Logo

dev-resources.site

for different kinds of informations.

Implementing Live Camera OCR with Jetpack Compose

Published at
12/6/2024
Categories
android
jetpackcompose
Author
rockandnull
Categories
2 categories in total
android
open
jetpackcompose
open
Author
11 person written this
rockandnull
open
Implementing Live Camera OCR with Jetpack Compose

Implementing Live Camera OCR with Jetpack Compose

Building apps that can seamlessly interpret real-world data is becoming increasingly essential, especially with the rise of AI and machine learning.

Integrating features like Optical Character Recognition (OCR) directly into mobile apps allows users to extract and process text from images or camera feeds, enhancing the app's interactivity and usefulness.

In this post, weโ€™ll explore how to implement a live camera view with OCR in Jetpack Compose. Leveraging Compose's modern UI toolkit Jetpack's CameraX, and the power of ML Kit, weโ€™ll create a streamlined, intuitive experience for real-time text detection. Whether youโ€™re building a document scanner, a data entry helper, or just want to experiment with cool tech, this guide will provide a practical, step-by-step approach to integrating these features into your app.

The Composable

@OptIn(ExperimentalPermissionsApi::class)
@Composable
private fun CameraView(
    modifier: Modifier,
    onTextDetected: (Text) -> Unit = {},
) {
    val context = LocalContext.current
    val lifecycleOwner = LocalLifecycleOwner.current
    val permissionState = rememberPermissionState(permission = Manifest.permission.CAMERA) // 1.

    val cameraController = remember {
        LifecycleCameraController(context).apply {
            setEnabledUseCases(CameraController.IMAGE_ANALYSIS)
            setImageAnalysisAnalyzer(
                ContextCompat.getMainExecutor(context),
                TextRecognitionAnalyzer(onTextDetected = onTextDetected), // 2.
            )
        }
    }

    Box(
        modifier = modifier.fillMaxWidth(),
        contentAlignment = Alignment.Center,
    ) {
        AndroidView(
            modifier = Modifier
                .fillMaxSize()
                .clip(RoundedCornerShape(12.dp)),
            factory = { context ->
                PreviewView(context).apply { // 3.
                    scaleType = PreviewView.ScaleType.FILL_CENTER
                    layoutParams = ViewGroup.LayoutParams(
                        ViewGroup.LayoutParams.MATCH_PARENT,
                        ViewGroup.LayoutParams.MATCH_PARENT,
                    )
                    this.controller = cameraController
                    cameraController.bindToLifecycle(lifecycleOwner) // 4.
                }
            },
        )

        if (!permissionState.status.isGranted) { // 5.
            Column(
                horizontalAlignment = Alignment.CenterHorizontally,
            ) {
                Text(
                    text = "Needs camera permission",
                )
                Spacer(modifier = Modifier.size(8.dp))
                Button(
                    onClick = {
                        permissionState.launchPermissionRequest()
                    },
                ) {
                    Text(text = "Request permission")
                }
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode
  1. Accessing the camera needs the appropriate permission. This example code handles permission requests and grants using the Google Accompanist Permissions Library.
  2. A custom analyzer that has input an image and output some text. This will be covered in the next section.
  3. The PreviewView from Jetpack's CameraX handles the live preview from the camera. Unfortunately, this is not Compose-native, so we need to use the old Android View.
  4. The controller must be aware when the app is in the foreground/background for resource allocation, so this conveniently takes care of it.
  5. The UI for displaying permission granting requests.

The Analyzer

internal class TextRecognitionAnalyzer(
    private val onTextDetected: (Text) -> Unit,
) : ImageAnalysis.Analyzer {

    private val scope: CoroutineScope = CoroutineScope(Dispatchers.IO + SupervisorJob())
    private val textRecognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS) // 1.

    @OptIn(ExperimentalGetImage::class)
    override fun analyze(imageProxy: ImageProxy) {
        scope.launch { // 2.
            val mediaImage = imageProxy.image ?: run {
                imageProxy.close()
                return@launch
            }

            val inputImage =
                InputImage.fromMediaImage(mediaImage, imageProxy.imageInfo.rotationDegrees)

            suspendCoroutine { continuation ->
                textRecognizer.process(inputImage)
                    .addOnSuccessListener { visionText: Text ->
                        if (visionText.text.isNotBlank()) {
                            onTextDetected(visionText) // 3.
                        }
                    }
                    .addOnCompleteListener {
                        continuation.resume(Unit)
                    }
            }
            delay(100)
        }.invokeOnCompletion { exception ->
            exception?.printStackTrace()
            imageProxy.close()
        }
    }
}
Enter fullscreen mode Exit fullscreen mode
  1. This is Google's ML KitRecognition for doing OCR (Optical Character Recognition) using machine learning. Essentially, it will analyze images and return text.
  2. We are using coroutines to handle the background processing.
  3. This is called when the OCR processing is complete and we can proceed with the business logic for the extracted text.

By integrating a live camera view with OCR in Jetpack Compose, you unlock powerful capabilities to process real-world data in real-time, adding immense value to your app's user experience. This combination of CameraX, ML Kit, and Compose makes it possible to create seamless, modern, and efficient interfaces while leveraging cutting-edge AI tools.

Happy coding!

jetpackcompose Article's
30 articles in total
Favicon
Fixing Rounded Navigation Bar Corner Padding in Jetpack Compose
Favicon
Build Android App Widgets With Jetpack Glance
Favicon
Type-Safe Navigation in Jetpack Compose: Passing Custom Classes
Favicon
Cross-Platform UI Development with Jetpack Compose Multiplatform
Favicon
Implementing Live Camera OCR with Jetpack Compose
Favicon
Build a Flashlight in Jetpack Compose
Favicon
Exemplificando SOLID com Jetpack Compose - parte S
Favicon
Shimmer animations in Jetpack Compose without extra dependencies
Favicon
Getting started with splash screen in Jetpack Compose
Favicon
First steps using Material3 in a WearOS Android App
Favicon
On building a digital assistant for the rest of us (part 4)
Favicon
On building a digital assistant for the rest of us (part 3)
Favicon
๐ŸŒฑ Type-safe navigation in Jetpack Compose - Quick Guide
Favicon
On building a digital assistant for the rest of us (part 2)
Favicon
On building a digital assistant for the rest of us (part 1)
Favicon
Estado en aplicaciones Android| Jetpack Compose
Favicon
Need help with App language | Android Compose
Favicon
Mastering Jetpack Compose: Tips and Tricks for Modern Android UI Development
Favicon
How to Provide Accessibility in your Android App | Part 4: List, Link Semantics andย Testing
Favicon
Making Your Android App Accessible: Semantic Properties and Screen Orientation โ€” Part 3
Favicon
How to Provide Accessibility in Your Android App | Scaling * Text Size * Focus Order * Labeling โ€” Part 2
Favicon
How to Provide Mobile Accessibility in Your Native Android App | Guide โ€” Part 1
Favicon
Apollo GraphQL Integration in Jetpack Compose
Favicon
Accessibility Considerations with Stacked Cards Custom Layout
Favicon
Don't Lock the Screen Orientation! Handling Orientation in Compose
Favicon
Performance Optimization of LazyColumn in Jetpack Compose
Favicon
Adapt Kotlin 2.0 in Android applications
Favicon
Jetpack Compose -Difference between mutableStateOf() and derivedStateOf()
Favicon
List Formatter in Android : Android Internationalization part 1
Favicon
Enable Edge to Edge in Android Jetpack Compose (Transparent Status Bar)

Featured ones: