Leverage our skills and apps with new AI/ML tools for Android

Leverage your skills & apps with new AI/ML tools for
Android Julien Salvi - Android GDE | Android @ Aircall droidcon Lisbon 2024 󰐨 @JulienSalvi Let’s use AI/ML wisely 🤓

Julien Salvi Lead Android Engineer @ Aircall Android GDE PAUG,
Punk & IPAs! @JulienSalvi Bonjour !

The age of AI/ML A little bit of context… A
little bit of context… It’s there, we cannot escape 🫣

The Age of AI/ML AI context in 2024 • AI/ML
is booming for 2 years now with many tools that are more and more accessible for everyone • We’ve seen the rise of Generative AI (ChatGPT, Gemini, MistralAI…) and a push in other ML tools (MediaPipe, TensorFlow, ML Kit…) • Android isn’t not escaping the AI trend! • We now have a large set of tools to leverage our skills and apps 🚀

The Age of AI/ML AI/ML on Android • We can
identify 2 categories of tools: ◦ AI for developers ◦ AI for apps • The first one will help developers in building great apps being a daily assistant to leverage their skills • The second will provide a set of tools to build better apps • Each tool has its own learning curve, cost and set of features 🚀

The Age of AI/ML Learning curve Tools The learning curve

AI/ML tools to leverage your skills A little bit of
context… A little bit of context… Let AI assist you 🤖

AI assistants Gemini, Copilots, JetBrains AI…

AI Assistants Gemini, Copilot, JetBrains AI… • AI assistants should
be seen as pair programmers 🤓 Use them to boost productivity, explore new ideas, and learn new techniques. • DON’T blindly accept every suggestions! 🫣 • Security & Privacy matters a lot! 🔐 Check your company policy before using a new AI companion. Be mindful of the code you share with the AI assistants.

x Let’s see GitHub Copilot & JetBrains AI in action
🤓 x

AI chats , ChatGPT, Mistral AI…

AI chats The art of prompting • When using LLM
based chats, you must provide the best context possible to get the best answers 📝 • You're not chatting with a human! 🤖 Be clear, specific, and avoid ambiguity. • Context is key: Provide relevant information about your code, goals, and desired outcome. ◦ Instead of: "Make this better" ◦ Try: "Simplify this Kotlin function and explain the changes"

AI chats The art of prompting • Keywords are your
best friends: Use relevant technical terms ("Jetpack Compose," "coroutines," "Room database") • Structure for success: ◦ State the task clearly: "Write a..." "Explain how..." "Find errors in..." ◦ Provide code snippets or context: (Use the "Insert code" button in Gemini) ◦ Specify desired format: "Kotlin code", "Bulleted list", "Slides"... • If the first response isn't perfect, rephrase or refine your prompt!

AI chats Gemini in Android Studio • Gemini is directly
built in Android Studio 🛠 • It can answer coding question, generate code or help you debug some part of your code 🤓 • You can control the data/code shared with Gemini 🧐 • Fine-grained control of the files your share with Gemini using a .aiexclude file 🔐

AI chats Gemini in Android Studio • Get quick answers:
Ask about Android APIs, libraries, best practices, or even general coding concepts. ◦ "How do I use Room to store data?" ◦ "How can I make my app more accessible?" ◦ "What's the difference between ViewModel and SavedStateHandle?"

AI chats Gemini in Android Studio • Generate different code
options: Describe the functionality you need, and Gemini will suggest code snippets. ◦ "Create a function to fetch data from this API endpoint using Retrofit” ◦ "Write a composable function that displays a list of items in a lazy grid"

AI chats Gemini in Android Studio • Improve existing code:
Ask Gemini to review your code for potential issues, optimizations, or improvements. ◦ “Can you help me simplify this code?” ◦ “Is there a more efficient way to implement this feature?”

Main Usage: Offer suggestions and automate repetitive tasks Cost: $0*
Learning Curve: Fast Pros Cons • Integrated within Android Studio • Trained for Android dev • No additional cost • Privacy controls • Still in development • No offline support • Privacy concerns *depending on how you value your code 😅 in Android Studio

Main Usage: Offer suggestions and automate repetitive tasks Cost: From
$10/month to $39/seat per month Learning Curve: Fast Pros Cons • Official Plugin for Android Studio • Easy connect with your GitHub account • Lots of features if fully integrated with GitHub • Privacy controls • Context awareness • Non negligible cost • No offline support • Privacy concerns GitHub Copilot in Android Studio

Main Usage: Offer suggestions and automate repetitive tasks in IntelliJ
Cost: €8.33/per month Learning Curve: Fast Pros Cons • Plugin for IntelliJ • Efficient code completion and generation • Can generate commit msg, explain errors, generate documentation • Privacy controls • Customer data not used to train the models • Still in development • No offline support • Privacy concerns JetBrains AI ℹ JetBrains AI is using LLMs from OpenAI and Google

x Let’s have a real look 🤓

AI/ML tools for your apps A little bit of context…
A little bit of context… Build smarter & richer apps 󰳕

for Android GenAI for your apps

In a nutshell • Gemini easily enables generative AI capabilities
in your apps to build enhanced features like sentiment analysis, smart bots, text summary and more • ⚠ Only use the Google AI Client SDK for prototyping as you can leak your API key if it’s embedded in your app • Prefer using Gemini in Firebase with Vertex AI or having your own gateway for a safe usage for Android

In a nutshell • Gemini on-device with Nano is still
a private preview 🥲 • The more context (text and images) you give, the more accurate your response will be! • Experiment with the parameters to get the desired output for Android

Google AI Client SDK for Android https://developer.android.com/ai/google-ai-client-sdk

Vertex AI with Firebase for Android https://developer.android.com/ai/vertex-ai-firebase

dependencies { // Google AI Client SDK for Android implementation
'com.google.ai.client.generativeai:generativeai:0.9.0' // Vertex AI in Firebase implementation 'com.google.firebase:firebase-vertexai:16.0.0-beta04' } 🧪 prototyping

// With Google AI SDK on Android val model =
GenerativeModel( model = "gemini-1.5-flash", apiKey = "<MY-API-KEY>", generationConfig = generationConfig { temperature = 0.15f topK = 32 topP = 1f maxOutputTokens = 4096 }, safetySettings = listOf( SafetySetting(HarmCategory.HARASSMENT, BlockThreshold.MEDIUM_AND_ABOVE), SafetySetting(HarmCategory.HATE_SPEECH, BlockThreshold.MEDIUM_AND_ABOVE), SafetySetting(HarmCategory.SEXUALLY_EXPLICIT, BlockThreshold.MEDIUM_AND_ABOVE), SafetySetting(HarmCategory.DANGEROUS_CONTENT, BlockThreshold.MEDIUM_AND_ABOVE), ) )

// With Vertex AI in Firebase val model = Firebase.vertexAI.generativeModel(
model = "gemini-1.5-flash", generationConfig = generationConfig { temperature = 0.15f topK = 32 topP = 1f maxOutputTokens = 4096 }, safetySettings = listOf( SafetySetting(HarmCategory.HARASSMENT, BlockThreshold.MEDIUM_AND_ABOVE), SafetySetting(HarmCategory.HATE_SPEECH, BlockThreshold.MEDIUM_AND_ABOVE), SafetySetting(HarmCategory.SEXUALLY_EXPLICIT, BlockThreshold.MEDIUM_AND_ABOVE), SafetySetting(HarmCategory.DANGEROUS_CONTENT, BlockThreshold.MEDIUM_AND_ABOVE), ) )

// Text generation with a simple prompt scope.launch { val
response = model.generateContent("Give a recipe with the best Portuguese ingredients") } // Use an image and a prompt scope.launch { val response = model.generateContent( content { image(bitmap) text("Is there some carrot in this picture?") } ) } // Text generation as a stream thanks to Flow scope.launch { var outputContent = "" generativeModel.generateContentStream("My awesome prompt").collect { response -> outputContent += response.text } }

for Android Main Usage: Enhance your apps with GenAI-based features
Cost: from $0 to $21/1 million tokens (output)* Learning Curve: Fast Pros Cons • Fast integration with Android • Proxy with Firebase • Fast learning curve • Text and image as input • On-device capabilities with Gemeni Nano • Lots of things still in preview • Heavy process can be costly • High risk of leaking your API key if you embed the SDK in your app *depending on the Gemini model and/or Firebase cost

ML Kit NLP and Video/Image analysis

In a nutshell ML Kit on Android • ML Kit
brings powerful and easy-to-use ML features, optimized for Android and iOS with minimal coding and resource. • It provides pre-built and customizable models for common use cases such as image and text recognition, face detection, barcode scanning… • ML Kit also allows developers to train custom models using their own data.

Model installation ML Kit on Android • Models in ML
Kit APIs can be installed in 3 different ways: ◦ Unbundled: Models are downloaded and managed via Google Play Services. ◦ Bundled: Models are statically linked to your app at build time. ◦ Dynamically downloaded: Models are downloaded on demand. • By using ML Kit you will increase your app size (2 to 10 MB per model).

Vision libraries ML Kit on Android Text Recognition v2 Face
Detection Face Mesh Detection (beta) Object Detection Image Labeling Document Scanning (beta) Pose Detection (beta) Barcode Scanning Digital Ink Recognition (beta) Selfie and subject segmentation (beta)

NLP libraries ML Kit on Android Language identification Smart Reply
Translation Entity extraction (beta)

Text Recognition v2 ML Kit on Android • Text Recognition
v2 allows us to extract text from images (camera or static images) • Trained to recognize text in over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese or Chinese. • The Text Recognizer segments text into blocks, lines, elements and symbols. Bundled model: +4 MB per architecture

Text Recognition v2 ML Kit on Android • Text Recognition
v2 allows us to extract text from images (camera or static images) • Trained to recognize text in over 100 languages, including Latin-based scripts and non-Latin scripts such as Japanese or Chinese. • The Text Recognizer segments text into blocks, lines, elements and symbols. Line Block Element Line Element Element

dependencies { // To recognize Latin script implementation 'com.google.mlkit:text-recognition:16.0.0' //
To recognize Chinese script implementation 'com.google.mlkit:text-recognition-chinese:16.0.0' // To recognize Devanagari script implementation 'com.google.mlkit:text-recognition-devanagari:16.0.0' // To recognize Japanese script implementation 'com.google.mlkit:text-recognition-japanese:16.0.0' // To recognize Korean script implementation 'com.google.mlkit:text-recognition-korean:16.0.0' }

// Init TextRecognition client (here latin languages) val recognizer =
TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS) // Load bitmpap image for instance val image = InputImage.fromBitmap(bitmap, 0) // Use the client to process the image val result = recognizer.process(image) .addOnSuccessListener { visionText -> // Get text from image and info where they are located val allText = visionText.text val blocks = visionText.textBlocks // ... } .addOnFailureListener { e -> // Task failed with an exception }

Kit for Android Main Usage: Build computer vision and NLP
features with pre-built models Cost: $0 Learning Curve: Quite Fast Pros Cons • Fast integration with Android and on-device • Free to use • Pre-built models for various use cases • Custom model deployment • Optimized for mobile usage • Black box with pre-built models • Limited model customization • Some features require the Google Play Services • Performance for Computer Vision

x Let’s explore some real life examples 🚀

Google AI Edge MediaPipe & TensorFlow Lite LiteRT

In a nutshell TensorFlow Lite or LiteRT • LiteRT (formerly
TensorFlow Lite) is a high-performance cross-platform runtime for on-device AI • Convert or use existing model that suit your use cases or build your own! • LiteRT is optimized for mobile with a focus on privacy, size and perf • The learning curve of building your own TFLite models can be quite steep and can require Python knowledge https://ai.google.dev/edge/litert

In a nutshell TensorFlow Lite or LiteRT • You can
take advantage of the Play Service to have a lighter app and use high-level API in Java/Kotlin (recommended way) • The high-level API will be there in run the inferences through an Interpreter API expose in library. • You will have control on the input, output and learning part • Otherwise, you’ll have to deal a C/C++ API! https://ai.google.dev/edge/litert •

TensorFlow Lite or LiteRT https://developers.googleblog.com/en/tensorflow-lite-is-now-litert/

LiteRT / TensorFlow Lite Main Usage: Build and deploy ML
models on Android to bring on-device ML features Cost: $0* Learning Curve: High Pros Cons • Build your own models or use existing ones • Optimized for on-device ML • Offline support • Low latency and real time performances • Full control of the flow • Steepest learning curve for Android Dev • Python knowledge mandatory • Model conversation to .tflite format • Requires strong ML knowledge *building and hosting your models can be non negligible

In a nutshell MediaPipe Framework • MediaPipe Framework is a
low-level tool to build on-device ML pipelines • It requires NDK/C++ to run the pipelines on Android and be familiar with several Framework concepts (Packets, Graph, Calculator) • The learning curve is steep and it can take some time to master the entire flow https://ai.google.dev/edge/mediapipe/framework

In a nutshell MediaPipe Solutions • MediaPipe brings a cross-platform
and easy-to-use ML solution, optimized for mobile with minimal coding and resource. • It provides pre-built models for multiple fields such as vision, text, audio or GenAI… or you can build and evaluate your own models with the Model Maker & Studio tools • You must add the models in the app resources before using the MediaPipe libraries https://ai.google.dev/edge/mediapipe/solutions/guide

MediaPipe Solutions Gesture recognition Image Classification Face stylization Object detection
Audio classification Interactive segmentation Vision Hand detection Face detection Experimental Text Text classification Text embedded Language identification Audio GenAI Image generation LLM inference Experimental

Hand landmarks detection MediaPipe Solution • Hand landmarks detection identify
the key points of the hand • The input can a static image, a decoded video frame or a live stream • The library offers many configurable options • ⚠ Embedding models in your app will have an impact on your app size

dependencies { // To recognize hand landmarks implementation 'com.google.mediapipe:tasks-vision:0.10.15' }
// Download the pre-built model // Add it to your app assets <your-project-root>/src/main/assets

// Path the model to the library val baseOptions =
BaseOptions.builder().setModelAssetPath("path_to_model").build() // Configure the hand landmarks detection here val optionsBuilder = HandLandmarker.HandLandmarkerOptions.builder() .setBaseOptions(baseOptions) .setMinHandDetectionConfidence(minHandDetectionConfidence) .setMinTrackingConfidence(minHandTrackingConfidence) .setMinHandPresenceConfidence(minHandPresenceConfidence) .setNumHands(maxNumHands) .setRunningMode(RunningMode.IMAGE) // Start detecting! val handLandmarker = HandLandmarker.createFromOptions(context, optionsBuilder.build()) val mediaPipeImage = BitmapImageBuilder(image).build() // image is Bitmap val result = handLandmarker?.detect(mediaPipeImage)

MediaPipe (Solutions & Framework) Main Usage: Build ML-based features (vision,
text, audio) with turnkey models or your own Cost: $0* Learning Curve: Medium high Pros Cons • Built-in models for MediaPipe Tasks • Lots of use cases covered • On-device ML • Customize your own models • Cross-platform • Steeper learning curve • MediaPipe Framework requires NDK knowledge • Python recommended to build ML models *building and hosting your models can be non negligible

x MediaPipe live from the studio 🚀

AI assistants Gemini in Android Studio https://developer.android.com/studio/preview/gemini GitHub Copilot https://docs.github.com/en/copilot
JetBrains AI https://www.jetbrains.com/help/idea/ai-assistant.html

AI/ML tools ML Kit https://developers.google.com/ml-kit MediaPipe https://ai.google.dev/edge/mediapipe/solutions/guide Gemini on Android
https://developer.android.com/ai/generativeai ML/AI Codelabs https://codelabs.developers.google.com/?category=aiandmachinelearning&product=android

Julien Salvi droidcon Lisbon 2024 󰐨 @JulienSalvi Obrigado! Take good
advantage of AI

Leverage our skills and apps with new AI/ML too...

Leverage our skills and apps with new AI/ML tools for Android

More Decks by Julien Salvi

Other Decks in Programming

Featured

Transcript