Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Eat smarter - Building an AI-powered meal plann...

Eat smarter - Building an AI-powered meal planner (with SwiftUI and Firebase)

Firebase is known as a backend as a service (BaaS) for mobile and web developers, and its realtime databases. But it also included a number of AI features that you can use to implement exciting new use cases for your apps.

In this talk, I will demonstrate how to implement an AI-powered meal planner using Firebase, bringing together Firestore, Gemini, Genkit, Cloud Storage, and SwiftUI.

In addition to walking you through plenty of code samples, I will explain the key concepts that underpin the technology powering the app. There is also going to be a good helping of live demos, which will include pictures of delicious recipes, so better make sure you’re not hungry when attending this talk.

You will learn

- How to securely call LLMs like Gemini and Imagen from mobile apps, without leaking your API keys
- How to use multimodal prompts
- How to generate text and images
- How to generate structured data
- How to use vector embeddings to implement semantic search
- How to use Retrieval Augmented Generation to tap into the user’s data stored in the app
- How to monitor your AI features and keep an eye on the number of tokens consumed and produced

Avatar for Peter Friese

Peter Friese

May 09, 2025
Tweet

More Decks by Peter Friese

Other Decks in Technology

Transcript

  1. @peterfriese.dev Created by Mamank from Noun Project https: / /

    peterfriese.dev peterfriese Peter Friese, Staff Developer Relations Engineer, Google AI Development with Firebase
  2. Friendly Meals Generate recipes based on what’s in your fridge

    Include an inspirational image of the dish Base the recipe on a curated set of recipes Eat pizza 🍕
  3. Friendly Meals Eat pizza 🍕 Retrieval Augmented Generation Generate text

    from text Analyse images Generate images from text
  4. Setting up the model class RecipeGenerationService { private let vertexAI

    = VertexAI.vertexAI() private lazy var model = vertexAI .generativeModel(modelName: "gemini-2.0-flash") func foo() async throws { let prompt = "Hello World!" let response = try await model.generateContent(prompt) print(response.text) } }
  5. Generate text from text func generateRecipe(from ingredients: String, cuisine: Cuisine,

    mealType: MealType, servings: Int) async throws - > String { let prompt = """ Create a \(cuisine.rawValue) \(mealType.rawValue.lowercased()) recipe for \(servings) people using these ingredients: \(ingredients). Generate: 1. A creative title that describes the dish 2. A brief, appetizing description 3. Estimated cooking time in minutes 4. List of ingredients with measurements 5. Step-by-step cooking instructions 6. Include the cuisine type ("\(cuisine.rawValue)") 7. For the imageURL, provide a URL to a high-quality food photo from Pexels.com that most closely matches this exact \(cuisine.rawValue) dish. The image should show a finished, plated dish that matches the recipe's style and ingredients. """ let response = try await model.generateContent(prompt) return response.text ? ? "" }
  6. Analyse images private lazy var visionModel = vertexAI.generativeModel( modelName: "gemini-2.0-flash",

    generationConfig: GenerationConfig() ) func analyzeImage(_ image: UIImage) async throws -> String { let prompt = """ Please analyze this image and list all visible food ingredients. \ Format the response as a comma-separated list of ingredients. \ Be specific with measurements where possible, but focus on identifying the \ ingredients accurately. """ let response = try await visionModel.generateContent(prompt, image) return response.text ?? "" } Multimodal input
  7. Generate images from text private lazy var imagenModel = vertexAI.imagenModel(

    modelName: "imagen-3.0-generate-002", generationConfig: ImagenGenerationConfig(numberOfImages: 1) ) func generateImage(for recipe: Recipe) async throws -> UIImage? { let prompt = """ A professional food photography shot of \(recipe.title). \ The dish should be \(recipe.description). \ Style: High-end food photography, restaurant-quality plating, soft natural \ lighting, shot from above on a clean background, showing the complete \ plated dish. \ Cuisine style: \(recipe.cuisine.rawValue) """ let response = try await imagenModel.generateImages(prompt: prompt) return response.images.first.flatMap { UIImage(data: $0.data) } }
  8. Setting up the model const ai = genkit({ plugins: [

    vertexAI({ location: "us-central1" }), ], model: ge mi ni20Flash }); Unified model interface
  9. const detectIngredientsFlow = ai.defineFlo w ({ name: "detectIngredientsFlow", inputSchema: z.object({

    im age: z.string().describe("Base64 encoded im age of a fridge") }), outputSchema: ingredientSchema, }, async (input) => { const prompt = `Analyze this im age of a fridge or food items and list all the ingredients you can identify. For each ingredient, provide: 1. The name of the ingredient 2. The amount and unit (if visible, e.g., "2 liters”, "500 grams") Format the response as a structured list of ingredients.`; const { output } = await ai.generate({ prompt: [ { media: { url: input. i m age } Multimodal prompts in Genkit Genkit flow
  10. const detectIngredientsFlow = ai.defineFlo w ({ name: "detectIngredientsFlow", inputSchema: z.object({

    im age: z.string().describe("Base64 encoded im age of a fridge") }), outputSchema: ingredientSchema, }, async (input) => { const prompt = `Analyze this im age of a fridge or food items and list all the ingredients you can identify. For each ingredient, provide: 1. The name of the ingredient 2. The amount and unit (if visible, e.g., "2 liters”, "500 grams") Format the response as a structured list of ingredients.`; const { output } = await ai.generate({ prompt: [ { media: { url: input. i m age } Multimodal prompts in Genkit Input schema
  11. const detectIngredientsFlow = ai.defineFlo w ({ name: "detectIngredientsFlow", inputSchema: z.object({

    im age: z.string().describe("Base64 encoded im age of a fridge") }), outputSchema: ingredientSchema, }, async (input) => { const prompt = `Analyze this im age of a fridge or food items and list all the ingredients you can identify. For each ingredient, provide: 1. The name of the ingredient 2. The amount and unit (if visible, e.g., "2 liters”, "500 grams") Format the response as a structured list of ingredients.`; const { output } = await ai.generate({ prompt: [ { media: { url: input. i m age } Multimodal prompts in Genkit Prompt
  12. const detectIngredientsFlow = ai.defineFlo w ({ name: "detectIngredientsFlow", inputSchema: z.object({

    im age: z.string().describe("Base64 encoded im age of a fridge") }), outputSchema: ingredientSchema, }, async (input) => { const prompt = `Analyze this im age of a fridge or food items and list all the ingredients you can identify. For each ingredient, provide: 1. The name of the ingredient 2. The amount and unit (if visible, e.g., "2 liters”, "500 grams") Format the response as a structured list of ingredients.`; const { output } = await ai.generate({ prompt: [ { media: { url: input. i m age } Multimodal prompts in Genkit Call the model
  13. 2. The amount and unit (if visible, e.g., "2 liters”,

    "500 grams") Format the response as a structured list of ingredients.`; const { output } = await ai.generate({ prompt: [ { media: { url: input. i m age } }, { text: prompt } ], output: { schema: ingredientSchema } }); if (!output) { throw new Error("Failed to detect ingredients"); } return output; }); Multimodal prompts in Genkit Multimodal prompt
  14. Task: Find all words that are food in the following

    sentence “I went down to Aberystwyth on foot to buy some welsh cakes and a few berries. When I finished doing my groceries, I had a latte at Coffee #1, where I met a few other speakers.”
  15. Task: Find all words that are food in the following

    sentence “I went down to Aberystwyth on foot to buy some welsh cakes and a few berries. When I finished doing my groceries, I had a latte at Coffee #1, where I met a few other speakers.”
  16. Task: Find all words that are food in the following

    sentence “I went down to Aberystwyth on foot to buy some welsh cakes and a few berries. When I finished doing my groceries, I had a latte at Coffee #1, where I met a few other speakers.”
  17. Vector embedding for food: [-0.018035058, 0.013980114, -0.01309541, 0.024956783, 0.02708295, -0.074924484,

    0.03496225, 0.0125780115, . .. ] Vector embedding for foot: [-0.016025933, 0.008207399, -0.03572462, 0.020942606, -0.0003162824, -0.041694388, 0.050102886, 0.007380137, . .. ]
  18. const computeIngredients E m beddings = async (ingredientsList: string[]): Pro

    mi se<number[]> => { const text = ingredientsList.join(", "); const result = await ai.embed({ embedder: text Em bedding005, content: text, }); return result[0].embedding; }; Computing embeddings
  19. const recipeRetriever = defineFirestoreRetriever(ai, { name: 'recipeRetriever', firestore, collection: 'recipes',

    contentField: 'recipe', vectorField: 'ingredients E m beddings', embedder: text Em bedding005, distanceMeasure: 'COSINE' }); Retrieving documents
  20. const inventRecipeFlow = ai.defineFlo w ({ name: "inventRecipeFlow", inputSchema: im

    ageBasedRecipeSchema, outputSchema: recipeSearchSchema, }, async (input) => { const detectedIngredients = await detectIngredientsFlo w ({ im age: input. im age }); const inspirationRecipes = await findRecipesFlo w ({ ingredientsList: detectedIngredients.ingredients. m ap(i => i.name) }); const generatedRecipe = await inventRecipeFlo w ({ inspirationRecipes, cuisine: input.cuisine, mea l T ype: input. m ea l T ype, servings: input.servings }); return generatedRecipe; }); Nested flows in Genkit
  21. export const inventRecipe = onCallGenkit({ secrets: [apiKey], }, inventRecipeFlo w

    ); Calling Genkit flows from Swift let request = GenerateRecipeRequest( image: base64String, cuisine: cuisine.rawValue, mealType: mealType.rawValue.lowercased(), servings: servings, dietaryRestrictions: dietaryRestrictions ) let generateRecipe: Callable<GenerateRecipeRequest, ImageBasedRecipeResponse> = functions.httpsCallable("generateRecipe") let response = try await generateRecipe(request) guard let generatedRecipe = response.recipes.first else { // handle error } return generatedRecipe } Expose Genkit flow as callable function
  22. export const inventRecipe = onCallGenkit({ secrets: [apiKey], }, inventRecipeFlo w

    ); Calling Genkit flows from Swift let request = GenerateRecipeRequest( image: base64String, cuisine: cuisine.rawValue, mealType: mealType.rawValue.lowercased(), servings: servings, dietaryRestrictions: dietaryRestrictions ) let generateRecipe: Callable<GenerateRecipeRequest, ImageBasedRecipeResponse> = functions.httpsCallable("generateRecipe") let response = try await generateRecipe(request) guard let generatedRecipe = response.recipes.first else { // handle error } return generatedRecipe } Define remote function in Swift
  23. export const inventRecipe = onCallGenkit({ secrets: [apiKey], }, inventRecipeFlo w

    ); Calling Genkit flows from Swift let request = GenerateRecipeRequest( image: base64String, cuisine: cuisine.rawValue, mealType: mealType.rawValue.lowercased(), servings: servings, dietaryRestrictions: dietaryRestrictions ) let generateRecipe: Callable<GenerateRecipeRequest, ImageBasedRecipeResponse> = functions.httpsCallable("generateRecipe") let response = try await generateRecipe(request) guard let generatedRecipe = response.recipes.first else { // handle error } return generatedRecipe } Call the function
  24. @peterfriese.dev Created by Mamank from Noun Project https: / /

    peterfriese.dev peterfriese Peter Friese, Staff Developer Relations Engineer, Google AI Development with Firebase