Avoid common LLM pitfalls

Avoid common LLM pitfalls Mete Atamel Developer Advocate @ Google
@meteatamel atamel.dev speakerdeck.com/meteatamel github.com/meteatamel/genai-beyond-basics

Introduction Pitfalls Solutions Summary 01 02 03 04 Agenda

Introduction 01

Artificial Intelligence NLP AI Landscape Data Science Machine Learning —
Unsupervised, Supervised, Reinforcement Learning Deep Learning — Artificial, Convolution, Recurrent Neural Networks Generative AI — GAN, VAE, Transformers LLMs — Transformers Image Gen — GAN, VAE

LLM Landscape

Gemini (brand) Gemini App previously Bard Gemini Cloud Assist previously
Duet AI Gemini Code Assist previously Duet AI for developers … Google AI Landscape Vertex AI Google AI Studio previously MakerSuite Model Garden Codey Imagen Gemma Llama 3 Claude 3 Falcon Vicuna Stable Diffusion … Search & Conversation Vector Search Notebooks Pipelines AutoML Gemini (model) … Vision, Video, TTS / STT, NL APIs

Natively multimodal Large context window Sophisticated reasoning

Proprietary + Confidential

Open model derived from Gemini

Gemini Gemma Type Closed, proprietary Open Size Very large Smaller
(2B & 7B versions) Modality Text, image, video, speech Only text Languages 39 languages English-only Function calling ✅ ❌ Context window 32K for 1.0 Pro (8K out max) 1M+ for 1.5 Pro 8K tokens (in + out) Performance State-of-the-art in large models, high quality out-of-the-box State-of-the-art in its class, but can require ﬁne-tuning Use cases Enterprise, scale, SLOs, model updates, etc. Experimentation, research, education Can run locally, privacy Pricing & Management Fully managed API Pay per character Manage yourself Pay for your own hardware & hosting Customization Through managed tuning: supervised, RLHF, distillation Programmatically modify underlying weights

Pitfalls 02

⚠ LLMs require pre and post processing

⚠ LLMs hallucinate

⚠ LLMs rely on outdated public data

⚠ LLMs outputs can be chaotic

⚠ LLMs inputs can get expensive

⚠ LLMs outputs are hard to measure

Solutions 03

LangChain is the most popular one Semantic Kernel, AutoGen and
others github.com/meteatamel/genai-beyond-basics/tree/main/samples/frameworks/langchain github.com/meteatamel/genai-beyond-basics/tree/main/samples/frameworks/semantic-kernel ⚠ LLMs require pre and post processing 💡LLM frameworks

Grounding with Google Search for public data Grounding with Vertex
AI Search for private data github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/google-search github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/vertexai-search ⚠ LLMs hallucinate 💡Grounding (easy way)

At some point, you’ll need Retrieval-Augmented Generation (RAG) to ground
on your own private data and for more control ⚠ LLMs hallucinate 💡Grounding (RAG)

LLM Vector DB vector embeddings chunks DOCS calculate split store
vector + chunk ❶ INGESTION RAG

Chatbot app LLM Vector DB vector embeddings chunks DOCS calculate
prompt vector embedding split calculate ﬁnd similar answer context + prompt + chunks store vector + chunk ❶ INGESTION ❷ QUERYING RAG

• How to parse & chunk docs? • What embedding
model to use? • What vector database to use? • How to retrieve similar docs and add to the prompt? • What about images? RAG get complicated github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/rag-pdf-langchain-firestore

Function calling: Augment LLMs with external APIs for more real-time
data ⚠ LLMs rely on outdated public data 💡Function calling

Chatbot app Gemini What’s the weather like in Zadar? It’s
sunny in Zadar! External API or service user prompt + getWeather(String) function contract call getWeather(“Zadar”) for me please 󰚦 getWeather(“Zadar”) {“forecast”:”sunny”} function response is {“forecast”:”sunny”} Answer: “It’s sunny in Zadar!” Function calling github.com/meteatamel/genai-beyond-basics/tree/main/samples/function-calling/weather

LLMs now support response type (JSON) and response schemas to
control the output format better github.com/meteatamel/genai-beyond-basics/tree/main/samples/controlled-generation ⚠ LLMs outputs can be chaotic 💡Response type and schema

Reduce costs (not necessarily latency) when a large context is
referenced repeatedly by shorter requests github.com/meteatamel/genai-beyond-basics/tree/main/samples/context-caching ⚠ LLMs inputs can get expensive 💡Context caching

DeepEval is an open-source evaluation tool Vertex AI has rapid
evaluation and AutoSxS evaluation github.com/meteatamel/genai-beyond-basics/tree/main/samples/evaluation/deepeval ⚠ LLMs outputs are hard to measure 💡Evaluation frameworks

LLMs change all the time, jailbreaking, PII data in prompts
and responses, involving users in tuning LLMs, etc. ⚠ And there’s more…

Summary 04

LLM frameworks to orchestrate LLM calls Grounding and function calling
for private and real-time data Response type and schemas to structure outputs Context caching to optimize costs Evaluation frameworks to evaluate LLM outputs 📋 Summary

Thank you! Mete Atamel Developer Advocate at Google @meteatamel atamel.dev
speakerdeck.com/meteatamel github.com/meteatamel/genai-beyond-basics

Avoid common LLM pitfalls

Avoid common LLM pitfalls

Mete Atamel

More Decks by Mete Atamel

Other Decks in Technology

Featured

Transcript