Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Avoid common LLM pitfalls

Mete Atamel
September 15, 2024

Avoid common LLM pitfalls

It’s easy to generate content with a Large Language Model (LLM), but the output often suffers from hallucinations (fake content), outdated information (not based on the latest data), reliance on public data only (no private data), and a lack of citations back to original sources. Not ideal for real-world applications. In this talk, we’ll provide a quick overview of the latest advancements in multi-modal LLMs, highlighting their capabilities and limitations. We’ll then explore various techniques to overcome common LLM pitfalls, including Retrieval-Augmented Generation (RAG) to enhance prompts with relevant data, ReACT prompting to guide LLMs in verbalizing their reasoning, Function Calling to grant LLMs access to external APIs, and Grounding to link LLM outputs to verifiable information sources, and more.

Mete Atamel

September 15, 2024
Tweet

More Decks by Mete Atamel

Other Decks in Technology

Transcript

  1. Avoid common LLM pitfalls Mete Atamel Developer Advocate @ Google

    @meteatamel atamel.dev speakerdeck.com/meteatamel github.com/meteatamel/genai-beyond-basics
  2. Artificial Intelligence NLP AI Landscape Data Science Machine Learning —

    Unsupervised, Supervised, Reinforcement Learning Deep Learning — Artificial, Convolution, Recurrent Neural Networks Generative AI — GAN, VAE, Transformers LLMs — Transformers Image Gen — GAN, VAE
  3. Gemini (brand) Gemini App previously Bard Gemini Cloud Assist previously

    Duet AI Gemini Code Assist previously Duet AI for developers … Google AI Landscape Vertex AI Google AI Studio previously MakerSuite Model Garden Codey Imagen Gemma Llama 3 Claude 3 Falcon Vicuna Stable Diffusion … Search & Conversation Vector Search Notebooks Pipelines AutoML Gemini (model) … Vision, Video, TTS / STT, NL APIs
  4. Gemini Gemma Type Closed, proprietary Open Size Very large Smaller

    (2B & 7B versions) Modality Text, image, video, speech Only text Languages 39 languages English-only Function calling ✅ ❌ Context window 32K for 1.0 Pro (8K out max) 1M+ for 1.5 Pro 8K tokens (in + out) Performance State-of-the-art in large models, high quality out-of-the-box State-of-the-art in its class, but can require fine-tuning Use cases Enterprise, scale, SLOs, model updates, etc. Experimentation, research, education Can run locally, privacy Pricing & Management Fully managed API Pay per character Manage yourself Pay for your own hardware & hosting Customization Through managed tuning: supervised, RLHF, distillation Programmatically modify underlying weights
  5. LangChain is the most popular one Semantic Kernel, AutoGen and

    others github.com/meteatamel/genai-beyond-basics/tree/main/samples/frameworks/langchain github.com/meteatamel/genai-beyond-basics/tree/main/samples/frameworks/semantic-kernel ⚠ LLMs require pre and post processing 💡LLM frameworks
  6. Grounding with Google Search for public data Grounding with Vertex

    AI Search for private data github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/google-search github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/vertexai-search ⚠ LLMs hallucinate 💡Grounding (easy way)
  7. At some point, you’ll need Retrieval-Augmented Generation (RAG) to ground

    on your own private data and for more control ⚠ LLMs hallucinate 💡Grounding (RAG)
  8. Chatbot app LLM Vector DB vector embeddings chunks DOCS calculate

    prompt vector embedding split calculate find similar answer context + prompt + chunks store vector + chunk ❶ INGESTION ❷ QUERYING RAG
  9. • How to parse & chunk docs? • What embedding

    model to use? • What vector database to use? • How to retrieve similar docs and add to the prompt? • What about images? RAG get complicated github.com/meteatamel/genai-beyond-basics/tree/main/samples/grounding/rag-pdf-langchain-firestore
  10. Function calling: Augment LLMs with external APIs for more real-time

    data ⚠ LLMs rely on outdated public data 💡Function calling
  11. Chatbot app Gemini What’s the weather like in Zadar? It’s

    sunny in Zadar! External API or service user prompt + getWeather(String) function contract call getWeather(“Zadar”) for me please 󰚦 getWeather(“Zadar”) {“forecast”:”sunny”} function response is {“forecast”:”sunny”} Answer: “It’s sunny in Zadar!” Function calling github.com/meteatamel/genai-beyond-basics/tree/main/samples/function-calling/weather
  12. LLMs now support response type (JSON) and response schemas to

    control the output format better github.com/meteatamel/genai-beyond-basics/tree/main/samples/controlled-generation ⚠ LLMs outputs can be chaotic 💡Response type and schema
  13. Reduce costs (not necessarily latency) when a large context is

    referenced repeatedly by shorter requests github.com/meteatamel/genai-beyond-basics/tree/main/samples/context-caching ⚠ LLMs inputs can get expensive 💡Context caching
  14. DeepEval is an open-source evaluation tool Vertex AI has rapid

    evaluation and AutoSxS evaluation github.com/meteatamel/genai-beyond-basics/tree/main/samples/evaluation/deepeval ⚠ LLMs outputs are hard to measure 💡Evaluation frameworks
  15. LLMs change all the time, jailbreaking, PII data in prompts

    and responses, involving users in tuning LLMs, etc. ⚠ And there’s more…
  16. LLM frameworks to orchestrate LLM calls Grounding and function calling

    for private and real-time data Response type and schemas to structure outputs Context caching to optimize costs Evaluation frameworks to evaluate LLM outputs 📋 Summary
  17. Thank you! Mete Atamel Developer Advocate at Google @meteatamel atamel.dev

    speakerdeck.com/meteatamel github.com/meteatamel/genai-beyond-basics