Upgrade to Pro — share decks privately, control downloads, hide ads and more …

From Zero to Production: Build Your Own GenAI S...

From Zero to Production: Build Your Own GenAI Solution.

Thanks to powerful frameworks and libraries, the first Generative AI applications can be realized at Hello World level with just a few lines of code. However, these first attempts also reach their limits just as quickly. Why? Because reality presents challenges that cannot be solved easily with this trivial approach. But what is needed for a Generative AI application at enterprise level? A well designed Generative AI architecture!
During this full day hands-on workshop, we will build a complex Generative AI application step by step. Starting with a minimalist RAG system, we will look at various challenges and discuss suitable solutions. In the end, we will have an architecture that can withstand the challenges of reality.

Avatar for Lars Roewekamp

Lars Roewekamp PRO

November 22, 2024
Tweet

More Decks by Lars Roewekamp

Other Decks in Technology

Transcript

  1. Generative AI From Zero to Production: Build your own GenAI

    solution #WISSENTEILEN powered by Lars Röwekamp | Tim Wüllner | open knowledge GmbH Hands-on Workshop
  2. The Idea: The question and answer are SEMANTICALY very similar.

    Sematic proximity is represented MATHEMATICALLY with the aid of vectors. How does an LLM generates a response?
  3. Dog Cat Tiger Mouse Hamburg Berlin Madrid Amazon Embedding Model

    0.6 0.3 0.1 … 0.8 0.5 0.3 … 0.4 0.2 0.9 … Embeddings (semantic vectors)
  4. Embedding Model 0.6 0.3 0.1 … 0.8 0.5 0.3 …

    0.4 0.2 0.9 … Berlin: City, Europe, Capital of Germany, … Dog Cat Tiger Mouse Hamburg Berlin Madrid Amazon Embeddings (semantic vectors)
  5. Dog Cat Mouse Tiger Hamburg Berlin Madrid Amazon Embedding Model

    0.6 0.3 0.1 … 0.8 0.5 0.3 … 0.4 0.2 0.9 … Berlin: City, Europe, Capital of Germany, … Dog Cat Tiger Mouse Hamburg Berlin Madrid Amazon
  6. Dog Cat Mouse Tiger Hamburg Berlin Madrid Amazon Embedding Model

    0.6 0.3 0.1 … 0.8 0.5 0.3 … 0.4 0.2 0.9 … Dog Cat Tiger Mouse Hamburg Berlin Madrid Amazon
  7. Dog Cat Mouse Tiger Hamburg Berlin Madrid Amazon Embedding Model

    0.6 0.3 0.1 … 0.8 0.5 0.3 … 0.4 0.2 0.9 … Dog Cat Tiger Mouse Hamburg Berlin Madrid Amazon
  8. Tiger Hamburg Berlin Madrid Amazon „Can you name a few

    European capitals?“ Embedding Model 0.8 0.5 0.4 … QUESTION: „Can you name ..?“: City, Europe, Capital of … Dog Cat Mouse
  9. Tiger Hamburg Berlin Madrid Amazon „Can you name a few

    European capitals?“ Embedding Model 0.8 0.5 0.4 … QUESTION: „Can you name ..?“: City, Europe, Capital of … Dog Cat Mouse
  10. Tiger Hamburg Berlin Madrid Amazon „Can you name a few

    European capitals?“ Embedding Model 0.8 0.5 0.4 … QUESTION: „Can you name ..?“: City, Europe, Capital of … Our goal? Dog Cat Mouse
  11. Hamburg Berlin Madrid Amazon Our goal? Do everything possible to

    ensure that the ‚question‘ and the ‚answer‘ ars as close together as possible!
  12. man woman king queen queen – woman + man =

    king Okay, but what‘s so great about that? We can calculate with it and ‚understand‘.
  13. man woman king queen doctor – man + woman =

    ? Okay, but what‘s so great about that? We can calculate with it and ‚understand‘.
  14. man woman king queen doctor – man + woman =

    nurse Okay, but what‘s so great about that? We can calculate with it and ‚understand‘.
  15. Generative AI under the hood Transformer (LLM Layer) John wants

    his bank to cash the … ? word2vec style vectors
  16. Generative AI under the hood Transformer (LLM Layer) John wants

    his bank to cash the … ? John wants his bank to cash the … ? (verb) (verb) context aka hidden state
  17. Generative AI under the hood Transformer (LLM Layer) John wants

    his bank to cash the … ? John wants his bank to cash the … ? (verb) (verb) context aka hidden state Key-Vector: btw I‘am a noun describing a male person. Query-Vector: btw I‘am seeking for a noun describing a male person.
  18. Generative AI under the hood Transformer (LLM Layer) John wants

    his bank to cash the … ? Transformer (LLM Layer) context aka hidden state John wants his bank to cash the … ? (verb) (verb) (John‘s) (finance) John wants his bank to cash the … ? (verb) (verb) (male) ( ... )
  19. Generative AI under the hood Transformer (LLM Layer) John wants

    his bank to cash the … ? John (main character, male, married to Cheryl, cousin of Donald, from Minnesota, currently in Boise, … ) wants his bank to cash the … ?
  20. Generative AI under the hood Transformer (LLM Layer) John wants

    his bank to cash the … ? Transformer (LLM Layer) John wants his bank to cash the … (verb) (verb) (John‘s) (finance) John wants his bank to cash the … ? (verb) (verb) (male) Enough context information to be able to „guess“ the next word. cheque ( ... )
  21. Generative AI under the Hood Large Language Model John wants

    his bank to … Large Language Model Next Token increase cash close … Probability 0.2 0.4 0.1 … Next Token account cheque money order … Probability 0.15 0.75 0.05 … John wants his bank to cash the …
  22. Health Care & Pharmaceuticals Advertisment & Marketing Media & Entertainment

    Production & Manufactoring Financial Services Software Development Generative AI am Beispiel
  23. Generative AI by Example Health Care Enhancing medical images Discovering

    new drugs* Simplifying medical tasks Persionalized treatment *via Generative Design
  24. Model A-small Model Provider model-name model-parameter provider-parameter prompt Model A-big

    Model B e.g. openAI, Google, … Provider Client API GenAI Basics Model Integration „What is the most beautiful holiday destination?“ Reminder: 00-hello-genai.ipynb
  25. Hands-On Create a Jupyter Notebook in the Google Colab environment.

    Calling a GenAI model with a prompt via a provider-specific API. Hello IPYNB*, hello GenAI *Python Notebook aka Jupyter Notebook
  26. GenAI- Model Prompt GenAI Basics Model Selection Prompt Engineering 2

    1 „What is the most beautiful holiday destination?“
  27. GenAI- Model Prompt GenAI Basics Model Selection Prompt Engineering 2

    Model Selection „What is the most beautiful holiday destination?“ Welches Model? Welche Parameter? Welche Infrastruktur? Welche Lizenz? Welches Preismodell? Welche Governance? 1
  28. GenAI Basics Model Selection „In the world of GenAI, model

    selection is part of the product design. It affects user experience, cost, scalability, and reliability.“ „While we might not pick the perfect model on the first try, starting with clear priorities like speed, cost, accuracy helps guide the process.“
  29. Step 1 Step 2 Step 3 Step 4 GenAI Basics

    Model Selection aka „How do I find* the right model?“
  30. Define specific GenAI use cases Clearly outline the purpose and

    application of GenAl. 1. Use Case Step 2 Step 3 Step 4 GenAI Basics Model Selection aka „How do I find* the right model?“
  31. Define specific GenAI use cases Create a shortlist of foundation

    models Models that meet the business and technical requirements. 1. Use Case 2. Shortlist Step 3 Step 4 GenAI Basics Model Selection aka „How do I find* the right model?“ Clearly outline the purpose and application of GenAl.
  32. Define specific GenAI use cases Create a shortlist of foundation

    models Evaluate and test selected models Models that meet the business and technical requirements. Conduct thorough testing to evaluate model performance in your own context. 1. Use Case 2. Shortlist 3. Evaluation Step 4 GenAI Basics Model Selection aka „How do I find* the right model?“ Clearly outline the purpose and application of GenAl.
  33. Define specific GenAI use cases Create a shortlist of foundation

    models Evaluate and test selected models Integrate the best models and evaluate them on an ongoing basis Models that meet the business and technical requirements. Conduct thorough testing to evaluate model performance in your own context. Integrate, regularly evaluate and, if necessary, optimise the performance of the model.. 1. Use Case 2. Shortlist 3. Evaluation 4. Integration GenAI Basics Model Selection aka „How do I find* the right model?“ Clearly outline the purpose and application of GenAl.
  34. GenAI Basics Model Selection aka „How do I find* the

    right model?“ „Know Your Use-Case!“ *wenn du am Ende erfolgreich sein möchtest.
  35. GenAI- Model Prompt GenAI Basics Model Integration Prompt Engineering 2

    Model Selection 1 Suitable model found: But how do I address it from my application? „What is the most beautiful holiday destination?“
  36. Model A-small Model Provider model-name model-parameter provider-parameter prompt Model A-big

    Model B e.g. openAI, Google, … Provider Client API GenAI Basics Model Integration „What is the most beautiful holiday destination?“ Reminder: 00-hello-genai.ipynb
  37. Model X mini Model Provider II model-name model-parameter provider-parameter prompt

    Model Y Model Y ++ e.g. openAI, Google, … Provider Client API GenAI Basics Model Integration „What is the most beautiful holiday destination?“
  38. Model X mini Model Provider II Provider II Client API

    model-name model-parameter provider-parameter prompt Model Y Model Y ++ e.g. openAI, Google, … GenAI Basics Model Integration „What is the most beautiful holiday destination?“ Die openAI API ist aktuell der de-facto Standard.
  39. Hub-Provider Client API model-name model-parameter hub-provider-parameter prompt e.g. Hugging-Face Provider-II

    hosted Models Provider-I hosted Models Hub-hosted Models* Hub-Provider Dispatcher *Provider X, fine-tuned or adapted, own GenAI Basics Model Integration „What is the most beautiful holiday destination?“ Cloud based Resources
  40. Hub-Provider Client API model-name model-parameter hub-provider-parameter prompt e.g. Hugging-Face Provider-II

    hosted Models Provider-I hosted Models Hub-hosted Models* Hub-Provider Dispatcher *Provider X, fine-tuned or adapted, own GenAI Basics Model Integration „What is the most beautiful holiday destination?“ cloud-not-allowed exception
  41. Local-Hub Provider Client API model-name model-parameter hub-provider-parameter prompt e.g. GPT4all,

    ollama Local hosted Models (Provider III) Local hosted Models (Provider II) Local hosted Models (own or adaptedI) Local-Hub Provider Dispatcher Local Machine GenAI Basics Model Integration „What is the most beautiful holiday destination?“
  42. GenAI Basics Model Adaption „ What additional levers do I

    have?“ • Temperature degree of ‘fantasy‘ • Max Tokens length of answer • Top K selection of hits from the top K hits • Top P selection of hits from the top P per cent • Presence Penalty avoid repetition • Frequence Penalty avoid repetition (weighted edition)
  43. GenAI- Model Prompt Model Selection Prompt Engineering 2 1 GenAI

    Basics Prompt Engineering „What is the most beautiful holiday destination?“
  44. GenAI- Model Prompt Model Selection Prompt Engineering 1 Parts of

    a Prompt System vs User Prompt Principles of Prompting Patterns for Prompting Context Window Query vs Chat GenAI Basics Prompt Engineering 2 „What is the most beautiful holiday destination?“
  45. GenAI Basics Prompt Engineering Role Instruction Example 1 Context Example

    n Question Who am I? What is my intention? What are helpful examples? Are there any additional information? BTW: what is the task I ask for? What should a good prompt should look like?
  46. GenAI Basics Prompt Engineering Role Instruction Example 1 Context Example

    n Question What is your role as an assistant? What is my intention? What are helpful examples? Are there any additional information? BTW: what is the task I ask for? What should a good prompt should look like?
  47. GenAI Basics Good Prompt vs. Bad Prompt You I want

    to cook something. LR Topic: Recipe Recommendations
  48. GenAI Basics Good Prompt vs. Bad Prompt You I want

    to cook something. LR You Acting as an expert home cook, for someone who enjoys vegetarian Italian food and has only 30 minutes to prepare dinner, could you recommend a recipe including a list of ingredients and step-by- step instructions? LR Output Example Context Question Role Acting as an expert home cook, for someone who enjoys vegetarian Italian food and has only 30 minutes to prepare dinner, could you recommend a recipe including a list of ingredients and step-by-step instructions? You could point at recipes you like from the BBC’s Good Food guide, providing URL’s to recipes you love. Topic: Recipe Recommendations
  49. GenAI Basics Good Prompt vs. Bad Prompt You I want

    to learn something new. LR You Acting as a coding instructor, for a beginner with a goal to learn Python within 4 weeks, please provide a learning plan including resources and a weekly schedule for 10 hours per week. LR Output Example Context Question Role Acting as a coding instructor, for a beginner with a goal to learn Python within 4 weeks, Please provide a learning plan, including resources and a weekly schedule for 10 hours per week Point it at courses you’ve done in the past that you’ve liked. Tell it you like to learn by reading books, or websites, or by watching videos, or a mixture of both. Topic: Learning new skills
  50. GenAI Basics System Prompt vs. User Prompt System Assistant User

    „You“ Chat-Model Behaviour of Assistant
  51. GenAI Basics System Prompt vs. User Prompt System Assistant User

    ‘You're a friendly assistant.’ „You“ Chat-Model Behaviour of Assistant
  52. GenAI Basics System Prompt vs. User Prompt System Assistant User

    ‘You're a friendly assistant and always keep things brief.’ ‚Which is the most beautiful city in the world?‘ ‚Which is the most beautiful city in the world?‘ ‘You are a philosopher and have a tendency to be verbose.’ ??? ??? „You“ Chat-Model Behaviour of Assistant
  53. GenAI Basics System Prompt vs. User Prompt System-Prompt: ‘You're a

    friendly assistant and always keep things brief.’ Assistant: 65 Tokens
  54. GenAI Basics System Prompt vs. User Prompt System-Prompt: ‘You are

    a philosopher and have a tendency to be verbose.’ Assistant: 434 Tokens
  55. GenAI Basics Patterns for Prompting Pros & Cons Consideration Stepwise

    Interaction Flipped Interaction Discussion of Experts I‘am planing to travel to China. I‘am planing a trip to China. I‘am planing a trip to China. I‘am planing a trip to China.
  56. GenAI Basics Patterns for Prompting I‘am planing to travel to

    China. Please analyze the pros and cons of traveling by bike, ship or plane. Pros & Cons Consideration Stepwise Interaction Flipped Interaction Discussion of Experts I‘am planing a trip to China. I‘am planing a trip to China. I‘am planing a trip to China.
  57. GenAI Basics Patterns for Prompting I‘am planing to travel to

    China. Please analyze the pros and cons of traveling by bike, ship or plane. Pros & Cons Consideration Stepwise Interaction Flipped Interaction Discussion of Experts I‘am planing a trip to China. Walk me through the planing process step by step, explaining your reasoning at each stage. Wait for my con- firmation before moving to the next step. I‘am planing a trip to China. I‘am planing a trip to China.
  58. GenAI Basics Patterns for Prompting I‘am planing to travel to

    China. Please analyze the pros and cons of traveling by bike, ship or plane. Pros & Cons Consideration Stepwise Interaction Flipped Interaction Discussion of Experts I‘am planing a trip to China. Walk me through the planing process step by step, explaining your reasoning at each stage. Wait for my con- firmation before moving to the next step. I‘am planing a trip to China. Before providing a solution, please ask me relevant questions about the trip, so that you can give me the most appropriate planing advice.“ I‘am planing a trip to China.
  59. GenAI Basics Patterns for Prompting I‘am planing to travel to

    China. Please analyze the pros and cons of traveling by bike, ship or plane. Pros & Cons Consideration Stepwise Interaction Flipped Interaction Discussion of Experts I‘am planing a trip to China. Walk me through the planing process step by step, explaining your reasoning at each stage. Wait for my con- firmation before moving to the next step. I‘am planing a trip to China. Before providing a solution, please ask me relevant questions about the trip, so that you can give me the most appropriate planing advice.“ I‘am planing a trip to China. I am torn between sustainable and cheap travelling. Can you provide two schools or trends within the travel community that can represent the two approaches.
  60. Hands-On Prompting with a system Prompting using various prompting patterns

    and best practices. Extended prompting via chatbot using a storybook and chat history. Prompting & Models
  61. GenAI- Model Prompt GenAI Basics myDomain + „Leave regulation at

    open knowledge“ „How to apply for vacation at open knowledge?“
  62. GenAI- Model Prompt GenAI Basics myDomain + „Leave regulation at

    open knowledge“ „How to activate the alarm system at open knowledge?“
  63. GenAI- Model Prompt GenAI Basics myDomain + + „How to

    activate the alarm system at open knowledge?“ „Leave regulation + Alarm system usage at open knowledge“
  64. GenAI- Model Prompt GenAI Basics myDomain + „All internal knowledge

    (Wiki, DB, …) at open knowledge“ „How to activate the alarm system at open knowledge?“
  65. GenAI- Model Prompt GenAI Basics myDomain + ERROR: Token Limit

    exceeded Output Size Context Window Size „How to activate the alarm system at open knowledge?“ „All internal knowledge at open knowledge“
  66. GenAI- Model Prompt GenAI Basics myDomain + ERROR: Out-of-Budget Exception

    Context Window Size „All internal knowledge at open knowledge“ „How to activate the alarm system at open knowledge?“
  67. GenAI- Model Prompt GenAI Basics myDomain + ERROR: Lost-in-the-Middle Context

    Window Size „All internal knowledge at open knowledge“ „How to activate the alarm system at open knowledge?“
  68. GenAI- Model Prompt GenAI Basics myDomain + ERROR: Self-fullfilling Prophecy

    Context Window Size Context Window Size Context Window Size Context Window Size „Alles interne Wissen (Wiki, DB, …) bei open knowledge“ „How to activate the alarm system at open knowledge?“
  69. GenAI- Model Prompt GenAI Advanced myDomain + ERROR: Token Limit

    exceeded! Output Size Context Window Size „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  70. myGenAI Model Prompt GenAI Advanced myDomain „All internal knowledge at

    open knowledge“ „How to apply for vacation at open knowledge?“
  71. myGenAI Model Prompt GenAI Advanced myDomain Option 1: Build own

    Model Option 2: Fine-tune existing Model ERROR: Way to expensive ERROR: Way to complex WARNING: Out-of-Sync „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  72. Prompt GenAI Advanced myDomain GenAI- Model SOME MAGIC „MAGIC Enrichment“

    „How to apply for vacation at open knowledge?“
  73. Prompt GenAI- Model GenAI Advanced myDomain SOME MAGIC „How to

    apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  74. Prompt Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever GenAI- Model

    GenAI Advanced myDomain „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  75. „How to apply for vacation at open knowledge?“ Prompt Ingesting-

    Pipeline Knowledge Database Retriever GenAI Advanced myDomain R etrieval Augmented Generation „All internal knowledge at open knowledge“
  76. Prompt Ingesting- Pipeline Knowledge Database Retriever GenAI Advanced RAG Systems

    R etrieval Augmented Generation „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  77. Prompt Ingesting- Pipeline Knowledge Database GenAI Advanced RAG Systems R

    etrieval Augmented Generation Retriever „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  78. GenAI Advanced RAG Systems GenAI Model INGESTING Pipeline (Async.) Chunking

    Embedding RETRIEVAL Pipeline (Sync.) Top-k Fetching Augmenting UI/UX (Prompt) Query- Embedding „How to apply for vacation at open knowledge?“ Retriever
  79. GenAI Advanced RAG Systems GenAI Model INGESTING Pipeline (Async.) Chunking

    Embedding RETRIEVAL Pipeline (Sync.) Top-k Fetching Augmenting UI/UX Query- Embedding It looks complicated, and it is! Except in ‘Hello World’ examples from the Internet. Retriever
  80. Chunking Embedding Ingesting Pipeline by Example Document (Doc. Id:1) „It

    will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“
  81. Chunking Embedding Ingesting Pipeline by Example [„ It will soon

    be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype.“] [„ The added value of this technology is too great and its use in the form of a chat is too intuitive.“] Document (Doc. Id:1) „It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“ [ … ]
  82. Chunking Embedding Ingesting Pipeline by Example [-0.24, 0.15, …, 0.52]

    [-0.13, 0.95, …, 0.31] […] [„ It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype.“] [„ The added value of this technology is too great and its use in the form of a chat is too intuitive.“] Document (Doc. Id:1) „It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“ [ … ]
  83. Chunking Embedding Ingesting Pipeline by Example Id Doc. Id Embedding

    1 1 [-0.24, 0.15, …, 0.52] 2 1 [-0.13, 0.95, …, 0.31] […] […] […] [-0.24, 0.15, …, 0.52] [-0.13, 0.95, …, 0.31] […] [„ It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype.“] [„ The added value of this technology is too great and its use in the form of a chat is too intuitive.“] Document (Doc. Id:1) „It will soon be the second anniversary of the release of ChatGPT. There is still no end in sight to the hype. The added value of this technology is too great and its use in the form of a chat is too intuitive. In 2024, it was expected that the leap from exploratory playing around with large language models [...].“ [ … ]
  84. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting ‘What is

    the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  85. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting [0.35, 0.25,

    …, 0.62] ‘What is the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  86. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting Embeddings from

    knowledge base Embeddings Dimension 1 Embed. Dim. 2 Top-k=1 document(s) [0.35, 0.25, …, 0.62] ‘What is the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  87. Retrieval Pipeline by Example Embedding Top-k Fetching Augmenting Embeddings from

    knowledge base Embeddings Dimension 1 Embed. Dim. 2 Top-k=1 document(s) [0.35, 0.25, …, 0.62] & „ What is the lecture […] based on the following context: ‘An upstream retrieval process to pull in relevant information from our own dynamic database can provide well-founded answers.’ ‘What is the lecture ‘The architecture for language models in practice - Retrieval Augmented Generation’ about?’
  88. Hands-On Implementing a simple RAG system using ingestion and retrieval

    pipelines. Applying the RAG system to questions from your own domain. RAG System Simple Version
  89. Retrieval Augmented Knowledge Database Retriever Generation Ingesting- Pipeline GenAI Advanced

    RAG Systems „How to apply for vacation at open knowledge?“
  90. Retrieval Augmented Knowledge Database Retriever Generation Ingesting- Pipeline GenAI Advanced

    RAG Systems „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  91. Retrieval Augmented Knowledge Database Retriever Generation Ingesting- Pipeline GenAI Professional

    Guardrails „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  92. Retrieval Augmented Knowledge Database Retriever Generation Guarding Output Ingesting- Pipeline

    GenAI Professional Guardrails Output Profanity Check „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  93. Retrieval Augmented Knowledge Database Retriever Generation Guarding Output Ingesting- Pipeline

    GenAI Professional Guardrails „Unfortunately, I can't answer that!“ Output Profanity Check „…. Lazy piece! …“ „How to apply for vacation at open knowledge?“ „Working like this you doesn't need a holiday. Lazy piece!“
  94. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guardrails Guarding Topic Check „Which Netflix series do you recommend?“
  95. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guardrails Guarding Topic Check „Which Netflix series do you recommend?“ „Sorry, but this does not fall within my area of responsibility.“
  96. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guardrails Output-Governance: Toxic Speech, Fact-Checking, Ethical-Guidelines, Cross-Validation, … Input-Evaluation: Prompt-Injection, Jailbreak-Attemp, Topic-Filter, PII-Cleansing, Rate-Limiting,… Guarding
  97. Hands-On Create your own guardrails for input and output to

    prevent toxic language. Implementing a simple Output Guardrail as a fact checker. RAG System Guardrails
  98. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    GenAI Professional Guarding What else is missing for a professional and productive system?
  99. Retrieval Augmented Knowledge Database Retriever Generation Input Output Ingesting- Pipeline

    RAG-Monitoring Cost- Monitoring Model- Monitoring Response-Quality- Monitoring GenAI Professional Monitoring Guarding Ragas: https://docs.ragas.io/en/stable/concepts/metrics
  100. Generated Answer Retrieved Context User Input ‚Is the answer supported

    by the context?‘ ‚Is the context found relevant to the question?‘ ‚Is the answer relevant to the question?‘ GenAI Professional Monitoring: The RAG-Triad
  101. Generated Answer Retrieved Context User Input ‚Is the given answer

    similar to the correct answer?‘ ‚Has the entire context to be found been found?‘ Ground Truth Answer Ground Truth Context ‚Is the answer supported by the context?‘ ‚Is the context found relevant to the question?‘ ‚Is the answer relevant to the question?‘ GenAI Professional Monitoring: The RAG-Triad+
  102. Retrieval Augmented Generation GenAI Professional Response Quality Monitoring https://docs.confident-ai.com/docs/metrics-llm-evals G-Eval

    Deep Acyclic Graph Answer Relevancy Faithfulness Summerization Hallucination Contextual Recall Contextual Relevancy Contextual Precision Relevant Context Generated Answer User Input
  103. Retrieval Augmented Generation User Input GenAI Professional Response Quality Monitoring

    Faithfulness Is the supported by the ? Relevant Context Generated Answer
  104. Retrieval Augmented Generation User Input GenAI Professional Response Quality Monitoring

    Does the match the ? Relevant Context Generated Answer Answer Relevancy
  105. GenAI Professional Response Quality Monitoring Step 1: Generate possible questions

    ( ) from Embeddings Dimension 1 Embed. Dim. 2 Answer Relevancy Does the match the ?
  106. GenAI Professional Response Quality Monitoring Step 1: Generate possible questions

    ( ) from Step 2: Determine the average distance from Embeddings Dimension 1 Embed. Dim. 2 Answer Relevancy Does the match the ?
  107. Best Practices What are the typical pitfalls that I will

    run into - and how do I deal with them?
  108. Retrieval Augmented Knowledge Database Retriever Generation Guarding Input Output Ingesting-

    Pipeline RAG-Monitoring Cost- Monitoring Model- Monitoring Response-Quality- Monitoring GenAI Best Practices Real Life Survival Guide
  109. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Hallucination (via Faithfulness / Answer Relevancy) GenAI Best Practices Real Life Survival Guide
  110. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Missing Data GenAI Best Practices Real Life Survival Guide
  111. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Missing Data GenAI Best Practices Real Life Survival Guide Chunk not found
  112. GenAI Model UI/UX Guardrails Output Ingesting-Pipeline (Async.) Chunking Embedding Retrieval-Pipeline

    (Sync.) Query- Embedding Top-k Fetching Augmenting Input Missing Data GenAI Best Practices Real Life Survival Guide Chunk not found Chunk not relevant
  113. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: RAG does not know the required data.
  114. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: RAG does not know the required data. Solution : Implement a suitable ingestor.
  115. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: Snapshot of the data is not up-to-date.
  116. Ingesting-Pipeline (Async.) Chunking Embedding Real Life RAG #1 Missing-Data Challenge

    Missing Data Problem: Snapshot of the data is not up-to-date. Solution: Interface for ‘continuous’ data sync.
  117. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found.
  118. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #1: Adapt chunk-sizes and overlapping.
  119. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #2: Choose a different splitter (e.g. Semantic-Splitter).
  120. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #3: Optimize the distance threshold or k from top-k.
  121. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #4: Selection of a more suitable embedding model.
  122. Query- Embedding Top-k Fetching Ingesting-Pipeline (Async.) Chunking Embedding Augmenting Retrieval-Pipeline

    (Sync.) Real Life RAG #2 Chunk-not-found Challenge Chunk not found Problem: Existing chunks will not be found. Solution #5: Hypothetical document embeddings (HyDE) Input HyDE
  123. Guardrails Input Output Gao et al. (2023) HyDE = Hypothetical

    Document Embeddings Real Life RAG #2 Chunk-not-found Challenge HyDE
  124. Real Life RAG #3 Chunk-not-relevant Challenge Chunk not relevant GenAI

    Model Query- Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.)
  125. Real Life RAG #3 Chunk-not-relevant Challenge Problem: Chunk is not

    considered to be relevant by the model. Chunk not relevant GenAI Model Query- Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.)
  126. Real Life RAG #3 Chunk-not-relevant Challenge Chunk not relevant Query-

    Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.) Problem: Chunk is not considered to be relevant by the model. Solution #1: Select a more suitable model for the use case. GenAI Model
  127. Real Life RAG #3 Chunk-not-relevant Challenge Chunk not relevant Query-

    Embedding Top-k Fetching Augmenting Retrieval-Pipeline (Sync.) GenAI Model Problem: Chunk is not considered to be relevant by the model. Solution #2: Add additional context to the chunk (small-to-big). Chunk
  128. GenAI Model UI/UX Guardrails Input Output Ingesting-Pipeline (Async.) Chunking Embedding

    Retrieval-Pipeline (Sync.) Query- Embedding Top-k Fetching Augmenting
  129. Hands-On Surviving the RAG jungle: Using LLM and RAG metrics

    to optimise the retrieval process. RAG System typical Pitfalls
  130. Generated Answer Retrieved Context User Input ‚Is the given answer

    similar to the correct answer?‘ ‚Has the entire context to be found been found?‘ Ground Truth Answer Ground Truth Context ‚Is the answer supported by the context?‘ ‚Is the context found relevant to the question?‘ ‚Is the answer relevant to the question?‘ GenAI Professional Monitoring: The RAG-Triad+
  131. 1 Provide evaluation data that includes the ground truth context.

    2 Generate responses to evaluation input and collect context. Real Life RAG Metrics in Information Retrieval
  132. 1 2 Generate responses to evaluation input and collect context.

    3 Determine whether context is included in ground truth. 1 0 Hit Rate Real Life RAG Metrics in Information Retrieval – Hit Rate Provide evaluation data that includes the ground truth context.
  133. 1 2 Generate responses to evaluation input and collect context

    based on relevance. 3 Determine the proportion of relevant documents in relation to all documents. 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 𝑟𝑒𝑙𝑒𝑣𝑎𝑛𝑡 𝑅𝑒𝑠𝑢𝑙𝑡𝑠 𝑡𝑜𝑡𝑎𝑙 𝑅𝑒𝑠𝑢𝑙𝑡𝑠 0.00 0.33 Real Life RAG Metrics in Information Retrieval – Precision Provide evaluation data that includes the ground truth context.
  134. 1 2 Generate responses to evaluation input and collect context

    based on relevance. … 3 Determine the rank of the first relevant document. Calculate and average the reciprocal rank (RR): 𝑅𝑅 = 1 𝑅𝑎𝑛𝑔 1 2 3 1 2 3 1 2 0 𝑀𝑅𝑅 = 1 𝑛 σ𝑖=1 𝑛 𝑅𝑅𝑖 𝑀𝑅𝑅 = 0.25 1 2 3 1 2 3 Real Life RAG Metrics in Information Retrieval – Mean Reciprocal Rank (MRR) Provide evaluation data that includes the ground truth context.
  135. Prompt Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever GenAI- Model

    GenAI Outlook What else is there to consider? „All internal knowledge at open knowledge“ „How to apply for vacation at open knowledge?“
  136. UI/UX Enterprise Integration „Comment demander des vacances chez ok?“ Retrieve

    Augment Ingesting- Pipeline Retriever GenAI- Model GenAI Outlook Multi-Language RAG Knowledge Database „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  137. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Language- Detection Translation Translation Translation GenAI Outlook Multi-Language RAG Knowledge Database „All internal knowledge at open knowledge“ „Comment demander des vacances chez ok?“
  138. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever

    GenAI- Model Translation Translation Translation Translation FR_fr language not supported domain specific wording GenAI Outlook Multi-Language RAG Language- Detection „Comment demander des vacances chez ok?“ „All internal knowledge at open knowledge in multiple languages.“
  139. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Knowledge Database GenAI Outlook Access Control „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  140. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Knowledge Database GenAI Outlook Access Control „What are the top secret KPIs of open knowledge?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  141. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Token Knowledge Database Input Guardrail Some kind of Attack? Prompt-Injection? Prompt-Abuse? Rate-Limit? GenAI Outlook Access Control „What are the top secret KPIs of open knowledge?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  142. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database with

    AC Meta Data Retriever GenAI- Model RBAC via PBF Token Token Input Guardrail GenAI Outlook Access Control RBAC = Role based Access Control PBF = Permission based Filtering „What are the top secret KPIs of open knowledge?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  143. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database with

    AC Meta Data Retriever GenAI- Model RBAC via PBF Token Token Input Guardrail Output Guardrail PII: Personally identifiable information? GenAI Outlook Access Control „What does the personal data of X look like at ok?“ „All internal knowledge at open knowledge“ incl C-Level information.“
  144. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Retriever GenAI- Model

    Knowledge Database GenAI Outlook Integration „All internal knowledge at open knowledge“ „How to apply for vacation at open knowledge?“
  145. GenAI Outlook Integration GenAI based Service A P I ML

    as a Service Input Output UI/UX Enterprise Integration „All internal knowledge at open knowledge“ „How to apply for vacation at open knowledge?“
  146. Hands-On Provision of a GenAI-based RAG system as a service.

    Integration of the service into a proprietary web-based interface. RAG System GenAI as a Service
  147. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever

    GenAI- Model GenAI Outlook Multi-Agenten Systeme „How to apply for vacation at open knowledge?“ „All internal knowledge at open knowledge“
  148. UI/UX Enterprise Integration Retrieve Augment Ingesting- Pipeline Knowledge Database Retriever

    GenAI- Model GenAI Outlook Multi-Agenten Systeme „All internal knowledge at open knowledge“ ‚Apply for my vacation from […] to [… ] at ok, please.‘
  149. Retriever Memory Tools Planning Database Prompting GenAI- Model API-Call UI/UX

    Agentic AI ‚Apply for my vacation from […] to [… ] at ok, please.‘
  150. Retriever Memory Tools Planning Database Prompting GenAI- Model API-Call UI/UX

    ‚You are an assistant for employees. Perform the following steps to complete the tasks: [...]‘ Agentic AI ‚Apply for my vacation from […] to [… ] at ok, please.‘
  151. Retriever Memory Tools Planning Database Prompting GenAI- Model API-Call UI/UX

    I should first retrieve past interactions with the user and user information. Agentic AI ‚You are an assistant for employees. Perform the following steps to complete the tasks: [...]‘ ‚Apply for my vacation from […] to [… ] at ok, please.‘
  152. Retriever Memory Tools Planning Database Prompting GenAI- Model API-Call UI/UX

    User : {Name: „Tim Wüllner“, ID: „123“} […] Agentic AI ‚Apply for my vacation from […] to [… ] at ok, please.‘
  153. Retriever Memory Tools Planning Database Prompting GenAI- Model API-Call UI/UX

    User : {Name: „Tim Wüllner“, ID: „123“} […] I should now obtain information on how to apply for leave at open knowledge. Agentic AI ‚Apply for my vacation from […] to [… ] at ok, please.‘
  154. Retriever Memory Tools Planning Database Prompting GenAI- Model API-Call UI/UX

    ‚You can apply for holiday leave at open knowledge via the BCS system. Leave is applied for as follows […]. Agentic AI ‚Apply for my vacation from […] to [… ] at ok, please.‘
  155. Agentic AI Retriever Memory Tools Planning Database Prompting GenAI- Model

    API-Call UI/UX I should now use the proposed tool(s) to apply for the holiday for the user as described. ‚You can apply for holiday leave at open knowledge via the BCS system. Leave is applied for as follows […]. ‚Apply for my vacation from […] to [… ] at ok, please.‘
  156. Agentic AI Retriever Memory Tools Planning Database Prompting GenAI- Model

    API-Call UI/UX requestVacation(userId=„123“,from=„x“,to=„y“) ‚Apply for my vacation from […] to [… ] at ok, please.‘
  157. Agentic AI Retriever Memory Tools Planning Database Prompting GenAI- Model

    API-Call UI/UX requestVacation(userId=„123“,from=„x“,to=„y“) I should now inform the user about the holiday request. ‚Apply for my vacation from […] to [… ] at ok, please.‘
  158. Agentic AI Retriever Memory Tools Planning Database Prompting GenAI- Model

    API-Call UI/UX ‚Apply for my vacation from […] to [… ] at ok, please.‘ „I applied for … .“
  159. UI/UX Spezialist für Urlaub Spezialist für User- Interaktion Spezialist für

    Coding Spezialist für Social Media ‚Create a proposal for a contribution on the topic of AI agents.‘ ‚Why does this code result in a NullPointerException ?‘ Agentic AI Multi Agent Systems
  160. GenAI Model UI/UX Guardrails Input Output Ingesting-Pipeline (Async.) Chunking Embedding

    Retrieval-Pipeline (Sync.) Top-k Fetching Augmenting Query- Embedding
  161. GenAI from prototype to production Your personal „takeaways“: • GenAI

    is powerful. But also expensive. • Professional prompting is your super power. • Every model has its own character. • RAG for your own domain knowledge. • GenAI is also just software. Your use case determines the right path!
  162. #WISSENTEILEN #WISSENTEILEN IMAGE REFERENCE Folie 21: © photoplotnikov - istockphoto.com

    Folie 23: © Mix und Match Studios - shutterstock.com Folie 23: © Mix und Match Studios - shutterstock.com All other pictures, drawings and icons originate from • pexels.com, • pixabay.com, • unsplash.com, • flaticon.com or are created by my own.