Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Techorama BE 2025: Pragmatic Gen AI: Top Patter...

Techorama BE 2025: Pragmatic Gen AI: Top Patterns & Solutions for Successful LLM-Powered Applications

Generative AI and human language as first-class assets in your software—moving beyond buzzword bingo. Join Christian Weyer in this session to explore actionable patterns and solutions for seamlessly integrating language and embedding models into modern software architectures. Key topics, including Semantic Routing, Retrieval-Augmented Generation (RAG), Structured Output, and Observability, are demonstrated through harmonized, real-world examples in an end-to-end system featuring multiple services and client applications. Developers and architects will gain practical insights into bringing natural language user interfaces to life in their projects.

Avatar for Christian Weyer

Christian Weyer

May 28, 2025
Tweet

More Decks by Christian Weyer

Other Decks in Programming

Transcript

  1. Pragmatic Gen AI Top Patterns & Solutions for successful LLM-powered

    Applications Christian Weyer | Co-Founder & CTO | Thinktecture AG | [email protected] 1
  2. § Technology catalyst § AI-powered solutions § Pragmatic end-to-end architectures

    § Microsoft Regional Director § Microsoft MVP for AI § Google GDE for Web AI [email protected] https://www.thinktecture.com Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered Applications Christian Weyer Co-Founder & CTO @ Thinktecture AG 2
  3. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications Our journey 3 Models for our software Lightweight RAG Semantic Routing Observability LLM all-the-things? Structured Output / Tool Calling
  4. Language Models understand and generate semantically rich human language, transforming

    it into text or structured data for both humans and machines. ⚠ Non-deterministic: same input can lead to different outputs. Embedding Models capture semantic meaning by encoding human language into numerical vector representations, facilitating understanding, comparison, and retrieval for both humans and machines. ✅ Deterministic: same input always results in the same embedding. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered Applications 5 🫱 🫲 Semantic AI Generative AI
  5. § Language & embedding models part of end-to-end architectures §

    E-M enable semantic search & comparison § L-M enable human language understanding via context § System prompt § Conversation history § User query Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered Applications API-based model integrations 7
  6. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications Classical applications & UIs 8 API-based data Document-based data
  7. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications Language-enabled “UIs” – Talk-to-TT 9
  8. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications 10 C4 system context diagram § Various tech stacks § Docker-based distributed system
  9. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications PATTERN LIGHTWEIGHT RAG [RETRIEVAL-AUGMENTED GENERATION] 11
  10. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications “Talk to your Data” Cleanup & Split Text Embedding Question Text Embedding Save Query Relevant Results Question Answ er w / sources LLM Embedding Model Embedding Model 💡 Indexing / Embedding Question Answering .md, .docx, .pdf etc. “Lorem ipsum…?” 💡 Vector DB 12
  11. § Frameworks § LangChain § FastEmbed § Lightweight & efficient

    for generating text embeddings § Embedding model § jinaai/jina-embeddings-v2-base-de (local) – 768 dims § Vector store § PostgreSql (pgvector) vector store § LLM/SLM § Llama 3.3 70B on Cerebras (very fast) Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered Applications Technical implementation – Lightweight RAG 13
  12. § Integration is being standardized with MCP Pragmatic Gen AI

    Top Patterns & Solutions for Successful LLM-Powered Applications Structured data from unstructured input For calling APIs / tools 15 “OK, when is my colleague CW available for a two- days workshop?” System Prompt (with employee data) + Schema / Function Calling (for structured output) (Internal) Web API Availability business logic
  13. § Frameworks § Pydantic § Instructor § Methodology § Schema

    with JSON Mode (not Function Calling) § SLM/LLM § Llama 3.3 70B on Cerebras (very fast) Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered Applications Technical implementation – Structured Output 16
  14. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications Semantics-based decisions for user interactions Guarding (e.g. prompt injection) Routing (selecting correct target) “Lorem ipsum…?” Target RAG Target API Call Target … something else … Fine-tuned Language Model Embedding Model 18
  15. Guarding § Frameworks § llm-guard § HuggingFace Transformers § Model

    § deepset/ deberta-v3-base-injection (local) Routing § Frameworks § semantic-routing § FastEmbed § Embedding model § intfloat/ multilingual-e5-large (local) – 1024 dims § Vector store § PostgreSql (pgvector) Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered Applications Technical implementation – Semantic Guarding & Routing 19
  16. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications PATTERN / SOLUTION OBSERVABILITY 20
  17. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications Things can get… overwhelming 21
  18. § Methodology § Open Telemetry (OTel) § Frameworks § OTel

    Python packages § LogFire SDK § Tools § LogFire, LangFuse § Any OTel-enabled system Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered Applications Technical implementation - Observability 23
  19. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications END-TO-END SOLUTION ILLUSTRATED 24
  20. Semantic routing Pragmatic Gen AI Top Patterns & Solutions for

    Successful LLM-Powered Applications "Talk to your systems" - for Availability queries 25 Web App / Watch App Speech-to-Text Internal Gateway (Python FastAPI) LLM / SLM Text-to-Speech Transcribe spoken text Transcribed text Check for experts availability with text Extract { experts, booking times } from text Structured JSON data (Function calling) Generate response with availability Response Response with experts availability 🔉 Speech-to-text for response Response audio Internal Business API (node.js – veeeery old) Query Availability API Availability When is CL…? CL will be…
  21. Pragmatic Gen AI Top Patterns & Solutions for Successful LLM-Powered

    Applications Recap: Top Semantic AI patterns & solutions – in end-to-end software engineering 26 Lightweight RAG Structured Output Semantic Guarding & Routing Insightful Observability 💡 Fun Fact: Large parts been built with AI-assisted Coding / Vibe Coding