Upgrade to Pro — share decks privately, control downloads, hide ads and more …

.NET Day Franken 2025: Pragmatische Gen-AI-Anwe...

.NET Day Franken 2025: Pragmatische Gen-AI-Anwendungen: Top Patterns & Lösungen für nahtlose LLM-Integration

Generative Al jenseits des Buzzword-Bingos. In diesem Vortrag präsentiert Christian Weyer konkrete Patterns & Lösungen für die Integration von Large Language Models (LLMs) wie GPT, Mistral, Claude oder Llama in eigene Software-Architekturen.
Wichtige Themen wie Semantic Routing, RAG, Structured Output oder Observability werden mit Code-Beispielen illustriert. Es erwartet Entwickler und Architekten ein pragmatischer Einblick zur möglichen Umsetzung von Natural Language User Interfaces in eigenen Projekten.

Avatar for Christian Weyer

Christian Weyer

May 10, 2025
Tweet

More Decks by Christian Weyer

Other Decks in Programming

Transcript

  1. § Technology catalyst § AI-powered solutions § Pragmatic end-to-end architectures

    § Microsoft Regional Director § Microsoft MVP for AI § Google GDE for Web AI [email protected] https://www.thinktecture.com Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Christian Weyer Co-Founder & CTO @ Thinktecture AG 2
  2. Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Our

    journey 3 Models for our software Lightweight RAG Semantic Routing Observability LLM all-the-things? Structured Output / Tool Calling
  3. Language Models understand and generate semantically rich human language, transforming

    it into text or structured data for both humans and machines. ⚠ Non-deterministic: same input can lead to different outputs. Embedding Models capture semantic meaning by encoding human language into numerical vector representations, facilitating understanding, comparison, and retrieval for both humans and machines. ✅ Deterministic: same input always results in the same embedding. Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration 5 🫱 🫲 Semantic AI Generative AI
  4. § Language & embedding models part of end-to-end architectures §

    E-M enable semantic search & comparison § L-M enable human language understanding via context § System prompt § Conversation history § User query Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration API-based model integrations 7
  5. Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration “Talk

    to your Data” Cleanup & Split Text Embedding Question Text Embedding Save Query Relevant Results Question Answ er w / sources LLM Embedding Model Embedding Model 💡 Indexing / Embedding Question Answering .md, .docx, .pdf etc. “Lorem ipsum…?” 💡 Vector DB 12
  6. § Frameworks § LangChain § FastEmbed § Lightweight & efficient

    for generating text embeddings § Embedding model § jinaai/jina-embeddings-v2-base-de (local) § Vector store § PostgreSql (pgvector) vector store § LLM/SLM § Llama 3.3 70B on Cerebras (very fast) Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Technical implementation – Lightweight RAG 13
  7. § Integration is being standardized with MCP Pragmatische Gen-AI-Anwendungen Top

    Patterns & Lösungen für nahtlose LLM-Integration Structured data from unstructured input For calling APIs / tools 15 “OK, when is my colleague CW available for a two- days workshop?” System Prompt (with employee data) + Schema / Function Calling (for structured output) (Internal) Web API Availability business logic
  8. § Frameworks § Pydantic § Instructor § Methodology § Schema

    with JSON Mode (not Function Calling) § SLM/LLM § Llama 3.3 70B on Cerebras (very fast) Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Technical implementation – Structured Output 16
  9. Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Semantics-based

    decisions for user interactions Guarding (e.g. prompt injection) Routing (selecting correct target) “Lorem ipsum…?” Target RAG Target API Call Target … something else … Fine-tuned Language Model Embedding Model 18
  10. Guarding § Frameworks § llm-guard § HuggingFace Transformers § Model

    § deepset/ deberta-v3-base-injection (local) Routing § Frameworks § semantic-routing § FastEmbed § Embedding model § intfloat/ multilingual-e5-large (local) § Vector store § PostgreSql (pgvector) Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Technical implementation – Semantic Guarding & Routing 19
  11. § Methodology § Open Telemetry (OTel) § Frameworks § OTel

    Python packages § LogFire SDK § Tools § LogFire, LangFuse § Any OTel-enabled system Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Technical implementation - Observability 23
  12. Semantic routing Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose

    LLM-Integration "Talk to your systems" - for Availability queries 25 Web App / Watch App Speech-to-Text Internal Gateway (Python FastAPI) LLM / SLM Text-to-Speech Transcribe spoken text Transcribed text Check for experts availability with text Extract { experts, booking times } from text Structured JSON data (Function calling) Generate response with availability Response Response with experts availability 🔉 Speech-to-text for response Response audio Internal Business API (node.js – veeeery old) Query Availability API Availability When is CL…? CL will be…
  13. Pragmatische Gen-AI-Anwendungen Top Patterns & Lösungen für nahtlose LLM-Integration Recap:

    Top Semantic AI patterns & solutions – in end-to-end software engineering 26 Lightweight RAG Structured Output Semantic Guarding & Routing Insightful Observability 💡 Fun Fact: Large parts been built with AI-assisted Coding / Vibe Coding