Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Advanced RAG Pipelines: Engineering Scalable Re...

Advanced RAG Pipelines: Engineering Scalable Retrieval Systems for Enterprise AI

NVIDIA GTC 2025
by Meriem Bendris & Bilge Yucel

Retrieval-augmented generation (RAG) systems integrate the reasoning capabilities of large language models and information retrieval by searching for the relevant content from a large corpus to generate informed, accurate responses. Learn how to build advanced, custom RAG applications. You'll explore the design of an end-to-end RAG system, including data preparation, indexing, retrieval, and response generation. Then, we'll show how you can leverage Haystack pipelines with multiple NVIDIA NIMs such as the LLM, text embedding, and reranking microservices to self-host on a Kubernetes production environment at scale. Finally, we'll discuss AI models evaluation and customization for specific RAG tasks.

Bilge Yücel

March 21, 2025
Tweet

More Decks by Bilge Yücel

Other Decks in Technology

Transcript

  1. Todayʼs Presenters Bilge Yücel Haystack Developer Relations Engineer at deepset

    Meriem Bendris Senior Solution Architect AI at NVIDIA
  2. Agenda 01 deepset & Haystack 02 NVIDIA NIM 03 Advanced

    RAG using NVIDIA NIM in deepset Studio 04 Deploying Haystack Pipelines with NVIDIA NIM on Kubernetes 05 Conclusion
  3. deepset: Solving Custom AI challenges since 2018 Early Advancements Foundational

    NLP technology development Inflection Point GPT3.5 & ChatGPT Major public awareness Broad Ecosystem of AI Tools Business apps and agents running on compound AI systems • Flexible AI Orchestration • Variety of LLMs • Multimodal, Agentic systems • Variety of deployments 2010s Transformer Models Google BERT “Attention is all you needˮ 2017 Early LLMs GPT2, GPT3, LaMDA 20192021 Nov 2022 2024-beyond Studio 2.0 on-prem Our Community
  4. AI Architectures in the Enterprise Retrieval Augmented Generation RAG Retrieve

    relevant documents, use them to inform responses, and generate accurate and contextually rich answers to complex queries. Intelligent AI Agents Automate and streamline tasks, workflows, and insight generation with an compound AI system capable of complex reasoning and decision making. Text-to-SQL / Conversational BI Transform natural language queries into SQL commands, ask questions of complex datasets, and make data analysis more accessible & intuitive. Intelligent Document Processing Process documents at scale, accelerate insight extraction, and boost workflow efficiency with AI-powered automation. Semantic Search Understand and retrieve information based on semantic similarity, deliver highly accurate and relevant search results, and provide an advanced recommendation service. Multimodal Integrate and process multiple forms of data including text, PDFs, images, audio, and video for an enriched user experience.
  5. deepset AI Platform Studio, Enterprise Editions] Orchestration Tools Build Test

    Deploy Monitor Framework Components Pipelines Solutions Flexible Architecture Templates (e.g., Agent, RAG, GraphRAG, Multimodal, Search, IDP, Text2SQL Open Ecosystem Any Data, LLMs, and Integrations deepset: Delivering Custom Enterprise-Grade Gen AI Haystack Open Source LLM Orchestration AI Tools VectorDBs Embedding Models LLMs Evaluation Observability
  6. Open-source LLM orchestration framework Provides the tools that Python developers

    need to build real world, advanced AI systems Quickly combine models, data, and other tools for custom Gen AI Building blocks = Components & Pipelines Component Component Pipeline pip install haystack-ai
  7. Advanced RAG Pipeline Join documents Query BM25 Embedding Reranking Answer

    Retrieval Hybrid Retrieval Prompt Builder LLM Conditional Router Answer Web Search Prompt Builder LLM Use a fallback branch Fallback to web • Alternative data sources • Error handling
  8. deepset Studio • Drag, drop, and construct Haystack pipelines •

    Ready-made pipelines • Bring your own files or connect to your database • Deploy on Studio or export pipelines • Free and open to everyone Development Environment for Haystack
  9. Deploying Pipelines as REST APIs with Hayhooks • Hayhooks: tool

    to deploy and serve Haystack pipelines as REST APIs • Pipeline → Endpoint
  10. Reference Architecture for Enterprise AI AI Orchestration Build, Test and

    Monitor Your Agents and Applications Vector Database Dependent on Customer) Inference Service NVIDIA NIM Microservice for LLM Models / Embedders / Rerankers Network Infrastructure Compute Infrastructure Storage Infrastructure NVIDIA Accelerated Computing Infrastructure Management NVIDIA NIM Operator NVIDIA GPU Operator Kubernetes Workers Container Runtime Enterprise Linux Any Database
  11. Thank You Visit our booth #2111 👉 Get all resources

    Learn more: deepset.ai haystack.deepset.ai