Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ABS2024: ChatGPT Over Your Own Data by Marco Ge...

ABS2024: ChatGPT Over Your Own Data by Marco Gerber & Michael Rüefli

⭐️ ChatGPT over your own data#
As the demand for intelligent chatbots grows, organizations seek ways to tailor these conversational agents to their specific needs. In this session, we explore how to create a private ChatGPT instance that leverages your own data. Whether you’re building a customer support bot, an internal knowledge base, or a specialized domain-specific assistant, understanding the architecture and data requirements is crucial.
🙂 MARCO GERBER ⚡️ Senior Cloud Engineer @ scopewyse
🙂 MICHAEL RÜEFLI ⚡️ Solutions Architect @ scopewyse

More Decks by Azure Zurich User Group

Other Decks in Technology

Transcript

  1. Our focus Cloud Security We follow the Zero Trust principle

    using the combined security features from Microsoft Azure and Microsoft 365 Cloud Platform Microsoft Azure is our selected platform for your critical business applications, whether they are IaaS, or PaaS or microservice based Data & AI Unlock the full potential of your data with Microsoft Data and AI services. We empower your business with cutting edge solutions
  2. Michael Rüefli Partner | Solutions Architect scopewyse GmbH [email protected] www.miru.ch

    @drmiru drmiru About me | Tech Azure Cloud Platform & Security Security in focus, MCT (Microsoft Certified Trainer) Community worker About me | Private Father, Husband, Skydiver, Skier
  3. Marco Gerber Senior Cloud Engineer scopewyse GmbH [email protected] linkedin.com/in/marcogerber @rolebasedpotato

    marcogerber.ch About me | Tech Azure Architecture, Security, AI, Automation About me | Private Winterthur, Alpinsport, Fotografie, Reisen
  4. The AI journey November 30, 2022 Launch of ChatGPT January,

    2023 OpenAI and Microsoft further extend their Partnership January, 2023 Microsoft announces Azure OpenAI Service now GA November, 2021 Microsoft introduces Azure OpenAI Service The rise of Large Language Models (LLM) The rise of Large Multimodal Models (LMM) Build and integration of AI application Today
  5. Initial thoughts We cannot use the public ChatGPT due to

    privacy and compliance reasons. Let‘s deploy Azure OpenAI and use it as internal ChatGPT. Wouldn‘t it be great if we could integrate our own documents and create a knowledge base bot? Well, let‘s use „On Your Data“ in Azure OpenAI and integrate it as a Copilot inside Teams. Prompt-in-prompt-out and asking questions is great, but we need our AI application to do stuff for us. Let‘s bring in an orchestration layer and chain services together.
  6. ▪ Large Language Model (LLM) ▪ GPT-4 (OpenAI) ▪ Llama

    3 (Meta) ▪ Large Mulitmodal Model ▪ GPT-4 Turbo with Vision (OpenAI | Images, Video, Text) ▪ GPT-4o (OpenAI | Audio, Images, Video, Text) ▪ Sora (OpenAI | Video generation) ▪ They can: Put one sequence of character (tokens) after the other (i.e. text generation) ▪ They can not: Run functions, call API‘s, retrieve documents, etc. Therefore, ChatGPT is not a model, it‘s an AI application! LLM / LMM
  7. Data enrichment techniques Fine-tuning • Retraining a base model (like

    GPT-4) with a custom dataset • Adapt or encance to a specific behaviour • Static, not suitable for changing data • Costly to train and maintain Retrieval-Augemented Generation (RAG) Embeddings • Format: Vectors • Good for similarity search rgb(255,180,0) rgb(250,145,0) • Use your own data • Static and changing data • APIs, Databases, Storage (PDF, Word, Markdown, etc.) and many more • Trust through citations GPT-4 model training • Costs of more than $100 Millionen • Trained on ~25‘000 Nvidia A100 Tensor Core GPUs simultaneously (cluster provided by Microsoft) • Continuous training of 90 – 100 days
  8. Azure OpenAI On Your Data (RAG) How it works Tell

    me something about our expense policy. Generated answer, grounded using data from the AI Search index Index documents from storage account Query AI Search
  9. ▪ Why do I need prompt orchestration? ▪ Advanced workflows

    ▪ Integrate business logic ▪ Trigger actions Example: Insurance application Prompt orchestration Prompt Flow Semantic Kernel LangChain
  10. HR application Web portal, API Guideline documents PDF, Word, DB

    entries, etc. Prompt Flow Azure AI Search Index documents and files Content Safety Monitor and safeguard prompts Azure OpenAI Interact, extract, format, translate, etc. HR AI Web App Tasks - Check available vacation days - Check overtime hours - Aks questions about internal HR guidelines - Submit vacation requests Outlook calendar Graph API User information, calendar, etc.
  11. ▪ Think different ▪ Service availability ▪ Responsible and trusted

    AI ▪ Transparency of training data ▪ Data sources ▪ Race, gender, age, ethnicity, etc. ▪ Regulation ▪ Security, Networking, Monitoring, Logging ▪ Python basics are beneficial ▪ Be open and eager to learn. Have fun! Key takeaways