Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DWX 2024: AI++: Multimodale Large Language Mode...

DWX 2024: AI++: Multimodale Large Language Models als Kern moderner Business-Anwendungen – in Action

Menschliche Sprache als Universal Interface für Software-Lösungen - hört sich spannend an! Und das Ganze gemischt mit Bildern verspricht eine neue Art des Benutzerzugangs zu Anwendungen. Jenseits des ChatGPT-Hypes taucht Christian in die Welt der Large Multimodal Models (LMMs) ein und konzentriert sich darauf, wie man AI-Funktionalität über Daten und APIs sinnvoll in eigene Applikationen integrieren kann. Wir werden pragmatische Szenarien und Use Cases untersuchen, die das Potenzial von LMMs (GPT- oder Llama-basiert) demonstrieren - und erörtern, wie AI-Techniken in bestehende Architekturen einbezogen werden können. Die Teilnehmer erhalten erste Einblicke in Frameworks wie LangChain und Semantic Kernel zur Programmierung Generative-AI-basierter Systeme. Zudem werden wir darauf eingehen, nicht nur Closed-Source-Lösungen (wie OpenAI) zu nutzen, sondern auch Open-Source-Optionen (wie Llama oder Mistral) in Betracht zu ziehen, um unterschiedlichen Anforderungen gerecht werden zu können. Universal User Interfaces auf Basis menschlicher Sprache - come in and find out!

Christian Weyer

July 03, 2024
Tweet

More Decks by Christian Weyer

Other Decks in Programming

Transcript

  1. AI++ Multimodale Large Language Models als Kern moderner Business-Anwendungen –

    in Action Christian Weyer Co-Founder & CTO @christianweyer
  2. § Technology catalyst § AI-powered solutions § Pragmatic end-to-end architectures

    § Microsoft Regional Director § Microsoft MVP for Developer Technologies & Azure ASPInsider, AzureInsider § Google GDE for Web Technologies [email protected] @christianweyer https://www.thinktecture.com Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Christian Weyer Co-Founder & CTO @ Thinktecture AG 2
  3. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen –

    in Action Our journey today 3 AI all-the- things? LLMs in your Solutions Talk to your Data Exciting Times… Democratizing Generative AI Talk to your Systems
  4. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen –

    in Action AI all-the-things? 5 Data Science Artificial Intelligence Machine Learning Unsupervised, supervised, reinforcement learning Deep Learning ANN, CNN, RNN etc. NLP (Natural Language Processing) Generative AI GAN, VAE, Transformers etc. Image / Video Generation GAN, VAE Large Language Models Transformers
  5. § LLMs generate text based on input § LLMs can

    understand text – this changes a lot § Without having to train them on domains or use cases § Prompts are the universal interface (“UI”) → unstructured text with semantics § Human language evolves as a first-class citizen in software architecture 🤯 Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Large Language Models (LLMs) – like GPT powering ChatGPT 6 Text… – really, just text?
  6. § LLMs are programs § LLMs are highly specialized neural

    networks § LLMs use(d) lots of data § LLMs need a lot of resources to be operated § LLMs have an API to be used through Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Large Language Models demystified 🔍 7
  7. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen –

    in Action Using LLMs: It’s just HTTP APIs Inference, FTW. 9
  8. GPT-4 API access OpenAI Playground Multimodale Large Language Models (LMMs)

    als Kern moderner Business-Anwendungen – in Action DEMO 10
  9. § Barebones SDKs § Abstraction over HTTP APIs § E.g.

    Open AI SDK, Mistral SDK § Available for any programming language § Also available from other LLM providers § Or: Abstracing frameworks Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Building LLM-based end-to-end applications 11
  10. § OSS framework for developing applications powered by LLMs §

    > 1000 contributors § Python and Typescript versions § Chains for sequences of LLM-related actions in code § Abstractions for § Prompts & LLMs (local and remote) § Memory § Vector stores § Tools § Loading text from a wide range of sources § Alternatives like LlamaIndex, Haystack, etc. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action LangChain - building LLM-based applications 12
  11. § Microsoft’s open-source framework to integrate LLMs into applications §

    .NET, Python, and Java versions § Plugins encapsulate AI capabilities § Semantic functions for prompting § Native functions to run local code § Chain is collection of Plugins § Planners are similar to Agents in LangChain § Not as broad feature set as LangChain § E.g., no concept/abstraction for loading & working with data Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Semantic Kernel 13
  12. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen –

    in Action Answering Questions on Data Retrieval-augmented generation (RAG) Cleanup & Split Text Embedding Question Text Embedding Save Query Relevant Text Question Answer LLM 15 Embedding model Embedding model 💡 Indexing / Embedding Question Answering Vector DB
  13. RAG: Learning about company’s policies via Slack LangChain, Weaviate –

    GPT-4o Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action DEMO 16
  14. § Write or speak your input → get structured data

    for your programs & systems § Clever & strict prompting § Schema description: JSON, TypeScript types, etc. § Framework or tools support § Pydantic, Kor, TypeChat, etc. § Open AI Function / Tool Calling Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Extract structured data from textual information 18
  15. Extracting structured data from text & voice: Form filling JSON

    extraction, OpenAI JS SDK, Angular Forms - Mixtral-8x7B on Groq Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action DEMO 19
  16. Extracting structured data from PDF Python SDK, OpenAI Tool Calling

    - GPT-4o Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action DEMO 20 Multimodal
  17. § Integrate LLM-external systems to aid LLMs § Tool /

    function calling standard established by OpenAI § LLM outputs JSON containing arguments to call one or many functions § LLM does not call the function § All major libs support tool calling § OpenAI SDKs § LangChain § Semantic Kernel § etc. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Extending LLM capabilities 21 curl https://api.openai.com/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $OPENAI_API_KEY" \ -d '{ "model": "gpt-3.5-turbo", "messages": [ { "role": "user", "content": "What is the weather like in Boston?" } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] } }, "required": ["location"] } } } ], "tool_choice": "auto" }'
  18. Ask for experts availability in my company systems Angular, Speech-to-text,

    internal HTTP API, node.js OpenAI SDK + Tool Calling, Text-to-speech – GPT-4o Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action DEMO 22
  19. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen –

    in Action Talking to internal APIs – Ask for experts availability 23 Angular PWA Open AI Speech-to-Text Internal Systems Gateway Open AI GPT-4 Open AI Text-to-Speech Transcribe spoken text Transcribed text Check for experts availability with text Extract { experts, booking times } from text Structured JSON data (Tool calling) Generate response with availability Response Response with experts availability 🗣 🔉 Speech-to-text for response Response audio Internal Company API Query Availability API Availability When is CL…? CL will be…
  20. Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen –

    in Action LLMs everywhere OpenAI-related (cloud) OpenAI Azure OpenAI Service Big cloud providers Google Model Garden on Vertex AI Amazon Bedrock Open-source Edge IoT Server Desktop Mobile Web 25 Other providers Antrophic Cohere Mistral AI Hugging Face Open-source
  21. § Open-source community drives innovation in Generative AI § Important

    factors § Use case § Parameter size § Quantization § Processing power needed § Llama- & Mistral-based families show big potential for local use cases Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Open-source LLMs thrive 26
  22. Local RAG: Mistral-7B open-source LLM llama.cpp, ollama, LangChain, StreamLit Multimodale

    Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action DEMO 27
  23. § LLMs & LMMs enable new scenarios & use cases

    to incorporate human language into software solutions § Fast moving and changing field § Every week something “big” happens in LLM space § Frameworks & ecosystem are evolving together with LLMs § Closed vs open LLMs § Competition drives invention & advancement § SLMs: specialized, fine-tuned for domains § SISO (sh*t in, sh*t out) § Quality of results heavily depends on your data & input Multimodale Large Language Models (LMMs) als Kern moderner Business-Anwendungen – in Action Current state 29