Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LLM Development Landscape

LLM Development Landscape

LLM Development Landscape
Presented at Data + AI Day 2024
6th October 2024

Kamolphan Liwprasert

October 06, 2024
Tweet

More Decks by Kamolphan Liwprasert

Other Decks in Technology

Transcript

  1. LLM Development Landscape ✨ Overview ภาพรวมในการพัฒนาแอป LLM ✨ Concept ที่น่ารู้เกี่ยวกับ

    LLM ✨ Dev Application LLM อย่างไรได้บ้าง ✨ มี framework อะไรให้เลือกใช้บ้าง
  2. นิยาม LLM A large language model (LLM) is a computational

    model capable of language generation or other natural language processing tasks. https://en.wikipedia.org/wiki/Large_language_model
  3. นิยาม Multimodal LLM Multimodal = characterized by several different modes

    of activity or occurrence. https://research.google/blog/multimodal-medical-ai/
  4. Model Serving Application 📱 💻 🌐 Model 🤖 API API

    = “Client - Server” Client Server
  5. Why self-host LLM? 💲 Cost efficient in long term (ie.

    on-premise) → Need to tune the latency to make the model faster ⚙ Customization & fine-tuning → No lock-in to a particular model 🔒 Security compliance & data residency / privacy
  6. LangChain 🦜🔗 Python / JS library framework for developing applications

    powered by large language models (LLMs). https://www.langchain.com/langchain
  7. Semantic Kernel from Microsoft Semantic Kernel is an SDK that

    integrates Large Language Models (LLMs) like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java. https://github.com/microsoft/semantic-kernel
  8. vLLM = Model serving for LLM Easy, fast, and cheap

    LLM serving for everyone vLLM is fast with: ✅ State-of-the-art serving throughput ✅ Efficient management of attention key and value memory with PagedAttention ✅ Continuous batching of incoming requests ✅ Fast model execution with CUDA/HIP graph ✅ Quantization: GPTQ, AWQ, SqueezeLLM, FP8 KV Cache ✅ Optimized CUDA kernels https://github.com/vllm-project/vllm Throughput: Higher is better
  9. Responsible AI ✅ ตรวจสอบความถูกต้องเสมอ ✅ Human-centered Design ออกแบบสําหรับคนใช้ ⚠ ระวังเรื่อง

    Data Privacy ความเป็นส่วนตัวของข้อมูล ⚠ Biases and Fairness ทําให้มีความเป็นธรรมกับผู้ใช้
  10. Sunday 3 November 2024 @ K+ Building Samyan Register now:

    bit.ly/devfest-cloud-bkk24 Saturday 26 October 2024 @ Cleverse Register now: bit.ly/technologista-2024 ฝาก event :) Technologista By PyLadies x Women Techmakers DevFest Cloud Bangkok By GDG Cloud Bangkok