Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DevConf.CZ 2024: AI: Open Source will save us (...

DevConf.CZ 2024: AI: Open Source will save us (InstructLab)

Christian Heimes

June 14, 2024
Tweet

More Decks by Christian Heimes

Other Decks in Programming

Transcript

  1. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 DevConf.CZ 2024 AI: Open source will save us Christian Heimes Principal Software Engineer 1
  2. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 I'm not Máirín Duffy or Aakanksha Duggal 2
  3. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 My role in InstructLab & RHEL AI 4 ▸ Python ▸ Python packaging and PyPI ▸ Containerization ▸ Hardware enablement (AMD ROCm, Intel Gaudi)
  4. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Literacy 5 Democracy needs educated, well-informed citizens Who is in control? Humans or computers? ▸ general literacy (reading, writing) ▸ numeracy (math literacy) ▸ basic analytic reasoning ▸ computer literacy (basic coding skills) ▸ AI literacy
  5. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Agenda 6 ▸ AI hardware and software stacks ▸ Making AI accessible ▸ LAB: Large-Scale Alignment for ChatBots ▸ InstructLab step-by-step ▸ (demo, maybe…)
  6. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 8 hardware: CPU, GPU, HPU, NPU, TPU, XPU, … drivers, support libraries (CUDA, ROCm, oneAPI) Large Language Models (LLM) complicated to install expensive some closed source hard to install some proprietary hard to improve
  7. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 9 open source-licensed large language model (LLM) Graphical tool for Windows, macOS, and Linux with Podman AI Lab extension Podman Desktop model alignment tools based on the LAB (Large-scale Alignment for chatBots) methodology InstructLab • bootc container with drivers • application container with PyTorch stack container Granite model from IBM Research Accessible Open Source Generative AI
  8. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 10 bootc container ▸ Kernel drivers ▸ firmware ▸ firmware tools ▸ podman Application container ▸ Python ▸ PyTorch ▸ vLLM / llama-cpp-python ▸ DeepSpeed ▸ vendor extensions and optimizations ▸ InstructLab Containers AMD ROCm, Intel Gaudi, Nvidia CUDA
  9. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 11 Containers (2) PyTorch installation woes python3 -m venv --upgrade-deps venv source venv/bin/activate (venv) $ pip cache remove llama_cpp_python (venv) $ pip install instructlab \ --extra-index-url https://download.pytorch.org/whl/rocm6.0 \ -C cmake.args="-DLLAMA_HIPBLAS=on" \ -C cmake.args="-DAMDGPU_TARGETS=all" \ -C cmake.args="-DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang" \ -C cmake.args="-DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++" \ -C cmake.args="-DCMAKE_PREFIX_PATH=/opt/rocm"
  10. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 12 flatpak install io.podman_desktop.PodmanDesktop Podman Desktop AI Lab
  11. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 13 "Representing IBM’s commitment to open source innovation IBM has released granite-7b-base, a base pre-trained LLM from IBM’s Granite model series, under an apache-2.0 license for community and commercial use. Granite-7b-base was pre-trained from scratch on IBM-curated data as an open reference implementation of Meta’s Llama-2-7B. In a commitment to data transparency and fostering open innovation, the data sources, sampling proportions, and URLs for access are provided below." IBM Granite 7B model https://huggingface.co/ibm-granite/granite-7b-base
  12. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Open Source foundational model 14 ▸ self-hosting (instead of "LLM as a Service") ▸ no commercial restriction ▸ keep your data in-house ▸ build your own products
  13. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 15 LAB: Large-Scale Alignment for ChatBots This work introduces LAB, a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. Leveraging a taxonomy-guided synthetic data generation process and a multi-phase tuning framework, LAB significantly reduces reliance on expensive human annotations and proprietary models like GPT-4. [...] marking a step forward in the efficient training of LLMs for a wide range of applications. arXiv:2403.01081 Shivchander Sudalairaj, Akash Srivastava, Abhishek Bhandwaldar, Aldo Pareja, Kai Xu, David D. Cox
  14. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Models aren't databases 16 Tensors … tensors everywhere token_embd.weight | size 32008 x 4096 | type F16 | T+ 1 blk.0.attn_norm.weight | size 4096 | type F32 | T+ 2 blk.0.ffn_down.weight | size 4096 x 14336 | type F16 | T+ 2 blk.0.ffn_gate.weight | size 14336 x 4096 | type F16 | T+ 2 blk.0.ffn_up.weight | size 14336 x 4096 | type F16 | T+ 2 blk.0.ffn_norm.weight | size 4096 | type F32 | T+ 2 blk.0.attn_k.weight | size 1024 x 4096 | type F16 | T+ 2 blk.0.attn_output.weight | size 4096 x 4096 | type F16 | T+ 2 blk.0.attn_q.weight | size 4096 x 4096 | type F16 | T+ 2 blk.0.attn_v.weight | size 1024 x 4096 | type F16 | T+ 2 blk.1.attn_norm.weight | size 4096 | type F32 | T+ 2 blk.1.ffn_down.weight | size 4096 x 14336 | type F16 | T+ 3 blk.1.ffn_gate.weight | size 14336 x 4096 | type F16 | T+ 3 blk.1.ffn_up.weight | size 14336 x 4096 | type F16 | T+ 3 ... blk.31.attn_v.weight | size 1024 x 4096 | type F16 | T+ 29 tensor output_norm.weight| size 4096 | type F32 | T+ 29
  15. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Additive Improvements with Pull Request github.com/instructlab/taxonomy 18
  16. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 pip install instructlab 19 LABrador
  17. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 hic sunt dracones! 20 InstructLab 0.16.1 bugs and future changes InstructLab is still under development ▸ InstructLab 0.17 will change subcommands and arguments ▸ demo with low-fidelity approximation of synthetic data generation and model instruction tuning ▸ training data with markup or non-ASCII characters breaks training (sorry "Řečkovice" and "Božetěchova") ▸ PyTorch training on Linux is currently broken mlx training on macOS works, shout out to Grant Shipley
  18. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 pip install instructlab ▸ Linux (Fedora, C9S, RHEL), macOS, WSL2 ▸ 60 GB free disk space ▸ Python 3.9 – 3.11 ▸ C/C++ compiler, devel packages ▸ hardware accelerator recommended ・ AMD ROCm, Nvidia CUDA, Apple Silicon (M1, M2, M3) ▸ free GPU memory ・ > 6 GB for inference ・ > 18 GB for training 21 https://github.com/instructlab/instructlab
  19. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 pip install instructlab (2) 22 https://github.com/instructlab/instructlab python3 -m venv --upgrade-deps venv source venv/bin/activate (venv) $ pip cache remove llama_cpp_python (venv) $ pip install instructlab -C cmake.args="-DLLAMA_CUDA=on" # AMD ROCm build (venv) $ ilab sysinfo instructlab.version: 0.16.1 torch.version: 2.3.1+rocm6.0 torch.backends.cpu.capability: AVX2 torch.version.cuda: None torch.version.hip: 6.0.32830-d62f6a171 torch.cuda.0.name: AMD Radeon RX 7900 XT torch.cuda.0.free: 19.9 torch.cuda.0.total: 20.0 llama_cpp_python.version: 0.2.75 llama_cpp_python.supports_gpu_offload: True
  20. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 InstructLab alignment workflow 23 ▸ ilab init ▸ ilab download ▸ ilab chat (ilab serv + ilab chat) ▸ modify taxonomy ▸ ilab diff ▸ ilab generate ▸ ilab train ▸ ilab chat with the aligned new model
  21. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 init & download 24 (venv) $ ilab init --model-path models/granite-7b-lab-Q4_K_M.gguf Welcome to InstructLab CLI. This guide will help you to setup your environment. Please provide the following values to initiate the environment [press Enter for defaults]: Path to taxonomy repo [taxonomy]: `taxonomy` seems to not exist or is empty. Should I clone https://github.com/instructlab/taxonomy.git for you? [y/N]: Path to your model [models/granite-7b-lab-Q4_K_M.gguf]: Generating `config.yaml` in the current directory… Initialization completed successfully, you're ready to start using `ilab`. Enjoy! (venv) $ ilab download --repository instructlab/granite-7b-lab-GGUF --filename granite-7b-lab-Q4_K_M.gguf Downloading model from instructlab/granite-7b-lab-GGUF@main to models... granite-7b-lab-Q4_K_M.gguf: 100%|█████| 4.08G/4.08G [02:07<00:00, 32.1MB/s] Download complete. Moving file to models/granite-7b-lab-Q4_K_M.gguf
  22. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 serve & chat 25 (venv) $ ilab serve --model-path models/granite-7b-lab-Q4_K_M.gguf --model-family merlinite lab.py:340 Using model 'models/granite-7b-lab-Q4_K_M.gguf' with -1 gpu-layers and 4096 max context size. server.py:206 Starting server process, press CTRL+C to shutdown server... (venv) $ $ ilab chat ╭───────────────────────────────────────────── system ──────────────────────────────────────────────╮ │ Welcome to InstructLab Chat w/ MODELS/GRANITE-7B-LAB-Q4_K_M.GGUF (type /h for help) │ ╰───────────────────────────────────────────────────────────────────────────────────────────────────╯ >>> When is DevConf.CZ in 2024? [S][default] ╭──────────────────────────────── models/granite-7b-lab-Q4_K_M.gguf ────────────────────────────────╮ │ DevConf.CZ, an annual software development conference held in the Czech Republic, typically takes │ │ place in February or March. To be more specific, it usually occurs around the second week of │ │ March. However, I would recommend checking the official website or following their announcements │ │ on social media for the most accurate and up-to-date information regarding the date and venue of │ │ DevConf.CZ 2024. This way, you can ensure that you have the most current details and plan your │ │ attendance accordingly. │ ╰─────────────────────────────────────────────────────────────────────────── elapsed 1.791 seconds ─╯
  23. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 AI hallucination 26 >>> When is DevConf.CZ in 2024? I apologize for any confusion earlier. Based on the information available up to March 15, 2023, the exact dates for DevConf.CZ 2024 have not been announced yet. >>> /N >>> When is DevConf.CZ in 2024? DevConf.CZ is an annual software development conference held in the Czech Republic, focusing on open source technologies. In 2024, the event is scheduled to take place from February 7 to February 9. This three-day conference offers a wide range of sessions, workshops, and networking opportunities for developers, enthusiasts, and professionals in the field of software development. >>> /N >>> When is DevConf.CZ in 2024? DevConf.CZ is an annual technology conference held in the Czech Republic. In 2024, it will take place from March 13th to March 15th. [...]
  24. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Adding knowledge 27 https://github.com/tiran/taxonomy-devconfcz2024 # knowledge/events/conferences/devconfcz2024/qna.yaml --- created_by: Christian Heimes domain: conferences seed_examples: - answer: DevConf.CZ is a community conference for developers, admins, DevOps engineers, testers, documentation writers and other contributors to open source technologies. question: What is DevConf.CZ? - answer: The conference is from Thursday June 13 to Friday June 15, 2024 (2024-06-13 to 2024-06-15). question: When is DevConf.CZ in 2024? - answer: DevConf.CZ 2024 is in the Faculty of Information Technology in Brno, Czech Republic question: Where is DevConf.CZ in 2024? - answer: You can fly to Vienna or Prague airport, then take the train to Brno main train station. question: How do I get to Brno by air travel? - answer: Open source, Linux distributions, hybrid cloud, edge computing, artificial intelligence, devops. question: What are the conference themes? - answer: '"From General AI to the Edge" and "What if you could boot a container?"' question: What are the DevConf.CZ 2024 keynotes? task_description: 'DevConf.CZ 2024 Brno' document: repo: https://github.com/tiran/taxonomy-devconfcz2024.git commit: main patterns: - conference.md - themes.md - travel.md - keynotes.md text book pop quiz
  25. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Synthetic data generation 28 (venv) $ ilab diff knowledge/events/conferences/devconfcz2024/qna.yaml Taxonomy in /taxonomy/ is valid :) (venv) $ ilab generate --num-instructions 300 Generating synthetic data using 'models/granite-7b-lab-Q4_K_M.gguf' model, taxonomy:'taxonomy' against http://127.0.0.1:8000/v1 server Synthesizing new instructions. If you aren't satisfied with the generated instructions, interrupt training (Ctrl-C) and try adjusting your YAML files. Adding more examples may help. generate_data.py:506 Selected taxonomy path knowledge->events->conferences->devconfcz2024 Q> What is the location of DevConf.CZ 2024? I> A> DevConf.CZ 2024 will be held at the Faculty of Information Technology in Brno, Czech Republic.
  26. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Synthetic data generation (SDG) 29 Q> What can you tell me about DevConf.CZ 2024 Brno? A> DevConf.CZ 2024 is a tech conference held in the city of Brno, Czech Republic from June 13-15, 2024. The event features presentations on modern application development and services, Linux distributions and operating systems, edge computing and automotive, artificial intelligence and data science, DevOps and automation, security and compliance, and open source success stories. This year's theme is "Navigating the Future of Technology." The conference offers keynote speeches from industry leaders and experts in their respective fields. Attendees can participate in workshops, tutorials, and other hands-on sessions to learn about new tools and technologies. The event also features talks on agility, leadership, diversity, equity, and inclusion (DEI), as well as future technology trends. [...] Q> What is the closest tram or bus stop to the venue? A> The closest bus and tram stop to DevConf.CZ is Semilasso, which can be reached by taking the No. 72 bus from Technologický park or tram No. 12 from Red Hat Office, as described in detail above.
  27. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 SDG: Food 30 Q> What kind of food will be provided at DevConf.CZ? A> Food will include breakfast snacks like pastries, sandwiches, fruit, and coffee. For lunch there will be a variety of hot meals available such as burgers, pasta salads, and even some traditional Czech dishes like goulash and schnitzel. Vegetarian options are also available. Q> What is the most popular dish in the Czech Republic? A> One of the most popular traditional dishes in the Czech Republic is "svíčka", also known as "bramborky" or "cabbage rolls". This dish consists of a savory mixture of ground meat (usually pork, beef, or sometimes turkey), herbs such as garlic and onions, spices including black pepper and paprika, and shredded cabbage leaves. The filling is rolled up in the cabbage leaves and then steamed or boiled to achieve a tender texture. Svícička can be served with various side dishes like potato salad, pickled vegetables, and bread dumplings called "knedlíky".
  28. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 SDG: Places to visit 31 Q> What are some of the best attractions in Brno for tourists to see? I> A> Some popular attractions in Brno include: - The Spilberk Fortress - The Old Town Hall - The Olomouc Gate Q> What are the best places to visit in Brno? A> The best places to visit in Brno depend on your interests. For history and architecture, you can visit the old town hall, Spilberk fortress, and Villa Tugendhat. For nature and recreation, you can go to the Botanical Garden of Masaryk University or take a walk through the city forest. Q> What is a good source for local attractions and museums? A> One possible source for local attractions and museums would be <https://www.visitbrno.cz/en/things-to-do/museums/>, which has several pages listing different types of museums in Brno, as well as other attractions and activities.
  29. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 training 32 (venv) $ ilab train --device=cuda --iters 300 --num-epochs 5 LINUX_TRAIN.PY: NUM EPOCHS IS: 5 LINUX_TRAIN.PY: TRAIN FILE IS: generated/train_merlinite-7b-lab-Q4_K_M_2024-06-09T13_02_30.jsonl LINUX_TRAIN.PY: TEST FILE IS: generated/test_merlinite-7b-lab-Q4_K_M_2024-06-09T13_02_30.jsonl LINUX_TRAIN.PY: Using device 'cuda:0' AMD ROCm HIP version: 6.0.32830-d62f6a171 cuda:0 is 'AMD Radeon RX 7900 XT' (19.9 GiB of 20.0 GiB free, capability: 11.0) LINUX_TRAIN.PY: LOADING DATASETS Generating train split: 301 examples [00:00, 88236.34 examples/s] Generating train split: 7 examples [00:00, 4256.94 examples/s] [...] LINUX_TRAIN.PY: LOADING THE BASE MODEL config.json: 100%|███| 644/644 [00:00<00:00, 6.69MB/s] model.safetensors.index.json: 100%|███| 23.9k/23.9k [00:00<00:00, 113MB/s] model-00001-of-00003.safetensors: 100%|███| 4.94G/4.94G [02:33<00:00, 32.2MB/s] model-00002-of-00003.safetensors: 100%|███| 5.00G/5.00G [03:40<00:00, 22.7MB/s] model-00003-of-00003.safetensors: 100%|███| 4.54G/4.54G [04:56<00:00, 15.3MB/s]
  30. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Serve and test the new model 33 (venv) $ ilab serve --model-path models/ggml-model-f16.gguf --model-family merlinite lab.py:340 Using model 'models/ggml-model-f16.gguf' with -1 gpu-layers and 4096 max context size. server.py:206 Starting server process, press CTRL+C to shutdown server... (venv) $ ilab chat --model-path models/ggml-model-f16.gguf --model-family merlinite ... $ gpt --temperature 0.1 > When is DevConf.CZ 2024? DevConf.CZ 2024 will take place on June 13-15, 2024.
  31. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 low-fidelity approximation 34 "ilab train" uses quantized LoRA for training, which results in a low fidelity model! ▸ ilab generate and train: couple of hours on consumer hardware for $3k USD ▸ backend training with DeepSpeed: couple of days on hardware worth several $ 100k USD
  32. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 Summary 35 https://github.com/instructlab ▸ bootc and PyTorch container ▸ Podman desktop and AI Lab ▸ Granite model ▸ Model alignment with InstructLab Please join the InstructLab community. Slack invite like in instructlab/community repository.
  33. DevConf.CZ: AI: Open source will save us, Christian Heimes, CC

    BY-SA 4.0 linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat 36 Red Hat is the world’s leading provider of enterprise open source software solutions. Award-winning support, training, and consulting services make Red Hat a trusted adviser to the Fortune 500. Thank you