Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Towards Structured Data: LLMs from Prototype to...

Towards Structured Data: LLMs from Prototype to Production

Large Language Models (LLMs) have enormous potential, but also challenge existing workflows in industry that require modularity, transparency, data privacy and structured data. In this talk, I'll present pragmatic and practical approaches for how to use LLMs beyond just chat bots, how to ship more successful NLP projects from prototype to production and how to use the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that can be run and maintained in-house.

Ines Montani

June 12, 2024
Tweet

More Decks by Ines Montani

Other Decks in Programming

Transcript

  1. Ines Montani Explosion TOWARDS STRUCTURED LARGE LANGUAGE MODELS ✨ CHATGPT

    🤖 ARTIFICIAL INTELLIGENCE 🧠 MACHINE LEARNING ✨ PROTOTYPE TO PRODUCTION LLAMA 🦙 NATURAL LANGUAGE PROCESSING 💬 ✨ OPEN SOURCE 🌎 PYTHON 🐍 PROMPT ENGINEERING ⚙ ZERO-SHOT LEARNING 🎯 GPT-4 EVALUATION 📈 COPILOT 🚀 GENERATIVE AI 👾 DATA LLMS FROM Ines Montani 💥 Explosion
  2. SOFTWARE IN INDUSTRY black-box models modular 🧩 transparent 🔎 explainable

    🔮 🔒 data-private ✅ reliable 💸 a ff ordable
  3. SOFTWARE IN INDUSTRY black-box models modular 🧩 transparent 🔎 explainable

    🔮 third-party APIs 🔒 data-private ✅ reliable 💸 a ff ordable
  4. 📖 single/multi-doc summarization ✅ problem solving ✍ paraphrasing 🧮 reasoning

    🖼 style transfer Generative ❓question answering 📚 text classification 🏷 entity recognition 🔗 relation extraction 🧬 grammar & morphology 🎯 semantic parsing 👫 coreference resolution 💬 discourse structure Predictive UNDERSTANDING NLP TASKS
  5. 📖 single/multi-doc summarization ✅ problem solving ✍ paraphrasing 🧮 reasoning

    🖼 style transfer Generative ❓question answering 📚 text classification 🏷 entity recognition 🔗 relation extraction 🧬 grammar & morphology 🎯 semantic parsing 👫 coreference resolution 💬 discourse structure Predictive UNDERSTANDING NLP TASKS human-readable machine-readable
  6. 🔮 large generative model 📦 distilled task-specific model in-context learning

    Falcon MIXTRAL GPT-4 transfer learning ELECTRA T5 BERT-base still very competitive!
  7. GITHUB.COM/EXPLOSION/SPACY-LLM Named Entity Recognition Text Classification Relation Extraction Lemma- tization

    💬 unstructured text input 📊 structured Doc object 🔮 LLM ⚙ Supervised Model ✍ Rules mix, match and replace techniques
  8. CLOSE THE GAP BETWEEN PROTOTYPE AND PRODUCTION 🔗 standardize inputs

    and outputs 📈 start with evaluation EXPLOSION.AI/BLOG/APPLIED-NLP-THINKING 🎯 assess utility, not just accuracy
  9. CLOSE THE GAP BETWEEN PROTOTYPE AND PRODUCTION 🔗 standardize inputs

    and outputs 📈 start with evaluation EXPLOSION.AI/BLOG/APPLIED-NLP-THINKING 🎯 assess utility, not just accuracy 🔁 work on data iteratively
  10. CLOSE THE GAP BETWEEN PROTOTYPE AND PRODUCTION 🔗 standardize inputs

    and outputs 📈 start with evaluation EXPLOSION.AI/BLOG/APPLIED-NLP-THINKING 🎯 assess utility, not just accuracy 🔁 work on data iteratively 💬 consider structure and ambiguity of natural language
  11. processing pipeline prototype 🔮 📦 GITHUB.COM/EXPLOSION/SPACY-LLM processing pipeline in production

    📦 📦 📦 📦 📊 structured Doc object 📊 structured Doc object PROTOTYPE TO PRODUCTION
  12. processing pipeline prototype 🔮 📦 prompt model & transform output

    to structured data GITHUB.COM/EXPLOSION/SPACY-LLM processing pipeline in production 📦 📦 📦 📦 📊 structured Doc object 📊 structured Doc object PROTOTYPE TO PRODUCTION
  13. ▪ PyData NYC 2023 workshop: extracting dishes, ingredients and equipment

    from r/cooking Reddit posts SPACY.FYI/PYDATA-NYC CASE STUDY 🕓 8 hours DATA DEV TIME 📦 400mb MODEL SIZE 🔥 2000+ WORDS / SECOND
  14. ▪ PyData NYC 2023 workshop: extracting dishes, ingredients and equipment

    from r/cooking Reddit posts ▪ used LLM during annotation SPACY.FYI/PYDATA-NYC CASE STUDY 🕓 8 hours DATA DEV TIME 📦 400mb MODEL SIZE 🔥 2000+ WORDS / SECOND
  15. ▪ PyData NYC 2023 workshop: extracting dishes, ingredients and equipment

    from r/cooking Reddit posts ▪ used LLM during annotation ▪ beat few-shot LLM baseline of 0.74 with task-specific model SPACY.FYI/PYDATA-NYC CASE STUDY 🕓 8 hours DATA DEV TIME 📦 400mb MODEL SIZE 🔥 2000+ WORDS / SECOND
  16. ▪ PyData NYC 2023 workshop: extracting dishes, ingredients and equipment

    from r/cooking Reddit posts ▪ used LLM during annotation ▪ beat few-shot LLM baseline of 0.74 with task-specific model ▪ 20× inference time speedup SPACY.FYI/PYDATA-NYC CASE STUDY 🕓 8 hours DATA DEV TIME 📦 400mb MODEL SIZE 🔥 2000+ WORDS / SECOND
  17. ▪ LLMs can be one part of a product or

    process, and swapped for di ff erent approaches. CONCLUSION
  18. ▪ LLMs can be one part of a product or

    process, and swapped for di ff erent approaches. ▪ Iteration and the right tooling can get you past the prototype plateau. CONCLUSION
  19. ▪ LLMs can be one part of a product or

    process, and swapped for di ff erent approaches. ▪ Iteration and the right tooling can get you past the prototype plateau. ▪ There’s no need to compromise on development best practices or privacy. CONCLUSION
  20. THANK YOU! 💥 Explosion 💫 spaCy ✨ Prodigy 🐦 Twitter

    🐘 Mastodon 🦋 Bluesky 💼 LinkedIn explosion.ai spacy.io prodigy.ai @_inesmontani @[email protected] @inesmontani.bsky.social