Applied NLP with LLMs: Beyond Black-Box Monoliths

Ines Montani Explosion LLM

270m+ 270m+ spaC y Open-source library for industrial- strength natural
language processing spacy.io downloads

270m+ 270m+ spaC y ChatGPT can write spaCy code! Open-source
library for industrial- strength natural language processing spacy.io downloads

900+ 10k+ Prodi g y Modern scriptable annotation tool for
machine learning developers prodigy.ai 900+ companies 10k+ users

900+ 10k+ Prodi g y Modern scriptable annotation tool for
machine learning developers prodigy.ai Alex Smith Developer Kim Miller Analyst GPT-4 API 900+ companies 10k+ users

Falcon MIXTRAL GPT-4 LLM

Falcon MIXTRAL GPT-4 good contextual results LLM

Falcon MIXTRAL GPT-4 good contextual results easy to use &
configure LLM

Falcon MIXTRAL GPT-4 good contextual results easy to use &
configure fast prototyping LLM

Falcon MIXTRAL GPT-4 good contextual results ⚠ transparency easy to
use & configure fast prototyping LLM

Falcon MIXTRAL GPT-4 good contextual results ⚠ transparency ⚠ e
iciency easy to use & configure fast prototyping LLM

Falcon MIXTRAL GPT-4 good contextual results ⚠ data privacy ⚠
transparency ⚠ e iciency easy to use & configure fast prototyping LLM

Pro t ot y pe & Productio n CLOSE THE
GAP BETWEEN CLOSE THE GAP BETWEEN

GAP BETWEEN CLOSE THE GAP BETWEEN How to avoid the prototype plateau?

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs How to avoid the prototype plateau?

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs 📈 start with evaluation How to avoid the prototype plateau?

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs 📈 start with evaluation 🔮 assess utility, not just accuracy explosion.ai/blog/applied-nlp-thinking How to avoid the prototype plateau?

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs 📈 start with evaluation 🔮 assess utility, not just accuracy explosion.ai/blog/applied-nlp-thinking 🛠 work on data iteratively How to avoid the prototype plateau?

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs 📈 start with evaluation 🔮 assess utility, not just accuracy explosion.ai/blog/applied-nlp-thinking 💬 consider structure and ambiguity of natural language 🛠 work on data iteratively How to avoid the prototype plateau?

P rototype task-specific output 💬 prompt 📖 text LLM GPT-4
API

P rototype task-specific output 💬 prompt 📖 text LLM prompt
model & transform output to structured data github.com/explosion/spacy-llm GPT-4 API

📖 text task-specific output P roduction P rototype task-specific output
💬 prompt 📖 text LLM prompt model & transform output to structured data github.com/explosion/spacy-llm GPT-4 API

💬 prompt 📖 text LLM distilled task-specific components prompt model & transform output to structured data github.com/explosion/spacy-llm GPT-4 API

💬 prompt 📖 text LLM distilled task-specific components prompt model & transform output to structured data github.com/explosion/spacy-llm ✅ modular GPT-4 API

💬 prompt 📖 text LLM distilled task-specific components prompt model & transform output to structured data github.com/explosion/spacy-llm ✅ small & fast ✅ modular GPT-4 API

💬 prompt 📖 text LLM distilled task-specific components prompt model & transform output to structured data github.com/explosion/spacy-llm ✅ data-private ✅ small & fast ✅ modular GPT-4 API

in the loop H uma n explosion.ai/blog/human-in-the-loop-distillation LLM

in the loop H uma n explosion.ai/blog/human-in-the-loop-distillation continuous evaluation baseline
LLM

LLM prompting

LLM prompting transfer learning CO M PO N EN T

LLM prompting transfer learning CO M PO N EN T distilled model

Case Stud y : PyData NYC 8hr 400mb 2k+ 8hr
400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts model size words/second data dev time spacy.fyi/pydata-nyc

400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts • used LLM during annotation model size words/second data dev time spacy.fyi/pydata-nyc

400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts • used LLM during annotation • 20× inference time speedup model size words/second data dev time spacy.fyi/pydata-nyc

400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts • used LLM during annotation • 20× inference time speedup • beat few-shot LLM baseline of 0.74 with task-specific model model size words/second data dev time spacy.fyi/pydata-nyc

Case Stud y : S&P Global 99% 6mb 16k+ 99%
6mb 16k+ • real-time commodities trading insights by extracting structured attributes model size words/second F-score explosion.ai/blog/sp-global-commodities

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment model size words/second F-score explosion.ai/blog/sp-global-commodities

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment • used LLM during annotation model size words/second F-score explosion.ai/blog/sp-global-commodities

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment • used LLM during annotation • 10× data development speedup with humans and model in the loop model size words/second F-score explosion.ai/blog/sp-global-commodities

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment • used LLM during annotation • 10× data development speedup with humans and model in the loop • 8 market pipelines in production model size words/second F-score explosion.ai/blog/sp-global-commodities

break down larger problems

break down larger problems make problem easier

break down larger problems make problem easier reassess dependencies

break down larger problems make problem easier reassess dependencies choose
the best techniques

break down larger problems make problem easier reassess dependencies choose
the best techniques iterate on code and data

break down larger problems make problem easier factor out business
logic reassess dependencies choose the best techniques iterate on code and data

Case Stud y : GitLab 1 year 6× 1 year
6× • extract actionable insights from support tickets and usage questions speedup of support tickets explosion.ai/blog/gitlab-support-insights

6× • extract actionable insights from support tickets and usage questions • high-security environment speedup of support tickets explosion.ai/blog/gitlab-support-insights

6× • extract actionable insights from support tickets and usage questions • high-security environment • easy to adapt to new scenarios and business questions speedup of support tickets explosion.ai/blog/gitlab-support-insights

6× • extract actionable insights from support tickets and usage questions • high-security environment • easy to adapt to new scenarios and business questions • separated general-purpose features from product-specific logic speedup of support tickets explosion.ai/blog/gitlab-support-insights

Summar y APPLIED NLP & GEN AI APPLIED NLP &
GEN AI

Reason and refactor. The key to success lies in your
data and may surprise you! Summar y APPLIED NLP & GEN AI APPLIED NLP & GEN AI

data and may surprise you! Summar y APPLIED NLP & GEN AI APPLIED NLP & GEN AI Iterate. The right tooling and mindset gets you past the “prototype plateau”.

data and may surprise you! LLM Stay ambitious. Don’t compromise on best practices, e iciency and privacy. Summar y APPLIED NLP & GEN AI APPLIED NLP & GEN AI Iterate. The right tooling and mindset gets you past the “prototype plateau”.

Explosion spaCy Prodigy Twitter Mastodon Bluesky explosion.ai spacy.io prodigy.ai @_inesmontani
@[email protected] @inesmontani.bsky.social LinkedIn

Applied NLP with LLMs: Beyond Black-Box Monoliths

Applied NLP with LLMs: Beyond Black-Box Monoliths

Resources

A practical guide to human-in-the-loop distillation

Applied NLP Thinking: How to Translate Problems into Solutions

Half hour of labeling power: Can we beat GPT?

How S&P Global is making markets more transparent with NLP, spaCy and Prodigy

How GitLab uses spaCy to analyze support tickets and empower their community

Using LLMs for human-in-the-loop distillation in Prodigy

More Decks by Ines Montani

Other Decks in Technology

Featured

Transcript