Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs (QCon London)

The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs (QCon London)

With the latest advancements in Natural Language Processing and Large Language Models (LLMs), and big companies like OpenAI dominating the space, many people wonder: Are we heading further into a black box era with larger and larger models, obscured behind APIs controlled by big tech monopolies?

I don’t think so, and in this talk, I’ll show you why. I’ll dive deeper into the open-source model ecosystem, some common misconceptions about use cases for LLMs in industry, practical real-world examples and how basic principles of software development such as modularity, testability and flexibility still apply. LLMs are a great new tool in our toolkits, but the end goal remains to create a system that does what you want it to do. Explicit is still better than implicit, and composable building blocks still beat huge black boxes.

As ideas develop, we’re seeing more and more ways to use compute efficiently, producing AI systems that are cheaper to run and easier to control. In this talk, I'll share some practical approaches that you can apply today. If you’re trying to build a system that does a particular thing, you don’t need to transform your request into arbitrary language and call into the largest model that understands arbitrary language the best. The people developing those models are telling that story, but the rest of us aren’t obliged to believe them.

Ines Montani

April 08, 2024
Tweet

Resources

Behind the scenes

https://speakerdeck.com/inesmontani/the-ai-revolution-will-not-be-monopolized-behind-the-scenes

A more in-depth look at the concepts and ideas behind the talk, including academic literature, related experiments and preliminary results for distilled task-specific models.

More Decks by Ines Montani

Other Decks in Technology

Transcript

  1. Developer tools company specializing in AI, machine learning and NLP.

    explosion.ai EXPLOSION Ines Montani Founder and CEO
  2. Developer tools company specializing in AI, machine learning and NLP.

    explosion.ai EXPLOSION Ines Montani Founder and CEO Matthew Honnibal Founder and CTO
  3. Modern scriptable annotation tool for machine learning developers prodigy.ai PRODIGY

    9k+ 800+ users companies Alex Smith Developer Kim Miller Analyst
  4. WHY OPEN SOURCE? transparent no lock-in up to date programmable

    extensible community-vetted runs in-house easy to get started
  5. WHY OPEN SOURCE? transparent no lock-in up to date programmable

    extensible community-vetted runs in-house easy to get started also free!
  6. task-specific models small, often fast, cheap to run, don’t always

    generalize well, need data to fine-tune OPEN-SOURCE MODELS
  7. encoder models ELECTRA T5 task-specific models small, often fast, cheap

    to run, don’t always generalize well, need data to fine-tune OPEN-SOURCE MODELS
  8. encoder models ELECTRA T5 task-specific models small, often fast, cheap

    to run, don’t always generalize well, need data to fine-tune OPEN-SOURCE MODELS
  9. encoder models ELECTRA T5 task-specific models small, often fast, cheap

    to run, don’t always generalize well, need data to fine-tune relatively small and fast, affordable to run, generalize & adapt well, need data to fine-tune OPEN-SOURCE MODELS
  10. encoder models ELECTRA T5 task-specific models small, often fast, cheap

    to run, don’t always generalize well, need data to fine-tune relatively small and fast, affordable to run, generalize & adapt well, need data to fine-tune OPEN-SOURCE MODELS large generative models Falcon MIXTRAL
  11. encoder models ELECTRA T5 task-specific models small, often fast, cheap

    to run, don’t always generalize well, need data to fine-tune relatively small and fast, affordable to run, generalize & adapt well, need data to fine-tune OPEN-SOURCE MODELS large generative models Falcon MIXTRAL very large, often slower, expensive to run, generalize & adapt well, need little to no data
  12. encoder models large generative models ENCODING & DECODING TASKS network

    trained for specific tasks using model to encode input ! model " text vectors ! task model task output # task network labels
  13. encoder models large generative models ENCODING & DECODING TASKS model

    generates text that can be parsed into task-specific output " text ! model raw output ⚙ parser task output % template prompt network trained for specific tasks using model to encode input ! model " text vectors ! task model task output # task network labels
  14. encoder models ELECTRA T5 task-specific models small, often fast, cheap

    to run, don’t always generalize well, need data to fine-tune relatively small and fast, affordable to run, generalize & adapt well, need data to fine-tune OPEN-SOURCE MODELS large generative models Falcon MIXTRAL very large, often slower, expensive to run, generalize & adapt well, need little to no data
  15. encoder models ELECTRA T5 task-specific models small, often fast, cheap

    to run, don’t always generalize well, need data to fine-tune relatively small and fast, affordable to run, generalize & adapt well, need data to fine-tune OPEN-SOURCE MODELS large generative models Falcon MIXTRAL very large, often slower, expensive to run, generalize & adapt well, need little to no data
  16. output costs OpenAI Google ECONOMIES OF SCALE access to talent,

    compute etc. API request batching high traffic & & & & & & & & low traffic batch & & & & & & & & …
  17. output costs OpenAI Google you ' ECONOMIES OF SCALE access

    to talent, compute etc. API request batching high traffic & & & & & & & & low traffic batch & & & & & & & & …
  18. human-facing systems machine-facing models ChatGPT GPT-4 most important differentiation is

    product, not just technology AI PRODUCTS ARE MORE THAN JUST A MODEL
  19. human-facing systems machine-facing models ChatGPT GPT-4 most important differentiation is

    product, not just technology UI / UX marketing customization AI PRODUCTS ARE MORE THAN JUST A MODEL
  20. human-facing systems machine-facing models ChatGPT GPT-4 swappable components based on

    research, impacts are quantifiable most important differentiation is product, not just technology UI / UX marketing customization AI PRODUCTS ARE MORE THAN JUST A MODEL
  21. human-facing systems machine-facing models ChatGPT GPT-4 swappable components based on

    research, impacts are quantifiable most important differentiation is product, not just technology cost speed accuracy latency UI / UX marketing customization AI PRODUCTS ARE MORE THAN JUST A MODEL
  22. human-facing systems machine-facing models ChatGPT GPT-4 swappable components based on

    research, impacts are quantifiable most important differentiation is product, not just technology cost speed accuracy latency UI / UX marketing customization AI PRODUCTS ARE MORE THAN JUST A MODEL But what about the data?
  23. human-facing systems machine-facing models ChatGPT GPT-4 swappable components based on

    research, impacts are quantifiable most important differentiation is product, not just technology cost speed accuracy latency UI / UX marketing customization AI PRODUCTS ARE MORE THAN JUST A MODEL But what about the data? User data is an advantage for product, not the foundation for machine-facing tasks.
  24. human-facing systems machine-facing models ChatGPT GPT-4 swappable components based on

    research, impacts are quantifiable most important differentiation is product, not just technology cost speed accuracy latency UI / UX marketing customization AI PRODUCTS ARE MORE THAN JUST A MODEL But what about the data? User data is an advantage for product, not the foundation for machine-facing tasks. You don’t need specific data to gain general knowledge.
  25. human-facing systems machine-facing models ChatGPT GPT-4 swappable components based on

    research, impacts are quantifiable most important differentiation is product, not just technology cost speed accuracy latency UI / UX marketing customization AI PRODUCTS ARE MORE THAN JUST A MODEL But what about the data? User data is an advantage for product, not the foundation for machine-facing tasks. You don’t need specific data to gain general knowledge.
  26. USE CASES IN INDUSTRY predictive tasks ( entity recognition )

    relation extraction * coreference resolution # grammar & morphology + semantic parsing % discourse structure , text classification generative tasks " single/multi-doc summarization - reasoning ✅ problem solving ✍ paraphrasing 0 style transfer ⁉ question answering
  27. USE CASES IN INDUSTRY predictive tasks ( entity recognition )

    relation extraction * coreference resolution # grammar & morphology + semantic parsing % discourse structure , text classification generative tasks " single/multi-doc summarization - reasoning ✅ problem solving ✍ paraphrasing 0 style transfer ⁉ question answering many industry problems have remained the same, they just changed in scale structured data
  28. supervised learning programming & rules rules or instructions ✍ machine

    learning examples 2 EVOLUTION OF PROBLEM DEFINITIONS
  29. supervised learning programming & rules rules or instructions ✍ in-context

    learning rules or instructions ✍ machine learning examples 2 EVOLUTION OF PROBLEM DEFINITIONS
  30. supervised learning prompt engineering programming & rules rules or instructions

    ✍ in-context learning rules or instructions ✍ machine learning examples 2 EVOLUTION OF PROBLEM DEFINITIONS
  31. supervised learning prompt engineering programming & rules rules or instructions

    ✍ in-context learning rules or instructions ✍ machine learning examples 2 EVOLUTION OF PROBLEM DEFINITIONS instructions: human-shaped, easy for non-experts, risk of data drift ✍
  32. supervised learning prompt engineering programming & rules rules or instructions

    ✍ in-context learning rules or instructions ✍ machine learning examples 2 EVOLUTION OF PROBLEM DEFINITIONS instructions: human-shaped, easy for non-experts, risk of data drift ✍ 2 examples: nuanced and intuitive behaviors, specific to use case, labor-intensive
  33. supervised learning prompt engineering programming & rules rules or instructions

    ✍ in-context learning rules or instructions ✍ machine learning examples 2 EVOLUTION OF PROBLEM DEFINITIONS instructions: human-shaped, easy for non-experts, risk of data drift ✍ 2 examples: nuanced and intuitive behaviors, specific to use case, labor-intensive
  34. prompting large general- purpose model continuous evaluation baseline domain- specific

    data WORKFLOW EXAMPLE iterative model-assisted data annotation
  35. prompting large general- purpose model continuous evaluation baseline domain- specific

    data WORKFLOW EXAMPLE iterative model-assisted data annotation
  36. prompting large general- purpose model distilled task- specific model transfer

    learning continuous evaluation baseline domain- specific data WORKFLOW EXAMPLE iterative model-assisted data annotation
  37. prompting large general- purpose model distilled task- specific model transfer

    learning continuous evaluation baseline distilled model domain- specific data WORKFLOW EXAMPLE iterative model-assisted data annotation
  38. processing pipeline in production swap, replace and mix components github.com/explosion/spacy-llm

    prompt model & transform output to structured data processing pipeline prototype PROTOTYPE TO PRODUCTION
  39. processing pipeline in production swap, replace and mix components github.com/explosion/spacy-llm

    prompt model & transform output to structured data processing pipeline prototype PROTOTYPE TO PRODUCTION
  40. processing pipeline in production swap, replace and mix components github.com/explosion/spacy-llm

    prompt model & transform output to structured data structured machine-facing Doc object processing pipeline prototype PROTOTYPE TO PRODUCTION
  41. RESULTS & CASE STUDIES F-Score Speed (words/s) GPT-3.5 1 78.6

    < 100 GPT-4 1 83.5 < 100 spaCy 91.6 4,000 Flair 93.1 1,000 SOTA 2023 2 94.6 1,000 SOTA 2003 3 88.8 > 20,000 1. Ashok and Lipton (2023), 2. Wang et al. (2021), 3. Florian et al. (2003) CoNLL 2003 Named Entity Recognition
  42. RESULTS & CASE STUDIES F-Score Speed (words/s) GPT-3.5 1 78.6

    < 100 GPT-4 1 83.5 < 100 spaCy 91.6 4,000 Flair 93.1 1,000 SOTA 2023 2 94.6 1,000 SOTA 2003 3 88.8 > 20,000 1. Ashok and Lipton (2023), 2. Wang et al. (2021), 3. Florian et al. (2003) SOTA on few- shot prompting RoBERTa-base CoNLL 2003 Named Entity Recognition
  43. FabNER Claude 2 accuracy on # of examples 10 20

    30 40 50 60 70 80 90 100 0 100 200 300 400 500 20 examples RESULTS & CASE STUDIES F-Score Speed (words/s) GPT-3.5 1 78.6 < 100 GPT-4 1 83.5 < 100 spaCy 91.6 4,000 Flair 93.1 1,000 SOTA 2023 2 94.6 1,000 SOTA 2003 3 88.8 > 20,000 1. Ashok and Lipton (2023), 2. Wang et al. (2021), 3. Florian et al. (2003) SOTA on few- shot prompting RoBERTa-base CoNLL 2003 Named Entity Recognition
  44. FabNER Claude 2 accuracy on # of examples 10 20

    30 40 50 60 70 80 90 100 0 100 200 300 400 500 20 examples RESULTS & CASE STUDIES F-Score Speed (words/s) GPT-3.5 1 78.6 < 100 GPT-4 1 83.5 < 100 spaCy 91.6 4,000 Flair 93.1 1,000 SOTA 2023 2 94.6 1,000 SOTA 2003 3 88.8 > 20,000 1. Ashok and Lipton (2023), 2. Wang et al. (2021), 3. Florian et al. (2003) SOTA on few- shot prompting RoBERTa-base CoNLL 2003 Named Entity Recognition
  45. FabNER Claude 2 accuracy on # of examples 10 20

    30 40 50 60 70 80 90 100 0 100 200 300 400 500 20 examples RESULTS & CASE STUDIES says more about crowd worker methodology than LLMs we don’t need crowd workers 3 F-Score Speed (words/s) GPT-3.5 1 78.6 < 100 GPT-4 1 83.5 < 100 spaCy 91.6 4,000 Flair 93.1 1,000 SOTA 2023 2 94.6 1,000 SOTA 2003 3 88.8 > 20,000 1. Ashok and Lipton (2023), 2. Wang et al. (2021), 3. Florian et al. (2003) SOTA on few- shot prompting RoBERTa-base CoNLL 2003 Named Entity Recognition
  46. modular testable flexible predictable transparent no lock-in programmable extensible run

    in-house cheap to run DISTILLED TASK-SPECIFIC COMPONENTS
  47. modular testable flexible predictable transparent no lock-in programmable extensible run

    in-house cheap to run DISTILLED TASK-SPECIFIC COMPONENTS
  48. control resource regulation compounding economies of scale network effects MONOPOLY

    STRATEGIES human-facing products vs. machine-facing models
  49. control resource regulation compounding economies of scale network effects MONOPOLY

    STRATEGIES human-facing products vs. machine-facing models
  50. THE AI REVOLUTION WON’T BE MONOPOLIZED The software industry does

    not run on secret sauce. Knowledge gets shared and published. Secrets won’t give anyone a monopoly.
  51. THE AI REVOLUTION WON’T BE MONOPOLIZED The software industry does

    not run on secret sauce. Knowledge gets shared and published. Secrets won’t give anyone a monopoly. Usage data is great for improving a product, but it doesn’t generalize. Data won’t give anyone a monopoly.
  52. THE AI REVOLUTION WON’T BE MONOPOLIZED The software industry does

    not run on secret sauce. Knowledge gets shared and published. Secrets won’t give anyone a monopoly. LLMs can be one part of a product or process, and swapped for different approaches. Interoperability is the opposite of monopoly. Usage data is great for improving a product, but it doesn’t generalize. Data won’t give anyone a monopoly.
  53. THE AI REVOLUTION WON’T BE MONOPOLIZED The software industry does

    not run on secret sauce. Knowledge gets shared and published. Secrets won’t give anyone a monopoly. LLMs can be one part of a product or process, and swapped for different approaches. Interoperability is the opposite of monopoly. Usage data is great for improving a product, but it doesn’t generalize. Data won’t give anyone a monopoly. Regulation could give someone a monopoly, if we let it. It should focus on products and actions, not components.