Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Context Engineering - Making Every Token Count

Avatar for Addy Osmani Addy Osmani
September 09, 2025

Context Engineering - Making Every Token Count

Slides from my talk at O'Reilly AI CodeCon

Video: https://www.youtube.com/watch?v=zMM5zqesL1g

How do you get the best out of AI systems when every token in the context window matters? In this talk I break down context engineering - the art and science of filling an AI’s limited memory with the right mix of instructions, data, and history so it can perform at its best.

You'll learn:

- What tokens and context windows really are, and why they matter 
- Why prompt engineering often fails without strong context management 
- Practical strategies to avoid vague, hallucinated, or poisoned responses 
- Patterns for context management - Write, Select, Compress, Isolate
- How modern AI tools like Cursor and Cline optimize context automatically 
- Actionable tips for AI-assisted coding: from error logs and design docs to database schemas and PR feedback

Whether you’re building with coding agents, debugging with AI, or designing smarter prompts, this talk will help you make every token count.

Avatar for Addy Osmani

Addy Osmani

September 09, 2025
Tweet

More Decks by Addy Osmani

Other Decks in Technology

Transcript

  1. CONTEXT WINDOWS ARE LIKE LIMITED RAM Curation of what fi

    ts into RAM is analogous to “context engineering” ANALOGY
  2. PROMPT ENGINEERING Clear instructions for models so they can accomplish

    a task https://www.youtube.com/watch?v=ysPbXH0LpIE PRELUDE
  3. CONTEXT ENGINEERING MEANS PROVIDING AN AI WITH ALL THE INFORMATION

    AND TOOLS IT NEEDS TO SUCCESSFULLY COMPLETE A TASK – NOT JUST A CLEVERLY WORDED PROMPT. DEFINITION
  4. TOO LITTLE CONTEXT: VAGUE OR HALLUCINATED RESPONSES. TOO MUCH CONTEXT:

    DISTRACTED, UNABLE TO FIND RELEVANT INFO, OVER-INDEXING ON PATTERNS. BAD CONTEXT: POISONING, TRUSTING INCORRECT STATEMENTS OVER TRAINING.
  5. TOO LITTLE CONTEXT: VAGUE OR HALLUCINATED RESPONSES. TOO MUCH CONTEXT:

    DISTRACTED, UNABLE TO FIND RELEVANT INFO, OVER-INDEXING ON PATTERNS. BAD CONTEXT: POISONING, TRUSTING INCORRECT STATEMENTS OVER TRAINING.
  6. TOO LITTLE CONTEXT: VAGUE OR HALLUCINATED RESPONSES. TOO MUCH CONTEXT:

    DISTRACTED, UNABLE TO FIND RELEVANT INFO, OVER-INDEXING ON PATTERNS. BAD CONTEXT: POISONING, TRUSTING INCORRECT STATEMENTS OVER TRAINING.
  7. SELECT CONTEXT • Retrieve relevant tools • Retrieve from scratchpad

    • Retrieve long-term memory • Retrieve relevant knowledge. @aifolksorg
  8. COMPRESS CONTEXT • Summarize context to retain relevant tokens •

    Trim to remove irrelevant tokens @aifolksorg
  9. ISOLATE CONTEXT • Partition context in state • Hold in

    environment/sandbox • Partition across multi-agents @aifolksorg
  10. •Be precise: Vague requests lead to vague answers. The more

    specific you are, the better your results will be. •Provide relevant code: Share the specific files, folders, or code snippets that are central to your request. •Include design documents: Paste or attach sections from relevant design docs to give the AI the bigger picture. •Share full error logs: For debugging, always provide the complete error message and any relevant logs or stack traces. •Show database schemas: When working with databases, a screenshot of the schema helps the AI generate accurate code for data interaction. •Use PR feedback: Comments from a pull request make for context-rich prompts. •Give examples: Show an example of what you want the final output to look like. •State your constraints: Clearly list any requirements, such as libraries to use, patterns to follow, or things to avoid. CONTEXT ENGINEERING FOR AI CODERS