Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Everything Old is New Again: Why Information Re...

Everything Old is New Again: Why Information Retrieval Still Powers AI Search

As generative AI reshapes how users discover information, many assume traditional search principles are obsolete. This talk argues the opposite. From ranking and relevance to retrieval pipelines and evaluation, modern AI Search systems are built on decades of information retrieval research. Attendees will learn how foundational IR concepts quietly underpin LLM-powered search experiences—and why understanding them is essential for building, optimising, and auditing AI-driven discovery systems.

Avatar for Dawn Anderson

Dawn Anderson

March 25, 2026
Tweet

More Decks by Dawn Anderson

Other Decks in Marketing & SEO

Transcript

  1. Who is Dawn Anderson?  SEO practitioner for almost 20

    years  International SEO conference speaker since 2017  Boutique agency owner (Bertey)  SEO consultant  Information retrieval & AI search world interloper  Commissioned to write a book on AI SEO currently
  2. The 'R' in RAG Does the Heavy Lifting The generation

    is only as good as the context you provide it (retrieval)
  3. What is Information Retrieval (IR)? • Computer science field behind

    search as we know it • Many nuanced specialisms within the field • Close relatives, offspring and siblings in: o Natural language processing o Recommender systems o AI search / Generative information retrieval o Knowledge graphs and structured data specialisms
  4. Chunking Strategies • Fixed-size vs. semantic chunking • There are

    many different types of chunking • Fixed size (word, sentence, paragraph) is very rigid • Semantic chunking takes into consideration meaning / context
  5. Metadata: The Unsung Hero • The importance of tagging chunks

    with metadata (date, author, source) • Allows for search filtering later • Classic IR approaches adapted for AI search
  6. Precision vs. Recall - The classic IR trade-off • Precision:

    Out of all the documents we retrieved, how many were actually useful? • Recall: Out of all the useful documents in the database, how many did we manage to find?
  7. Semantic Similarity (K-Nearest Neighbors) • How vector search ranks results

    • Distance between query vector and document vectors • Query as centroid
  8. The Flaws of Vector Search - Why vector search isn't

    a silver bullet • Cat chasing dog • Dog chasing cat • In vector search these are mostly the same but... obviously they are NOT • Vector search struggles with exact phrasing, part numbers, and negations
  9. Learning to Rank (LTR) Using machine learning to optimise ranking

    Training a model to weigh different signals What matters most in different contexts for different queries? Tune from learnings
  10. Context Injection in LLM Models – Advanced RAG Technique Source:

    https://apxml.com/courses/getting-started-rag/chapter-4-rag-generation-augmentation/context- injection-methods
  11. Fallback Strategies What to do when no relevant data is

    found Good LLM systems should return "I don't know" Often they just guess the most probabilistic answer LLM must be instructed to answer "I don't know"
  12. Classic IR Metrics MRR (Mean Reciprocal Rank) 1 NDCG (Normalised

    Discounted Cumulative Gain) 2 MAP (Mean Average Precision) 3
  13. RAG- Evaluation Was good context retrieved? Did the LLM stick

    to the context? Did it actually answer the user's question?
  14. The Future - Beyond RAG: Agentic Search The shift from

    single queries to multi-step reasoning
  15. Respect the Foundations Generative AI may be the shiny user

    interface but IR is the reliable pipeline.
  16. Thank you • X – dawnieando • LinkedIn – MsDawnAnderson

    • Threads – dawnieando • Bluesky – dawnieando • Bertey.com