Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lessons Learnt from Crawling 1000+ Websites

Lessons Learnt from Crawling 1000+ Websites

First presented at BrightonSEO October 2025, this deck contains 12 important lessons for crawling websites based on our success and failures over the last 30 years

We've also contained plenty of examples in the lessons

Avatar for Charles Meaden

Charles Meaden

October 27, 2025
Tweet

More Decks by Charles Meaden

Other Decks in Marketing & SEO

Transcript

  1. @charlesmeaden #brightonseo This Was The Height of SEO <meta name="keywords"

    content="Robert Jordan, house, flat, bungalow, farm, apartment, country house, terraced house, semi-detached house, detached house, condominium, rental, rent, landlord, tenant, property management, homes, leases, agreements, landlords insurance, accommodation, rooms, management lettings, residential, renters,">
  2. @charlesmeaden #brightonseo 1: Make Friends With The Devs • They

    know all the tricks to getting your crawl to proceed • Speak their language
  3. @charlesmeaden #brightonseo 2: Never Assume • Work on the premise

    that somewhere, someone has got it wrong • Always Be Curious
  4. @charlesmeaden #brightonseo 3: Ask Some, Not Many Questions • Just

    enough for the basic facts • Go in with an open mind
  5. @charlesmeaden #brightonseo 5: Do A Quick First Pass Crawl You’ll

    quickly see what you can safely exclude from the whole crawl
  6. @charlesmeaden #brightonseo 6: Check What Appears On The Page •

    Data from a crawl does a great job of telling you about the page structure • What about the content on the page?
  7. @charlesmeaden #brightonseo A Case Study • USA Hotel chain site

    with 900 hotels, each with their own page(s) • Former agency had crawled for 18 months and found no issues
  8. @charlesmeaden #brightonseo 7: Every Web Site is Slightly Different Content

    management systems and JavaScript frameworks handle things differently
  9. @charlesmeaden #brightonseo 8: Compare and Contrast • Run a text

    and JavaScript crawl • You may be surprised by the difference
  10. @charlesmeaden #brightonseo 8: Keep Your Crawls • Always handy when

    a change has been made • You’ve been tasked with working out what went wrong
  11. @charlesmeaden #brightonseo Examples • Lighthouse tests • Checking for 301

    and 404 • XML Orphan Pages • Overly large images
  12. @charlesmeaden #brightonseo Case Study – Embargoed Products • Resellers who

    have restrictions on the brands they can show in Google • Run a weekly crawl to check for pages that mention it
  13. @charlesmeaden #brightonseo Things We’ve Extracted • Stock levels / Out

    of stock • Categories and taxonomies • Schemas • Video embeds
  14. @charlesmeaden #brightonseo Crawl and Analytics Data Use traffic data to

    understand which changes will make the most impact
  15. @charlesmeaden #brightonseo Case Study – Crawl and Stock Data •

    Ecommerce client combines crawl data with stock data • Determines which category pages get updated first