as you know it It's typically: • Stripped of HTML tags • Has boilerplate removed • Stray artefacts resolved • (Usually) does not render JavaScript • Is not current
• Structured data and entity recognition is how you get on the shortlist to be a part of the RAG pipeline, or how you’re considered as a “good result” in the training data in the first place • When you codify how you talk about your brand and that’s consistent across channels, across media, across wherever you can control and influence, that’s a pattern LLM’s can recognise and interpolate
'Session source/medium' to match regex .*meta.ai.*|.*perplexity.*|.*claude.*|.*mistral.* |.*gemini.*|.*chatgpt.*|.*copilot.*|.*manus.*|.* huggingface.co.*|.*grok.*|.*deepseek.*|.*you .com.*|.*poe.*|.*character.ai.*
Access logs • Usually from your server, sometimes from your CDN (e.g. Cloudflare) • Probably a request to your development team, potentially your engineers if they sit differently • 3 months for a top-level • 6-12 months for a trend • Can use Screaming Frog to aggregate data (or build it yourself with Claude’s help)
Chrome (118/119/120) • Often won’t have a referrer, though many times a referrer isn’t included in logs anyway • Geo mismatch (Vietnam location, en-US lang) • Simple user agents naming, “python-requests/2.28” • Matches datacentre IPs rather than residential
for the specific shade of grey your client asked for? Or it doesn’t have the right alt attribute to describe it properly? Abandon. • Page takes too long to load? Nope. • Can’t search easily for v-neck in the on-site search? Gone. • Exact travel dates not available? Next hotel. • Have a pointless discount pop-up that blocks the category page? Bye!
• Are all pages a 200 response? • Do your forms work properly? • Accessible to a text browser? (e.g. AA/AAA) • Does your filtering work? Your on-site search? • Does your discount pop-up block the conversion path visually? • Remove auto redirects based on geo? • Do you allow guest checkout? Or force login early?
of a purely “SEO” stance again, with: • Retention conversions • CRO initiatives • Product details and ontology • Accessibility • Brand building via digital footprint …and yes, stricter technical SEO and managing tech debt
cookies, 3 of 4 times • Agents will probably be logged as direct traffic • Agents will likely be logged as desktop • Agents will be Chrome/Chromium • Engagement metrics will be..weird
in step with the business 1. LLM bots are here to stay, whether we want them to be or not 2. Bots are not forgiving to errors or conversion blockers 3. This is more than traditional SEO