Reimagining SEO in the AI Era [SoCi Reimagine 2024 Talk]

Reimagining SEO in the AI Era Keynote Speaker Michael King,
CEO of iPullRank

3 3 Download this deck: https://speakerdeck.com/ipullrank

4 Salutations! I’m Mike King (@iPullRank)

6 People are Searching Differently?

7 7 “Search It Up” is the new “Google it”

9 9 What does it mean when a brand ceases
to be a verb?

10 There’s A Lot of Discussion of ChatGPT Replacing Google
The same is true for Tiktok, Perplexity, Bing’s CoPilot, and [insert new genAI search tool here].

11 11 40% of People Leaving ChatGPT Go to Google
My assumption is that many of these people are fact-checking. That is a bad behavior to establish for nearly half of your users. This is also an indication that people are deeply aware of the issues related to hallucinations. In other words, people don’t trust the product.

12 12 Yes, TikTok is a (Nascent) Search Engine 41%
of Tiktok users perform searches, but the search volume around a series of broad and meaningful queries is not there to make it more than a small supplement to Google Search.

13 13 21% of People Going to Tiktok Come from
Google 24.5% of People Leaving Tiktok Go to Google

14 14 The Last Time the Major Search Engine Died
it Looked Like This

15 15 Google Search Still Dwarfs Everything, but More People
are Using More Channels

16 16 It means there is fragmentation in how information
needs are being met.

17 17 I wish OpenAI the best with this. It
will be very diﬃcult to supplant Google as the search engine of record.

18 18 Users Have Very High Expectations of a Search
Box

19 19 …but search outside of major search engines is
still really bad

20 20 Google is still the main event, but we
are going back into a world where we need to optimize for multiple search engines across a series of channels.

21 GenAI Comes from Search Tech

22 22 Real quick. Let’s talk about how search engines
work.

23 23 Search Engines Work based on the Vector Space
Model Documents and queries are plotted in multidimensional vector space. The closer a document vector is to a query vector, the more relevant it is.

24 24 TF-IDF Vectors The vectors in the vector space
model were built from TF-IDF. These were simplistic based on the Bag-of-Words model and they did not do much to encapsulate meaning.

25 25 Relevance is a Function of Cosine Similarity When
we talk about relevance, it’s the question of similar is determined by how similar the vectors are between documents and queries. This is a quantitative measure, not the qualitative idea of how we typically think of relevance.

26 26 The lexical model counts the presence and distribution
of words. Whereas the semantic model captures meaning. This was the huge quantum leap behind Google’s Hummingbird update and most SEO software has been behind for over a decade. Google Shifted from Lexical to Semantic a Decade Ago

27 Word2Vec Gave Us Embeddings Word2Vec was an innovation led
by Tomas Mikolov and Jeff Dean that yielded an improvement in natural language understanding by using neural networks to compute word vectors. These were better at capturing meaning. Many follow-on innovations like Sentence2Vec and Doc2Vec would follow.

28 28 Tomas Mikolov Led the Word2Vec Research Tomas is
a Czech computer scientist behind many of these natural language understanding innovations.

29 He was accompanied by the Chuck Norris of Computer
Science Jeff Dean Jeff Dean has been a part of nearly every major innovation that has powered Google Search.

30 30 This Allows for Mathematical Operations Comparisons of content
and keywords become linear algebraic operations.

31 31 Words are Converted to Multi-dimensional Coordinates in Vector
Space

32 32 We Went from Sparse Representations to Dense Representations

33 33 Word2Vec Gave Us Hummingbird

34 34 8 Google Employees Are Responsible for Generative AI
https://www.wired.com/story/eight-google-employees-invented-modern-ai-transformers-paper/

35 35 The Transformer The transformer is a deep learning
model used in natural language processing (NLP) that relies on self-attention mechanisms to process sequences of data simultaneously, improving eﬃciency and understanding in tasks like translation and text generation. Its architecture enables it to capture complex relationships within the text, making it a foundational model for many state-of-the-art NLP applications.

36 36 The Transformer Gave us BERT

37 37 Word2Vec Captured Relationship, but Not Context – BERT
Captures Context

38 38 BERT Yields Embeddings with Higher Dimensionality and Information
Capture

39 39 Leveraging generative AI is a combination of content
strategy, your unique creative angles, and deep understanding of the technical nuances of a channel.

40 40 SEOs are uniquely positioned for this moment.

41 41 This is our opportunity to get up from
the kids table. This is our opportunity to get up from the kids table

42 The threat of Google’s AI Overviews

43 43 Queries are Longer and the Featured Snippet is
Bigger 1. The query is more natural language and no longer Orwellian Newspeak. It can be much longer than the 32 words that is has been historically in order 2. The Featured Snippet has become the “AI snapshot” which takes 3 results and builds a summary. 3. Users can also ask follow up questions in conversational mode.

44 44 The Search Demand Curve will Shift With the
change in the level of natural language query that Google can support, we’re going to see a lot less head terms and a lot more long tail term.

45 45 The CTR Model Will Change With the search
results being pushed down by the AI snapshot experience, what is considered #1 will change. We should also expect that any organic result will be clicked less and the standard organic will drop dramatically. However, this will likely yield query displacement.

46 Rank Tracking Will Be More Complex As an industry,
we’ll need to decide what is considered the #1 result. Based on this screenshot positions 1-3 are now the citations for the AI snapshot and #4 is below it. However, the AI snapshot loads on the client side, so rank tracking tools will need to change their approach.

47 47 Context Windows Will Yield More Personalized Results AIO
maintains the context window of the previous search in the journey as the user goes through predeﬁned follow questions. This will need to drive the composition of pages to ensure they remain in the consideration set for subsequent results.

48 AI Overviews are Susceptible to Spam and Lower Quality
Sites AI Overviews operate outside of core Organic Search so they don’t always have the same spam ﬁlters.

49 49 Luckily Users Love it So Much

50 50 Ranking in AI Overviews is more about relevance
than the other signals.

51 What is Retrieval Augmented Generation (RAG)?

52 52 Combining a Search Engine with a Language Model
is called “Retrieval Augmented Generation” Neeva (RIP), Bing, and now Google’s Search Generative Experience all use pull documents based on search queries and feed them to a language model to generate a response. This concept was developed by the Facebook AI Research (FAIR) team.

53 53 Google’s Initial Version of this is called Retrieval-Augmented
Language Model Pre-Training (REALM) from 2021 REALM identiﬁes full documents, ﬁnds the most relevant passages in each, and returns the single most relevant one for information extraction.

54 54 DeepMind followed up with Retrieval-Enhanced Transformer (RETRO) DeepMind's
RETRO (Retrieval-Enhanced Transformer) is a language model that combines a large text database with a transformer architecture to improve performance and reduce the number of parameters required. RETRO is able to achieve comparable performance to state-of-the-art language models such as GPT-3 and Jurassic-1, while using 25x fewer parameters.

55 Google’s Later Innovation Retroﬁt Attribution using Research and Revision
(RARR) RARR does not generate text from scratch. Instead, it retrieves a set of candidate passages from a corpus and then reranks them to select the best passage for the given task.

56 56 AIO is built from REALM/RETRO/RARR + PaLM 2
and MUM MUM is the Multitask Uniﬁed Model that Google announced in 2021 as way to do retrieval augmented generation. PaLM 2 is their latest (released) state of the art large language model. The functionality from REALM, RETRO, and RARR is also rolled into this.

57 57 Sounds cool, but how does it work?

58 58 Documents are Broken into Chunks and the Most
Relevant Chunks are Fed to the Language Model to Generate a Response

59 How to Appear in LLMs

60 60 Blocking LLMs is a Mistake. Appearing in these
places will be recognized as brand awareness opportunities very soon.

Embrace Structured Data There are three models gaining popularity: 1.
KG-enhanced LLMs - Language Model uses KG during pre-training and inference 2. LLM-augmented KGs - LLMs do reasoning and completion on KG data 3. Synergized LLMs + KGs - Multilayer system using both at the same time https://arxiv.org/pdf/2306.08302.pdf Source: Unifying Large Language Models and Knowledge Graphs: A Roadmap

62 62 What is Mitigation for AIO? 1. Manage expectations
on the impact 2. Understand the keywords under threat 3. Re-prioritize your focus to keywords that are not under threat 4. Optimize the passages for the keywords you want to save

64 64 We Can Also Show You Per Keyword How
You Show Up

65 65 It’s all about the Fraggles. (Fragment + Handle)

67 67 The Fraggles Show What AIO Used for the
AI Snapshot

68 68 Scroll to Text

69 69 Fraggles Relevance Relevance against the chunks to keyword:
Relevance against AI Snapshot:

Check out MarketBrew’s Free Tool to Help

74 The GEO team shared their ChatGPT prompts The GEO
team also shared the ChatGPT prompts that help them improve their visibility. You can augment them and put them to work right away. https://github.com/GEO-optim/GEO/blo b/main/src/geo_functions.py

Check out @GarrettSussman’s post on how to optimize for AI
Overviews: https://ipullrank.com/optimize-content-for-sge

76 So, It Turns Out Google is All About the
Clicks

77 77 Google’s Algorithms Inner Workings Have Been Put on
Full Display Lately Through a combination of what’s come out of Google’s DOJ antitrust trial and the Google API documentation leak, we have a much clearer picture of how Google actually functions.

78 78 I was the First to Publish on the
Google Leak

79 79 We Now Have a Much Stronger Understanding of
the Architecture https://searchengineland.com/how-google-search-ranking-works-445141

80 80 These speciﬁcs help us zero in on what
really matters.

81 The Primary Takeaway is the Value of User Behavior
in Organic Search Google’s Navboost system keeps track of user click behavior and uses that to inform what should rank in various contexts.

82 82 Google Has Denied this Many Times

83 83 Many times…

84 84 But it’s Been Conﬁrmed in Pandu Nayak’s Testimony

85 85 And in Google’s documentation for their Cloud Search
services

86 The Leaked Docs Have Several Features Measuring Clicks “Long
clicks” are those clicks that resulted in a successful user session meaning that the user did not come back to Google or otherwise demonstrated that they found what they were looking for.

88 88 That’s Also Why Google Was So Mad when
@DejanSEO Did This…

89 89 User Click Data is What Makes Google More
Powerful Than Any Other Search Engine The court opinion in the DoJ Antitrust trial, Google’s leaked documents, and Google’s own internal documentation all support the fact that click behavior is what makes Google perform the way that it does.

90 90 13 Months of Google Data = 17 Years
of Bing Data

91 Modern SEO Needs UX Baked-in Google has expectations of
performance for every position in the SERP. The user behavior signals collected reinforce what should rank and demote what doesn’t perform just like a social media channel. The best way to scale this is by generating highly-relevant content with a strong user experience.

92 Local Insights from the Google Leak

93 93 The leak indicates that Google is trying to
localize every page and every query.

94 94 Although this yields signals that impact rankings positively,
this is not what you should be doing for long term success.

95 User Click Behavior Can Brand a Page with a
Location Again, continued performance in Google is all about the clicks. The clickRadius50Percent indicates that Google will assign a location to a document based on the location where half its clicks come from. Action: For location based content, make it clear as early as the metadata that this page is relevant to speciﬁc locations and leverage rel-canonical tags effectively.

96 96 Google Attempts to Localize All Documents Google looks
to semantic and explicit signals within a document to associate a location with it. Action: Clarify which location a page is associated with in language on the page.

97 Google Uses Many Semantic Signals to Determine Location Google
looks to semantic and explicit signals within a document to associate a location with it. Action: Clarify which location a page is associated with in language on the page.

98 98 Your Primary Category Is Given a Ton of
Value The primary category in your GMB holds a lot of weight for what you rank for. If you have locations going rogue, show them this slide and tell them they are shooting themselves in the foot. Action: Make sure your primary category is set to the one that matters most to the business.

99 99 Signals from Location Pages are Very Valuable As
Well People generally neglect the location landing pages as secondary. The LocalWWWInfo module in the leak suggests that the topics and relationships of these pages have impact on how your location ranks. Action: Spend the time to make these pages robust and original.

10 0 10 0 Locations Are Associated with “Conceptual Centers”
Google identiﬁes the centroid for a given concept in a given area. There’s not much to do here if the business location is already set, but if the business is considering moving, you may want to weigh in with this data to give your locations a better opportunity.

10 1 10 1 You’re Not Imagining it Chains are
Treated Differently Google treats bigger brands separately and applies what it understands across the masterbrand to all locations. If you’re not a brand, this does not apply to you, but it’s effectively a multiplier effect that the bigger brands beneﬁt from.

10 2 10 2 NAP is Still Critical The basics
of making sure your name, address, and phone number are consistent across all citations is still a very valuable thing to do.

10 3 10 3 For Local Search the Leak Data
Reinforces Best Practices Get your on-page SEO house in order, but really focus on your GMB, reviews, and citations.

10 4 Less is More, More or Less It’s time
to cut out the content madness

105 10 5 You Don’t Need Link Volume, You Need
Link Quality Indexing Tier Impacts Link Value A metric called sourceType that shows a loose relationship between the where a page is indexed and how valuable it is. For quick background, Google’s index is stratified into tiers where the most important, regularly updated, and accessed content is stored in flash memory. Less important content is stored on solid state drives, and irregularly updated content is stored on standard hard drives. The higher the tier, the more valuable the link. Pages that are considered “fresh” are also considered high quality. Suffice it to say, you want your links to come from pages that either fresh or are otherwise featured in the top tier.

10 6 10 6 No one is actually improving content
when they do the Skyscraper Technique

10 7 Indexing is Also Harder It’s not being talked
about as much, but indexing has gotten a lot harder since the Helpful Content update. You’ll see a lot more pages in the “Discovered - currently not indexed” and “Crawled - currently not indexed” than you did previously because the bar is higher for what Google deems worth capturing from the web.

10 8 10 8 Google Wants to Crawl Even Less
Gary Illyes has indicated that he wants to have Google crawl less. Search quality certainly cannot suffer, so crawlin has to get increasingly intelligent.

10 9 10 9 I Believe This is a Function
of Information Gain Conceptually, as it relates to search engines, Information Gain is the measure of how much unique information a given document adds to the ranking set of documents. In other words, what are you talking about that your competitors are not?

11 0 110 In conclusion: “More content” is no longer
inherently the most effective approach because there’s no guarantee of traﬃc from Google. Google’s sophistication won’t allow it.

11 1 11 1 The only content you should be
making

11 2 11 2 I’m Leaving Y’all with Three Actions
Today 1. How to Prune Your Content 2. How to Use LLMs to Generate Valuable Content 3. AI Tools to Use for SEO

11 3 The Content Pruning Process

11 4 11 4 Pruning and Optimization Work Quite Well
Together

11 5 Aleyda Has a Process Aleyda’s workﬂow is a
great place to work through whether your content should be pruned or not. https://www.aleydasolis.com/en/crawli ng-mondays/how-to-prune-your-website- content-in-an-seo-process-crawlingmon days-16th-episode/

11 6 11 6 We like automate to get to
a Keep. Revise. Kill. (Review.)

11 7 11 7 Content Decay The web is a
rapidly changing organism. Google always wants the most relevant content, with the best user experience, and most authority. Unless you stay on top of these measures, you will see traﬃc fall off over time. Measuring this content decay is as simple comparing page performance period over period in analytics or GSC. Just knowing content has decayed is not enough to be strategic.

11 8 11 8 It’s not enough to know that
the page has lost traﬃc.

11 9 11 9

12 0 12 0 The Content Potential Rating (CPR).

12 1 12 1 Content Potential Score

12 2 12 2 Interpreting the Content Potential Rating 80
- 100: High Priority for Optimization 60 - 79: Moderate Priority for Optimization 40 - 59: Selective Optimization 20 - 39: Low Priority for Optimization 0 - 19: Minimal Benefit from Optimization If you want quick and dirty, you can prune everything below a 40 that is not driving significant traffic.

12 3 12 3 Combining CPR with pages that lost
traﬃc helps you understand if it’s worth it to optimize.

12 4 12 4 Step 1. Pull the Rankings Data
from Semrush Organic Research > Positions > Export

12 5 12 5 Step 2: Pull the Decaying Content
from GSC Google Search Console is a great source to spot Content Decay by comparing the last three months year over year. Filter for those pages where the Click Difference is negative (smaller than 0) then export.

12 6 12 6 Step 3: Drop them in the
Spreadsheet and Press the Magic Button

12 7 The Output is List of URLs Prioritized by
Action Each URL is marked as Keep, Revise, Kill or Review based on the keyword opportunities available and the effort required to capitalize on them. Sorting the URLs marked as “Revise” by Aggregated SV and CPR will give you the best opportunities ﬁrst.

12 8 12 8 Get your copy of the Content
Pruning Workbook : https://ipullrank.com/cpr-sheet

12 9 How to Kill Content Content may be valuable
for channels outside of Organic Search. So, killing it is about changing Google’s experience of your website to improve its relevance and reinforce its topical clusters. The best approach is to noindex the pages themselves, nofollow the links pointing to them, and submit an XML sitemap of all the pages that have changed. This will yield the quickest recrawling and reconsideration of the content.

13 0 13 0 How to Revise Content Review content
across the topic cluster Use co-occurring keywords and entities in your content Add unique perspectives that can’t be found on other ranking pages Answer common questions Answer the People Also Ask Questions Restructure your content using headings relevant to the above Add relevant Structured markup Expand on previous explanations Add authorship Update the dates Make sure the needs of your audiences are accounted for Add to an XML sitemap of only updated pages

13 1 How to Review Content The sheet marks content
that has a low content potential rating and a minimum of 500 in monthly search volume as “Review” because they may be long tail opportunities that are valuable to the business. You should take a look at the content you have for that landing page and determine if you think the effort is worthwhile.

13 2 Using AI for your SEO

13 3 With AI, I’m giving y’all legos. What you
build is up to you, but I’m going to show things to consider.

13 4 Setting Up LLMs Locally You don’t need ChatGPT
anymore

13 5 13 5

13 6 LLaMa 3.1 is SOTA Facebook’s open source model
is outperforming the best closed-source models on a variety of different evaluation metrics.

13 7 13 7

13 8 13 8

13 9 13 9

14 0 14 0

14 1 14 1

14 2 142 You can now unlock state of the
art generative AI use cases from your laptop for free.

14 3 Make Sure You Hook It Up To Your
GPU On a Windows machine you’ll need to go to the NVIDIA Control Panel and add the Ollama server application under Manage 3D Settings.

14 4 How to Use LLMs for Content and SEO

14 5 14 5 The Three Laws of Generative AI
content 1. Generative AI is not the end-all-be-all solution. It is not the replacement for a content strategy or your content team. 2. Generative AI for content creation should be a force multiplier to be utilized to improve workﬂow and augment strategy. 3. You should consider generative AI content for awareness efforts, but continue to leverage subject matter experts for lower funnel content.

14 6 14 6 Think back to 10 Minutes Ago
- Retrieval Augmented Generation

14 7 14 7 It’s Not Diﬃcult to Build with
Llama Index sitemap_url = "[SITEMAP URL]" sitemap = adv.sitemap_to_df(sitemap_url) urls_to_crawl = sitemap['loc'].tolist() ... # Make an index from your documents index = VectorStoreIndex.from_documents(documents) # Setup your index for citations query_engine = CitationQueryEngine.from_args( index, # indicate how many document chunks it should return similarity_top_k=5, # here we can control how granular citation sources are, the default is 512 citation_chunk_size=155, ) response = query_engine.query("YOUR PROMPT HERE")

14 8 148 Everyone can code now.

15 0 15 0

151 PAGE GENERATIVE AI PRODUCTIVITY USE CASES RAG opens up
a series of generative AI use cases that work well for your situation. Briefing & Business Cases Content Analysis First-pass Brand Review First-pass Legal Review Content First Draft Keyword Insertion Structured Data Generation Link Identification & Insertion Generating Voiceovers Generating Images Generating Videos Writing Code

15 2 @BritneyMuller’s Guide to Using Colab Britney talked about
how easy it is to use Colab with Python. Now it’s even easier to using LLMs. https://github.com/BritneyMuller/colab- notebooks?tab=readme-ov-ﬁle

15 3 Just describe what you want You can tell
your language model what you want the code to do and it will handle the rest. If it doesn’t work, just describe what went wrong or paste the error and it will ﬁx it for you. In this example my prompt is: {write python code for colab that takes a csv file of keywords and using bertopic with the chatgpt to compute the natural language topics for each row.}

15 4 15 4 Colab + Gemini

15 5 15 5 Llama Index - RAG - https://www.llamaindex.ai/

15 6 15 6 LangChain - Build Agents - https://www.langchain.com/

15 7 15 7 LangFuse - Prompt Management & Observability
tool - https://langfuse.com/

15 8 15 8 Literal AI

15 9 159 You don’t need to code for any
of this.

16 0 16 0

16 1 16 1

16 2 16 2

16 3 163

16 4 164

16 5 165

16 6 16 6 Bubble - No Code Apps -
https://bubble.io/

16 7 Integrate Promptitude with Zapier or Make

16 8 Prompts You Need To Write ChatGPT is very
effective at doing the following SEO related tasks: Page Title writing Meta Description writing Keyword Insertion Link Insertion You should use your own prompts for these though so you don’t copy other people’s patterns.

16 9 Page Titles Feature Token Count features Hypothesis There’s
no hard max page title length indicated in the attributes so we can test lengths longer than the 60-70 characters to determine impact.

17 0 Page Title Test Hypothesis: A page title that’s
longer than the standard best practice will negative impact rankings for primary keyword target. Variables: Control Short page title Long page title Metrics: Ranking Increase

17 1 lastSigniﬁcantUpdate lastSigniﬁcantUpdate - The date of the last
time Google encountered the page as materially updated. Feature Hypothesis Making substantial updates to pages regularly yields improved crawl activity and more opportunities to rank better.

17 2 17 2 TL;DR Generation for Signiﬁcant Updates

17 3 Test Structure The goal of this test is
to determine how much content is considered a “signiﬁcant update” that yields crawl activity. Create control and variants pages testing the length of added content: We measure the impact on organic traﬃc in order to capture changes to rankings and/or changes to clickthrough rate.

17 4 Let Screaming Frog Do the Heavy Lifting Generate
embeddings while you crawl using Screaming Frog SEO Spider. Take the ﬁle to Colab and do the following things: Keyword - Landing Page Relevance Scoring Keyword Mapping Link Building Target Identiﬁcation Redirect Mapping Internal Link Mapping https://ipullrank.com/vector-embedding s-is-all-you-need You can also work with your language model to combine crawl data with SERP data and do things like information gain calculations.

17 5 17 5 Brief Writing

17 6 17 6 Persona Modeling based on Desk Research
Data

17 7 17 7 Persona Writing with SparkToro

17 8 17 8 Brand Voice and Tone Review

17 9 17 9 Schema Markup Generator https://chatgpt.com/g/g-MvH0WHO3e-schema-markup-generator-gpt

18 0 18 0 Taskade - AI All-in-One - https://www.taskade.com/

18 1 18 1 AIPower - All-in-One for WP -
https://aipower.org/

18 2 18 2 Thunderbit - Build No Code AI
Automation Tools - https://thunderbit.com/

18 3 18 3 Keyword Insights - Keyword Clustering Tool

18 4 18 4 Octoparse - Combine a scraper with
Generative AI - https://www.octoparse.ai/

18 5 18 5 DejanSEO’s LinkBERT - https://www.linkbert.com https://dejanmarketing.com/tools/li nkbert/

18 6 18 6 InLinks - Automated Internal Links -
https://inlinks.com/

18 7 18 7 Link Whisper - Internal Link automation
- https://www.linkwhisper.com

18 8 18 8 Respona - AI-enabed Link Building -
https://www.respona.com

18 9 18 9 SEOJuice - https://seojuice.io/

19 0 19 0 UX Sniff - A AI-enabled HotJar
https://uxsniff.com/

19 1 Roll the Credits

19 2 19 2 What you should know and do
to win Google is still the primary show in town Relevance is a quantitative measure GenAI works on the same math as search engines Focus on making your chunks for relevant to rank in GenAI Search Improve UX to drive more long clicks Improve web search factors and local search factors Focus on content your audience wants, prune what they don’t Use RAG to generate content with AI Embrace AI tools to improve your workﬂows and your ability to test

Thank You | Q&A [email protected] Award Winning, #GirlDad Featured by
Get Your AIO Threat Report: https://ipullrank.com/AIO-report Play with Raggle: https://www.raggle.net Download the Slides: https://speakerdeck.com/ipullrank Mike King Chief Executive Oﬃcer @iPullRank

Reimagining SEO in the AI Era [SoCi Reimagine 2...

Reimagining SEO in the AI Era [SoCi Reimagine 2024 Talk]

More Decks by Michael King

Featured

Transcript