Deciphering SEO News: The Infamous 2024 Google Leak
The dust has settled after another big news story in the SEO world: a purported leak of internal Google data about how its ranking algorithm works.
If you’re plugged into the SEO sphere, you likely caught wind of the drama. But if it’s news to you, you’re not alone. Either way, you likely have some questions.
- What happened?
- What information did this leak include?
- What’s the controversy?
- What does it mean for my website?
Now that we have more context about the leak, the Nexus Marketing team wanted to offer a quick overview and our take on its implications for your organization’s SEO.
What happened?
On May 5th, 2024, SEO thought leader Rand Fishkin received news of leaked Google API documentation from a source claiming that it came from Google’s secretive Search Division. On May 27th, Fishkin published a full write-up:
Rand Fishkin: An Anonymous Source Shared Thousands of Leaked Google Search API Documents with Me; Everyone in SEO Should See Them
It immediately became the biggest topic in the SEO world in years—the buzz eclipsed even the very recent rollout of AI Overviews.
Since then, the leaked information has been a major point of controversy. We’ll dig into why it’s controversial among SEO professionals, but first, let’s clarify:
Was this a leak of the actual Google ranking algorithm?
No. At this point, it’s been pretty much disproven.
- The leak consisted of internal Google API documentation. (API stands for application programming interface—code shared between separate platforms—normal stuff in the tech world for companies to build cross-functionalities between their own or external tools.)
- While the original source claimed that the leaked documentation came from Google’s Search Division, this has not been verified. In fact, this does not appear to be true.
- This information is actually related to a completely separate Google product, Document AI Warehouse, a public Google Cloud platform for analyzing and storing data.
- However, it’s still fair to consider the leaked information potentially ranking-related. It details a wide range of rules and protocols that this particular Google product potentially uses to analyze and interpret data. It’s not unreasonable to assume that some or many of its features might also exist in the core Google Search ranking algorithm—but there are caveats.
Here’s how SEO expert, Eli Schwartz, put it in a recent newsletter:
“Google has said this information is out of context, which is an accurate way to describe it. At best, it’s a comprehensive list of definitions that Google could track in its ranking algorithms, but we have no indication that these elements are used, if they are, and how they are used.”
What information did the Google leak include?
So, what exactly did the leaked Google API documentation include?
It includes detailed lists of over 14,000 named protocols that Google’s Document AI Warehouse potentially uses to understand and sort information.
Some of the specific protocols that caught people’s attention include:
siteFocusScore
A potential representation of a site’s topical authority, or how closely it sticks to its relevant subject matter
siteRadius
A potential measure of how closely a given page aligns with the site’s core topics
EffortScore
A potential estimate of the relative effort that went into content creation on a page, related to Google’s PageQuality measures
freshdocs
A “link value multiplier” that appears to weigh the value of links from newer pages over links built on older pages
fullLeftContext and fullRightContext
Protocols that appear to tell Google to interpret the surrounding context of a link’s anchor text, meaning anchor text like “click here” can still be understood
Mentions of the NavBoost system
A system of protocols that likely reranks results based on their click metrics (as an approximation of content quality, user satisfaction, etc.)
Based on this information, many SEO professionals quickly assumed that these or similar factors are used in the Google Search algorithms to understand and rank content out in the wild. The thinking is that if Google has supposedly built these systems and features, they’re likely deployed elsewhere.
Note how often we use “potential” in these descriptions—remember that we don’t know for sure whether these factors are actually included in Google Search algorithms.
Want to go deeper? This SearchEngineLand article explains more of the specific protocols included in the leak. However, keep in mind that this article was published before we learned more about the actual provenance of the data—take its analyses of the leak’s SEO implications with a grain of salt.
Want an even deeper dive? (Not highly recommended, but it’s definitely interesting!) Here’s a comprehensive table of all the protocols found in the leaked documentation:
How did the SEO world react and why?
The news of the Google leak created waves immediately. First reactions ran the gamut from measured skepticism to frenzied anger to jaded dismissal.
Now that a few weeks have passed and we’ve learned more about the leaks, the buzz has calmed significantly.
But why the dramatic range of initial reactions?
First, some context. Whenever we get news of big Google shake-ups, algorithm changes, and hints about how the ranking algorithm works, a similar swarm of reactions occurs.
The SEO space is as diverse as the internet itself. When websites use released information to make assumptions about updated SEO best practices, not all of them will reap benefits. Some may actually see decreased SEO performance. No two sites will experience an algorithm update the same way or see the same impacts from making similar changes to their content.
This, understandably, can create frustration and blame-casting.
Some SEO professionals then assume that Google intentionally misleads them. Seemingly contradictory statements or mixed results from implementing assumption-based changes fuel the fire.
When the leaks landed, they were largely misinterpreted.
Many immediately thought the leaked information came directly from the core Search ranking algorithms. Many of the signals and features included in the leak do seem to contradict past statements from Google about how Search works. Confirmation bias set in, and we saw some angry and jaded reactions.
But fast forward just a few days.
A clearer picture of the leak emerged. Here’s how Search Engine Journal article put it in late May, 2024:
“Many SEOs [have come] to the conclusion that the alleged Google data leak was not a leak, did not contain ranking algorithm secrets, was five years out of date, and did not show anything new. While that’s not how everyone feels about it, SEOs in general don’t tend to agree about anything.”
The leak is complicated. And while some in the space distrust Google’s statements, jumping to black-and-white conclusions one way or the other is never the best move.
Why Does it Matter?
All this context might be confusing. We get it, so why are we sharing it?
We want to arm our clients and partners (and everyone else) with knowledge and a healthy dose of skepticism.
If you work in marketing and encounter SEO news and trends, you need to understand why we see a range of dramatic reactions to industry developments as they break. These reactions tend to die down after a few days, and then the real takeaways emerge after that (and we’ll tell you about them).
Extra Context: Why is the algorithm a secret, anyway?
To be fair, Google does very carefully word its releases and public discussions of algorithm details. Why?
If Google were to fully publicize its algorithmic ranking factors or even discuss a single new one in great depth, it could wreak havoc on the search results. Here’s a hypothetical:
- Let’s say Google very clearly announces one day that page views from its Chrome browser are a ranking signal (which was implied in the leak—but remember all the caveats).
- A marketplace springs up overnight to sell shady bot traffic for websites to boost their Chrome page views.
- Chaos breaks out on the SERPs, and spammy bot-supported websites start climbing the rankings.
- More websites, SEOs, and users complain about Google becoming overrun with spammy content.
- Google scrambles to make changes while the user experience and rankings for high-quality sites suffer.
A little dramatic, maybe, but that’s the general idea why Google is so secretive about the exact makeup of its ranking algorithms.
What does it mean for your website?
Honestly, not much.
If you follow tried-and-true best practices, create quality content for your users, build links to it, and maintain your website’s health, you’re already ahead!
But even if the leaked information did come directly from the Search algorithm, it wouldn’t have taught us much that we didn’t already know or couldn’t piece together.
If you pay attention to Google’s big-picture changes (like their Helpful Content Update, spam crackdown, and the rollout of AI Overviews), you can already infer that more deeply understanding the context and quality of content has been a big priority. It makes sense that Google would want to be capable of piecing together the full context of the language surrounding a link, for instance.
But if you’re already creating excellent, user-centric, well-organized content, this shouldn’t even be a concern.
Here’s our take:
When SEO professionals take leaks, rumors, clickbait strategies, and Google’s hints too literally, it’s no surprise when they see mixed results.
There’s a commonly held misconception that there’s some kind of exact formula for SEO success. This is not true.
Google’s ranking system is extremely complex, with thousands and thousands of ranking factors constantly in flux.
It’s not a checklist or recipe of exact measurements to get just right. And you definitely don’t succeed at SEO by taking dramatic swings at big strategy changes based on assumptions and expecting all-or-nothing results.
SEO success comes from building up your website’s quality, helping your users, and creating a respected brand. You’ll satisfy algorithm signals along the way, and Google will learn to recognize you for your hard work.
To sum it all up, we love this quote from SEO thought leader, Eli Schwartz:
“The user pays your bills, not the search engines.”
Learn from a balanced mix of sources—Google’s statements, firsthand trial and error, experts in the field, and your audiences’ needs.
This is exactly how we approach SEO at Nexus Marketing, and it’s served our clients well for a decade now. Want to learn more? We’d love to hear from you.
Have any questions or want to learn more about our approach to SEO?
Contact us or reach out directly to your Nexus points of contact at any time!