This week, we saw news of a leak from Google that one leading SEO blog claims “will likely be one of the biggest stories in the history of SEO and Google Search.”
That’s a heady claim. But what does it actually mean?
First, some background: in March, Google appears to have accidentally uploaded internal documentation to GitHub that contains information on how its algorithm decides which sites should be ranked for different types of search queries. All told, some 2,596 modules with 14,014 attributes were disclosed.
Manna from heaven for SEO nerds, for sure—and there will undoubtedly be a tsunami of content from experts in the weeks and months ahead explaining how different factors apply. In fact, within days of the leak being made public, we’ve already seen posts outlining the fact that Google has either omitted information or downright misled SEOs about how the algorithm works.
But inside baseball aside, I’m skeptical that this is a development that will fundamentally change the understanding or practice of SEO. Here’s why:
First, the number of attributes simply underlines exactly how complex Google’s search indexing processes are. It’s a long-held canard that even folks within Google don’t fully understand how the algorithm works—and I don’t expect the list of attributes to help non-Googlers get any closer to a complete understanding of it.
Second: the attributes aren’t ranked in any way. So even getting confirmation that, yes, post-search click behavior is a factor that impacts your site’s ability to rank is only marginally helpful—because we don’t know how significant of a factor it actually is.
And, finally: what’s true about Google today does not necessarily mean it will be true tomorrow. While some SEOs have understandably been frustrated to learn that things the company has told them about the algorithm were misleading, there’s good reason for that: Google is constantly evolving to prevent sites from gaming its rankings. If the company came out and gave the world the keys to its operations, every spam site and bad actor in existence would simply optimize for those parameters.
One very basic example of that is apparent from even a cursory glance at the “Demotions” section in the first article linked above. We’ve long known that Google has the ability to “demote” content for a variety of reasons, from potentially harmful content (e.g. unqualified medical advice) to bait-and-switch link tactics. But the leaks confirm that “exact domain match” is another factor that can lead to a site being demoted in the rankings. The reason for that? In the early days of search, it was common practice for scammers to game the rankings for, say, sneakers, by setting up a site on a domain like “bestcheapsneakersite.com,” taking payments, and then not delivering any merchandise. As a result, Google now explicitly penalizes sites with that kind of exact match strategy. So, as scammers evolve their tactics, the algorithm will continue to change to combat them.
Don’t get me wrong: I’m not saying that Google is perfect. Search quality in the past several months has deteriorated significantly for a number of reasons, from changes at the top (the former head of their ads unit is now running search, a subject worthy of its own deep dive) to the rise of a deeply flawed AI results model.
But I am predicting that this data leak will turn out to be little more than a storm in a teacup for the majority of publishers. In a world where SEO is often viewed as a battle between search engines and people looking to gain an edge—however small or temporary—the only constant is that, eventually, quality content wins out. Producing that has been key to our approach as an agency from day one, and there’s nothing in this leak that convinces me that the future of SEO is going to change the value of content that serves the needs of users.