Goodbye ChatGPT, hello Yandex.
The source code for Yandex (a Russian search engine) leaked recently and the one question on everyone’s mind is whether we should care.
If you are reading this you are probably interested in knowing more about the Yandex leak, what it means, and how to leverage this rare inside look at a search engine.
You’re in the right place (one of them, at least). But let me stop you right there before you get too excited, at least for a moment.
Yandex is not Google.
Yes, it is a search engine. Yes, it was launched around the same time and designed to be a Google competitor. But there are pretty distinct differences as well. It’s important to take a grain of salt when reviewing anything coming out of Yandex because it’s not Google, and not necessarily indicative of how Google works.
But darn it, this was fun.
While there is a ton of poorly translated text and internal lingo that makes about half the ranking factors in the Yandex file useless, the other half is a fascinating look at how the Russian search engine operates and gives a few clues into how Google might work.
I read the entire list (as much as I could, at least) and I’m going to break down the findings in this article. I’ll answer a couple questions for folks out of the loop first so if you want to get to the good stuff scroll down the page a bit.
Yandex is a Russian search engine. It is the largest technology company in Russia and the 5th largest search engine in the world behind Google, Bing, Yahoo!, and Baidu.
From testing a few queries, the results are close to Google’s but not quite the same. Right off the bat, there are distinct differences at play that (again) are worth bearing in mind.
Yandex was founded in the early 1990s and launched in 1997, around the same time that Google’s search engine was launched. I’ve seen references to the fact that many ex-Googlers work at Yandex, however I can find no proof of this apart from claims on Twitter.
All we do know is that Yandex was created around the same time as Google, has benefited of watching Google grow for 20+ years, and is the top technology employer in Russia so make of that what you will.
In January 2023 Yandex’s source code was leaked on a popular hacking forum. The hacker posted a 44.7 GB file they claimed contained the entire source code minus its anti spam rules.
What is especially fascinating here is that it included a file of their 1900+ ranking factors for the world to see. Again, Yandex is not Google however this is a fascinating look into how search engines operate and gives some clues into how we can approach Google SEO.
I only accessed the ranking factors list, that seems to be what most people care about anyway. There is much more to the source code though if you are adventurous.
I nabbed the full TXT file, wrote a python script to pull the descriptions into a CSV, and ran that CSV through Google Sheets where I could Google Translate it.
The translation is an imperfect process, but it was quick enough for me to read the file in a weekend and make as much sense of it as possible.
There are over 1800 factors in the Yandex ranking factor list, about half of them are worth talking about. Poor translations aside, there is a lot of internal lingo and references to an internal wiki that I cannot access.
If you’ve worked in a software company you are likely familiar with the niche inside terms that your team uses in their code. It’s no difference with large companies and search engines, there is a lot of terminology here that limits the usefulness of the file.
For the most part, the contents of the file are not surprising.
Many of the ranking factors are things we already “knew”. However, there are a few surprises I’ll get to later. I did my best to classify the factors according to their “SEO category”, here is a distribution of terms:
And here’s how often different industries are mentioned:
At a glance, this tells us:
This is not new, but it is nice to get a little confirmation. Now for some deeper analysis!
Again, take this with a grain of salt. After reviewing each Yandex ranking factor, there are three things I’d prioritize:
Let’s dig deeper to the specifics.
Technical performance and crawl depth come up quite a bit in the Yandex file.
Crawl depth is a way to manage the depth that search engine spiders need to crawl to find pages on your website. Crawl depth is mentioned 4 times in the ranking factors list.
Like Google, Yandex does not want to try hard or dig deep into your website to find key pages. Based on the ranking factors it is important to:
This is somewhat basic/common knowledge in SEO, though. We all know we should be choosey with our hosts and that our sites should be accessible to search engine crawlers. The best way to optimize your website’s crawl depth is to have a relatively flat website structure:
Yandex even checks for the number of slashes in a URL! Too many slashes is a symptom of a deep site structure, which is something to avoid.
How your website performs in search results matters a lot to Yandex.
Does it matter to Google? No idea, but this does confirm what I call the “compounding effect of SEO”, meaning the better your website performs in search results the better it will perform.
Starting a website from scratch is hard. Once you get a bit of traction and are ranking for keywords, it becomes easier. The more keywords you rank for, and the more your content appears in search results the more authority your website will build, and it continues to get easier from there.
Yandex pays close attention to how websites perform in search results. It checks for things like:
The biggest takeaway I find from this is to target queries you can rank for. This will help your site rank more consistently, build your authority, and make it easier to rank as time goes on. Rather than chase competitive queries and spin your wheels, pick your battle wisely, and you will see your site improve in search results.
Again, nothing new. I’ve seen this firsthand in working with our clients and building websites, but it’s fascinating to confirm.
Keyword usage matters, and it’s fascinating to see how.
Yandex looks for how the keyword is used along with synonyms, things like:
What’s interesting though is it’s not just a target keyword that is factored in, it’s the search term. Again, no surprise there but it’s easy as SEOs to forget that we are not targeting keywords as much as we are targeting search terms and bearing the different variations of searches around a core keyword is critical.
Word usage matters as well, Yandex pays close attention to whether “like terms” appear in the content. For example, if you have an e-commerce site Yandex will look for words like “checkout”, “add to cart”, “buy” on the page in addition to the product description.
Not too many surprises here. The key takeaway is to understand the intent behind each keyword you are targeting and think through the various ways that keyword is searched for, and ensure those variations are addressed in your content.
One of the most common factors is content quality. Yandex pays close attention to the relevance of the content to the search query, how users interact with the content, and of course, the quality of the content.
Content quality is not just “is this well-written”, Yandex checks for things like:
Yandex looks at how well the content addresses the query, but it also pays very close attention to how visitors to the page interact with it and the experience that they have.
Bad content (poorly-written, using too many links, too long, too short, or using too many difficult words) leads to a bad experience, and it is clear from this list and Google’s recent changes that search engines value experience.
Google added “Experience” to the classic EAT acronym, so this does not come as a shock. Given the amount of data Google has, I would not be surprised if Google is paying attention to how people interact with your content, how long they spend reading it, and how often they bounce off to click on something else.
Location and language come up a lot. Russia is a large and diverse country, so serving up content based on where you are is important. That’s the same for Google though and that is not limited to the United States either.
Still, it is interesting to see just how much location is factored in and how nuanced searches can be. This is something to bear in mind for local businesses but also worth reviewing if you have a national/international brand. Search engines pay a lot of attention to the location of a search and optimizing in one location does not guarantee optimizing for them all.
No list of ranking factors would be complete without a mention of link building. As you might expect, link building comes up quite often in the Yandex list. Here are the notable factors:
Like I have been saying, no major surprises here. Links matter, but they need to be from good websites, and they need to be relevant to your site.
The theme of this article (as you can imagine) is, few surprises but good to know.
I don’t feel compelled to overhaul my approach to SEO based on this exercise of reading through the Yandex list. However, I do feel compelled to continue to create a good experience for my readers through my website and content. This is yet another reminder of the drum I beat with our clients: your website is not a brochure, it is a product.
The more you can treat your website like a product to be used, rather than a dumping ground for content and links, the better your website will perform in search results. The Yandex leak shows one thing clearly: search engines value relevant content on websites that create good experiences. Download our free SEO strategy guide to see how you can leverage this approach with your website.
Get the same strategy framework we teach every single client. Follow these 4 steps to outsmart your competitors on Google and rank your website higher than ever.