We read all 1800+ Yandex ranking factors and here is what we found

Tyler Scionti

  | Published on  

September 29, 2023

Goodbye ChatGPT, hello Yandex.

The source code for Yandex (a Russian search engine) leaked recently and the one question on everyone’s mind is whether we should care.

If you are reading this you are probably interested in knowing more about the Yandex leak, what it means, and how to leverage this rare inside look at a search engine.

You’re in the right place (one of them, at least). But let me stop you right there before you get too excited, at least for a moment.

Yandex is not Google.

Yes, it is a search engine. Yes, it was launched around the same time and designed to be a Google competitor. But there are pretty distinct differences as well. It’s important to take a grain of salt when reviewing anything coming out of Yandex because it’s not Google, and not necessarily indicative of how Google works.

But darn it, this was fun.

While there is a ton of poorly translated text and internal lingo that makes about half the ranking factors in the Yandex file useless, the other half is a fascinating look at how the Russian search engine operates and gives a few clues into how Google might work.

I read the entire list (as much as I could, at least) and I’m going to break down the findings in this article. I’ll answer a couple questions for folks out of the loop first so if you want to get to the good stuff scroll down the page a bit.

What is Yandex?

Yandex is a Russian search engine. It is the largest technology company in Russia and the 5th largest search engine in the world behind Google, Bing, Yahoo!, and Baidu.

From testing a few queries, the results are close to Google’s but not quite the same. Right off the bat, there are distinct differences at play that (again) are worth bearing in mind.

Yandex was founded in the early 1990s and launched in 1997, around the same time that Google’s search engine was launched. I’ve seen references to the fact that many ex-Googlers work at Yandex, however I can find no proof of this apart from claims on Twitter.

All we do know is that Yandex was created around the same time as Google, has benefited of watching Google grow for 20+ years, and is the top technology employer in Russia so make of that what you will.

What happened?

In January 2023 Yandex’s source code was leaked on a popular hacking forum. The hacker posted a 44.7 GB file they claimed contained the entire source code minus its anti spam rules.

What is especially fascinating here is that it included a file of their 1900+ ranking factors for the world to see. Again, Yandex is not Google however this is a fascinating look into how search engines operate and gives some clues into how we can approach Google SEO.

I only accessed the ranking factors list, that seems to be what most people care about anyway. There is much more to the source code though if you are adventurous.

Our process for analyzing the file

I nabbed the full TXT file, wrote a python script to pull the descriptions into a CSV, and ran that CSV through Google Sheets where I could Google Translate it.

The translation is an imperfect process, but it was quick enough for me to read the file in a weekend and make as much sense of it as possible.

There are over 1800 factors in the Yandex ranking factor list, about half of them are worth talking about. Poor translations aside, there is a lot of internal lingo and references to an internal wiki that I cannot access.

If you’ve worked in a software company you are likely familiar with the niche inside terms that your team uses in their code. It’s no difference with large companies and search engines, there is a lot of terminology here that limits the usefulness of the file.

A summary of what we found

For the most part, the contents of the file are not surprising.

Many of the ranking factors are things we already “knew”. However, there are a few surprises I’ll get to later. I did my best to classify the factors according to their “SEO category”, here is a distribution of terms:

And here’s how often different industries are mentioned:

At a glance, this tells us:

  • Search engines take care to the industry of a query
  • They also care a lot about spam/pornography
  • Content quality matters
  • Keyword usage is important
  • Links matter

This is not new, but it is nice to get a little confirmation. Now for some deeper analysis!

Key takeaways

Again, take this with a grain of salt. After reviewing each Yandex ranking factor, there are three things I’d prioritize:

  • Do the basics of technical optimization (have a good host, make sure your website is easy to crawl and fast)
  • Create high-quality content
  • Prioritize relevance and user experience

Let’s dig deeper to the specifics.

Technical performance and crawl depth

Technical performance and crawl depth come up quite a bit in the Yandex file.

Crawl depth is a way to manage the depth that search engine spiders need to crawl to find pages on your website. Crawl depth is mentioned 4 times in the ranking factors list.

Like Google, Yandex does not want to try hard or dig deep into your website to find key pages. Based on the ranking factors it is important to:

  • Have a good, reliable host so the site is accessible
  • Make sure your pages load quickly
  • Ensure important pages are linked from the home page
  • Limit the number of levels down to a page from the home page
  • Do not orphan pages (pages that are not linked from anywhere)

This is somewhat basic/common knowledge in SEO, though. We all know we should be choosey with our hosts and that our sites should be accessible to search engine crawlers. The best way to optimize your website’s crawl depth is to have a relatively flat website structure:

Yandex even checks for the number of slashes in a URL! Too many slashes is a symptom of a deep site structure, which is something to avoid.

Search performance

How your website performs in search results matters a lot to Yandex.

Does it matter to Google? No idea, but this does confirm what I call the “compounding effect of SEO”, meaning the better your website performs in search results the better it will perform.

Starting a website from scratch is hard. Once you get a bit of traction and are ranking for keywords, it becomes easier. The more keywords you rank for, and the more your content appears in search results the more authority your website will build, and it continues to get easier from there.

Yandex pays close attention to how websites perform in search results. It checks for things like:

  • Your click-through rate from search results
  • How your site performs at different times of day/days of the week
  • Your average rank across keywords you rank for
  • How your website stacks up against competitors ion your niche

The biggest takeaway I find from this is to target queries you can rank for. This will help your site rank more consistently, build your authority, and make it easier to rank as time goes on. Rather than chase competitive queries and spin your wheels, pick your battle wisely, and you will see your site improve in search results.

Again, nothing new. I’ve seen this firsthand in working with our clients and building websites, but it’s fascinating to confirm.

Keyword usage

Keyword usage matters, and it’s fascinating to see how.

Yandex looks for how the keyword is used along with synonyms, things like:

  • Whether the keyword is explicitly used
  • How often the keyword is used
  • Where the keyword is used
  • How synonyms are used

What’s interesting though is it’s not just a target keyword that is factored in, it’s the search term. Again, no surprise there but it’s easy as SEOs to forget that we are not targeting keywords as much as we are targeting search terms and bearing the different variations of searches around a core keyword is critical.

Word usage matters as well, Yandex pays close attention to whether “like terms” appear in the content. For example, if you have an e-commerce site Yandex will look for words like “checkout”, “add to cart”, “buy” on the page in addition to the product description.

Not too many surprises here. The key takeaway is to understand the intent behind each keyword you are targeting and think through the various ways that keyword is searched for, and ensure those variations are addressed in your content.

Content quality

One of the most common factors is content quality. Yandex pays close attention to the relevance of the content to the search query, how users interact with the content, and of course, the quality of the content.

Content quality is not just “is this well-written”, Yandex checks for things like:

  • Do like words go together
  • Do the titles use excessive capitalization
  • The ratio of links (internal and external) to text on the page
  • Whether a page is especially long and does not link anywhere
  • The number of sentences
  • The average length of the words in the text
  • If there is a large image on the page (featured image?)
  • Do visitors to the page bookmark it
  • Do visitors to the page come back to it
  • How often do visitors come back to it within a set period
  • Does the page contain offensive language, spam, or pornography
  • Does the page contain pirated video content, or videos that do not load

Yandex looks at how well the content addresses the query, but it also pays very close attention to how visitors to the page interact with it and the experience that they have.

Bad content (poorly-written, using too many links, too long, too short, or using too many difficult words) leads to a bad experience, and it is clear from this list and Google’s recent changes that search engines value experience.

Google added “Experience” to the classic EAT acronym, so this does not come as a shock. Given the amount of data Google has, I would not be surprised if Google is paying attention to how people interact with your content, how long they spend reading it, and how often they bounce off to click on something else.

Location/Geography

Location and language come up a lot. Russia is a large and diverse country, so serving up content based on where you are is important. That’s the same for Google though and that is not limited to the United States either.

Still, it is interesting to see just how much location is factored in and how nuanced searches can be. This is something to bear in mind for local businesses but also worth reviewing if you have a national/international brand. Search engines pay a lot of attention to the location of a search and optimizing in one location does not guarantee optimizing for them all.

Link building

No list of ranking factors would be complete without a mention of link building. As you might expect, link building comes up quite often in the Yandex list. Here are the notable factors:

  • Where the links come from
  • The age of the links
  • The relevancy of the links (to the site and to the search term)
  • The anchor text for the link

Like I have been saying, no major surprises here. Links matter, but they need to be from good websites, and they need to be relevant to your site.

How will you apply the Yandex leak to search?

The theme of this article (as you can imagine) is, few surprises but good to know.

I don’t feel compelled to overhaul my approach to SEO based on this exercise of reading through the Yandex list. However, I do feel compelled to continue to create a good experience for my readers through my website and content. This is yet another reminder of the drum I beat with our clients: your website is not a brochure, it is a product.

The more you can treat your website like a product to be used, rather than a dumping ground for content and links, the better your website will perform in search results. The Yandex leak shows one thing clearly: search engines value relevant content on websites that create good experiences. Download our free SEO strategy guide to see how you can leverage this approach with your website.

10x your traffic with our proven SEO strategy framework

Get the same strategy framework we teach every single client. Follow these 4 steps to outsmart your competitors on Google and rank your website higher than ever.