Google Ranking Factor Studies Debugged
Over the years a great number of studies have tried to show which Google ranking factors are the most important. Recently another big SEO Ranking Factors study came out – this time from the tool provider SEMRush.
Although I very much appreciate all SEO studies as a basis for discussions about what might be important ranking factors most of them suffer from many of the same basic flaws and often conclusions are made that are just not valid and most certainly don’t align with my professional experience about what is actually working.
The most recent study from SEMRush is better than most but many of the problems remain the same. In this post I will look closer at a few of the main problems and why I think the real value is limited.
Correlation doesn’t Imply Causation
You probably already know this but to be sure we all agree let me repeat the facts: Just because two numbers correlate does not prove that any of them have any influence on the other. Often it is just a coincidence. Even if the curves looks very similar it is still not, in itself any proof of causation.
Lets take a look at a few funny examples to highlight my point …
Direction of Correlation – or No Correlation
Even if you do think that all logic indicate that there is a correlation between two numbers most often you can’t be sure which way the correlation goes.
A good example can be found in the new SEMRush Ranking Factors study with what comes out as the most important factor – “Website Visits”. What they found is that, the more visits a website have the better it ranks.
But do a site rank better because it has more traffic or do it have more traffic because it ranks better? Or is there really no correlation between the two at all?
In the SEMRush study they actually did try to eliminate search traffic and the result came out almost the same. I am glad they did this but still it does not prove much.
Maybe Google is just very good at finding the best websites and give them the best rankings – the same ones that most people visit because they actually are very good.
We had that same discussion almost 10 years ago when the first correlation studies between social visibility and Google rankings came out. Several studies showed that if a site is more visible in the major social media and more often appear at the social streams – such as Twitter, Facebook and LinkedIn, they also tend to rank better in Google. That is true.
But again, as I also argued back then – how do we know, just based on this sort of analysis, that there actually is a correlation? The fact is we don’t!
Maybe Facebook and Google are just equally and independently good at finding the best sites. Maybe what most people chose to share, comment and “like” is the most interesting sites – the talk of the town, and that Google with their own algorithms also find this to be true.
I am not saying that social visibility is not a ranking factors. I think it is and my own real life testings have proven me right. But the point is that from a correlation study in itself we just can’t be sure.
With my 20 years experience in SEO I am 100% sure that Google do not use website traffic as a direct ranking factor. Even a small one. It would simply be much too easy to manipulate and in itself is a very weak indicator of quality or relevancy. Even if Google only included Chrome traffic it would still be too easy to manipulate that data for the ones of us that knows how to. No major search engine today is this “stupid”. It would be a spammers heaven.
Rather I am pretty sure its an unrelated correlation. Websites that have the most traffic most often have invested a lot in this in many ways – organic optimization, at all levels, e-mail marketing, social and paid. So off course there is a correlation – its just not a course.
Full or Partly Match
I am not sure how SEMRush have done the keyword matching part of their study related to on-page factors because they do not provide any detailed information about it in the report. But a very common problem (also known from some of the most used SEO Tools and plugins) is that only full matches are accounted for. Let me give you an example.
Many people search for “paintball Copenhagen”. However, in a good text that exact keyword phrase is hard to use. Most often it will be used with one or more words between the two words in the keyword phrase. Something like this: “We provide paintball in Copenhagen” or “come and play great paintball games in Copenhagen”.
But Google is smart enough to understand this. The text examples above will be relevant for the keyword phrase “paintball Copenhagen”. In fact, with Hummingbird and RankBrain many would argue, that using variations of your keywords, that mix in related words as in my (simple) example above, actually score better in Google today. In my practical experience this is true.
So if you do a study that only look for 100% exact phrase matches (or use a tool that does it) you most likely will not end up with valid results. In the SEMRush report I suspect that the low score of on-page factors they got may be due to this problem, but again I can’t be sure because they did not provide details about the data matching.
Google Ranking Factors Are Relative
In the good old days search algorithms where simple “one stringed” mechanisms applied to all websites, languages, verticals and searches. That is not the case today!
For one type of searches, for one language, for one market, the factors that count may be completely different than what count in another scenario.
So to make a list of fixed factors to analyse all markets and a broad scope of keywords just make very little sense. You have to look at your own market, analyze the top ranking websites in this, to get a true understanding of what makes them rank.
You could argue that by analyzing a broad scope you get a useful average idea of what ranking factors count. In my mind you don’t. The problem is that “average” data might be good for statistics – but its not for operational work.
It’s similar to human analysis and marketing. Although it can be interesting to analyse average human psychology factors, using it to target humans just don’t work. There are no average people. Only in statistics. You can’t talk to average people. In real life you need to target the real people.
The same is true on SEO. Average “truths” about what is the most important ranking factors will most often not work for your specific market. You neede to specifically optimize for what works in your niche.
Details abut the SEMRush Ranking Factor 2017 Study
In the methodology description in the beginning of the report SEMRush do state that “correlation analysis is not the best match for this type of research” and they further claim that “we applied more complex methods in order to reveal the parameters that influence SERP results”.
However, they do not explain what methods they have used to verify the validity of the correlation data presented in the report. To me, it looks like most of it is pure correlation data – segmented by search volume, but not by keyword type, language, region of cross referenced with any other factors that might be related. Maybe they did do some other magic to the numbers they just have not documented it.
However, they do also end the methodology description with a very important disclaimer:
We cannot state explicitly that if you improve the factor X, you will rank higher for Y, but we have come up with a list of observations regarding the nature of these alleged ranking factors.
Keep that in mind. I actually like that statement and something like this should indeed be included in all such studies. I am just afraid that many will not read it but rather just the results of the study – however weak the results and conclusions may be.
You can download the full study from SEMRush here.
Can the SEMRush study be used at all?
It depends on what you want to use it for. I have just used it as a basis for this blog posts to discuss Google Ranking Factors. I think that’s a great use of it.
But if you want to use it to get a true blue print of how Google works and use the results as they are, to tailor your future SEO strategies then I am afraid its not very useful. Please don’t do that!
Use studies like this to questions yourself. Use it for discussions with colleagues and experts in the field. Use it as an inspiration but never accept it as solid facts.
My good frind Martin MacDonald posted a very good comment on the updated version of the EM Rush study, that you should read:
https://webmarketingschool.com/semrush-direct-traffic-ranking-factor-claim/
My concept about google ranking factors was totally wrong, so after reading your blog post ,i really feel that am a SEO master for my post KMSpico, so thanks for share that sort of information.