A recent study by SEO company Searchmetrics asserts that there is a high degree of correlation between social signals and Google ranking. One of the especially provocative conclusions from the study is that Google Plus social signals have the highest correlation of any variable informing search engine ranking. If true, this could be a huge shift in the conventional wisdom about search and the value of social media.
I want to discuss their findings and their method of statistical analysis for their conclusions, and show why social signals probably aren’t informing search as prevalently as this study implies.
Searchmetrics looked at the top 10,000 Google keywords and analyzed the first three SERPs for each. They captured a number of factors from the top web pages and did a regression analysis of each, calculating a correlation coefficient (expressed as a number between -1 and 1).
Here is a summary of the variables and their correlation coefficients:
They also found that the the average page load time for top results was 1.2 seconds.
Three aspects of the study that are worthy of further discussion are the use of the correlation coefficient, the benchmarking of large digital entities versus small, and the implication of social correlation for search.
What is the correlation coefficient? It is an expression of how well a best fit-curve matches its data. It is expressed as a number between -1 and 1 where -1 shows a perfect inverse correlation between two variables and 1 shows perfect correlation. 0 indicates no correlation. Generally, greater than .8 indicates that there is high correlation between variables, and less than .5 is considered weaker evidence of correlation. Here is a graphic representation of diminishing correlation coefficients from Wolfram Alpha:
Because the Searchmetric data starts with a coefficient of .42, the data is certainly not as consistent as the narrative suggests. In fact, the strongest correlation coefficients in this study are weak. In the study, the authors make strong claims using language like “clearly important” and “considerable negative effect” for variables that effectively show no correlation to search. It’s very hard to see the utility of any of this data in the context of such weak correlation. I suspect if the same data set were framed differently it could have been stronger.
Benchmarking large digital entities versus small
One of the big takeaways for me (although it wasn’t explicitly stated) was the monopoly of big business for top keywords on the SERP. The telling statistic for me was the page load time of 1.2 seconds. My page load time is 1.5 seconds, which is both fast and outside the threshold of page load time in the study. Consider these as well:
chrisbrogan.com – 961 ms
marismith.com (lots of Facebook signals here) – 2.23 sec
copyblogger.com – 1.5 sec
socialmediaexaminer.com – 2.89 sec
convinceandconvert.com – 3.22 sec
mashable.com – 1.74 sec
Of some very prominent social media sources, only one fits within the band cited in this study. This means that a lot of the big keywords belong to bigger businesses. The amount of budget necessary to have such high-performing web properties would presumably be spent in other places such as marketing and distribution as well, something smaller businesses can’t compete with. This is probably a strong indicator that long tail keyword research is an important factor for search for everyone else.
Implication of social correlation informing search
There are myriad ways to controvert the assertion that social is informing search to such a high degree. You could point out that Google doesn’t pay a whole lot of attention to Facebook, or that Searchmetrics made the same assertion about the influence of Google Plus in their report last year when Google Plus was far under the radar. What troubles me about using low correlation to make determinations about search is that you could use nearly any variable and it could be misconstrued as correlated with search. For some of these factors, roast beef consumption in the U.S. at the time of publishing would probably show a greater correlation than the statistics cited.
In the case of social media, you have a piece of content that Google thinks is valuable. It stands to reason that readers find it valuable as well and share it. There will be some correlation there, but should we draw the conclusion that social is informing search because of this? No. Until an assertion like this has been substantiated by multiple credible analyses, it’s probably not time to shift your link-building strategy to Google Plus empire building. Which is not to say that social doesn’t have influence over search, but Facebook’s Graph Search is a strong indication that social can’t be a major contributor to a SERP just yet.
But it does substantiate the value of creating content that people want to consume. Whether social informs search or they work somewhat independently, if you create content that is meaningful for people it will have value agnostic of its distribution channel.