Putting facts first: changes to Google’s algorithm and their impact on content marketing

Google has about 85% of the UK search market and any change it makes to its site ranking methodology is going to affect everyone who produces content. Now the company is considering what could be its most controversial change yet.

In February, Google released a technical paper outlining how it could be possible to rank individual pages on the quality and factual accuracy of their content as compared with their popularity.

A long history of change

Google has been evolving its page-ranking algorithm ­– how it decides what ranks highest when returning a search query – since its foundation in the mid-1990s. By 2003, the focus was on preventing the all-important first page being dominated by commercial sites using paid-for links from dubious places. The change came when Google knew who was involved in this practice and was able penalise them accordingly. Many companies lost their place on the first page overnight.

Google’s last major change came in 2013. The “Hummingbird” algorithm – following 2011’s “Panda” – set out to increase focus on the search query as a whole and gather its intent, rather than looking solely at individual keywords. But backlinking from other sites still plays a major part in what gets returned as search results to the user.

Now, a new methodology could move Google’s search rules further forward. In the future, pages may be downgraded for containing “factual inaccuracies” or poor content. Google’s aim is to return high quality, rather than just simply popular, suggestions.

The new way of doing things

Google proposes measuring a page’s “trustworthiness” and downgrading the search ranking if it finds incorrect facts or other examples of poor content. To do this Google proposes looking through the text of any given page looking for factual statements and testing these against what it holds in its own database.

The database in question is Google’s Knowledge Vault (KV), containing around 1.6 billion facts gleaned from the web. Under the bonnet (or hood, if you’re in the US), the database is automatically compiled and evaluated from sources like Wikipedia and Freebase.

Defining a fact

In the research paper, Google cited as an example the nationality of US President Barack Obama. According to the findings of the KV, Obama was born in the US. As a result, any site that says he is born in Kenya would be downgraded in Google’s ranking.

A site’s subject matter also comes into consideration when evaluating individual pages. This could be determined on the basis of a site’s overall content, its name, or “about us” page. Any new content would then be compared with this to see if it was “on topic”. If not, it could be penalised in the rankings.

Google’s engineers put it more succinctly: “If the website is about business directories in South America but the extractions [facts] are about cities and countries in South America, we consider them as not topic relevant.”

Basic errors could also be penalised even if they are gramatically correct. For example, mistakenly adding an extra zero and stating that an athlete weighs 1000 pounds could prove a problem.

Another downgrade could come from what Google calls a “triviality of facts”. Another example given by Google concerns a page from a review on a website clearly dedicated to Hindi-language movies. If that page further states that the language of each film is Hindi, that could be deemed “triviality”.

How content marketing could gain

Questions are already arising over whether Google is the correct arbiter of what is correct, especially when the judgments are being handled by software, albeit software that is highly sophisticated. Obama’s place of birth is not the only example of an issue disputed by a sizeable minority, whatever we may think about the beliefs and intentions involved. There have been concerns raised by sceptics of man-made climate change theories that sites could be downgraded in related search inquiries.

Arguably, if you’re producing good content then these changes should be a help rather than a hindrance. Part of maintaining a good content standard should also involve avoiding extraneous material. It might seem harsh but one could question the editorial standards of a site already entitled “Top 100 Hindi-language films” that has a line next to the 1975 classic Sholay stating it is in Hindi.

Being off-topic for the site could prove a problem however. An article for a bank about a CSR programme could be downgraded as it is not about finance.

One possible benefit could be to avoid a perverse incentive faced by all content producers: at the moment they are only too aware that gaining organic traffic for new content is dependent on search results whose rankings are based on popularity/backlinks. This is a key point in the introduction to Google’s paper where the author makes the point that high quality content is effectively hidden under the current system.

Will it happen?

Right now, these proposals are some way from becoming reality. At the end of the report Google said it planned to do more work on the algorithm, such as ensuring site topics were analysed correctly. No date was given for the next milestone, but it outlined areas for further investigation and improvement. These include detection of sites that use copied content, the ability to analyse site content, and a better definition of what constitutes triviality.

Since it came into existence, Google has always sought to improve its number one product. An indication of how far it wants to go was given by Google founder Larry Page in 2004. In an interview with Bloomberg, he said: “The ultimate search engine would basically understand everything in the world, and it would always give you the right thing. And we’re a long, long ways from that.”