Google News: How does the search giant’s headline aggregator work?

Silicon Valley giant's popular service uses complex algorithm to assess page quality 

Joe Sommerlad
Monday 18 June 2018 17:15 BST
Comments
Google CEO Sundar Pichai
Google CEO Sundar Pichai (Justin Sullivan/Getty Images)

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Google News is checked by millions of people on a daily basis looking for quick access to a range of coverage of a given event or issue.

It was founded by software developer Krishna Bharat in 2002 in response to the scramble for news that followed the attacks on the World Trade Centre on 11 September 2001.

The service collects and ranks all articles on a particular topic then making international headlines into clusters, allowing readers to choose which publication’s account they read.

But how does Google rank the content it shows?

Rather than a physical team of news editors, Google relies on an algorithm whose methodology, like Colonel Sanders’ recipe for fried chicken, is a closely guarded secret.

The algorithm reviews content automatically, looking for indicators of quality, assessing a story’s placement based on the number of user clicks it is attracting, the popular consensus on the trustworthiness of its publisher, the relevance of the story to the reader’s current geographical location and the freshness (i.e. publication date and time) of the story in question.

Google News is therefore more likely to rank British news sites highly when the story concerns a fire in London than reports on the same incident from much-admired publishers from further afield like The New York Times or Washington Post.

The recurrence of specific keywords across publications and the level of public interest indicated by user searches guide the algorithm in its creation and organisation of specific subjects into clusters.

Josh Cohen, Google’s business product manager, explained in an interview with Search Engine Land the sort of questions Google’s system is asking itself: “What’s the aggregate editorial interest is in a given story? What does everyone have on their front page? That’s going to drive the results. What do editors collectively feel is the top story of the day?”

Google rarely excludes sites from featuring in its search results but does confess to favouring those that “primarily offer timely reporting or analysis of recent events” when it comes to Google News.

The company does invite publishers to submit a request for inclusion in Google News results, which sees it check out the applicant’s claim to ownership of the site. In making this invitation, Google offers guidelines for publishers advising them to “write original content that’s clear and free of grammatical errors”.

Other quality checks include clear attribution, any author bias being clearly signposted and the inclusion of manually-added hyperlinks in copy.

“Sites included in Google News must not misrepresent, misstate, or conceal information about their owner or primary purpose,” the company states.

The amount of text featured should also not be dwarfed by the amount of advertising content displayed.

But it is unclear how strictly Google News adheres to its rules as results for The Economist feature – with no writer credited – while publications like RT, whose reliability has been widely questioned, also rank.

Given that page popularity is assessed by clicks, problems can also arise for subscription-access news providers with a paywall in place as the obstacle is likely to deter casual non-subscribers from clicking through and thus having a negative impact on the story’s Google News placing.

The increasing pressure on Silicon Valley giants like Google and Facebook to be more transparent about their practices in light of the 2016 US presidential election and the Cambridge Analytica scandal shows no sign of letting up.

Google announced a significant overhaul of Google News in May 2018 in a renewed bid to inspire trust regarding its workings.

“The reimagined Google News uses a new set of AI techniques to take a constant flow of information as it hits the web, analyse it in real time and organise it into storylines,” Google News boss Trystan Upstill said in a blog post.

“This approach means Google News understands the people, places and things involved in a story as it evolves, and connects how they relate to one another.

“At its core, this technology lets us synthesise information and put it together in a way that helps you make sense of what’s happening, and what the impact or reaction has been.”

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in