Why did Facebook, Instagram and WhatsApp go down?

Andrew Griffin
Tuesday 05 October 2021 09:33 BST
Comments
Why Facebook, Instagram and WhatsApp all went down
Leer en Español

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

Facebook, WhatsApp and Instagram have all gone down in a major outage.

Such problems – especially after they have been ongoing for hours – likely indicates there is a major problem with the technology underpinning Facebook’s services.

And those issues can easily last for hours. In 2019, when it suffered from its biggest ever outage, it was more than 24 hours from the beginnings of the problem until Facebook said it was resolved.

:: Follow our live coverage of the outage here ::

What’s more, Facebook might never truly reveal what caused the problems. After that record outage in 2019, it said only that the problems were “a result of a server configuration change”.

This time around, at least some of the problems were related to the domain name system, or DNS, which works something like a phone book for the internet. When a user types in a web address – such as facebook.com – then the computer needs to turn that into a an IP address, which is a series of numbers, so that it can access the data that makes up the page you want to see.

When Facebook was down, however, that system was not working: the computer searches for the numbers it wants to see, but the numbers aren’t there. Facebook’s servers should have provided them, but the phone book is in effect blank.

To sign up to our breaking email alerts click here

When it comes to Facebook, that meant anyone attempting to access the site will see an error code, depending on what browser they use. Apps might work a little differently – they would still show existing content, such as WhatsApp messages or Instagram posts – that have already been downloaded, but they were not be able to ask Facebook’s servers for new ones.

It is far from the only company to suffer such issues. In July, many major websites – including those of seemingly unconnected companies such as Home Depot and Delta Airlines – went down because of problems at Akamai, which offers DNS to its customers.

But Facebook’s DNS problems were only a symptom, even if they are the one that means many people are unable to access those sites. The system would not break spontaneously, and so it is likely that something has happened to the underlying infrastructure – a stray settings change, a physical outage at a server, or something else entirely – that has stopped it from working.

To sign up to our free weekly tech newsletter click here

It appeared, at least from the outside, that Facebook had done that to itself; the company maintains its own DNS, unlike other smaller companies, and the changes were made from inside the company. At some point during Tuesday, the relevant directions to web browsers appeared to have been removed – though, at the time of publication, Facebook was yet to explain how or why.

The fact that Facebook is so extensively run on its own systems also meant that it, too, was affected by the outage, with internal communications tools going offline. It also reportedly kept engineers from being able to fix the problems remotely, since they were unable to access the system to do so – meaning that the company was forced to send engineers to physically deal with the servers in person.

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in