CrowdStrike reveals how huge Microsoft outage that led to global chaos actually happened

A bug took TV stations offline, brought global air travel to standstill and led to cancelled hospital appointments

Andrew Griffin
Thursday 25 July 2024 06:02 BST
Comments
American Airline's International flights passengers line up to check in during a global technical outage at Miami International airport in Miami, Florida, USA, 19 July 2024
American Airline's International flights passengers line up to check in during a global technical outage at Miami International airport in Miami, Florida, USA, 19 July 2024 (EPA)

Your support helps us to tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.

At such a critical moment in US history, we need reporters on the ground. Your donation allows us to keep sending journalists to speak to both sides of the story.

The Independent is trusted by Americans across the entire political spectrum. And unlike many other quality news outlets, we choose not to lock Americans out of our reporting and analysis with paywalls. We believe quality journalism should be available to everyone, paid for by those who can afford it.

Your support makes all the difference.

CrowdStrike, the company at the centre of an IT outage that brought much of the world to a standstill last week, has finally revealed how the problem was able to happen.

Last Friday, users of Microsoft PCs reported that their computers were refusing to turn on. That rapidly led to global chaos as users of those computers were unable to do their job: TV stations went offline, global air travel was brought to a standstill with many major airlines cancelling flights entirely, and hospital appointments were forced to be cancelled.

It quickly became clear that the bug was not the result of a Microsoft update, but one from cyber security company CrowdStrike. All of the affected computers were running its cybersecurity software.

Now it has revealed that the bug was able to get out into the world because of a failure in its quality control mechanism.

The bug was released in an update to CrowdStrike’s Falcon Sensor, a platform that protects computers from malicious software and hackers. To be able to do so, however, it must be constantly updated and have deep access to a computer – both of which led to the problems.

A stray update containing a fault meant that Windows operating systems were not able to run properly once they had installed it. That led to the “Blue Screen of Death” that greeted users on Friday morning, and meant those computer systems were not able to get online.

CrowdStrike, like other major software companies, has systems that are intended to spot such bugs in software before it is sent out to the public. But that system itself had a bug that meant it certified the release despite the fact it included dangerous errors.

“Due to a bug in the Content Validator, one of the two Template Instances passed validation despite containing problematic content data,” CrowdStrike said in a statement, referring to the failure of an internal quality control mechanism that allowed the problematic data to slip through the company’s own safety checks.

CrowdStrike did not say what that content data was, nor why it was problematic. A “Template Instance” is a set of instructions that guides the software on what threats to look for and how to respond. CrowdStrike said it had added a “new check” to its quality control process in a bid to prevent the issue from occurring again.

The extent of the damage from the botched update is still being assessed. On Saturday, Microsoft said about 8.5 million Windows devices had been affected, and the US House of Representatives Homeland Security Committee has sent a letter to CrowdStrike CEO George Kurtz asking him to testify.

Once its investigation is complete, CrowdStrike said that it will publicly release its full analysis of the meltdown.

CrowdStrike released information to fix affected systems last week, but experts said getting them back online would take time as it required manually weeding out the flawed code.

Wednesday’s statement was in line with a widely held assessment from cybersecurity experts that something in CrowdStrike‘s quality control process had gone badly wrong.

Additional reporting by agencies

Join our commenting forum

Join thought-provoking conversations, follow other Independent readers and see their replies

Comments

Thank you for registering

Please refresh the page or navigate to another page on the site to be automatically logged inPlease refresh your browser to be logged in