CrowdStrike: How a single file broke Windows

And why it's so tedious to fix.

CrowdStrike: How a single file broke Windows
Photo Credit: Paul Mah

How a single file broke Windows. What happened with CrowdStrike, and why it's so tedious to fix.

By now, you would have read about how a botched update from a cybersecurity firm caused global IT chaos by crashing systems:

  • Self-service kiosks.
  • IT systems in banks and hospitals.
  • Check-ins at airports around the world.
  • Even 100s of carpark gantry systems in Singapore.

All down. At the same time.

Who is CrowdStrike?

CrowdStrike is a top US-based cybersecurity company valued at US$80 billion. Its products compete successfully across multiple niches.

  • SIEM.
  • Cloud.
  • Endpoint.
  • Threat intel.
  • Data protection.
  • Exposure management.

It's also the leader for Windows endpoint protection for businesses, with around 22% market share, or almost 1 in 4 Windows PC.

Crucially, CrowdStrike is used heavily in the Critical Information Infrastructure (CII) sector, which means areas like healthcare, transportation, and government.

How did it get so bad?

The issue was triggered by a faulty file in the Falcon Sensor component of its Falcon solution. The problem file was pushed out as part of an automated update.

A problem in the file sent to Windows systems promptly crashed it. Restarting the computer crashes it again, effectively causing a continuous restart loop.

By itself, Falcon Sensor does fairly basic data collection, packages it up, and uploads it to the CrowdStrike cloud (~5-8MB/day).

Despite its name, the *.sys file is not kernel-mode software but used to provide channel updates. It is somehow improperly formatted, causing the CrowdStrike driver to crash.

Non-Windows systems are not affected; Windows Server systems with Falcon Sensor are similarly impacted.

Why can't it be quickly fixed?

The fix is relatively simple, and entails replacing the offending file. It looks like this:

  • Sit down at PC and press F4 to boot into safe mode.
  • Key in unique 48-digit BitLocker key*.
  • Log in and replace offending file.
  • Reboot and verify that it works.
  • Repeat for each and every PC.

There are multiple solutions to reboot/manage PCs remotely. But they are pricey and must be set up ahead of time.

tldr; Almost everyone is fixing this manually.

*BitLocker is a Windows disk encryption feature used by businesses to protect onboard data. The decryption key is unique for every PC.

The real question

Competitors have been quick to release statements deriding the situation. Of course, they tend to recommend their solutions instead.

To me, the real question is why the update was pushed out at the same time, and not staggered.

Tomorrow, I'll write about what I think will be the repercussions of this.