When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

CrowdStrike broke Debian and Rocky Linux months ago, but no one noticed

crowdstrike

A widespread Blue Screen of Death (BSOD) issue on Windows PCs disrupted operations across various sectors, notably impacting airlines, banks, and healthcare providers. The issue was caused by a problematic channel file delivered via an update from the popular cybersecurity service provider, CrowdStrike. CrowdStrike confirmed that this crash did not impact Mac or Linux PCs.

It turns out that similar problems have been occurring for months without much awareness, despite the fact that many may view this as an isolated incident. Users of Debian and Rocky Linux also experienced significant disruptions as a result of CrowdStrike updates, raising serious concerns about the company's software update and testing procedures. These occurrences highlight potential risks for customers who rely on their products daily.

In April, a CrowdStrike update caused all Debian Linux servers in a civic tech lab to crash simultaneously and refuse to boot. The update proved incompatible with the latest stable version of Debian, despite the specific Linux configuration being supposedly supported. The lab's IT team discovered that removing CrowdStrike allowed the machines to boot and reported the incident.

A team member involved in the incident expressed dissatisfaction with CrowdStrike's delayed response. It took them weeks to provide a root cause analysis after acknowledging the issue a day later. The analysis revealed that the Debian Linux configuration was not included in their test matrix.

"Crowdstrike's model seems to be 'we push software to your machines any time we want, whether or not it's urgent, without testing it'," lamented the team member.

This was not an isolated incident. CrowdStrike users also reported similar issues after upgrading to RockyLinux 9.4, with their servers crashing due to a kernel bug. Crowdstrike support acknowledged the issue, highlighting a pattern of inadequate testing and insufficient attention to compatibility issues across different operating systems.

To avoid such issues in the future, CrowdStrike should prioritize rigorous testing across all supported configurations. Additionally, organizations should approach CrowdStrike updates with caution and have contingency plans in place to mitigate potential disruptions.

Source: Ycombinator, RockyLinux

Report a problem with article
xbox logo
Next Article

Microsoft responds to the FTC's filing about Game Pass changes to the US Court of Appeals

Radeon Software
Previous Article

AMD 24.7.1 driver adds Anti-Lag 2 support for DOTA 2 and brings bug fixes

Join the conversation!

Login or Sign Up to read and post a comment.

26 Comments - Add comment

Crowdstrike's model seems to be 'we push software to your machines any time we want, whether or not it's urgent, without testing it

Exactly what happened today with Windows, it seems that CrowdStrike pushed the faulty update without properly testing it, because such a huge issue like this should have been detected with proper testing before mass deployment.

I mean, you only need to install the update and restart the machine for it to break down. This proves that they didn't test anything.

Exactly what happened today with Windows, it seems that CrowdStrike pushed the faulty update without properly testing it, because such a huge issue like this should have been detected with proper testing before mass deployment.

I mean, you only need to install the update and restart the machine for it to break down. This proves that they didn't test anything.

What is worse is a large corporation like this should know the importance of staggered release, except for some high severity security patch updated should be in batches.

What is worse is a large corporation like this should know the importance of staggered release, except for some high severity security patch updated should be in batches.

In CrowdStrike's defense, they've ALWAYS been a meager fly-by-night outfit with limited enterprise value. Most often, they'll deploy things that others have already detected -- so they're performing redundant actions with limited authority or expertise on proper testing... as proven in these situations.

Exactly what happened today with Windows, it seems that CrowdStrike pushed the faulty update without properly testing it, because such a huge issue like this should have been detected with proper testing before mass deployment.

I mean, you only need to install the update and restart the machine for it to break down. This proves that they didn't test anything.

Maybe they should rename the company ClownStrike.

In CrowdStrike's defense, they've ALWAYS been a meager fly-by-night outfit with limited enterprise value. Most often, they'll deploy things that others have already detected -- so they're performing redundant actions with limited authority or expertise on proper testing... as proven in these situations.

That sounds more like an indictment than a defense!

In CrowdStrike's defense, they've ALWAYS been a meager fly-by-night outfit with limited enterprise value. Most often, they'll deploy things that others have already detected -- so they're performing redundant actions with limited authority or expertise on proper testing... as proven in these situations.

That's a defense? Remind me to never hire you as my lawyer! 🤣🤣🤣

"Crowdstrike's model seems to be 'we push software to your machines any time we want, whether or not it's urgent, without testing it"

Sounds like CrowdStrike were inspired by microsoft's ability to get away with such attrocious nonsense with Winbug10, they figured they'd give it a shot too. Good to see it backfire on them, hopefully they learn that such practices aren't welcome and rehire their QA/QC department...

Wandering the Neowin website I still observe people complaining about companies reducing their testing budget.

ACCOUNTANTS run these companies, not technicians. IT and passion have not tied together since 2010.

I assume they're THE most hated company right now, especially among IT people, lots of weekends screwed up and long hours because of this. I also hope they're sued to hell and back.

Weekends screwed up?

How about millions or tens of millions or hundreds of millions lost due to these outages.

My daughter is a RN at a major hospital in my area and they are still down. Other hospitals as well. Did anyone die because of this? If CrowdStrike is around in 2 years I would be surprised.

All that said the fact that Windows lets 3rd party software access ring 0 is a YUGE issue. Windows has an absolute horrible history of security issues. It is long past the time that people move away from Windows for anything mission critical.

Weekends screwed up?

How about millions or tens of millions or hundreds of millions lost due to these outages.

My daughter is a RN at a major hospital in my area and they are still down. Other hospitals as well. Did anyone die because of this? If CrowdStrike is around in 2 years I would be surprised.

All that said the fact that Windows lets 3rd party software access ring 0 is a YUGE issue. Windows has an absolute horrible history of security issues. It is long past the time that people move away from Windows for anything mission critical.

I was referring to IT people who now have to work extra hard to fix this mess. Today is Friday, so this weekend is for sure an "all hands on deck" situation. Luckily we weren't affected, but oh boy would I be ###### with CrowdStrike.

And about that last paragraph, every OS lets third-party software run with Ring-0 privileges, that's not a Windows-only thing. Let's not try to blame Microsoft for this one, they aren't at fault here for once. Like the article clearly explains, Linux was also affected but nobody cared so it didn't show up on the news. So no, moving from Windows wouldn't have prevented any of this. The title is:

"CrowdStrike broke Debian and Rocky Linux months ago, but no one noticed".

Weekends screwed up?

How about millions or tens of millions or hundreds of millions lost due to these outages.

My daughter is a RN at a major hospital in my area and they are still down. Other hospitals as well. Did anyone die because of this? If CrowdStrike is around in 2 years I would be surprised.

All that said the fact that Windows lets 3rd party software access ring 0 is a YUGE issue. Windows has an absolute horrible history of security issues. It is long past the time that people move away from Windows for anything mission critical.

I think it's more like tens of billions of $ lost, and that's without exaggeration. The worldwide damage was MASSIVE!

When we first went to crowdstrike almost 2yrs ago we put on a brand new AMD Epyc server, it broke the system to the point opening command prompt took 90+ seconds.. had tickets open with them no one seemed to care... did clean reinstalls multiple times showing them the issue, it broke consistently, they played it off as bad hardware... so we did it on a completely different AMD Epyc system (both of these were Dells) and same exact issue, but their support people just kept collecting logs taking screen recordings and telling us its not them it's us... so after 6 months, yes SIX MONTHS! of back and forths on it we gave up and put Defender ATP on and no issues, a few months later they put out a "Fix for issues with some AMD Epyc systems" really ###### us off...

and don't get me started on some of the Linux issues we've had, everyone was touting today this is a windows only issue, but we've had some doosies of issues in the past 2 years with various kernel problems with their falcon service on it...

Hello,

I work for a competitor and know that companies in this space go to great lengths to avoid these kinds of issues. What could have happened here might be some kind of process error, and those can occur at any company.

Nobody knows the actual root cause of this, but until Crowdstrike has released their post mortem incident report, commenting on what they may--or may not--have done is speculative and ill-informed.

Let's give Crowdstrike a chance get back up on their feet, and get their customers' systems up and running. Then we can focus on what we can learn from this event to prevent it from happening again in the future.

Regards,

Aryeh Goretsky

Hello,

I work for a competitor and know that companies in this space go to great lengths to avoid these kinds of issues. What could have happened here might be some kind of process error, and those can occur at any company.

Nobody knows the actual root cause of this, but until Crowdstrike has released their post mortem incident report, commenting on what they may--or may not--have done is speculative and ill-informed.

Let's give Crowdstrike a chance get back up on their feet, and get their customers' systems up and running. Then we can focus on what we can learn from this event to prevent it from happening again in the future.

Regards,

Aryeh Goretsky

This event won't happen again because CrowdStrike will no longer be relevant. They have effectively ruined their own reputation!

Any sensible enterprise will limit their dealings with CrowdStrike to the point that they (CrowdStrike) becomes another service provider whose contract is not renewed. And just look at HP Cloud and other large enterprises that have since morphed/sold/closed due to costly blunders.

The worst part? We might never know (from CrowdStrike themselves) why this all happened. It's partly process, it's partly technical, it's 100% CrowdStrike at fault. I say that a post-mortem might not come from them because other cybersecurity researchers may spill-the-beans and prove conclusively what actually went wrong... publicly.

Lastly, your company may be going to great lengths to avoid these fiascos. CrowdStrike has always had a questionable reputation in those regards. They've always been a mediocre competitor in the space with a strong sales team -- and this WORLDWIDE fiasco proves it.

Join the conversation!

Login or Sign Up to post a comment.