A widespread Blue Screen of Death (BSOD) issue on Windows PCs disrupted operations across various sectors, notably impacting airlines, banks, and healthcare providers. The issue was caused by a problematic channel file delivered via an update from the popular cybersecurity service provider, CrowdStrike. CrowdStrike confirmed that this crash did not impact Mac or Linux PCs.
It turns out that similar problems have been occurring for months without much awareness, despite the fact that many may view this as an isolated incident. Users of Debian and Rocky Linux also experienced significant disruptions as a result of CrowdStrike updates, raising serious concerns about the company's software update and testing procedures. These occurrences highlight potential risks for customers who rely on their products daily.
In April, a CrowdStrike update caused all Debian Linux servers in a civic tech lab to crash simultaneously and refuse to boot. The update proved incompatible with the latest stable version of Debian, despite the specific Linux configuration being supposedly supported. The lab's IT team discovered that removing CrowdStrike allowed the machines to boot and reported the incident.
A team member involved in the incident expressed dissatisfaction with CrowdStrike's delayed response. It took them weeks to provide a root cause analysis after acknowledging the issue a day later. The analysis revealed that the Debian Linux configuration was not included in their test matrix.
"Crowdstrike's model seems to be 'we push software to your machines any time we want, whether or not it's urgent, without testing it'," lamented the team member.
This was not an isolated incident. CrowdStrike users also reported similar issues after upgrading to RockyLinux 9.4, with their servers crashing due to a kernel bug. Crowdstrike support acknowledged the issue, highlighting a pattern of inadequate testing and insufficient attention to compatibility issues across different operating systems.
To avoid such issues in the future, CrowdStrike should prioritize rigorous testing across all supported configurations. Additionally, organizations should approach CrowdStrike updates with caution and have contingency plans in place to mitigate potential disruptions.
Source: Ycombinator, RockyLinux
Exactly what happened today with Windows, it seems that CrowdStrike pushed the faulty update without properly testing it, because such a huge issue like this should have been detected with proper testing before mass deployment.
I mean, you only need to install the update and restart the machine for it to break down. This proves that they didn't test anything.
What is worse is a large corporation like this should know the importance of staggered release, except for some high severity security patch updated should be in batches.
In CrowdStrike's defense, they've ALWAYS been a meager fly-by-night outfit with limited enterprise value. Most often, they'll deploy things that others have already detected -- so they're performing redundant actions with limited authority or expertise on proper testing... as proven in these situations.
Maybe they should rename the company ClownStrike.
That sounds more like an indictment than a defense!
That's a defense? Remind me to never hire you as my lawyer! 🤣🤣🤣
Sounds like CrowdStrike were inspired by microsoft's ability to get away with such attrocious nonsense with Winbug10, they figured they'd give it a shot too. Good to see it backfire on them, hopefully they learn that such practices aren't welcome and rehire their QA/QC department...
Wandering the Neowin website I still observe people complaining about companies reducing their testing budget.
ACCOUNTANTS run these companies, not technicians. IT and passion have not tied together since 2010.
Why is everyone attacking this small tech startup?
They're worth $74.22B as of this moment, and that's after a 18% fall because of this fiasco. But yeah, I agree they should lose everything because of how much damage they caused worldwide, and be demoted to a (dodgy) startup.
Surely this isn’t a serious question.
You'd have to assume he's joking, yeah. He can't be THAT naive...
https://pbs.twimg.com/media/BZsPpRJCQAA3hJ5.jpg
The disruption was so widespread I expect they will be sued out of existence.
I assume they're THE most hated company right now, especially among IT people, lots of weekends screwed up and long hours because of this. I also hope they're sued to hell and back.
Weekends screwed up?
How about millions or tens of millions or hundreds of millions lost due to these outages.
My daughter is a RN at a major hospital in my area and they are still down. Other hospitals as well. Did anyone die because of this? If CrowdStrike is around in 2 years I would be surprised.
All that said the fact that Windows lets 3rd party software access ring 0 is a YUGE issue. Windows has an absolute horrible history of security issues. It is long past the time that people move away from Windows for anything mission critical.
I was referring to IT people who now have to work extra hard to fix this mess. Today is Friday, so this weekend is for sure an "all hands on deck" situation. Luckily we weren't affected, but oh boy would I be ###### with CrowdStrike.
And about that last paragraph, every OS lets third-party software run with Ring-0 privileges, that's not a Windows-only thing. Let's not try to blame Microsoft for this one, they aren't at fault here for once. Like the article clearly explains, Linux was also affected but nobody cared so it didn't show up on the news. So no, moving from Windows wouldn't have prevented any of this. The title is:
"CrowdStrike broke Debian and Rocky Linux months ago, but no one noticed".
No to hell and back, just to hell that'l do
I wonder what it would look like for a company to get sued back into existence.
Edit: probably an integer overflow somewhere would do it
I think it's more like tens of billions of $ lost, and that's without exaggeration. The worldwide damage was MASSIVE!
https://www.youtube.com/watch?v=d8cgl19Jk8c
https://twitter.com/vxunderground/status/1814323450489582022
You can't hack a computer that won't boot up.
When we first went to crowdstrike almost 2yrs ago we put on a brand new AMD Epyc server, it broke the system to the point opening command prompt took 90+ seconds.. had tickets open with them no one seemed to care... did clean reinstalls multiple times showing them the issue, it broke consistently, they played it off as bad hardware... so we did it on a completely different AMD Epyc system (both of these were Dells) and same exact issue, but their support people just kept collecting logs taking screen recordings and telling us its not them it's us... so after 6 months, yes SIX MONTHS! of back and forths on it we gave up and put Defender ATP on and no issues, a few months later they put out a "Fix for issues with some AMD Epyc systems" really ###### us off...
and don't get me started on some of the Linux issues we've had, everyone was touting today this is a windows only issue, but we've had some doosies of issues in the past 2 years with various kernel problems with their falcon service on it...
Hello,
I work for a competitor and know that companies in this space go to great lengths to avoid these kinds of issues. What could have happened here might be some kind of process error, and those can occur at any company.
Nobody knows the actual root cause of this, but until Crowdstrike has released their post mortem incident report, commenting on what they may--or may not--have done is speculative and ill-informed.
Let's give Crowdstrike a chance get back up on their feet, and get their customers' systems up and running. Then we can focus on what we can learn from this event to prevent it from happening again in the future.
Regards,
Aryeh Goretsky
This event won't happen again because CrowdStrike will no longer be relevant. They have effectively ruined their own reputation!
Any sensible enterprise will limit their dealings with CrowdStrike to the point that they (CrowdStrike) becomes another service provider whose contract is not renewed. And just look at HP Cloud and other large enterprises that have since morphed/sold/closed due to costly blunders.
The worst part? We might never know (from CrowdStrike themselves) why this all happened. It's partly process, it's partly technical, it's 100% CrowdStrike at fault. I say that a post-mortem might not come from them because other cybersecurity researchers may spill-the-beans and prove conclusively what actually went wrong... publicly.
Lastly, your company may be going to great lengths to avoid these fiascos. CrowdStrike has always had a questionable reputation in those regards. They've always been a mediocre competitor in the space with a strong sales team -- and this WORLDWIDE fiasco proves it.