Although most had never heard of it before last week, it’s now obvious to everyone that CrowdStrike’s software is deeply woven into the fabric of our society. One botched update disrupted healthcare, grounded flights, and stalled businesses worldwide — despite affecting less than 1% of all Windows devices.
But in these so-called ‘unprecedented’ times, it’s comforting to know that when it comes to security and safety — cyber or otherwise — most challenges are, in fact, precedented. In fact, throughout history, safety mechanisms have always revealed a fundamental flaw, which is then remedied with the right technology and philosophical approach.
Let’s journey back to the early days of the railways. Early train braking systems were designed to be off until pulled on. But after accidents such as the Armagh disaster (the worst rail accident of the Victorian age), the system changed to be “always on” by default — so that if the system failed, the brakes would be on, not off. A simple change in design thinking has led to a robust, failsafe system ever since.
Software and computing is just the same, and the industry will certainly learn lessons from what happened with the CrowdStrike / Microsoft outage. So what systems thinking changes might come out of this, and what opportunities are there for investors like Dawn?
1. SDLC. At Dawn, we’ve been investing for many years in tooling for the software development lifecycle (SDLC) — the process that helps development teams build high-quality software quickly and with minimal risk. Yet even in the age of GitHub Copilot, the SDLC is still a surprisingly artisanal, human-driven process. Testing, quality assurance, documentation, dependency mapping, package management — all are areas where we see continued need for innovation and robustness, particularly as the role and skillset of developers evolves through the GenAI revolution. The SDLC tooling and processes of today will look truly primitive to people a decade from now.
2. Observability. A global outage’s financial impact is astronomical: disrupted virtual work, digital experiences, hospital appointments, and financial services. Companies today are obliged to manage an ever-growing surface of applications and infrastructure, through which is flowing an exponentially increasing volume of data. We expect to see renewed attention and spend on next-gen monitoring and observability to manage this explosion. We’ve previously invested in monitoring and response platforms such as Onum and Shoreline.io (recently acquired by Nvidia), and we continue to be excited by opportunities such as extending observability “right” into action and response.
3. (Semi-)automated actions. The CrowdStrike / Microsoft debacle reminded us all that although automated updates are convenient, they carry risks. We’re not yet ready to hand over all updates and actions to automated systems — especially as it opens the door to another SolarWinds hack. Nonetheless, there is huge scope to use technology to give operational leverage to technical professionals, allowing them to dedicate their attention to high-value tasks. For now, we’ll still want a human to press the button, but teams in security operations centres (SOCs), IT operations and similar stand to benefit enormously from machine-assisted input. We backed exposure management platform Vulcan.io in an adjacent area, and are hunting for more opportunities in the future.
McAfee CTO Steve Grobman told Bloomberg, “The software industry will learn from this [incident], we always do.” I am certain that forward-thinking companies worldwide will seize this chance to future-proof their processes and systems design. The next ‘unprecedented’ event might just be a familiar challenge with a well-prepared response.