Crowdstrike - post-incident review: a dozen learning points
I blogged about the Crowdstrike incident on July 21st while it was still playing out. Now, having d rained the swamp and let the d ust settle, I'm d ue to d raw out, d econstruct and d ecide what to d o about the Crowdstrike d isaster, so here goes: Design, build and test systems for resilience, where 'systems' means not just IT systems but the totality of interdependent technologies, organisations, people, information flows and other resources necessary to deliver and support critical business activities. Hinson tip : "be prepared" is not just for boy scouts ! Those dependencies are p otential p inch p lus p ain p oints. Test software before release. Sounds easy, right? It isn't. There is an infinite amount of testing that could be performed, only a fraction of which realistically should be, while the amount and quality of testing actually performed is resource-constrained and time-boxed for business and uncertainty (risk!) reasons (delaying secu...