SMotW #70: incident detection time lag

Security Metric of the Week #70: delay between incident occurrence and detection

While some information security incidents are immediately obvious, others take a while to come to light ... and an unknown number are never discovered. Compare, say, a major storm that knocks out the computer suite against an APT (Advanced Persistent Threat) incident. During the initial period between occurrence and detection, and subsequently between detection and resolution, incidents are probably impacting the organization. Measuring how long it takes to identify incidents that have occurred therefore sounds like it might be a useful way of assessing and if necessary improving the efficiency of incident detection to reduce the time lag.

When ACME's managers scratched beneath the surface of this candidate security metric, thinking more deeply about it as they worked methodically through the PRAGMATIC analysis, it turned out to be not quite so promising as some initially thought:

P	R	A	G	M	A	T	I	C	Score
80	70	72	30	75	50	50	65	65	62%

Management was concerned that, in practice, while the time that an incident is detected can be ascertained from incident reports (assuming that incidents are being reliably and rapidly reported - a significant assumption), it is harder determine, with any accuracy, exactly when an incident first occurred. Root cause analysis often discovers a number of control failures that contributed or led to an incident, while in the early stages of many deliberate attacks the perpetrators are gathering information, passively at first then more actively but often covertly. Forensic investigation might establish more objectively the history leading up to the discovery and reporting of incidents, but at what cost?

For the purposes of the metric, one might arbitrarily state that an incident doesn't exist until the moment it creates an adverse impact on the organization, but that still leaves the question of degree. Polling the corporate website for information to use in a hacking or phishing attack has a tiny - negligible, almost unmeasurable - impact on the organization, so a better definition of the start of an incident might involve 'material impact' above a specified dollar value: fine if the costs are known, otherwise not so good.

The 30% rating for Genuinness highlights management's key concern with this metric. The more they discussed it, the more issues, pitfalls and concerns came out of the woodwork, leaving an overriding impression that the numbers probably couldn't be trusted. On the other hand, the 62% score means the metric has some potential: the CISO was asked to suggest other security incident-related metrics, perhaps variants of this one that would address management's reservations.

[This is one of eight possible security incident metrics discussed in the book, two of which scored quite a bit higher on the PRAGMATIC scale. There are many many more possibilities in this space: how would you and your colleagues choose between them?]