A critique of CIS netsec metrics (LONG)

Perusing a CIS paper on metrics for their newly-updated recommended network security controls (version 7), several things strike me all at once, a veritable rash of issues.

Before reading on, please at least take a quick squint at the CIS paper. See what you see. Think what you think. You'll get more out of this blog piece if you've done your homework first. You may well disagree with me, and we can talk about that. That way, I'll get more out of this blog piece too!

[Pause while you browse the CIS paper on metrics]

[Further pause while you get your thoughts in order]

OK, here's my take on it:

The recommended controls are numerous, specific and detailed cybersecurity stuff, hence the corresponding metrics are equally granular since the CIS team has evidently decided that each control should be measured individually ... whereas, in practice, I'd be more inclined to take the metrics up a level or three since my main interest in metrics is to make decisions in order to manage things, not to do them nor to 'prove' that someone is doing something. I'm not entirely sure even the network security wonks would welcome or appreciate such detailed metrics: they should already know how they are doing, pretty much, without the need to measure and prove it (to themselves!). Management, on the other hand, could do with something more than just the tech guys telling them "Oh it's all OK! We're using the CIS guidance! Nothing to see here - move along!" or "Of course we are terribly insecure: we've told you a million times we need more resources!". I contend that overview/status or maturity metrics would be far more useful for management. [I'll circle back to that point at the end. Skip the rest if this is all too much.]
I guess if all the individual metrics were generated, it would be possible to generate an overall score simply by averaging them (taking the mean and maybe the variance too since that relates to consistency). That could be used as a crude indication of the status, and a lever to drive up implementation, but it would be better to at least group the detailed metrics into categories (perhaps relating to the categories of control) and report each category separately, providing a better indication of where the strengths and weaknesses lie. However, I'm still troubled by the first part: "If all the individual metrics were generated" implies a very tedious and potentially quite costly measurement process. Someone familiar with the organization's network security controls (a competent IT auditor, for instance, or consultant - a reasonably independent, unbiased, diligent and intelligent person anyway) ought to be able to identify the main strengths and weaknesses directly, categorize them, measure and report them, and offer some suggestions on how to address the issues, without the tedium. I figure it's better for the network security pros to secure the network than to generate reams of metrics of dubious value. [More on this below]
I'm sure most of us would challenge at least some of the CIS recommended controls: they mean well but there are situations where the controls won't work out in practice, or they go too far or not far enough, or there are other approaches not considered, or the wording isn't right, or ... well, let's just say there are lots of potential issues way down there in the weeds, and that's bound to be an issue with such prescriptive, detailed, "do this to be secure" check-the-box approaches (I know, I know, I'm exaggerating for effect). Plucking but one example from my own specialism, control 17.4 says "Update Awareness Content Frequently - Ensure that the organization's security awareness program is updated frequently (at least annually) to address new technologies, threats, standards and business requirements." Updating awareness and training program content to reflect the ever-changing information risk landscape is good practice, I agree, but annually is definitely not, especially if that also implies that it is only necessary to check for changes in the information risks on an annual basis. Hello! Wakey wakey! There is new stuff happening every hour, every day, certainly every few weeks, with potentially significant implications that ought to be identified, evaluated and appropriately responded-to, promptly. Annual updates are way too slow, a long way short of "frequent" to use their word. Furthermore, the metric for 17.4 is equally misleading: "Has the organization ensured that the organization's security awareness program is updated frequently (at least annually) to address new technologies, threats, standards and business requirements: yes/no?" Using their metric, any sort of 'update' to the awareness program that happens just once a year justifies answering yes - ticking the box - but to me (as an awareness specialist) that situation would be woefully inadequate, indicative of an organization that patently does not understand the purpose and value of security awareness and training. In that specific example, I would suggest that the frequency of meaningful reviews and updates to the information risk profile and the awareness and training program would be a much more useful metric - two in fact since each aspect can be measured separately and they may not align (hinting at a third metric!).
The underlying problem is that we could have much the same discussion on almost every control and metric in their list. How many are there in total? Over 100, so that's roughly 100 discussions. Pains will be taken! Set aside a good few hours for that, easily half to a whole a day. You could argue that we would end up with a much better appreciation of the controls and the metrics ... but I would counter that there are better ways to figure out worthwhile metrics than to assess/measure and report the implementation status of every individual control. That half a day or so could be used more productively.
My suggestion to use 'frequency of risk and awareness updates' reminds me of the concern you raised, Walt. Binary metrics are crude while analog metrics are more indicative of the true status, particularly in boundary cases where a straight yes or no does not tell the whole story, and can be quite misleading (e.g. as I indicated above). Binary metrics and crude checklists are especially problematic if the metrician has flesh in the game (which would be true if the CIS network security metrics were being measured and reported by network security pros), and if the outcome of the measurement may reflect badly or well on them personally. The correct answer is of course "Yes" if the situation clearly and completely falls into the "Yes" criterion, but what if the situation is not quite so clear-cut? What if the appropriate, honest answer would be "Mostly yes, but slightly no - there are some issues in this area"? Guess what: if "Yes" leads to fame and fortune, then "No" doesn't even get a look-in! In extreme cases, people have been known to ignore all the "No" situations, picking out a single "Yes" example and using that exception, that outlier, to justify ticking the "Yes" box. This is of course an information risk, a measurement bias, potentially a significant concern depending on how the metrics are going to be used. The recipient and user of the metrics can counter the bias to some extent if they are aware of it and so inclined, but then we're really no better off than if they just discussed and assessed the situation without binary metrics. If they are unaware of the bias and unwisely trusting of the metric, or if they too are biased (e.g. an IT manager reporting to the Exec Team on the network security status, using the 'facts' reported up the line by the network security team as their get-out-of-jail-free card - plausible deniability if it turns out to be a tissue of lies), then all bets are off. There are situations where such biased metrics can be totally counterproductive - leaving us worse off than if the metrics did not exist (consider the VW emissions-testing scandal, plucking a random example out of the air, one that I brought up yesterday in relation to assurance).
Furthermore, I have concerns about the CIS version of an analog metric in this document. Someone at CIS has clearly been on the 'six sigma' training, swallowed the Cool-aid, and directly transferred the concept to all the analog metrics, with no apparent effort to adapt to the situation. Every CIS analog metric in the paper has the identical form with identical criteria for the 6 levels: 69% or less; 31% or less; 6.7% or less; 0.62% or less; 0.023% or less; 0.00034% or less. That categorization or gradation really doesn't make a lot of sense in every case, leading to inconsistencies from one metric or one control to the next. I challenge anyone to determine and prove the distinction between the upper three values on their scale for any real-world network security measurement in the table, at least not without further measurement data (which sort of defeats the purpose) ... so despite the appearance of scientific rigour, the measurement values are at least partially arbitrary and subjective anyway. Trying to shoe-horn the measurement of a fair variety of network security control implementation statuses into the same awkward set of values is not helpful. For me, it betrays a lack of fundamental understanding of six-sigma, continuous improvement and process maturity. Frankly, it's a mess.
Returning to the idea of averaging scores to generate overall ratings, that approach is technically invalid if the individual values being averaged are not equivalent - which they aren't for the reasons given above. Seems to me The Big Thing that's missing is some appreciation and recognition of the differing importance or value of each control. If all the controls were weighted, perhaps ranked or at least categorized (e.g. vital, important, recommended, suggested, optional), there would be a better basis for generating an overall or section-by-section score. [In fact, the process of determining the weightings or ranking or categorization would itself generate valuable insight ... a bonus outcome from designing better security metrics! The CIS controls are supposedly 'prioritized' so it's a shame that approach didn't filter down to the metrics paper.] One thing we could do, for example, is ignore all except the vital controls on a first pass: get those properly specified, fully implemented, operational, actively managed and maintained would be an excellent starting point for an organization that has no clue about what it ought to be doing in this space. Next pass, add-in the important controls. Lather-wash-rinse-repeat ...

Overall the CIS paper, and bottom-up metrics in general, generate plenty of data but precious little insight - quantity not quality.

Earlier I hinted that as well as their use for decision-making and managing stuff, metrics are sometimes valued as a way of ducking accountability and reinforcing biases. I trust anyone reading this blog regularly knows where I stand on that. Integrity is a core value. 'Nuff said.

If I were asked to design a set of network security metrics, I would much prefer a top-down approach (e.g. the goal-question-metric method favoured by Lance Hayden, or a process/organizational maturity metric of my own invention), either instead of, or as well as, the bottom-up control implementation status approach and other information sources (e.g. there is likely to be a fast-flowing stream of measurement data from assorted network security boxes and processes).

Perhaps these alternatives are complementary? I guess it depends on how they are used - not just how they are meant or designed to be used, but what actually happens in practice: any metric (even a good one, carefully designed, competently measured, analyzed and reported with integrity) can be plucked out of context to take on a life of its own as people clutch at data straws that reinforce their own biases and push their own agendas. See any marketing-led "survey" for clear evidence of that!