SMotW #84: % of security-certified systems

Security Metric of the Week #84: proportion of IT systems whose security has been certified compliant



Large organizations face the thorny problem of managing their information security consistently across the entire population of business units, locations, networks, systems etc. Achieving consistency of risk analysis and treatment is tricky for all sorts of reasons in practice: diverse and geographically dispersed business units, unique security challenged, cultural differences, political issues, cost and benefit issues, differing rates of change, and more.

Three common approaches are to: 
  1. Devolve security responsibilities as much as possible, basically leaving the distributed business units and teams to their own devices (which implies widespread distrust of each other's security arrangements between different parts of the organization);
  2. Centralize security as much as possible, possibly to the extent that remote security teams are mere puppets, with all the heavy lifting in security done via the network by a tight-knit centralized team (with the risk that the standard security settings might not, in fact, be appropriate in every case);
  3. Hybrid approaches, often involving strong central guidance (security policies and standards mandated by HQ) but local security implementation/configuration and management (with some discretion over the details).
Some highly-organized organizations (military and governmental, mostly) take the hybrid approach a step further with strong compliance and enforcement actions driven from the center in an effort to ensure that those naughty business units out in the field are all playing the game by the rules. Testing and certifying compliance of IT systems against well-defined systems security standards, for instance, gives management greater assurance that system security is up to scratch - provided the testing is performed competently which usually means someone checking and accrediting the teams who do the testing so that they are permitted to issue compliance certificates.

ACME Enterprises Inc may not be the very largest of imaginary corporations but it does have a few separate sites and lots of servers. With some concern about how consistently the servers were secured, ACME's managers agreed to take a PRAGMATIC look at this metric:

P
R
A
G
M
A
T
I
C
Score
72
79
73
89
68
32
22
89
88
68%

With most of the numbers hovering in the 70's and 80's, the two lowest ratings stand out: their reasoning for the 32% rating for Accuracy was that certified compliance of a system to a technical security standard does not necessarily mean it is actually secure: ACME has had security incidents on certified compliant servers that met the standard but, for various reasons, turned out to have been inadequately secured after all.

On the other hand, it was seen as A Good Thing overall that more and more servers were both being made compliant and certified as such, hence management thought this metric had some potential as an organization-wide security indicator: they gave it 72% for Predictiveness since, in their opinion, there was a reasonably strong correlation between the proportion of servers having been certified compliant, and ACME's overall security status.

Let me repeat that: although certification is not a terribly reliable guide to the security of a given server, the certification process is driving server security in the right direction, hence the proportion of certified servers could be a worthwhile strategic-level security metric for ACME.  Interesting finding!

The rating of just 22% for Timeliness was justified on the basis that the certification process is slow: the certification tests take some time to complete, and the certification team has a backlog of work. The process and the metric gives a delayed picture of the state of security. Focusing management attention on the proportion of servers certified would undoubtedly have the side-effect of pressuring the team to certify more of the currently unchecked servers (perhaps increasing the possibility of the tests being shortcut, although the certification team leader was known to be a no-compromise do it right or not at all kind of person), but there are ways to deal with that issue.

The metrics discussion headed off at a tangent at this point, as they realized that "Time taken to security-certify a server" might be another metric worth considering. Luckily, with many other security metrics on the table already, someone had the good sense to park that particular proposal for now, adding time-to-certify to the list of metrics to be PRAGMATICally assessed later, and they got back on track - well almost ...
One of the managers queried the central red stripe on the mock-up area graph on the table. The CISO admitted that the stripe represented the servers that had failed their certification testing, and so opened another can o' worms when the penny dropped that 'proportion of servers certified or not certified' is not the whole story here. As the temperature in the workshop room rapidly escalated, the arrival of lunch and the temporary departure of several managers to catch up with their emails saved the day!