Saturday 29 March 2014

Resistance is futile: new compliance awareness module

We have just delivered April's awareness module on information security and privacy compliance, a perennial topic that remains stubbornly on management's agenda.

This time around, we had the 'benefit' of an excellent ready-made compliance case study in the shape of Target's recent breach. Reviewing the news on Target revealed plenty of lessons on compliance, security, privacy, governance, risk management, incident response, press relations and accountability - a rich vein indeed!

Something else that came out of our research was the value of encouraging compliance in a positive sense, as much as hammering non-compliance through enforcement and penalties, the more conventional approach (typified by this poster image - one of six new designs in the module).  Compliance benefits the organization, management, the authorities, customers, business partners, owners, stakeholders and society, as well as individual workers. The module talks about good practices, maturity and ethics. It's good to promote the upside of compliance for a change rather than simply ringing the warning bells, yet again.

Monday 24 March 2014

Rejected ISO/IEC 27002 controls for BCM (L O N G)

Continuing the series of bloggings on new/changed controls proposed to SC 27 in 2011 for incorporation into the 2013 version of ISO/IEC 27002, we come next to thorny issue of business continuity.

Let me set the scene for this by reminding you what ISO/IEC 27002:2005 had to say about business continuity management in its section 14 (italicized) along with my comments (not italicized).

14.1 INFORMATION SECURITY ASPECTS OF BUSINESS CONTINUITY MANAGEMENT 

Objective: To counteract interruptions to business activities and to protect critical business processes from the effects of major failures of information systems or disasters and to ensure their timely resumption.

Mmm.  OK, well I note that it mentions 'information systems', primarily meaning IT systems - at least, that is how the vast majority of readers will interpret it. The mention of resumption also hints at IT Disaster Recovery (DR) which in practice was the main emphasis of business continuity management in the IT context at the time the standard was written. The whole emphasis of the BCM objective was to deal with the aftermath of disasters, on the presumption that something had already gone horribly wrong.  

While the rest of 27002 generally concerns avoiding or preventing disasters, one concept that spans the divide between prevention and recovery was noticeably absent from the standard, namely resilience. Resilience involves hardening and strengthening critical business processes and their supporting infrastructures so that, in the event of a serious incident, they hopefully continue operating. I deliberately said "hopefully" because there is always a chance that the resilience arrangements may themselves fail when they are needed most, or an incident may be so disastrous in scale that they are completely overwhelmed. Therefore disaster recovery and contingency arrangements are still needed, even if the resilience arrangements are sound.

14.1.1 Including information security in the business continuity management process

Control: A managed process should be developed and maintained for business continuity throughout the organization that addresses the information security requirements needed for the organization’s business continuity.

What is "a managed process for business continuity?" I hear you ask.  The standard went on to expand on that, explaining that the process should 'bring together the key elements of business continuity' which included understanding risks, impacts and assets associated with critical business processes, insurance, additional [but unspecified] preventive and mitigating controls, resources, ensuring the safety of personnel and information processing facilities, business continuity planning plus testing and updating the plans, oh and nominating a manager (which really ought to come first!).

Twice mentioning 'information processing facilities' again indicates the section's IT perspective and, to be honest, betrays a persistent IT bias throughout the ISO27k standards.  It's a bugbear of mine that I think is too deeply entrenched for SC 27 to tackle ... but that doesn't stop me trying!

14.1.2 Business continuity and risk assessment

Control: Events that can cause interruptions to business processes should be identified, along with the probability and impact of such interruptions and their consequences for information security.

It was common practice at that time to develop DR plans based around specific disaster scenarios - often, only those specific scenarios were considered. Consequently if a disaster happened to involve something unexpected, or an unfortunate coincidence of multiple disastrous causes, the IT function, along with the critical business processes IT supported/enabled, was stuffed.  

This is another bugbear of mine, the lack of emphasis on contingency thinking, by which I mean 'What we actually do following a disaster is contingent on the nature of the disaster that unfolds, and since we don't know exactly what will happen, we need to prepare ourselves to cope with almost anything.' The point is to prepare even if you can't sensibly plan. Contingency preparations include stockpiling or securing alternative sources of essential supplies, tools etc., and of course preparing the people, getting them ready to think on their feet as well as knuckle down and get on with whatever has to be done to maintain critical business processes. Seems to me a highly resilient workforce is a tremendously valuable business asset.

14.1.3 Developing and implementing continuity plans including information security

Control: Plans should be developed and implemented to maintain or restore operations and ensure availability of information at the required level and in the required time scales following interruption to, or failure of, critical business processes.

Here again I note the emphasis on planning, not preparing. This control is hinting at meeting the RTO/RPO parameters typically specified for DR. 'Including information security' was presumably meant to refer to ensuring the availability of information, but in the 2013 version of the standard, that casual mention resulted in the whole section being diverted into a discussion about business continuity planning for the information security function(!!).

14.1.4 Business continuity planning framework

Control: A single framework of business continuity plans should be maintained to ensure all plans are consistent, to consistently address information security requirements, and to identify priorities for testing and maintenance.

Spot "plans" and "planning" again. Need I say more?

Also, this control seems out-of-sequence to me, along with the earlier mention of identifying a business continuity manager.  I appreciate that there is not supposed to be any special significance to the order of items in the standard, but in practice things that end up tucked away in the body are less prominent, and are commonly perceived to be less important, than those that come first. The standard paid scant attention to the governance of business continuity management, which is why my re-written version (below) put business continuity strategy first and foremost.

14.1.5 Testing, maintaining and re-assessing business continuity plans

Control: Business continuity plans should be tested and updated regularly to ensure that they are up to date and effective. 

"Plans".



Again, the accepted wisdom of the day was that DR plans should be tested periodically, with good practice hovering between 1 and 3 years. Mostly, this was entirely within the IT domain, ensuring that the main IT systems could be recovered as per the DR plans within the RTO/RPO: a minority of organizations paid any attention at all to the business process angle (e.g. persuading a few token "end users" - business people - to check that the recovered business applications could be launched, seldom much more than that). As to testing and (im)proving the organization's ability to recover supply chain failures, customer failures, loss of key people and so forth, no, not a chance.

OK, that's enough of my ranting, cut to the chase. Here's the replacement text I proposed (renumbered as it would have been in the 2013 standard) ...

-------------------

17 Business continuity management

17.1  Business continuity management policy

Objective: to clarify the organization’s overall objectives in relation to maintaining the operation of business processes and related information assets.

17.1.1  Business continuity policy

Control

Management should adopt a business continuity management policy or strategy.

Implementation guidance

Management should consider, develop, mandate, implement and maintain a coherent high-level policy or strategy for business continuity management, concerning important aspects such as:
  • The overall objectives or aims of business continuity (e.g. “To maintain the operation of business processes that are deemed critical to the organization’s mission through the use of resilience measures, supported by recovery and contingency arrangements”);
  • Governance of business continuity, including accountability and key responsibilities (such as a nominated business continuity manager as well as business continuity rĂ´les within operations, risk management, information security, compliance and other functions or departments);
  • Resourcing of business continuity (e.g. the allocation of costs associated with providing the resilience, recovery and contingency arrangements for shared resources such as the IT infrastructure, as well as activities such as business impact analysis and exercises).

Other information

Failure to plan and prepare suitable business continuity arrangements may ultimately contribute to the failure of an organization due to a serious incident/disaster, or an accumulation of effects arising from multiple incidents, affecting the organization directly or affecting vital suppliers, partners and customers.  Given the scale, this would probably be considered a governance and/or risk management failure of senior management by the organization’s disenfranchised stakeholders.  Compared to doing nothing, investing in adequate business continuity arrangements is a wise move over the medium to long-term.

Having a business continuity management policy or strategy removes all doubt that management supports the arrangements necessary to ensure the continuity of processes (along with the associated resources, including information) that are deemed critical to the organization’s mission, along with the ability to recover less-critical processes (and resources).  Senior management’s overt support should ensure that business continuity is adequately addressed throughout the organization, even when other objectives and activities compete for limited resources.  It makes it harder for individual senior managers to deny, ignore or downplay their obligations towards business continuity.

The business continuity management policy/strategy need not necessarily be integrated within the information security and risk management policy suite, but must align with them as there are many points of overlap.  It also needs to align with business strategies, budgets etc., in other words it should not be developed and maintained in isolation.

17.1.2  Business continuity management procedures

Control

The organization should design, document and implement necessary business continuity processes.

Implementation guidance

In support of the policy, the business continuity manager should lead the design, development and implementation of procedures documenting business continuity management processes, including in particular:
  • Business impact analysis;
  • Resilience, recovery and contingency plan development and maintenance;
  • Lifecycle management for resilience, recovery and contingency controls.
Those specific aspects are described more fully below.  Elsewhere this standard also describes incident management, including crisis and disaster management activities, and other associated aspects such as compliance and assurance, all of which support business continuity.

In addition, suitable metrics should be adopted, enabling management to determine the extent to which the arrangements in place satisfy the objectives, along with their efficiency and effectiveness and opportunities for improvement. Furthermore, suitable awareness, training and compliance activities should be instituted to ensure that activities in practice conform to the business continuity management procedures and thus satisfy the policy.

Additional information

If business continuity is considered vital to the organization and is sufficiently complex to justify the investment, management may wish to adopt a discrete/separate business continuity management system and/or a dedicated business continuity team, function or department.  However, as with policy, it is important to maintain close alignment with other business objectives, activities and initiatives, so an integrated or consultative approach (albeit with clear leadership to achieve and maintain the alignment and integration necessary to fulfil the policy) may be more suitable.

17.2     Business Impact Analysis

Objective: to identify the importance of various information assets in achieving the organization’s mission by considering the consequences of various kinds of information security incident.

17.2.1  Determine the criticality of business processes and information assets

Control

Assess and rank business processes or activities, plus the associated information systems, networks and other information assets, in terms of their criticality to the organization’s mission.

Implementation guidance

Workshops or study groups are effective ways of involving managers and staff with knowledge of critical business processes (including relevant information asset owners), led or facilitated by the business continuity manager and supported by subject matter experts in related areas such as risk management, information security, human resources, finance, compliance and IT.

Starting with the organization’s core operations (i.e. the business activities that most directly and obviously relate to its central mission), identify business processes or activities without which the organization would cease to have any purpose and/or income.  Such business-critical processes deserve more detailed analysis to determine, for example, the rate at which impacts accumulate if they are interrupted.  Estimating the likelihood and projecting the possible costs of serious incidents helps by providing key parameters for business continuity planning.

Considering a broad range of possible incident scenarios and developing “worst case” projections can be helpful in business impact analysis, but these should not become the entire focus of all business continuity planning.  The organization also needs to cope with unforeseen incidents, including low-probability high-impact extreme situations and failures of controls that are anticipated to ensure business continuity, falling into the realm of contingency planning (see below).

Given limited resources, there is little point in evaluating relatively low priority business processes or activities beyond confirming that they are indeed low priority.  The business continuity manager may apply arbitrary criteria to identify such processes/activities, but should nevertheless ensure that they are adequately supported by generic recovery and contingency arrangements.  Furthermore, the criteria should be reviewed periodically since the organization’s capability for business continuity management is expected to increase with maturity.

Other information

Business continuity involves maintaining vital operations despite all manner of events, incidents and disasters, particularly those which are unforeseen since, arguably, many of those which are foreseen should be handled adequately by routine operations and controls.  The aim is of course to avoid serious disruption to the business.  Interruptions to less critical activities may be insignificant in isolation but costs and disruption tend to mount if they are widespread, or if they are not recovered to some semblance of normality within a reasonable period, which begs questions such as “Which activities are so critically important to the organization that they absolutely must be maintained without interruption?” and “How much would it cost if business processes were interrupted, and how do these costs accumulate over time?”  Business impact analysis is a systematic way to address questions of this nature.

The failure of vital operations can lead to consequential damages for the organization such as:
  • Delays and mounting backlogs to production processes;
  • Missed deadlines;
  • Customer complaints, missed business opportunities;
  • Health and safety issues (especially evident with safety-critical systems);
  • Fines, penalties and other liabilities;
  • Bad press, reputational damage, customer defections, claims from suppliers and customers;
  • Relatively inefficient and often rather costly fallback arrangements (note: business continuity arrangements generally incur costs when they are invoked, but also incur costs to develop and maintain the capability);
  • Supply chain issues, potentially leading to systemic failure and collapse of tightly integrated partner organizations with industry-wide and international repercussions.
In most circumstances, information asset owners are best placed to consider and assess the nature and scale of business impacts, taking account of advice on the possibilities or probabilities of various kinds of information security incident from subject matter experts.  Team/workshop approaches are favoured for this reason, often with several iterations to achieve consensus and parity with other business processes and systems.

Inventories and other repositories or collections of information concerning information assets, risks and incidents, along with information architectures or models, complement business impact analysis, planning and other business continuity activities, providing inputs and/or making use of the outputs.  This is another sound reason to integrate business continuity management with other business activities rather than handle it as an entirely separate issue.

Information on critical business risks, processes, information, systems, suppliers, people etc. is itself valuable and sensitive, implying the need to secure through using suitable information security controls.

17.2.2  Specify resilience and recovery requirements

Control

Based on the business impact analysis, clarify and document the resilience and recovery requirements for business processes/activities plus the associated information systems, networks and other information assets.

Implementation guidance

It is helpful to distinguish resilience measures designed to ensure the continued, uninterrupted operation of vital business processes (such as high-availability arrangements for IT systems and networks) from recovery measures designed to recover business operations following interruptions (such as restoration from backups and so-called disaster recovery).  One way to do this is to define Recovery Time Objectives and Recovery Point Objectives for IT systems using a common basis (such as the projected accumulation of costs due to service interruptions resulting from serious incidents or disasters).  Techniques such as Failure Modes and Effects Analysis can facilitate structured, detailed analysis of critical systems.  A simpler if less rigorous approach is to prioritize or rank systems etc. relative to each other, and to apply common or ‘baseline’ controls to arbitrarily-defined categories or groups of systems etc., supplemented by additional control where justified.

Other information

Identifying and characterizing the business continuity requirements for business units, processes, systems, people, suppliers etc. enables the associated continuity arrangements to be optimized, especially when resources are limited.  In the absence of clear priorities, vital time may be lost in recovering non-critical systems, for example, thereby delaying and perhaps jeopardising the successful recovery of more critical systems and processes.

As with other information security controls, resilience and recovery arrangements generally involve a combination of general purpose infrastructure or baseline controls (such as regular offline data backups and tested restore capabilities) plus additional custom-designed controls protecting high-risk processes, systems, networks etc. (such as load-balancing, clustering and distributed computing arrangements).

Since it is hard for anyone to predict the duration and scale of incidents and disasters, assumptions about either aspect are inherently risky.  As a general rule, it is safer to assume that things might be even worse than predicted, leaving plans open-ended where possible and giving employees the time and opportunity to make alternative arrangements before essential resources are exhausted.

17.2.3  Maintain business impact/business continuity information

Control

Proactively manage and maintain the information relating to business impacts and business continuity, ensuring that significant changes to the business, processes, IT systems etc. which affect business continuity requirements are adequately reflected in the business continuity arrangements.

Implementation guidance

The business continuity manager is normally responsible for leading, stimulating, guiding, directing and controlling business continuity activities as a whole.  This includes ensuring that business continuity information (such as business impact analyses, and resilience and recovery requirements) is properly managed and maintained.

Developing, implementing, reviewing and updating business continuity arrangements lends itself to a cyclical approach, but the period of review and depth of analysis should preferably reflect the criticality of the business processes.  For example, business continuity arrangements for the most critical processes may be reviewed at every 3 to 6 months with an annual in-depth analysis, whereas less critical processes may be reviewed less often.  It is generally better to start implementing continuity arrangements for business processes that are clearly critical to the organization as soon as practicable, rather than waiting for the entire analysis of all business processes to complete.  One way to do this is to run successive rounds or generations of analysis, addressing first the core business processes and gradually working down the priority list. 

Other information

A single, complete round of detailed business impact analysis can easily involve months if not years of work in a large organization, with the unfortunate consequence that the organization may change substantially during the process, thus invalidating early work.  Techniques such as restricting the scope and time-boxing the analyses may help by keeping the process rolling and ensuring that significant business changes are picked up reasonably quickly.  Significant business changes (such as new markets, new products, mergers and acquisitions, and new IT systems) should ideally incorporate business impact analysis and business continuity management activities to align them with other business continuity arrangements at the time they are implemented, and then fold into the routine business continuity management processes.

17.3     Business continuity controls

Objective: to develop, test, implement and maintain various controls necessary to fulfill the identified resilience and recovery requirements.

17.3.1  Resilience of critical business processes and associated information assets

Control

Ensure through the application of resilience engineering that critical business processes, along with the associated IT systems, networks, resources etc., are sufficiently resilient and robust to resist failure except, perhaps, under the most extreme circumstances.

Implementation guidance

Processes, IT systems etc. in this category may require investment in high-availability controls such as:
  • Fault-tolerance, diversity, redundancy and automated failover techniques (e.g. uninterruptible power supplies and diversely-routed communications networks);
  • Excess capacity, ‘over-engineering’ and ‘graceful-decline’ (e.g. reallocating resources from lower to higher priority activities to prevent or slow down declining performance);
  • Fail-safe designs;
  • Preventive maintenance;
  • Strict change management, with comprehensive pre-implementation preparation, planning and testing of changes and the ability to reverse unsuccessful changes very reliably, quickly and efficiently;
  • Additional monitoring with high-priority responses to impending and actual incidents (e.g. routine performance monitoring and capacity planning, coupled with alerts or alarms if systems or processes exceed permitted response times or throughput).

Other information

Many of the information security controls described elsewhere in this standard are particularly significant for business-critical IT systems and networks.  This is patently obvious in the case of availability-related controls, but also applies to controls whose prime focus is the maintenance of integrity (e.g. malware controls) and, to a lesser extent, confidentiality (e.g. access controls).  Therefore, business impact analysis and resilience engineering can benefit information security and hence the organization in various ways in addition to continuity, illustrating the value of the systems approach to information security management. 

Resilience engineering is a form of preventive control.  The idea is to make critical processes, plus the supporting services etc., so robust that they keep on working (to some extent) through incidents and disasters that would otherwise have severely disrupted or interrupted them.  The concept of resilience engineering includes but extends beyond the realm of IT, including for example:

  • Diversity of supply for vital raw materials/supplies, business/IT services etc. (e.g. power feeds from multiple substations, commoditised cloud computing services);
  • Deputies, understudies, multi-skilled employees and/or the availability of competent contractors/consultants capable of covering for the loss of key employees at short notice;
  • Proactive risk management, with a reduced tolerance for risks relating to business-critical processes relative to others and stricter controls.
Ideally, the interdependency of critical business functions, systems, networks, people, organizations etc. should be mapped and reviewed since even a relatively small incident in one part may have more serious consequences elsewhere.  Practical constraints generally make such a rigorous approach unworkable in practice, although it may be feasible and worthwhile to at least map key first-level dependencies relating to business-critical and safety-critical processes, systems, people and suppliers.

17.3.2  Recovery of information processes

Control

Facilitate the restoration of information processes that fail, despite the presence of preventive controls.

Implementation guidance

Various types of backups, archives, fall-backs, stand-ins and replacements are the usual ways of providing for the restoration or recovery of information systems, networks and content that fail in service.  There are many possible alternatives, and although choices may be made serendipitously (for instance using the backup and recovery options provided by default with most systems), management should ensure that the options and configuration do actually fulfil the recovery requirements, especially in respect of business-critical information processes and systems. 

Furthermore, the arrangements should be proven adequate, for example through periodically testing the ability to recover systems from offline backups onto suitable test systems (avoiding overwriting live data on the production systems just in case the tests should fail).

Other information

Due to the wide variety of technical options available and spectrum of recovery requirements, it is not appropriate for this standard to specify or recommend particular solutions.  Information asset owners, in conjunction with subject matter experts, should ensure that the appropriate recovery arrangements are specified, funded, implemented and maintained, complementing other information security controls.

17.3.3  Business continuity tests and exercises

Control

Conduct tests and exercises to gain assurance of the adequacy of business continuity arrangements.

Implementation guidance

Although actual incidents and disasters are the ultimate proving grounds, business continuity arrangements should be preferably have been tested previously to confirm that they would operate as specified and expected.  Such testing offers the opportunity to revise or refine the arrangements if necessary, and assures management, information asset owners and other stakeholders that adequate arrangements are in place.

In addition, business continuity exercises that simulate various kinds of incident or disaster allow those involved in resilience, recovery and contingency activities to become more familiar and competent through training, practice and rehearsal. 

Other information

There are many ways of conducting business continuity tests and exercises, ranging from paper-based checks of the plans through to full invocations under simulated disaster conditions.  Factors to take into account when planning such tests and exercises include:
  • Their scope, coverage, depth, frequency and timing (e.g. is it appropriate to test assumptions made in planning?; the confidence necessary to authorize full failover tests on live production systems at peak times generally implies a very high level of assurance in the design and operation of the failover arrangements, whereas authorizing limited tests at off-peak times usually indicates a far lower confidence and assurance level);
  • The amount of assurance required, which strongly relates to the criticality of the business processes, systems etc. whose continuity is to be maintained, along with the nature of the continuity arrangements (e.g. minor changes to existing recovery plans probably do not deserve the same level of assurance as new plans or complete re-writes) and stakeholder requirements (e.g. organizations involved in critical national infrastructure services may be held to a higher standard of proof than businesses in general);
  • Resources available for testing/exercising, and priorities relative to other business continuity, information security and general business activities;
  • Risks to the business, including the risks associated with conducting the tests/exercises themselves as well as the possibility of the business continuity arrangements proving inadequate when invoked for real;
  • The scenarios or situations being simulated, including ‘wildcards’ designed to test/exercise contingency arrangements (see below);
  • The maturity of the organization’s business continuity and other information security management practices.

17.4     Contingency arrangements

Objective: to enhance the organization’s capability to deal with exceptional information security risks that are not adequately mitigated by other risk treatments.

17.4.1  Contingency preparations

Control

Develop the organization’s broad capabilities to cope effectively and positively with unanticipated situations, events, incidents and disasters, whatever their nature.

Implementation guidance

Generalized contingency capabilities include:
  • Employees’ willingness to rise to a challenge, take personal risks (within reason), be resourceful, creative, resilient and adaptive under pressure, and collaborate with colleagues to make the best use of available resources;
  • Management’s willingness to give staff the latitude and discretion necessary to take matters into their own hands, when appropriate;
  • The availability of emergency supplies such as first-aid kits, flashlights, water, gloves etc., information resources such as policies, procedures, instructions, communications facilities and backups, and external assistance such as the emergency and specialist services and assistance from business partners;
  • Training, practice and rehearsals in the associated skills and activities, increasing employee’s competence and confidence in challenging situations;
  • The overall status/strength and resilience of the organization as a whole, and potentially also the supply chain, industry and/or nation for truly massive disasters.

Other information

True contingency activities are contingent (dependent) on the exact situations that unfold, hence while it is not appropriate to develop detailed/specific plans for most circumstances, general approaches, strategies and ways of dealing with novel situations are of value.

Although it makes sense for the organization to do all it reasonably can do to avoid or prevent incidents and disasters, there are far too many possible scenarios to plan fully for them all.  There inevitably remains a possibility that the analysis, planning and preparations will prove inadequate (such as underestimating or failing to foresee certain threats, vulnerabilities or impacts) or the preventive controls may prove inadequate given certain ‘unfortunate’ situations (such as rare combinations of events).  The sheer cost of trying to prevent absolutely everything is prohibitive, and continuity planning on this basis soon becomes unworkable, hence the reason for emphasizing risk-based planning and prioritization, coupled with recovery and contingency arrangements as a last resort.

-------------------------------


OK, so that's what I proposed.  Now take a look at what ended up being published in section 17 of ISO/IEC 27002:2013, and recall the old adage about a camel being a horse designed by a committee. About the only discernible vestige of my lovingly researched and written proposal is the garbled section 17.2 "Redundancies" which is (naturally) IT-specific. Section 17.1 appears to be advising the information security management function to develop its own business continuity plans - quite extraordinary! Yes, it is necessary to consider information security in the aftermath of a disaster, but no that is not THE primary consideration in business continuity management. I despair!


Sorry for this extraordinarily long post but I feel much better now I've got that little lot off my chest.


PS  I also proposed new or changed security controls for:

  • SCADA/ICS (industrial control systems)
  • The computer suite (mostly physical controls)
  • SDLC (software development life cycle) and
  • Cloud computing

Friday 21 March 2014

Avoiding metrics myopia

Things being measured are patently being observed in a very specific, focused manner. Things that are observed so closely tend to be presumed by others to be of concern to the measurer/observer, at least. Things under the measurement ruler therefore assume a certain significance simply by dint of their being closely observed, regardless of any other factors.

We see this 'fascination bias' being actively exploited by cynical promoters, marketers and advertisers on a daily basis through an endless stream of largely banal and unscientific online polls and infographics, their true purpose made all the more obvious by the use of bright primary-color eye-catching graphics. They are manipulating readers to believe that since something has been measured, it must be important. How many of us take the trouble to think about the quality of the metrics, or about all the other aspects that haven't been measured? Like bunnies in the headlights, we stare at the numbers.

In PRAGMATIC Security Metrics, we outlined 21 observer biases drawn from an even longer list drawn up by Kahneman, Slovic and Tversky in Judgment Under Uncertainty (1982): what I'm calling 'fascination bias' has some resemblance to what Kahneman et al. described as 'attentional bias', the tendency to neglect relevant data when making judgments of a correlation or association.

Fascination bias creates a genuine concern in that we tend to measure things that are relatively easy to measure, and place undue faith in those metrics relative to other factors that are not being measured. Back in 2011, Michel Zalewski said in his blog:
"Using metrics as long-term performance indicators is a very dangerous path: they do not really tell you how secure you are, because we have absolutely no clue how to compute that. Instead, by focusing on hundreds of trivial and often irrelevant data points, they take your eyes off the new and the unknown."
While we don't entirely accept that we 'have no clue how to compute security performance', his point about neglecting other risks and challenges due to being inordinately focused on specific metrics is sound.  It's only natural that what gets measured gets addressed (though not necessarily improved!). The unfortunate corollary is that what doesn't get measured gets neglected.

The upshot of this is that there is a subtle obligation on those who choose metrics to find ways to measure all the important matters, even if some of those metrics are expensive/complex/qualitative/whatever. It's simply not good enough to measure the easy stuff, such as the numbers that assorted security systems constantly pump out 'for free'. It's inappropriate to disregard harder-to-measure issues such as culture, ethics, awareness and trust, just as it is inappropriate to restrict your metrics to IT or cybersecurity rather than information security.

That's one of the key reasons why we favor the systematic top-down GQM approach: if you start by figuring out the Goals or objectives of security, expand on the obvious Questions that arise and only then pick out a bunch of potential Metrics to answer those questions, it's much harder to overlook important factors. As to figuring out the goals or objectives, conceptual frameworks for information security such as BMIS and ISO27k, based on fundamental principles, are an obvious way to kick-off the thinking process and frame the initial discussion.

Wednesday 19 March 2014

Refreshing transparency or naivete?

In the course of researching for next month's awareness module on security compliance, I came across a surprisingly honest security and compliance statement by UserVoice - a cloud based customer service provider.

I suspect, like all those website privacy policies, very few visitors are concerned or knowledgeable enough to read the compliance statement word-for-word.  A quick scan of the headlines tells us they comply with Safe Harbor, PCI-DSS and others.  It mentions SSAE16 certification. Sounds OK at this point - all very positive. They are making the right noises.

Most of the more technical security-related statements are similarly inspiring.  I am impressed to read, for instance:
"Development and test environments do not use customer data.  We use fake customer data in those environments."
I wish that approach was universal - in fact as I said just last week here on the blog, I tried (and failed!) to get advice along those very lines inserted into ISO/IEC 27002:2013.

Read on, though, and a few of the latter statements are not quite so confidence-inspiring e.g.
"We‘re running the latest version of Ruby on Rails 2.3 and we review/apply the latest security patches as they come out."
According to the Ruby On Rails site, it is up to version 4something now. Perhaps someone simply forgot to update the compliance statement ... but if so, what else might have changed and not been reflected in their fine words?

Later, it says:
"Your password is stored securely.  For performance reasons our database itself is not encrypted (though backups are; more on that below), but all user passwords are hashed using the SHA1 algorithm with salt. Hashing passwords is actually more secure than encrypting them, because that means we don’t have access to the original passwords, nor does anyone else. So even if our database is compromised, everyone’s passwords will stay secure."
Fair enough.  One might question their continued use of deprecated standard SHA1 but it makes sense to use one-way hashing rather than reversible encryption ... except that three paragraphs below we read:
"User passwords are encrypted.  For performance reasons our database itself is not encrypted (though backups are; more on that below), but all user passwords are encrypted."
So at this point I'm confused.  Possibly "user passwords" and "your password" are referring to different things? If not, those two statements seem directly contradictory. Again, this could just be a simple typo, misunderstanding or error in the compliance statement, and again it makes me wonder whether any of this is trustworthy.

Up to now, I could easily give them the benefit of doubt, but a couple of statements towards the end really stand out for me. They are what prompted me to write this blog piece:
"Do you have a have a rigorous screening process for your employees that includes credit checks, background checks, reference checks and cavity searches? No. We will often check references, we will always verify that local standard for proof of eligibility to work, and we‘ll make sure that potential team members fit our company values, but we will do little beyond that. We also have every employee sign a confidentiality agreement before he or she can work for us. We also do performance and peer reviews twice a year and retrospectives after any critical issue. If someone is not meeting our own performance standard, including being the root or contributing cause of a critical issue(s), we will not hesitate to terminate our working relationship with that person."
'We will often check references' is basically admitting that they don't always do so. Um. I'd argue that's a fundamental security control, an important guard against identity theft for starters - one of several unmentioned controls for new employees (such as security orientation, to name but one). Requiring employees to sign confidentiality agreements is fine, but what about contractors, consultants, suppliers and others? And aside from confidentiality, what about other aspects of information security and privacy - such as ethics for instance:
"Do you have a _____ ethics policy? No. We‘re guided by our company values and good old common sense. We believe that policies around ethics lead to less ethical behavior."
Oh, I see. The company values remind me of the Internet boom at the start of the millennium, with comments such as 'don't be a dick' and Don't take ourselves too seriously' painting a picture of a fun-filled laissez-faire work environment ... hardly one where good old fashioned ethical values stand out.

Perhaps I'm just an old fart. Perhaps I simply don't get it. But perhaps I have good reason to question their attitudes and approach towards information security.

So I'll leave you to ponder this. How would you feel if you were one of the backers behind a company that made such a compliance statement, at times brutally honest, contradictory and tongue-in-cheek? What if a security or privacy breach occurs, and customer interests are harmed, and the news media sink their teeth into this statement? I'd respectfully suggest it might be time to reconsider, or at least review and update the compliance statement.

Tuesday 18 March 2014

Rejected ISO/IEC 27002 control for the computer suite

I proposed the following text to update the ISO/IEC 27012:2005 advice on physical security.  Unusually for me, it specifically relates to protecting IT rather than information assets of all kinds.  Various changes were made in that section, but virtually all that remains of my suggested control is 11.1.4's "Physical protection against natural disasters, malicious attacks or accidents should be designed and applied" so obtain specialist advice.

------------------------------

Protecting the IT environment and services

Control
IT facilities should provide a suitable environment to protect the IT equipment and other information assets, and to maintain essential supplies and services.

Implementation guidance
IT facilities such as computer rooms, computer suites and computer centres commonly house significant quantities of highly valuable and somewhat fragile IT equipment and other information assets (including data storage media). Where possible, IT facilities should be located to avoid the risks associated with: fires, floods, earthquakes, volcanic eruptions and other natural disasters; vandalism, sabotage, arson and criminal damage; overheating, power interruption etc., for example not being sited in areas prone to:
  • earthquakes, volcanoes, landslides and subsidence;
  • flooding and tsunami e.g. coastal zones, valley floors and flood plains;
  • wild fires e.g. bush and forested areas;
  • climatic extremes e.g. tropical and polar regions, deserts;
  • lightning, storms, tornadoes etc.;
  • power cuts, brownouts, surges and spikes; or
  • vandalism and sabotage e.g. lawless areas, dense urban areas, run-down industrial estates or socially-deprived/high-crime areas. 
It is particularly important for business continuity purposes that primary (operational) and secondary (load-sharing, standby or disaster recovery) sites are not subject to common environmental risks (e.g. being located too close together or sharing the same fault line).

If it is impossible to avoid the risks entirely, mitigating controls should ideally be designed, implemented, operated, managed and maintained as a coherent and comprehensive set. The following guidelines are not intended to be comprehensive as specialist advice is recommended, particularly given compliance obligations imposed by relevant laws, local building codes and regulations. However, these are commonplace controls:

a) controls to mitigate fire and smoke risks include:
  • semi-sealed positive-pressure arrangements to reduce the incursion of smoke, flames and dust from adjacent areas;
  • the use of fire-proof, fire-resistant and low-smoke building materials, doors, walls, wallpapers and floor coverings etc.; 
  • avoiding the use or storage of unnecessary quantities of combustible materials (e.g. flammable fuel or solvents, wood, paper or plastics including computer tapes and packaging materials) and combustion sources (e.g. fires, ovens, furnaces, chemical laboratories, cigarettes) in or near IT facilities; 
  • cleaning and other measures to reduce the build up of dust, waste, stores etc. that may occlude air filters, cause electrical short-circuits, reduce the reliability of electronic equipment and may cause safety issues (e.g. by blocking emergency exits);
  • suitable fire alarms having appropriate smoke and/or heat detectors (including high sensitivity units where high-flow air conditioning units are installed, and perhaps aspirating units inside equipment racks or cabinets), manual fire alarm triggers, alarm/annunciator panels that unambiguously identify the zones where fires are detected, fire-resistant cabling etc.; 
  • fail-safe power system interlocks to shut down air conditioners, IT equipment, Uninterruptible Power Supplies and generators in the appropriate sequence if a fire is detected (e.g. stopping air conditioners early to avoid fanning the flames and spreading smoke);
  • provision of manual fire extinguishers and, where appropriate, automated fire suppression systems, in all cases suitable to protect the IT facilities and electrical equipment, and to protect the health and safety of everyone in the facility; 
  • properly protected, signed and alarmed emergency exits; and 
  • fire incident response and evacuation procedures;

b) controls to mitigate flood risks include:
  • locating facilities well above water courses and anticipated flood levels;not running water supplies, sewage and drainage pipes through, over or in fact near the facilities (including drains that may block and back-up under false floors);
  • properly installing and maintaining roofs, walls, windows and pipework (including air conditioner condensate drains) to minimize the possibility of leaks;
  • sub-floor water detection with alarms; and 
  • provision of mops, buckets, plastic sheeting etc. plus flood response procedures to cope with floods;

c) controls to mitigate accidental or deliberate physical damage to IT facilities such as vandalism, sabotage and arson include:
  • strong walls, gates, doors and other physical access barriers and controls (see 9.1.1 [physical security perimeter] and 9.1.2 [physical entry controls]); 
  • security guards and procedures such as regular and ad hoc facilities inspections or rounds; 
  • identification, authorization, access and logging/monitoring procedures covering everyone who enters the facilities e.g. employees (staff and managers); contractors and consultants; suppliers, partners and customers; installation and maintenance workers; visitors; cleaners; and security guards; and
  • deterrent signs, CCTV monitoring (potentially using covert as well as overt cameras, protected cabling, encrypted wireless links, pan-tilt-zoom cameras, supplementary lighting etc.) and intruder alarms (potentially including ‘silent alarm’ facilities and interlocks with other controls such as CCTV and lighting systems and egress controls);

d) controls to mitigate lightning- and static-electricity related risks include:
  • installing suitable lightning conductors on exposed building high points, all properly grounded through directly-routed heavy-duty copper straps etc.;
  • safety grounding of all exposed metalwork, metal cabinets and racks housing electrical and IT equipment, raised floor panels etc.;
  • using surge arrestors on all wires and cables entering the facility, and fibre-optic or radio instead of copper wires for high-availability data communications links;
  • fitting low-static floor coverings, maintaining at least 50-60% relative humidity, and/or the use of static-dissipating conductive coatings or sprays; and
  • using grounded static-dissipating mats and wrist bands while working on static-sensitive electronic equipment;

e) controls to mitigate the risks of excessive temperature variations include:
  • installation, operation, management and maintenance of suitable heating, ventilating and air conditioning systems, ideally with sufficient excess capacity under normal circumstances to handle the load with ease, cope with extremes, and permit individual units to fail or be taken out of service for maintenance without unduly compromising the temperature control;
  • monitoring temperatures locally and remotely (e.g. at security guard stations), with alarms and response procedures accordingly;
  • maintaining power consumption and cooling demands within acceptable limits by systematically monitoring and controlling the installation and location of electrical equipment (e.g. high-density racks of blade servers may easily exceed default planning assumptions for both power and local cooling requirements); and
  • provision of emergency cooling facilities such as windows or doors that can be left open without compromising physical access controls, fans and portable air conditioners;

f) controls to mitigate risks associated with electrical power include:
  • professionally-designed, installed, operated, tested and maintained electrical power systems, covering the power sources, distribution, switching, monitoring and control systems, cabling, sockets, fuses, grounding, ducting etc.;
  • power sources with more than adequate capacity to cope with current and projected peak loads, ideally supplied from diverse parts of the electrical grid (e.g. separate substations fed from separate high voltage transmission lines) with appropriate facilities to cross-link or change over supplies safely in an emergency;
  • resilient power sources such as computer-grade Uninterruptible Power Supplies, electrical generators, power-conditioners etc., all adequately specified for the intended use; 
  • preferential use of low power/highly efficient equipment; and
  • systematic monitoring of power loads, balancing the load across phases and circuits, and ensuring that capacities are never exceeded;

g) general-purpose controls for IT facilities include:
  • adoption of international and national standards relevant to the risks and controls in this section;
  • consulting subject-matter experts offering more specific and detailed/tailored guidance in these areas;
  • awareness, training and management oversight activities to reinforce compliance with applicable laws, regulations, policies, procedures, guidelines etc.;
  • regular inspections, surveys or tests of IT facilities and the associated controls for signs of inadequately mitigated risks (these may perhaps be fulfilled at least partly through security guards’ rounds, routine health and safety inspections, facilities audits etc.); and
  • business continuity arrangements including resilience measures (such as hot sites), crisis and incident management procedures and contingency plans to mitigate the impacts of environmental or physical disasters despite the preventive controls (see also clause 14).
-------------------------------------------------

I deliberately structured the proposed control around the specific risks being addressed - an approach I favor throughout ISO27k: if we are advising people to identify, assess, treat and monitor their information-related risks, it makes sense to me for the standards themselves to be overtly risk-driven. I envisaged information security pro's systematically working through the text above, thinking about the relevance of the listed risks, and considering the advice on controls.  However, 27002 is controls-driven, making little reference to the risks.

That final reference to clause 14 on business continuity should now read clause 17, although that is another section where my suggested changes were rejected, leaving us with a distinctly unimpressive result.  More on that later.


PS  I also proposed new or changed security controls for:
  • SCADA/ICS (industrial control systems)
  • SDLC (software development life cycle)
  • BCM (business continuity management) and
  • Cloud computing