Performance Based Regulation and Detecting the Pathogens
At a time when Performance Based Regulation is a hot topic in the aviation industry, a series of rail accidents in North America help demonstrate the type of poor performance that PBR must successfully detect. These accidents were what James Reason, Professor Emeritus, University of Manchester described as ‘organisational accidents’ in his classic 1997 book Managing the Risks of Organizational Accidents. Reason explained that:
Organizational accidents have multiple causes involving many people operating at different levels of their respective companies.
Such accidents result from ‘latent organisational failures’ that are, according to Reason, like pathogens that have infected the organisation. A key challenge for an organisation’s Safety Management System (SMS) is detect latent pathogens before they cause harm. PBR needs to give the regulator assurance that the organisation’s SMS is vigilant and effective at doing that.
As we recently discussed, the US National Transportation Safety Board (NTSB) published a special investigation report into the organisational factors that emerged after five accidents at Metro-North (discussed at a special hearing). Metro-North is the second largest commuter railroad, and one of the busiest, in the United States. Between May 2013 and March 2014 Metro-North had five significant accidents resulting in 6 fatalities, 126 injuries and more than $28 million in damages. In 2012 the Federal Rail Road Administration (FRA) issued a Notice of Proposed Rule Making (NPRM) that would require a ‘system safety program’, which the NTSB likens to an SMS in other industries. The NPRM states:
Since most of these are procedures, processes, and programs railroads should already have in place, the railroads would most likely only have to identify and describe such procedures, processes, and programs to comply with the regulation.
Similar statements have been made in other similar rule-making initiatives in other industries. They help defuse potential complaints about extra red-tape but do raise the question: ‘so is there a real benefit to the proposed regulation’? The prime benefit is of course ensuring that organisations that would not operate an effective SMS voluntarily, at least have to justify their SMS performance to an independent regulator. NTSB observe that:
Metro-North has for many years had an SSPP [System Safety Program Plan] that presumably will fulfill the proposed regulatory requirement for such a program. However, while the NTSB investigations found Metro-North had a written SSPP, its implementation was very limited and represented little more than a paperwork exercise. Few Metro-North employees even knew the program existed. The identified deficiencies in the Metro-North SSPP implementation provide a cautionary example to FRA as it finalizes the proposed regulation.
They also note that:
A management systems approach will require cultural change at the [regulator] as well as in the industry.
The US Federal Aviation Administration (FAA) is has introduced their own Part 5 Safety Management System (SMS) requirement Part 121 carriers (as we discussed in June 2015) so the NTSB have issued a timely reminder. We have however previously expressed concerns that the FAA’s fondest for fines may undermine that implementation.
The Transportation Safety Board of Canada (TSB), in its final report on the crude oil train derailment and fire that killed 47 people on 6 July 2013 at Lac-Megantic, Quebec expressed concerns about how the regulator, Transport Canada (TC) dealt with SMS regulation (emphasis added):
…the first SMS audit to assess the effectiveness of the company’s safety management processes took place in 2010, which was 7 years after the company was found to be in compliance with the SMS Regulations. During this audit, inspectors were informed that the SMS had not yet been implemented because the company was awaiting regulatory approval. TC then clarified with MMA [Montreal, Maine & Atlantic Railway] that TC does not approve a railway’s SMS.
A second SMS audit was conducted in 2012, and focused on a very limited subset of SMS elements. …many of the deficiencies in MMA’s SMS that came to light through the audit process were never resolved. For example, weaknesses in MMA’s risk assessment process were identified during TC’s pre-audit in 2003. The 2010 audit found that risk assessments were being conducted only for major operational changes. Since that time, very few risk assessments had been conducted…
The absence of an internal audit procedure at MMA was first identified during TC’s pre-audit in 2003, and again in the 2010 SMS audit. An internal audit procedure had not been developed, and no internal SMS audits had taken place at MMA. Other weaknesses in MMA’s SMS, including the fact that the toll-free number for reporting safety concerns was not being used…
Although TC inspections identified problems at MMA between 2003 and 2010, and it was clear to TC that MMA’s SMS was not effective, no SMS audits were conducted in that time frame. The 2010 TC audit determined that MMA had not implemented its SMS. The limited number and scope of SMS audits that were conducted by TC Quebec Region, as well as the absence of a follow-up procedure to ensure MMA’s corrective action plans had been implemented, contributed to the fact that systemic weaknesses in MMA’s SMS remained unaddressed.
An organization with a strong safety culture is generally proactive when it comes to addressing safety issues. MMA was generally reactive. There were also significant gaps between the company’s operating instructions and how work was done day to day. This and other signs in MMA’s operations were indicative of a weak safety culture—one that contributed to the continuation of unsafe conditions and unsafe practices, and significantly compromised the company’s ability to manage risk.
When the investigation looked carefully at MMA’s operations, it found that employee training, testing, and supervision were not sufficient, particularly when it came to the operation of hand brakes and the securement of trains. Although MMA had some safety processes in place and had developed a safety management system in 2002, the company did not begin to implement this safety management system until 2010—and by 2013, it was still not functioning effectively.
TSB identified 18 distinct causes and contributing factors, “many of them influencing one another” and many of whhich should have been detectable by an observant regulator:
This isn’t the only example of a stalled regulatory audit programme. We previously discussed one at a Canadian air operator: Culture + Non Compliance + Mechanical Failures = DC3 Accident. In that accident report TSB comment:
While a move towards SMS has great potential to enhance safety by encouraging operators to put in place a systemic approach to proactively manage safety, the regulator must also have assurances of compliance with existing regulations, particularly for operators that have demonstrated a reluctance to exceed minimum regulatory compliance.
In order to assess regulatory compliance, and hence whether risks are sufficiently mitigated, inspectors must have appropriate processes and carry out detailed inspections of actual operating procedures and practices.
The current approach to regulatory oversight, which focuses on an operator’s SMS processes almost to the exclusion of verifying compliance with the regulations, is at risk of failing to address unsafe practices and conditions.
If TC does not adopt a balanced approach that combines inspections for compliance with audits of safety management processes, unsafe operating practices may not be identified, thereby increasing the risk of accidents.
Of course the organisations discussed above should be the rare exception. A key to PBR is directing regulatory attention to the weaker organisations with marginal compliance, poor systems and weak cultures before an accident.
Prof Sidney Dekker comments on the danger that an SMS can become a “self-referential system”: a system that just exists for itself and is a sponge for data but one from which intelligence never emerges.
For more on the general topic of PBR see this 2002 paper from the Harvard John F Kennedy School of Government: Performance-Based Regulation Prospects and Limitations in Health, Safety and Environmental Protection
Also see this piece on lessons from the formation of the UK Military Aviation Authority (MAA): Regulatory Reflections & Resisting the Seduction of the Risk Management Process
UPDATE 9 Sept 2015: The Buffalo accident was used as an SMS case study at a European Aviation Safety Agency (EASA) Workshop in Cologne. It was stated that EASA would address the lessons from this accident by:
- Phased approach
- Stakeholder involvement
- Maintain compliance backstops
- Balance the split between rules and AMCs
- Combine safety management system assessments with audits for regulatory compliance
We have also subsequently discussed other examples of possibly lax regulatory oversight in Canada:
- UPDATE 19 June 2016: HEMS Black Hole Accident: “Organisational, Regulatory and Oversight Deficiencies”
- UPDATE 17 July 2016: Fatal Flaws in Canadian Medevac Service
- UPDATE 19 August 2016: Canadian KA100 Fuel Exhaustion Accident This accident highlights important human factors, competence and regulatory oversight issues.
UPDATE 28 August 2016: We look at an EU research project that recently investigated the concepts of organisational safety intelligence (the safety information available) and executive safety wisdom (in using that to make safety decisions) by interviewing 16 senior industry executives: Safety Intelligence & Safety Wisdom. They defined these as:
Safety Intelligence the various sources of quantitative information an organisation may use to identify and assess various threats.
Safety Wisdom the judgement and decision-making of those in senior positions who must decide what to do to remain safe, and how they also use quantitative and qualitative information to support those decisions.
UPDATE 31 December 2016: TC has imposed C$409k of civil penalties in 2016.
UPDATE 17 January 2017: We discuss new UK CAA guidance: Performance Based Oversight: Accountable Manager Meetings (CAP1508)‘