In November 2006, the International Civil Aviation Organization (ICAO) required the implementation of an aviation safety management system (SMS) by the following aviation service provider organizations:
Since then, the list of required participants to implement SMS has increased to include:
Luckily for the aviation industry, ICAO published some excellent guidance material in the Safety Management Manual (SMM), now in the third edition. There is a fourth edition, but all I've seen as of February 2019 is the draft SMM.
ICAO broke down a very broad topic into bite-sized portions, called components. Aviation safety professionals routinely call these four components the "Four Pillars of SMS," or simply "Four Pillars."
A common misconception about aviation SMS is that SMS is all about safety risk management (SRM). SRM activities that may come to mind include:
If you are following along, and you are an SMS professional, you will realize this shortlist's activities are not all belonging to SRM. The last three are "Safety Assurance" (SA) activities. SRM activities are "system design" activities while SA is monitoring activities to ensure that the system is properly designed and can tolerate operations in the presence of hazards.
Contrary to a layman's belief, SRM does not consume most of a safety manager's energies in managing SMS requirements. Once the SMS has been implemented, most safety teams are working on the last three items in the above list, as well as several other tasks that we'll describe below.
Without a doubt, safety risk management is the rock star of the four pillars, but it should not be where you should be focusing all your energies. If you are a new safety manager, where should your energies be focused?
Let's quickly review the definition of an aviation SMS and lay out the four pillars or components of an ICAO-compliant SMS.
First, (in case you need a quick review) the four pillars are:
SKY brary (skybrary.aero) has grown into an excellent resource over the past ten years. Their definition of "safety management system" is succinct and to the point:
A safety management system (SMS) is a systematic approach to managing safety, including the necessary organizational structures, accountabilities, policies, and procedures.
I also like how the FAA describes the purpose of safety management systems:
From the two definitions above, safety risk management is only a small portion of the entire SMS. The other three ICAO pillars are what actually consume most of a safety manager's time.
From my experience, the ICAO SMS Documentation element and the Safety Communication element will take up most of a safety manager's time.
Safety risk management component consists of
In short, safety risk management deals with system design. During the design process, which is a formal process, safety teams, operational department heads, and subject matter experts will be spending a lot of time analyzing and documenting the system.
One of the first tasks is to describe each system. The system description can be a short narrative that answers these questions:
The system description is important in order to narrow the scope of analysis. Your company is composed of many "systems" and the description allows safety teams to focus on one system at a time. System descriptions are most useful in triggering the thought process as the proactive hazard identification process begins.
To assist in system description, two useful templates may help:
There are no hard and fast rules for drafting the system description, but for each system, you should be considering the:
As you describe the system and the interfaces with the system, you will begin to note hazards related to the system. This will be the beginning of the hazard register that is so prevalent in aviation. Unfortunately, the majority of aviation service providers still manage their hazard register in an Excel spreadsheet that was either downloaded off of their civil aviation authority's website or given to them by an SMS consultant.
Spreadsheets are the wrong technology to manage a hazard register except for very small companies, such as those with fewer than 80 to 100 employees. Another instance where spreadsheets are adequate to manage the hazard register is for operators only needing a "paper SMS," which is where the operator does the absolute bare minimum to demonstrate the existence of an SMS.
For each hazard in the system, you will be documenting hazard-related consequences. A convenient way to document consequences that may result from interacting with the hazard is to list "risk scenarios." When you draft risk scenarios, do not become the drama queen and claim that the worst thing that could happen is a fatality or multiple fatalities. Document "worst credible scenarios." What would usually happen if this hazard manifests itself?
If you haven't been following along, I'm providing a process to document your hazards in your proactive hazard analysis. In some parts of the world, I've heard this exercise called "safety risk analysis." I prefer proactive hazard analysis because it is more descriptive and accurate of what we are doing: proactively identifying and assessing hazards in our system design.
Up until now, we have our:
Hazard-related consequences are the risks you face should an event occur (hazard manifests).
A suggested next step is to list your existing risk controls that are fully implemented. I'm using the term "risk control" here. You may also see these common synonyms: control measure, control.
A useful strategy for listing risk controls is to classify risk controls according to easily recognizable criteria, such as the Hierarchy of Control. We've included more information about this immediately above.
If you classify risk controls according to the Hierarchy of Control, management will be more apt to develop better risk controls because they intuitively understand a control's potential utility by where it sits in the hierarchy. If you are not familiar with the Hierarchy of Controls, here is the list ordered according to their perceived effectiveness:
Another power tip is to classify risk controls according to their function in mitigating risk. This is simple and intuitive. Once I list them, you will immediately see the usefulness of this classification strategy:
An important note regarding this classification. Risk controls may be classified using a combination of two or three of the above classifications, such as
These are very good tips. You can see how descriptive our risk controls can be to the entire team. By using the Hierarchy of Control and the risk controls "purpose or function," we are exposed to a more complete picture of how this risk control mitigates risk. Trying to manage all these classifications will be excruciating and next to impossible by using a "spreadsheet SMS." You will need an SMS database to develop a sustainable proactive hazard analysis and monitoring process.
We are progressing further into the proactive hazard analysis process. We now have:
Our next step is to assess the risk of each risk. When the risk is unacceptable, additional risk controls will be required. As we have seen from the Hierarchy of Control, the most effective risk control reduces exposure to the hazard by effectively eliminating the hazard or your interaction with the hazard. In the aviation industry, this is not always practical as we would not be able to accomplish the corporate mission unless we "interact with the environment."
Whenever risk controls are added to the system, another system's analysis is required before implementing the risk control. This will allow safety teams to predict the residual risk.
Predicted residual risk is the estimated risk after safety requirements are implemented or after all avenues of risk reduction have been explored. Predicted residual risk is based on the assumption that controls are in place and/or all safety requirements are implemented and are valid.
Whenever safety requirements are not documented in your proactive hazard identification process, the predicted residual risk should be the same as the initial risk.
The process of designing and evaluating additional risk controls is done using the Management of Change Process.
When the predictive residual risk is acceptable after the consideration of additional risk controls, you can implement the risk controls and monitor. And this is where the safety assurance (SA) activities take over.
Monitoring the effectiveness of risk controls is performed by line employees and auditors.
Safety managers intuitively believe that when they have finally created their SMS hazard register, they have reached a milestone in their SMS maturation. This is indeed a congratulatory moment because this is a lot of work. These six words "Hazard Identification" + "Risk Assessment and Mitigation" may take weeks for a safety team to complete.
Luckily, legacy operators don't have to document all of the existing hazards, risks, and risk controls to have a compliant SMS. Legacy operators will have to go through the SRM process to identify activities that are required to be run through the Management of Change process. These activities include:
As you have seen, there are considerable documentation requirements in SRM. Furthermore, most safety managers don't realize that adequate performance of this ICAO SMS component is dependent on more than the safety manager!
I repeat: "Adequate performance of this ICAO SMS component is dependent on more than the safety manager!" Safety managers are not typically subject matter experts of the entire airline or airport operation. Therefore, you will need buy-in and support from other managers to adequately identify hazards and their respective control, mitigation, and recovery measures.
Safety risk management is the rock star of an aviation SMS because these activities are truly achieved when you have sincere acceptance of your SMS.
We have seen hundreds of SMS implementations. Safety managers are results-driven, hard-working professionals. Many are pilots and believe the shortest distance between two points is a straight line. Safety managers are also very structured and detail-oriented. That is what makes them great safety managers.
One potential problem with the four pillars is that the Safety Risk Management pillar is the second component when I believe it should be the last to be considered when implementing an SMS in an existing operation. Aviation service providers already have risk management strategies in place, otherwise, they would not be operating. Too many times we have seen safety managers hurriedly crank out their "hazard identification" and "risk assessment and mitigation" requirements in a vacuum. In short, they perform these tasks with very little involvement from other operational managers and subject matter experts.
Why do safety managers perform the initial hazard identification activities by themselves? Immature aviation SMS doesn't have sincere acceptance by upper management. They may assign a safety manager, which is great, but then these upper managers assume these regulatory SMS requirements are the safety manager's problem. "Leave me alone and take care of it, safety manager."
Consequently, the safety manager does the best he can and cobbles together a hazard register. He may also do a fairly good job at documenting the risk assessments and controls; however, when additional controls are required to reduce risk, he will require the involvement of the other managers, who believe that "safety management is the safety manager's problem." This is certainly a safety culture issue. I'm delighted to see that the mindset is changing slowly in Canada, which is one of the leaders in SMS implementations.
Hazard registers must be regularly reviewed. So if you created a hazard register early on in your SMS implementation, this is not always a bad thing. We have seen best-in-class aviation SMS wait for three years after the start of their SMS implementations to create a hazard register.
After all, it takes time to shape a culture and break down the barriers to resistance. I am not against waiting, but if you don't have a hazard register, you may risk an audit finding.
One approach to dealing with an auditor is to tell him you are researching best practices to implement a hazard register. You are also waiting to gain sincere acceptance from the other managers in order to accurately perform risk assessments and develop additional controls. This approach may buy you some time; however, after three years of saying you have a demonstrable SMS, the auditors are going to expect to see something.
Many of you have pencil-whipped a hazard register together or paid a consultant to give you a spreadsheet with your hazards and risks. You can still use this work to get by, but you will still need to review this hazard register annually. As you review, continue to improve the early work by beefing up documentation and adding control measures to mitigate risk to ALARP.
Now let's quickly look at some best practices to strive for when satisfying the safety risk management requirements. We'll summarize a few of the best practices for each element, which comes from SM ICG SMS Evaluation Tool.
Safety risk management is truly the rock star of SMS implementation. When you have reached this plateau and not simply pencil-whipped this requirement, you have finally achieved an SMS implementation.
SRM provides the design, i.e., the blueprint that managers have prepared to conduct operations in the safest manner practical. Safety assurance (SA) monitors the design and provides feedback on the system to provide management the assurance that their system is designed appropriately. Of course, there will always be bugs. But there is a process in place to continually improve the "system."
Aviation SMS is never completed. They require considerable work to continually promote and educate employees about the SMS principles. An aviation SMS implementation is not a sprint, but a marathon.
As you have seen, a hazard register in a spreadsheet is not a best practice. Having the hazard register in an SMS database offers considerable advantages in both the SRM and SA processes. As reported safety issues and audit findings enter the SMS through the SA processes, these items can be classified according to their associated hazard.
Now this is powerful. If you have an SMS database that doesn't have these features, you may want to pay attention. Or if you are a competitor who scours the SMS Pro website for ideas, here is a great tip:
As reported safety issues and audit findings come into the system, you classify them according to one of the hazards from the hazard register. This classification will give you access in real-time to risk controls that mitigate hazard-related consequences. Now, for the really sexy part. During the reactive risk management process, you can now monitor AND evaluate risk controls in real time.
This is an example of the power of an SMS database.
When you are reluctant to have employees report every minor incident or close call, it is usually due to the inability to easily and quickly manage many reported safety issues at one given time. With an SMS database, you don't have this concern. You can easily manage hundreds of reported safety issues each month using an SMS database.
If you have an SMS database and it is not treating you like it should, consider SMS Pro. We want to be your SMS Partners.
To see an example of what a full-featured SMS database can offer, please watch the following short demo videos:
Have questions or need more information? Want to see SMS Pro live? Sign up for a live demo.
Last updated April 2024.