Introduction
The models used in accident investigation can typically be grouped into three types: sequential, epidemiological, and systemic models. Although the sequential and epidemiological models have contributed to the understanding of accidents; they are not suitable for clarifying the complexities and dynamics of modern sociotechnical systems. In these systems, the interactions and events are connected in complicated ways, and standard safety engineering techniques alone are not sufficient to comprehend the accident causation. When analyzing major accidents in process industries, a more systematic and professional model is needed than when supervisors and workers are investigating a normal minor accident in a simple setting.
The purpose of accident investigation
There are several definitions of accidents. In the context of accident investigation, here we will use the definition that an accident is an unplanned, unwanted, but controllable event which disrupts the work process and inflicts injuries.
An accident investigation may have different purposes[1]:
- Identify and describe the true course of events (what, where, when)
- Identify the direct and root causes / contributing factors of the accident (why)
- Identify risk-reducing measures to prevent future, comparable accidents (learning)
- Investigate and evaluate the basis for potential criminal prosecution (blame)
- Evaluate the question of guilt in order to assess the liability for compensation (pay)
In an accident investigation, one tries to obtain answers to the following questions: what happened, why it happened, and how could this have been prevented?
An accident investigation should adopt a systematic approach to identify the factors leading to the accident, and in addition, it should examine what improvements are needed in the work environment and in organisational procedures as well as clarifying the responsibilities of each participant. The use of a systematic approach confers reliability on the investigation and making it possible to describe in a comprehensive manner the course the f accident and all factors influencing the accident. In brief, one needs to have rules of conduct for an investigation: who should participate and how to implement the investigation in practise.
After every incident and accident, we should decide what kind of safety measures, guiding, training and information will be needed in the workplace to prevent the same kind of incidents and who should deal with this information in the first place.
Main techniques for accident investigations and analyses
Simple techniques
Simple accident investigation techniques do not require the users to be a safety professional, i.e. learning these techniques do not require a long period of training or a certified degree. Learning to apply simple techniques only requires orientation and commitment. A typical feature for a simple technique is that the time required is not excessive, it should only take a couple of hours to perform this kind of accident investigation. A good example of these simple techniques is the Finnish model for accident investigation[2] The Finnish model for accident investigation is not statutory, but it is a practical and easy to use tool for accident investigation at workplaces that can be used by non-experts.
In the Finnish model for accident investigation, it is recommended that the accident investigation should be conducted in working groups that include individuals from different levels of the organisation. Answers should be sought to such questions as:
- What happened (description)?
- Where did it happen?
- What were the circumstances at the accident scene?
- Which persons, machines, equipment were involved in the accident?
- What work was being performed when the accident occurred?
- Was there anything unusual in the situation?
The Finnish model for investigating occupational accidents consists of 10 steps:
- Orientating to the accident case: After an accident has occurred, it is essential to check the scene immediately in order to gather information about what happened. Eyewitnesses should be interviewed and circumstances can be photographed. All unusual and deviant events and occurrences should be recognised and reported. For example, checking the scene of the accident should include the following points:
- The names and locations of victim(s), eyewitnesses, and other persons who were working in the area;
- What was being done and which equipment was being used;
- The circumstances at the accident scene;
- The circumstances of the wider working environment in general (i.e. lighting, noise, and etc.);
- The level of training of the personnel involved;
- Organisation of the work and responsibilities of the persons involved.
- Describing the events in chronological order: The events should be outlined and separated– An easy way to undertake this description is to start with the accident itself. The investigation should be extended backwards until the last "normal" working act was performed, thus it is not enough to describe only to the event that led to the accident:
- What were the previous events before the accident occurred?
- What was the result? (injury type and injured body part)
- What was the type of the accident?
- What was the concrete cause of the injury?
- Gathering information on how the victim was involved, with the cause of the injury:
- The scene and occasion;
- What was he/she doing before the accident happened.
- Gathering information on how the cause of the injury was related to the accident cause:
- The cause of the injury may exist as a part of normal operations, but alternatively it may as well be caused because of broken or malfunctioning machines and/or equipment, or equipment wrongly placed in the work place.
- Gathering information on contributing factors, i.e. what were the factors that contributed to the accident:
- Contributing factors, such as described in step 2, should be considered for each event;
- Each event may include more than one contributing factor;
- Recognition of contributing factors is based on careful inspection at the actual accident scheme, instead of guessing behind the office desk.
- Gathering information on why did the cause of the injury exist and how did it come to be present at the accident scene, especially when it is not its permanent location. One should also consider what were the accident factors contributing to the existence of the cause of the injury.
- Considering ways on how to prevent similar accidents occurring again.
- Choosing the best measures for preventing similar accidents in the future and considering how best to implement these measures:
- When several optional measures exist, it is essential to consider which one is the best and most realistic for being implemented.
- Choose the person responsible for implementing these measures;
- Set a schedule for the implementation.
- Distributing information on the results of the accident investigation at the workplace:
- It is essential to inform also other departments in addition to those at the scene of the accident, because similar accidents may occur in other locations as well.
- Following up that the measures are implemented and evaluate their impact.
Advanced techniques
Good examples of more complex and systematic accident investigation techniques are AcciMap, model STAMP model, MTO-analyses and FRAM method. Each of these advanced techniques requires specialized training before being mastered in practice, and therefore they will only be briefly overviewed in this article.
The AcciMap
The AcciMap accident analysis technique is based on Rasmussen’s risk management framework [3], [4]. Initially, different accident scenarios are selected and the causal chains of events are analysed using a cause-consequence chart. A cause-consequence chart represents a generalisation that aggregates a set of accidental courses of events. Cause consequence charts have been widely used as the basis for predictive risk analysis[5]. See figure 1.
Source:[6]
The set that is chosen to be included in a cause-consequence chart is defined by the choice of the critical event, which reflects the release of a well-defined hazard source, such as “loss of containment of hazardous substance”, or “loss of control of accumulated energy”. The critical event connects the causal tree (the logic relation among potential causes) with the subsequent event tree. In this way, the AcciMap serves to identify relevant decision-makers and the normal work situation in which they influence and modulate possible accidents.
The focus of AcciMap is not on the traditional search for identifying the “guilty person”, but on the identification of those people in the system that can make decisions resulting in improved risk management, and hence, to the design of improved system safety.[7]
STAMP
STAMP (Systems Theoretic Accident Modeling and Processes) focuses on the role of constraints in safety management. Instead of defining safety in terms of preventing component failure events, safety is defined as a continuous control task to impose the constraints necessary to limit system behaviour to ensure only safe changes and adaptations. Accidents are seen as resulting from inadequate control or enforcement of constraints on safety-related behaviour at each level of the system development and system operations control structures. Therefore, accidents can be understood in terms of why the controls that were in place did not prevent or detect maladaptive changes (e.g. identifying the safety constraints that were violated at each level of the control structure, as well as why the constraints were inadequate or, if they were potentially adequate, why the system was unable to exert appropriate control over their enforcement).
The process leading to an accident (loss event) can be described as an adaptive feedback function that fails to maintain safety as performance changes over time to meet a complex set of goals and values. This adaptive feedback mechanism allows the model to incorporate adaptation as a fundamental property.[8]
MTO-analysis
The basis for the MTO-analysis (Man, Technology and Organization) is that human, organisational, and technical factors are equally important in an accident investigation. The method is based on "Human Performance Enhancement System (HPES)"[9], which will not be described in detail in this article.
The MTO-analysis is based on three methods:
- Structured analysis by use of an event- and cause-diagram;
- Change analysis by describing how events have deviated from earlier events or common practice;
- Barrier analysis by identifying technological and administrative barriers, which have failed or are missing.
The first step in an MTO-analysis is to develop the event sequence in a longitudinal sequence and to illustrate the event sequence in a block diagram. The next step is to identify possible technical and human causes of each event and to draw these vertically into each event in the diagram. The third step is to analyse which technical, human or organisational barriers that have failed or were missing during the accident and illustrate all missing or failed barriers below the events as shown in the figure (Figure 2.) .
Source: Adapted by the author[10]
A checklist for identification of failure causes is also part of the MTO-methodology. The checklist contains the following factors: work organisation, work practice, management of work, change procedures, ergonomic/ deficiencies in the technology, communication, instructions/procedures, education/competence, and work environment. For each of these failure causes, there is a detailed checklist for basic or fundamental causes.
Functional Resonance Accident Model (FRAM)
The Functional Resonance Accident Model (FRAM) and the associated method provide a way to describe how multiple functions and conditions can combine to produce an adverse outcome[11].
FRAM is based on the following principles:
- The principle of equivalence of successes and failures. FRAM adheres to the resilience engineering view that failures represent the reverse side of the adaptations necessary to cope with real world complexity rather than a failure of normal system functions. Success depends on the ability of organisations, groups and individuals to anticipate risks and to appreciate critical situations, to recognise them in time, and to take appropriate action; failure is due to the temporary or permanent absence of that ability.
- The principle of approximate adjustments. Since the conditions of work never completely match the conditions that have been specified or prescribed, individuals and organisations must always adjust their performance so that it can succeed under the existing conditions, specifically the actual resources and requirements. Since resources (time, manpower, information, etc.) always are limited, such adjustments are invariably approximations rather than exact characterisations.
- The principle of emergence. The variability of normal performance is rarely large enough to be the cause of an accident in itself or even to constitute a malfunction. However, the variability from multiple functions may combine in unexpected ways, leading to consequences that are disproportionally large, hence they produce a non-linear effect. Both failures and normal performance are emergent rather than resultant phenomena, because neither can be attributed to or explained only by referring to the (mal)functions of specific components or parts.
- The principle of functional resonance. The variability of a number of functions may every now and then resonate, i.e., reinforce each other and thereby lead to variability such that one function will exceed the normal limits. The consequences may spread through tight couplings rather than via identifiable and enumerable cause-effect links, e.g., as described by the Small World Phenomenon. This can be described as a resonance of the normal variability of functions, hence as functional resonance. The resonance analogy emphasises that this is a dynamic phenomenon, hence not attributable to a simple combination of causal links.
When conducting an accident investigation with FRAM, the explanation is produced by proceeding through the following steps: Step 1. Identify essential system functions, using normal or accident-free performance as a baseline. This step characterises each function separately but does not try to arrange or order them in any way. The starting point may be an existing task analyses, procedures, expert knowledge, etc. The characterisation involves the following six aspects:
- Input (I): that which the function processes or transforms or that which starts the function,
- Output (O): that which is the result of the function, either an entity or a state change,
- Preconditions (P): conditions that must exist before a function can be executed,
- Resources (R): that which the function needs or consumes to produce the output,
- Time (T): temporal constraints affecting the function (with regard to starting time, finishing time or duration),
- Control (C): how the function is monitored or controlled.
Each function may be described by a simple table, which can then be used for further analysis. It is also possible to show the functions graphically using a hexagon to represent each function (FRAM modules, Figure 3).
Source: [12]
Step 2. Characterise the observed variability of system functions, considering both actual and potential variability. The purpose of FRAM is to provide an explanation of the accident in terms of combinations of performance variabilities. The second step is therefore for each function to describe the actual variability during the accident. This may point to other functions that must be characterised as part of the explanation. For instance, if the input to a function came too late, or was of the wrong kind, then the source of that input – i.e., another function – must be described and characterised. This may in turn require even more functions to be described, until one has accounted for the total scenario.
Step 3. Identify and describe the functional resonance from the observed dependencies / couplings among functions and the observed performance variability. The output of the first and the second steps is a list of functions each characterised by two or more of the six aspects. (Note that a function may require several instances of an aspect to be described.) The dependencies among functions can be found by matching or linking their aspects. For example, the output of one function may be a) the input to another function, b) constitute a resource, c) fulfil a pre-condition, or d) enforce a control or time constraint. The result is an overall description of how the functions were linked or coupled in the accident scenario, and therefore a description of how functional variability propagated through the system. In general, the links specify where the variability of one function may have an impact, or how it may propagate. Many such occurrences and propagations of variability may create a resonance effect: although the variability of each function may be below the normal detection threshold, in combination they may become a ‘signal’, hence this constitutes a risk. This step may be supported by a visualisation of how the functions are linked. This kind of visualisation can be valuable in tracing functional dependencies, but the analysis should nevertheless be based on the description of the functions rather than on the graphical representation.
Step 4.Identify barriers for variability (damping factors) and specify required performance monitoring. Barriers are means to prevent an unwanted event from taking place, or to protect against the consequences of an unwanted event [9]. Barriers can be described in terms of barrier systems (the organizational and/or physical structure of the barrier) and barrier functions (the manner by which the barrier achieves its purpose). The four fundamental barrier systems are:
- Physical barrier systems that block the movement or transportation of mass, energy, or information;
- Functional barrier systems that set up pre-conditions that must be met before an action (by a human and/ or machine) can be undertaken;
- Symbolic barrier systems that are indications of constraints on action that are physically present; and
- Incorporeal barrier systems that are indications of constraints on action that are not physically present.
In addition to recommendations for barriers, a FRAM analysis can provide the basis for issuing recommendations on how to monitor performance in order to detect excessive variability. Performance indicators may be developed both for functions and for the couplings between the indicators.
Selecting a suitable accident investigation technique
It is essential that workplaces have a plan about how to investigate accidents. Irrespective of the technique, it is important that those persons who are involved in accident investigation know how to conduct the investigation and are aware of the guidelines for investigating accidents in their workplace. The persons who participate in these investigations should be named (usually safety managers and supervisors) and in addition, a worker from the accident scene may beneficially be included in the investigation.
When selecting a suitable technique for accident investigation, there should be at least one person who has a good knowledge about the different accident investigation techniques suitable for use in their work environment, and who is able to choose the proper method for each case. Some minor accidents may not need to be investigated in the same kind of depth as those that have led to serious injuries.
Some basic practical guidance on investigating an accident can be found in the publication: Investigating accidents and incidents.[13]
Conclusions
Accidents and also near misses almost never result from one single cause, most accidents involve multiple, interrelated causal factors. All actors or decision-makers influencing the normal work process might also influence accident scenarios, either directly or indirectly. This complexity should also be reflected in the accident investigation process. The aim of accident investigations should be to identify the event sequences and all (causal) factors influencing the accident scenario in order to be able to propose risk reducing measures which may prevent future accidents.[14]
Often, accident investigations involve using a set of accident investigation methods. Each method might have different purposes and may make their own contribution to the total investigation process. It is important to remember that every piece of a puzzle is as significant as the others.
Graphical illustrations of the event sequence are useful during the investigation process because they provide an effective visual aid that summarises key information and provides a structured method for collecting, organising and integrating collected evidence to facilitate communication between the investigators. Graphical illustrations also help to identify information gaps.[15]
During the investigation process, different methods should be used in order to analyse emerging problem areas. There should be at least one member of the multi-disciplinary investigation team who has good knowledge about the different accident investigation methods, and is able to choose the optimal methods for analysing the different problems. [16]