Tools, Technologies and Training for Healthcare Laboratories

Is Risk Analysis Reliable?

As we enter the fall of 2012, we're well into the "roll-out" phase of the new CLSI EP23 guideline. There have been articles, "advertorials", webinars, workshops and more - all about the new Risk Analysis Approach. But what's interesting is that while Risk Analysis is somewhat "new" to US healthcare, it's been done for years in other countries. US laboratories should take serious note of recent articles on the effectiveness of these Risk Analysis techniques in healthcare.

Risk Analysis: Is it Reliable? Is it Valid? Is it too little for too much?

Sten Westgard, MS
September 2012

EP23. Risk Management. Risk Analysis. Quality Control Plans. Individualized Quality Control Plans (IQCP). These are just some of the new terms flooding the regulatory landscape of the US medical laboratory. After the "We blew it" failure of EQC, CMS and CLSI came up with the latest new approach to quality control: developing QC Plans based on Risk Management techniques.

Risk Management has been around for a while. It's not even new to healthcare. Risk Analysis and Failure Mode Effect Analysis (FMEA) have been implemented in healthcare outside the US for quite some time. Recently, papers are starting to be published that scrutinize the benefits - and costs - of using these Risk Analysis techniques. For laboratories considering the adoption of the new EP23 and IQCPs, it's worth taking a serious look at the following three articles:

Failure mode and effects analysis: are they valid? Shebl NA, Franklin BD, Barber N, BMC Health Services Research 2012, 12:150

Failure mode and effects analysis: too little for too much? Franklin, BD, Shebl NA, Barber N, BMJ Quality and Safety 2012. 21:607-611.

Is Failure Mode and Effect Analysis Reliable? Shebl NA, Franklin BD, Barber N, J Patient Saf 2009;5: 86-94.

It's important to note that these studies focus on the use of FMEA, which is the predominant Risk Analysis technique in use for healthcare processes. Risk Analysis and FMEA are not synonymous. That is, Risk Analysis can be done by many other techniques, FMEA being just one possible, but the main examples shown about Risk Analysis tend to be FMEA. When EP23 talks about concepts like Occurrence and Severity in Risk, those concepts are coming from FMEA.

Is Risk Analysis by FMEA reliable?

Shebl, Franklin and Barber tackled the subject of FMEA in a novel way: they had two groups perform Risk Analysis by FMEA on the same process and assessed the agreement between groups. In other words, would two groups tackling the same project reach the same answer? In even fewer words, is FMEA reproducible? One of the main strengths of scientific approaches is that the technique should produce the same result, regardless of who carries out the technique. Equations deliver the same answers regardless of the operator (barring mistakes of course). Instruments are designed to give the same answer on a specimen regardless of the operator. We would like to find the same strength in FMEA.

The authors established two teams to study he same process: the use of vancomycin and gentamicin. The teams were supposed to go through several Risk Analysis steps: map the process, identify failure modes, assess the risk of those failure modes, and propose improvements to mitigate the risks. Unfortunately, the two groups produced significantly different results:

FMEA Step Group 1 Group 2
Mapping the process 8 steps 10 steps
Identifying sub-processes 23 sub-processes 29 sub-processes
Calculating Total
Risk Priority Number (RPN)
1165 4518
Identifying Failure Modes approximately 50 approximately 50
only 17 failure modes in common
No agreement between the two groups
on what were the top 5 failure modes
Identifying the Causes of
Failure Modes
21 32
only 10 causes in common to both groups
Making Recommendations
for Risk Mitigation
26 39
only 9 recommendations were common to both groups

To paraphrase the table, the two groups were significantly different in every aspect of the FMEA process. They mapped out different process steps, identified different failure modes, assigned different ranks and priorities to those failure modes, identified different causes for those failure modes, and made different recommendations for Risk Mitigation. Just one stark difference: Group 2 assessed the risks of the process to be nearly 4 times higher than Group 1.

The authors summarized it this way: "The results of this study call into question the reliability of the FMEA because its outcomes cannot be repeated; instead the results appear to depend on the individual teams' experience, knowledge, and perceptions. The fact that different groups identify different high risk failures makes it impossible to tell which failures should be addressed and thus where money, time, and effort should be allocated to avoid these failures."

Is Risk Analysis by FMEA valid?

Shebl, Franklin, and Barber have a new study out where they expanded their evaluation of FMEAs. While this study is again based on a duplicated FMEA on the use of vancomycin and gentamicin, this time the results of the FMEA were further analyzed in 4 ways:

  1. Face validity: "refers to the investigator's or an expert panel's subjective assessment of the presentation and relevance of the tool in question." This was evaluated by having the authors compare the results of a researcher's study on the same process as the process that was investigated by the FMEA teams. This assessment concentrated on the validity of the workflow or process map.
  2. Content validity: "involves the judgment, usually by an expert panel, about the extent to which the contents of the FMEA results appear to examine and include the domains it is intended to measure." Here, Shebl et al were able to get three different medical consultants to review the FMEA and determine if anything was left out.
  3. Criterion validity: "refers to the extent to which the method correlates with other measures of the same variable." In other words, does the FMEA match up with the observable reality and real-world data? The study authors compared the FMEA findings to a database of actual recorded incidents from the incident database of the relevant healthcare institution covering 2006 through 2009.
  4. Construct validity: This is a mathematical assessment of the results of the FMEA process. The scoring system - calculating RPNs - was evaluated to see if failures were prioritized correctly.

This study found that, in general, the FMEA met the conditions of Face validity. A research study generally agreed with the process steps defined by the FMEA groups. The findings for content validity, however, were not as positive. Two of the reviewers pointed out failure modes that were not identified, disagreed with some of the RPN rankings and priorities assigned to various failure modes. Of the actual recorded incidents from the incident database, 41% of the incidents were of a failure mode that the FMEA team had not identified. There was also no significant relationship between the FMEA-predicted probability of occurrence and the actual rate of occurrence. That is, the FMEA team predicted a probability that was not matched by the actual occurrence rate of failures. It seems that the FMEA teams generally predicted higher probabilities and severities of failure modes than the incident database actually recorded.

Finally, regarding the construct validity, the authors cite a study by Bowles [Bowles JB: An assessment of RPN prioritization in a failure modes effects and criticality analysis. J IEST 2003, 47:51-56] where the known problems with ordinal scales are discussed:

1. Holes in the scales. Ordinal scales are not continuous. Instead, there are gaps between the valuations, particularly when people assign the number rather than make a measurement. Particularly when an RPN is formed from the product of three factors, many of the possible values are eliminated. Thus, we have a warped distribution of the numbers.

2. Duplicate RPN values are not really identical. A failure mode that involves a severity of 8 should be significantly different than an failure mode that has a severity of 2. Yet we could easily construct a scenario where the RPN of two failure modes are the same. (FailureModeA: 10 x 8 x 2 = 160; FailureModeB: 2 x 8 x 10 = 160; recalling that PRN = OCC x SEV x DET)

3. Sensitivity to small changes. A one-point change in one of the factors of a failure mode can lead to a much larger change to the resulting RPN. Again, this can distort the comparisons of failure modes and the resulting prioritization. In our previous example, if we increase the SEV from 8 to 9, our RPN increases to 180 (a 20 pt increase). If we take the latter example and change the OCC from 2 to 3, the resulting RPN increases to 240 (an 80 pt increase).

4. Comparisons of RPNs. Because of the ordinal nature of these factors, each factor is not truly comparable to the other, despite the ranking scales, and "mathematical theory states that they shouldn't be used in arithmetic." Multiplying these factors does not produce meaningful results.

At the end of this review, the study authors conclude:

"Following the result of this study and previous reliability studies, it would not be appropriate to recommend the use of FMEA alone as a tool for preventing patient harm."

Nevertheless, should we stop doing FMEA as part of Risk Analysis?

In a recent editorial, Franklin, Shebl, and Barber tackle the delicate issue of whether we should abandon FMEA entirely. It's undeniable that Risk Management and Risk Analysis are techniques that aren't going to go away. The need is too great. However, is FMEA really fit for purpose? In addition to the problems noted earlier, the authors bring up one more bottom-line obstacle: the time and effort required to perform a FMEA.

"[A] problem in conducting FMEA is that it is very time-consuming. We identified 10 published studies of FMEA in healthcare which stated the number of meetings that had been required. There was an average of eight meetings....which had a mean duration of 1.5h each. Of 26 studies which cited the number of participants, the average was eight....This corresponds to 96h of healthcare professionals' time per FMEA. This number and length of meetings may result in inconsistent attendance due to work schedules and time commitments, resulting in loss of expertise and continuity."

The summary of all their studies is found here:

"In short, FMEA in healthcare is associated with a lack of standardisation in how the scoring scales are used and how failures are prioritised. Different team members and different scoring methods yield dissimilar results, and the concept of multiplying ordinal scales to prioritise failures is mathematically flawed. The FMEA process is subjective, but the use of numerical scores gives an unwarranted impression of objectivity and precision. FMEA is therefore a tool for which there is a lack of evidence. It is surprising that such a commonly used and widely promoted technique within healthcare appears to have no evidence that its outcomes are valid and reliable; particularly as it is used to prioritise patient safety practices and requires so much staff time."

Not a glowing endorsement of a technique that seems bound to impact US laboratory regulations very, very soon.

Conclusion

One key thing to remember is that the EP23 guideline does not mandate the use of FMEA as the Risk Analysis technique, nor does it specify any specific choice of the number of factors or scale of the rankings in a FMEA. Thus, even if Risk Analysis becomes the law of the land, laboratories will have options to incorporate more data-driven techniques and sources of information. As we have shown elsewhere, Six Sigma can provide a strong evidence-based foundation to Risk Analysis.

At the time of this writing, it is still unknown what the exact letter of the law will say about FMEA and Risk Analysis. Rumor has it that a "full" FMEA will not be required, but something far simpler. Instead of a complete Risk Acceptability Matrix, there may only be a "Yes/No" style worksheet; that no team of professionals will be required to identify hazards; that the laboratory manager alone will have the power to decide if failure modes are acceptable risks. If that all turns out to be true, unfortunately, this will not avoid the problems of FMEA, and may instead amplify the weaknesses of FMEA.

There has been a relentless drive to reduce the amount of QC effort required of laboratories and manufacturers. Little evidence had been offered that performance has improved enough to justify this reduction. Risk Analysis is the latest attempt, following on the heels of the failed policy of EQC. Ultimately, there is no free lunch. Quality cannot be assumed, it must be managed. Relying on a flawed technique to reduce QC frequency may indeed reduce laboratory effort, but it is unlikely it will reduce patient risk.