Risk Management Essays

The Detectability Debate

At the Westgard Workshops 2011, a hot topic of discussion was the issue of Detectability. Whether or not to include it in the Risk Analysis methodology being promoted to the medical laboratory community. There are several arguments for eliminating detection from Risk Analysis. But should we really do away with Detectability?

The Debate on Detectability.

To count or not? Too much risk without.

June 2011
James O. Westgard, PhD and Sten Westgard, MS

Reasons to ignore Detectability
Reasons to include Detectability
Why not do both?

Unfortunately, this is a discussion "deep in the weeds" of Risk Analysis. For those new to Risk Analysis, it may be more helpful to review the basic methodology and concepts of Risk Analysis before diving into this discussion.

But, if we had to summarize the debate in brief, we would say this: In Risk Analysis, there are two choices for modeling Failure Mode Effects Analysis (FMEA). You can choose a 2-factor model, which includes severity and occurrence (SEV and OCC), or you can choose a 3-factor model, which includes severity, occurrence, and detectability (SEV and OCC and DET).

There are numerous arguments in favor of either the 2-factor model or the 3-factor model, and our intent here is to summarize and comment on these arguments. But let us admit right now that we have a preference for the 3-factor model in the medical laboratory when the intent is to use risk analysis to develop an Analytic QC Plan.

Why might we ignore detectability?

1. It's simpler.
Two factors are less than three, yes? When you use a 2-factor model, you can calculate "criticality" (SEV*OCC) or use a Risk Acceptability Matrix to judge acceptability of risk. If you use a 3-factor model, you calculate a Risk Priority Number (RPN=SEV*OCC*DET) to prioritize the importance of failure modes, whereas in 2-factor models, you use a graphical table called a risk acceptability matrix. ISO 14971 recommends the 2-factor model and the use of the risk acceptability matrix, which further influences the recommendations that may emerge in CLSI EP23.

2. It's hard to estimate detection.
In the usual Risk analysis approach, a team of experts and users utilize a rating scale (typically from 1 to 3, or 1 to 4, or 1 to 5, or 1 to 10), to assign values to the different risk factors. Even if there is consensus agreement from the team, the rating can still be rather arbitrary, and it may end up being little more than a guess.

3. Manufacturers shouldn't estimate Detectability.
When a manufacturer creates a new method or instrument or medical device, their main focus is on reducing occurrence, i.e., eliminating failures from occurring. They may make recommendations for detection, but the performance of those measures will often end up in the hands of the end user. They might hope that detectability will be high (i.e. that their product will reliably detect failures and defects when they occur) but the truth may be something else. Their product could end up in an office lab where there are unskilled technicians, etc., so Detectability may be far lower than if in the hands of a skilled laboratory scientist. Since manufacturers don’t always know where their instruments will end up, it's a challenge to predict Detectability.

4. When the failure occurs, it may not be possible to recover, in which case detectability is irrelevant.
One of our speakers, Tina Krenc, gave this example: If a grenade is thrown into the room where you are, and there is no chance you can get to it before it explodes, does it matter that you've detected the grenade? No [Note: this is an example of a tightly-coupled process where there is little or no time between an event and its consequence.] What's more important - trying to develop a faster reaction time so you can get to the grenade and throw it out of the room, or trying to develop a barrier so the grenade doesn't get tossed in the first place? The logic of this scenario is that it's better to focus on other factors, particularly problems upstream. If we can concentrate on reducing occurrence, then there will be fewer failures to detect and detectability will not be as important. Sometimes this approach is discussed as "assuming detection is 0" - that once an error occurs, it's already too late, so better to focus on preventing the error from occurring in the first place.

5. Are we "double-counting" Detectability?
If we determine the occurrence of harm rather that the occurrence of the failure/hazard, we're already taking Detectability into account. In classical risk analysis, the factor of interest is the “probability of occurrence of a failure.” In the ISO and CLSI guidelines, though, the term “probability of occurrence of harm” is being used. This implies that when we make our ranking of occurrence, maybe we factor the detection of the error at that point. So if we rank a failure mode on the basis of probability of occurrence of harm, then if we also include Detectability as an independent factor, we're "double-counting" that impact.

Why should we include detection?

Of the reasons to ignore detectability in Risk Analysis, we find some of them more compelling than others. Again, we believe that Detectability should be involved in Risk Analysis, particularly in medical laboratories and particularly when a laboratory is developing a QC Plan.

1. Do we want the risk model to be simple or simplistic?
Certainly 2 factors are fewer than 3, but dropping one factor isn't going to give a big savings of time or effort in the Risk Analysis process. While this means that you calculate RPN (SEV*OCC*DET) instead of Criticality (SEV*OCC) and use a spreadsheet instead of a Risk Acceptability Matrix (a table), neither is really not beyond the capabilities of today's analysts. If you think about all the variables today's technicians are already managing, this additional step isn't too hard.

2. Detectability is doable.
Medical labs can readily estimate detection for Statistical QC procedures and possibly for patient data QC procedures. Detection can be described using power curves, which are plots of the probability for rejection vs the size of error (or vs the Sigma-metric of the analytic process). The size of a medically important error can be calculated from the quality required for the test (allowable Total Error) and the precision (CV) and accuracy (bias) of the measurement procedure. The performance of the QC procedure can then be expressed by the probability for error detection (P_ed) and detectability can be expressed as 1-P_ed. That means that an SQC procedure with a P_ed of 0.90 will provide a 10-fold reduction in errors that impact patient test results; a P_ed of 0.99 would provide a 100-fold reduction. That means we can get a number for detectability instead of a ranking - and a number that is more evidence-based than a "committee dead-reckoning."

3. Manufacturers shouldn't guess at Detectability. Laboratories should make that decision.
This is a good point and back when EP22 was a possibility, it probably was worth debating whether the risk information that manufacturers supply to customers should include Detectability. Now, however, EP22 is dead and buried, and no one knows if manufacturers will provide any risk information at all. So, how will a laboratory make this assessment of Detectability when only the manufacturer knows how the internal control mechanisms work? Certainly the laboratory can request or demand risk information, but it's now safe to assume that the manufacturer will not provide any estimate of Detectability. The laboratory will need to depend on statistical QC and patient data control procedures for which it is able to determine Detectability.

4. Detectability is relevant for the medical laboratory.
The thing about the medical laboratory is that by performing QC, we actually have a chance to detect errors and recover from failures. Lab tests aren't grenades. When they fail, we don't get blown to bits - we get the opportunity to trouble-shoot, recalibrate, and possibly retest patient specimens. So if we focus on Detectability, it does have an impact on how we operate in the laboratory.

Stamatis's book on FMEA makes this comment about detection: "if the ability of the controls to detect failure is unknown, or the detection cannot be estimated, than the detection rating should be 10." [ed. the worst possible rating, 10 on a scale of 1 to 10, where 10 indicates no detection at all]

5. Adding Detectability as a factor does not mean it will be “double-counted”.
When we think about occurrence of failures, it may be best to confine ourselves to the occurrence of the defect, not try to squeeze in another factor. The strength of the Risk Analysis approach is that it separates out these factors into independent components; folding them into one another weakens the approach. This "double-counting" point semes to us the least compelling argument. At best, we're fudging the factors. At worst, it seems like an accounting trick or sleight of hand - "I don't need to account for detectability because I've modified one of the other variables instead." (Why not just modify severity to account for both detectability and occurrence - then we'll have just a one-variable problem)

Why not use both 2- and 3-factor models?

There is room for compromise here. In Six Sigma Risk Analysis, we recommend a process where laboratories start with a 2-factor FMEA (i.e. without Detectability), but once the first round of mitigating and reducing risks is complete, the next FMEA is completed with Detectability as a factor. By that second FMEA, the laboratory has implemented the QC Plan and should have a quantitative estimate of the error detection that can be used. So this mixed approach would allow labs to start simple, then add more quantitative data at an appropriate time.

Our final plea is this: Risk Analysis is being proposed as a way to create an Analytical QC Plan in the medical laboratory. It's worth recalling the purpose of QC. What is QC supposed to do? Detect errors. Detect errors with enough warning so that recovery is possible. So error detection is fundamental to the process and we should include Detectability in the Risk Analysis model, particularly to evaluate the residual risk remaining after implementation of a QC Plan.

Tools, Technologies and Training for Healthcare Laboratories