Quality Standards
Comparing Verdicts from Different Goals
The debate on performance specification (quality goal) models has been going for decades. What's the bottom line? If we used permissible uncertainty instead of allowable total error, would our judgments on method accepability be any different?
Comparing Verdicts from Different Goals, Models, and Specifications - Roche cobas 8000 c701
February 2016
Sten Westgard, MS
As we debate the best way to determine ideal and practical performance specifications (quality goals or quality requirements), it may be useful to come down out of the clouds and look at some actual laboratory examples. In a previous article, we analyzed the Roche cobas 8000 c701 using Sigma-metrics and allowable total error performance specifications (mainly from the US CLIA program). Now we're going to look at the same data, but use different mdoels and specifications. Will the different models mean a different verdict? Let's see...
The original paper can be found here:
Importance of implementing an analytical quality control system in a core laboratory.Marques-Garcia F, Garcia-Codesal F, del Rosario Caro-Narros M, Contreras-SanFeliciano T, Revista de Caldidad Asistencial 2015 Nov-Dec;30(6):302-9.
The Sigma-metric analysis can be found here. https://www.westgard.com/cobas-c701.htm
Dr. Rainer Haeckel and Werner Wosniok et al published a set of permissible limits for measurement uncertainty last year (2015). This includes permissible imprecision (pCV%), permissible bias (pB%), and permissable standard and expanded measurement uncertainty (pU%). In addition to that article, the authors have a spreadsheet that provides a more comphrensive set of specifications than their journal article. Note that these target measurement uncertainties are supposed to be based on the level of interest within the reference range. Haeckel, Wosniok et al provide these examples based on some textbook reference ranges and their own example control levels.
We'll calculate measurement uncertainty simply as the square root of the sum of the square of the observed imprecision and square of the observed bias. This is an unorthodox calculation of measurement uncertainty, since the usual prescription is to eliminate any bias. In this case, we're going to incorporate the bias into the measurement uncertainty calculation as if was any other uncertainty.
Assay | CV% |
Bias% | MU | PermissCV (pCV%) |
PermissBias (pB%) |
Permiss |
Sigma-metric (CLIA goals) |
Albumin | 4.8 | 0.4 | 4.9 | 3.2 | 2.2 | 3.9 | 2.0 |
Alk Phosphatase | 3.3 | 1.3 | 3.5 | 4.7 | 3.3 | 6.0 | >6 |
ALT | 3.8 | 8.7 | 9.5 | 5.2 | 3.6 | 6.7 | 3.0 |
AST | 2.8 | 1.8 | 3.4 | 5.5 | 3.9 | 6.7 | >6 |
Total Billirubin | 4.0 | 2.8 | 4.9 | 6.4 | 4.5 | 7.8 | >6 |
Calcium | 2.5 | 0.2 | 2.5 | 2.1 | 1.5 | 2.6 | 1.6 |
Chloride | 1.6 | 2.0 | 2.6 | 1.6 | 1.1 | 1.9 | 1.8 |
Cholesterol | 2.7 | 3.9 | 4.8 | 3.2 | 2.2 | 3.9 | 2.3 |
Creatinine Kinase | 3.2 | 1.2 | 3.4 | 6.5 | 4.6 | 7.9 | >6 |
Creatinine | 9.6 | 0.5 | 9.6 | 4.1 | 2.9 | 5.0 | 1.5 |
GGT | 3.4 | 3.4 | 4.8 | 5.8 | 4.0 | 7.1 | 5.6* |
Iron | 2.7 | 1.8 | 3.3 | 5.2 | 3.6 | 6.3 | >6 |
Lipase | 2.5 | 4.5 | 5.2 | 6.0 | 4.2 | 7.3 | >6 |
LDH | 2.6 | 2.5 | 3.5 | 4.1 | 2.8 | 5.0 | >6 |
Magnesium | 2.8 | 2.8 | 4.7 | 2.7 | 1.9 | 3.3 | 5.8* |
Phosphate | 2.8 | 1.5 | 3.1 | 3.7 | 2.6 | 4.5 | 3.3* |
Potassium | 1.9 | 3.5 | 4.0 | 2.5 | 1.8 | 3.1 | >6 |
Total Protein | 2.7 | 0.7 | 2.8 | 2.4 | 1.7 | 2.9 | 3.4 |
Sodium | 1.3 | 0.6 | 1.4 | 1.3 | 0.9 | 1.5 | 2.2 |
Transferrin | 1.5 | 0.5 | 1.5 | 3.8 | 2.7 | 4.7 | 2.3* |
Triglycerides | 2.1 | 5.9 | 6.2 | 5.6 | 3.9 | 6.8 | >6 |
Urea Nitrogen | 2.7 | 1.3 | 3.0 | 4.8 | 3.4 | 5.9 | 2.8 |
Uric Acid | 3.0 | 1.1 | 3.2 | 4.2 | 3.0 | 5.2 | 5.3 |
*indicates that these Sigma-metrics were generated by using Ricos goals or CAP goals - for those analytes a CLIA goal was not available.
Anything highlight in red exceeds an allowable performance specification.
Now, we can further evaluate this same performance by making simple comparisons of the imprecision and bias against the desirable specifications for imprecision and bias (Ricos goals).
Assay | CV% |
Bias% | Desirable CV |
Desirable Bias |
Sigma-metric (CLIA goals) |
Albumin | 4.8 | 0.4 | 1.6 | 1.4 | 2.0 |
Alk Phosphatase | 3.3 | 1.3 | 3.2 | 6.7 | >6 |
ALT | 3.8 | 8.7 | 9.7 | 11.5 | 3.0 |
AST | 2.8 | 1.8 | 6.2 | 6.5 | >6 |
Total Billirubin | 4.0 | 2.8 | 10.9 | 9.0 | >6 |
Calcium | 2.5 | 0.2 | 1.1 | 0.8 | 1.6 |
Chloride | 1.6 | 2.0 | 0.6 | 0.5 | 1.8 |
Cholesterol | 2.7 | 3.9 | 3.0 | 4.1 | 2.3 |
Creatinine Kinase | 3.2 | 1.2 | 11.4 | 11.5 | >6 |
Creatinine | 9.6 | 0.5 | 3.0 | 4.0 | 1.5 |
GGT | 3.4 | 3.4 | 6.7 | 11.1 | 5.6* |
Iron | 2.7 | 1.8 | 13.3 | 8.8 | >6 |
Lipase | 2.5 | 4.5 | 16.1 | 11.3 | >6 |
LDH | 2.6 | 2.5 | 4.3 | 4.3 | >6 |
Magnesium | 2.8 | 2.8 | 1.8 | 1.8 | 5.8* |
Phosphate | 2.8 | 1.5 | 4.08 | 3.38 | 3.3* |
Potassium | 1.9 | 3.5 | 2.3 | 1.81 | >6 |
Total Protein | 2.7 | 0.7 | 1.38 | 1.36 | 3.4 |
Sodium | 1.3 | 0.6 | 0.3 | 0.2 | 2.2 |
Transferrin | 1.5 | 0.5 | 1.5 | 1.3 | 2.3* |
Triglycerides | 2.1 | 5.9 | 9.95 | 9.97 | >6 |
Urea Nitrogen | 2.7 | 1.3 | 6.1 | 5.6 | 2.8 |
Uric Acid | 3.0 | 1.1 | 4.3 | 4.9 | 5.3 |
Notice that just by using the simple desirable specifications for the individual components of error, we have determined that more assays are unacceptable than by measurement uncertainty terms or by Sigma-metrics.
Here's an overview of the assays and which ones are unacceptable according to each goal source.
Assay | CV% |
Bias% | Target MU |
Ricos Goals |
Sigma-metric (CLIA goals) |
Albumin | 4.8 | 0.4 | unacceptable imprecision, MU |
unacceptable imprecision |
2.0 |
Alk Phosphatase | 3.3 | 1.3 | unacceptable imprecision |
||
ALT | 3.8 | 8.7 | unacceptable bias, MU |
||
AST | 2.8 | 1.8 | |||
Total Billirubin | 4.0 | 2.8 | |||
Calcium | 2.5 | 0.2 | unacceptable imprecision |
unacceptable imprecision |
1.6 |
Chloride | 1.6 | 2.0 | unacceptable imprecision, bias, MU |
unacceptable imprecision and bias |
1.8 |
Cholesterol | 2.7 | 3.9 | unacceptable bias, MU |
2.3 | |
Creatinine Kinase | 3.2 | 1.2 | |||
Creatinine | 9.6 | 0.5 | unacceptable imprecision, MU |
unacceptable imprecision |
1.5 |
GGT | 3.4 | 3.4 | |||
Iron | 2.7 | 1.8 | |||
Lipase | 2.5 | 4.5 | |||
LDH | 2.6 | 2.5 | |||
Magnesium | 2.8 | 2.8 | unacceptable imprecision, bias, MU |
unacceptable imprecision and bias |
|
Phosphate | 2.8 | 1.5 | |||
Potassium | 1.9 | 3.5 | unacceptable imprecision, bias, MU |
unacceptable bias |
|
Total Protein | 2.7 | 0.7 | unacceptable imprecision |
unacceptable imprecision |
|
Sodium | 1.3 | 0.6 | unacceptable imprecision and bias |
2.2 | |
Transferrin | 1.5 | 0.5 | 2.3* | ||
Triglycerides | 2.1 | 5.9 | unacceptable bias |
||
Urea Nitrogen | 2.7 | 1.3 | 2.8 | ||
Uric Acid | 3.0 | 1.1 | |||
Total unacceptable methods (%) | 10 out of 23 (43.5%) | 9 out of 23 (39.1%) | 8 out of 23 (34.8%) |
The measurement uncertainty goals are the most stringent, followed by the Ricos imprecision and bias specifications, with Sigma-metrics coming in last. But while overall the number of methods considered unacceptable is nearly the same (only 1 or 2 methods different), which methods are considered unacceptable are different depending on the goals being used.
CLIA goals are often derided as being too wide and lenient, but clearly using the Sigma-metric calculation is still a demanding benchmark. Individual error parameters by Ricos goals are slightly more demanding. The target measurement uncertainty performance specifications are even more challenging still. It begs the question: if one of the latest instruments on the market has between a third and nearly a half of 23 biochemistry analytes considered unacceptable, does it matter which set of goals we use? We're not going to turn off half of our testing methods - we may need to accept that our goals are too tight, no matter what the model, no matter how we calculate performance, and scale back our expectations, and, most importantly, those of our clinicians. We may need to re-educate clinicians on how they interpret these test results, because they may currently be making a lot of decisions based on noise, not signal.