Sigma Metric Analysis
Evaluation of a Chinese Automated Chemistry Analyzer
A 2013 study from the Journal of Laboratory Medicine and Quality Assurance evaluated the performance of a Chinese manufactured automated chemistry analyzer running Korean reagent. The study produced a lot of r-values in the 0.99 range. Does this mean the analyzer-reagent combination is better than the west?
Sigma-metrics of a Chinese automated chemistry analyzer
- The Precision and Comparison data
- Determine quality requirements at the critical decision level
- Calculate Sigma metrics
- Summary of Performance by Sigma-metrics chart and OPSpecs Chart
- Conclusion
December 2014
Sten Westgard, MS
[Note: This QC application is an extension of the lesson From Method Validation to Six Sigma: Translating Method Performance Claims into Sigma Metrics. This article assumes that you have read that lesson first, and that you are also familiar with the concepts of QC Design, Method Validation, and Six Sigma. If you aren't, follow the link provided.] |
This analysis looks at an analyzer probably unknown to the west: a CS-6400 automated chemistry analyzer from Dirui in China:
Evaluation of the CS-6400 Automated Chemistry Analyzer. Hyo-Jun Ahn, Hye-Ryun Kim, and Young-Kyu Sun, J Lab Med Qual Assur 2013;35:36-46.
The Imprecision and Bias Data
Controls from chemTRAK H from MAS were used along with HiSens ProTrol controls from HBI corporation in a precision study that followed the CLSI EP5-A2 guideline. This allowed calculation of within-run, between-day, and total imprecision.
Assay | Level 1 |
CV% | Bias% | Level 2 |
CV% | Bias% |
Albumin | 4.55 | 1.45% | 3.79 | 1.75% | ||
Alk Phos | 92.1 | 8.78% | 366.4 | 3.19% | ||
AST | 44.2 | 3.1% | 165.7 | 2.72% | ||
ALT | 29.4 | 4.0% | 122.3 | 2.88% | ||
Amylase | 262.8 | 2.54% | 836.2 | 1.59% | ||
Bilirubin, Direct | 0.39 | 4.84% | 1.44 | 6.34% | ||
Bilirubin, Total | 0.63 | 4.2% | 2.73 | 7.19% | ||
Calcium | 6.46 | 4.38% | 9.49 | 3.9% | ||
Cholesterol | 219.8 | 1.53% | 167.5 | 2.61% | ||
Creatinine Kinase (CK) | 110 | 2.17% | 372.2 | 2.65% | ||
Chloride | 98.9 | 1.03% | 92.8 | 0.92% | ||
Creatinine | 1.19 | 3.97% | 4.07 | 3.25% | ||
GGT | 27.9 | 4.82% | 70.4 | 3.65% | ||
Glucose | 61.4 | 1.86% | 208.1 | 3.1% | ||
HDL | 76.1 | 2.53% | 65.1 | 2.34% | ||
Potassium | 2.77 | 1.7% | 4.38 | 1.0% | ||
Sodium | 151 | 0.9% | 140.6 | 0.9% | ||
Phosphate | 2.73 | 2.0% | 5.42 | 1.6% | ||
Total Protein | 7.13 | 1.8% | 5.48 | 2.3% | ||
Triglycerides | 88.1 | 5.0% | 57 | 6.7% | ||
Urea Nitrogen | 13.53 | 2.8% | 39.89 | 2.7% | ||
Uric Acid | 3.32 | 2.4% | 7.03 | 2.1% | ||
LDH | 246.4 | 1.2% | 480.7 | 2.3% | ||
Magnesium | 0.58 | 8.8% | 1.36 | 4.6% | ||
LDL | 140.1 | 2.5% | 106.8 | 2.9% | ||
Lipase | 38.6 | 2.4% | 5.9 | 2.6% |
We've got two levels of imprecision measured - it may be that only one of them is the truly medically important decision level. For our purposes, we'll work with all the data, but for a particular laboratory, they might narrow their focus to just one level.
Next, we need the bias data. The study states that 128 serum samples were obtained from healthy adults "and values obtained from DxC800 and Vista500 were used to make comparison."
The study shows the correlation coefficient, slope and y-intercept. The regression equation can be used to determine the difference between the DxC800/Vista500 methods and the CS-6400 methods. We will not duplicate all of that regression data, but we will show just the bias values.
Newlevel = (slope * Oldlevel ) + Y-intercept
As an example, let's take Albumin, where the study determined a slope of 0.75 and y-intercept of 1.34. If we use the regression equation at levels of 4.55 and 3.79, this is what we see for bias
Newlevel1 = (0.75 * 4.55 ) + 1.34
Newlevel1 = (3.41 ) + 1.34
Newlevel1 = 4.75
The bias between the old and new level is the absolute value of the difference between 4.75 - 4.55 = 0.20
This is a 4.37% bias at the level of 4.55
Newlevel2 = (0.75 * 3.79 ) + 1.34
Newlevel2 = (2.84) + 1.34
Newlevel2 = 4.18
The bias between the old and new level is the absolute value of the difference between 4.18 - 3.79 = 0.39
This is a 10.26% bias at the level of 3.79.
Now we'll just fill in all the biases...
Assay | Level 1 |
CV% | Bias% | Level 2 |
CV% | Bias% |
Albumin | 4.55 | 1.45% | 4.37% | 3.79 | 1.75% | 10.26% |
Alk Phos | 92.1 | 8.78% | 286.24% | 366.4 | 3.19% | 281.66% |
AST | 44.2 | 3.1% | 12.12% | 165.7 | 2.72% | 10.13% |
ALT | 29.4 | 4.0% | 18.65% | 122.3 | 2.88% | 19.93% |
Amylase | 262.8 | 2.54% | 137.03% | 836.2 | 1.59% | 124.70% |
Bilirubin, Direct | 0.39 | 4.84% | 12.93% | 1.44 | 6.34% | 27.51% |
Bilirubin, Total | 0.63 | 4.2% | 12.63% | 2.73 | 7.19% | 5.04% |
Calcium | 6.46 | 4.38% | 0.36% | 9.49 | 3.9% | 6.17% |
Cholesterol | 219.8 | 1.53% | 3.88% | 167.5 | 2.61% | 4.63% |
Creatinine Kinase (CK) | 110 | 2.17% | 13.07% | 372.2 | 2.65% | 13.71% |
Chloride | 98.9 | 1.03% | 0.05% | 92.8 | 0.92% | 0.42% |
Creatinine | 1.19 | 3.97% | 10.92% | 4.07 | 3.25% | 0.29% |
GGT | 27.9 | 4.82% | 32.92% | 70.4 | 3.65% | 32.76% |
Glucose | 61.4 | 1.86% | 6.53% | 208.1 | 3.1% | 4.76% |
HDL | 76.1 | 2.53% | 13.94% | 65.1 | 2.34% | 15.66% |
Potassium | 2.77 | 1.7% | 5.56% | 4.38 | 1.0% | 2.55% |
Sodium | 151 | 0.9% | 1.24% | 140.6 | 0.9% | 1.64% |
Phosphate | 2.73 | 2.0% | 2.29% | 5.42 | 1.6% | 1.31% |
Total Protein | 7.13 | 1.8% | 7.59% | 5.48 | 2.3% | 8.26% |
Triglycerides | 88.1 | 5.0% | 8.78% | 57 | 6.7% | 18.98% |
Urea Nitrogen | 13.53 | 2.8% | 19.85% | 39.89 | 2.7% | 5.0% |
Uric Acid | 3.32 | 2.4% | 8.14% | 7.03 | 2.1% | 10.85% |
LDH | 246.4 | 1.2% | 136.14% | 480.7 | 2.3% | 127.49% |
Magnesium | 0.58 | 8.8% | 9.70% | 1.36 | 4.6% | 4.46% |
LDL | 140.1 | 2.5% | 4.65% | 106.8 | 2.9% | 6.31% |
Lipase | 38.6 | 2.4% | 53.91% | 5.9 | 2.6% | 233.50% |
Now you may be wondering about these bias numbers, because some of them are quite large. For example, the Lipase biases are 54% and 233%. What's happening here? The Slope is 2.06 and the y0intercept is -20.02. But if you have a slope of >2, that means the new method is increasing twice as fast along the range as the comparative method. Even if you've got a great correlation coefficient, that's a whole lot of proportional error. Now keep in mind this CS-6400 is being compared to the DxC800 and Vista500, well-known analyzers which are expected to give pretty reliable results. What the bias numbers are telling is is the CS-6400 is not getting the same answer as the Beckman Coulter and Siemens analyzers.
Determine Quality Requirements at the decision levels
Now that we have our imprecision and bias data, we're almost ready to calculate our Sigma-metrics. We're just missing one key thing: the analytical quality requirement. We're going to use, for the most past, the CLIA proficiency testing criteria, which set specifications in the form of a total allowable error. Where CLIA doesn't regulate an analyte (for example Lipase), we'll use the "Ricos goals" which are based on biologic variation.
For some analytes, CLIA sets a unit-based goal, which then becomes a variable allowable total error across the range of the assay. We'll note where that is taking place.
Assay | TEa | Level 1 |
CV% | Bias% | TEa | Level 2 |
CV% | Bias% |
Albumin | 10% | 4.55 | 1.45% | 4.37% | 10% | 3.79 | 1.75% | 10.26% |
Alk Phos | 30% | 92.1 | 8.78% | 286.24% | 30% | 366.4 | 3.19% | 281.66% |
AST | 20% | 44.2 | 3.1% | 12.12% | 20% | 165.7 | 2.72% | 10.13% |
ALT | 20% | 29.4 | 4.0% | 18.65% | 20% | 122.3 | 2.88% | 19.93% |
Amylase | 30% | 262.8 | 2.54% | 137.03% | 30% | 836.2 | 1.59% | 124.70% |
Bilirubin, Direct | 44.5% | 0.39 | 4.84% | 12.93% | 44.5% | 1.44 | 6.34% | 27.51% |
Bilirubin, Total | 63.5% | 0.63 | 4.2% | 12.63% | 20.0% | 2.73 | 7.19% | 5.04% |
Calcium | 15.48% | 6.46 | 4.38% | 0.36% | 10.54% | 9.49 | 3.9% | 6.17% |
Cholesterol | 10% | 219.8 | 1.53% | 3.88% | 10% | 167.5 | 2.61% | 4.63% |
Creatinine Kinase (CK) | 30% | 110 | 2.17% | 13.07% | 30% | 372.2 | 2.65% | 13.71% |
Chloride | 5% | 98.9 | 1.03% | 0.05% | 5% | 92.8 | 0.92% | 0.42% |
Creatinine | 25.21% | 1.19 | 3.97% | 10.92% | 15% | 4.07 | 3.25% | 0.29% |
GGT | 22.11% | 27.9 | 4.82% | 32.92% | 22.11% | 70.4 | 3.65% | 32.76% |
Glucose | 10% | 61.4 | 1.86% | 6.53% | 10% | 208.1 | 3.1% | 4.76% |
HDL | 30% | 76.1 | 2.53% | 13.94% | 30% | 65.1 | 2.34% | 15.66% |
Potassium | 18.05% | 2.77 | 1.7% | 5.56% | 11.42% | 4.38 | 1.0% | 2.55% |
Sodium | 2.65% | 151 | 0.9% | 1.24% | 2.84% | 140.6 | 0.9% | 1.64% |
Phosphate | 10.7% | 2.73 | 2.0% | 2.29% | 10.7% | 5.42 | 1.6% | 1.31% |
Total Protein | 10% | 7.13 | 1.8% | 7.59% | 10% | 5.48 | 2.3% | 8.26% |
Triglycerides | 25% | 88.1 | 5.0% | 8.78% | 25% | 57 | 6.7% | 18.98% |
Urea Nitrogen | 9% | 13.53 | 2.8% | 19.85% | 9% | 39.89 | 2.7% | 5.0% |
Uric Acid | 17% | 3.32 | 2.4% | 8.14% | 17% | 7.03 | 2.1% | 10.85% |
LDH | 20% | 246.4 | 1.2% | 136.14% | 20% | 480.7 | 2.3% | 127.49% |
Magnesium | 25% | 0.58 | 8.8% | 9.70% | 25% | 1.36 | 4.6% | 4.46% |
LDL | 20% | 140.1 | 2.5% | 4.65% | 20% | 106.8 | 2.9% | 6.31% |
Lipase | 37.44% | 38.6 | 2.4% | 53.91% | 37.44% | 5.9 | 2.6% | 233.50% |
It's starting to become clear there are going to be problems with this analyzer. If you look at the last row, Lipase, you can see that the "Ricos goals" set a desirable allowable error of 37.44%, while the bias alone at both control levels far exceeds that number. What does that mean?
With Sigma-metrics we can start to make some sense of the scale of the problem.
Calculate Sigma metrics
Now all the pieces are in place. Remember, this time we have two levels, so we're going to calculate two Sigma metrics.
Remember the equation for Sigma metric is (TEa - bias%) / CV.
Example calculation: for Albumin, with a 10% quality requirement, at the level of 4.55, given 1.45% imprecision and 4.37% bias:
(10 - 4.37) / 1.45 = 5.63 / 1.45 = 3.9 Sigma
At the lower level of 3.79, again with a 10% quality requirement, an imprecision of 1.75% and a bias of 10.26%:
(10 - 10.26) / 1.75 = less than zero / 1.75 = ???
Now, let's be clear that there is no real "negative Sigma." Really, if you reach that point, where the bias excessed the allowable total error, the assays are just not aiming at the same target. They are getting significantly different answers (now the new method might still be giving consistent, precise answers, but nonetheless they are not the same answers as the comparative method). In cases where this happens, we prefer to put "n/a" for not applicable, rather than an artificial negative Sigma.
So here's the full table with all the metrics, where possible:
Assay | TEa | Level 1 |
CV% | Bias% | Level 1 Sigma | TEa | Level 2 |
CV% | Bias% | Level 2 Sigma |
Albumin | 10% | 4.55 | 1.45% | 4.37% | 3.9 | 10% | 3.79 | 1.75% | 10.26% | n/a |
Alk Phos | 30% | 92.1 | 8.78% | 286.24% | n/a | 30% | 366.4 | 3.19% | 281.66% | n/a |
AST | 20% | 44.2 | 3.1% | 12.12% | 2.5 | 20% | 165.7 | 2.72% | 10.13% | 3.6 |
ALT | 20% | 29.4 | 4.0% | 18.65% | 0.3 | 20% | 122.3 | 2.88% | 19.93% | 0.03 |
Amylase | 30% | 262.8 | 2.54% | 137.03% | n/a | 30% | 836.2 | 1.59% | 124.70% | n/a |
Bilirubin, Direct | 44.5% | 0.39 | 4.84% | 12.93% | 6.5 | 44.5% | 1.44 | 6.34% | 27.51% | 2.7 |
Bilirubin, Total | 63.5% | 0.63 | 4.2% | 12.63% | 12.1 | 20.0% | 2.73 | 7.19% | 5.04% | 2.1 |
Calcium | 15.48% | 6.46 | 4.38% | 0.36% | 3.5 | 10.54% | 9.49 | 3.9% | 6.17% | 1.1 |
Cholesterol | 10% | 219.8 | 1.53% | 3.88% | 4.0 | 10% | 167.5 | 2.61% | 4.63% | 2.1 |
Creatinine Kinase (CK) | 30% | 110 | 2.17% | 13.07% | 7.8 | 30% | 372.2 | 2.65% | 13.71% | 6.1 |
Chloride | 5% | 98.9 | 1.03% | 0.05% | 4.8 | 5% | 92.8 | 0.92% | 0.42% | 5.0 |
Creatinine | 25.21% | 1.19 | 3.97% | 10.92% | 3.6 | 15% | 4.07 | 3.25% | 0.29% | 4.5 |
GGT | 22.11% | 27.9 | 4.82% | 32.92% | n/a | 22.11% | 70.4 | 3.65% | 32.76% | n/a |
Glucose | 10% | 61.4 | 1.86% | 6.53% | 1.9 | 10% | 208.1 | 3.1% | 4.76% | 1.7 |
HDL | 30% | 76.1 | 2.53% | 13.94% | 6.3 | 30% | 65.1 | 2.34% | 15.66% | 6.1 |
Potassium | 18.05% | 2.77 | 1.7% | 5.56% | 7.3 | 11.42% | 4.38 | 1.0% | 2.55% | 8.5 |
Sodium | 2.65% | 151 | 0.9% | 1.24% | 1.6 | 2.84% | 140.6 | 0.9% | 1.64% | 1.4 |
Phosphate | 10.7% | 2.73 | 2.0% | 2.29% | 4.2 | 10.7% | 5.42 | 1.6% | 1.31% | 5.8 |
Total Protein | 10% | 7.13 | 1.8% | 7.59% | 1.4 | 10% | 5.48 | 2.3% | 8.26% | 0.8 |
Triglycerides | 25% | 88.1 | 5.0% | 8.78% | 3.2 | 25% | 57 | 6.7% | 18.98% | 0.9 |
Urea Nitrogen | 9% | 13.53 | 2.8% | 19.85% | n/a | 9% | 39.89 | 2.7% | 5.0% | 1.5 |
Uric Acid | 17% | 3.32 | 2.4% | 8.14% | 3.7 | 17% | 7.03 | 2.1% | 10.85% | 3.0 |
LDH | 20% | 246.4 | 1.2% | 136.14% | n/a |
20% | 480.7 | 2.3% | 127.49% | n/a |
Magnesium | 25% | 0.58 | 8.8% | 9.70% | 1.7 | 25% | 1.36 | 4.6% | 4.46% | 4.5 |
LDL | 20% | 140.1 | 2.5% | 4.65% | 6.2 | 20% | 106.8 | 2.9% | 6.31% | 4.7 |
Lipase | 37.44% | 38.6 | 2.4% | 53.91% | n/a | 37.44% | 5.9 | 2.6% | 233.50% | n/a |
Recall that in industries outside healthcare, on the short-term scale, 3.0 Sigma is the minimum performance for routine use and 6.0 Sigma is considered world class quality. We're looking at the long-term scale for this Sigma-metric calculation, which is 1.5s higher (the short-term scale builds in a 1.5s shift, to allow for "normal process variation"). So we could go as low as 1.5 for the bare minimum acceptability. Still, what this is telling us is that this analyzer has a lot of problem assays, particularly if we compare the performance to some of the leading instruments.
Out of 52 data points for 26 analytes, only 9% are above Six Sigma, while more than 55% of the assays perform below 3 Sigma.
Summary of Performance by Sigma-metrics Method Decision Chart and OPSpecs chart
We can make visual assessments of this performance using a Normalized Sigma-metric Method Decision Chart:
Here we can see that no method is hitting the bull's eye, and a majority of the methods seem to be missing the target.A lot of those dots are actually "off the map" - so far off the chart that they are floating above your monitor.
Now what about QC? How do we monitor and control these methods? For that, we need a Normalized OPSpecs chart:
Most of the methods are simply not controllable. That is, even with the full "Westgard Rules" we probably won't be catching errors when they first occur - it will take a while before we pick them up. We would also need to double or triple the number of controls in use for the trouble-some methods, raising the expense of running this instrument. We may even need to increase the frequency of running controls because of the poor performance. However, HDL, CK, Potassium, LDL, Chloride, these are methods that can be controlled, some of them with only single rules.
Conclusion
The authors conclude "The CS-6400 with HiSens showed excellent analytical performance (precision, linearity and accuracy). Furthermore results from the CS-6400 were highly correlated with those obtained from similar tests performed on DXC800 and Vista500. Therefore, the CS-6400 is appropriate for tertiary care hospitals where large volumes of test samples must be processed within a short period with minimal cost."
Based on Sigma-metric analysis, we would reach a different conclusion. Yes, the correlations between the methods of the CS-6400 and DXC and Vista are high, but they will generate distinctly different answers. Unless the CS-6400 is being used in isolation, or the biases can be eliminated, this is not a good candidate instrument for a laboratory.