Tools, Technologies and Training for Healthcare Laboratories

Sigma-metrics of an AU 5800

Back in 2011, we took a look at the Olympus AU 2700 plus. In 2016, it's time to take a look at the more recent models, like the Olympus AU 5800.

Sigma-metrics of a Beckman Coulter Olympus AU 5800

December 2016
Sten Westgard, MS

[Note: This QC application is an extension of the lesson From Method Validation to Six Sigma: Translating Method Performance Claims into Sigma Metrics. This article assumes that you have read that lesson first, and that you are also familiar with the concepts of QC Design, Method Validation, and Six Sigma. If you aren't, follow the link provided.]

[Additional Note: On December 9th, the final paragraph of the conclusion was modified slightly.]

Clinical Laboratory published a Belgian study (at the Hospital AZ Glorieux Ronse) of the Beckman Coulter AU 5800. This is one of the Olympus instruments that Beckman Coulter acquired when it purchased the company some years ago.

Analytical Performance Verification of the Beckman Coulter AU5800 Clinical Chemistry Analyzer Against Recognized Quality Specifications Reveals Relevance of Method Harmonization Glibert B, Bourleaux V, Peeters R, Reynolds T, Vranken G., Clin Lab 2016;62:57-72.

The Imprecision and Bias Data

"Three types of replication experiments were used. the different experiments were not carried out at the same time....[I]imprecision was studied according to the CLSI EP15-A2 protocol. firstly, for every control level, three replicate samples were run per day for five consecutive days.
"Secondly, 30 patient serum samples were selected and assigned to a low and high concentration group....Each patient sample was split into two aliquots and analysed in singly. Imprecision was calculated separately for the low and high concentration groups...
"Finally, imprecision was estimated from an IQC cycle of at least 35 working days."

Of the three studies, the long- term performance of the IQC controls is what interests us. This gives the most realistic estimate of what actual routine IQC performance will look like. 

The comparison was berformed between the AU5800 and the Unicel DxC 800. While these are two field methods, they are in fact the two instruments from the same company, Beckman Coulter. It may typically be expected that instruments from the same company are similar in performance to each other - that there should be minimal bias. However, remember that AU5800s are actually from Olympus, while the DxC instruments are from the original Beckman Coulter. So there is reason to investigate the comparability of these two instruments.

"Twenty surplus fresh serum samples selected to have concentrations distributed through the widest part of the analytical measuring range, as described in CLSI protocol EP15-A2, were selected. Samples were collected in a two week period and handled as routine samples. Single measurements were completed as soon as possible one after another on both systems. this experiment is conducted to verify whether a difference between methods is statistically different from zero, is in agreement with expected bias or is in compliance with allowable goals for bias."

From this study, "Linear regression statistics were calculated with the Passing Bablock procedure." Using this regression equation, it's possible to calculate the bias at each of the levels where IQC was run.

In other words, this study did a very good job of collecting both imprecision and bias data. While the study looked at 24 methods, we're going to focus on the 19 most common chemistry assays.

Assay Target Value
CV% Bias%
Alkaline Phosphatase  133 7.4%  17.15%
   530 2.4  16.29
 ALT  41 4.6  3.56
   121 2.4  9.14
 Amylase  86 1.9  1.07
   241 1.7  1.67
AST 45 5.0  9.96
  137 1.5  12.0
 Calcium 2.4 1.7  2.08
  3.41 1.3  1.47
 Chloride 103 1.4  1.29
  93 0.8  0.25
 Cholesterol 3.77 0.9  2.0
  7.37 0.9  2.0
 Creatinine Kinase 157 3.1  6.42
  395 1.3  6.77
 Creatinine 99 2.7  4.34
  453 1.6  4.08
 GGT 58 0.8  36.9
  178 3.5  37.64
 Glucose 5.3 2.4  2.04
  13.6 2.5  4.46
 LD 154 3.5  20.83
  516 2.1  22.35
 Phosphorous 1.59 3.9  2.37
  3.03 2.9  2.67
 Potassium 2.72 1.1  6.81
  4.29 0.7  1.02
 Sodium 148 1.0  0.68
  137 1.1  0.73
 Total Bilirubin 103 5.0  4.16
  26 4.0  13.5
 Total Protein 38.4 3.3  9.18
  76.6 3.3  5.1
 Triglycerides 1.67 2.7  2.99
  3.79 3.4  0.36
 Uric Acid 325 4.3  2.89
  586 2.9  3.39

In the absence of context, it's often hard to know if this is good performance. There are a few analytes where bias does seem rather high (Alkaline Phosphatase, AST, GGT, LD). But we'll have to see whether the context of the allowable error is such that it allows that bias.

Determine Quality Requirements at the decision levels

Now that we have our imprecision and bias data, we're almost ready to calculate our Sigma-metrics. We're just missing one critical component: the analytical quality requirement.

Assay TEa% Source Target Value
CV% Bias%
Alkaline Phosphatase 30 CLIA  133 7.4%  17.15%
       530 2.4  16.29
 ALT 20 CLIA  41 4.6  3.56
       121 2.4  9.14
 Amylase 30 CLIA  86 1.9  1.07
       241 1.7  1.67
AST 20 CLIA 45 5.0  9.96
      137 1.5  12.0
 Calcium 10.42 CLIA 2.4 1.7  2.08
  7.33   3.41 1.3  1.47
 Chloride 5 CLIA 103 1.4  1.29
      93 0.8  0.25
 Cholesterol 10 CLIA 3.77 0.9  2.0
      7.37 0.9  2.0
 Creatinine Kinase 30 CLIA 157 3.1  6.42
      395 1.3  6.77
 Creatinine 15 CLIA 99 2.7  4.34
      453 1.6  4.08
 GGT 22.11 Ricos 58 0.8  36.9
    desirable 178 3.5  37.64
 Glucose 10 CLIA 5.3 2.4  2.04
      13.6 2.5  4.46
 LD 20 CLIA 154 3.5  20.83
      516 2.1  22.35
 Phosphorous 10.7 Ricos 1.59 3.9  2.37
    desirable 3.03 2.9  2.67
 Potassium 18.38 CLIA 2.72 1.1  6.81
  11.66   4.29 0.7  1.02
 Sodium 2.7 CLIA 148 1.0  0.68
  2.92   137 1.1  0.73
 Total Bilirubin 20 CLIA 103 5.0  4.16
  26.31   26 4.0  13.5
 Total Protein 10 CLIA 38.4 3.3  9.18
      76.6 3.3  5.1
 Triglycerides 25 CLIA 1.67 2.7  2.99
      3.79 3.4  0.36
 Uric Acid 17 CLIA 325 4.3  2.89
      586 2.9  3.39

Note that the majority of the performance specifications we use here are CLIA goals, but where CLIA does not provide a goal, we have opted to select the desirable specifications for allowable total error from the biologic variation database ("Ricos goals"). Also note in cases where the CLIA allowable total error is expressed in units, that means that the % goal is different at each level - so those goals are listed in each row of the table, where necessary.

Now we have all the numbers we need for Sigma-metrics.

Calculate Sigma metrics

Sigma-metrics takes both imprecision and bias into account in a single equation. We're going to calculate Sigma-metrics using both "Ricos goals" and the CLIA goals.

Remember the equation for Sigma metric is (TEa - bias%) / CV.

Example calculation: for Alkaline Phosphatase, with a 30% quality requirement, given 7.4% imprecision and 17.15% bias:

(30 - 17.15) / 7.4 = 12.85 / 7.4 = 1.7 Sigma

The Sigma-metric verdict on this Alkaline Phosphatase assay is not good. But this is only one level, and maybe not perhaps the most critical level. And the issue of the bias here, which is the main contributor to the low metric, will be discussed shortly.

But first here's the table with all the Sigma-metrics using the (mostly CLIA) analytical performance goals:

Assay TEa% Source Target Value
CV% Bias% Sigma-metric
Alkaline Phosphatase 30 CLIA  133 7.4%  17.15% 1.7
       530 2.4  16.29 5.7
 ALT 20 CLIA  41 4.6  3.56 3.6
       121 2.4  9.14 4.5
 Amylase 30 CLIA  86 1.9  1.07 15.2
       241 1.7  1.67 16.7
AST 20 CLIA 45 5.0  9.96 2.0
      137 1.5  12.0 5.3
 Calcium 10.42 CLIA 2.4 1.7  2.08 4.9
  7.33   3.41 1.3  1.47 4.5
 Chloride 5 CLIA 103 1.4  1.29 2.6
      93 0.8  0.25 5.9
 Cholesterol 10 CLIA 3.77 0.9  2.0 8.9
      7.37 0.9  2.0 8.89
 Creatinine Kinase 30 CLIA 157 3.1  6.42 7.6
      395 1.3  6.77 17.9
 Creatinine 15 CLIA 99 2.7  4.34 3.9
      453 1.6  4.08 6.8
 GGT 22.11 Ricos 58 0.8  36.9 negative
    desirable 178 3.5  37.64 negative
 Glucose 10 CLIA 5.3 2.4  2.04 3.3
      13.6 2.5  4.46 2.2
 LD 20 CLIA 154 3.5  20.83 negative
      516 2.1  22.35 negative
 Phosphorous 10.7 Ricos 1.59 3.9  2.37 2.1
    desirable 3.03 2.9  2.67 2.8
 Potassium 18.38 CLIA 2.72 1.1  6.81 10.5
  11.66   4.29 0.7  1.02 15.2
 Sodium 2.7 CLIA 148 1.0  0.68 2.0
  2.92   137 1.1  0.73 2.0
 Total Bilirubin 20 CLIA 103 5.0  4.16 3.2
  26.31   26 4.0  13.5 3.2
 Total Protein 10 CLIA 38.4 3.3  9.18 0.3
      76.6 3.3  5.1 1.5
 Triglycerides 25 CLIA 1.67 2.7  2.99 8.1
      3.79 3.4  0.36 7.3
 Uric Acid 17 CLIA 325 4.3  2.89 3.3
      586 2.9  3.39 4.7

 Two points to be made right away: any Sigma-metric above 6 is simply treated the same way as Six Sigma.  All methods with Sigma-metrics above 6 are considered world class. Similarly, the Sigma-metrics that are "below zero" are not necessarily worse - but they do indicate that the bias is significant. Any time the bias exceeds the allowable total error, you will find a Sigma-metric of zero. This essentially means that the difference between the comparative method and the new method is so large that they are effectively aiming at different targets. The bigger question arises: which method is correct?

Overall, there are a lot of good metrics here. A few troubling numbers, but not too many of them.

When two field methods are compared, the traditional method validation study assigns all the "error" to the new method, assuming that the old method was correct. However, in this unique case, we have two instruments from the same manufacturer but built by different original companies. The Unicel DxC was a true Beckman Coulter instrument, while the Olympus AU5800 is an instrument that Beckman Coulter acquired from Japan. The differences may be primarily methodological - that is, the measurement principles of these methods may be fundamentally different. That doesn't minimize the issue that patient results run on the Olympus may be significantly different than results run on the Beckman Coulter, but it may explain why.

Summary of Performance by Sigma-metrics Method Decision Chart and OPSpecs chart

We can make visual assessments of this performance using a Normalized Sigma-metric Method Decision Chart. In this case, we're going to reduce the number of data points so that we select only one point to plot for each of the 19 assays. That means picking one of the two data points provided. We're guided in this by the decision level recommendations of the Sigma Verification Program, which provides participating labs with not only the analytical performance specifications but also the critical decision levels where those goals should be applied. Using that data, we then plot the selected points on the Normalized MEDx chart:

 2016 Glibert and Vranken AU 5800 Method Decision Chart

Now what about QC? How do we monitor and control these methods? For that, we would usually need a Normalized OPSpecs chart. However, because the differences between methods are so significant, we should probably stop and consider our options.

If we assume that the new instrument (AU5800) is "new and improved" then while there are significant biases observed, we may consider them biases in the "right" direction. That is, our old UniCel was significantly biased and we are just now correcting those issues. This means we may need to re-baseline all of our patients, develop new reference ranges, and re-educate our clinicians for any methods where the biases were clinically significant. They need to know that the new method has a new range and their old cutoffs need to be adjusted.

The authors also commented on these significant biases: "Substantial differences were observed between AU and DxC methods for ALP, GGT, and LD. This may be due to differences in calibration and/or methodology. Methods for enzymes on the AU analyser are calibrated, with the calibrator value of GGT and LD traceable to the IFCC reference method and IFCC/IRMM certified reference material. Calibrator values for ALP are traceable to the Beckman Coulter master calibration. By comparison, except for [Amylase], enzyme methods were not calibrated on the DxC analyser."

We can confirm our assumption by doing further experiments - perhaps by obtaining IFCC/IRMM reference materials, where available, for some of these biased methods. Then we can confirm that the new methods are closer to truth. If we still have the old UniCel, we can also confirm that the old methods were more biased from the reference materials.

Once we've decided how to handle this bias, we can calculate our Sigma-metrics again (this time using the bias from the reference methods or materials, or perhaps from a peer group once we have installed the instrument and collected enough data). If we simply zero out the bias, then what was an instrument where 8 of the methods are below 3 Sigma becomes a instrument where only 1 method is below 3 sigma. In the interim, using a robust series of "Westgard Rules" on those troublesome methods is a practical safeguard.

Conclusion

The authors stated "One important finding of the current study is the magnitude of some between method differences, which emphasizes the importance of harmonizing test results. Preferably, in the interest of appropriate patient care, CE-marked reagents measuring the same analyte on different platforms from the same manufacturer, should give comparable results...However, the methods compared were not always identical and for some tests IFCC standardized techniques were only used on the AU platform."

They also concluded, "Although the AU 5800 assays correlated very well with the comparator device, considerable differences between methods were observed for some chemistries, underlining the importance of harmonizing results. Regardless of the biases observed, the AU5800 analyser is considered suitable for performing clinical chemistry testing in medium to high volume laboratories."

Based on Sigma-metric analysis, we would agree with most of those conclusions. Certainly it looks like UniCel users may want to switch away from their current instrument to another one that is more traceable. However, UniCel DxC users cannot assume that a switch from UniCel to Olympus AU instruments is seamless and that the assay ranges will be perfectly comparable. The assumption that you can simply standardize your laboratory by buying all your instruments from the same diagnostic manufacturer is clearly incorrect. Beneath the top level company name, in this case Beckman Coulter, significant differences in performance may exist in the product line, particularly when diagnostic companies are built of conglomerations and consolidations of smaller diagnostic firms.