Tools, Technologies and Training for Healthcare Laboratories

Roche Integra 400+

Continuing in our series on translating method validation studies into Six Sigma metrics, we examine a Roche Integra 400+ chemistry analyzer, with data provided by St. Joseph Hospital in Houston, Texas.

From Method Performance Claims to Six Sigma Metrics: A Chemistry Analyzer

[Note: This QC application is an extension of the lesson From Method Validation to Six Sigma: Translating Method Performance Claims into Sigma Metrics. This article assumes that you have read that lesson first, and that you are also familiar with the concepts of QC Design, Method Validation, and Six Sigma. If you aren't, follow the link provided.]

Recently, the staff at Westgard Web had the pleasure of receiving some real-world laboratory data and calculations from one of our website visitors. Inspired by the Six Sigma calculations presented in the QC application, From Method Validation to Six Sigma: a POC chemistry analyzer, the ED Laboratory Supervisor at St. Joseph Hospital in Houston, Texas decided to perform a similar analysis of their chemistry analyzers.

For the purposes of this application, we are not going to walk through the steps of the calculations (see the previous article links or the recap below to see how these numbers are calculated). We will only present the final results here and discuss them.

What data was used and how was it collected?

The laboratory performed a comparison of methods experiment between a Roche Integra 400+ and a Hitachi 917 in June 2003. This data was derived from a 1 week sample analysis, and the number of samples varied by the number of analytes.

They also performed two imprecision studies: a within-run replication experiment with 10 samples, and a between-day replication experiment with 25 samples, derived from 12 days control data except for Dbili and CO2, which were only stable in the QC sample for 5 days.

Preliminary Results

Here is the Method Validation study data along with the preliminary Sigma calculations:

On first glance, these data are amazing. CO2 with a Sigma metric over 100! The lowest metric reported is 5.3, which is still a great metric.

However, this is a case where the metrics are too good to be true. The key here is to notice where the CV figures are coming from. These imprecision numbers are from the within-run study. The variation within a single run does not adequately estimate the errors and variation that a laboratory will experience run-to-run, shift-to-shift, day-to-day, etc. So these figures do not represent a realistic or practical assessment of the instrument.

However, since we had the between-day imprecision study, we asked the lab to recalculate with that data:

A More realistic picture of performance

Using these CV figures, the Sigma metrics dramatically decrease (CO2 at the upper level drops from 127 to "only" 12), although there are still some very high numbers. We also see that there are some cases where the ideal is not being achieved: Amylase, BUN, CO2, Na, (which as any regular visitor to Westgard Web knows, is an incredibly hard test to control within the requirements set by CLIA)

What does it all mean?

Without doubt, this study has a lot of good news. The instrument performs very well on many of the tests. But where does one go with that news? What control rules should be used? How many controls should be run? This next step is QC Design. The QC Design process will determine what are the best control procedures to use for each test.

As with previous cases, it is immediately apparent that the performance varies depending on the level where the control is being run. For instance, glucose has a Sigma metric of 9.0 at a level of 87, and a Sigma metric of over 15 at a level of 301. So which is the real Sigma metric for glucose? Normally, we would recommend finding the single most important decision level for each test and determine the CV and bias at that level. Then use those estimates to calculate a single critical Sigma metric at that level. That would represent the performance for the test at the level where you determine it is most important for the patient. QC Design would use those same estimates of CV and bias to determine the best control procedures at that critical level.

However, in this case, we have such a bounty of good news that we can perform QC Design in a simpler fashion. We could describe this as the "Worst Case Scenario" QC Design. Simply take the worst performance for the test (the level with the lowest Sigma metric) and perform QC Design based on those numbers. Whatever control rule you arrive at based on those figures will automatically work for the other levels.

There are several different QC Design tools available:

In this application, we'll demonstrate the use of EZ Rules®. Here is an example screen, showing the results for calcium:

After entering the Quality Requirement, CV, and bias, and pertinent details about the instrument (# of controls run, for instance), the EZ Rules program will make an automatic QC selection. It will then present this control rule and number of control materials on a series of charts. The first chart is a Critical-Error Graph, which displays the medically important systematic error as well as the Sigma Metric:

Note that alternative control rules are also presented. In this case, more complicated multirules could be used, but that would simply be overkill. The 13s rule achieves 97% error detection with virtually no false rejection at all.

The EZ Rules® program also presents an OPSpecs chart. This presents much of the same data as the graph above, but in a slightly simplified format.

The main advantage of the OPSpecs chart here is to show the "operating point" of the method. Using CV as an x-coordinate and bias as a y-coordinate, performance can be plotted on the graph, and all control rules coming above the operating point will achieve the listed error detection. The benefit of this visual simplification is that it allows you to project the effects of improved performance (for instance, if bias were zero, what control rules could be used, etc.) For calcium, performance is pretty ideal already, so there isn't much of a need to project the effect of improvements.

After performing QC Design on all the tests, here are our results:

It may seem like there are a lot of different recommendations, but they boil down to basically three categories. Here is a summary:

Control Rule Analytes
For Sigma metrics of 5.0 and above:
13s or 13.5s with N=2
ALB, ALKP, ALT, AMY, AST, Tbili, Ca, Creat, K, Gluc, Lip, TP
For Sigma metrics between 4.0 and 5.0:
12.5s with N=2
Chloride
For Sigma metrics below 4.0:
"Westgard Rules" with N=4
CO2, Dbil, BUN, Sodium

The majority of the tests (12 of 17) can be controlled with a single control rule with fairly wide limits - and still this rule would guarantee over 90% error detection. As a side effect of using the wide limits, false rejection would go down dramatically - to essentially zero.

Observations, Conclusions, and Important Notes.

1. Sigma Metrics are not a magic bullet. Beware the salesman who says, "Yeah, our instrument has great Six Sigma metrics." Just because an instrument comes with a claim of Six Sigma or better performance does not eliminate the need for you to examine the data. A Sigma metric is like any other statistic - it can be manipulated. You can still "game the system" to get a high Sigma metric, if you know what you're doing. So as a customer, if you are given a claim of Sigma performance, you must examine the underlying data to make sure the calculations are correct, the assumptions are correct, and the data sets are correct. In the application here, the first presentation of data was unrealistic because it was based on within-run imprecision estimates.

In fact, we predict as Six Sigma becomes more popular in the diagnostic and healthcare market, these manipulations will occur. Manufacturers will offer data based on small sample sizes, over short periods of time, using calibrators, etc. For example, for the application here, it would have been better to have the replication results over a 30-day period. It will be the customer's responsibility to demand that realistic performance data is provided.

2. Current chemistry instrumentation performs well. This is an obvious statement, based on the data we've just seen. But it really does bear repeating: this analyzer has achieved some high Sigma metrics. In fact, so well that many of the tests could actually be adequately controlled with JUST ONE CONTROL. It's not presented here in the data because CLIA requires at least two controls. Here is an explicit case where regulations are doubling the cost of QC on an instrument.

3. Lab testing can perform much better than POC testing. It's worth noting that there is a marked contrast between the "in-the-lab" performance vs. the "near-patient" testing performance. Witness our recent evaluation of a POC device. Based on the data provided by the manufacturer, that device had negative Sigma metrics. Basically, it was not a stable process. If you give your physician or patient a choice between 6+ sigma performance and no performance at all, what will they choose?

4. Some methods are always going to be trouble. Sodium, BUN, CO2 were still hard to control. For years, we have reported that these methods are difficult. The reasons for this are known: CLIA limits on these analytes are tight. As we've noted in other Six Sigma articles, these analytes are often more easily controlled by Average of Normals control rules.

5. How do you QC a chemistry panel when performance varies test by test? This is a good question. A few of the analytes in this set are going to need Max QC - "Westgard Rules" with at least 4 controls. But a majority of the other tests only need a minimum of controls and control rules. If you can only choose one control rule and number of controls for the whole panel, what will you choose? The ideal solution is to allow different QC for different tests. Using different control rules for each test is possible, or should be. This is a case where the laboratory professional must make a judgment call: do QC for the worst performer, which is overkill for the others, OR do QC based on the performance of the majority, knowing that this means some tests are not being adequately controlled. Ultimately it is your call.

Recap: What do you need to go from method validation to Six Sigma?

From the Method Validation study provided by the manufacturer:

From other sources: