Tools, Technologies and Training for Healthcare Laboratories

A POC Chemistry Device

Now that we know how to translate the manufacturer's performance claims into Six Sigma metrics, let's take a hard look at some real-world data. With a performance study supplied by a "near-patient" chemistry anlayzer, we find out just how good (and how bad) performance of tests are when they're at the POC.

FROM METHOD PERFORMANCE CLAIMS TO SIX SIGMA METRICS: A POC CHEMISTRY ANALYZER


[Note: This QC application is an extension of the lesson From Method Validation to Six Sigma: Translating Method Performance Claims into Sigma Metrics. This article assumes that you have read that lesson first, and that you are also familiar with the concepts of QC Design, Method Validation, and Six Sigma. If you aren't, follow the link provided.]

Recap: What do you need to go from method validation to Six Sigma?

From the Method Validation study provided by the manufacturer:

From other sources:

What calculations do I have to perform, and in what order?

  1. Use the regression equation to estimate bias at the levels where precision studies were performed
  2. Find the quality requirement for those levels.
  3. Calculate Six Sigma metrics.

Here is the Method Validation study data from our anonymous instrument:

Test Name Control/Level CV Slope Y-Int R with Comments
Glucose I: 217.9 0.79 1.0377 5.37 correlation of the instruments is extraordinary at 100%
II: 81.5 0.93
BUN I: 11.3 4.11 1.0219 2.8 correlation between instruments is almost perfect at 99.
II: 43.3 1.07
Creatinine I: 0.63 25.3 1.0523 0.09 correlation is almost perfect at 99.
II:3.28 2.92
Creatine Kinase I:176.5 2.82 1.0419 44.11 the correlation coefficient between the analyzers is excellent at 9.
II: 514.3 1.68
Sodium I:140.6 1.14 1.1193 -4.82 correlation is, again, almost perfect at 99.
II: 118 0.64
Potassium I:6.18 2.08 1.0055 -0.70 correlation between the two instruments is outstanding at 98.
II:4.23 1.79
tCO2 I:25.4 10.52 0.7339 3.54 (94.4%) The accuracy data shows how noisy the method is in both instruments by the scattering of the data points….This is inherent to the methodology of measuring tCO2.
II:12.6 12.66

On first glance, the report contents are clearly favorable. It’s hard to understand the real meaning of the numbers, but the words used by the report about the correlation are clear: almost perfect, excellent, and outstanding. When the correlation coefficient isn’t that great, it’s not the new instrument’s fault; it’s the fault of all tCO2 methods.

Now, let’s take this manufacturer supplied data and work with it.

Estimating Bias at the same levels where the Precision studies were performed.

How do you do this? By using the Regression Equation:

Yc = a + b Xc where Yc and Xc represent the test and comparison values, respectively at a concentration level of interest, b is the slope, and a is the y-intercept. The slope and y-intercept are given from the comparison of methods experiment.

Use a level close to the mean of the data where your imprecision study was performed as your Xc value. For instance, for Glucose level I at 217.9, use 220 as the Xc value. And then solve the Regression Equation for Yc. This will estimate what the value of the reference method will be at that level.

Next, take the value of Yc-Xc, and divide it by Xc. This gives you a % bias measurement at that level.

At the end of these calculations, you have estimates of bias and CV at the same level.

Here’s what our example data looks like after we’ve performed these calculations:

Test Name Control/Level CV Bias % Slope Y-Int Level used for Xc calculations
Glucose I: 217.9 0.79 6.2 1.0377 5.37 220
II: 81.5 0.93 10.5 80
BUN I: 11.3 4.11 27.6 1.0219 2.8 11.0
II: 43.3 1.07 8.7 43.0
Creatinine I: 0.63 25.3 20.2 1.0523 0.09 0.6
II:3.28 2.92 8.0 3.2
Creatine Kinase I:176.5 2.82 29.4 1.0419 44.11 175
II: 514.3 1.68 12.8 510
Sodium I:140.6 1.14 8.5 1.1193 -4.82 140
II: 118 0.64 7.5 110
Potassium I:6.18 2.08 11.1 1.0055 -0.70 6.0
II:4.23 1.79 16.9 4.0
tCO2 I:25.4 10.52 12.4 0.7339 3.54 25.0
II:12.6 12.66 2.9 12.0

Note that even after those calculations, it’s still difficult to judge the quality of these methods. Certainly, we can look at methods that have high CV and high bias and wonder about them, but we really don’t have an intuitive feel for what the best values for those quantities should be. That’s why we need a quality requirement for each test.

What’s a quality requirement and where do I find it?

Finding or defining quality requirements is a critical step in the QC Design Process. We refer you to those articles on the website for more explanation. Since we are working with a chemistry instrument, we are in luck. CLIA has defined the quality requirements for all the tests on our new instrument. Let’s add those to our table:

Test Name Control/Level Q.R. CV Bias % Slope Y-Int Level used for Xc calculations
Glucose I: 217.9 10 0.79 6.2 1.0377 5.37 220
II: 81.5 10 0.93 10.5 80
BUN I: 11.3 18.2 4.11 27.6 1.0219 2.8 11.0
II: 43.3 9 1.07 8.7 43.0
Creatinine I: 0.63 50 25.3 20.2 1.0523 0.09 0.6
II:3.28 15 2.92 8.0 3.2
Creatine Kinase I:176.5 30 2.82 29.4 1.0419 44.11 175
II: 514.3 30 1.68 12.8 510
Sodium I:140.6 2.8 1.14 8.5 1.1193 -4.82 140
II: 118 3.6 0.64 7.5 110
Potassium I:6.18 8.3 2.08 11.1 1.0055 -0.70 6.0
II:4.23 12.5 1.79 16.9 4.0
tCO2 I:25.4 20 10.52 12.4 0.7339 3.54 25.0
II:12.6 41.66 12.66 2.9 12.0

One important thing to note is that the CLIA quality requirements are sometimes in absolute percentages, but other times the requirement varies depending on the level. That’s why the table presents different quality requirements at different levels.

Now that we’ve added quality requirements, you can already see where there are some tests that aren’t performing so well. For instance, if Potassium has an 8.3% quality requirement at a level of 6.18, having a CV of 2.08 and a bias of 11.1 probably isn’t good. How can you fit the simple addition (2.08 + 11.1) into 8.1?

In any case, we’re ready to get Six Sigma metrics! Now we’ll really be able to see how the tests stand up.

Calculating Sigma Metrics from Bias, CV and Quality Requirement.

Again, the website has already covered the relationship between Six Sigma Metrics and bias, CV, and quality requirements. There is even a free online calculator on Westgard Web to perform the caculations.

Let’s see the Sigma Metrics:

Test Name Control/Level Q.R. CV Bias % Sigma Metric Slope Y-Int Level used for Xc calculations
Glucose I: 217.9 10 0.79 6.2 4.56 1.0377 5.37 220
II: 81.5 10 0.93 10.5 negative 80
BUN I: 11.3 18.2 4.11 27.6 negative 1.0219 2.8 11.0
II: 43.3 9 1.07 8.7 0.28 43.0
Creatinine I: 0.63 50 25.3 20.2 1.18 1.0523 0.09 0.6
II:3.28 15 2.92 8.0 2.39 3.2
Creatine Kinase I:176.5 30 2.82 29.4 0.21 1.0419 44.11 175
II: 514.3 30 1.65 12.8 10.2 510
Sodium I:140.6 2.8 1.14 8.5 negative 1.1193 -4.82 140
II: 118 3.6 0.64 7.5 negative 110
Potassium I:6.18 8.3 2.08 11.1 negative 1.0055 -0.70 6.0
II:4.23 12.5 1.79 16.9 negative 4.0
tCO2 I:25.4 20 10.52 12.4 0.72 0.7339 3.54 25.0
II:12.6 41.66 12.66 2.9 3.05 12.0

At this point, we expect that there may be some shock and incredulity. There are some wild and wide-ranging numbers here, and not many of them are high. Can this data really reflect the performance of an actual method? Remember, this is method validation performance data supplied by the manufacturer of the instrument itself. The manufacturer gave us these numbers. But the manufacturer clearly doesn’t understand how those numbers convert into Sigma metrics.

What does it mean when a test has a NEGATIVE Sigma metric?

Once you’ve got less than a zero Sigma metric, the actual value is unimportant. By going below zero, in effect you’ve got far more variation than is allowed by your quality requirement. Just looking at the table explains it: for Potassium, when the quality requirement is 12.5, you can’t have a 16.9% bias and a 1.7% CV. Those two numbers don’t add up to less than 11.8.

The final meaning of a negative Sigma metric for a test is this: there is so much variation in that process it can’t provide quality results of any kind. Find a better method.

What does it mean when a test has 2 widely different Sigma metrics?

To those more comfortable with Six Sigma, it is probably disconcerting to find that a single test process has two different Sigma metrics. We are used to encountering just one metric associated with one process. However, it’s certainly not surprising that a test performs differently at different levels. It would be far more unusual if a test performed the same at all the levels of concentration.

For some of the tests, the two different values are close enough to give an overall feeling about the test. Both Sigma metrics for Potassium are negative; that’s bad. For Creatinine, the metrics are 1.18 and 2.46. That gives you a range of performance and an idea that this isn’t a great method, either. But for a method like Creatine Kinase, you’ve got a 10.2 Sigma metric and then a 0.21 Sigma metric. One is great. The other is bad. What does that mean?

Calculating Sigma Metrics at the Critical Medical Decision Level

Remember that these Sigma metrics are calculated at the levels where controls are being run. Are those the best levels to judge the performance of the test? Or are there better, more appropriate levels to use? If you think about it, ultimately, the Sigma metrics of where the controls are run matter less. We are more interested in finding the Sigma performance at the level where medical decisions are being made, and where patients are being most affected by the test results.

Dr. Bernard Statland has a critical reference for this area. He has graciously allowed us to post some of those values on the website. Using those medical decision levels, we can recalculate the Sigma metrics at medically important levels.

The process for working with the critical medical decision levels is similar to our earlier calculations. We use the regression equation again to estimate Yc and Yc-Xc, by which we obtain a bias estimate. However, for CV, we will need to rely on the precision studies. The practice here is to use the CV estimate which is closest to the critical level. So for glucose, where the known CV values are found at levels of 217.9 and 81.5, and the critical medical decision level is 120, we would use the CV value from the study at 81.5, since that is the closest.

Otherwise, the process is identical. We find quality requirements for that critical level, then we recalculate the Six Sigma metric.

To summarize the steps here:

  1. Find a critical medical decision level.
  2. Use the regression equation to estimate bias at that level.
  3. Pick the closest precision study to estimate CV at that level.
  4. Find the quality requirement for that level.
  5. Calculate Six Sigma metrics.

Having completed this process for all the tests, here are the final results:

Test Name Control/Level Q.R. CV Bias % Sigma Metric Slope Y-Int Level used for Xc calculations
Glucose I: 217.9 10 0.79 6.2 4.56 1.0377 5.37 220
II: 81.5 10 0.93 10.5 negative 80
Crit: 120 10 0.93 8.2 1.94 120
BUN I: 11.3 18.2 4.11 27.6 negative 1.0219 2.8 11
II: 43.3 9 1.07 8.7 0.28 43
Crit: 26 9 4.11 12.9 negative 26
Creatinine I: 0.63 50 25.3 20.2 1.18 1.0523 0.09 0.6
II:3.28 15 2.92 8.0 2.39 3.2
Crit: 1.6 18.75 25.3 10.8 0.31 1.6
Creatine Kinase I:176.5 30 2.82 29.4 0.21 1.0419 44.11 175
II: 514.3 30 1.68 12.8 10.2 510
Crit: 240 30 2.82 22.5 2.66 240
Sodium I:140.6 2.8 1.14 8.5 negative 1.1193 -4.82 140
II: 118 3.6 0.64 7.5 negative 110
Crit: 135 2.96 1.14 8.35 negative 135
Potassium I:6.18 8.3 2.08 11.1 negative 1.0055 -0.70 6.0
II:4.23 12.5 1.79 16.9 negative 4.0
Crit: 5.8 8.6 2.08 11.5 negative 5.8
tCO2 I:25.4 20 10.52 12.4 0.72 0.7339 3.54 25.0
II:12.6 41.66 12.66 2.9 3.05 12.0
Crit: 20 25 10.52 8.9 1.53 20

Conclusion: We wouldn’t want this Point-of-Care device anywhere near us.

Based on the final calculations, at all critical levels, for all tests on the instrument, the Sigma metrics are below 3. As you may recall, in industry, any process below 3 sigma is considered too unstable for routine use. Therefore, your final judgement on this instrument should not be positive, to put it mildly. These tests have far too much variation. The quality required by the tests is not being met by the performance of the instrument.

Postscript: How would you QC this instrument?

For a moment, let’s assume that you already have this instrument and you’re stuck with it – there’s no money in the budget to get a new one for quite some time. If this instrument is the only method to provide test results, you’ll still have to use it, no matter how bad the performance is.

If the Sigma metrics were above 3 sigma, we would recommend using a QC Design or QC Planning tool like the Normalized OPSpecs charts available on the website, or the software programs QC Validator® 2.0, or EZ Rules®. But in this case, performance is so poor that a blanket recommendation will suffice.

For methods below 3 sigma, you want to use the "full Westgard Rules" with as many controls as you can afford. 13s/22s/R4s/41s/8x for example, with 4 control measurements or more.