Tools, Technologies and Training for Healthcare Laboratories

Real-World Chemistry Data

We recently received a set of data for a chemistry analyzer. An analysis of these numbers gives us an eye-opening glimpse of real-world performance. On some tests, there were Sigma-metrics higher than 20! And yet on one test, the Sigma metric was actually below 1.0!! See which tests were good, which ones were great, and which ones were just plain ugly.



[Note: This QC application is an extension of the lesson From Method Validation to Six Sigma: Translating Method Performance Claims into Sigma Metrics. This article assumes that you have read that lesson first, and that you are also familiar with the concepts of QC Design, Method Validation, and Six Sigma. If you aren't, follow the link provided.]

Below follows a Six Sigma metric evaluation of some real-world chemistry data from a hospital laboratory.

Summary of Analyte Performance

  • Data was gathered on the performance of controls of a chemistry analyzer.
  • CLIA PT criteria were used to define the quality requirements for the analytes.
  • %Bias was estimated by calculating the absolute % difference of the Laboratory mean from the Peer Group mean during a proficiency testing survey. The average SDI was converted into concentration units by multiplying by the group SD, then this number was divided by group mean (also in concentration units), and the final results is a bias expressed as a percentage.
  • In addition, %CV was calculated for two controls, Material I and Material II. Where the CVs for Material I and II were similar, the worst (largest) value was used to calculate a "worst-case" scenario for the Sigma Metric. Where the CVs for Material I and II were widely different, an average was used to calculate the Sigma Metric.
Analyte CLIA PT
Criteria(% )
PT %Bias %CV
Triglycerides 25 2.0 0.9
ALP 30 3.8 1.2
Magnesium 25 2.8 1.3
Uric Acid 17 1.2 1.1
Creatinine 15 1.7 1.2
CPK 30 4.6 3.1
Glucose 10 1.9 0.9
Total Protein 10 2.6 1.4
Cholesterol 10 5.8 1.3
Chloride 5 2.7 0.6
LDH 20 10.5 2.0
BUN 9 3.6 1.3
Albumin 10 5.6 1.5
AST 20 8.9 3.9
ALT 20 1.1 4.0
T Bilirubin 20 2.9 4.9
Amylase 30 3.1 2.8
Potassium (+/- .5 mmol/L)
1.2 1.5
Sodium (+/- 4 mmol/L)
2.1 1.2
Calcium (+/- 1.0 mg/dL)
5.2 1.0

Sigma metrics of analytes

Sigma metrics were calculated using the simple equation: (Quality Requirement - Bias)/ CV. More about this way of calculating Sigma-metrics can be found here.

Analyte CLIA PT
Criteria(% )
PT %Bias %CV
Triglycerides 25 2.0 0.9 25.6
ALP 30 3.8 1.2 21.8
Magnesium 25 2.8 1.3 17.1
Uric Acid 17 1.2 1.1 14.4
Creatinine 15 1.7 1.2 11.1
CPK 30 4.6 3.1 8.2
Glucose 10 1.9 0.9 9.0
Total Protein 10 2.6 1.4 5.3
Cholesterol 10 5.8 1.3 3.2
Chloride 5 2.7 0.6 3.8
LDH 20 10.5 2.0 4.8
BUN 9 3.6 1.3 4.2
Albumin 10 5.6 1.5 2.9
AST 20 8.9 3.9 2.8
ALT 20 1.1 4.0 4.7
T Bilirubin 20 2.9 4.9 3.5
Amylase 30 3.1 2.8 9.6
Potassium 12% 1.2 1.5 7.2
Sodium 2.6% 2.1 1.2 0.4
Calcium 14% 5.2 1.0 8.8

At first glance, the Sigma metrics here are quite impressive. There are ten tests with Sigma Metrics greater than 6. There are only five tests with Sigma Metrics less than 4.

Why are some Sigma Metrics so high and others so low? Of course a lot of it has to do with the method performance. If bias and/or CV is high, Sigma values will be low. But for some specific analytes, the quality requirement defined by CLIA'88 was very generous (i.e. large). For instance, the reason why the triglycerides analyte has such a high, high Sigma value is because CLIA sets the quality requirement at 25%. That's a fairly low bar of achievement. In the opposite extreme, the CLIA analytical quality requirement for Sodium is very tight and very tough. There's not much room for variation and with even a small bias and CV, the Sigma value is incredibly small.

QC Rule Recommendations

The data from the preceding tables was entered into the EZ Rules software program. For each test, Automatic QC Selection was used. The following QC procedures were recommended by the software.

Analyte CLIA PT
PT %Bias %CV
EZ Rulestm
QC Procedure
Triglycerides 25 2.0 0.9 25.6 13.5s with N=2
ALP 30 3.8 1.2 21.8 13.5s with N=2
Magnesium 25 2.8 1.3 17.1 13.5s with N=2
Uric Acid 17 1.2 1.1 14.4 13.5s with N=2
Creatinine 15 1.7 1.2 11.1 13.5s with N=2
CPK 30 4.6 3.1 8.2 13.5s with N=2
Glucose 10 1.9 0.9 9.0 13.5s with N=2
Total Protein 10 2.6 1.4 5.3 13s with N=2
Cholesterol 10 5.8 1.3 3.2 13s/22s/R4s/41s with N=4, 50% AQA
Chloride 5 2.7 0.6 3.8 13s/22s/R4s/41s with N=4, 50% AQA
LDH 20 10.5 2.0 4.8 12.5s with N=2
BUN 9 3.6 1.3 4.2 12.5s with N=4
Albumin 10 5.6 1.5 2.9 13s/22s/R4s/41s/8x with N=4, MAX QC
AST 20 8.9 3.9 2.8 13s/22s/R4s/41s/8x with N=4, MAX QC
ALT 20 1.1 4.0 4.7 12.5s with N=2
T Bilirubin 20 2.9 4.9 3.5 13s/22s/R4s/41s with N=4, 50% AQA
Amylase 30 3.1 2.8 9.6 13.5s with N=2
Potassium 12% 1.2 1.5 7.2 13.5s with N=2
Sodium 2.6% 2.1 1.2 0.4 13s/22s/R4s/41s/8x with N=4, MAX QC
Calcium 14% 5.2 1.0 8.8 13.5s with N=2

Some notes on terminology and recommendations:

  • N means Number of Control Measurements. For example, N = 2 could mean running two controls OR it could mean reading one control twice. Likewise, N = 4 could mean running 4 controls or it could mean reading 2 controls twice.
  • Unless otherwise stated, the control rules recommended detect critically-sized systematic errors with 90% probability or better. When this is not true, for instance, with Cholesterol and Chloride, the term "50% AQA" is used. This indicates that the control rule can only provide approximately 50% Analytical Quality Assurance, or detect a critical-error with 50% or more probability. This means that medically important errors will only be detected half the time and some runs with poor quality test results will be reported. On average, it will take two runs to detect any analytical problems when the QC procedure provides only 50% error detection.
  • "MAX QC" indicates that the control rule used does not even achieve 50% Analytical Quality Assurance. The recommendation is to use the maximum QC procedure you are able to implement, increase non-statistical methods of QC for this method, and consider finding other methods for this analyte.
  • For those methods with Sigma values of 6.0 or higher, the same control rule is recommended: 13.5s with N=2. This will give the necessary error detection for critical-errors, but has essentially NO false rejects or repeat runs.

More Practical Rules to Use

While EZ Rules recommendations are mathematically solid, there are times when a more practical approach is needed for the laboratory. The requisite QC procedures must be reconciled with some of the laboratory "intangibles":

Control Rule Analytes
For 5.0 Sigma values and above:
13s with N=2
Triglycerides, ALP, Magnesium, Uric Acid, Creatinine, CPK, Glucose, Total Protein, Amylase, Potassium, Calcium
For 4.0 to 5.0 Sigma values:
12.5s with N=2
For Sigma values below 4.0:
"Westgard Rules" with N=2
Cholesterol, Chloride, Albumin, AST, T. Bilirubin, Sodium

Intangible #1: Since all of these analytes are on the same instrument, it's unlikely that different numbers of controls can be run for each analyte. It is most likely that just two controls will be run, which is fine for the high Sigma value analytes, but will further reduce the performance of the methods that need Ns of 4. However, the interpretation of the QC can and should be done differently for each analyte.

Intangible #2: Another observation is that many of the QC software packages available on the market do not allow 2.5 or 3.5s control rule application. For those QC recommendations of 13.5s, a 13s is fine, since it is in fact a tighter control limit and thus error detection will not suffer if the latter control rule is used instead of the former.

What can be done?

For methods with 5 Sigma performance and above, all that needs to be done is breathe a sigh of relief, pat yourself on the back, and move on to the tough methods.

For methods with Sigma performance between 4.0 and 5.0, minor improvements in CV and Bias will have a big impact and move them into the "great" category.

For methods with Sigma performance of 4.0 and below, serious efforts must be made to significantly improve the CV and Bias. Reducing Bias is the first and easiest step. But the laboratory must also increase all types of monitoring on these methods, not just statistical QC. More instrument function checks, calibrations, preventive maintenance, etc., is indicated here.

For more advice on what to do once you learn your Sigma performance, learn about Total QC Strategies.

Example 1. Triglycerides.

CLIA quality requirement for proficiency testing is 25%. %Bias is 2.0. %CV for Material I is 0.9 or 0.8.

EZ Rules Automatic QC Selection: When the above data is entered into the program, the screen below indicates the automatically selected control rule.

Sigma Metric Graph: See Critical-Error graphs for more information. The key on the right details 8 different control rules that will detect nearly every critical-error. Normally, in a Sigma Metric Graph, a bold vertical line will display the Sigma Metric of the particular test. In this example, method performance is so good, the vertical line is actually far off the chart to the right. Basically, performance is so good any QC procedure will suffice, and widening limits is good idea.

OPSpecs chart: The Operating point of this method is very low and to the left, which again shows that really any of the control rules indicated by the key at right will provide the necessary error detection.

Example 2: BUN

CLIA quality requirement is 9%. %Bias is 3.6. %CV is 1.4 or 1.1.

Sigma Metrics Graph: Here we actually can see the critical-error displayed. Because the performance of this method is not as good, the control rules required are more complex ("Westgard Rules") and require more controls (N=4). The control rules displayed actually don't provide the error detection desired (around 90% or better), although a few rules are quite close (see the second vertical column in the key for specifics).

OPSpecs chart: The OPSpecs chart displayed is actually for only 50% error detection, not 90%, since that was not achieved. As the Sigma metrics graph indicated above, all of these control rules provide at least 50% error detection, but the best one is a "Westgard Rule" with N of 4.

What is required to achieve 90% error detection?. If 90% or more error detection is mandated, you will need to use at least 8 control measurements (8 controls, or 4 controls measured twice, etc.). Given that close to 90% can be achieved, this rigorous control rule is probably not desirable.

Conclusion: Room for improvement, Reasons to celebrate, and eye-opening

This assessment of Sigma Metrics for chemistry tests shows what can be expected in any laboratory that employs state of the art automated analytical systems. There will be many methods that show excellent performance and require minimal QC. There will be some tests that require careful selection of control rules to maximize error detection. There will be a few tests where error detection is not sufficient and improvement efforts are required.

The initial focus for performance improvement should be to reduce method bias. This often is related to the calibration materials and their assigned values. The potential benefit of reductions in bias can be assessed from the OPSpecs chart by simply dropping the operating point to sit on the x-axis (i.e., y-value or bias is represented as being zero).

One last, but important, point. The Sigma Metrics that are calculated show the potential capability of achieving or satisfying the stated quality requirement. In routine operation, the actual achievement of this quality is assured by applying the proper QC procedure.