Assessing the time to rejection for QC procedures

James O. Westgard, Ph.D.

How many runs does it take before your instrument will detect a medically important error? This is a basic question that other industries take great pains to determine - so why is it healthcare laboratories generally don't know the answer? Dr. Westgard explains how this number can be calculated - and how new technologies in the lab are creating a new way to describe the performance of QC procedures

It's about time! Assessing the time to rejection for QC procedures

An Analogy: How long does it take you to "catch on"?
Average Run Length (ARL)
Average Time to Rejection
iQM Performance Figures
References

This discussion is the third in a series that focuses on understanding the performance characteristics of QC procedures. This series is particularly applicable to understanding a new QC technology - iQM, intelligent Quality Management - that has been introduced by Instrumentation Laboratory for their GEM analyzer. You can access the earlier discussions at these links:

CLIA QC clearance postponed again and again and again... describes the principles of the new iQM technology that has been approved by FDA "replacing the use of traditional external quality controls." iQM makes use of frequent measurements on internal "Process Control Solutions" to monitor the stability of the electrode measurements. Internal "drift limits" are applied as statistical control limits.
"Area under a Table" provides the background for understanding how the probabilities for rejection can be determined for a QC procedure using a table of areas under a normal or gaussian curve. It describes how to use the CLIA requirements for quality and the iQM specifications for "drift limits" to determine the probabilities for false rejection and error detection.

The purpose of this third lesson is to show how the probabilities for rejection can be translated into practical figures that describe the "time to rejection", i.e., how many runs it takes to detect a medically important error. It introduces a new parameter - Average Run Length - as an alternate way of describing the performance of a QC procedure.

An Analogy

When I teach this material "live" in the classroom or in a workshop, I tell the participants that they will need to be exposed to these ideas three times, on average, before it all becomes clear. The first time is the "live" presentation. The second time is a review of the notes and handout materials. The third time may be a review of these lessons on the website. There may be additional times based on study of materials and references in the scientific literature.

Of course, some people will grasp the ideas quicker than others. A few actually catch on in the first presentation (1 time), but many more catch on after both the live presentation and a further study of their notes (2 times). Others will make use of these lessons on the website to review the ideas (3 times). Some will go on to study additional materials, such as the discussion of the concept of ARL and its application to QC procedures in healthcare laboratories [1] (4 times), plus the specific application to iQM published in the scientific literature [2] (5 times), etc. Yes, it could take more than 5 times. It has taken many years for me to understand this, so my number of exposures far exceeds these numbers. Hopefully, I can reduce the number of exposures needed for your learning and understanding!

Graphically, the results would look something like the histogram in this figure. The columns from left to right represent the increasing number of exposures from 1 to 6. The heights of the columns represent the numbers of people who catch-on after that number of exposures. If we were to multiply the number of people times the number of exposures, then calculate the average of all these results, that would give us the average number of exposures needed before people catch-on.

Average Run Length (ARL)

That's the concept of "average run length" - the average number of times or exposures needed to catch something. For an analytical measurement process, we're interested in how many times the measurement process needs to be exposed to a control solution before a rejection occurs. Two situations are of interest:

the number of times a control solution will be analyzed before a false rejection occurs, which we call the average run length for false rejection (ARLfr)
the number of times a control solution will be analyzed before a medically important error is detected, which we call the average run length for error detection (ARLed)

The use of ARLs is the standard way of describing the performance of QC procedures in the industrial literature. We have also used the ARL characteristics for QC procedures in healthcare applications, as described in our 1986 book on Cost-Effective QC [1]. However, the terminology is different. Industry uses the term ARL for acceptable quality in place of ARLfr and the term ARL for rejectable quality in place of ARLed. We use the "fr" and "ed" terms here to tie these characteristics to the "false rejection" and "error detection" situations.

The industrial literature also tells us how to determine average run lengths from probabilities for rejection. It's a simple matter of taking the reciprocal of the probability for rejection characteristics, for example:

ARLfr = 1/Pfr
ARLed = 1/Ped

Thus, the probabilities of false rejection and error detection that were determined in the previous lesson can be used to calculate the average run lengths for false rejection and error detection. Using these probably figures, here are the ARLs for the example from the previous lesson:

Pfr was 0.0027, therefore the ARLfr is 370, which means it will take, on average, 370 measurements before a false rejection is observed. For false rejections, the longer the ARL, the better!
Ped was 0.91, therefore the ARLed is 1.1, which means it will take, on average, 1.1 runs to detect the problem. For error detection, the shorter the better.

Average Time to Rejection

To make this information even more useful, ARLs can be converted to units of time to describe how long it takes until a rejection is observed. Given the known time periods for analysis of the different process control solutions (PCS A 1 to 4 hours, PCS B 3 to 30 minutes), the ARLs can be multiplied by the sampling time to describe to the average time to rejection.

For example, given that PCS B is analyzed at least every 30 minutes, the average times before a rejection will be observed can be calculated, as follows:

ARLfr*30 minutes = 370*30 minutes = 11,100 minutes or 185 hours or 7.7 days, which means that a false rejection is expected about once in an 8 day period.
ARLed*30minutes = 1.1*30 = 33 minutes, which means that a medically important systematic error will be detected, on average, within 33 minutes. The reality is that the error is most often detected in 30 minutes (the first analysis of PCS B) and occasionally in 60 minutes (the second analysis), giving an average of 33 minutes.

It is important to recognize that the 30 minute sampling time for PCS B is the maximum time when the instrument is in standby operation. When patient specimens are being analyzed, PCS B is analyzed after each specimen, or every 3 minutes. That means that during heavy workload, the average time for detecting an error will be only 3.3 minutes, on average.

iQM Performance Figures

This methodology for determining probabilities, average run lengths, and average times to rejection has been applied test by test, using the CLIA requirements for quality and the manufacturer's specifications for drift limits. For Process Control Solution B, the average times for error detection are expected to be 3 to 33 minutes for pH, PCO2, PO2, K+, Ca++, lactate, and hematocrit; 7 to 71 minutes for glucose; and 10 to 102 minutes for sodium. The shorter times correspond to the heavy workload situation where PCS B is analyzed every 3 minutes and the longer times are for the low workload situation where PCS B is analyzed every 30 minutes.

Detailed performance figures for each test are available in the scientific literature [2]. Given the methodology described in these lessons, you should be able to work through all these figures yourself and confirm those performance figures.

References

Westgard JO, Barry PL. Cost-Effective Quality Control: Manage the quality and productivity of analytical processes. Washington DC:AACC Press, 1986, pp 72-75, 106-111.
Westgard JO, Mansouri S, Fallon KD. Validation of iQM active process control technology. Point of Care 2003:2 (in press).

Tools, Technologies and Training for Healthcare Laboratories

Quality Management