This question comes from Robbie Keith of Summit Laboratory We are in the process of evaluating our QC program. Our techs monitor Levy-Jennings charts for shifts and trends weekly. We would like to know what you consider to define a shift or trend (e.g. how many points are required increasing or decreasing to define a trend?) Consider control rules such as 41s, 10mean, etc., as good indicators of shifts and trends. The number of observations needed increases as the limit approaches the mean of the control material in order to keep the false rejections down. Minimum number of consecutive observations above or below the mean should probably be set as 6. There are some recommendations, particularly in the Germany, to use 7 above or below the mean, or 7 trending consecutively in one direction.

FAQ's about Multirule QC

Frequently-Asked-Questions (FAQs) about "Westgard Rules" and multirules.

Plus, some questions about Immunassays and QC (scroll down past the first section).

Frequently-Asked-Questions about Westgard Rules

Should I have an 1_2s rule violation for starting evaluation of violations of 4_1s, 10_x, 8_x and 12_x rules?
When would I use 8_x and 12_xrules?
What is N?
What's the best way to chart QC for multirule applications?
Does the 1_2s warning rule have to be used in a computerized implementation?
Can other rules be used as warning rules rather than rejection rules?
Other than better error detection, are there any reasons to multirule procedures instead of single rules?
What rules are most sensitive for detecting systematic errors?
What causes systematic errors?
What rules are most sensitive for detecting random error?
What causes random errors?
When can a single rule QC procedure be used instead of a multirule procedure?
How do you decide if you need to apply rules across runs?
When one rule in a multirule combination is violated, do you exclude just that control value from the QC statistics

New Questions about Multirule QC

1. In your article about multi-rules published in 1981 and in your book Cost-Effective Quality Control: Managing the Quality and Productivity of Analytical Process , page 95, you say that violation of the rules 4_1s and 10_x are signals of out-of-control and they lead to rejection of the run. In the same book, paragraph: modifications to use for warning purposes, pages 113 and 114, you say that those rules should be used as warnings for preventive maintenance in order to reduce false rejections. In your page on Internet in the link: "Multirule and 'Westgard' rules: what are they?" the rules are again signals of out-of-control runs. Could you clear this subject to me?

One situation might be methods on instrument systems where periodic changes in reagents introduce small systematic errors that can't be easily or completely eliminated by recalibration. These systematic changes may be judged to be medically unimportant, but the 4_1s and 10_x rules might still detect them. In such a case, it may be useful to apply the 4_1sand 10_x rules when you change the lot number of reagents and want to check for small shifts, but then "turn-off" those rules once you decide there aren't any shifts or when an observed shift is judged to be small and not important to detect. Otherwise, those rules will continue to give rejection signals even though you have decided not to do anything about the problem.

You can make these decisions on what rules to apply much more objectively if you follow our recommended QC Planning Process, where you define the quality required for the test, account for the imprecision and inaccuracy observed for your method, then select the control rules and numbers of control measurements necessary to detected medically important errors.

2. Should I have an 1_2s rule violation for starting evaluation of violations of 4_1s, 10_x, 8_x and 12_x rules?

In the original multirule paper, we recommended the use of the 1_2s rule as a warning, then the inspection of the data by the other rules to decide whether or not the run should be rejected. This was done to simplify the manual application of the rules and keep from wasting time when there wasn't likely to be a problem. While in principle it is possible that there might be a 4_1s or 10_x violation without ever exceeding the 2s warning limit, our experience has been that it seldom happens, at least if the control charts have been properly set up and the limits calculated from good estimates of the observed method variation.

In computer assisted applications of the multirule procedure, there is no need to use a 2s warning limit. All the chosen rejection rules can be applied simultaneously.

3. When would I use 8_x and 12_x rules? What are the advantages of these rules over 4_1s and 10_x in the indication of systematic errors?

You pick the number of consecutive control measurements to fit the product N times R, where N is the number of control observations in the run and R is the number of runs. For example, if N is 4 and R is 2, you would want to use the 8_x rule to look at the control data from the current and previous run, or a 12_x rule to look back over three consecutive runs. A 10_xrule would require looking back 2.5 runs, which doesn't make any sense. You would either look at the control measurements in 2 runs or in 3 runs, not 2.5 runs.

What is N?

When N is 2, that can mean 2 measurements on one control material or 1 measurement on each of two different control materials. When N is 3, the application would generally involved 1 measurement on each of three different control materials. When N is 4, that could mean 2 measurements on each of two different control materials, or 4 measurements on one material, or 1 measurement on each of four materials.

In general, N represents the total number of control measurements that are available at the time a decision on control status is to be made.

What's the best way to chart QC for multirule applications?

You can chart your QC data using regular Levey-Jenning's control charts, on which additional lines have been drawn to represent the mean plus/minus 1s, plus/minus 2s, and plus/minus 3s. You can set up one control chart for each level of control material being analyzed. These individual charts have the advantage of showing your method performance at each control level. However, it is difficult to visually combine the measurements from consecutive control measurements on different materials.

To combine measurements on different materials, you can first calculate the difference of each control observation from its expected mean, divide by the expected standard deviation to give a z-score or a standard deviation index (SDI), and then plot the SDI value on a control chart whose central mean is zero and whose control limits are drawn as plus/minus 1, plus/minus 2, and plus/minus 3. You can plot the values for the different materials using different colors to help keep track of trends within a material. This is a lot of work if you have to do it by hand, but many computerized QC programs support this type of calculation and often provide an SDI chart.

Does the 1_2s warning rule have to be used in a computerized implementation?

No, it was mainly intended for manual implementation to trigger the application of the other rules. When you apply the rules manually, it sometimes is a lot of work to look through all the control data and check it with several rules. If the computer can do all the rule checking, then it's not much work for the analyst to apply all the rules and there's really no need to apply the 1_2s rule at all.

Can other rules be used as warning rules rather than rejection rules?

There's another type of warning rule that can be used to indicate prospective action instead of run rejection. With very stable analytical systems, it may be advantageous to interpret rules like the 1_4s and 10_x as warning rules because they are quite sensitive to small shifts that occur from run to run, day to day, or reagent lot to reagent lot. If you do this, you also need to define the response that is appropriate for a warning. That may be to perform maintenance before the next run, carefully inspect the analytical system, review system changes, review patient data, etc.

Other than better error detection, are there reasons to use multi-rule procedures instead of single rules?

If the same error detection is available by multirule and single rule QC procedures, but that error detection is less than 90%, then it would still be valuable to use a multirule procedure because it can increase the number of control measurements inspected by applying rules across runs, thereby improving the detection of persistent errors (i.e., errors that begin in one run and persist until detected).

Another potential advantage of a multirule procedure is that the rule violated can provide a clue about the type of analytical error that is occurring. For example, violations of rules such as 2_2s, 4_1s, and 8_x are more likely due to systematic errors, whereas violations of rules such as 1_3s and R_4s are likely due to random errors.

What rules are most sensitive for detecting systematic errors?

Rules like the 2_2s, 3_1s, 4_1s, 6_x, 8_x, 9_x, 10_x, and 12_x tend to be more sensitive to systematic error than random error.

What causes systematic errors?

Systematic errors may be caused by inaccurate standards, poor calibration, inadequate blanks, improperly prepared reagents, degraduation of reagents, drift of detectors, degradation of instrument components, improper setting of temperature baths, etc.

What rules are most sensitive for detecting random error?

Rules like the 1_2.5s, 1_3s, 1_3.5s, and R_4s are most likely to detect random errors.

What causes random errors?

With automated systems, random errors may be due to incomplete mixing, bubbles or particles in the reagents, probe and syringe variations, optical problems, sample line problems, etc. With multitest use of a control material, apparent control problems on several tests may actually be a random or individual problem with the control material itself. With manual methods, random errors may be due to alliquoting and pipetting, timing variations in critical steps, readout variation from cell to cell, etc.

When can a single rule QC procedure be used instead of a multirule procedure?

In the case where a single rule procedure provides 90% detection of the critical-sized errors in a single run, any problems that occur will generally be detected right away, in the first run in which they occur. The single rule procedure may be simpler to implement and therefore be preferable for such applications. With high precision automated chemistry and hematology analyzers, there may be many tests for which a single rule QC procedure is perfectly adequate.

How do you decide if you need to apply rules across runs?

You need to first assess the error detection within the run by use of a critical-error graph or an OPSpecs chart. If error detection is less than 90%, then it will generally be advantageous to apply rules across runs to detect persistent errors as soon as possible.

When one rule in a multirule combination is violated, do you exclude just that control value from the QC statistics?

No, you exclude all the control values in that run. Remember the QC statistics are supposed to provide estimates of the mean and SD to represent stable method performance, which is then used to compare with current performance. All control results in an out-of-control run are suspect of not representing stable performance.

Immunoassay QC

This month's second question comes from Brisbane, Australia:

A website visitor raised some issues about QC for immunoassay methods. He noted that the daily controls are well within the manufacturer's specifications - so much so they often form a straight line over periods of time and are not randomly distributed about the mean as one would expect.

Are the manufacturer's specifications for acceptable control values too wide?
Should we set our own control limits based on our control data?
How do you use control charts on extremely stable immunoassay analyzers?
How do you determine the frequency with which to run controls on extremely stable analyzers?
Where can I find some example QC planning applications for immunoassay methods?

Are the manufacturer's specifications for acceptable control values too wide?

Manufacturers sometimes use controls to ascertain whether their systems are behaving in a normal manner or whether they need to troubleshoot the system. In doing so, they may set the limits of acceptable values to encompass the performance expected from most of their systems in the field. These limits may reflect both within laboratory variation and between laboratory variation, and, therefore, may be much wider than appropriate for QC within a single laboratory.

Should we set our own control limits based on our control data?

Yes, this is the best practice for achieving tight control of a testing process. This practice allows you to optimize the testing process in your laboratory for cost-effective operation. You can then take into account the quality required in your laboratory when you select control rules and numbers of control measurements. You can follow our QC planning process and apply our QC planning tools to help you set appropriate control limits.

How do you use control charts on extremely stable immunoassay analyzers?

Make sure you assess stability relative to the critical sizes of error that would be medically important in your laboratory, rather than the manufacturer's specifications. If the method operates well within the quality required in your laboratory, you will be able to employ single rules such as 1_3.5s or 1_3s with a low number of control measurements. These QC procedures will assure a low rate of false rejections, and therefore contribute to cost-effective operation of your testing processes.

How do you determine the frequency with which to run controls on extremely stable analyzers?

This is a tough question and we don't have an easy answer!

In principle, if the analyzer were perfectly stable and never had a problem, it wouldn't be necessary to run any controls. On the other hand, if the analyzer frequently has problems, then it is necessary to run controls very often. Deciding how often to run controls is difficult because we seldom have adequate information about the stability of an analyzer under the operating conditions of our own laboratory. Furthermore, it is difficult to transfer any information from other laboratories because they may operate the analyzer under conditions that are different from ours.

The frequency of problems could be assessed if we were able to implement an ideal QC design that detects all medically important errors and gives essentially no false rejections. Then we could count the number of rejections, compare to the total number of acceptances plus rejections (or total number of runs) to determine the rate or frequency of problems, then optimize the QC design based on the frequency of problems.

Alternatively, we can make some judgment on the factors which make an analyzer susceptible to problems, such as a change in operator, change in reagents, recalibration, maintainence, etc., then establish the appropriate times for checking the operation of the analyzer. Manufacturers will generally recommend the maximum length of time that the analyzer can be run without rechecking performance. Laboratories may need to set a shorter length of time based on their operating conditions.

One novel way of measuring run length would be to use patient data to monitor the stabilitiy of the analyzer, as we discussed in a recent article in Clinical Chemistry:

JO Westgard, FA Smith, PJ Mountain, S Boss. Design and assessment of average of normals (AON) patient data algorithms to maximize run lengths for automatic process control. Clin Chem 1996;42:1683-1688.

The ability to implement this approach will depend on the workload of the laboratory. The approach is probably most workable in high volume automated laboratories.

So, in practice we usually end up making our best judgment of when to run controls on the basis of the maximum period allowable according to the manufacturer's recommendations, our knowledge of the methods and their susceptibility to problems, our experience with how often we have problems with the analyzer, and the factors that affect operation of the analyzers in our own laboratories.

Where can I find some example QC planning applications for immunoassay methods?

We provided some detailed examples for prolactin, total b-hCG, CEA, FSH, LH, TSH, and b2-microglobulin in a recent paper:

K Mugan, IH Carlson, JO Westgard. Planning QC procedures for immunoassays. Journal of Clinical Immunoassay 1994;17:216-222.

The application of higher N multirule procedures to immunoassay methods was discussed in a recent continuing education publication:

JO Westgard, CA Haberzettl. Quality control for immunoassays. AACC Diagnostic Endocrinology and Metabolism: An in-service training & continuing education program. 1996;14(September):239-243.

Neill Carey has written a couple of good application papers:

RN Carey. Quality- control rules for immunoassay. AACC Endocrinology and Metabolism: In-service training and continuing education 1992;10(September):9-14.
Carey RN, Tyvoll JL, Plaut DS, Hancock MS, Barry PL, Westgard JO. Performance characteristics of some statistical quality control rules for radioimmunoassay. J. Clin Immunoassay 1985;8:245-252.

We also include example applications for cortisol, thyroxine, and FSH in our OPSpecs Manual - Expanded Edition (pp 5-30 to 5-35).

Check the archives for more Questions

The particular recommendation (on pages 113-114 of Cost-Effective QC) concerned when you might choose not to use the 4_1s and 10_x rules or when you might apply them as "warning" rules that are used prospectively to trigger inspection and preventive maintenance rather than apply them to reject a run.

Tools, Technologies and Training for Healthcare Laboratories

Questions