Misleading Comparisons, Minimal Review, Mars Moving Average Paper

If you compare 20 apples to 1 orange, is there a problem with your study? If you are using only 1 orange when you claim to be comparing against 5 oranges, is that another problem with your study?

Misleading Comparisons, Minimal Review, Mars Moving Average Paper

“IQC: Moving average algorithms outperform Westgard rules”, Really?

James O. Westgard, PhD
September 2021

With multirule QC having been a standard for IQC for nearly 40 years, we periodically see studies that claim a new QC procedure is superior in performance. It's an easy way to make an exciting title for a journal. Topple the standard with a new kid on the block.

Such a study was just published in Clinical Biochemistry: Poh DKH, Lim CY, Tan RZ, Markus C, Loh TP.” Internal quality control: Moving average algorithms outperform Westgard rules.” Clin Biochem 2021; https://doi.org/10.1016.j.clinbiochem.2021.09.007.

The authors claim that “moving averages” of control results outperformed Westgard rules. “The larger the block size used in the moving averages algorithm, the greater the power of error detection.”

For those readers not familiar with recent publications about Patient Based Real Time Quality Control (PBRTQC), “block size” may not be a recognized term. What it refers to is the number of results included in a moving average (MA). When applied to patient data, it is the number of patient samples that are analyzed and results included in an estimate of a particular control statistic, e.g., a moving average of the patient results. When applied to QC data, it is the number of QC results that are included in the particular “moving average” statistic, whether a simple moving average, weighted moving average, or exponentially weighted moving average.

In this particular study, the block sizes for the various MA algorithms are given as 5, 10, and 20. For the multirule procedures used for comparison of performance, only the rules are given: 1:3s, 2:2s, and 1:3s/2:2s. We are left on our own to understand that the block size would be 1 for the 1:3s procedure, 2 for a 2:2s procedure, and 2 for a multirule 1:3s/2:2s procedure.

The authors show power curves that demonstrate better error detection for MA algorithms having 5, 10, and 20 control results vs multirule procedures having 1 and 2 control results. That seems like a pretty obvious outcome because the number of control measurements is a primary contributor to increasing the error detection capabilities of SQC procedures.

This comparison issue raises an initial concern: 1 vs 5, 1 vs 10, 1 vs 20. How is this a comparison of apples to apples? Should we be surprised that 20 measurements might have a different ability to detect errors than 1 measurement? If we are surprised, expect a follow-up paper to come: 40 - gasp - is also better than 1. I can even see a number of papers that might come out concerning 2...

A second issue comes up under what is being claimed as "Westgard Rules" by the authors. First, let's remember that the literature does not really recognize the term "Westgard Rules", that's just the common name for it. What became popular as the Westgard Rules was what the original paper called a multirule. But the typical implementation of "Westgard Rules" is not just 1 single rule (it can't be a multirule if there is only 1 of them). Usually when you say "Westgard Rules" you think of the 1:3s, the 2:2s, the R:4s and a 4:1s rule and a 10:x rule. Those would require block sizes of 4 and 10, and that would be a fairer comparison.

One wonders about the reviewers who accepted a study with such fundamental errors: the comparison is incorrect, and what's being compared is not what is being stated as what is being compared. Yes, any paper with the term PBRTQC and "moving averages" is very trendy right now, but there should still be some standards.

What is most worrisome, it that this paper's publication may indicate a growing lack of knowledge about SQC procedures and the factors that influence performance, not just among the authors, but among reviewers, journal editors, and the laboratory in general.

It should also be pointed out that it was actually never intended that multirule QC procedures be applied with more than 4 to 6 control measurements. In the original publication, Table 4, we summarized the QC rules that would be appropriate for different numbers of control observations, as follows:

Number of control measurements	Individual run	Consecutive runs
N=1	1:2s	4:1s
N=2	1:3s/2:2s/R:4s	4:1s/10x
N=3	1:3s/2of3:2s/R:4s	9:x
N=4	1:3s/2:2s/R:4s/4:1s	8:x
N=4-10	Mean/range	Trend analysis
N=4-20	Mean/chi-square	Trend Analysis

Note the recommendation to use mean and range rules once N exceeds 4 control measurements. Note also the use of trend analysis for higher Ns and consecutive runs (which are referenced to exponentially smoothed moving averages).

Given the overwhelming tide of in-print, pre-print, online-first papers, it is definitely a challenge for authors and reviewers to read all the original publications they cite to ensure they properly understand the sources. The original multirule paper is not just a set of rules, it's also a list of limitations of multirule procedures and practical guidance for applications. In this case, the authors did not even reference the original publication, but instead referenced a 1-page report about multirule performance by other authors about 15 years later. If you are going to compare against something, though, you should at least get the reference right, even if you don't actually read the paper. But reading the paper would be even better - in this case it would have saved the authors the trouble of having to write their paper.

So, for those who want to better understand our recommendations for multirule QC procedures, please see Clin Chem 1981;27(3):493-501. Note also that this publication includes the name of the reviewers and addresses their particular comments and recommendations in the text of the publication. That’s a novel approach for providing more transparent publications.

Tools, Technologies and Training for Healthcare Laboratories

Trends

Misleading Comparisons, Minimal Review, Mars Moving Average Paper

Misleading Comparisons, Minimal Review, Mars Moving Average Paper

“IQC: Moving average algorithms outperform Westgard rules”, Really?