# Quality Requirements and Standards

## Theory versus Practice in Analytical Quality Specifications

**As Yogi Bera once said, "In theory, there is no difference between theory and practice - in practice, there is." So is it true with mu.**

## Theory vs Practice for Analytical Quality Specifications

#### March 2024

James O. Westgard, Sten A. Westgard

Recent papers have again renewed our interest in discussions (arguments) about current theories and practices for analytical goal-setting. First, a summary opinion paper written on behalf of the European Federation of Clinical Chemistry and Laboratory Medicine Task Group for the Biological Variation Database provides an update on the EFLM recommendations for Analytical Performance Specifications (APS) [1]. Second, a paper that discusses the actual laboratory practices related to the use of APS in Italy [2]. The interesting connection between these two papers is that the Italian authors are perhaps the leading promoters of metrology applications in laboratory medicine. The two papers contrast the APS theory and EFLM guidelines with the actual practices of a group of laboratories in Italy.

### EFLM models for APS

EFLM prefers separate specifications for imprecision and bias, but also describes how specifications for allowable total error can be calculated from biologic variation data, where CVI represents individual variation and CVG represents Group variation.

Allowable imprecision = CV_{A} = 0.5*CV_{I}

Allowable bias = BiasA = 0.25*(CV_{I}^{2}+CV_{G}^{2})^{1/2}

Allowable total error = TEA = 1.65*(0.5*CV_{I}) + 0.25*(CV_{I}^{2}+CV_{G}^{2})^{1/2}

The model for calculating allowable imprecision was originally formulated in the 1970s by Cotlove et al [3] who recommended that the allowable analytical CV should be one-half of CV_{I}, which limits the noise contribution to 12% of the signal. The model for allowable bias was developed by Gowens *et al* [4] in 1988 and the model for allowable total error was formulated in 1993 by Fraser and Hyltoft-Petersen [5]. EFLM officially discourages the use of the TE_{A} model in favor of allowable measurement uncertainty which is the standard uncertainty expanded for 2*CV_{A}, or 95% limits, as described by Braga* et al* [6].

Allowable measurement uncertainty = MU_{A} = 2*CV_{A} = 2*0.5*CV_{I} = CV_{I}

The EFLM website states the argument against total error as follows:

*“[T]his conventional model for deriving total allowable error from biological variation data is flawed. It sums up two mutually exclusive terms, i.e., maximum allowable bias and maximum allowable imprecision, resulting in overestimating allowable total error.”*

That the website continues to provide these calculations was the subject of vocal, vigorous debate at the recent Prague 2023 conference.

### Total Analytical Error (TAE) Model

It is important first to point out that this criticism of total error applies to the EFLM goal-setting model, which is different from the Total Analytical Error (TAE) model that we developed for determining the total amount of analytical error that might affect a test result. In estimating TAE, the observed imprecision and observed bias are added together (TAE_{obs} = |bias_{obs}|+ 1.65s_{obs}) to estimate the TAE observed for a particular test method. From our perspective, both imprecision and bias are error terms and both types of errors can affect patient test results, therefore their total effect should be determined to judge how analytical errors will influence the use and interpretation of a patient test result. In practice bias causes a systematic error (in one direction) whereas imprecision is a random error that can be in either direction. Bias will shift location of the test value in one direction or the other, but in either case will add to the total deviation or total error, hence our use of absolute bias for the estimation of TAE.

We should also note that the calculation of the observed TAE originated for the intended application in method evaluation/verification experiments, where separate experiments provide estimates of imprecision (replication experiment) and bias (comparison of methods experiment), as originally described by Westgard, Carey and Wold [7]. That is the source of the Total Analytical Error model.

This TAE model is related to industrial process capability indices, as discussed elsewhere [8]. The most common process capability index is Cp, which considers the tolerance limits for an acceptable product relative to the variation (SD) of the production process. A related index, Cpk, takes the “centerness” of the process into account (bias) and relates directly to the Sigma Metric, as shown below:

3Cpk = (TEA - |Biasobs|)/SDobs = Sigma Metric

Industrial recommendations for process performance have been based on Cpk, e.g., the minimum acceptable performance for a production process is a Cpk of 1.0 or 3 Sigma, for a more controllable process Cpk should be 1.33 or 4 Sigma, and the goal for excellent performance is a Cpk of 2.0, which corresponds to a 6 Sigma process for world class quality.

EFLM Goal Setting Model for Allowable Total Error (TEA)

The EFLM goal setting model for defining the total amount of error that is allowable (TEA) was proposed by Fraser and Hyltoft Petersen [5] for use with external QC and/or proficiency testing applications. The key factor in its recommendation was the condition that only 1 measurement is allowed on EQC/PT samples, therefore the measurement is subject to both random and systematic errors. They then used the biologic goals for precision and bias to calculate a 95% confidence limit for allowable total error.

It is this TEA goal-setting model that has been criticized by Oosterhuis [9] for overestimating the allowable Total Error. The argument is that the expression for allowable bias comes from an application by Gowens and Hyltoft Petersen for setting analytical specifications for the transfer to reference ranges within a geographic area [4]. The objective in transferring a reference range to another method is to limit the population falling outside the reference limits to 4.6%, corresponding to the uncertainty observed when a reference range is determined from analysis of 120 normal subjects. To achieve this condition, maximum allowable bias between methods is 0.25*(CV_{I}^{2}+CV_{G}^{2})^{1/2} when imprecision of 0.0. In a related manner, the allowable imprecision is 0.6*CV_{G} when bias is 0.0%. In theory, this expression for allowable bias applies to a particular application, the transfer of reference intervals, as well as a special condition, i.e., imprecision is 0.0%. Nonetheless, Fraser and Hyltoft Petersen adopted this expression for the allowable bias, which can be argued as appropriate because it considers the allowable bias in classifying a patient vs a reference limit plus the allowable random variation or imprecision of 1.65*CV_{A} for monitoring a patient, thus providing a 95% limit for scoring EQC/PT results.

### EFLM biologic variation database and online calculations

Notable accomplishments of the EFLM and its bio-database project include standardizing the criteria for studies of biologic variation in the database, review of existing studies that fulfill those criteria, assembling a comprehensive database of the results of those studies, and providing users with online calculation tools for allowable MU, allowable bias, and allowable TE [10]. The EFLM preferred nomenclature uses MAU for allowable expanded measurement uncertainty, MAu for allowable standard measurement uncertainty, CVA for allowable relative imprecision, and TEa for allowable total error.

### EFLM recommendations

*“There are many different approaches for setting APS, but today, APS for many measurands are defined based on BV data. Which formulae to use is dependent on e.g. if the TE or MU paradigm is used, the latter being metrologically the more correct. APS for bias is, however, appropriate for laboratories implementing TE paradigms.”*

Perhaps the most important revelation here is the acknowledgment that **bias still exists**, therefore an APS for bias is still relevant, attributing this of course to those laboratories that still employ the supposedly outdated concept of Total Error [11]. However, models for calculating MU still vacillate in whether to use the uncertainty in the estimate of bias or the actual existing bias (as bias squared) together with the estimate of intermediate imprecision.

### Theoretical vs practical considerations

We have written earlier that metrology MU model may be preferred for its theory, but that the TAE model is more practical and useful in the laboratory [12]. As confirmation, the recent report by Cerotti *et al* is revealing [2]. They performed a survey related to the use of APS in laboratories in northern Italy, but also including members of the Italian Society of Clinical Biochemistry and Clinical Molecular Biology. They included seven questions shown below:

1) Have you ever heard of Analytical Performance Specifications (APS)?

2) If so, have you ever used APS targets in your laboratory work?

3) If so, in which situation?

4) Have you ever used APS to evaluate the performance of a new methods before introducing it for clinical practice?

5) If you have used APS at least on some occasions, where did you get the information from (in most cases)?

6) If you have used APS at least on some occasions, what type of estimate did you apply them to:

7) Did you ever calculate the uncertainty of your measurements?

For brevity, we focus here on results from questions 6 and 7.

What kind of estimate? Imprecision goals were identified by 72 labs (67%); bias goals by 26 (24%); Total error goals by 51 (47%); Uncertainty goals by 19 (18%). **Thus,** **TE goals are in more widespread use than MU goals, used more than twice as often as mu goals.**

Did you ever calculate the uncertainty of your measurements? No by 146 labs (73%); Yes for selected measurements by 44 labs (21.9); Yes for majority of measurements by 11 labs (5.5%). **Thus, most laboratories do not currently calculate measurement uncertainty.**

In explaining these results, the authors make the following statement:

*“Regarding the calculation of measurement uncertainty, we must underline the fact that in Italy the ISO 15189 accreditation is still at the beginning (only 49 accredited laboratories) and the calculation of measurement uncertainty is not a requirement for Institutional accreditation of clinical laboratories. So, it is not surprising that only the 5.5% of laboratories declared to calculate measurement uncertainty for the majority of measureands.”*

This confirms our worst fears that MU is a requirement for ISO 15189 accreditation only to mandate the theory (actually only the calculations) be implemented in laboratories. If so, MU will ( and in many cases, is) just be an exercise performed for accreditation, rather than an effort to improve the practice of measuring and managing quality in medical laboratories. In contrast, the TAE model has gained world-wide acceptance because of its practical applications in the validation of method performance, usefulness together with Six Sigma concepts for quality assessment, and practical tools for designing SQC procedures to verify the attainment of the intended quality of results (an ISO 15189 requirement).

In 25 years, measurement uncertainty has managed to gain the dedicated following of 5.5% of laboratories. At this rate, we'll see complete adoption sometime around 2230.

### References

- Sandberg S, Coskun A, Carobene A, Fernandes-Calle P, Diaz-Garson J, Bartlett WA, Jonker N, Galior K, Gonsales-Lao E, Moreno-Parro I, Sufrate-Vergara B, Webster C, and Aarsand AK. Analytical performance specifications based on biological variation data – considerations, strengths, and limitations. Clin Chem Lab Med 2024: https://doi.org/10.1515/cclm-2024-0108.
- Ceriotti F, Buoro S, Pasotti F. How clinical laboratories selection and use analytical performance specifications (APS) in Italy. Clin Chem Lab Med 2024: https://doi.org/10.1515/cclm-2023-1314.
- Cotlove E, Harris E, Williams G. Biological and analytical components of variation in long-term studies of serum constituents in normal subjects: Physiological and medical implications. Clin Chem 1970;16:1028-1032.
- Gowens EMS, Hyltoft Petersen P, Blaabjerg O, Horder M. Analytical goals for the acceptance of common reference intervals for laboratories throughout a geographical area. Scand J Clin Lab Invest 1988;48:757-764.
- Fraser CG, Hyltoft Petersen P. Quality goals in external quality assessment are best based on biology. Scand J Clin Lab Invest 1993;53(Suppl 212);8-9.
- Braga F, Pasqualetti S, Borrillo F, Capoferrir A, Chibireva M, Rovegno L, Pantenghini M. Definition and application of performance specifications for measurement uncertainty of 23 common laboratory tests: linking theory to daily practice. Clin Chem Lab Med 2023;61:213-223. https://doi.org/10.1515/cclm-2022-0806.
- Westgard JO, Carey RN, Wold S. Criteria for judging precision and accuracy in method development and evaluation. Clin Chem 1974;20:825 33.
- Westgard SA, Bayat H, Westgard JO. Mistaken assumptions drive new Six Sigma model off the road. Biochemia Medica (Zagreb) 2019;29(1):010903. https://doi.org/10.11613/BM.2019.010903.
- Oosterhuis W. Gross overestimation of Total Allowable Error based on biological variation. LTE. Clin Chem 2011;57:1334-36.
- EFLM Biological Variation Database. https://biologicalvariation.eu.
- Panteghini M. Replay to Westgard et al.: ‘Keep your eyes wide…as the present now will later be past’. Clin Chem Lab Med 2022; https://doi.org/10.1515/cclm-2022-0557.
- Westgard JO. Total error more practical, but measurement uncertainty may still be preferred. Clin Chem 2018:64:636-638.