Tools, Technologies and Training for Healthcare Laboratories

W.o.W., part I: Total Error vs. Trueness and Measurement Uncertainty

September 2007

Total Error. Trueness. Uncertainty. Can these terms coexist? Under ISO, will defining an allowable error for a test become unacceptable? Will the embrace of ISO accreditation mandate the rejection of all non-ISO-conforming terminology and concepts?

The debate - and the future - is uncertain.

A War of Words (W.o.W) In Laboratory Medicine, Part I:

Total Error vs. Trueness and Measurement Uncertainty

US politics has become a war of words, with each side trying to name their policies and positions in a way that will be viewed favorably by the voters. There now are entire books that focus on “framing” ideas so they will be seen positively and therefore be more readily accepted by the public. We have a “No Child Left Behind Program” that leaves many children without an adequate education and a Clean Air Act that allows continuing pollution of the environment. Instead of an escalation of the war in Iraq, we have just witnessed a “surge” in US troops. Many other examples of framing issues and ideas can be found in the news today!

Would you believe that there also is war of words going on in laboratory medicine? Yes, it’s true, or should I say, there is a certain “trueness” to that statement. If you’re uncertain about trueness, you’re not alone!

This discussion will describe the conflict that is occurring (Part I), trace the history and evolution of the concepts and terminology used to describe the quality and performance of measurement procedures (Part II), then rationalize the differences between conflicting terms and their underlying concepts to provide a framework for understanding how these concepts can and should be used together (Part III), and finally provide some example data to illustrate the practical meaning of different estimates and measures of test quality and method performance (Part IV).

The Current Battle!

In describing the quality of laboratory testing, the favored terms today are trueness, bias, precision, and uncertainty [1], rather than inaccuracy (or systematic error), imprecision (or random error), and total error, which have been commonly used in laboratory medicine for the last two to three decades. Not only are the words different, but so are the concepts of accuracy or correctness and the preferred measures or methodology for characterizing those terms.

This new terminology is being advanced by ISO, the International Standards Organization [1], which adheres to the official language of VIM [2, International Vocabulary for Units and Measures in Metrology] and has recommends estimation of uncertainty using the methodology of GUM [3, Guide for Expression of Uncertainty of Measurements].

The intent of ISO/VIM/GUM is to make measurements transferable globally by eliminating or correcting biases or systematic errors between measurement systems, and then to report any remaining variance of a test result (uncertainty) to inform the user of its quality. These are good intentions, but they’re certainly not new or unique to ISO/VIM/GUM nor are those objectives achievable only by using their proposed concepts and terminology for measurement quality and method performance.

Why must it be done the ISO/VIM/GUM way? Dybkaer, a medical laboratory scientist who is one of the leading advocates of this approach, argues that the current total error concept is flawed because it allows systematic errors to exist, rather than requiring them to be eliminated [4]:

“When describing the performance of procedures and the reliability of their results, ISO terminology should be used. Results should be universally comparable and this requires metrological traceability, the concomitant uncertainty (inversely) indicating reliability should be obtained in a universal and transparent fashion, and should be combinable. Therefore, the approach of [GUM], leading to a result with known bias and a combined standard uncertainty has advantages over the allowable total error concept, incorporating procedural bias.”

The phrase “incorporating procedural bias” is key here. Dybkaer is saying that the inclusion of “procedural bias” or systematic error in the concept of total error allows manufacturers to avoid the goal of globally consistent test results:

“The allowable total error (which for practical purposed could also have been termed ‘allowable deviation’) is set for a given type of quantity and purpose. The distribution between constant and random contributions may then be chosen freely within the total sum, which may include a known procedural bias. This is one reason for the outcome of external quality assessment where results are clustered in method-dependent groups when measuring systems are precise rather than true. Consequently,

  • Results from different measurement procedures are not directly comparable;
  • Biological reference intervals will depend on the procedure or have to be widened to accommodate results from all procedures, leading to a loss of diagnostic capability;
  • Classification of biological states by comparing a given result with common limits becomes hazardous;
  • Equations between different types of quantities cannot work across procedures;
  • Movement of patients between health services requires repeat measurements.

Such conditions do not seem acceptable in terms of health and resources, and may lead to complaints and loss of business.”

The remedy proposed is to avoid defining a quality requirement for a test and, instead, estimate the uncertainty of the test result:

“Rather than defining an allowable total error with estimated elements of all types of systematic and random error (hitherto often called inaccuracy and imprecision, respectively), any result should be corrected for known significant biases and should have a measure of uncertainty attached giving an interval comprising a large fraction of the reasonably possible values of the measurand with a given level of confidence.”

Metrologists call for correction of any known biases, but that is a risky procedure in clinical tests because there are relatively few reference methods and materials, thus it is difficult to know what correction is actually correct. Even after correction, there will likely be some remaining biases which are then to be included in the estimate of measurement uncertainty.

In Defense of Total Error

My perspective is quite different! I believe laboratories need to manage the quality of their testing process in order to deliver test results that are fit for use in patient care. If you don’t know the quality that is required, then how can you manage quality objectively? A good analogy is to consider an error budget for a testing process. What errors are observed during method evaluation and how do they compare with the maximum error can be tolerated without compromising the use and interpretation of the test result?

ISO 15189 actually acknowledges this important application with its statement in section 5.5:

  • Performance specifications for each procedure used in an examination shall relate to the intended use of that procedure

This implies that the precision and accuracy of a measurement procedure (examination procedure in ISO terminology) should be appropriate for the “intended use,” which is the clinical application of the test. To do this, the laboratory should define how good the test must be for its intended clinical use, which then leads to a definition of a quality requirement for the test. This can most readily be done by applying the total error model and the criteria for acceptability of different types of analytical errors [5].

ISO also provides guidance that laboratories should design their internal quality control on the basis of the intended quality of test results, as stated in section 5.6:

  • The laboratory shall design internal quality control systems that verify the attainment of the intended quality of results.

This ISO guidance can most readily be accomplished by utilizing the total error model and employing the expanded “error budgeting” models, e.g., the analytical and clinical quality planning models [6-7]. These models are supported by manual [8-9] and computer tools[10-11] that make them readily available for practical applications in the laboratory. Therefore, the traditional error concepts and error budgeting tools should not be displaced by ISO’s guidance for trueness and uncertainty. These different concepts have different applications in the overall framework for quality management and some of the applications are more readily accomplished with existing error models and related tools.

What’s the point?

When arguing about these concepts and terms, the arguments tend to go to the extremes and discourage or even exclude the use and application of the offending terms. Dybkaer is absolutely correct about the problems we are having due to systematic differences between different analytical methods and the need for improvements to make clinical test results transferable. But those improvements are not restricted or dependent on the use of ISO concepts and terminology. There also is a place and role for more traditional terms and concepts that can fulfill some of the needs for improving laboratory quality management.

Strangely, total error now finds itself being a “traditional” term and seemingly subject to attempted replacement or displacement. I say “strangely” because the concept of total error was considered revolutionary when introduced some 30 years ago and only became well-accepted after some 10 to 15 years in the field. The ISO terms of trueness and measurement uncertainty face that same challenge today to gain acceptability in the field. That will be more readily accomplished by incorporating them into the existing laboratory concepts and terminology, rather than by attempting to displace the well-establish total error concept.

In part II, I’ll discuss the history and evolution of these different concepts and terms.

References

  1. ISO/FDIS 15189 Medical laboratories – Particular requirements for quality and competence. 2002. International Organization for Standards, Geneva Switz.
  2. International Vocabulary of Basic and General Terms in Metrology (VIM). 3rd ed. Draft April 2004. Annex A.
  3. GUM. Guide to the expression of uncertainty in measurement. ISO, Geneva, 1995.
  4. Dybkaer R. Setting quality specifications for the future with newer approaches to defining uncertainty in laboratory medicine. Scand J Clin Lab Invest 1999;59:579-584.
  5. Westgard JO, Carey RN, Wold S. Criteria for judging precision and accuracy in method development and evaluation. Clin Chem 1974;20:825 33.
  6. Westgard JO, Hytoft Petersen P, Wiebe DA. Laboratory process specifications for assuring quality in the U.S. National Cholesterol Education Program (NCEP). Clin Chem 1991:37:656-661.
  7. Westgard JO, Wiebe DA. Cholesterol operational process specifications for assuring the quality required by CLIA proficiency testing. Clin Chem 1991;37:1938-44.
  8. CLSI C24-A3. Statistical Quality Control for Quantitative Measurement Procedures: Principles and Definitions. Clinical and Laboratory Standards Institute (CLSI), 940 West Valley Road, Wayne, PA, 2006.
  9. Westgard JO. Basic Planning for Quality. Madison WI:Westgard QC, Inc.. 2001.
  10. Westgard JO, Stein B, Westgard SA, Kennedy R. QC Validator 2.0: a computer program for automatic selection of statistical QC procedures in healthcare laboratories. Comput Method Program Biomed 1997;53:175-186.
  11. Westgard JO, Stein B. An automatic process for selecting statistical QC procedures to assure clinical or analytical quality requirements. Clin Chem 1997;43:400-403.

James O. Westgard, PhD, is a professor emeritus of pathology and laboratory medicine at the University of Wisconsin Medical School, Madison. He also is president of Westgard QC, Inc., (Madison, Wis.) which provides tools, technology, and training for laboratory quality management.