A Demanding Standard for Quality |
|
February 2003
An updated version of this essay appears in the Nothing but the Truth about Quality manualJames O. Westgard, PhD, FACB
I had an experience recently that gave me a new insight into quality. In studying certain evidence, it became clear that the evidence available, even if true, was not sufficient to know if it represented the truth. I suddenly understood the meaning of telling "the truth, the whole truth, and nothing but the truth".
Partial evidence may not be a lie, but it isn't necessarily the truth. To know the truth requires full disclosure of the available evidence (the whole truth), plus recognition of other factors that may confound the truth (nothing but the truth).
American politicians provide a good example of telling only part of the truth. There is always at least one fact to support their statements, but they seldom tell the whole truth and nothing but the truth. In the heat of political campaigns, one might wonder whether there is even one iota of truth (an infinitesimal amount).
As scientists, we think we're objective and never stray from the truth. Do you think our work measures up to the "truth standard?" Here are some examples where the truth standard challenges the quality of laboratory medicine.
In the evening news on January 27, 2003, Dan Rather talked about THE new test for heart disease and the recommendations for widespread use. Of course, this is the high sensitivity C- reactive protein test (hs-CRP) that has been growing and growing in interest through publications of more and more articles in scientific journals. Now the test has made it into main street journalism and will be covered in the "medical moments" discussions on local TV. The February 2003 issue of Medical Laboratory Observer shows an advertisement (Dr. Shuman shown here) and includes a product announcement for a new instrument that "is suited to screening of large patient populations in specific clinical locations." Pretty soon there will be advertisements telling you to "remind your doctor" about the importance of this new test to prevent future cardiac events.
In evaluating a new diagnostic test by the "truth standard", the test should demonstrate the following characteristics:
- First, a relationship to the disease of interest;
- Second, a reliable measurement process that provides correct test results; and
- Third, other factors should not interfere and confound the meaning of the test result.
The first dimension of truth has been demonstrated by epidemiological studies that show a relationship between slight elevations of CRP and a higher risk of future cardiovascular events. A risk assessment algorithm has been developed that makes use of a series of five clinical cutpoints, thereby the name "quintiles of risk".
The second dimension of truth is provided by special high sensitivity assays designed to measure very low concentrations of CRP. These assays are now being marketed and make it possible for routine service laboratories to provide the tests widely throughout the country.
The third dimension of truth is more difficult to assess. First of all, CRP will be elevated in many common infections, therefore the test is by its very nature expected to often provide high results due to any infectious process. That suggests that false positives will be a big problem when the test is used widely for screening individuals. Even if such infections were absent, the classification can be confounded by the biologic variation of the individual patient, as discussed in the January 2003 issue of clinical chemistry [1,2]. I also assert that the test cannot be adequately controlled to meet the demanding analytical requirements of the quintile classification scheme. [See the discussion on this website of "Quintiles and Quality" at http://www.westgard.com/quest15.htm.]
By the "truth standard", hs-CRP provides a test that is truly related to the risk of future cardiovascular events, the new high sensitivity measurement systems make it possible to measure it correctly, but the criterion for "nothing but the truth" is not satisfied. The biologic variability of the individual patient can potentially confound the use and interpretation of the test, as well as the analytical variability and the difficulty of quality controlling the test.
Here's a chance to watch the truth unfold and to see if science and scientists are truly objective. Over the course of the next few years, we will learn the truth about the clinical usefulness of hs-CRP measurements on the patient population at large.
New guidelines for the use of laboratory tests in the diagnosis and treatment of diabetes have been published in the last year. The American Diabetes Association (ADA) together with U.S. Health and Human Services (HSS) have recommended that a normal fasting glucose should be below 110 mg/dL and that an individual whose fasting test result is greater than 126 should be classified hyperglycemic, which if confirmed establishes a diagnosis of diabetes[3]. Guidelines published by the National Academy of Clinical Biochemistry (NACB) describe the desirable performance characteristics for glucose, as well as glycated hemoglobin, which is recommended as the best monitor of long-term control and should be measured at least twice a year[4]. Glycated hemoglobin should be 7.0% or less, and if it exceeds 8.0%, the treatment strategy should be reassessed.
In assessing the truthfulness of tests for glucose and glycated hemoglobin, there are three considerations:
- First, the truth requires that the test be able to distinguish between values that lead to different clinical actions or treatments;
- Second, the method performance specifications for precision and accuracy must provide reliable estimates of the test values; and
- Third, the patient's own biologic variation must be accounted for and any additional method variation (unstable performance) must be detected by statistical QC to eliminate false classifications.
A test for glucose must clearly separate a patient whose homeostatic set point is 126 mg/dL from one whose set point is 110 mg/dL. This difference from 110 to 126 mg/dL is a clinical decision interval (Dint), which can be expressed as a quality requirement of 16/110 or 14.5%. The first dimension of truth for a glucose test is that it must clearly distinguish between values that would classify a patient as normal and abnormal.
The second dimension of truth has to do with the performance specifications for the method. The National Academy of Clinical Biochemistry (NACB) cites a recommendation of a CV of 2.2% and a bias of 0% as desirable [4].
The third dimension of truth has to due with factors that might cause a misclassification of a patient, such as the known individual biologic variability and the minimal QC performed in laboratories today. For a clinical quality requirement of 14.5% and a known biologic variation of 6.5%, the maximum allowable CV for a glucose method is approximately 1.0% if the method bias is 0.0% and the minimum QC is performed (2 measurements using 3s control limits). Recent data from the New York State Department of Health proficiency survey shows method biases from 1.4% to greater than 3.0%, with the most likely estimate being approximately 2.0% for the largest method subgroups. This analysis is illustrated in a previous discussion "Cooking the books", which is found at http://www.westgard.com/essay46.htm.
The truth is that glucose will not be a reliable test when used with the new ADA/HSS interpretation guidelines. Method performance and laboratory QC cannot guarantee that patients will be classified correctly.
The national guidelines describe a glycated hemoglobin level of 7% or lower as desirable and that a value of 8.0% or higher should be grounds for re-assessing the patient's treatment plan. The first dimension of truth is that a difference from 7% to 8% must be distinguishable by the measurement procedure.
The second dimension of truth deals with the specifications for method performance. The NACB recommends that the desirable CV for a method is 3.0% and that the maximum allowable CV is 5.0% [4]. Method bias should be minimal if the method is properly calibrated and certified by the National Glycohemoglobin Standardization Program.
The third dimension of truth is that the known within-subject biologic variation and sensitivity of the laboratory QC procedures should not allow any misclassifications. Given the expected within-subject biologic variation of 4.1% and the minimal QC specification of two control materials per run, the desirable method CV is actually 1.9% to 2.2% when only 2 control measurements are utilized. The recommended method specification of a 3.0% CV is not sufficient to assure proper classification of individual patients. This analysis is illustrated in a previous essay "Why not evidence-based method specifications?" which can be found at http://www.westgard.com/essay44.htm.
The common problem that links these examples together is the dimension of "nothing but the truth." Our current scientific methodologies for validating the clinical usefulness of a test and its analytical reliability do not sufficiently encompass that dimension of the truth. Recent efforts known as the STARD initiative are aimed at improving the validation of clinical or medical usefulness of a test. [See "Good data wanted: Bad data should not apply", at http://www.westgard.com/essay49.htm.]. However, there will still be a need to account for the known within-subject biologic variability. The information in Fraser's book on Bilogic Variation needs to be studied carefully and taken seriously [5].
Efforts to improve analytical reliability may be more difficult. I have been arguing the case about the importance of statistical QC - the third dimension of the truth - for the analytical reliability of laboratory tests for well over a decade. Example applications showing the importance of QC performance for lipid tests [6-7] have been vigorously opposed by the lipid standardization people themselves [8]. Now that Six Sigma concepts are being applied in laboratories, it becomes clear that the NCEP specifications often result only in 3 sigma performance, which is considered minimally acceptable for a production process [9].
During the last decade, most laboratories have chosen to reduce QC to the minimum specified by the government CLIA regulations, rather than considering what is needed based on scientific evidence. Do you think that government regulations represent the truth, the whole truth, and nothing but the truth? Not likely, given the politics involved in the regulatory process!
As scientists, do we choose to follow only those parts of the truth that are easy to satisfy and convenient for our work? Are we are selective and not completely objective? I suggest that we do indeed err by not adequately accounting for known causes of variation that will confound the use and interpretation of laboratory test results. That's nothing but the truth.
- Campbell B, Flatman R, Badrick T, Kanowski D. Problems with high-sensitivity C-Reactive Protein. Letter. Clin Chem 2003;49:201.
- Ockene IS, Matthew CE, Riai N, Ridker PM, Reed G, Stanek E. Letter in response to Campbell et al. in Clin Chem 2003;49:201-202.
- Sainato D. A new attack on the diabetes epidemic. Clinical Laboratory News 2002;28*Hybe):1-5.
- Sacks DB, Bruns DE, Goldstein DE, et al. Guidelines and recommendations for laboratory analysis in the diagnosis and management of diabetes mellitus. Clin Chem 2002;48:436-472.
- Fraser CG. Biological Variation: From Principles to Practice. Washington DC:AACC Press, 2001.
- Westgard JO, Hyltoft Petersen P, Wiebe DA. Laboratory process specifications for assuring quality in the U.S. National Cholesterol Education Program. Clin Chem 1991;37:656-661.
- Fallest-Strobl PC, Olafsdottir E, Wiebe DA, Westgard JO. Comparison of NCEP performance specifications for triglycerides, HDL-, and LDL-cholesterol with operation specifications based on NCEP clinical and analytical goals. Clin Chem 1997;43:2164-2168.
- Caudill SP, Cooper GR, Smith SH, Myers GL. Assessment of current National Cholesterol Education Program guidelines for total cholesterol, triglyceride, HCL-cholesterol, and LCL-cholesterol measurements. Clin Chem 1998;44:1650-1658.
- Westgard JO. Six Sigma Quality Design and Control. Madison WI:Westgard QC, Inc., 2001. Lesson 11, pages 139-154.
Other Essays:
