METHOD
VALIDATION:
Class Questions
James O. Westgard, PhD
- This group of questions come from students in Chem 555, which
is a graduate course in Clinical Chemistry at West Chester University,
West Chester, PA (Fall '99). Dr. Al Caffo is the course instructor
(acaffo@wcupa.edu) and an old friend from DuPont/Dade/Behring
days (the good old days). It's been great to have this opportunity
to discuss method validation practices.
-
- Q: When CLIA regulations are fully implemented, will the
requirements of inspecting agencies with deemed status (CAP and
JCAHO) change as well? (from David Cardamone, Chem 555, WCU/Caffo)
- A: The adequacy of the regulatory approach seems to be challenged
by the recent FDA/Abbott consent decree, which illustrates the
need for laboratories to maintain a quality system that is independent
of the manufacturer (see http://www.westgard.com/essay28.htm).
If the regulations were completely implemented, including the
FDA implementation of a QC clearance process to review manufacturer's
QC instructions, the FDA-Abbott consent decree would still pose
a significant problem for laboratories. I would hope that professional
organizations would have standards that require laboratories
to establish and maintain independent quality systems, in which
case the standards of deemed organizations will need to go beyond
the CLIA requirements.
-
- Q: What statistics should be applied to compare methods
that are reported in different units? (For example, CK-MB assays
that measure mass or activity, where one uses ng/mL and the other
U/L) (from Jody Williams, Chem 555, WCU/Caffo)
-
- A: In this situation, a comparison plot and regression statistics
will still be useful to demonstrate the relationship between
the methods. The slope and intercept are still important to quantitatively
describe the relationship. While the correlation coefficient
may also be useful, remember that the value will depend on the
analytical range that is studied. The fit of the data to a straight
line is actually best described using the standard deviation
of the points about the regression, which will provide a quantitative
estimate of the scatter or random error in units of the new method
(i.e., y-axis units). The comparison plot should be carefully
inspected for linearity and outliers, which should be investigated
further if necessary. More effort should be expended in establishing
reference intervals since these will be critical for the clinical
use and interpretation of the data.
-
- Q: How important is an initial replication study, since
an estimate of random error can be obtained from the duplicate
determinations done as a part of the method comparison study?
(from Ruth Mortimer, Chem 555, WCU/Caffo)
-
- A: The short-term replication study is very useful. First
of all, it gives immediate information on your technique or the
best performance of the analytical system. If you can't reproduce
results under these conditions, there's no need to go any further.
If you can reproduce results, then you should have some confidence
that your technique or the instrument is working properly. You
should also remember that the estimates of imprecision from the
patient duplicates in a comparison of method study wil only include
short-term variation, essentially within-run variation. It's
still important to do a long-term replication study using control
materials.
-
- Q: Is there a way to account for lot-to-lot reagent variation
when establishing a reportable range? (from Nicole Bethke, Chem
555, WCU/Caffo)
-
- A: You could do this if you had several different lots of
materials available to you, however, that is seldom the case
in a real laboratory situation. It's more likely that you will
test each new lot of materials to be sure the reportable range
remains adequate. Over time, you will accumulate data that demonstrates
the stability of the reportable range. That information may help
you optimize the way you perform the linearity experiment or
maybe establish a way to monitor the limits of reportable range
as part of routine QC.
- Q: How can a Reportable Range start at zero, since the
detection limit will restrict the reportable results to some
value greater than zero? (from Peter Szczerba, Chem 555, WCU/Caffo)
-
- A: In principle, you could argue that you need to characterize
the detection limit as the bottom of the "zero" for
any reportable range. However, for many tests, there's no need
to determine "zero" so exactly. The clinically important
test values are always considerably above zero. Take glucose
for example. Low values (less than say 50 mg/dL or so) are important,
but it's not critical to know if "zero" really means
0.0 mg/dL or 3 mg/dL or 5 mg/dL.
Q: From your experience with clinical chemistry analyzer systems,
what is the primary source of imprecision: the design of the
method or the manufacturing consistency of the reagents? (from
Maria Gonzalez, Chem 555, WCU/Caffo)
-
- A: I think precision is very much related to the level (or
generation) of automation. Highly automated systems, such as
today's 4th and 5th generation chemistry analyzers, tend to be
very precise because operator variability has been almost completely
eliminated, environmental variability has been controlled, and
instrument variability has been reduced with improved components.
With these systems, manufacturing consistency might be more of
an issue for accuracy than precision. With manual methods, operator
variability would certainly be important, as might reagent variability.
Q: In the past, we have been able to cross over to a new lot
of QC material by running it in parallel with the old lot over
20 days. With the economic situation in our lab now, we don't
have the time to do this any longer. Is there a minimum number
of days for cross over? Can we run more testing on fewer days?
(from Vivian Anton, Chem 555, WCU/Caffo)
-
- A: It's still good practice to obtain at least 20 measurements
as a starting point for new lots of materials. However, you might
get those measurements in a shorter period of time if necessary.
A two week period might be more practical, but then you should
be sure to calculate cumulative means, SDs, and limits, and update
these calculated values with each week's additional data. If
you go to a one week period, then it may be better to establish
new mean values on the basis of new data (with a minimum of 10-20
measurements over a 5 day period), but use your old CVs to establish
preliminary control limits, initiate the use of the actual control
limits after collection of two weeks data, then update the cumulatives
with each additional week's data.
- Q: How can acceptable statistical limits for a patient
correlation be defined when the reference range of the test method
differs from the reference range of the comparison method (there
is a known bias between methods)? (from Sandy Krakowsky, Chem
555, WCU/Caffo)
-
- A: The reason for doing a method comparison study is to see
if the new method can replace the old method without changing
any of the test values. If there are systematic errors between
the methods, the first issue is to determine which method is
correct. If the "correct" method is the new method,
then you will have to carefully establish the reference intervals
for the new method. It may also be necessary to investigate clinically
important populations to demonstrate the range of values expected
for clinical applications.
- Q: When developing a new method in the research laboratory
(e.g. for a new investigational drug), replication, linearity,
interference experiments can be done, but how can a method comparison
be performed if this is the only method in existence? (from Heather
Bonner, Chem 555, WCU/Caffo)
-
- A: When no comparison method is available, you have to depend
on the interference and recovery experiments to assess systematic
errors. This means much more extensive testing by these experiments,
much more effort in establishing reference intervals, and most
likely considerable effort in establishing the range of values
expected in different clinical populations. When a comparison
method exists, the advantage is to transfer what is known about
the usefulness of that test by demonstrating comparability with
the known