Method Validation

A review of Method Validation to help provide you with the skills to establish your own method validation process. It is important to establish a systematic process that is tailored to the needs of your laboratory and to the characteristics of the methods being tested. Don't assume that method performance is okay just because you've purchased new instruments or new reagent kits! In the real world, methods still have real problems.

A review of the MV process
Applications with published data
Applications in your own laboratory
Adaptations for individual laboratories
References

The lessons in this series on Method Validation should provide you with the skills to establish your own method validation process. It is important to establish a systematic process that is tailored to the needs of your laboratory and to the characteristics of the methods being tested. Don't assume that method performance is okay just because you've purchased new instruments or new reagent kits! In the real world, methods still have real problems. See MV - Myths of Quality.

A review of the MV process

Here's a brief summary of our lessons on method validation. You can access each lesson via the link provided to review the material in greater detail.

Method Validation should be a standard laboratory process, but the process need not be exactly the same for every laboratory or for every method validated by a laboratory. See MV - The Management of Quality for an overview of the quality management process that is needed in healthcare laboratories and the role of method validation in establishing standard testing processes.
Remember the purpose of method validation is error assessment. See MV - The Inner, Hidden, Deeper, Secret Meaning for a description of random, systematic, and total analytical errors that are the focus of method validation studies.
Note that a USA laboratory is required by CLIA regulations to "demonstrate that prior to reporting patient test results, it can obtain the performance specifications for accuracy, precision, and reportable range of patient test results, comparable to those established by the manufacturer. The laboratory must also verify that the manufacturer's reference range is appropriate for the laboratory's patient population." For modified methods or high complexity methods, CLIA also requires verification of analytical sensitivity and analytical specificity. See MV - The Regulations.
Other critical method factors or characteristics, such as cost/test, specimen types, specimen volumes, time required for analysis, rate of analysis, equipment required, personnel required, efficiency, safety, etc., must be considered during the selection of the analytical method. See MV - Selecting a Method to Validate.
The approach in method validation is to perform a series of experiments designed to estimate certain types of analytical errors, e.g., a linearity experiment to determine reportable range, a replication experiment to estimate imprecision or random error, a comparison of methods experiment to estimate inaccuracy or systematic error, or interference and recovery experiments to specifically estimate constant and proportional systematic errors (analytical specificity), and a detection limit experiment to characterize analytical sensitivity. See MV - The Experimental Plan. For details of the different types of experiments, see the following:
- MV - The Linearity or Reportable Range Experiment. A minimum of 5 specimens with known or assigned values should be analyzed in triplicate to assess the reportable range.
- MV - The Replication Experiment. A minimum of 20 replicate determinations on at least two levels of control materials are recommended to estimate the imprecision or random error of the method.
- MV - The Comparison of Methods Experiment. A minimum of 40 patient specimens should be analyzed by the new method (test method) and an established method (comparison method) to estimate the inaccuracy or systematic error of the method.
- MV - The Interference and Recovery Experiments. Common interferences, such as lipemia, hemolysis, and elevated bilirubin are usually tested, along with potential interferences that are specific to the test and methodology. Recovery experiments are used to test competitive interferences, such as the possible effects of proteins and metabolics in the specimens.
- MV - The Detection Limit Experiment. Generally, a "blank" specimen and a specimen "spiked" with the amount of analyte in the manufacturer's claim for the limit of detection are each analyzed 20 times.
The data collected in the different experiments needs to be summarized (by statistical calculations) to provide estimates of the analytical errors that are the focus of each experiment. See MV - The Data Analysis Tool Kit.
The acceptability of these observed errors is judged by comparison to standards of quality, i.e., recommendations for the types and amounts of analytical errors that are allowable without invalidating the medical usefulness of the test results. Method performance is acceptable when the observed errors are smaller than the stated limits of allowable errors. Method performance is not acceptable when the observed errors are larger than the stated limits of allowable errors. For a quality standard in the form of an allowable total error (such as provided by the CLIA proficiency testing criteria for acceptable performance), a simple graphical tool - the Method Decision Chart - can be constructed to classify method performance as excellent, good, marginal, or unacceptable [1]. See MV - The Decision on Method Performance.
If method performance is judged acceptable, the reference intervals should be verified. See MV - Reference Interval Transference

Applications with published data

A critical review of the literature is always a good starting point when selecting and evaluating a method. This literature includes scientific papers as well as manufacturer's method descriptions. The time and effort needed for method validation studies in your own laboratory can be minimized by a careful assessment of the data in the literature.

Published evaluation studies seldom follow a standard validation process. Therefore, it is necessary to impose your own system of organization, data analysis, and data interpretation if you are to make sense of the published results. This is a process of critical review, which is distinctly different from just accepting the organization, data analysis, and conclusions that have been published.

Define the quality requirement in the form of an allowable total error (TEa) for the test (or tests) of interest at the medical decision concentration for critical test interpretation. Note that few journals require the authors to declare the quality that they consider acceptable, therefore the conclusions of a published study seldom refer to any standards of quality for the tests being studied. The notable exception is the journal of Clinical Chemistry which began in January 1999 to advise contributors of method performance studies that they should reference their findings to defined quality standards [2].
Prepare a "data page" to summarize information about the experiments. List the standard experiments that would be expected, e.g., reportable range, within day replication, day-to-day replication, interference, recovery, and comparison of methods.
Scan the published report to locate the different experiments, summarize the critical factors (number of patient specimens, number of replicate measurements, etc), and identify the strengths and weaknesses of the published studies.
- For a replication experiment, assess the suitability of the concentrations of the control materials, the sample matrix, the time period of study, and other conditions, such as the number of different reagent lots included, number of analysts involved, etc. Tabulate the number of measurements, the mean, and the standard deviation or coefficient of variation for each material.
- For an interference experiment, assess whether the substances and concentrations tested are appropriate. Tabulate the average difference or bias as your estimate of constant systematic error.
- For a recovery experiment, determine how the calculations were done (whether recovery was calculated on the total amount measured or on the amount added, the latter being the correct way). Tabulate the number of experiments and the average recovery. Calculate the proportional error (100 * average % recovery), then multiply times the critical medical decision concentrations to estimate the proportional systematic error.
- For a detection limit experiment, clarify the definition of the particular term being used and the experimental approach for making the estimate. Identify the samples analyzed, the number of replicate measurements, and the equation for calculating the detection limit.
- For a comparison experiment, assess whether the comparison method itself is a good choice. Tabulate the number of patient specimens analyzed by the two methods, the concentration range studied, and the distribution of data over that range. Tabulate the statistics results (most likely t-test and regression statistics). Assess whether the regression statistics will provide reliable estimates of errors by inspecting a comparison plot and also from the value of the correlation coefficient (which should be 0.99 or higher). When regression statistics are reliable, estimate the inaccuracy or systematic error from the equation SE= (a+bXc) - Xc, where a is the y-intercept, b is the slope, and Xc is the critical medical decision concentration. If ordinary regression statistics do not provide reliable estimates of errors, determine whether the bias from t-test statistics will be reliable, which requires that the mean of the comparison results must be close to the medical decision concentration of interest.
Utilize the Method Decision Chart to assess whether method performance is satisfactory for your laboratory. Show the individual estimates of systematic errors or inaccuracy (from interference and recovery experiments) as points on the y-axis; show the individual estimates of random errors or imprecision as points on the x-axis. Assess the combined effects of random and systematic errors by plotting the operating point whose y-coordinate is the bias or SE from the comparison of methods experiment and whose x-coordinate is the CV from the day-to-day replication study.
Review the authors' conclusions and recommendations. Resolve any differences between your conclusions and those of the authors. Identify the factors that will be critical if you test the method in your own laboratory.

Applications in your own laboratory

It is important to have a clear understanding of the method validation process and be well-organized in carrying out your experimental studies. Good record keeping is essential to document the conditions of the studies (reagent lot numbers, calibration lot numbers, re-calibrations, preventive maintenance procedures, any method changes or corrective actions).

Carefully specify the application, methodology, and performance requirements for the test of interest. State the quality requirement for the test in the form of an allowable total error (TEa), such as specified in many proficiency testing programs. Conduct a careful literature search and select a method that has appropriate application and methodology characteristics and has a good chance of achieving the desired performance.
Develop an evaluation plan on the basis of the characteristics of the test and method that will be critical for its successful application in your laboratory. Identify the experiments, specify the amount of data to be collected, and identify the decision concentrations or analytical ranges where the data should be collected. Schedule personnel time to carry out the validation studies.
Implement the method validation plan by preparing a set of worksheets that define the amount of data to be collected in the different experiments. These worksheets will formalize the planning of the experiments and also facilitate the collection of the data.
- The reportable range worksheet should have columns for the date, analyst, sample identification, assigned or known value, observed result #1, observed result #2, observed result #3, average result, and comments. The number of rows will depend on the number of specimens analyzed, which will usually be from 5 to 10. Also include information about the source of the specimens, preparation of specimen pools, manufacturer and lot numbers of commercial materials. See Reportable Range Worksheet for an example.
- The replication worksheet should have columns for date, time, analyst, result for material 1, result for material 2, (result for material 3 if needed), and comments. The number of rows should be a minimum of 20. Also include information about the manufacturer and the lot numbers for the control materials being analyzed. Note that you will usually need one worksheet for the within-day replication study and a second for the day-to-day study. See Replication Experiment Worksheet for an example.
- The comparison of methods worksheet should have columns for date, analyst, specimen identification number, test result (y-value), comparison result (x-value), difference (y-x), and comments. Add extra columns if duplicate measurements are to be performed. The number of rows should be 40. See Comparison of Methods Worksheet for an example.
Begin plotting the comparison data on a daily basis as it is collected. Identify discrepant values and repeat those tests by both methods. A difference plot will point out these discrepancies more clearly than a comparison plot, but either or both can be used for this purpose.
Perform the statistical calculations that are appropriate for the data collected in the different experiments.
Utilize the medical decision chart to assess whether method performance is satisfactory for your laboratory.
Document the method validation studies. If method performance is acceptable, prepare a method procedure to document the standard testing process. Prepare teaching materials for in-service training. Select appropriate QC materials, control rules, and numbers of control measurements to monitor routine performance.

Adaptations for individual laboratory applications

It should be recognized that each laboratory situation may be different, therefore, different adaptations are possible in different laboratories. The approach is to maintain the principles of the method validation process, while making the experimental work as efficient and practical as possible. Some ideas are presented here concerning the scope of studies, personnel skills, and data analysis techniques.

The scope of studies may be adapted on the basis of the information available in the scientific literature. Minimal work can be performed when thorough studies have been published. Always perform a linearity or reportable range experiment and a replication study over at least ten days. Reduce the number of patient comparisons to 20 specimens whose concentrations are selected to span the analytical range. Minimize the use of recovery and interference studies. Likewise, when replacing a method or instrument with the same or similar method or instrument, your earlier laboratory experience allows you to reduce the amount of data needed to validate the new method or instrument.
New technology or changes in method or measurement principles will require more extensive validation studies. New methodology that is just being released and not yet in widespread use must be critically evaluated. If the laboratory is involved in "field testing" for a manufacturer, even more extensive studies will be required, way beyond the minimums suggested here for basic method validation studies.
The laboratory personnel involved in method validation studies may have a variety of experience. However, it is important to have at least one skilled analyst to organize the studies, specify the amount of data to be collected, monitor the data during collection to identify obvious method problems, carefully inspect the data to identify discrepant results, properly analyze the data statistically, critically interpret the results, and make any necessary changes or adjustments to the validation plan. Other analysts may carry out the experiments and tabulate the data. Participation of several analysts will provide more realistic estimates of the imprecision expected under routine operation of the method.
The data analysis should be understandable by the laboratory analysts, otherwise good data may still not provide good decisions about method performance. The comparison of methods data are the most difficult to analyze. A plot of the data should always be prepared. Regression statistics are generally preferred, but t-test statistics may be sufficient when the estimate of bias is obtained at a mean that is very close to the medical decision level of interest. See Points of Care with Method Comparison Data for more detailed guidelines on the data analysis.

References

Westgard JO. A method evaluation decision chart (MEDx) Chart for judging method performance. Clin Lab Science 1995;8:277-83. See PDF files on this website.
Information for authors. Clinical Chemistry 1999;45:1-4.

Tools, Technologies and Training for Healthcare Laboratories