The latest on Reference Values and Reference Intervals
Dr. Per Hyltoft-Petersen discusses why reference values and reference intervals are still an issue and are still important.
A Review of the Clinical Chemistry and Laboratory Medicine special issue on Reference Values (Volume 42, Number 7, 2004)
- Can the concept of reference values be improved?
- Is there a concept of health and normality?
- Statistical description of reference values
- Interpretation of the reference limits
- The concept of common and multicentre intervals
- Special aspects of reference intervals
- Analytical quality in relation to reference intervals
- An Alternative Approach?
Per Hyltoft Petersen
Department of Clinical Biochemistry, Odense University Hospital, 5000 Odense C, Denmark
and NOKLUS, Norwegian centre for external quality assurance of primary care laboratories, Division for General Practice, University of Bergen, N-5021 Bergen Norway
Because, there is still a lot to do.
Through the 1980s, IFCC’s expert panel on reference values developed the theories on reference values and reference intervals and published their thoughts and recommendations in a series of excellent papers (1-6), which handled the most important aspects of the concept of reference values and calculation and presentation of reference intervals. These recommendations have been widely adopted, but the concepts of reference values and reference intervals are not static, and they are still changing, even the fundamental ideas are kept as basis for further developments. Thus, Henny et al. (7) in 2000 presented a ”Need for revisiting the concept of reference values”, pointing to the need for more practical recommendations regarding systematic errors and transferability, regarding the reference population, regarding statistical methods used, regarding reference and decision limits and the question about which percentiles to be used.
These questions and many others are dealt with in the July special issue of Clinical Chemistry and Laboratory Medicine, CCLM, on 'Reference values and reference intervals' (Volume 42, Number 7, 2004), where many of the most outstanding scientists within the reference concept and related scientific areas introduce their thoughts, in order to reinforce the still valid concepts and draw attention to new ideas that have developed since the publication of the recommendations, with the aim of stimulating further developments. There are different opinions among the authors, but these divergences make the different ideas more approachable and open for debate. The goal of the special issue is not to produce new guidelines, but to bring more focus and debate.
Ralph Gräsbeck gives a personal historical view of the evolution of the reference value concept (8), which he developed together with Professor Saris back in 1969. Introducing the philosophy of reference values was a long process, since the profession was satisfied with the traditional idea of normal values, and Gräsbeck describes how they had to admit that health is a relative thing, and that a person may be ill from one point of view and well from another. He also points to the fact, that the field of reference values is only one part of laboratory medicine, where tests also should be evaluated in respect to other qualities like their clinical utility. Gräsbeck ends his contribution by telling that he is glad that the seed they planted gave such an unexpected harvest. Claude PetitClerc, another key-person from the IFCC-committee on reference values is less optimistic, as ‘the term “reference values” is well implemented but the concept not’ (9). He discusses “normality” in relation to different perspectives from the clinician’s pragmatic approach to preventive medicine, and further, the challenges due to the demands from clinicians to keep reference values constant over time and geography, ending with the need for individual reference intervals. Ritchie and Palomaki (10) question the relevance of one single reference interval based on healthy individuals, as they point out that each specific disease in principle also needs a specific reference interval, which in reality may lead to variable decision limits or cut-off points.
So, even as the basic ideas are well accepted, there are still questions about ‘how to define these reference individuals’ in the practical world.
It is clear from Helge Erik Solberg’s contribution (11) on the IFCC recommendations on estimation of reference intervals, that the concepts are still valid and that the RefVal programme is the best basis for calculations of reference intervals. Solberg recommends the non-parametric calculations and the programme has been expanded with the boot-strap technique in order to reduce the size of the confidence intervals around the reference limits. A graphical technique based on assumptions about Gaussian distributions of logarithmic reference values according to continued partitioning into subgroups is introduced by Hyltoft Petersen et al. (12) in order to disclose lack of homogeneity and to give better parametric description of the sub-groups. The question of partitioning is also dealt with by Ari Lahti (13), who compares the published statistical methods for partitioning of reference values into separate reference intervals and establishes criteria for when partitioning is recommended and not – and he indicates a grey zone where also non-statistical conditions are considered. A description of continuous increasing or decreasing reference values, principally in relation to age is described by Virtanen et al. (14) based on covariate-dependent reference limits. The problem of combining two or more reference regions in order to improve the interpretation of patient data is described by James Boyd (15). This elegant combination based on parametric distributions is not in general use, as it also contains some drawbacks as discussed.
Here, we have the statistics of describing crude reference populations by use of non-parametric statistics, supplemented by parametric statistics to select and describe sub-groups, as well as criteria for when to perform the partitioning, whereas continuous changing values have to be described by other tools, as also the combination of two or more quantities requires different tools.
The combination of reference regions (15), solves some of the problems on probabilities in repeated testing, which is the main theme of Jørgensen et al.’s contribution on the increasing use of test results in wellness testing (16), which result in a high percentage of false positive results when the traditional description of reference values as 95 % reference intervals is used for the purpose. George Klee (17) compares reference limits and decision limits and makes it clear that, in general, the reference limits should not be used as cut-off points. He also points to the costs related to wrong diagnosis and he stresses the need for improving the analytical and clinical quality. The problem with use of population-based reference intervals is also the theme of Callum G. Fraser’s contribution(18). He points to the flaws of population-based reference intervals due to the biological individuality presented by all, as the dispersion of values for any individual may span only a small part of the traditional reference interval for many quantities. The quantitative measure of this relationship is the index-of-individuality, which reflects the ratio between within-subject and between-subject biological variations. The smaller the index, the more can we benefit from individual reference intervals. Josep Quaraltó (19) describes such individual reference intervals and gives the concept and formulas for time series analysis in its many aspects from ‘reference change values’ to ‘random walk models’.
Here the use of 95 % reference intervals is questioned, both due to the changed probabilities according to repeated testing and due to the misuse of reference limits as decision limits (cut-off points). Further, the use of population-based reference intervals is criticised, as individual reference intervals for each single individual are preferable if available.
In principle we should have the same reference intervals for homogeneous groups, independent of which laboratory performs the measurements, but this is not the case. Thus, the idea today, is to establish common reference intervals where possible. Fuentes-Arderiu et al. (20) have established common reference intervals in Spain for laboratories using the same equipment and same reagents and Rustad et al. (21) have made the same effort in the Nordic countries, but across analytic methods, and by use of a liquid frozen reference preparation with traceable concentration values to reference methods. It is clear from the investigations that it is a cumbersome task and firm organization is needed to complete such projects. Investigation of different racial/ethnical groups reveal that there are genetically and physiologically differences in reference intervals for proteins as investigated by Johnson et al. (22) in Caucasians and Asian Indians in Leeds, England, where especially the plasma protein alpha1-Antitrypsin showed phenomenological differences, which could be explained genetically, and by Ichihara et al. (23) in different cities in Southeast Asia showing major differences between people in Tokyo and the other cities, like Hong-Kong and Singapore.
Thus, the goal is to establish common reference intervals for homogeneous groups across methods and across countries, based on very large sample sizes, which makes it possible to establish reference intervals for relevant subgroups of the main populations as well as for ethnic and racial minorities. Differences in reference intervals should depend on differences between homogeneous populations and not on analytical methods or individual laboratories establishing their own intervals based on poor selection of reference individuals and with small sample sizes.
The selection of reference individuals is a difficult process, and the recommendations state that the important issue is to describe criteria for rule-in and rule-out for published reference intervals. Gérard Siest (24) describes the French approach, where special centres for Preventive Medicine are established, where the individuals are tested by a physician and by a series of clinical chemical analyses at regularly intervals. This makes it possible to compare the results over time and thereby confirm the absence of malign diseases (as well as early indications of pathological changes). For plasma-glucose the establishment of a reference population is especially difficult, as the diagnosis diabetes mellitus is defined by the measured concentration of the quantity. Consequently, Jørgensen et al. (25) ruled all individuals with risk factors out, and thus established a – ‘low risk’ reference interval. In a comparable study on the thyroid hormone TSH, Jensen et al. (26) excluded all individuals with thyroid antibodies or family history of thyroid diseases. A further problem in establishing reference intervals is for special fluids, which are difficult or even risky for the patient/reference individual to obtain. This very difficult process is described by Jean-Louis Dhondt (27) for cerebrospinal fluids, and thus gives an example of how to handle the delicate problem.
The centres for Preventive Medicine make it attractive to establish reference intervals retrospectively, as the reference individuals are seen regularly and any outcome can be seen at the next visit. As in the first section, here the contributors are also focused on the problem of variable criteria for establishing reference intervals for quantities with specific relation to the diseases to investigate, and thereby, the problem of recruiting individuals without risk for the disease. The problem of reference intervals of cerebrospinal fluid can hardly be solved, due to ethical and risk problems, but cooperation between all laboratories could be of substantial help.
Analytical quality must be sufficient to secure the optimal utilization of reference intervals. For individually-established laboratory reference intervals the constancy from creation of the reference intervals to their use is crucial, but for common reference intervals shared by several laboratories painstaking control is mandatory. Thienpont et al. (28) have described the elements of traceability of values from SI units and reference methods to the calibrator used in the laboratory. They have further pointed to the many problems with the traceability of most of the clinical chemical quantities, due to ‘families’ of molecules and to matrix problems. The practical creation of the needed analytical quality is described by Klein and Junge (29) as performed in industry, with the many checks of quality of reference intervals and analytical performance before a kit is released. When we come to the control of analytical quality Ricós et al. (30) illustrate the external control performed by commutable control materials in relation to analytical quality specifications, which are derived for the purpose of sharing common reference intervals for homogeneous groups. In the daily routine performance, analytical stability is most important, and the control materials and control rules of internal control systems as designed by James Westgard (31) are necessary tools to monitor this quality. The combination of rules with a low probability of false rejection, and with high probability of error detection, can be selected by computer programme.
To assure proper analytical quality, at least three elements must be present: the analytical quality specifications (to know what quality should be), creation of the quality by reference methods and industrial production of kits according to the analytical quality specifications, which are in use also for the control procedures, which in practice are performed as external control by external quality assessment organizers and internal in each single laboratory according to internal quality control.
Is there an alternative to the reference concept with reference individuals, reference values and reference intervals etc.? Henk Goldschmidt (32) has established such a provocative vision, ‘The NEXUS vision.’ While it will probably not be established the next few years, it can provide the basis and stimulation for future discussions.
The conclusion from all these important contributions to this special issue of CCLM on reference values and reference intervals is that the concepts from the IFCC recommendations are still valid, but need to be expanded considerably, in order to catch up with all the additional ideas, concepts and practical problems.
- Solberg HE. Approved recommendation (1986) on the theory of reference values. Part 1. The concept of reference values. J Clin Chem Clin Biochem 1987; 25:337-42
- PetitClerc C, Solberg HE. Approved recommendation (1987) on the theory of reference values. Part 2. Selection of individuals for the production of reference values. J Clin Chem Clin Biochem 1987; 25:639-44.
- Solberg HE, PetitClerc C. Approved recommendation (1988) on the theory of reference values. Part 3. Preparation of individuals and collection of specimens for the production of reference values. J Clin Chem Clin Biochem 1988; 26:593-8.
- Solberg HE, Stamm D. Approved recommendation on the theory on reference values. Part 4. Control of analytical variation in the production, transfer and application of reference values. Eur J Clin Chem Clin Biochem 1991; 29:531-5.
- Solberg HE. Approved recommendation (1987) on the theory of reference values. Part 5. Statistical treatment of collected reference values. Determination of reference limits. J Clin Chem Clin Biochem 1987; 25:645-56.
- Dybkær R, Solberg HE. Approved recommendation (1987) on the theory of reference values. Part 6. Presentation of observed values related to reference values. J Clin Chem Clin Biochem 1987; 25:657-62.
- Henny J, Petitclerc C, Fuentes-Arderiu X, Hyltoft Petersen P, Queraltó JM, Schiele F, Siest G. Need for revisiting the concept of reference values. Clin Chem Lab Med 2000; 38:589-95.
- Gräsbeck R. The evolution of the reference value concept. Clin Chem Lab Med 2004; 42:692-7.
- Petitclerc C. Normality: The unreachable star ? Clin Chem Lab Med 2004; 42:698-701.
- Ritchie RF, Palomaki G. Selection of clinically relevant populations for reference ranges. Clin Chem Lab Med 2004; 42:702-709.
- Solberg HE. The IFCC Recommendation on estimation of reference intervals. The RefVal Program. Clin Chem Lab Med 2004; 42:710-4.
- Hyltoft Petersen P, Blaabjerg O, Andersen M, Jørgensen LGM, Schousboe K, Jensen E. Graphical interpretation of confidence curves in rankit plots. Clin Chem Lab Med 2004; 42:715-24. -
- Lahti A. Partitioning biochemical reference data into subgroups: Comparison of existing methods. Clin Chem Lab Med 2004; 42:725-33.
- Virtanen A, Kairisto V, Uusipaikka E. Parametric methods for estimating covariate dependent reference limits. Clin Chem Lab Med 2004; 42:734-8.
- Boyd JC. Reference regions of two or more dimensions. Clin Chem Lab Med 2004; 42:739-46.
- 16. Jørgensen LGM, Brandslund I, Hyltoft Petersen P. Shall we maintain the 95 percent reference intervals in the era of wellness testing. A concept paper. Clin Chem Lab Med 2004; 42:747-51.
- Klee GG. Clinical interpretation of reference intervals and reference limits. A plea for assay harmonization. Clin Chem Lab Med 2004; 42:752-7.
- Fraser CG. Inherent biological variation and reference values. Clin Chem Lab Med 2004; 42:758-64.
- Quaraltó JM. Intraindividual reference values. Clin Chem Lab Med 2004; 42:765-77.
- Fuentes-Arderiu X, Mas-Serra R, Alumà-Trullàs A, Martí-Marcet MI, Dot-Bach D. Guideline for the production of multicentre physiological reference values using the same measurement system. A proposal of the Catalan Association for Clinical Laboratory Sciences. Clin Chem Lab Med 2004; 42:778-82.
- Rustad P, Felding P, Lahti A. Proposal for guidelines to establish common biological reference intervals in large geographical areas for biochemical quantities measured frequently in serum and plasma. Clin Chem Lab Med 2004; 42:783-91.
- Johnson AM, Hyltoft Petersen P, Whicher JT, Carlström A, MacLennan S, On behalf of the International Federation for Clinical Chemistry and Laboratory Medicine, Committee on Plasma Proteins. Reference intervals for plasma proteins: Similarities and differences between adult Caucasians and Asian Indian males in Yorkshire, UK. Clin Chem Lab Med 2004; 42:792-9.
- Ichihara K, Itoh Y, Min WK, Yap SF, Lam CWK, Kong XT, Chou CT, Nakamura H. Diagnostic and epidemiological implications of regional differences in serum concentrations of proteins observed in six Asian cities. Clin Chem Lab Med 2004; 42:800-9.
- Siest G. Study of reference values and biological variations: a necessity and a model for Preventive Medicine Centers. Clin Chem Lab Med 2004; 42:810-6.
- Jørgensen LGM, Brandslund I, Hyltoft Petersen P, Stahl M, de Fine Olivarius N. Creation of a low risk reference group and reference interval of fasting venous plasma glucose. Clin Chem Lab Med 2004; 42:817-23.
- Jensen E, Hyltoft Petersen P, Blaabjerg O, Skov Hansen P, Brix TH, Ohm Kyvik K, Hegedüs L. Establishment of a serum TSH reference interval in healthy adults. The importance of environmental factors, including thyroid antibodies. Clin Chem Lab Med 2004; 42:824-32.
- Dhondt J-L. Difficulties for establishing reference intervals for special fluids: the example of 5-hydroxyindole acetic acid and homovanillic acid in cerebro-spinal fluids. Clin Chem Lab Med 2004; 42:833-41.
- Thienpont T, Van Uytfanghe K, Rodríguez Cabaleiro D. Metrological traceability of calibration in the estimation and use of common medical decision-making criteria. Clin Chem Lab Med 2004; 42:842-50.
- Klein G, Junge W. Creation of the necessary analytical quality for generating and using reference intervals. Clin Chem Lab Med 2004; 42:851-7.
- Ricós C, Doménech M, Perich C. Analytical quality specifications for common reference intervals. Clin Chem Lab Med 2004; 42:858-62.
- Westgard JO. Design of internal quality control for reference value studies. Clin Chem Lab Med 2004; 42:863-7.
- Goldschmidt HMJ. The NEXUS vision. An alternative to the reference value concept. Clin Chem Lab Med 2004; 42:868-73.