Tools, Technologies and Training for Healthcare Laboratories

What's the Right Goal?

Laboratories are blessed today with a wealth of information about quality requirements, from CLIA, Rilibak, RCPA and of course the Ricos et al biological database. But all those choices can be overwhelming. Often we are asked, "What's the best quality requirement?" or "What's the right quality requirement for my lab?" Here's some advice on what goals to choose.

 

 

Sten Westgard, MS
May 2011

A recent article in CCLM asked the reasonable question, Why are there differences in quality requirements?

Why do different EQA schemes have apparently different limits of acceptability? Bedrich Friedecky, Josef Kratochvila and Marek Budina, Clin Chem Lab Med 2011;49(4).

Friedecky and colleagues point out that there are very large differences in the quality required by different EQA programs. Shouldn't the limits of acceptability in Germany be the same as the limits in the Czech Republic? Shouldn't the limits in Australia be the same limits in the US? Why is the bar set higher in some countries and lower in others?

Given the global interconnectedness of the diagnostic industry (labs around the world tend to use the same instruments and work with the same diagnostic companies) and the medical laboratory profession (scientific findings and principles of best practices are disseminated widely), surely we should all have the same quality requirements, right?

Why are quality requirements different?

Back when we first started asking the question, how good does a test need to be, there weren't many answers. CLIA provided one of the first sets of guidelines, with the proficiency testing criteria established in 1992. But that didn't cover all tests (and hasn't expanded to cover anymore analytes since the original publication data, nor have the numbers been updated in 20+ years). Slowly, though, other organizations and expert groups began formulating goals.

Today, we have a wealth of information about the quality required by various laboratory tests. It's still incomplete, of course, with many new tests lacking any specification for quality, and it's still subject to change, but a laboratory can now turn to the Ricos biologic variation database, or the German Rilibak, or the Royal College of Pathologists of Austral-Asia (RCPA)'s Allowable limits of Performance.

Aren't all quality requirements the same?

Friedecky et al created a table highlighting some of the differences:

[this table is adapted from the letter, omitting some analytes, using updated 2010 RCPA requirements, and eliminating the listing of the GE-RSMD% requirement ]

Analyte Acceptance criteria / quality requirements
CLIA Ricos RCPA Rilibak SEKK
Sodium 4 mmol/L 0.9% ± 3 mmol/L < 150 mmol/L
± 2% > 150 mmol/L
5% 5%
Potassium 0.5 mmol/L 6% ± 0.2 mmol/L < 4.0 mmol/L
± 5% > 4.0 mmol/L
8% 8%
Chloride 5% 1.5% ± 3.0 mmol/L < 100 mmol/L
± 3% > 100 mmol/L
8% 7%
Calcium 0.25 mmol/L 2.4% ± 0.10 mmol/L < 2.5 mmol/L
± 4% > 2.5 mmol/L
10% 10%
Protein 10% 3.4% ± 3.0 g/L < 60 g/L
± 5% > 60 g/L
10% 10%
Albumin 10% 4% ± 2.0 g/L < 33.0 g/L
± 6% > 20 g/L
20% 12%
Bilirubin 20% 31.1% ± 3 umol/L < 25 umol/L
± 12% > 25 umol/L
22% 21%
Cholesterol 10% 8.5% ± 0.3 mmol/L < 5 mmol/L
± 6% > 5 mmol/L
13% 10%
Glucose ± 6 mg/dL or
± 10% (greater)
6.9% ± 0.4 mmol/L < 5.0 mmol/L
± 8% > 5.0 mmol/L
15% 10%
Urea ± 2 mg/dL or
± 9% (greater)
15.7% ± 0.5 mmol/L < 4.0 mmol/L
± 12% > 4.0 mmol/L
20% 15%
Creatinine ± 0.3 mg/dL or
± 15% (greater)
8.2% ± 8 mmol/L < 100 mmol/L
± 8% > 100 mmol/L
20% 15%

These differences are not trivial and they have significant impact in the EQA program outcomes. In assessing laboratory results from the SEKK EQA program (between 380 to 402 participants), Frediecky et al found that 98% of the labs could meet Rilibak and SEKK acceptance limits, 87% could meet the CLIA limits, 72% could meet the RCPA limits, and only 22% could meet the biological limits from the Ricos et al database. The full letter lists all the different success rates. The authors conclude:

"From the data it is clear that it would be undesirable to derive the size of the acceptance limits from the biological variability of electrolytes (with the exception of potassium), proteins, creatinine, and albumin."

In other words, the choice of quality goal matters. If you choose small targets, many of today's methods will not hit them.

So why are quality requirements different?

You might wonder if the Germans and Australians are using tests in different ways, but that's actually not the reason why their quality requirements differ. The differences in goals are caused not by the use of test results, but by the approach to formulating the goal.

It's helpful to review the different approaches to goal setting.  These are most clearly defined by the "Stockholm consensus hierarchy", which was the outcome of a 1999 meeting in Stockholm, Sweden (Strategies to Set Global Quality Specifications in Laboratory Medicine) to at least harmonize the quality requirements in use.

Here's is the actual hierarchy of models as determined by the conference:

"I. Evaluation of the effect of analytical performance on clinical outcomes in specific clinical settings

II. Evaluation of the effect of analytical performance on clinical decisions in general:

A. data based on components of biological variation
B. data based on analysis of clinicians' opinions

III. Published professional recommendations

A. from national and international expert bodies
B. from expert local groups or individuals

IV. Performance goals set by

A. regulatory bodies
B. organisers of External Quality Assessment (EQA) schemes

V. Goals based on the current state of the art

A. as demonstrated by data from EQA or Proficiency Testing schemes
B. as found in current publications on methodology.

Where available, and when appropriate for the intended purpose, models higher in the hierarchy are to be preferred to those at lower levels."

While the hierarchy defines 5 different levels of approaches, we can boil it down to about three general attitudes:

1. Consensus-driven. This approach includes "state of the art" and compliance goals set by regulators. The aim here is to find a goal that most laboratories and methods can meet.  So whatever the basic performance of methods is at the time, that's what the goal for acceptable performance becomes. So, except for some egregious outliers, most laboratories and methods can meet this type of goal. In other words, lower the bar until most labs and methods can succeed.

This type of approach, in other spheres, is known as "social promotion." Rather than objectively evaluating performance, you just try to keep the peer group together, and hope that along the way, over time, things will improve. The unfortunate side effect is that bad performers get through. This type of approach can help perpetuate bad performance and may in fact provide no incentive for methods to improve. It may have a significant deleterious effect on patients.

2. Biology-driven. By actually studying the within-subject biologic variation, we can determine how much natural variation is expected for a test result. From that, we can then determine how much variation a laboratory method should be able to add to the total variation. That's exactly what Carmen Ricos and her colleagues have done for the last decade, studying all the different publications on biologic variation in different tests, compiling them all into a database. The Ricos database further specifics goals for minimum, desirable, and optimal performance, using Callum Fraser's work on biologic variation.

This approach ignores current method performance in the field, and instead determines how good performance should be. The advantage of this is that it provides a demanding objectively-determined goal. The disadvantage is that it may set a goal that is out of reach for current methods. However, even such a demanding quality requirement serves to provide an ultimate goal for performance, akin to "this is how good test methods ultimately need to be."

3. Clinical-Use-driven. A final way to set quality specifications is to study not the biologic variation, nor the current method performance, but instead the actual behavior of clinicians using test results. If you can evaluate how decisions, diagnoses, and treatments are made at different cutoffs and thresholds, you can reverse engineer that information into quality specifications for analytical methods.

This approach has most recently been used by Karon and Klee in their work on glucose and tight glycemic control. By analyzing medical records and glucose results, they determined how clinicians made decisions on insulin dosing. Through simulation, they determined the size of an analytical error that would significantly change the dose. By determining this critical-error, laboratories can then determine the amount of allowable error in glucose methods. This kind of approach doesn't accept current analytical performance but does reflect current clinical practice. It may not agree with the biologic limits, although potentially it could.

Still, what quality requirement should a laboratory use?

The question remains. Even sorting the types of quality requirements into hierarchies and approaches doesn't necessarily alleviate the choice. What's a laboratory to do?

The Stockholm hierachy recommends that clinical use trumps biology and consensus. That is, how your clinicians use the test is the most important way to determine the quality required by the test. That clinical use is probably more demanding than the consesnsus goals, but may be less demanding than the biological goals.

It should be hoped that manufacturers set their design goals based on clinical use and biology. It should be further hoped that, as method performance reaches the performance goals set by biology, clinicians modify their use of the test results to take advantage of those improvements. So in the final state, the clinical use and biology quality requirements agree.

The typical laboratory faces a set of consensus-driven quality requirements, which are often less demanding but obligatory (the laboratory has to meet them or it will be shut down or lose reimbursement and funding, etc.). If those compliance goals can be met, the next step the laboratory can take is to evaluate if the clinical-use-driven goals are acheivable. Often, though, clinical use is not well known by the laboratory (or clinicians don't agree on use), but the biological goals have been documented and are readily available. Meeting either of those types of goals will be more of a challenge. Ultimately, the laboratory wants to be able to meet clinically relevant goals for performance, so they can reliably and usefully support the medical decisions made with the test results.

Coda: if goals disagree, should we drop them?

There is a temptation to conflate these differences into a fatal flaw for the entire allowable error approach. The thinking goes somewhat like this, "If no one can agree on the quality required by a test, we should give up on quality requirements entirely and choose another approach... [insert your favorate quality fad of the moment here]"

We may not agree on the right speed limit on our streets and highways, but surely that doesn't mean we should abolish speed limits outright. There is room for improvement in the fomulation and use of quality goals, certainly, but it is easy to make progress in setting the right quality requirements. It will take less effort to improve this system than to abandon it and start over.