Tools, Technologies and Training for Healthcare Laboratories

Milan Manifesto: Are Quality Goals Evolving, Devolving, or simply Revolving?

While Americans were gorging themselves on turkey, assiduously avoiding talking politics with their relatives, slipping into tryptophan-induced unconsciousness in front of the television, an elite group of metrologists, biochemical scientists, and leaders of the quality field were gathering in Milan to discuss a far-ranging series of topics. Oh, and they were also deciding the fate of the laboratory world and how and what analytical quality specifications will be allowed in the future.

 

The Milan Manifesto: Are Quality Goals Evolving, Devolving, or Revolving?

James O. Westgard, PhD and Sten Westgard, MS
November 2014

2014-targets-mismatch

On November 24th and 25th, the first European Federation of Laboratory Medicine (EFLM) Strategic Conference on "Defining analytical performance goals 15 years after the Stockholm conference," was held in Milan, Italy.

The goal was to revisit the 1999 Stockholm consensus statement – a seminal document which established the ranking and relative utility of analytical quality specifications. The Stockholm consensus was a recommendation for a hierarchy of goals, with clinical goals at the top, following by goals based on biologic variation, then expert groups, proficiency (or EQA) criteria, and lastly "state of the art" performance. Whether you realize it or not, the Stockholm consensus has informed the setting of most quality specifications for the last decade and a half, particularly the widespread use of goals based on biologic variability. The Milan meeting was to decide if this hierarchy should be changed.

Anticlimactically, the Milan conference began with an announcement that the organizers had already reached their conclusions. In fact, they had held an earlier meeting in Istanbul (during the IFCC WorldLab conference), and in the month of August they had already drafted the new consensus of the Milan meeting. Those conclusions were revealed to participants at the end of the first day.

Speaking of participants, all the speakers were from Europe or Australia. That might appear reasonable since this was planned by a European professional organization, however, the earlier Stockholm conference included more widespread participation, including several speakers from the western part of the world. Because of this limitation, the revised consensus may actually be of more limited value than the original Stockholm hierarchy. It seemed like a strange way to hold a meeting that is intended to influence global practices in medical laboratories! The conclusions were already drafted and the meeting agenda was set to discuss only those points of view that agreed with this conclusion. The meeting was therefore less a debate or discussion to establish consensus, but rather an attempt to dictate the consensus of a few to those many who were attending. Very reminiscent of the process followed by CLSI and CMS for the development of risk-based QC plans in the US.

To be fair, the issued consensus statement was only a draft, and participants were invited to make suggestions for modifications. So it is possible that the participants of the meeting will indeed have some role in shaping the meeting's conclusion.

The Milan Manifesto (or is it the Edict of Istanbul?)

The new consensus draft is fairly innocuous. [Read the Draft here.] Where the Stockholm consensus had established 5 levels of hierarchy for quality specifications, the new suggestion is to simplify the hierarchy to just 3 models: goals from clinician surveys, goals from biologic variation, and goals from state of the art. But that wasn't to say that any model from the original Stockholm consensus is excluded. Two previous levels of the hierarchy that were from "expert groups" or "guidelines" as well as government and EQA/PT guidelines are now basically lumped into the state of the art model. It's hard to argue against reshuffling the deck slightly, refining the terminology, etc. But from our perspective, this modification was not the full agenda of the meeting.

It was also announced that there would be working groups to follow the meeting and help implement the recommendations. These Task Force Groups were not revealed until the very end of the meeting. But – surprise, surprise - five different Task-Force Groups had already been established to focus on the following issues:

  1. Performance on criteria models for specific laboratory tests. To allocate different tests to different models, producing a list of proposed models for different tests. No one model will cover all tests, so the triage of tests between models will be completed by this group.
  2. Harmonization of allowable limits in EQAs. To define performance criteria for common analytes to be adopted by all EQA programs globally. This is like the standardization and harmonization movement, but for all EQA and PT programs in the world.
  3. Measurement of Total Error. This group will question whether Total Error should be used, if at all, or reformulated, or modified. The implied goal of this group by the organizers and several of the presenters is that this group will reach a conclusion to significantly modify or abandon the Total Error concept.
  4. Performance criteria for pre- and post-analytical (extra-analytical) phases. To generate performance criteria for the pre- and post-analytical phases.
  5. Biological variation database. To implement a critical appraisal of all currently available literature on biologic variation. Possibility to launch a new website that will house a reconfigured, recalculated, and more stringent set of biologically-derived performance criteria.

2014-targets-2bwWhere the Milan meeting and its task forces seem to be heading is a familiar destination: 50 years ago, quality specifications were expressed only as allowable imprecision and allowable bias. It was the stated intention of many of the organizers and speakers to return to that golden era: no Total Error, just maximum CV and maximum bias. Indeed, some presenters and particiapnts went so far as to demand that only allowable imprecision should be specified: and that no bias be allowed or tolerated. This is an even more radical deconstruction of the current quality goal hierarchy of Stockholm. Ambitious indeed, to eliminate bias globally and create an alternative reality to the real world of laboratory measurements.

This debate over the form of a quality specification has a long history. The argument for separate specifications for CV and bias and a unified specification of Total Error has persisted for at least 40 years, dating back to a paper published by Westgard, Carey, and Wold in 1974 ["Criteria for judging precision and accuracy in method development and evaluation," Clin Chem 1974;20:825-833]. Yes, we at Westgard QC have our own bias in this matter because of our long history of developing practical approaches for evaluating, controlling, and managing the quality of laboratory measurement procedures. Total Error and Sigma Metrics are the keys to practical, yet quantitative tools, that support laboratories in their efforts to deliver quality test results. That's our bias and we won't pretend it doesn't exist, nor do we believe it should be eliminated.

It is a more literal revolution to turn back the clock on quality specifications and return to the separate specifications of the 1960's. Other recommendations mentioned in Milan involve generating far more complex models, indeed ones that are so complex that some numbers (like bias) can't be estimated at all. It is as if Quality Specifications are entering a "Chaos Theory" era, where it becomes unknowable what the actual quality specification should be unless all the exact pre-condition variables are known and calculated first. While complex equations may produce a more sophisticated estimate of allowable error, if these estimates can only be performed on a lab-by-lab, test-by-test, day-by-day basis, there is no generalization possible and every laboratory must generate their own highly individual quality specifications.

Finally, it should be said that nearly all of the specifications that are used in EQA and PT programs are implicitly in the form of Total Allowable Errors. CAP PT surveys don't give goals in the form of RMSD, or separate specifications for CV and Bias, or target measurement uncertainty; they set one number that encapsulates implicitly both CV and Bias. While it may be possible to generate a conclusion from a working group to eliminate Total Error, it's going to be a much more challenging task to rewrite the EQA and PT guidelines of every program in the world.

An historical perspective: The Edict of Milan 313 vs. The Edict of Milan 2014

"Constantine Chiaramonti Inv1749" by Unknown. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Constantine_Chiaramonti_Inv1749.jpg#mediaviewer/File:Constantine_Chiaramonti_Inv1749.jpgIn February of 313, the two Roman Emperors (Constantine I, who controlled the West, and Licinius, who controlled the East) met in Milan and reached a consensus on the persecution of Christians in the Empire, and in more general terms, the religious liberty available to Roman citizens:

"We have resolved among the first thing to ordain those matters by which reverence and worship to the Deity might be exhibited; that is, how we may grant likewise to the Christians, and to all, the free choice to follow that mode of worship which they may wish, that whatsoever divinity and celestial power may exist, may be propitious to us and to all that live under our government. Therefore, we have decreed the following ordinance, as our will, with a salutary and most correct intention, that no freedom at all shall be refused to Christians, to follow or to keep their observances or worship; but that to each one power be granted to devote his mind to that worship which he may think adapted to himself, that the Deity may in all things exhibit to us his accustomed favour and kindness."

From Ecclesiastical History, Book 10, Chapter 5, by Eusebius, translated by C.F. Cruse
http://wadsworth.cengage.com/history_d/special_features/ilrn_legacy/wawc1c01c/content/wciv1/readings/eusebius.html

The importance of this Edict for Christian history is immense. It is often assumed that this edict made Christianity the official religion of the Roman Empire, particularly since Constantine eventually became a strong advocate for Christianity. But the Edict itself was actually not about establishing one and only one religion, but allowing religious freedom.

The 1999 Stockholm meeting established a "freedom of quality" in its time. While it created the hierarchy, it did not specifically forbid any type of specification. The meeting in Milan this year has not exhibited that kind of tolerance. It seems designed less to grant specification freedom than to spark persecution. Those who hold faith in certain models shall be deemed elect and preferred. Those who persist in holding to models deemed unclean (such as Total Error) will face ostracism.

It is worth noting again that the organization of this meeting, while perhaps typical of the current mode of operation for scientific conferences, is nevertheless disconcerting. Before the meeting was convened, the conclusion was pre-ordained. Before any discussion was allowed, the consensus draft was already dictated. Before anyone could debate, the task force groups were already defined. The ground was prepared to eliminate room for objections and minimize any time for opposition.

For those who are so concerned about elimination of bias, it is curious to see such strong bias pervade the creation, control and conclusions of a meeting whose purpose was to create a consensus. While this may be the new norm for "consensus" meetings, this doesn't mean that the consensus position should be adopted without further study and discussion.

We at Westgard QC will continue to contribute to that ongoing dialogue where we are allowed as part of the "sifting and winnowing" that is necessary to separate the chaff from the kernels of truth.