Tools, Technologies and Training for Healthcare Laboratories

The Differences Dilemma: How to standardize Sigma metrics in a chaotic world

Increasing publications on Sigma metrics are noticing differences that derive from discrepant performance specifications. Are we in the wrong car on the wrong road heading in the wrong direction?

The Differences Dilemma: The Desire for Standardized Sigma metrics in a world of discrepant performance specifications

Many roads, many destinations, no clear path

Sten Westgard, MS
March 2022

As more and more articles on Sigma metrics are being published, many of them are reaching the same conclusion: the choice of performance specification matters. Choose CLIA goals, you’ll get one Sigma metric. Choose EFLM biological variation database derived goals, you’ll get a different Sigma metric. Choose the Australian RCPA goals, you’ll get a third Sigma metric.

This is often presented as a shocking, even damning, discovery. How can we get different Sigma metrics on the same test? But truly, there’s a lot of naivete contained in that conclusion.

Of course, the reason that different goals produce different Sigma metrics is just as simple as the reason why Stockholm and Milan and Paris are all different destinations. If you choose to go to Paris, you won’t get the same outcome as if you head to Stockholm. If you decide that Milan is your goal, you will have a different path, a different experience, a different end point than if you were trying to get to Stockholm. Destinations matter. Performance specifications matter, too.

Discerning researchers understand that the root of the problem is not in the Sigma metric, nor is there anything to blame on Six Sigma for this situation. Instead, the problem happens before we get to the Sigma metrics, when we establish our performance specifications.
Our performance specifications vary as much as Paris and Milan and Stockholm vary from each other.

Some goals are mandated by the local authority. CLIA is mandatory for all US laboratories, and by extension, all CAP-accreditedlaboratories around the world. Rilibaek is mandatory for German laboratories. The WS/T 403-2012 goals are mandatory in China. These mandates are not likely to change anytime soon. In Europe, the more popular goals tend to be those derived from the EFLM biological variation database, although those goals can be extremely challenging to hit.

Another reason why goals differ is in their application or intended use. Frequently goals that are used in mandatory programs, such as CLIA and CAP and Rilbaek, have more generous specifications, because the consequences of failing are severe – fail a PT survey too many times and your lab is no longer able to take in patient samples and generate revenue. Other goals from programs that are more “educational” can afford to be tighter, because the worst thing that will happen from a failure is that a letter or message is sent to the director that improvements are needed.

The 2014 Milan Consensus made explicit note of a third difference between performance specifications, whether they are developed by a focus on Model 1, the clinical outcome, or Model 2, biological variation, or Model 3, SOTA (state-of-the-art) considerations. While the Milan Hierarchy strongly prefers 1 over 2 and 2 over 3, it does acknowledge that some analytes will only be able to use SOTA goals, while other analytes will be able to use biological variation goals.

There are even more fundamental debates about performance specifications, more akin to the difference between heading for an ocean in a submarine vs heading to a city in plane. The academic back and forth over measurement uncertainty vs. total allowable analytical error has provided a great opportunity to spill a lot of digital ink, but while the elephants battle, the ants are getting crushed.

A final thought to this debate: the targets themselves are moving. Think of HbA1c, considered perhaps the poster child, the great success story, of standardization. Much of that work was accomplished by changing the TEa year after year, effectively forcing the bad methods out of business, and rewarding the better methods. So the performance specifications may change, not only because of scientific efforts, but also by new clinical uses of older tests, by breakthroughs in engineering, which enable more precise readings, which expose more granular clinical changes in patients.

Just as methods and instruments continue to evolve, expect that performance specifications will evolve, too. That doesn’t absolve us from trying to settle on the best goal of the moment, (perhaps the State of the Art goal for the State of the Art methods). But at least we need to acknowledge the complexity of the landscape while we argue about the correct destination.

There are increasing calls for a standardization of performance specifications. Perhaps this can be achieved, but more likely we will have to settle for a Milan-type consensus. What is more likely to happen is that forces on the ground, labs in the real world, will gravitate to the most practical goals for today. On our part, we are trying to facilitate better decision-making through display of consolidated goals for chemistry, immunoassays, and hematology. These are the goals that we use when we judge performance, no matter where in the world the lab is located.

Our choices may not be your choices, but at least we can have a discussion and debate about those choices. The more dialogue we have, the better our chances on finding consensus.