Tools, Technologies and Training for Healthcare Laboratories

Analytical Evaluation of a POC Blood Analysis System

A 2013 study evaluated the analytical performance of a POC blood analysis system. Are we seeing improvements in POC performance? Are there additional sources of error that need to be tracked when we evaluate POC devices?

Analytical Evaluation of a POC Blood Analysis System

NOVEMBER 2013
Sten Westgard, MS

[Note: This QC application is an extension of the lesson From Method Validation to Six Sigma: Translating Method Performance Claims into Sigma Metrics. This article assumes that you have read that lesson first, and that you are also familiar with the concepts of QC Design, Method Validation, and Six Sigma. If you aren't, follow the link provided.]

This analysis looks at a paper from a 2013 issue of American Journal of Clinical Pathology which examined the performance of a point-of-care (POC) blood analysis system:

Analytical and Clinical Performance of the [name withheld] Blood Analysis System, Brie Stotle, Alexander Kratz, Am J Clin Pathol 2013; 140:710-720.

This study was conducted at the Columbia University Medical Center of New York-Presbyterian Hospital.  The controls used were from Eurotrol.

Rather than focus on the name of the POC instrument, we're going to focus on the POC performance.

The Imprecision and Bias Data

The imprecision data used in the study was collected thus:

"The precision data shown in this article are based on 37 readers. Three levels of QC material were run on each reader every day for 10 consecutive days, giving 370 measurements for each level of QC material. The means, standard deviations, and coefficients of variation were then calculated with Excel..."

As for bias, the study compared against another popular POC device, but also against the instrument used in the central core laboratory, a Nova CCX:

"Correlation studies were performend on 15 randomly selected [POC] systems, representing 10% of the analyzers purchased...Approximatley 40 patient samples were analyzer on each of the [POC] systems...results from the 15 [POC] systems were compared with those from the...Nova CCX instruments."

While it may be interesting to compare this POC against another POC device, the practical utility is to use the comparison between the POC and core laboratory. Those differences are what is going to have the biggest impact in clinical care of patients. So we will use those results for our estimates of bias.

Below is the table of inter-day ("inter-batch") imprecision and bias for the two levels of each analyte:

Method Level
CV% Slope Y-int
Bias %
Sodium 109.7 0.7% 0.995 0.7 0.14%
133.2 0.9%       0.03%
156.6 0.7% 0.05%
Potassium 1.8 1.4% 0.928 0.08  2.76%
4.2 1.3%        5.30%%
6.2 1.3%  5.91%
Glucose 85.4 2.1% 0.863  7.70  4.68%
211 3.8%     10.05%
282.5 5.6%  10.97%
Calcium 2.0 2.3% 1.106  -0.15  2.9%
4.4 1.9%        7.1%
5.6 1.5%  7.85%
Hematocrit 22.7 2.0% 1.289  -11.4  21.32%
32.60 1.6%     6.07%
70.40 1.9%  12.07%
pCO2 21.0 3.3% 1.073  5.2 32.06%
40.30 2.1%     20.20%
71.20 2.9%  14.6%
pO2 73.4 8.3% 0.95  -0.70  5.95%
110.7 5.7%     5.63%
155.2 5.4%  5.45%
pH 7.11 0.1% 1..07  -0.53  0.45%
7.4 0.1%     0.16%
7.61 0.1%  0.03%

So we have a lot of numbers, right? We have three levels of controls, so we have imprecision estimates for all of those levels. Using the regression equation, we can estimate the bias at each.

Looking at the raw numbers, you may find it difficult to judge the method performance. From experience, you might be able to tell when a particular method CV is high or low. But the numbers themselves don't tell the story.

If we want an objective assessment, we can set analytical goals - specify quality requirements - and use those to calculate the Sigma-metric. 

Determine Quality Requirements

Now that we have our imprecision and bias data, we're almost ready to calculate our Sigma-metrics. But we're missing one key thing: the analytical quality requirements.

In the US, traditionally labs look first to CLIA for guidance. For this study, conducted in the US, it's natural to take those quality requirements as our source for total allowable error (TEa).

Method CLIA Goal
Sodium ± 4 mmol/L
Potassium ± 0.5 mmol/L
Glucose Target value ± 6 mg/dL
or ± 10% (greater)
Calcium ± 1.0 mg/dL
Hematocrit ± 6.0%
pCO2 Target value ± 5 mm Hg
or ± 8% (greater)
pO2 ± 18% (Rilibak)
pH ± 0.04 pH

 A few notes to make here. First, many of these goals are unit-based, which means the % allowable error will change depending on the level of interest. As you descend the range, the effective TEa will grow larger, but as you ascend the range, the TEa will grow smaller. Also, in some cases, CLIA provides both a units-based goal and a % goal. This dual goal is to be implemented so that whichever goal is larger, that's the one you use. For the study here, in both cases (glucose and pCO2), the units-based goal is smaller, so the percentage-based goal will be used. Finally, the CLIA goal for pO2 is actually ± 3 SD of the group SD. This is an older type of goal, one that is antiquated and should be replaced. We have instead substituted the goal for pO2 that is found in the German Rilibak,

Calculate Sigma metrics

Now the pieces are in place. Remember the equation for Sigma metric is (TEa - bias) / CV.

Example calculation: for Total Cholesterol, at the level 4.2 mmol/L, the CLIA goal is 10%. We also know from the study that at that level, the POC A device has 4.9% imprecision and 24.76% bias:

(10 - 24.76) / 4.9 = -14.76 / 4.9 = negative

So the POC A at the lower level of Total Cholesterol is not delivering excellent performance. In fact, the bias between the POC and the core laboratory method is so large, it exceeds the total allowable error. The study notes that the biases across the range of testing are between 12.4 and 21.2%. What this tells us it that these two methods (POC A vs. Roche Modular) are not in agreement, and could produce significantly different results on the same patient.

Recall that in industries outside healthcare, 3.0 Sigma is the minimum performance for routine use. 6.0 Sigma and higher is considered world class performance.We'll simplify the table below and calculate all the Sigma-metrics.

 

Method Level
CV% Bias %
Quality Requirement
(TEa %)
Sigma-metric
Sodium 109.7 0.7% 0.14% 3.65 5.01
133.2 0.9% 0.03% 3.0 3.31
156.6 0.7% 0.05% 2.55 3.57
Potassium 1.8 1.4%  2.76% 27.78 17.87
4.2 1.3%  5.30%% 11.9 5.08
6.2 1.3%  5.91% 8.06 1.66
Glucose 85.4 2.1%  4.68% 10 2.53
211 3.8%  10.05% 10 n/a
282.5 5.6%  10.97% 10 n/a
Calcium 2.0 2.3%  2.9% 50 20.48
4.4 1.9%  7.1% 22.73 8.22
5.6 1.5%  7.85% 17.86 6.67
Hematocrit 22.7 2.0%  21.32%  6.0%  n/a
32.60 1.6%  6.07%  6.0%  n/a
70.40 1.9%  12.07%  6.0%  n/a
pCO2 21.0 3.3% 32.06%  23.81%  n/a
40.30 2.1%  20.20%  12.41%  n/a
71.20 2.9%  14.6%  8.0%  n/a
pO2 73.4 8.3%  5.95%  18.0%  1.45
110.7 5.7%  5.63%  18.0%  2.17
155.2 5.4%  5.45%  18.0%  2.32
pH 7.11 0.1%  0.45%  0.56% 1.11
7.4 0.1%  0.16%  0.54%  3.79
7.61 0.1%  0.03%  0.53%  4.94

Yes, there are a lot of numbers here, but believe it or not, things are starting to look clearer. We have some great news here, particularly with Potassium and Calcium. But we also have troubling news with other analytes, such as Hematocrit and pCO2. The table lists "n/a" for the Sigma-metric for those analytes because the bias between the core lab method and this POC method exceeds the total allowable error. We've blown our error budget. In other words, when the two methods are that discordant, we can't really benchmark the performance. Unless the POC and core lab are reconciled (possibly recalibrated), the bias is so large that the results of the two methods are just too different to be useful. Indeed, the discrepancy is more likely to generate confusion, delay (while more tests are performed to determine which result is "real"), and in the worst case, incorrect diagnoses and decisions.

Summary of Performance by Sigma-metrics Method Decision Chart and QC Design by OPSpecs chart

If the numbers are too much to digest, we can put this into a graphic format with a Six Sigma Method Decision Chart. Here's the chart for CLIA specifications for allowable total error

 2013-POC-CLIA-NMEDx

Here's where the graphic display helps reveal issues with performance. You can see Calcium and Potassium are close to the bulls-eye of this graph, while many of the other analytes are missing the target completely. In some cases, it's the bias that is the big problem, but you can also see that the imprecision of the sodium method is a concern. For glucose, both imprecision and bias are problematic.

Now the question becomes, what would the laboratory do if this instrument was in routine operation? What QC would be necessary to assure the level of quality required for use of the tests? In this case, we use the same data, but plot the methods on an OPSpecs (Operating Specifications) chart.

2013-POC-CLIA-NOPSpecs

Out of 24 possible data points, a majority are below 3 Sigma (or missing the target) and only 4 are in the bulls-eye. Potassium and Calcium could be easily controlled with 2 control materials and limits set a 3 SD. However, for the rest of these analytes, they should be using extensive "Westgard Rules" with at least 4 controls per run. Of course given that this is a POC device, that might not even be possible - most POC devices are very difficult to operate when it comes to QC and the use of external controls. For pCO2, Hematocrit, and Glucose, "Westgard Rules" will not be enough - there is a correlation issue between the POC and core lab method that needs to be addressed first.

The IQC failure rate actually registered by the study site reached 3.1% of test cards (3,781 failures out of 122,317 cards used in the first year). That corresponds to a 3.3 Sigma on the short term, or a 1.8 Sigma on the long-term scale. Given that the POC device probably has a fixed QC procedure built into the instrument, that may in fact be lower than the appropriate level of rejection.

A concerning Coda

In addition to IQC failures, there were other failure rates recorded by the study.In the first year of implementation, 15,957 test cards failed (again out of 122,317), or a 13% failure rate. In addition, each POC device was paired up with personal data assistant (PDA). 31% of those PDAs failed in the first year. Given that the paired devices were returned when either one or both of the parts had failed, the hospital found that 55% of the inventory of devices needed repair or replacement in the first year of implementation.

  • A 13% card failure rate corresponds to 2.6 Sigma on the short-term scale
  • A 33% PDA failure rate corresponds to 1.9 Sigma on the short-term scale
  • A 55% overall inventory failure rate corresponds to 1.3 on the short-term scale

Recall, again, that outside of healthcare, a 3 Sigma process is considered the minimum acceptable performance. Any process below that is considered wasteful, error-prone, and unprofitable.

Conclusion

The study authors concluded the following:

"Our experience provides examples of some factors that can determine the success or failure of [a POC] project. Challenges faced during this implementation arose from a lack of user familiarity with the test system, a high card failure rate, and inventory supply problems due to the high PDA/reader mechanical failure rate. The POC department had to invest a significant amount of time retraining staff and orchestrating equipment repairs."

I believe that is a rather restrained conclusion given the high failure rates demonstrated across the board by this POC device. There are some good POC methods and devices out there. I would not classify this particular device as one of them. Nor would this be a good candidate for an IQCP or Risk-based QC - unless that Risk Analysis process ultimately recommended choosing a different device.

847.714.3904 [Cell]