Covid-19 Testing: Maintaining Quality in a State of Emergency
Wayne Dimech examines the quality - or lack thereof - of SARS-Cov-2 testing. Wayne Dimech is Executive Manager, Scientific and Business Relations at NRL, Australia (NRL) in Melbourne. He is a recognized expert in infectious disease serology and laboratory quality.
Part 2: Maintaining Quality of Testing in a State of Emergency
Wayne Dimech, B. Appl Sci.; MASM; MBA; FAIMS; FRCPA (science faculty)
Our COVID-19 special coverage:
- This is not a Test https://www.westgard.com/covid19-this-is-not-a-test.htm
- Sprinting into a marathon: https://www.westgard.com/covid-19-marathon.htm
- Key Facts in COVID-19: https://www.westgard.com/covid-19-test-test-test.htm
- Clinical Agreement for Qualitative Testing: https://www.westgard.com/qualitative-test-clinical-agreement.htm
- Clinical Agreement Calculator: http://tools.westgard.com/two-by-two-contingency.shtml
- Prevalence and Predictive Value: https://www.westgard.com/predictive-value.htm
- Predictive Value calculator: http://tools.westgard.com/predictive.shtml
- Serology Selection Shortlist: https://www.westgard.com/serology-eua-shortlist.htm
Surely the quick release of so many COVID-19 test kits, also known as in-vitro diagnostic devices (IVDs), is a good thing? Well, yes and no. I’ll let you into a trade secret. No medical pathology test is 100% sensitive and specific. To put it another way, all pathology tests will report some false positive and some false negative results; some tests more than others. Like pharmaceuticals, the sale of IVDs is highly regulated in counties with developed regulatory systems such as Europe, USA, Canada, Australia, Japan, Korea and others. Stringent regulatory conditions apply to the sale of IVDs, with the extent of regulations depending on the risk that false results pose to the individual and/or to the community. Many countries have immature, and some have no IVD regulatory systems. Often governments in these countries select test kits through a tender process in which the lowest priced product is chosen, rather than the one best suited to the purpose. In response to this situation, the World Health Organization (WHO) created the IVD Prequalification Program. WHO and collaborating laboratories like NRL assess the performance of IVDs and make the results public to help inform decisions made by governments and NGOs procuring test kits.
IVDs for HIV and hepatitis and all test kits used to screen the blood supply are highly regulated, as a false result can cause harm to both the individual and the community. Other infectious diseases are usually classified to a lesser extent, whereas blood tests for markers such as glucose or liver/ renal functions are lower still. To register a test kit for high risk organisms, the manufacturer must have international accreditation as a manufacturer (ISO 13485), provide the Regulator with a dossier containing comprehensive scientific evidence of the performance of the IVD, have package inserts or instructions for use (IFU) reviewed for clarity and completeness and approved for compliance with the regulations. A risk assessment for the safety of the IVD is also undertaken. Often the manufacturing facility is assessed for conformance to the standard. Registration of IVDs is complex and expensive, but is in place to minimise the risk of poorly performing tests released into the market and the misuse of IVDs. A Global Harmonisation Task Force, now superseded by the International Medical Device Regulators Forum (IMDF), was established to encourage Regulators in different countries to recognise the registration of IVDs in other signatory countries, thereby reducing the regulatory burden on IVD manufacturers. Most leading economies are signatories to the IMDF.
An IVD is registered for an “Intended Use”, and laboratories cannot legally use them for any other purpose. If an IVD is modified, it becomes an “in-house” test and is therefore used “off license” and the laboratory takes responsibility for the quality of the test results and any implications for patients. Some countries, including Australia, have specific regulations around the use of “in-house” tests. Manufacturers of high risk IVDs must show scientific evidence of the test kit performance and its suitability to the Intended Use” stated in the IFU. Examples of Intended Use are: screening of blood and tissue donations; general laboratory testing for diagnostic purposes; monitoring effectiveness of treatment or disease progression, confirmatory testing; rapid point of care test for community testing; or for home use only. Depending on the IVD and the Intended Use, a range of performance characteristics may be assessed, including:
- Sensitivity – the ability of a test to report a positive result on a truly positive sample;
- Specificity – the ability of a test to report a negative result on a truly negative sample;
- Precision – the amount of variation inherent in the test;
- Bias – the accuracy of the test measured against a true results (usually a reference measurement or standard);
- Limit of detection (LOD)- a measure of the lowest amount of analyte that the test kit can detect;
- Limit of quantification (LOQ) – a measure of the lowest amount of analyte the test kit can quantify (usually reported as 95% confidence);
- Linearity – a demonstration that, as the amount of analyte increases, the signal increases proportionally;
- Cross reactivity – whether analytes other than the analyte being detected causes the test kit to report positive results. This is common in some disease states such as autoimmune diseases, or infections with organisms similar to that being detected;
- Serotypes/Genotype variation – Many organisms have a number of different circulating serotypes or genotypes, which may not be detectable in a poorly designed test kit;
- Stability – Ensuring the reagents are stable and do not deteriorate over time.
Not all of these performance characteristic are relevant for all test kits, but where they are, they must be understood by the user so an appropriate interpretation of the result can be made. Most of these characteristics are self-explanatory, but a brief review of how each may be important for COVID-19 testing.
Sensitivity and specificity are important performance characteristics for almost all test kits. To estimate these characteristics, access to large numbers of known positive and negative samples is required. The European Common Technical Specifications currently require about 500 positive samples and 5,000 negative samples to assess specificity and sensitivity for blood screening assays. For Rapid Diagnostic Tests (RDT) for HIV, HCV and HBsAg, 500 positive and 1,500 negative samples are required; with an expected sensitivity of > 99.0%. Samples obtained from blood donors, clinical samples, pregnant women and samples with potentially interfering substances need to be sourced and tested. NAT for HIV RNA, HCV RNA and HBV DNA need to have the LOD estimated, as well as the LOQ for quantitative assays, by testing against an international standard. At least 10 samples for each HIV genotype must be tested to demonstrate the ability to detect each genotype equally. Samples containing cross-reacting analytes and interfering substances are tested to demonstrate specificity. Precision experiments, with both a limited number of variables i.e. same instrument, lot number and operator over a short period of time (repeatability) and multiple variables over a longer period of time (reproducibility) are conducted. Demonstration of minimal lot-to-lot variation is required and in Europe and USA, the regulator must test and release each batch high risk IVDs until they are satisfied with the IVD performance. WHO Prequalification protocols have similar but often less stringent criteria. By the time an IVD is released to the market, extensive performance evaluations have been conducted and are in the public domain for potential users to assess.
Quality of Current COVID-19 Testing
However, in the current situation with COVID-19 testing, a different quality paradigm exists. Correctly, governments waived the regulatory requirements to allow use of kits without the complete manufacturer evidence normally required. According to IVD Directive 98/79/EC, COVID-19 diagnostic devices are Annex 3 and the manufacturer has to specify device performance characteristics and self-declare conformity with the safety and performance requirements listed in the Directive. No performance testing or batch release is required by Notified bodies. Europe is currently transition to a new regulatory framework based on the IMDF, however this transition is not yet in place. It is uncertain if COVID-19 would be in the highest risk category, requiring a full assessment including batch release. Most likely, Europe would follow the same path as USA, Australia and WHO and allow Emergency Use provisions. In Australia, special legislation was passed to exempt IVDs from normal regulatory scrutiny. WHO implemented an Emergency Use Listing Procedure for COVID-19 NAT assays (but not serology), requiring limited evidence of performance. Similarly USA FDA; Canada; Japan; Korea and Singapore have also implemented Emergency Use Listing for NAT and serology without requiring complete performance evidence. Europe and Australia has referred validation of COVID-19 tests to public health laboratories outside the usual regulatory processes. These laboratories are the front line of COVID-19 testing, but have limited experience in formal test kit evaluations and are currently overloaded dealing with large volumes of samples. FIND (Geneva, Switzerland) is compiling results of evaluations for COVID-19 NAT and serology assays.
There are some technical difficulties faced when designing evaluation protocols for COVID-19 tests. Unlike HIV, syphilis and HBsAg serology tests, there are no acknowledged reference or confirmatory methods. In HIV serology, a testing strategy using multiple tests including screening tests followed by supplemental and/or confirmatory tests such as western blots are used to confirm positivity. Similarly syphilis testing used multiple specific anti-treponemal tests such as EIAs, CHLIA and TPPA in the testing strategy. HBsAg and HIV p24 positivity can be confirmed using neutralisation testing. COVID-19 serology currently has no reference test, however NRL is developing a western blot. Viral neutralisation may be useful. An international standard of a known viral load is required to assess the LOD or LOQ of COVID-19 NAT. However this takes a considerable amount of time to prepare. In the meantime, multiple panels of serial dilutions of virus should be made available and all NAT compared using the same dilution series so a comparison of LOD can be made. Without this level of discipline, our understanding of the performance of COVID-19 test kits will remain limited.
It is important that we recognise that there is currently little performance data available for COVID-19 test kits. A recent publication by National University of Singapore included an annex of the test kits available at the time. Disappointingly, most kits had little or no performance testing data available. Those that did inspired little confidence, having unacceptable sensitivity and specificity. One assay that does have data presented in the NUS study was BioMedomics /Jiangsu Medomics Medical, COVID-19 IgM/IgG Rapid Test. Using this as an example only, the stated sensitivity is 88.6% and the specificity is 90.6%. Firstly, without 95% confidence limits, report of sensitivity or specificity should be queried. If an assay detects 9 out of 10 positive samples correctly, it will have a sensitivity of 90% (95% CI: 54.1 – 99.5%), meaning that the real sensitivity is as low as 54.1%. If 1,000 samples were tested and 900 were detected, the sensitivity is also 90% (95% CI: 87.9 – 91.7%), demonstrating much greater confidence in the estimation. In the example, 397 positive samples were tested, and 351 were detected, giving a sensitivity of 88.6% (95% CI: 84.8 – 91.3%). To put it differently, for every 100,000 true positive individuals tested, 11,400 will incorrectly be reported as negative. The danger of using poor performing test kits is stark, both in health and economic terms.
We are flying blind. At the time of writing, just over one million people have confirmed COVID-19 infection, and the true number is assumed to be much greater. However, difficulties in obtaining sufficient numbers of samples of appropriate quantity to undertake comprehensive studies remain. One problem is that, when individuals tests NAT positive, they are put into isolation and samples are difficult to obtain. Those on the frontline are unaware of the need for evaluation samples and do not understand the type of sample required. Yes, there are many “evaluations” being undertaken using samples of convenience. FIND is collecting and compiling these data for public use. Manufacturers are required to obtain data to support their claims for future submission once COVID-19 is no longer emergency use.
However, to my knowledge, there has not been a systematic and scientifically constructed process to evaluate test kits using the same well-constructed panel of samples. Currently, there are over 200 known test kits, not to mention in-house test, available or being developed. What is required is a panel of samples assembled that can be used to evaluate the relevant performance characteristics of each COVID-19 test kit. These panels should be in sufficient quantity to allow testing of as many test kit as possible. Compiling the results obtained from each test kit will add value to the evaluation panel. WHO prequalification program has assembled such panels of sample for HIV serology and NRL is currently assembling panels for syphilis. This activity will take a coordinated effort and knowledge of evaluations protocols.
NRL would welcome communicating with potential collaborators that can support ethically accessed, non-commercial clinical samples to be used in development of an evaluation panel. As a scientific community, and to inform our governments, the need for comprehensive, scientifically-sound information on COVID-19 testing is urgent and important. Ad-hoc validations, using samples of convenience without well-constructed scientific protocols will just not suffice; and may even be dangerous and misleading. We need a coordinated international approach to solve this problem.
About the author
Wayne Dimech is Executive Manager, Scientific and Business Relation of the NRL, a World Health Organization (WHO) Collaborating Centre for Diagnostics and Laboratory Support for HIV and AIDS and Other Blood-borne Infections. Mr Dimech obtained a degree in medical laboratory science at Royal Melbourne Institute of Technology (RMIT) University in Melbourne, before undertaking a microbiology fellowship at the Australian Institute of Medical Scientists and completing an MBA at LaTrobe University in Melbourne. He is also a Fellow of the Faculty of Science (Research) of the Royal College of Pathologists Australasia. He has worked in private and public pathology laboratories predominantly in Microbiology Departments, where he specialised in infectious disease serology. Mr Dimech’s research interests include the control and standardisation of assays that detect and monitor blood-borne and sexually transmitted infectious diseases. A particularly interest is the standardization of rubella testing and the monitoring of infectious disease assay variability. He was instrumental in the development of EDCNet, an internet-based program for monitoring quality control test results, which is now used worldwide, and in the optimization of OASYS, software designed to manage the external quality assessment schemes. Mr Dimech is an advisor for numerous national and international working groups, including the Australian Hepatitis B Testing Strategy, Standards Australia and consultancies under the auspice of WHO, International Health Regulations, UNDP and the Global Fund. He has authored or co-authored about 50 articles in international peer-reviewed journals and contributed to three book chapters.