This article appeared first on the website of Significance magazine.
By Dr. Jim Cochran, Associate Dean for Research at Culverhouse
In the US, President Trump has said that testing for novel coronavirus infection will be limited to people who believe they may be infected. But if we only test people who believe they may be infected, we cannot understand how deep the virus has reached into the population. The only way this could work is if those who believe they may be infected are representative of the population with respect to novel coronavirus infection. Does anyone believe this is so?
The common characteristic of those who believe they may be infected is that they all show some outward symptoms of infection by the virus. In other words, people who are being tested for the novel coronavirus are disproportionately showing severe symptoms.
This would not be a problem if someone who is infected by the novel coronavirus immediately shows symptoms, but this is not the case. We have strong evidence that some people develop mild cases, show no symptoms, and carry the virus without knowing it because they are asymptomatic. Thus, efforts to understand the virus’ penetration into the population must include observation of the asymptomatic.
The estimate of the proportion of the population who is infected can be calculated as:
So, we need data from a random sample of the entire population in order to gather data from infected people who are showing symptoms, infected people who are asymptomatic, and people who are not infected. All have some probability of being included in a true random sample of the population.
Why are leaders resisting the use of random sampling to assess how widespread the virus is? It could be ignorance, disregard, or lack of appreciation of statistical principles – a consequence of the lack of statistical literacy that pervades the general population. (If the general population insisted on the use of random sampling to assess how widespread the virus is, leaders would not likely resist.) Or it could reflect concern over the limited availability of tests and a desire to devote all of these limited tests to those who show symptoms of novel coronavirus infection.
Unfortunately, this might be inadvertently helping the novel coronavirus spread. If a society does not understand the extent of infection in the general population or the virus’s infectivity, how can it prepare and optimally devote its resources to slow the spread of the virus? How does it decide what preventive measures are appropriate or necessary? How does it minimize the likelihood that the virus spreads to the point that the capacity of the hospital system is overwhelmed? How does it convince the public of the necessity of these measures? How does it predict the duration of this pandemic? How does it anticipate how many hospital beds, ventilators, or N95 respirators it will need? How does it decide where to deploy medical personnel? How does it know if it is making progress or if conditions are deteriorating?
To this point, we have no idea of the rate of infection in the general population – we may be failing to capture infections by a factor of 5 or a factor of 50. And without the evidence that a random sample of the general population would provide, we are operating in the dark. While we operate in the dark, preventable deaths will accumulate, and we will continue to take measures that are not only ineffective, but also unnecessarily costly.
Now, three months after the outbreak emerged, most of the world still lacks the ability to test a large number of people. This understandably makes even those leaders who appreciate sampling hesitant to test a random sample of the general population. But the bottom line is, we need more coronavirus tests than we think we need.
About the author
James J. Cochran is associate dean for research, professor of applied statistics, and the Rogers-Spivey faculty fellow at the Culverhouse College of Business, University of Alabama. He is vice-chair of the Significance Editorial Board.