Sunday, December 10, 2023

# COVID calculations – applying statistics to COVID-19 testing

Probabilities are tricky things to wrap your head around, and they are often not entirely intuitive. When a media report says that a COVID-19 test is 80% accurate or a vaccine is 95% effective, how do we know if that is good news? How reliable are those numbers, and how do we put them in context? Because our intuition can fail us, we need to rely on mathematics. Face your fear (of math), and bravely read on!

BIG Media contributor Brian Russell, in Bayes’ Theorem and the probability of having cancer, gave us a straightforward explanation of a mathematical process for determining probabilities of uncertain scenarios. The context of the examples in his article related to cancer testing and diagnoses. Now that we have the tools, this process can be applied to any scenario given uncertain inputs, including the accuracy of COVID-19 testing.

As Russell described, 18th-century philosopher and mathematician Thomas Bayes[1] deduced a method of relating probabilities to predict uncertain outcomes, namely (in the cancer context):

The conditional probability of having cancer given that you have a positive test result is equal to the conditional probability of having a positive test given that you have cancer multiplied by the marginal probability of having cancer and divided by the marginal probability of a positive test.”

That might be a completely foreign language to most, but the concept is much more elegantly expressed in its natural mathematical state: an equation:

Where:

posterior refers to what we want to know: the probability of having cancer after having received a positive test;

likelihood refers to the conditional probability of having a positive test result, given that you have cancer (the reliability of a positive test result);

prior refers to the marginal probability of having cancer (the rate of cancer occurrence in the general population);

evidence refers to the marginal probability of having a positive test (the rate of positive tests in the general population).

We can determine the value of each of the parameters on the right side of this equation (likelihood, prior, and evidence) using reported information on the proportion of the population that has cancer at any given time, and the reliability of the tests used by medical professionals to indicate the presence or absence of cancer in an individual. These terms and the method are explained in detail in Russell’s article.

COVID-19 testing

Let’s now substitute “COVID-19” for “cancer” in this method to determine the uncertainties inherent in a COVID diagnosis.

A study on the reliability of COVID testing reported the data shown in Table 1.[2] The number of people in the population likely to have COVID at any snapshot in time was assumed to be 5 in every 100, or 5%. This number is the “prior”. For people with symptoms, the tests had a positive reliability factor of 72% (meaning 72% of people with COVID will get a positive test result, the remaining 28% of people with COVID will get an incorrect negative test result), and a negative reliability factor of 99.5% (99.5% of people without COVID will get a negative test result, 0.5% of people without COVID will get an incorrect positive test result). For people without symptoms, these numbers changed to 58% and 98.9%, respectively.

Table 1: COVID test reliability data for people with and without symptoms.

These values are all that are needed to construct a table of probabilities, as described in Russell’s story. Table 2 shows the “with symptoms” example; the relevant numbers are colour coded to match the definition of the parameters in the equation.

Table 2: Bayes’ Theorem probability tables derived from the “with symptoms” data in Table 1: Top: marginal probability, bottom: conditional probabilities, given prior COVID status (infected or not infected).

Using these numbers and the equation form of Bayes’ Theorem shown above, we can easily calculate the probability that we have COVID, given symptoms and a positive test result.

For those individuals with symptoms, therefore, the probability of having COVID-19, given a positive test, is 87.8%. Redoing the calculation using the “without symptoms” numbers, this probability decreases to 74.4%.

Obviously, there can be considerable uncertainty in the initial parameters. How sensitive are the results to changes in the estimates of the “prior” and the reliability factors? Adjusting each of the initial parameters individually from the starting point of our first “with symptoms” example gives us a good idea of the sensitivity of our calculation to a range of reasonable possible inputs.

Charts 1, 2, and 3 show the probability of having COVID, given symptoms and a positive test relative to changes in each of the parameters. Note that only one parameter at a time is adjusted – the two parameters that are not adjusted are fixed at the “with symptoms” values from the example shown above.

Chart 1: Sensitivity of the probability of having COVID (with symptoms) to a range of “prior” values. The positive and negative test reliability factors are constant.

Chart 2: Sensitivity of the probability of having COVID (with symptoms) to a range of positive test reliability values. The prior and negative test reliability factors are constant.

Chart 3: Sensitivity of the probability of having COVID (with symptoms) to a range of negative test reliability values. The prior and positive test reliability factors are constant.

Notice how much more sensitive the probability of having COVID is to small adjustments in the negative test reliability factor than to the other two input parameters. Whereas the positive reliability factor can drop considerably without a significant change in the probability of having COVID, a change in the negative reliability factor of only 1.6% (from 99.8% to 98.2%) results in a significant drop of 27.2% in the probability of having COVID – from 95% down to 67.8%. This is one of those instances where our intuition might have led us astray.

The reason for this sensitivity is that there are many more people who do not have COVID than do. If even a small percentage of these healthy people receive (false) positive results, that substantially increases the number of people with positive results. The false positives, therefore, far outweigh the real positives, significantly reducing the chance that a positive result indicates a genuine COVID infection.

While reliability rates are typically quoted for ideal laboratory conditions, there are many opportunities for these ideal conditions to be compromised in reality. Poor sampling technique and sample degradation or contamination are common ones, which can lead to poorer reliability than expected. For example, another study[3] reported false positive rates of between 0.8% and 4.0%, implying negative test reliability ranging from 99.2% down to 96%. As we have seen, the chances of having COVID, given a positive test, are very sensitive to the negative test reliability rate. Substituting 96% into our first calculation reduces the probability of having COVID, given symptoms and a positive test, to only 48.6%.

Ironically, it is, in part, the number of positive tests recorded that determine the prevalence of COVID infections in the general population – the prior. If many of these tests are false, and the prior is consequently decreased, then the effect on the final result is compounded. In our example using the 96% value for the negative reliability rate, if we also reduce the prior from 5% to 3%, the probability of having COVID, given symptoms and a positive test, is now down to 35.8%.

Table 3 provides a summary of some of the numbers mentioned in this article to illustrate the range of probabilities inherent in the estimates of positive COVID tests. A probability is meant to quantify uncertainty, but, ironically, there can also be considerable uncertainty in the determination of that probability.

Table 3: Sample probabilities of true positive COVID tests, given various prior and test reliability inputs.

Nothing in our lives is certain, including numbers that are reported – and often stated as unequivocal facts – in the media. Even though probabilities are not trivial to understand or calculate, it is important to dig a little deeper to understand the true meaning of chance and risk. We may have less to fear than we thought.

[1] Britannica, The Editors of Encyclopaedia. “Thomas Bayes”. Encyclopedia Britannica, 13 Apr. 2021, https://www.britannica.com/biography/Thomas-Bayes. Accessed 25 June 2021.

[2] Dinnes J, Deeks JJ, Berhane S, Taylor M, Adriano A, Davenport C, Dittrich S, Emperador D, Takwoingi Y, Cunningham J, Beese S, Domen J, Dretzke J, Ferrante di Ruffano L, Harris IM, Price MJ, Taylor-Phillips S, Hooft L, Leeflang MMG, McInnes MDF, Spijker R, Van den Bruel A, Cochrane COVID-19 Diagnostic Test Accuracy Group. Rapid, point-of-care antigen and molecular-based tests for diagnosis of SARS-CoV-2 infection. Cochrane Database of Systematic Reviews 2021, Issue 3. Art. No.: CD013705. DOI: 10.1002/14651858.CD013705.pub2. How accurate are rapid tests for diagnosing COVID-19?

[3]The Lancet COMMENT| VOLUME 8, ISSUE 12, P1167-1168, DECEMBER 01, 2020 DOI: False-positive COVID-19 results: hidden problems and costs (accessed July 7, 2021)

Laurie Weston
Laurie Weston is a co-founder and scientific strategist for BIG Media, with a Bachelor of Science degree with honours in Physics and Astronomy from the University of Victoria in Canada. Laurie has more than 35 years of experience as a geophysicist in the oil and gas industry. She is president of Sound QI Solutions Ltd., a data analysis software and services company she founded in 2007.