A family party with 25 participants took place on the 15.03.2022 in a closed room and no masks were worn. In the following days some participants start to show symptoms of a COVID-19 infection: one on 18.03.2022 and three on 20.03.2022. How many further symptomatic infections can be expected in the following days?
get_expected_total_infections
The function get_expected_total_infections()
can be used
to give a first answer to the question. It returns a prediction how many
people in the group are expected to show symptoms in total.
The following input values are necessary for the function
get_expected_total_infections()
to work:
The group_size
is the number of people participating in
the event, including all observed infections.
The last_day_reported_infection
is the number of days
after the event when the last symptom begin was observed and the
total_reported_infections
is the total number of observed
infections so far.
Finally, meanlog
and sdlog
are the mean and
standard deviation parameters of the log-normal distribution for the
incubation time derived from the paper Xin et al. [1].
Based on the incubation time distribution one can calculate the
percentage of all symptomatic infections that will have their symptom
onset up to the last_day_reported_infection
. Then, said
percentage is combined with the total_reported_infections
to calculate the total symptomatic infections. The minimum between the
result and the group_size
is returned, because the
group_size
is obviously an upper bound for the total
infections.
<- 25
group_size <- 5 # day 0 = event day
last_day_reported_infections <- 4
total_reported_infections <- 1.69
meanlog <- 0.55
sdlog
<- get_expected_total_infections(group_size,
predicted_total_infections
last_day_reported_infections,
total_reported_infections,
meanlog,
sdlog)print(predicted_total_infections)
#> [1] 10
The output represents how many people are expected to get a symptomatic infection. In the example 10 infections are predicted in total, which implies one can expect 6 further people starting to show symptoms in the next days because 4 infections were already observed.
predict_future_infections
The function predict_future_infections()
can be used to
give a more detailed answer. It creates a vector containing the
predicted number of further people starting to show symptoms on each of
the days after the event.
Multiple arguments are necessary for the function
predict_future_infections()
to work:
The last_day_reported_infection
is the number of days
after the event when the last symptom begin was observed and the
total_reported_infections
is the total number of observed
infections so far.
Then, the total_expected_infections
is needed, which
defines the total number of expected infections, including the ones
already observed. One can use the output of
get_expected_total_infections()
or an own estimation based
on e.g. reported symptomatic infection rates in a population of
interest. If the output of get_expected_total_infections()
is used, then it should be based on the same meanlog
and
sdlog
as in the call to
predict_future_infections()
.
Finally, meanlog
and sdlog
are the mean and
standard deviation parameters of the log-normal distribution for the
incubation time.
The function predict_future_infections()
uses the
function get_incubation_day_distribution()
to get a vector
of day-specific probabilities of symptom onset, given that a person will
develop symptoms. Default values of the log-normal distribution for the
incubation time used in that function are taken from the paper Xin et
al. [1].
Starting on the first day after the
last_day_reported_infection
, the probability of symptom
onset on a particular day, given that no symptoms occurred so far, is
multiplied by the number of further expected infections and rounded
upwards to receive the expected number of people starting to show
symptoms on that day. The probability that no symptoms occurred so far
can be calculated by 1 minus the sum of symptom onset probabilities for
all previous days. The number of further expected infections is simply
total_expected_infections
minus
total_reported_infections
. The latter is afterwards raised
by the predicted number of people with symptom onset on the day
currently looked at, so that afterwards the next day can be treated.
When at some point the updated total_reported_infections
is not smaller than total_expected_infections
anymore, a 0
is inserted to signal that all further expected symptomatic infections
are allocated and the loop is stopped.
<- 5 # day 0 = event day
last_day_reported_infections <- 4
total_reported_infections <- get_expected_total_infections(25, 5, 4)
total_expected_infections <- 1.69
meanlog <- 0.55
sdlog
<- predict_future_infections(last_day_reported_infections,
predicted_daily_infections
total_reported_infections,
total_expected_infections,
meanlog,
sdlog)print(predicted_daily_infections)
#> [1] 0 0 0 0 0 2 1 1 1 1 0
The function predict_future_infections()
creates a
vector with values representing the expected distribution of new
symptomatic infections over the days after the event. Up to the
last_day_reported_infections
the entries are 0 because only
infections in the future are predicted.
predict_future_infections
<- data.frame("Erkrankungsdatum" = as.Date("2022-03-15") + 0:5,
data "Neue_Faelle" = c(0, 0, 0, 1, 0, 3))
<- data.frame("Erkrankungsdatum" = as.Date("2022-03-15") + 1:(length(predicted_daily_infections)),
expected "ErwarteteWeitereFaelle" = predicted_daily_infections)
<- ggplot(expected) +
g geom_bar(
data = data,
aes(Erkrankungsdatum,
Neue_Faelle,fill = "observation"),
stat = 'identity'
+
) geom_bar(
data = expected,
aes(Erkrankungsdatum,
ErwarteteWeitereFaelle,fill = "prediction"
),stat = 'identity'
+
) geom_vline(xintercept = expected$Erkrankungsdatum[1]) +
geom_label(aes(x = expected$Erkrankungsdatum[1], y = data$Neue_Faelle[1] + 4, label = "event"),
colour = "black", fill = "white", vjust = 1, size = 7) +
scale_y_continuous(breaks = function(x) unique(
floor(pretty(seq(0, (max(x) + 1) * 1.1))))) +
scale_x_date(date_breaks = "1 day", date_labels = "%d %b", minor_breaks = NULL) +
ylab("infected") +
xlab("timeline") +
labs(fill = 'type of cases') +
theme(legend.position = c(0.75,0.85), text = element_text(size = 16),
axis.text.x = element_text(face = "bold", angle = 30, hjust = 1))
return(g)
[1] Xin H, Wong JY, Murphy C, Yeung A, Ali ST, Wu P, Cowling BJ. The Incubation Period Distribution of Coronavirus Disease 2019: A Systematic Review and Meta-Analysis. Clinical Infectious Diseases. 2021; 73(12): 2344-2352.