Model¶
-
class
covid19_inference.model.
Cov19Model
(new_cases_obs, data_begin, fcast_len, diff_data_sim, N_population, name='', model=None)[source]¶ Model class used to create a covid-19 propagation dynamics model. Parameters below are passed to the constructor. Attributes (Variables) are available after creation and can be accessed from every instance. Some background:
The simulation starts diff_data_sim days before the data.
The data has a certain length, on which the inference is based. This length is given by new_cases_obs.
After the inference, a forecast takes of length fcast_len takes place, starting on the day after the last data point in new_cases_obs.
In total, traces produced by a model run have the length sim_len = diff_data_sim + data_len + fcast_len
Date ranges include both boundaries. For example, if data_begin is March 1 and data_end is March 3 then data_len will be 3.
- Parameters
new_cases_obs (1 or 2d array) – If the array is two-dimensional, an hierarchical model will be constructed. First dimension is then time, the second the region/country.
data_begin (datatime.datetime) – Date of the first data point
fcast_len (int) – Number of days the simulations runs longer than the data
diff_data_sim (int) – Number of days the simulation starts earlier than the data. Should be significantly longer than the delay between infection and report of cases.
N_population (number or 1d array) – Number of inhabitance in region, needed for the S(E)IR model. Is ideally 1 dimensional if new_cases_obs is 2 dimensional
name (string) – suffix appended to the name of random variables saved in the trace
model – specify a model, if this one should expand another
- Variables
new_cases_obs (1 or 2d array) – as passed during construction
data_begin (datatime.datetime) – date of the first data point in the data
data_end (datatime.datetime) – date of the last data point in the data
sim_begin (datatime.datetime) – date at which the simulation begins
sim_end (datatime.datetime) – date at which the simulation ends (should match fcast_end)
fcast_begin (datatime.datetime) – date at which the forecast starts (should be one day after data_end)
fcast_end (datatime.datetime) – data at which the forecast ends
data_len (int) – total number of days in the data
sim_len (int) – total number of days in the simulation
fcast_len (int) – total number of days in the forecast
diff_data_sim (int) – difference in days between the simulation begin and the data begin. The simulation starting time is usually earlier than the data begin.
Example
with Cov19Model(**params) as model: # Define model here
-
covid19_inference.model.
modelcontext
(model)[source]¶ return the given model or try to find it in the context if there was none supplied.
-
covid19_inference.model.
student_t_likelihood
(new_cases_inferred, pr_beta_sigma_obs=30, nu=4, offset_sigma=1, model=None, data_obs=None, name_student_t='_new_cases_studentT', name_sigma_obs='sigma_obs')[source]¶ Set the likelihood to apply to the model observations (model.new_cases_obs) We assume a
StudentT
distribution because it is robust against outliers [Lange1989]. The likelihood follows:The parameter
follows a
HalfCauchy
prior distribution with parameter beta set bypr_beta_sigma_obs
. If the input is 2 dimensional, the parameteris different for every region.
- Parameters
new_cases_inferred (
TensorVariable
) – One or two dimensonal array. If 2 dimensional, the first dimension is time and the second are the regions/countriespr_beta_sigma_obs (float) – The beta of the
HalfCauchy
prior distribution of.
nu (float) – How flat the tail of the distribution is. Larger nu should make the model more robust to outliers. Defaults to 4 [Lange1989].
offset_sigma (float) – An offset added to the sigma, to make the inference procedure robust. Otherwise numbers of
new_cases_inferred
would lead to very small errors and diverging likelihoods. Defaults to 1.model – The model on which we want to add the distribution
data_obs (array) – The data that is observed. By default it is
model.new_cases_ob
name_student_t – The name under which the studentT distribution is saved in the trace.
name_sigma_obs – The name under which the distribution of the observable error is saved in the trace
- Returns
None
References
-
covid19_inference.model.
SIR
(lambda_t_log, mu, pr_I_begin=100, model=None, return_all=False, save_all=False)[source]¶ Implements the susceptible-infected-recovered model.
The prior distribution of the recovery rate
is set to
. And the prior distribution of
to
- Parameters
lambda_t_log (
TensorVariable
) – time series of the logarithm of the spreading rate, 1 or 2-dimensional. If 2-dimensional the first dimension is time.mu (
TensorVariable
) – the recovery rate, typically a random variable. Can be 0 or 1-dimensional. If 1-dimensional, the dimension are the different regions.
pr_I_begin (float or array_like or
TensorVariable
) – Prior beta of the Half-Cauchy distribution of.
pr_median_mu (float or array_like) – Prior for the median of the lognormal distrubution of the recovery rate
.
pr_sigma_mu (float or array_like) – Prior for the sigma of the lognormal distribution of recovery rate
.
model (
Cov19Model
) – if none, it is retrieved from the contextreturn_all (bool) – if True, returns
new_I_t
,I_t
,S_t
otherwise returns onlynew_I_t
save_all (bool) – if True, saves
new_I_t
,I_t
,S_t
in the trace, otherwise it saves onlynew_I_t
- Returns
new_I_t (
TensorVariable
) – time series of the number daily newly infected persons.I_t (
TensorVariable
) – time series of the infected (if return_all set to True)S_t (
TensorVariable
) – time series of the susceptible (if return_all set to True)
-
covid19_inference.model.
SEIR
(lambda_t_log, pr_beta_I_begin=100, pr_beta_new_E_begin=50, pr_median_mu=0.125, pr_mean_median_incubation=4, pr_sigma_median_incubation=1, sigma_incubation=0.4, pr_sigma_mu=0.2, model=None, return_all=False, save_all=False, name_median_incubation='median_incubation')[source]¶ Implements a model similar to the susceptible-exposed-infected-recovered model. Instead of a exponential decaying incubation period, the length of the period is lognormal distributed. The complete equation is:
The recovery rate
and the incubation period is the same for all regions and follow respectively:
The initial number of infected and newly exposed differ for each region and follow prior
HalfCauchy
distributions:- Parameters
lambda_t_log (
TensorVariable
) – time series of the logarithm of the spreading rate, 1 or 2-dimensional. If 2-dimensional, the first dimension is time.pr_beta_I_begin (float or array_like) – Prior beta of the
HalfCauchy
distribution of.
pr_beta_new_E_begin (float or array_like) – Prior beta of the
HalfCauchy
distribution of.
pr_median_mu (float or array_like) – Prior for the median of the
Lognormal
distribution of the recovery rate.
pr_mean_median_incubation – Prior mean of the
Normal
distribution of the median incubation delay. Defaults to 4 days [Nishiura2020], which is the median serial interval (the important measure here is not exactly the incubation period, but the delay until a person becomes infectious which seems to be about 1 day earlier as showing symptoms).
pr_sigma_median_incubation – Prior sigma of the
Normal
distribution of the median incubation delay. Default is 1 day.
sigma_incubation – Scale parameter of the
Lognormal
distribution of the incubation time/ delay until infectiousness. The default is set to 0.4, which is about the scale found in [Nishiura2020], [Lauer2020].pr_sigma_mu (float or array_like) – Prior for the sigma of the lognormal distribution of recovery rate
.
model (
Cov19Model
) – if none, it is retrieved from the contextreturn_all (bool) – if True, returns
new_I_t
,new_E_t
,I_t
,S_t
otherwise returns onlynew_I_t
save_all (bool) – if True, saves
new_I_t
,new_E_t
,I_t
,S_t
in the trace, otherwise it saves onlynew_I_t
name_median_incubation (str) – The name under which the median incubation time is saved in the trace
- Returns
new_I_t (
TensorVariable
) – time series of the number daily newly infected persons.new_E_t (
TensorVariable
) – time series of the number daily newly exposed persons. (if return_all set to True)I_t (
TensorVariable
) – time series of the infected (if return_all set to True)S_t (
TensorVariable
) – time series of the susceptible (if return_all set to True)
References
- Nishiura2020(1,2)
Nishiura, H.; Linton, N. M.; Akhmetzhanov, A. R. Serial Interval of Novel Coronavirus (COVID-19) Infections. Int. J. Infect. Dis. 2020, 93, 284–286. https://doi.org/10.1016/j.ijid.2020.02.060.
- Lauer2020
Lauer, S. A.; Grantz, K. H.; Bi, Q.; Jones, F. K.; Zheng, Q.; Meredith, H. R.; Azman, A. S.; Reich, N. G.; Lessler, J. The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application. Ann Intern Med 2020. https://doi.org/10.7326/M20-0504.
-
covid19_inference.model.
delay_cases
(new_I_t, pr_median_delay=10, pr_sigma_median_delay=0.2, pr_median_scale_delay=0.3, pr_sigma_scale_delay=None, model=None, save_in_trace=True, name_delay='delay', name_delayed_cases='new_cases_raw', len_input_arr=None, len_output_arr=None, diff_input_output=None)[source]¶ Convolves the input by a lognormal distribution, in order to model a delay:
For clarification: the
distribution is a function evaluated at
.
If the model is 2-dimensional, the
is hierarchically modelled with the
hierarchical_normal()
function using the default parameters except that the priorof
is HalfNormal distributed (
error_cauchy=False
).- Parameters
new_I_t (
TensorVariable
) – The input, typically the number newly infected casesof from the output of
SIR()
orSEIR()
.pr_median_delay (float) – The mean of the
normal
distribution which models the prior median of theLogNormal
delay kernel.pr_sigma_median_delay (float) – The standart devaiation of
normal
distribution which models the prior median of theLogNormal
delay kernel.pr_median_scale_delay (float) – The scale (width) of the
LogNormal
delay kernel.pr_sigma_scale_delay (float) – If it is not None, the scale is of the delay is kernel follows a prior
LogNormal
distribution, with medianpr_median_scale_delay
and scalepr_sigma_scale_delay
.model (
Cov19Model
) – if none, it is retrieved from the contextsave_in_trace (bool) – whether to save
in the trace
name_delay (str) – The name under which the delay is saved in the trace, suffixes and prefixes are added depending on which variable is saved.
name_delayed_cases (str) – The name under which the delay is saved in the trace, suffixes and prefixes are added depending on which variable is saved.
len_input_arr – Length of
new_I_t
. By default equal tomodel.sim_len
. Necessary because the shape of theano tensors are not defined at when the graph is built.len_output_arr (int) – Length of the array returned. By default it set to the length of the cases_obs saved in the model plus the number of days of the forecast.
diff_input_output (int) – Number of days the returned array begins later then the input. Should be significantly larger than the median delay. By default it is set to the
model.diff_data_sim
.
- Returns
new_cases_inferred (
TensorVariable
) – The delayed input, typically the daily number new cases that one expects to measure.
-
covid19_inference.model.
week_modulation
(new_cases_raw, week_modulation_type='abs_sine', pr_mean_weekend_factor=0.3, pr_sigma_weekend_factor=0.5, week_end_days=(6, 7), model=None, save_in_trace=True)[source]¶ Adds a weekly modulation of the number of new cases:
if
week_modulation_type
is"abs_sine"
(the default). Ifweek_modulation_type
is"step"
, the new cases are simply multiplied by the weekend factor on the days set byweek_end_days
The weekend factor
follows a Lognormal distribution with median
pr_mean_weekend_factor
and sigmapr_sigma_weekend_factor
. It is hierarchically constructed if the input is two-dimensional by the functionhierarchical_normal()
with default arguments.The offset from Sunday
follows a flat
VonMises
distribution and is the same for all regions.- Parameters
new_cases_raw (
TensorVariable
) – The input array, can be one- or two-dimensionalweek_modulation_type (str) – The type of modulation, accepts
"step"
or"abs_sine
(the default).pr_mean_weekend_factor (float) – Sets the prior mean of the factor
by which weekends are counted.
pr_sigma_weekend_factor (float) – Sets the prior sigma of the factor
by which weekends are counted.
week_end_days (tuple of ints) – The days counted as weekend if
week_modulation_type
is"step"
model (
Cov19Model
) – if none, it is retrieved from the contextsave_in_trace (bool) – If True (default) the new_cases are saved in the trace.
- Returns
new_cases (
TensorVariable
)
-
covid19_inference.model.
make_change_point_RVs
(change_points_list, pr_median_lambda_0, pr_sigma_lambda_0=1, model=None)[source]¶ - Parameters
priors_dict –
change_points_list –
model –
-
covid19_inference.model.
lambda_t_with_sigmoids
(change_points_list, pr_median_lambda_0, pr_sigma_lambda_0=0.5, model=None)[source]¶ - Parameters
change_points_list –
pr_median_lambda_0 –
pr_sigma_lambda_0 –
model (
Cov19Model
) – if none, it is retrieved from the context
-
covid19_inference.model.
hierarchical_normal
(name, name_sigma, pr_mean, pr_sigma, len_L2, w=1.0, error_fact=2.0, error_cauchy=True)[source]¶ Implements an hierarchical normal model:
It is however implemented in a non-centered way, that the second line is changed to:
See for example https://arxiv.org/pdf/1312.0906.pdf
- Parameters
name (str) – Name under which
and
saved in the trace.
'_L1'
and'_L2'
is appendedname_sigma (str) – Name under which
saved in the trace.
'_L2'
is appended.pr_mean (float) – Prior mean of
pr_sigma (float) – Prior sigma for
and (muliplied by
error_fact
) forlen_L2 (int) – length of
error_fact (float) – Factor by which
pr_sigma
is multiplied as prior for sigma_text{L2}error_cauchy (bool) – if False, a
distribution is used for
instead of
- Returns
y (
TensorVariable
) – the random variablex (
TensorVariable
) – the random variable
-
covid19_inference.model.
make_prior_I
(lambda_t_log, mu, pr_median_delay, pr_sigma_I_begin=2, n_data_points_used=5, model=None)[source]¶ Builds the prior for I begin by solving the SIR differential from the first data backwards. This decorrelates the I_begin from the lambda_t at the beginning, allowing a more efficient sampling. The example_one_bundesland runs about 30% faster with this prior, instead of a HalfCauchy.
- Parameters
lambda_t_log (
TensorVariable
) –mu (
TensorVariable
) –pr_median_delay (float) –
pr_sigma_I_begin (float) –
n_data_points_used (int) –
model (
Cov19Model
) – if none, it is retrieved from the context
- Returns
I_begin (
TensorVariable
)