Plotting

We provide a lot of plotting functions which can be used to recreate our plots or create completely new visualizations. If you are familiar with matplotlib it should be no problem to use them extensively.

We provide three different types of functions here:

  • High level functions These can be used create figures similar to our paper Dehning et al. arXiv:2004.01105. The are neat little one liners which create a good looking plot from our model, but do not have a lot of customization options.

  • Low level functions These extend the normal matplotlib plotting functions and can be used to plot arbitrary data. They have a lot of customization options, it could take some time to get nicely looking plots with these functions though.

  • Helper functions These are mainly functions that manipulate data or retrieve data from our model. These do not have to be used most of the time and are only documented here for completeness.

If one just wants to recreate our figures with a different color. The easiest was is to change the default rc parameters.

covid19_inference.plot.get_rcparams_default()[source]

Get a Param (dict) of the default parameters. Here we set our default values. Assigned once to module variable rcParamsDefault on load.

covid19_inference.plot.set_rcparams(par)[source]

Sets the rcparameters used for plotting, provided instance of Param has to have the following keys (attributes):

Variables
  • locale (str) – region settings, passed to setlocale(). Default: “en_US”

  • date_format (str) – Format the date on the x axis of time-like data (see https://strftime.org/) example April 1 2020: “%m/%d” 04/01, “%-d. %B” 1. April Default “%b %-d”, becomes April 1

  • date_show_minor_ticks (bool) – whether to show the minor ticks (for every day). Default: True

  • rasterization_zorder (int or None) – Rasterizes plotted content below this value, set to None to keep everything a vector, Default: -1

  • draw_ci_95 (bool) – For timeseries plots, indicate 95% Confidence interval via fill between. Default: True

  • draw_ci_75 (bool) – For timeseries plots, indicate 75% Confidence interval via fill between. Default: False

  • draw_ci_50 (bool) – For timeseries plots, indicate 50% Confidence interval via fill between. Default: False

  • color_model (str) – Base color used for model plots, mpl compatible color code “C0”, “#303030” Default : “tab:green”

  • color_data (str) – Base color used for data Default : “tab:blue”

  • color_annot (str) – Color to use for annotations Default : “#646464”

  • color_prior (str) – Color to used for priors in distributions Default : “#708090”

Example

# Get default parameter
pars = cov.plot.get_rcparams_default()

# Change parameters
pars["locale"]="de_DE"
pars["color_data"]="tab:purple"

# Set parameters
cov.plot.set_rcparams(pars)

High level functions

covid19_inference.plot.timeseries_overview(model, trace, start=None, end=None, region=None, color=None, save_to=None, offset=0, annotate_constrained=True, annotate_watermark=True, axes=None, forecast_label='Forecast', forecast_heading='$\\bf Forecasts\\!:$', add_more_later=False)[source]

Create the time series overview similar to our paper. Dehning et al. arXiv:2004.01105 Contains \lambda, new cases, and cumulative cases.

Parameters
  • model (Cov19Model) –

  • trace (trace instance) – needed for the data

  • offset (int) – offset that needs to be added to the (cumulative sum of) new cases at time model.data_begin to arrive at cumulative cases

  • start (datetime.datetime) – only used to set xrange in the end

  • end (datetime.datetime) – only used to set xrange in the end

  • color (str) – main color to use, default from rcParam

  • save_to (str or None) – path where to save the figures. default: None, not saving figures

  • annotate_constrained (bool) – show the unconstrained constrained annotation in lambda panel

  • annotate_watermark (bool) – show our watermark

  • axes (np.array of mpl axes) – provide an array of existing axes (from previously calling this function) to add more traces. Data will not be added again. Ideally call this first with add_more_later=True

  • forecast_label (str) – legend label for the forecast, default: “Forecast”

  • forecast_heading (str) – if add_more_later, how to label the forecast section. default: “$bf Forecasts!:$”,

  • add_more_later (bool) – set this to true if you plan to add multiple models to the plot. changes the layout (and the color of the fit to past data)

Returns

  • fig (mpl figure)

  • axes (np array of mpl axeses (insets not included))

Todo

  • Replace offset with an instance of data class that should yield the cumulative cases. we should not to calculations here.

Low level functions

covid19_inference.plot._timeseries(x, y, ax=None, what='data', draw_ci_95=None, draw_ci_75=None, draw_ci_50=None, date_format=True, alpha_ci=None, **kwargs)[source]

low-level function to plot anything that has a date on the x-axis.

Parameters
  • x (array of datetime.datetime) – times for the x axis

  • y (array, 1d or 2d) – data to plot. if 2d, we plot the CI as fill_between (if CI enabled in rc params) if 2d, then first dim is realization and second dim is time matching x if 1d then first tim is time matching x

  • ax (mpl axes element, optional) – plot into an existing axes element. default: None

  • what (str, optional) – what type of data is provided in x. sets the style used for plotting: * data for data points * fcast for model forecast (prediction) * model for model reproduction of data (past)

  • date_format (bool, optional) – Automatic converting of index to dates default:True

  • kwargs (dict, optional) – directly passed to plotting mpl.

Returns

ax

covid19_inference.plot._distribution(model, trace, key, ax=None, color=None, draw_prior=True)[source]

Todo

documentation

Example

In this example we want to use the low level time series function to plot the new daily cases and deaths reported by the Robert Koch institute.

import datetime
import matplotlib.pyplot as plt
import covid19_inference as cov19

# Data retrieval i.e. download new data from RobertKochInstitue
rki = cov19.data_retrieval.RKI()
rki.download_all_available_data()

new_deaths = rki.get_new(
    value = "deaths",
    data_begin=datetime.datetime(2020,3,15), #arbitrary data
    data_end=datetime.datetime.today())

new_cases = rki.get_new(
    value = "confirmed",
    data_begin=datetime.datetime(2020,3,15),
    data_end=datetime.datetime.today())

# Create a multiplot
fig, axes = plt.subplots(2,1, figsize=(12,6))

# Plot the new cases onto axes[0]
cov19.plot._timeseries(
    x=new_cases.index,
    y=new_cases,
    ax=axes[0],
    what="model", #We define model here to get a line instead of data points
)

# Plot the new deaths onto axes[1]
cov19.plot._timeseries(
    x=new_deaths.index,
    y=new_deaths,
    ax=axes[1],
    what="model", #We define model here to get a line instead of data points
)

# Label the plots

axes[0].set_title("New cases")

axes[1].set_title("New deaths")

# Show the figure
fig.show()
../_images/exampe_timeseries.png

Helper functions

covid19_inference.plot._get_array_from_trace_via_date(model, trace, var, start=None, end=None, dates=None)[source]
Parameters
  • model (model instance) –

  • trace (trace instance) –

  • var (str) – the variable name in the trace

  • start (datetime.datetime) – get all data for a range from start to end. (both boundary dates included)

  • end (datetime.datetime) –

  • dates (list of datetime.datetime objects, optional) – the dates for which to get the data. Default: None, will return all available data.

Returns

  • data (nd array, 3 dim) – the elements from the trace matching the dates. dimensions are as follows 0 samples, if no samples only one entry 1 data with time matching the returned dates (if compatible variable) 2 region, if no regions only one entry

  • dates (pandas DatetimeIndex) – the matching dates. this is essnetially an array of dates than can be passed to matplotlib

Example

import covid19_inference as cov
model, trace = cov.create_example_instance()
y, x = cov.plot._get_array_from_trace_via_date(
    model, trace, "lambda_t", model.data_begin, model.data_end
)
ax = cov.plot._timeseries(x, y[:,:,0], what="model")
covid19_inference.plot._new_cases_to_cum_cases(x, y, what, offset=0)[source]

so this conversion got ugly really quickly. need to check dimensionality of y

Parameters
  • x (pandas DatetimeIndex array) – will be padded accordingly

  • y (1d or 2d numpy array) – new cases matching dates in x. if 1d, we assume raw data (no samples) if 2d, we assume results from trace with 0th dim samples and 1st new cases matching x

  • what (str) – dirty workaround to differntiate between traces and raw data “data” or “trace”

  • offset (int or array like) – added to cum sum (should be the known cumulative case number at the first date of provided in x)

Returns

  • x_cum (pandas DatetimeIndex array) – dates of the cumulative cases

  • y_cum (nd array) – cumulative cases matching x_cum and the dimension of input y

Example

cum_dates, cum_cases = _new_cases_to_cum_cases(new_dates, new_cases)
covid19_inference.plot._label_for_varname(key)[source]

get the label for trace variable names (e.g. placed on top of distributions)

default for unknown keys is the key itself

Todo

add more parameters

covid19_inference.plot._math_for_varname(key)[source]

get the math string for trace variable name, e.g. used to print the median representation.

default for unknown keys is “$x$”

Todo

use regex

covid19_inference.plot._days_to_mpl_dates(days, origin)[source]

convert days as number to matplotlib compatible date numbers. this is not the same as pandas dateindices, but numpy operations work on them

Parameters
  • days (number, 1d array of numbers) – the day number to convert, e.g. integer values >= 0, one day per int

  • origin (datetime.datetime) – the date object corresponding to day 0

covid19_inference.plot._get_mpl_text_coordinates(text, ax)[source]

helper to get coordinates of a text object in the coordinates of the axes element [0,1]. used for the rectangle backdrop.

Returns: x_min, x_max, y_min, y_max

covid19_inference.plot._add_mpl_rect_around_text(text_list, ax, x_padding=0.05, y_padding=0.05, **kwargs)[source]

add a rectangle to the axes (behind the text)

provide a list of text elements and possible options passed to mpl.patches.Rectangle e.g. facecolor=”grey”, alpha=0.2, zorder=99,

covid19_inference.plot._rx_cp_id(key)[source]

get the change_point index from a compatible variable name

covid19_inference.plot._rx_hc_id(key)[source]

get the L1 / L2 value of hierarchical variable name

covid19_inference.plot._format_k(prec)[source]

format yaxis 10_000 as 10 k. _format_k(0)(1200, 1000.0) gives “1 k” _format_k(1)(1200, 1000.0) gives “1.2 k”

covid19_inference.plot._format_date_xticks(ax, minor=None)[source]
covid19_inference.plot._truncate_number(number, precision)[source]
covid19_inference.plot._string_median_CI(arr, prec=2)[source]
covid19_inference.plot._add_watermark(ax, mark='Dehning et al. 10.1126/science.abb9789')[source]

Add our arxive url to an axes as (upper right) title

class covid19_inference.plot.Param[source]

Paramters Base Class (a tweaked dict)

We inherit from dict and also provide keys as attributes, mapped to .get() of dict. This avoids the KeyError: if getting parameters via .the_parname, we return None when the param does not exist.

Avoid using keys that have the same name as class functions etc.

Example

foo = Param(lorem="ipsum")
print(foo.lorem)
>>> 'ipsum'
print(foo.does_not_exist is None)
>>> True