Serving COVID-19 Epidemic Models with Watson Machine Learning

Written by Lukasz Cmielowski , Alexandre Quemy, Rafal Bigaj & Wojciech Sobala

In this story, we would like to share our recent experience of building, serving and integrating COVID-19 models using IBM Cloud.

Building epidemic models

For the purpose of this experiment, we have tried several epidemic models. Since we are located in Poland we have adopted all tested models to Poland COVID-19 situation and data. However, shared below examples can be easily adapted to other locations and data sources by simply passing country name and other input parameters.

We have evaluated SIR, logistic, double-exponential and Weibull models to predict the number of confirmed cases in Poland in the next days. The next model we worked on was the calibration model based on China's historical data, social distancing (lockdown date) and sigmoid-like asymptotical shape of an epidemic.

From the tooling perspective, we have used the Jupyter Notebook environment available in Watson Studio (IBM Cloud).

Calibration model

The calibration model uses a different approach than most data science and epidemiological methods. It relies on two simples and testable assumptions:

  • (H1): the asymptotical shape of an epidemic is a sort of sigmoid
  • (H2): as of today, only social distancing measure can slower the epidemic

Based on these two hypotheses, the method consists of taking a full trajectory as reference (e.g. Hubei, China) and calibrate as best as possible the (potentially partial) trajectories of other countries using a conformal map. All trajectories can then be considered as realizations of the same stochastic process. It is then possible to robustly estimate the parameters using all trajectories, i.e. by using ALL available information as of today. We paid particular attention to use non-biased estimators as the bias is larger when few trajectories are available. Using the inverse map, we can come back to the original space for each trajectory.

The model has many interesting properties:

  • Uses all the available information
  • Allows to easily include expert knowledge (e.g. complex model for the impact of social distancing measures depending on the country)
  • Has a close form, a proof of convergence, and the mathematical properties can be easily studied
  • Easy to make sound scenarios at a given risk (e.g. 5%)
  • Many ways to check if the model does not behave properly globally or for each particular trajectory
  • We can make a counterfactual analysis easily

However, the model works only for a country that implemented social distancing measures and it needs at least one full epidemic trajectory for the calibration. It is the case for COVID-19 since the epidemic in Hubei is over, but in more general cases it might not be the case.

To evaluate the model, we use several validation metrics that are also used to recalibrate the model depending on the deviation observed. We provide three scenarios:

  • Central,
  • Worst case a 5%
  • The best case at 5% (under a Gaussian assumption on the noise).

When the real curve starts to deviate from the central scenario, we readjust the individual trajectory parameters to realign the central scenario on it. This affects positively all other trajectories.
We provide and monitor the absolute and relative uncertainty between the 5% scenarios and the average scenario to ensure it converges toward zero with time.

The average relative error for the central scenario for a J+1 prediction from 28/03/2020 to 01/04/2020:

  • Italy: 1.14%
  • Poland: 1.35%
  • France: 2.08%
Projection of confirmed cases for Italy
Projection of confirmed cases for Poland
Projection of confirmed cases for France

The black dots represent the confirmed number of cases for the given country.
The black curve represents the expected number of cases found by the model. With a 95% confidence interval, the real trajectory should be between the worst case and the best-case scenario (assuming our hypotheses hold).

The vertical bars represent the number of new cases detected for each day. It is directly derived from the model. In blue, our estimation, in orange, the real observations. The text below the top left legend box indicated:

  • the risk of computing the best and worst scenarios, and the parameters for the Gaussian noise
  • the total number of cases the model predicts, with the interval for the risk mentioned above.
  • the expected date of the point of inflection as well as the interval for the risk mentioned above

If you are interested in technical details of the calibration model please read the calibration model story written by Alexandre Quemy.

If you want to play with source code please refer to this Jupyter notebook.

SIR model

The SIR model belongs to the compartmental models family. The model consists of three compartments:

  • S the number of susceptible,
  • I the number of infectious
  • R the number of recovered individuals.

S, I, and R parameters represent the number of people in each compartment at a particular time. To represent that those numbers (S, I, R) may vary over time, a function of t (time): S(t), I(t) and R(t) is introduced. SIR model allows describing the number of people in each compartment at a particular time [1].

The SIR model requires initialization parameters to be set:

  • Initial conditions: S(0), I(0), R(0)
  • β is a parameter controlling how much the disease can be transmitted through exposure
  • γ is a parameter expressing how much the disease can be recovered in a specific period [2].

β and γ can be found based on historical data. The technique is well described in “COVID-19 dynamics with SIR model” publication. You can also find implementation examples and detailed description in [2].

For the Poland use case, we have used historical data updated daily from GitHub repository CSSEGISandData. The simulation result can be found below.

Projection of confirmed cases (I) for Poland

Parametric curves

Models based on theories of epidemics such as SIR or SEIR are useful to understand dynamic and make simulations whereas simple models can provide better short term predictions [4].

Widely used machine learning algorithms are developed to provide good models for large datasets with many predictors (features) but simple nonlinear parametric curves are proved to be useful for predicting epidemic. Interpretable parameters make forecasting more stable. An item that remains crucial for selecting a good candidate curve is the ability to model flattening the epidemic curve which is commonly observed.

The most common choice is the logistic curve. We added two additional curves for comparison:

  • logistic
  • double exponential
  • Weibull

Weibull curve is inspired based on Weibull cumulative distribution with an additional parameter to rescale predicted values. We are interested in curve fitting to the number of daily cases at the country level. Data for period from first case detection to April, 1st is used. The last five days are used to compare prediction error between fitted models.

The logistic curve for confirmed cases in Poland
The double-exponential curve for confirmed cases in Poland
Weibull curve for confirmed cases in Poland

As we can see the Weibull curve looks very promising. The Mean Square Error is the smallest of all three evaluated curves.

The Jupyter notebook with source code can be found here.

For comparative analysis we have also trained a few regression models:

  • SVR (polynomial kernel)
  • Polynomial Regression
  • Bayesian Ridge

For each of them, we have calculated MSE on the same points as for described above curves. Please find MSE comparison in the table below.

MODEL                              | MSE
SVR(poly kernel) | 810720.6
Polynomial regression (degree=6) | 12156.5
BayesianRidge | 23566.0
Logistic: | 106986.5
Double exponential | 4369.6
Weibull | 899.6
SIR | 1826.7

Serving models as web service

To serve models as web service we use Watson Machine Learning service on IBM Cloud.

Please refer to this Jupyter notebook for calibration model, deployment and scoring steps source code.

If you want to run the model for your country you need to add country name and lockdown date to dictionary under CalibrationModel class:

ref_countries = {
'Hubei': {
'key': 'Province',
'name': 'Hubei',
'lockdown': '2020-01-23'
'Poland': {
'key': 'Country',
'name': 'Poland',
'lockdown': '2020-03-15'

To server a model as WebService you need to create an instance of Watson Machine Learning Service. Next copy instance credentials from credentials tab and paste them to notebook cell:

Now you can create a WebService serving calibration or any other models. You just need to pass scoring function name and WebService name to client.deployments.create() method.

To make a scoring request you need:

  • scoring URL (as printed below)
  • initialized python client (WatsonMachineLearningAPIClient) or Bearer token (can be generated using WML REST API)
  • scoring payload

Now, we can test our WebService by calling client.deployments.score(scoring_url, payload) method.

As a result of a scoring request

  • real confirmed
  • predicted confirmed
  • inflexion day
  • asymptotic cases
'inflexion_day': 70.0, 'asymptotic_cases': 5232.893425532289

are returned.

Using Plotly we can easily visualize confirmed cases prediction curve as well as real cases.

Projection of confirmed cases for Poland

Building dashboard

To build an interactive dashboard consuming our WebService we have used Plotly Dash. That allowed us to stay in the python eco-system and achieve our goals easily. The application itself is hosted on IBM Cloud.

The sample visualization from shows results based on the calibration model predictions. The actual and predicted a total number of cases is on the primary Y-axis, while new cases per day on the secondary Y-axis.

fig = make_subplots(specs=[[{"secondary_y": True}]])fig.add_trace(
go.Bar(x=days, y=calibration_result['PredictedChange'], name='Predicted Change', opacity=0.5),
go.Bar(x=days, y=calibration_result['ActualChange'], name='Actual Change', opacity=0.5),
go.Scatter(x=days, y=calibration_result['Predicted'], name='Calibration'),
go.Scatter(x=days, y=calibration_result['Actual'], name='Actual', mode="markers", marker=dict(size=8)),

For details see: how to make a graph with multiple axes in python.

The source code for the application can be found here.

Dashboard with CALIBRATION model results for Poland



Automation architect and data scientist at IBM Krakow Software Lab. Currently working on Watson Machine Learning cloud offering.