Predicting COVID19 cases with AutoAI time series API

IBM Watson AutoAI has recently introduced a new beta feature — time series support. It’s is as easy as a walk in the park: all you need to do is drag & drop your time series data, and then sit back and relax while the best model to is being prepared for you.

In this story I will present how easily IBM AutoAI python API can be applied to COVID19 data to get predicted confirmed cases for the next few days.

Setup

To work with AutoAI for time series one needs to have Watson Machine Learning service instance (included with the free plan). Watson Machine Learning provides the python interface via package (available on pypi). You can easily install the package by running the following pip command:

Next, you need to provide authentication information to initialise the python client.

The time series data

The prepared data set contains the date and daily_cases columns. The daily_cases column contains the number of confirmed COVID19 cases in Poland on a particular day. The data set tracks confirmed cases from January 22, 2020 till March 28, 2021. At that time, we start to observe a dramatic increase in daily cases here in Poland (around 30k per day).

Here a visualisation of this data set prepared using the plotly package.

Now we use the python client to upload the data to Cloud Object Storage, to make it available for AutoAI.

AutoAI for time series

Using the python API we can easily define the AutoAI experiment for time series data. We need to define the following parameters for our experiment’s optimizer:

  • - experiment name
  • — problem type
  • — indices of target columns
  • — date&time column index
  • — number of days to be predicted
  • - number of holdout records
  • - optimization metric

Now, call the method to start the training job.

As soon as training is completed, we can list all models found for us by AutoAI.

We can retrieve each pipeline details by calling the method.

Each pipeline details contains data for visualization. Below is a simple comparison of observed vs. predicted values on a holdout data set.

is the best model returned by the AutoAI. So, we will use this one for deployment and scoring.

Deployment and scoring

In this section we will deploy the best pipeline as a webservice. Then we will use this webservice’s scoring endpoint to obtain predictions for the next 7 days.

Since our deployment has been created, we can ask for predictions using the method.

We receive a list of predicted confirmed cases in Poland in the following week. Let’s visualize that.

Based on the prediction we can expect a significant peak in 3–4 days.

Go to IBM Cloud and check this new feature out.
You can also find sample AutoAI notebooks here.

Automation architect and data scientist at IBM Krakow Software Lab. Currently working on Watson Machine Learning cloud offering.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store