Watson AutoAI — can I get the model?
Is it possible to download Watson AutoAI trained model and use it outside Watson Studio ecosystem? The answer is YES.
This short story describes in details how one can download AutoAI trained model and use it on 3rd party environment (local machine, cloud service etc.).
Getting the model
The easiest way to download trained pipeline model is to use python SDK and autogenerated notebook. From the drop down menu next to selected pipeline model click “Save as Notebook”.
The notebook can be run either in Watson Studio runtime or any other notebook server (download it). Notebook installs automatically all required dependencies:
- xgboost
- lightgbm
- scikit-learn
- autoai-libs
- watson-machine-learning-client-V4
The autogenerated notebook contains the section “Get pipeline model” that demonstrates how to download trained pipeline.
pipeline_model = optimizer.get_pipeline(pipeline_name=pipeline_name)
This line of code downloads and loads trained AutoAI pipeline model as scikit-learn fitted pipeline. By default the type of pipeline is wrapped with lale wrapper. Lale is open-source library for semi-automated data science. It extends pure scikit-learn pipeline with visualisation, code pretty_print and refinery capabilities. If you want to get pure scikit-learn pipeline type pass astype=”sklearn” parameter.
pipeline_model = optimizer.get_pipeline(pipeline_name=pipeline_name, astype="sklearn")
From that point you can call predict method on pipeline object to get predictions for new observations.
predictions = pipeline_model.predict(test_X)
You can also use joblib to dump, copy to different environment and finally load the pipeline model:
from joblib import dump, loadsklearn_pipeline = optimizer.get_pipeline(pipeline_name, astype=’sklearn’)
dump(sklearn_pipeline, ‘sklearn_pipeline.joblib’)loaded_pipeline = load('sklearn_pipeline.joblib')
loaded_pipeline.predict(test_X)
Please keep in mind that required libraries needs to be installed on target environment as well.
Getting the pipeline definition
It is also possible to get pipeline definition source code. The definition is using scikit-learn syntax and can be used for pipeline modification and re-training.
pipeline_model.pretty_print(combinators=False, ipython_display=True)
Please note that this feature is supported for pipeline models of lale types. You can also insert the source code to next notebook cell by adding ipython_display=”input” parameter.
pipeline_model.pretty_print(combinators=False, ipython_display=’input’)
You can also get the source code in form of python script by calling helper function pipeline_to_script().