Introduction

"Platform that can deploy machine learning as an actual service only with Jupyter", and jupyter as the foundation I was very impressed with how it was possible to incorporate the notebook directly. In the article, a library called "papermill" that can execute jupyter notebook from the outside appeared, so I would like to use this.

reference

nteract/papermill

environment

macOS Catalina 10.15.3（19D76）
python 3.7.5

procedure

Installation

pip install papermill

Run

The folder structure is as follows. In the notebook that ʻinput.ipynb wants to execute, write the execution code in main.py`.

work/
　├ main.py
　└ input.ipynb

The contents of ʻinput.ipynb` are very simple as follows.

Here, the first cell is tagged with parameters. Select View-Cell Toolbar --Tags from the menu to display the text box on the upper right of the cell. Enter parameters here and click ʻAdd tagto add the tag. papermill can go to the cell with theparameters` tag in the notebook and rewrite the variables in the cell.

To run it in the python API: The notebook after execution is output as ./output.ipynb.

`main.py`


import papermill as pm

pm.execute_notebook(
   './input.ipynb',
   './output.ipynb',
   parameters = dict(alpha=0.6, ratio=0.1)
)

Run it with python main.py.

$ python main.py 
Executing: 100%|████████████████████████████████| 3/3 [00:01<00:00,  1.80cell/s]

When I open ./output.ipynb, it looks like this: A cell tagged with ʻInjected-parameters` has been added, overwriting the parameters.

To run from the CLI: Papermill will judge boolean and numerical value without permission.

$ papermill ./input.ipynb ./output.ipynb  -p alpha 0.6 -p ratio 0.1
Input Notebook:  ./input.ipynb
Output Notebook: ./output.ipynb
Executing: 100%|████████████████████████████████| 3/3 [00:01<00:00,  2.67cell/s]

The parameters can also be specified in the yaml file.

work/
　├ main.py
　├ input.ipynb
　└ parameters.yaml

In the CLI do the following:

papermill ./input.ipynb ./output.ipynb -f ./parameters.yaml

You can also save it to cloud storage. In that case, you also need to install the option.

pip install papermill[all]

Change the ./output.ipynb part to the cloud destination. Below is an example of AWS S3. It can be executed if it is configured by CLI.

papermill ./input.ipynb s3://xxxxxxxxxx/output.ipynb -f ./parameters.yaml

bonus

If the output destination is the same as the input, it will be overwritten.

papermill ./input.ipynb ./input.ipynb -f ./parameters.yaml

Repeating multiple times will only overwrite the cells in ʻInjected-parameters`, so the parameters will be rewritten properly.

in conclusion

It seems interesting to be able to create a management screen with flask and manage learning.

I tried papermill