Let's run IPython Notebook with Python 3.4 in Docker. There are some libraries that are not fully compatible with Python 3.4, but the migration is progressing gradually, so Python 3.4 is a good choice for future writing. You can check the compatibility of the library at Python 3 Wall of Superpowers.
Why use Docker:
I think this is the reason why IPython Notebook is used for data analysis.
Docker images for Python and IPython are registered on Docker Hub or published on Github. I feel that there is no definitive edition yet, so it would be nice to read and compare some Dockerfile
s.
It's basically built on Ubuntu, but the C library that depends on which Python module you use is different. Roughly speaking, there are those that assume numerical operations such as * numpy * and those that assume database connections such as PostgreSQL. In addition, the version of Python in the first place is also different between 2 series and 3 series.
Repository | Python | easy explanation |
---|---|---|
ipython/notebook(DockerHub) | Python 2.x | IPython notebook in a docker container. |
micktwomey/ipython3.4-notebook(DockerHub) | Python3.4 | DockerIPython2.0Notebook(micktwomey/ipython3.4)+Python3.4(micktwomey/python3.4) image |
unfairbanks/docker-ipython-notebook(DockerHub) | Python 2.x | Docker container image capable of running an iPython notebook server |
dckc/ipython-docker(Github) | Python 2.x | docker container for ipython notebook |
crosbymichael/python-docker(Github) | Python 2.x | Dockerfile for python on debian |
mingfang/docker-ipython(Github) | Python 2.x | Run IPython, Pattern, NLTK, Pandas, NumPy, SciPy, Numba, Biopython inside Docker |
This time, for my own study, I made a Docker image by hand and registered it in Docker Hub in order to use the latest version of each library. If you create it with Automated Build linked with Github, the image build will proceed automatically, which is convenient.
Launching the container with the image will launch the IPython Notebook on port 8888. You can specify the port number to forward from the host machine with the -p
option. (Here, the port number is changed so that you can see how to specify it.)
$ docker run -d -p 8080:8888 skitazaki/python34-ipython
If you want to share files with the host machine, mount / notebook
.
$ docker run -d -p 8080:8888 -v $PWD:/notebook skitazaki/python34-ipython
You can use the IPython Notebook by accessing port 8080 with a browser.
You can create a new notebook by clicking "New Notebook". You can see that you can enable * pylab * (* it seems that only * matplotlib * is better behaved in the future) and use * pandas * to draw the graph.
You can expose port 80 when booting on EC2, or you can use it for SSH tunneling. Let's set the security group and disclosure range according to the target data.
After editing the notebook, you can output it in HTML format, and you can download the Python source code and the IPython Notebook source code from the screen. This is often sufficient for checking data summaries and drawing graphs, such as in a pivot table. IPython Notebook allows you to add explanations in Markdown format, so I think it's more convenient to manage than just source code.
If you want to use a simple interactive shell, launch the container as follows.
$ docker run -it skitazaki/python34-ipython ipython
If you want to install additional Python libraries, specify / bin / bash
in the command and install withpip3 install {LIBNAME}
. After installation, you can start the IPython Notebook server with the following command.
container
root% ipython-notebook-startup.sh /notebook
Useful Python libraries are summarized here. If you look at it once in a while, you may discover something new.
When using Docker, it is convenient to set the alias written in the cheat sheet (wsargent / docker-cheat-sheet). Let's add it to .bashrc
/ .zshrc
etc.
.bashrc
alias dl='docker ps -l -q'
I think the following article is easy to understand as an introduction to Docker.
Recommended Posts