[Introduction to Docker] Create a Docker image for machine learning and use Jupyter notebook

About this article

This is a memorandum article for when you forget to use Docker because you have moved the learning environment of machine learning to Docker.

environment

Since it is a basic Mac premise, commands etc. for Windows are not written.

GitHub The Dockerfile introduced in this article is published on GitHub. GitHub:/kuboshu/pythonml

What you can see in this article

How to build a Docker image with a machine learning library installed and play with machine learning using the container Jupyter-notebook from the outside.

Libraries installed in the Docker image

Since Python is used, I put in a library related to Python machine learning.

Contents of the created Dockerfile

Created based on Ubuntu 20.04. I'm basically just installing the Python package with pip, so I haven't done anything special.

FROM ubuntu:20.04
LABEL maintainer="kuboshu83"
ENV DEBIAN_FRONTEND noninteractive
ARG INSTALLDIR_PYOCR="/app/ocr"
RUN apt-get update && \
    apt-get -y upgrade && \
    apt-get install -y git \
                       make \
                       cmake \
                       gcc \
                       g++ \
                       wget \
                       zip \
                       curl && \
    # ~~~~~Python installation~~~~~
    apt-get install -y python3 python3-pip && \
    ln -s $(which python3) $(dirname $(which python3))/python  && \
    ln -s $(which pip3) $(dirname $(which python3))/pip && \
    # ~~~~~Installation of ML related libraries for Python~~~~~
    #Tensorflow and Pytorch are large, so comment them out if you don't need them.
    #Estimated capacity is tensorflow=1.2GB, pytorch=It is 2GB.
    # Tensorflow,It is about 2GB for ML-like libraries other than Pytorch.
    pip install pystache \
                numpy==1.18.5 \
                pandas \
                scikit-learn \
                matplotlib \
                jupyterlab \
                pycaret \
                lightgbm \ 
                alembic==1.4.1 \ 
                sqlalchemy==1.3.13 \
                optuna && \
    pip install tensorflow && \
    pip install torch torchvision && \
    # ~~~~~OpenCV installation~~~~~
    pip install opencv-python && \
    apt-get install -y libgl1-mesa-dev && \
    # ~~~~Install Tesseract~~~~~
    apt-get install -y libleptonica-dev tesseract-ocr && \
    # ~~~~Install PyOCR~~~~~
    pip install pyocr && \
    mkdir -p /usr/local/share/tessdata/ && \
    curl https://raw.githubusercontent.com/tesseract-ocr/tessdata_best/master/jpn.traineddata -sS -L -o /usr/share/tesseract-ocr/4.00/tessdata/jpn.traineddata && \
    # ~~~~Install MeCab~~~~
    apt-get install -y mecab libmecab-dev mecab-ipadic && \
    pip install --no-binary :all: mecab-python3 && \
    pip install neologdn && \
    #~~~~Creating a working directory~~~~
    mkdir -p /home/share

#Launch Python shell by default
CMD ["python"]

How to build a Docker image

You can build the Docker image with the following command. Also, since the same command is described in build.sh on Github, you can also build the image by executing build.sh.

docker build -t image name:Location of version Dockerfile

How to launch Jupyter-notebook

If you create share / in the current directory and execute the following command, Jupyter-notebook will start next to the container. After that, you can use Jupyter-notebook by opening the displayed URL with a browser. Please specify an appropriate version of the Docker image. In the example below, v0.1.0 is used.

#Of the container/home/Create a directory to share with share
> mkdir share

#Start the container.
# -rm:Delete the container at the same time as stopping the container.
# -it:Required to use the terminal in the container.
#Jupyter like this time-It's unnecessary if you just use a notebook, but somehow it's included.
# -p :Assign host port 8888 to container port 8888.
# -v :host's(Current directory)/share/The container/home/share/Mount on.
# -w :The current directory of the container when the container is started/home/share/To.
# Jupyter-The lab is running on port 8888.
> docker run --rm -it -p 8888:8888 -w /home/share -v $(pwd)/share:/home/share pythonml:v0.1.0 /usr/local/bin/jupyter lab --ip=0.0.0.0 --port 8888 --allow-root

When you start the container, / home / share / prepared for work becomes the current directory, so it is easy to use if you share this with the directory on the host side.

What I looked up when writing a Dockerfile

Avoid interactive installation

--For reference (DEBIAN_FRONTEND = noninteractive: qiita @ udzura)

I want to build the Docker image completely automatically, so I don't want to be asked to set it manually when installing packages, so I wanted to disable the interactive setting at the time of installation, so I set the following as environment variables.

ENV DEBIAN_FRONTEND noninteractive

[Reduce the number of times RUN is used]

--Reference (Tutorial aiming to understand Docker image: qiita @ zembutsu)

At first, I used the RUN instruction a lot without thinking about anything, but when I checked the image with docker image ls -a, the image of the intermediate layer was mass-produced as shown below. Apparently, Docker creates an intermediate layer each time you use an instruction in a Dockerfile, and finally synthesizes the intermediate layers to create the final image. Therefore, we have reduced the number of instructions used as much as possible.

I don't know yet because I don't understand if there is a problem with many middle layers. However, when I displayed the image list, I felt uncomfortable that there were a lot of \ <none >, so I wrote it to reduce the middle layer.

REPOSITORY          TAG       IMAGE ID      
pythonml            v0.1.0    xxxxxx        
<none>              <none>    xxxxxx        <=Like this
<none>              <none>    xxxxxx        <=Like this
<none>              <none>    xxxxxx        <=Like this
<none>              <none>    xxxxxx        <=Like this
<none>              <none>    xxxxxx        <=Like this
<none>              <none>    xxxxxx        <=Like this
<none>              <none>    xxxxxx        <=This too
ubuntu              20.04     xxxxxx        

Summary

This time, I just built a Docker image with a Python package installed, and made a note of how to build an environment to play with machine learning using Jupyter-notebook from the host. I still have some libraries I want to play with, so I'd like to add them in the future.

Also, this time I give priority to the appearance of the Dockerfile, and since all the libraries are put in with apt-get or pip, there are older versions. So, I would like to build from the source code and install the latest version if I have time.

I was allowed to reference

-What is DEBIAN_FRONTEND = noninteractive: qiita @ udzura -Tutorial aiming to understand Docker image: qiita @ zembutsu

Recommended Posts

[Introduction to Docker] Create a Docker image for machine learning and use Jupyter notebook
Use Jenkins to build inside Docker and then create a Docker image.
Create a Docker Image for redoc-cli and register it on Docker Hub
[Docker] How to create a virtual environment for Rails and Nuxt.js apps
Create jupyter notebook with Docker and run ruby
I tried using Wercker to create and publish a Docker image that launches GlassFish 5.
How to create a lightweight container image for Java apps
How to create and launch a Dockerfile for Payara Micro
Create a lightweight STNS Docker image
2. Create Docker image and register Registry
I created a Docker image of a container for learning OpenAI Gym
[For those who create portfolios] How to use binding.pry with Docker
Let's install Docker on Windows 10 and create a verification environment for CentOS 8!
I tried to make a machine learning application with Dash (+ Docker) part1 ~ Environment construction and operation check ~
A shell script that builds a Docker image and pushes it to ECR
Create Docker to use Edge TPU compiler
Docker Compact Manual (4: Create a custom image)
I tried to make a machine learning application with Dash (+ Docker) part3 ~ Practice ~
How to create a database for H2 Database anywhere
How to create pagination for a "kaminari" array
[Java] Let's create a mod for Minecraft 1.14.4 [Introduction]
[Java] Let's create a mod for Minecraft 1.16.1 [Introduction]
Create a docker environment for Oracle 11g XE
Allows you to specify a proxy server to use for apt communication during docker build
How to create a header or footer once and use it on another page
Introduction to Machine Learning with Spark "Price Estimate" # 3 Make a [Price Estimate Engine] by learning with training data
How to quit Docker for Mac and build a Docker development environment with Ubuntu + Vagrant
I made a Docker image of SDAPS for Japanese
Create a docker image that runs a simple Java app
[Introduction] Try to create a Ruby on Rails application
[Rails] How to create a signed URL for CloudFront
How to build Docker + Springboot app (for basic learning)
How to use an array for a TreeMap key
How to deploy to Heroku from a local docker image
[Spring Boot] How to create a project (for beginners)
Create a Docker container to convert EPS to PGF source
[For those who create portfolios] How to use font-awesome-rails
Introduction to Programming for College Students: Making a Canvas
I want to create a generic annotation for a type
If you want to make a Java application a Docker image, it is convenient to use jib.
[Rails 6.0, Docker] I tried to summarize the Docker environment construction and commands necessary to create a portfolio
With podman in docker, everyone wants to get along and use docker on a shared computer
Maybe it works! Create an image with Docker and share it!
Introduction to Effective java by practicing and learning (Builder pattern)
Tutorial to create a blog with Rails for beginners Part 1
[Introduction to Docker] ~ The shortest explanation until starting jupyter lab ~
Create a Docker image with the Oracle JDK installed (yum
How to create a placeholder part to use in the IN clause
[For those who create portfolios] How to use chart kick
[Kotlin] Resources and tips for learning a new programming language
How to use Font Awesome icon for ul and li
[Personal notes] How to push a Docker image to GitHub Packages
Tutorial to create a blog with Rails for beginners Part 2
Procedures for passing RealmObject to Fragment and how to use Parceler
Create a Kibana container image for ARM64 (Raspberry Pi/Mac M1)
I tried to create a padrino development environment with Docker
Study Java: Use Timer to create something like a bomb timer
Tutorial to create a blog with Rails for beginners Part 0
Introduction to Programming for College Students: Draw a Straight Line
[Enum_help] Use enum_help to create a select box displayed in Japanese!
I tried to make a machine learning application with Dash (+ Docker) part2 ~ Basic way of writing Dash ~