[JAVA] Machine Learning with docker (40) with anaconda (40) "Hands-On Data Science and Python Machine Learning" By Frank Kane

1. For those who want to use it immediately (as soon as)

「Hands-On Data Science and Python Machine Learning」By Frank Kane

cat40.gif

http://shop.oreilly.com/product/9781787280748.do

docker Please install docker and start docker on Windows and Mac. On Windows, docker may not start unless Intel Virtualization is enabled in the Bios. In addition, security warnings may appear.

docker run

$ docker pull kaizenjapan/anaconda-frank

$ docker run -it -p 8888:8888 kaizenjapan/anaconda-frank /bin/bash

In the shell session below (base) root @ f19e2f06eabb: / # is the comman prompt. Actually, the number part may be different. Enter the right side of # on this line. The other lines are output. If there are any errors or differences in the output, please let us know in the comments. Navigate to the folder for each chapter.

If the display in docker and the shell of the OS that started docker are similar, you may make a mistake as to which one you are investigating. Watch out for docker's comman prompt.

File sharing or copying

In the OS that started docker and docker, please share the file or copy the file and display the generated file on the browser etc. The URL of the method is described in the reference column.

I'm looking for a good way to organize the disks of the OS that started docker. Some methods have shared settings from the beginning.

In the case of copying, the OS side command that started docker was executed. Replace it with your docker number. I displayed the copied file on the browser and checked the contents.


root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# python SparkDecisionTree.py 
Traceback (most recent call last):
  File "SparkDecisionTree.py", line 1, in <module>
    from pyspark.mllib.regression import LabeledPoint
ImportError: No module named pyspark.mllib.regression
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# pip install pyspark
bash: pip: command not found
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# find / -name pip -print
/root/anaconda3/bin/pip
/root/anaconda3/lib/python3.7/site-packages/pip
/root/anaconda3/pkgs/pip-10.0.1-py37_0/bin/pip
/root/anaconda3/pkgs/pip-10.0.1-py37_0/lib/python3.7/site-packages/pip
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# /root/anaconda3/bin/pip install pyspark        
Collecting pyspark
  Downloading https://files.pythonhosted.org/packages/5e/cb/d8ff49ba885e2c88b8cf2967edd84235ffa9ac301bffef657dfa5605a112/pyspark-2.3.2.tar.gz (211.9MB)
    100% |████████████████████████████████| 211.9MB 201kB/s 
Collecting py4j==0.10.7 (from pyspark)
  Downloading https://files.pythonhosted.org/packages/e3/53/c737818eb9a7dc32a7cd4f1396e787bd94200c3997c72c1dbe028587bd76/py4j-0.10.7-py2.py3-none-any.whl (197kB)
    100% |████████████████████████████████| 204kB 951kB/s 
Building wheels for collected packages: pyspark
  Running setup.py bdist_wheel for pyspark ... done
  Stored in directory: /root/.cache/pip/wheels/be/7d/34/cd3cfbc75d8b6b6ae0658e5425348560b86d187fe3e53832cc
Successfully built pyspark
twisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.
Installing collected packages: py4j, pyspark
Successfully installed py4j-0.10.7 pyspark-2.3.2
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# pip install --upgrade pip
bash: pip: command not found
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# apt-get install pip
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package pip
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning#  /root/anaconda3/bin/pip install --upgrade pip
Collecting pip
  Downloading https://files.pythonhosted.org/packages/c2/d7/90f34cb0d83a6c5631cf71dfe64cc1054598c843a92b400e55675cc2ac37/pip-18.1-py2.py3-none-any.whl (1.3MB)
    100% |████████████████████████████████| 1.3MB 8.5MB/s 
twisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.
Installing collected packages: pip
  Found existing installation: pip 10.0.1
    Uninstalling pip-10.0.1:
      Successfully uninstalled pip-10.0.1
Successfully installed pip-18.1
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning#  /root/anaconda3/bin/pip install PyHamcrest   
Collecting PyHamcrest
  Downloading https://files.pythonhosted.org/packages/9a/d5/d37fd731b7d0e91afcc84577edeccf4638b4f9b82f5ffe2f8b62e2ddc609/PyHamcrest-1.9.0-py2.py3-none-any.whl (52kB)
    100% |████████████████████████████████| 61kB 2.6MB/s 
Requirement already satisfied: six in /root/anaconda3/lib/python3.7/site-packages (from PyHamcrest) (1.11.0)
Requirement already satisfied: setuptools in /root/anaconda3/lib/python3.7/site-packages (from PyHamcrest) (40.2.0)
Installing collected packages: PyHamcrest
Successfully installed PyHamcrest-1.9.0
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# python SparkDecisionTree.py
Traceback (most recent call last):
  File "SparkDecisionTree.py", line 1, in <module>
    from pyspark.mllib.regression import LabeledPoint
ImportError: No module named pyspark.mllib.regression
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning#  /root/anaconda3/bin/pip install LabeledPoint
Collecting LabeledPoint
  Could not find a version that satisfies the requirement LabeledPoint (from versions: )
No matching distribution found for LabeledPoint
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning#  /root/anaconda3/bin/pip install regression  
Collecting regression
  Could not find a version that satisfies the requirement regression (from versions: )
No matching distribution found for regression
root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# 

jupyternotebook

root@0ece3ffce439:/Hands-On-Data-Science-and-Python-Machine-Learning# /root/anaconda3/bin/jupyter notebook --ip=0.0.0.0 --allow-root
[I 14:00:45.307 NotebookApp] JupyterLab extension loaded from /root/anaconda3/lib/python3.7/site-packages/jupyterlab
[I 14:00:45.307 NotebookApp] JupyterLab application directory is /root/anaconda3/share/jupyter/lab
[I 14:00:45.311 NotebookApp] Serving notebooks from local directory: /Hands-On-Data-Science-and-Python-Machine-Learning
[I 14:00:45.311 NotebookApp] The Jupyter Notebook is running at:
[I 14:00:45.311 NotebookApp] http://(0ece3ffce439 or 127.0.0.1):8888/?token=03a8851511d5e0e2457d5448b0f66f71b8378d4ac9b1c141
[I 14:00:45.311 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 14:00:45.313 NotebookApp] No web browser found: could not locate runnable browser.
[C 14:00:45.313 NotebookApp] 

In the browser localhost:8888 open

68747470733a2f2f71696974612d696d6167652d73746f72652e73332e616d617a6f6e6177732e636f6d2f302f35313432332f34643333383165622d653832302d336437372d316635352d6665303161396231353731342e706e67.png

In the above case, to token 03a8851511d5e0e2457d5448b0f66f71b8378d4ac9b1c141 Put in.

ju40-1.png ju40-2.png ju40-3.png ju40-5.png ju40-6.png

2. For those who build docker by themselves

From here on down, I will record what kind of policy and procedure I made the docker that was pulled above. It is a reference material for using the above docker. Not needed to run the rest of the book. This is the procedure for building docker / anaconda on your own. It is not a way to create a docker file. sorry. docker

A mechanism that allows Linux such as ubuntu and debian to be used in common from linux, windows and mac os. It is good to be able to use it without changing the settings of the OS to be used. With the same specifications, it can be used by a large number of people. Both software officially supported by the software developer and those conveniently tailored by the user are available. This time, we will tailor what we have officially distributed so that it can be used by others. python

I went to Phthon for deep learning training. The reason for using python is that many machine learning mechanisms can be used in python, and statistical analysis mechanisms such as R can also be easily used from python. anaconda

There are differences between the 2 and 3 versions of python and the distribution method. I've been using python3 with Anaconda for the last year and a half. The reason I used Anaconda is that it comes with a library of statistical analysis and Jupyter Notebook from the beginning.

official docker distribution

There are official distribution of OS such as ubuntu and debian, and official distribution of languages such as gcc and anaconda. By using these and registering on docker-hub, you can check the quality of official distribution and share a wide range of information including change rights. It does not mean that docker officially distributes it, but that it is officially distributed by each software provider. docker pull

The use of docker official distribution is realized by pulling from the URL. docker Anaconda

Use the one officially distributed by anaconda.

$  docker pull kaizenjapan/anaconda-keras
Using default tag: latest
latest: Pulling from continuumio/anaconda3
Digest: sha256:e07b9ca98ac1eeb1179dbf0e0bbcebd87701f8654878d6d8ce164d71746964d1
Status: Image is up to date for continuumio/anaconda3:latest

$ docker run -it -p 8888:8888 continuumio/anaconda3 /bin/bash

Pull other pushes that actually used keras and tensorflow

apt-get

(base) root@d8857ae56e69:/# apt-get update

(base) root@d8857ae56e69:/# apt-get install -y procps

(base) root@d8857ae56e69:/# apt-get install -y vim

(base) root@d8857ae56e69:/# apt-get install -y apt-utils

(base) root@d8857ae56e69:/# apt-get install sudo

apt-get install scala

apt-get install default-jre 



Source git

(base) root@f19e2f06eabb:/# git clone https://github.com/PacktPublishing/Hands-On-Data-Science-and-Python-Machine-Learning

conda

(base) root@f19e2f06eabb:/d# conda update --prefix /opt/conda anaconda

pip

(base) root@f19e2f06eabb:/# pip install --upgrade pip

/root/anaconda3/bin/pip install pyspark  

Register with docker hub

$ docker ps
CONTAINER ID        IMAGE                   COMMAND                  CREATED             STATUS              PORTS                    NAMES
caef766a99ff        continuumio/anaconda3   "/usr/bin/tini -- /b…"   10 hours ago        Up 10 hours         0.0.0.0:8888->8888/tcp   sleepy_bassi

$ docker commit 3bf1f723168d   kaizenjapan/anaconda-frank
 

$ docker push kaizenjapan/anaconda-frank

Reference

Why machine learning with docker Book / source list is being created (Goal 100) https://qiita.com/kaizen_nagoya/items/ddd12477544bf5ba85e2

Machine learning with docker (1) with anaconda (1) "Deep Learning from scratch-The theory and implementation of deep learning learned with Python" by Yasuki Saito https://qiita.com/kaizen_nagoya/items/a7e94ef6dca128d035ab

Machine learning with docker (2) with anaconda (2) "Deep Learning from scratch 2 Natural language processing" by Yasuki Saito https://qiita.com/kaizen_nagoya/items/3b80dfc76933cea522c6

Machine learning with docker (3) with anaconda (3) "Intuition Deep Learning" Antonio Gulli, Sujit Pal Chapter 1, Chapter 2 https://qiita.com/kaizen_nagoya/items/483ae708c71c88419c32

Machine learning with docker (71) Environment construction (1) docker Somehow, no matter what, there are only errors. https://qiita.com/kaizen_nagoya/items/690d806a4760d9b9e040

Machine learning with docker (72) Environment construction (2) Docker for Windows https://qiita.com/kaizen_nagoya/items/c4daa5cf52e9f0c2c002

Machine learning with docker (73) Environment construction (3) docker / linux / macos bash script, ms-dos batch file https://qiita.com/kaizen_nagoya/items/3f7b39110b7f303a5558

Machine learning with docker (74) Environment construction (4) R How many difficulties? https://qiita.com/kaizen_nagoya/items/5fb44773bc38574bcf1c

Machine learning with docker (75) Environment construction (5) Management of docker related files https://qiita.com/kaizen_nagoya/items/4f03df9a42c923087b5d

I tried to run OpenCV with Python and was told that libGL.so was missing, but I solved it. https://qiita.com/toshitanian/items/5da24c0c0bd473d514c8

Drawing tips with matplotlib on the server side https://qiita.com/TomokIshii/items/3a26ee4453f535a69e9e

Copy files between host and container with Docker https://qiita.com/gologo13/items/7e4e404af80377b48fd5

Use file sharing with Docker for Mac https://qiita.com/seijimomoto/items/1992d68de8baa7e29bb5

"Nagoya's IoT is Nagoya's OS" How can I use Docker? TOPPERS / FMP on RaspberryPi with Macintosh 5 barriers https://qiita.com/kaizen_nagoya/items/9c46c6da8ceb64d2d7af

Road to 64bit CPU and / or 64 year old determination https://qiita.com/kaizen_nagoya/items/cfb5ffa24ded23ab3f60

Deep Learning 2 Natural Language Processing from Zero How to proceed with a book club (example) https://qiita.com/kaizen_nagoya/items/025eb3f701b36209302e

Try using NVIDIA Docker on Ubuntu 16.04 LTS https://blog.amedama.jp/entry/2017/04/03/235901

Document history

ver. 0.10 First draft 20181024 ver. 0.11 push 20181028

Recommended Posts

Machine Learning with docker (40) with anaconda (40) "Hands-On Data Science and Python Machine Learning" By Frank Kane
I started machine learning with Python Data preprocessing
Python learning memo for machine learning by Chainer Chapters 1 and 2
Machine learning with Python! Preparation
Beginning with Python machine learning
"Scraping & machine learning with Python" Learning memo
Data science environment construction with Docker
Machine Learning with docker (42) Programming PyTorch for Deep Learning By Ian Pointer
[Reading Notes] Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow Chapter 1
"Gaussian process and machine learning" Gaussian process regression implemented only with Python numpy
Data pipeline construction with Python and Luigi
Amplify images for machine learning with python
Machine learning imbalanced data sklearn with k-NN
Machine learning with python (2) Simple regression analysis
Vulkan compute with Python with VkInline and think about GPU machine learning and more
Python: Preprocessing in machine learning: Data acquisition
[Shakyo] Encounter with Python for machine learning
Python data structure and operation (Python learning memo ③)
[Python] First data analysis / machine learning (Kaggle)
Python and machine learning environment construction (macOS)
Data analysis starting with python (data preprocessing-machine learning)
Organize data divided by folder with Python
Python: Preprocessing in machine learning: Data conversion
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Python: Preprocessing in machine learning: Handling of missing, outlier, and imbalanced data
I started machine learning with Python (I also started posting to Qiita) Data preparation
Build AI / machine learning environment with Python
Summary of mathematical scope and learning resources required for machine learning and data science
Chapter 6 Supervised Learning: Classification pg212 ~ [Learn by moving with Python! New machine learning textbook]
Align the number of samples between classes of data for machine learning with Python
Python learning notes for machine learning with Chainer Chapters 11 and 12 Introduction to Pandas Matplotlib
[Python] Easy introduction to machine learning with python (SVM)
Study machine learning and computer science. Resource list
Machine learning starting with Python Personal memorandum Part2
Machine learning starting with Python Personal memorandum Part1
[Python] Collect images with Icrawler for machine learning [1000 images]
Machine learning Training data division and learning / prediction / verification
[Python3] Let's analyze data using machine learning! (Regression)
Build PyPy and Python execution environment with Docker
A story about data analysis by machine learning
Build a Python machine learning environment with a container
Predicting offensive and defensive attributes from the Yu-Gi-Oh! Card name --Yu-Gi-Oh! Data Science 3. Machine Learning
Learn by running with new Python! Machine learning textbook Makoto Ito numpy / keras Attention!
Until you create a machine learning environment with Python on Windows 7 and run it
Machine learning with Raspberry Pi 4 and Coral USB Accelerator
Run a machine learning pipeline with Cloud Dataflow (Python)
Relationship data learning with numpy and NetworkX (spectral clustering)
Easy machine learning with scikit-learn and flask ✕ Web app
Python learning memo for machine learning by Chainer from Chapter 2
Build a machine learning application development environment with Python
Time series data prediction by AutoML (automatic machine learning)
Summary of the basic flow of machine learning with Python
Investigate Java and python data exchange with Apache Arrow
Practical machine learning with Scikit-Learn and TensorFlow-TensorFlow gave up-
[Basics of data science] Collecting data from RSS with python
Set up python and machine learning libraries on Ubuntu
How to build Anaconda virtual environment used in Azure Machine Learning and link with Jupyter
Data analysis with python 2
Learning Python with ChemTHEATER 03
"Object-oriented" learning with python
Learning Python with ChemTHEATER 05-1