Jupyter begins

Introduction

I'm a Jupyter beginner, so I'll show you how to build an environment on AWS EC2 and how to use it easily for myself six months later.

Aside from the detailed settings, the goal is to quickly build a Jupyter environment on EC2, run a simple Python script on Jupyter, and learn the basics of Jupyter's UI operation method. Since Linux combat power is low, try to follow the procedure with copy and paste as much as possible.

If you want to create Jupyter on Spark Cluster using Amazon EMR see here. In addition, Jupyter Notebook will be ** Juypter Lab ** from the next version, and the UI / functions will change significantly. How to build the environment of Jupyter Lab Please refer to here.

Jupyter environment construction

The first step is to build a Jupyter environment.

Creating EC2

Start EC2 that runs Jupyter and log in with ssh.

Install required modules

Put the required module in ʻatp-get, update pip and install ipython [notebook]. Add ʻexport LC_ALL = C when you see a message like WARNING! Your environment specifies an invalid locale. during ssh login.

$ export LC_ALL=C
$ sudo apt-get update
$ sudo apt-get install -y python-pip libpq-dev python-dev libpng12-dev libjpeg8-dev libfreetype6-dev libxft-dev
$ sudo pip install -U pip
$ sudo pip install numpy pandas matplotlib seaborn scikit-learn plotly ipython[notebook]

Jupyter settings

The following command will create a Jupyter configuration file template (~ / .jupyter / jupyter_notebook_config.py).

$ jupyter notebook --generate-config

Then edit ~ / .jupyter / jupyter_notebook_config.py. All of them are big files commented out with #, so put the following 5 lines in the beginning of the file and save it (The following settings are settings that anyone can access the Jupyter server, please note Please give me).

c = get_config()
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8080
c.NotebookApp.token = ''

If you want to password Jupyter Login

From recent Jupyter, Login with password or token is required as a security measure. In the example above, c.NotebookApp.token ='' allows access without token.

If you want to set the Login password, you need to check the hash string of Password in advance. When you execute the following command, Prompt for entering Password will appear, so enter the Password you want to set.

$ python -c "import IPython;print(IPython.lib.passwd())"

Then, it will return a hash string starting with sha1: such as sha1: 3be1549bb425: 1500071094720b33gf8f0feg474931dc5e43dfed, so copy it.

Then, change the contents of ~ / .jupyter / jupyter_notebook_config.py edited in ↑ as follows. Replace the hash string after c.NotebookApp.password with the hash string you looked up in advance above.

c = get_config()
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8080
c.NotebookApp.password = u'sha1:3be1549bb425:1500071094720b33gf8f0feg474931dc5e43dfed'

Launch Jupyter

Run the following command to start Jupyter.

$ jupyter notebook

Go to EC2 on your browser, such as ʻec2-53-239-93-85.ap-northeast-1.compute.amazonaws.com:8080. Don't forget the port 8080. If the login screen of Jupyter appears and you can log in by entering the password set at the top, it is successful. If you want to run Background, set nohup jupyter notebook> / dev / null 2> & 1 &` and Jupyter will continue to work even if you disconnect ssh.

Jupyter autostart settings

Create a script called start_jupyter.sh, register it in /etc/rc.local, and set Jupyter to be executed in the Background when EC2 starts.

How to use Jupyter

It's super easy, but it's a glimpse of how to use Jupyter.

Create a Jupyter notebook

After logging in, select Python2 from New to create a Python2 notebook.

Setting1.png

About various icons

Jupyter works by writing Code and description (Markdown) in a box (?) Called ** Cell ** and executing them in sequence. The functions of various icons are as follows

Setting2.png

Try to enter Python code

Enter the following Python code into Cell. It is a code that only prints the current time 10 times every second.

import datetime, time

def main():
  for count in range(0, 10):
    print_current_time()
    time.sleep(1)

def print_current_time():
  print (datetime.datetime.now().strftime('%Y/%m/%d %H:%M:%S'))

if __name__ == '__main__':
  main()

Please enter the above Python code in the Cell of 4 Codes as shown below.

Setting3.png

Try to enter a comment

Add a Markdown Cell as shown below and enter a comment in Markdown. If you execute Cell> Run All, Markdown will be processed in order from the top in the Rendered state.

Setting5.png

Try to create a 2D Chart

It is common to use matplotlib to create a 2D Chart. Chart is created / displayed by executing the following Code on Jupyter. % matplotlib inline is an idiot required to display the output Chart of matplotlib on Jupyter, and it is OK if it is declared / executed once somewhere on the Notebook.

%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt

x = np.random.randint(0, 100, 10000)
plt.hist(x, bins=20)
plt.plot()

With np.random.randint (0, 100, 10000), create 10,000 random number ints in the range of 0-99, and display the distribution of the random numbers as a histogram of 20 columns.

matplotlib.png

In addition, matplotlib is a 2D Chart creation library that has been around for a long time, and while there is an opinion that the parameter setting method is a little complicated, there is also a library that can create a modern design Chart with a short code called seaborn. There is (Refer to here). However, be aware that some charts support matplotlib but not seaborn.

Try to create a 3D Chart

A 3D Chart can be created using a Library called plotly. It is effective when you want to check the 3D data distribution by machine learning etc., and Scatter, Surface, Mesh is supported in 3D. If you execute the following Code on Jupyter, 3D Scatter Chart will be displayed like this. Enjoy changing the viewpoint and scaling with Drag and Pinch in / out.

np.random.multivariate_normal ([0,0,0], [[0.1, 0, 0], [0, 1, 0], [0, 0, 2]], 1000) .T has an average It is a function that creates 1000 random numbers with a three-dimensional normal distribution with [0,0,0] and a variance of [0.1, 1, 2] respectively.

import plotly
import numpy as np
plotly.offline.init_notebook_mode()

x1, y1, z1 = np.random.multivariate_normal([3,3,3], [[0.5, 0, 0], [0, 0.5, 0], [0, 0, 0.5]], 1000).T
trace1 = plotly.graph_objs.Scatter3d(x=x1, y=y1, z=z1, mode='markers', marker=dict(size=1, line=dict(color='b')))

x2, y2, z2 = np.random.multivariate_normal([0,0,0], [[0.1, 0, 0], [0, 1, 0], [0, 0, 2]], 1000).T
trace2 = plotly.graph_objs.Scatter3d(x=x2, y=y2, z=z2, mode='markers', marker=dict(size=1, line=dict(color='r')))

fig = plotly.graph_objs.Figure(data=[trace1, trace2])
plotly.offline.iplot(fig, show_link=False)

Try to enter a formula

There are several ways to express formulas on Jupyter.

Here, I will describe using Mathjax on Markdown.

$$r=\frac{1}{f}$$
$$\left(x + y\right)^{5}$$

Enter the above in ** Markdown Cell ** to execute the Cell, and it is OK if Rendering is done as follows.

mathinput.jpg

Sample formulas using Mathjax on Jupyter can be found in this article and this article. It can be found at //www.suluclac.com/Wiki+MathJax+Syntax). Also, Mathjax's grammar is this article is summarized.

Try saving notebook

You can name the created notebook with file> rename and save it in * .ipynb format with file> Download as.

Other Jupyter functions

Run shell script on Jupyter

There are a few cases where you are using it and want to execute a Shell command. I want to add a Python library, or I want to bring a file from another server with wget.

You can enter the Master node with ssh and execute the script, but you can also execute the shell script directly on Jupyter by one of the following methods. Script is executed with the user authority that started Jupyter.

** Use commands **

A python library called commands runs a shell script on python.

import commands
commands.getoutput("date")
commands.getoutput("curl yahoo.co.jp")
** Use Jupyter's ! **

As a Jupyter-specific function, if you write a shell script after !, It will be executed.

!date
!curl yahoo.co.jp

It is possible to execute the shell script by sudo as shown below, but since Jupyter will be processing all the time after executing the script, it is necessary to restore it by Interrupt etc.

!sudo su
!find / -name 'hoge.txt'

Jupyter notebook extensions Jupyter / IPython extensions is being developed (separate from Jupyter's original development team). This article is very well organized about what kind of functions it has.

To install Extension, edit ~ / .jupyter / jupyter_notebook_config.py with the above jupyter settings, and execute the following two lines before starting Jupyter with jupyter notebook. is.

mkdir -p ~/.local/share/jupyter
sudo pip install https://github.com/ipython-contrib/IPython-notebook-extensions/archive/master.zip

When I confess, I honestly don't use Extension ... Although ʻExecute Time` is convenient.

Jupyter Magic Commands Jupyter / IPython has a dedicated feature called Magic Commands. If you google with ʻipython magic command`, you will see various things, but the following are famous.

If you execute the Cell with % whos entered in the Cell, the Magic command will be executed. However, when I confess, I honestly don't use Magic Command ... I use % whos occasionally because it's convenient.

Finally

Have a good Jupyter life!

Recommended Posts

Jupyter begins
Jupyter Lab begins
Tkinter begins
Jupyter Tips 4
Jupyter nbextensions
Poetry begins
Jupyter Tips 5
PyCharm begins
Jupyter Tips 3
Jupyter Tips 2
Jupyter installation
Jupyter tricks
Jupyter installation error
Jupyter Notebook memo
Introducing Jupyter Notebook
Django begins part 1
Jupyter study notes_006
Jupyter process management
Django begins part 4
Powerful Jupyter Notebook
Golang on jupyter
Bash in Jupyter
Jupyter on AWS
Jupyter Study Note_002
Jupyter notebook password
Build Jupyter Hub
Jupyter Notebook memo
Jupyter study notes_008
Jupyter study notes_004
jupyter qtconsole config
Jupyter study notes_001
Jupyter Study Note_003
Jupyter Study Note_007
Jupyter Study Note_005