This article is an article of ** DMM WEBCAMP Advent Calendar Day 7 **. This time it is a super introductory edition, so I created it for ** people who have never done machine learning but want to start **.
** Since the purpose is to get a rough idea of machine learning and try to implement a basic simple perceptron for the time being **, I've written a very rough explanation of machine learning etc. (If you write each one properly) It will be insanely long). If you want to know more details, please check it out.
In addition, this time we are introducing the environment construction on windows 10. If there are any parts that are difficult to convey or are mistakenly recognized in the first post of Qiita, I would appreciate it if you could point out in the comments.
Overview A rough explanation of machine learning Python installation Miniconda installation Implementation of simple perceptron (neural network)
Suppose a person coming from the front is taking an animal for a walk. I think you mainly visually judge what the animal is (dog or cat?) And what the breed or cat is. Another suppose you hear a sound like "click, click, click ...". I think that the sound you hear is mainly judged by your ears, whether it is the sound of footsteps, the sound of applause, or the sound of the hands of a clock. In this way, human beings judge everything in their daily lives based on their experience.
There is machine learning to make it also a computer. By letting the computer repeat the learning and make judgments with high accuracy, it becomes possible to predict the result for unknown data.
Machine learning methods can be broadly divided into two types: ** unsupervised ** machine learning and ** supervised ** machine learning.
The biggest difference from supervised machine learning is learning without correct answer data. If there are 1000 data, all will be treated as training data. It is mainly clustered and grouped based on common characteristics. The most famous implementation method is the ** K-Means method **. I won't touch on unsupervised machine learning anymore, but if you are interested, please check it out.
Unlike machine learning without a teacher, it is a learning method with correct answer data. Similarly, assuming that there are 1000 pieces of data, for example, 700 pieces are divided into training data and 300 pieces are correct answer data, and learning is repeated. Among them, this time I will touch on neural networks and deep learning.
This is a learning method based on a model of nerve cells in the human brain. Multiple nerve cells are connected, and they receive input from the synapses of adjacent nerve cells and transmit information to the neighboring nerve cells. In this way, the neural network is divided into an input layer and an output layer, and when the total weight of the input layers exceeds a certain level, a value is entered in the output layer. This time, we will create a program that implements this neural network.
Roughly speaking, deep learning is complicated by incorporating neural networks in multiple layers. Compared to machine learning in last year's ** DMM WEBCAMP Advent Calendar ** article "For super beginners. Differences between deep learning and machine learning" It was written about deep learning, so please refer to it as well.
A rough explanation continued, but that's all for the explanation. Could you somehow understand what machine learning is like? Now, let's build an environment for machine learning.
After all, the first thing to do is install Python.
And before that, make sure you have the latest version of windows. The latest version as of December 2019 is ** Version 1909 **. If it is not the latest version, we recommend updating to the latest version.
Go to https://www.python.org and go to ** Download ** in the middle of the page you visited Latest: ** Python (* Vision *) )**Click. The latest version of Python as of December 2019 is ** Python 3.8.0 **.
Click ** Windows x86-64 executable installer ** from ** Files ** at the bottom of the destination page to download the Python installer.
Launch the downloaded installer and check ** add Python (* Virsion *) to PATH ** before starting the installation. This will allow you to launch Python on the command prompt. If you don't need it, start the installation without checking it.
If ** disable path length limit ** is displayed on this page notifying you that the installation is complete, you can select it to remove the file path length limit. The image is not visible as it was unrestricted when Python was previously installed.
The Python installation is now complete. You can type ** IDLE ** in the Windows Start menu and enter the code in the program you select and launch. So it's easy to write Python programs without a text editor like Sublime Text 3 or Visual Stdio Code.
If you've never written a Python program **, it's a good idea to touch it a bit before implementing a simple perceptron. If you are a regular writer of C or Java, you can express what you want to do with a small amount of description, so you can easily understand it.
Miniconda is one of the systems provided by an open source management system called ** Conda **. There is another ** Anaconda ** similar to Miniconda, but it's basically the same, and since installing Anaconda requires a fairly large file, only the minimum required functions are available. This is Miniconda. It looks like ** Anaconda ⊆ Miniconda **.
Data analysis can be done more conveniently by using Python + Miniconda.
Go to Miniconda — Conda documentation (https://docs.conda.io/en/latest/miniconda.html) and download the Miniconda installer. From ** Windows installer **, select and download the latest version of Python ** Miniconda3 Windows 64-bit **. I don't have a Python 3.8 Miniconda this time, but I have no problem with a Python 3.7 Miniconda.
Launch the downloaded installer, select Next-> I Agree, make sure the ** Just Me (recommended) ** radio button is selected before proceeding.
It is recommended that you proceed to the next screen without making any changes, as changing the ** Destination Folder ** that appears on the next screen can be a hassle.
On the ** Advanced installation Options ** screen, fix the default Python version to 3.7, uncheck ** Register Aaconda as my default Python 3.7 **, and start the installation from **.
On this screen that appears after the installation is complete, uncheck both of them before quitting the installer.
Let's start Anaconda Prompt from the start button of windows. I installed Miniconda, but as mentioned earlier, among Anaconda, the one with only the minimum functions is Miniconda, so the notation is Anaconda. Enter the following command to proceed with the environment construction.
conda update conda -y
AnacondaPrompt
>conda update conda -y
Collecting package metadata (current_repodata.json): done
Solving environment: done
# All requested packages already installed.
If it comes out like this, there is no problem.
As of December 2019, there is no Miniconda that supports Python 3.8, so Anaconda Prompt supports Python in version 3.7. However, there is almost no problem.
You can check it with conda list
.
AnacondaPrompt
>conda list
# packages in environment at C:\Users\User name\Miniconda3:
#
# Name Version Build Channel
asn1crypto 1.2.0 py37_0
ca-certificates 2019.11.27 0
・ ・ ・
python 3.7.4 h5263a28_0
The environment name is free, but this time we will proceed with ** NN_sample **. In order to eliminate the problem caused by the version difference caused by installing Python 3.8, specify the Python version to 3.7.4 and create a virtual environment. For ** Proceed ([y] / n)? ** that appears on the way, enter ** y ** and press Enter.
conda create -n NN_sample python=3.7.4
AnacondaPrompt
>conda create -n NN_sample python=3.7.4
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: C:\Users\User name\Miniconda3\envs\NN_sample
Proceed ([y]/n)? y
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate NN_sample
#
# To deactivate an active environment, use
#
# $ conda deactivate
Start the virtual environment. It can be activated with ʻactivate environment name`. If the startup is successful, ** (base) ** will change to ** (environment name) **.
AnacomdaPrompt
(base) C:\Users\User name>activate NN_sample
(NN_sample) C:\Users\User name>
We will add the necessary libraries in the virtual environment so that #include <~ .h>
in C and ʻimport java.io. ~` In Java can be done. Please proceed with the virtual environment started.
The following 5 are installed this time.
** jupyter lab ** is not a library, but a text editor, like a web tool that lets you code while checking data, unlike the others you install this time. Previously, ** jupyter notebook ** was the mainstream, but nowadays it is an image of using jupyter lab. To install jupyter lab, enter the following command.
AnacondaPrompt
>conda install -c conda-forge jupyterlab
Install the required libraries. I won't go into the description of each library, but at a minimum, all you need to implement a simple perceptron.
AnacondaPrompt
>conda install numpy matplotlib tensorflow keras
There are many other libraries, and I will write only a few libraries that I will use for any machine learning in the future.
That is all for building the environment, including the virtual environment.
I'm finally here. We will create a program that implements a simple perceptron in jupyter lab. Before that, let's take a quick look at the simple perceptron.
As mentioned earlier, a simple perceptron is a type of neural network. Therefore, it is divided into an input layer and an output layer, and when the total of the input layers exceeds a certain value, information is transmitted to the output layer. A simple perceptron has ** multiple inputs and a single output **. When the weights of each input layer are W1, W2, ..., Wx and the weights not related to the input layer are θ, the sum of the product of each input value and the weight and the sum of θ is greater than 0. Outputs 1, otherwise 0. If you write it in a model, it will be as follows. If you write it in a mathematical formula,
N₁ × W₁ + N₂ × W₂ + ... + Nx × Wx + θ > 0
It will be. Also, if you organize the left side,
\frac{N₁}{N₂} × W₁ + W₂ + \frac{θ}{N₂} = 0 \\
W₂ = (- \frac{N₁}{N₂}) × W₁ + (- \frac{θ}{N₂})
It's hard to understand, so if you change the variable
y = a × x + b
You can see that it is a straight line. In other words, with a simple perceptron, you can draw a straight line that divides the input value and output value into two as shown below.
That's a rough explanation of the simple perceptron. Could you somehow understand this as well? This time, we will create a program that divides the two-input ** AND function ** into two. The AND function outputs 1 if both inputs are 1, otherwise 0. Therefore, the division is expected as follows.
Use the ** cd command ** to move to the working folder and then start jupyter lab. This time, we will do it under Documents / sample
.
AnacondaPrompt
(base) C:\Users\User name>cd Documents
#Create sample folder
(base) C:\Users\User name\Documents>mkdir sample
(base) C:\Users\User name\Documents>cd sample
(base) C:\Users\User name\Documents\sample>activate NN_sample
(NN_sample) C:\Users\User name\Documents\sample>
For jupyter lab, just type jupyter lab
to start jupyter lab and launch your web browser. If it does not start up, after executing jupyter lab
, copy the URL written at the end of the sentence that appears and access it with a browser.
AnacondaPrompt
(NN_sample) C:\Users\User name\Documents\Folder name>jupyter lab
[I 20:43:50.910 LabApp] JupyterLab extension loaded from C:\Users\User name\Miniconda3\envs\Environment name\lib\site-packages\jupyterlab
・ ・ ・
Or copy and paste one of these URLs:
http://localhost:8888/?token=43ed3a669bd4da587fa6febf75f3e38b0f7de64916e96648
or http://127.0.0.1:8888/?token=43ed3a669bd4da587fa6febf75f3e38b0f7de64916e96648
It doesn't matter how you create it, but from the top left of the jupyter lab, select * File-> New-> Notebook * to create a new notebook. The ** Select Kernel ** that appears after selection can be left as ** python 3 **.
In the created notebook, the code is divided into cells, and you can select code, Markdown, and Raw for each cell. If you can divide the cells properly, you can execute only the necessary part of the program. You can execute each cell by pressing shift + Enter.
First, install the required libraries. Describe with ʻimport library name or ʻimport library name as the name used in the program
.
sample.ipynb
import numpy as np
import matplotlib.pyplot as plt
import os
import csv
from keras.models import Sequential
from keras.layers import Dense, Activation
Variables used in the code are defined here. This time it's a short code, but especially in machine learning, there are many opportunities to change the number of learnings and the specified file, so if you define it at once and edit this cell, you can apply it to everything in the program.
sample.ipynb
CSVFILE = 'data.csv'
GRIDFILE = 'grid.csv'
header = ['x', 'y', 'class']
body = [
[0, 0, 0],
[1, 0, 0],
[0, 1, 0],
[1, 1, 1]
]
We will write the Add function, which will be the model later, to the csv file. It also creates a csv file to create the grid needed to draw the trained graph. It's not an essential part, so it's okay to skip it. Since you only need to create a csv file, you can basically execute this cell only once.
sample.ipynb
#Delete any csv file with CSVFILE name
if os.path.exists(CSVFILE):
os.unlink(CSVFILE)
if os.path.exists(GRIDFILE):
os.unlink(GRIDFILE)
#Write to file
# with open(File name you want to handle,mode(w:writing))as variable name:
with open(CSVFILE, 'w') as v:
writer = csv.writer(v)
writer.writerow(header)
writer.writerows(body)
with open(GRIDFILE, 'w') as v:
x = float(0)
y = float(0)
writer = csv.writer(v)
writer.writerow([x, y])
pre_x = x
pre_y = y
for _ in range(0, 20):
pre_y += 0.05
writer.writerow([round(pre_x, 4), round(pre_y, 4)])
for _ in range(0, 20):
pre_y = y
pre_x += 0.05
for _ in range(0, 21):
writer.writerow([round(pre_x, 4), round(pre_y, 4)])
pre_y += 0.05
Extract the csv file with the data required for learning. Extract using python's unique ** slice **. In addition to slicing, it is also possible to extract as ** Dataflame type ** using ** pandas **, but this time we need ndarray type, so we will extract by slicing. Slices still have more elements to know, so please refer to here to learn.
sample.ipynb
#Skip the first line of CSVFILE,','Separated by, stored in variable data as ndarray type
data = np.loadtxt(CSVFILE, delimiter=',', skiprows=1)
#Extract data other than the last column
ip_train = data[:, :-1]
# [[0. 0.]
# [1. 0.]
# [0. 1.]
# [1. 1.]]
#Extract the last column of data
class_train = data[:, -1]
# [0. 0. 0. 1.]
If you get a Warning, there is nothing wrong with it, but please try again.
Generate a constructor for learning. Therefore, you can add layers with the variable .add (...) that stores
Sequential ().
Of course, you can also write layers at once, such as Sequential ([Dense (...), Activation (...)])
.
As mentioned earlier, we will add layers to the constructor for learning. This time, the activation function is sigmoid, the number of neurons is 1, and the input value is 2. In this case, the activation function determines how * N₁ × W₁ + N₂ × W₂ + ... + Nx × Wx + θ * is transmitted to the output layer. Others include ** softmax ** and ** relu **.
In compile, detailed learning settings are made. Set the loss function that you want to minimize in training to ** binary_crossentropy **, the evaluation function that evaluates performance regardless of training to ** accuracy **, and the optimization method to ** sgd **.
Learning is performed here. Specify the test model, number of epochs, and batch size.
sample.ipynb
#Neural network constructor generation
model = Sequential()
model.add(Dense(1, input_dim=2, activation='sigmoid')) # Dense(Number of neurons in the layer,Number of input dimensions,Activation function)
#Learning settings.compile(Loss function,Optimization method,Evaluation function)
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
#Execution of learning
fg = model.fit(ip_train, class_train, epochs=1000, batch_size=1)
# Train on 4 samples
# Epoch 1/1000
# 4/4 [==============================] - 1s 137ms/sample - loss: 0.7022 - acc: 0.7500
# Epoch 2/1000
# 4/4 [==============================] - 0s 5ms/sample - loss: 0.7004 - acc: 0.7500
# Epoch 3/1000
# 4/4 [==============================] - 0s 5ms/sample - loss: 0.6987 - acc: 0.7500
#・ ・ ・
# Epoch 999/1000
# 4/4 [==============================] - 0s 4ms/sample - loss: 0.2649 - acc: 1.0000
# Epoch 1000/1000
# 4/4 [==============================] - 0s 3ms/sample - loss: 0.2647 - acc: 1.0000
Now that the training is complete, I want to use predict () to predict the boundaries and use matplotlib to display the results. This time as well, we will use slices to extract and display the necessary data. There are no other parts in the code that will be explained in detail, so please take a look at the code.
sample.ipynb
t1 = ip_train[ class_train==1 ]
t0 = ip_train[ class_train==0 ]
#An array of x-coordinates of points with a teacher signal of 1
t1_x = t1[:, 0]
#An array of y-coordinates of points with a teacher signal of 1
t1_y = t1[:, 1]
t0_x = t0[:, 0]
t0_y = t0[:, 1]
# ','Read GRIDFILE separated by
g = np.loadtxt(GRIDFILE, delimiter=',')
#Predict boundaries from learning results
pred_g = model.predict(g)[:, 0]
#An array of points with a predicted value of 1
g1 = g[ pred_g >= 0.5 ]
#An array of points with a predicted value of 0
g0 = g[ pred_g < 0.5 ]
#An array of x-coordinates of points with a predicted value of 1
g1_x = g1[:, 0]
#An array of y-coordinates of points with a predicted value of 1
g1_y = g1[:, 1]
g0_x = g0[:, 0]
g0_y = g0[:, 1]
plt.scatter(t1_x, t1_y, marker='o', facecolor='black', s=100)
plt.scatter(t0_x, t0_y, marker='o', facecolor='white', edgecolor='black', s=100)
plt.scatter(g1_x, g1_y, marker='o', facecolor='black', s=20)
plt.scatter(g0_x, g0_y, marker='o', facecolor='white', edgecolor='black', s=20)
plt.show()
When I run it, I can't see the lines, but I think I've got the classification I expected. Black 〇 is the part that is regarded as output result 1, and white 〇 is the part that is regarded as output result 0. This is the end including implementation.
I've roughly gone through the implementation of the most basic simple perceptron. Again, the purpose of this article is to get a rough idea of machine learning and try to implement a basic simple perceptron for the time being **, so I hope it will give you a chance to start machine learning. I will. As it becomes more complicated, it becomes possible to distinguish what is reflected in the image and to identify the audio data. Please check it out and implement it. Also, in the DMM WEBCAMP Advent Calendar that I am participating in this time, senior mentors and employees are writing more amazing articles. If you are interested, please take a look.
Machine learning starting from scratch (overview of machine learning) [Differences between supervised learning and unsupervised learning-AI artificial intelligence technology-](https://newtechnologylifestyle.net/%E6%95%99%E5%B8%AB%E3%81%82%E3%82% 8A% E5% AD% A6% E7% BF% 92% E3% 81% A8% E6% 95% 99% E5% B8% AB% E3% 81% AA% E3% 81% 97% E5% AD% A6% E7% BF% 92% E3% 81% AE% E9% 81% 95% E3% 81% 84% E3% 81% AB% E3% 81% A4% E3% 81% 84% E3% 81% A6 /) Deep Learning-Three Things You Should Know- [Cluster analysis with scikit-learn (K-means method) – Data science with Python](https://pythondatascience.plavox.info/scikit-learn/%E3%82%AF%E3%83%A9%E3%82] % B9% E3% 82% BF% E5% 88% 86% E6% 9E% 90-k-means)
How to install Python (Windows)
Miniconda Usage Note — Reading Note v1.5dev --Prefabricated Hut Install Miniconda on Windows (2018) --Qiita [For beginners] Let's create a virtual environment with Anaconda --Qiita Jupyter Lab's recommendation --Qiita
[About the simplest simple perceptron-AI artificial intelligence technology](https://newtechnologylifestyle.net/%E4%B8%80%E7%95%AA%E7%B0%A1%E5%8D%98%E3% 81% AA% E5% 8D% 98% E7% B4% 94% E3% 83% 91% E3% 83% BC% E3% 82% BB% E3% 83% 97% E3% 83% 88% E3% 83% AD% E3% 83% B3% E3% 81% AB% E3% 81% A4% E3% 81% 84% E3% 81% A6 /) Introduction to deep learning starting with Keras / Tensorflow --Qiita Implementation of Multilayer Perceptron-Python and Machine Learning
Recommended Posts