Visualize the effects of deep learning / regularization

1.First of all

I started studying deep learning. This time, I will briefly summarize regularization.

2. Data creation

Based on the equation $ y = -x ^ 3 + x ^ 2 + x $, x is the value obtained by dividing -10 to 10 by 50, and y is the result of substituting that x into the equation and adding a random number from 0 to 0.05. Create the data as a value.

from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model
import numpy as np
import matplotlib.pyplot as plt

#Data generation
np.random.seed(0)
X = np.linspace(-10, 10, 50)
Y_truth = 0.001 * (-X **3 + X**2 + X)
Y = Y_truth + np.random.normal(0, 0.05, len(X))

plt.figure(figsize=(5, 5))
plt.plot(X, Y_truth, color='gray')
plt.plot(X, Y, '.', color='k')
plt.show()

スクリーンショット 2019-12-26 15.31.05.png This is the created data. It is assumed that the solid line is the true value (the value of the equation) and the point is the value actually observed (the value of y plus noise).

3. Introduction of polynomial regression

Overfitting is more likely to occur with higher degrees of freedom, so we dare to introduce 30-dimensional polynomial regression.

#graph display
def graph(Y_lr, name):
    plt.figure(figsize=(6, 6))
    plt.plot(X, Y_truth, color='gray', label='truth')
    plt.plot(xs, Y_lr, color='r', markersize=2, label=name)
    plt.plot(X, Y, '.', color='k')
    plt.legend()
    plt.ylim(-1, 1)
    plt.show()

#Display settings
xs = np.linspace(-10, 10, 200)  

#Introduction of polynomial regression
poly = PolynomialFeatures(degree=30, include_bias=False)  
X_poly = poly.fit_transform(X[:, np.newaxis])   

After setting the graph display and display, PolynomialFeatures is instantiated and fitted. The dimension is 30 dimensions (degree = 30).

4. No regularization

First, do polynomial regression without regularization.

#No regularization
lr0 = linear_model.LinearRegression(normalize=True)
lr0.fit(X_poly, Y)
Y_lr0 = lr0.predict(poly.fit_transform(xs[:, np.newaxis]))
graph(Y_lr0, 'No Regularization')

スクリーンショット 2019-12-26 15.32.47.png

Due to its high degree of freedom with a 30-dimensional polynomial, it is possible to pass many points dexterously, resulting in typical overfitting. It is far from the true value, and generalization performance cannot be expected with this.

5. L2 regularization

L2 regularization is a technique known for Ridge regression, which limits the coefficients so that they do not become too large, and adds the L2 norm of the parameter to the loss (C is a constant). L(W)+c|w|^2

#L2 regularization
lr2 = linear_model.Ridge(normalize=True, alpha=0.5)
lr2.fit(X_poly, Y)
Y_lr2 = lr2.predict(poly.fit_transform(xs[:, np.newaxis]))
graph(Y_lr2, 'L2')

スクリーンショット 2019-12-26 15.33.51.png Well, I feel like I've been able to return successfully.

6. L1 regularization

L1 regularization is a technique known for Lasso regression, which also limits the coefficients so that they do not become too large, adding the L1 norm of the parameter to the loss (C is a constant). L(W)+c|w|

#L1 regularization
lr1 = linear_model.LassoLars(normalize=True, alpha=0.001)
lr1.fit(X_poly, Y)
Y_lr1 = lr1.predict(poly.fit_transform(xs[:, np.newaxis]))
graph(Y_lr1, 'L1')

スクリーンショット 2019-12-26 15.34.43.png The shape is very close to a perfect fit. Compared to L1 regularization, L2 regularization seems to be able to regress better.

7. Comparison of dimensional coefficients

Compares 30 dimensional coefficients for each of no regularization, L2 regularization, and L1 regularization (listed from lowest dimension).

import pandas as pd
result = []
for i in range(len(lr0.coef_)):
      tmp = lr0.coef_[i], lr2.coef_[i], lr1.coef_[i]
      result.append(tmp)
df = pd.DataFrame(result)
df.columns = ['No Regularization', 'L2', 'L1']
print(df)

スクリーンショット 2019-12-26 15.43.05.png You can see that L2 has a smaller coefficient than No Regularization. L1 is also a sparse expression with many completely zeros.

I'm glad that L1 regularization can suppress overfitting and reduce dimensions.

Recommended Posts

Visualize the effects of deep learning / regularization
Othello-From the tic-tac-toe of "Implementation Deep Learning" (3)
Othello-From the tic-tac-toe of "Implementation Deep Learning" (2)
Deep learning 1 Practice of deep learning
The story of doing deep learning with TPU
Count the number of parameters in the deep learning model
Techniques for understanding the basis of deep learning decisions
Othello ~ From the tic-tac-toe of "Implementation Deep Learning" (4) [End]
Visualize the orbit of Hayabusa2
Deep reinforcement learning 2 Implementation of reinforcement learning
Visualize the response status of the census 2020
Deep Learning
[Anomaly detection] Try using the latest method of deep distance learning
Summary of pages useful for studying the deep learning framework Chainer
Learning notes from the beginning of Python 1
Meaning of deep learning models and parameters
Sentiment analysis of tweets with deep learning
Learning record of reading "Deep Learning from scratch"
Visualize the export data of Piyo log
Learning notes from the beginning of Python 2
Graph of the history of the number of layers of deep learning and the change in accuracy
I tried using the trained model VGG16 of the deep learning library Keras
I tried the common story of using Deep Learning to predict the Nikkei 225
I tried the common story of predicting the Nikkei 225 using deep learning (backtest)
The story of low learning costs for Python
See the behavior of drunkenness with reinforcement learning
Deep learning / error back propagation of sigmoid function
A memorandum of studying and implementing deep learning
[Learning memo] Deep Learning from scratch ~ Implementation of Dropout ~
Start Deep learning
Basic understanding of stereo depth estimation (Deep Learning)
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
Build a python environment to learn the theory and implementation of deep learning
Python Deep Learning
Parallel learning of deep learning by Keras and Kubernetes
About the development contents of machine learning (Example)
Deep learning × Python
Visualize the behavior of the sorting algorithm with matplotlib
Implementation of Deep Learning model for image recognition
Deep learning learned by implementation (segmentation) ~ Implementation of SegNet ~
Deep learning dramatically makes it easier to see the time-lapse of physical changes
How to install the deep learning framework Tensorflow 1.0 in the Anaconda environment of Windows
[Deep Learning from scratch] I implemented the Affine layer
I installed and used the Deep Learning library Chainer
Visualize the range of interpolation and extrapolation with python
About testing in the implementation of machine learning models
Application of Deep Learning 2 made from scratch Spam filter
Predict the gender of Twitter users with machine learning
Visualize the characteristic vocabulary of a document with D3.js
Deep Learning from the mathematical basics Part 2 (during attendance)
I tried to visualize the spacha information of VTuber
DNN (Deep Learning) Library: Comparison of chainer and TensorFlow (1)
Summary of the basic flow of machine learning with Python
Record of the first machine learning challenge with Keras
Visualize the appreciation status of art works with OpenCV
Cats are already tired of loose fluffy deep learning
I investigated the reinforcement learning algorithm of algorithmic trading
Collection and automation of erotic images using deep learning
DEEP PROBABILISTIC PROGRAMMING --- "Deep Learning + Bayes" Library --- Introduction of Edward
Visualize the number of complaints from life insurance companies
The copy method of pandas.DataFrame is deep copy by default