Machine learning

table of contents Chapter 1: Linear Regression Model [Chapter 2: Nonlinear Regression Model] (https://qiita.com/matsukura04583/items/baa3f2269537036abc57) [Chapter 3: Logistic Regression Model] (https://qiita.com/matsukura04583/items/0fb73183e4a7a6f06aa5) [Chapter 4: Principal Component Analysis] (https://qiita.com/matsukura04583/items/b3b5d2d22189afc9c81c) [Chapter 5: Algorithm 1 (k-nearest neighbor method (kNN))] (https://qiita.com/matsukura04583/items/543719b44159322221ed) [Chapter 6: Algorithm 2 (k-means)] (https://qiita.com/matsukura04583/items/050c98c7bb1c9e91be71) [Chapter 7: Support Vector Machine] (https://qiita.com/matsukura04583/items/6b718642bcbf97ae2ca8)

Chapter 2: Nonlinear Regression Model

Description of non-linear regression model

Perform nonlinear regression modeling for phenomena that have complex nonlinear structures
Limited if the structure of the data can be captured linearly
Need a mechanism to capture non-linear structures

Basis expansion method
As a regression function, we use a known nonlinear function called a basis function and a linear combination of parameter vectors.
Unknown parameters are estimated by the least squares method or maximum likelihood method as in the linear regression model.
y_i=f(x_i)+\epsilon_i

It is not so different from the linear model, only the linear map is multiplied by the part of $ \ phi $ to make it non-linear.

Commonly used basis functions
Polynomial function
Gaussian basis set
Spline function / B spline function
Non-linear regression based on one-dimensional basis functions
Non-linear regression based on two-dimensional basis functions
Underfitting and overfitting
Model for which a sufficiently small error cannot be obtained for the training data → Untrained
(Countermeasure 1) Use a model with high expressiveness because the expressiveness of the model is low.
(Countermeasure 2) ** Delete unnecessary basis functions (variables) ** to suppress expressiveness
(Countermeasure 3) Use regularization to suppress expressiveness

The bottom two are ways to adjust the complexity of the model

Remove unnecessary basis functions
Model complexity changes depending on the number, position and bandwidth of basis functions
If you prepare many basis functions for the problem you want to solve, overfitting problems will occur, so prepare appropriate basis functions (select by CV etc.)
Regularization method (penalization method)
Minimize "a function that imposes a regularization term (penalty term) whose value increases with the complexity of the model"
Regularization term (penalty term)
There are several types depending on the shape, and each has different estimator properties.
Regularization (smoothing) parameters
Adjust the smoothness of the curve of the model ▶ Need to decide properly
Role of regularization term (penalty term)
None ▶ Least squares estimator
Use L2 norm ▶ ** Ridge estimator </ font> **

Use L1 norm ▶ ** Lasso estimator </ font> **

The role of regularized paratha

Small ▶ Large constraint

Large ▶ Small constraint surface

Generalization performance
Predictive performance not only for the inputs used for learning, but also for new inputs never seen before
A model with good performance with a small generalization error (test error) (not a learning error)
Generalization error is usually estimated by measuring performance with validation data collected separately from training data
Bias / variance decomposition (Reference) Bias-Variance Decomposition: Machine Learning Performance Evaluation

Whether your model is unlearned or overfitted with data
Both training error and test error are small ▶ Possibility of generalized model
Small training error but large test error ▶ Overfitting
Neither training error nor test error is reduced ▶ Unlearned
In the case of regression, the solution is explicitly obtained (comparing the values of learning error and training error) 52 Nonlinear regression model
Holdout method
Divide finite data into two parts, one for learning and one for testing, and use it to estimate "prediction accuracy" and "error rate".
If more learning is used, less is used for testing and learning accuracy is improved, but performance evaluation accuracy is worse.
On the contrary, if the number of tests is increased, the number of learnings is reduced, so the accuracy of learning itself deteriorates.
It has the disadvantage of not giving a good performance evaluation unless you have a large amount of data at hand. For example, if you divide it in two, there is a risk that the lost data will be sent to only one of them.
In the nonlinear regression model based on the basis expansion method, the number, position, bandwidth value and tuning parameters of the basis functions are determined by the model that reduces the holdout value. 53 Nonlinear regression model

Cross-validation

The following verification data and training data are separated for each iterator, and the model is prepared. Below is an example of dividing the data into 5 parts for learning and evaluation. The important thing is not to cover the verification data and training data.

Even if the accuracy is 70% when verified by the holdout verification method and 65% for CV, CV is used for estimating generalization performance. Cross-validation has higher generalization performance than hold-out validation.

(Practice)

Google drive mount

from google.colab import drive
drive.mount('/content/drive')

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

#seaborn settings
sns.set()
#Background change
sns.set_style("darkgrid", {'grid.linestyle': '--'})
#size(Scale change)
sns.set_context("paper")

n=100

def true_func(x):
    z = 1-48*x+218*x**2-315*x**3+145*x**4
    return z 

def linear_func(x):
    z = x
    return z 
#Generate noisy data from true functions

#Data generation from a true function
data = np.random.rand(n).astype(np.float32)
data = np.sort(data)
target = true_func(data)

#Add noise
noise = 0.5 * np.random.randn(n) 
target = target  + noise

#Draw data with noise

plt.scatter(data, target)

plt.title('NonLinear Regression')
plt.legend(loc=2)

from sklearn.linear_model import LinearRegression

clf = LinearRegression()
data = data.reshape(-1,1)
target = target.reshape(-1,1)
clf.fit(data, target)

p_lin = clf.predict(data)

plt.scatter(data, target, label='data')
plt.plot(data, p_lin, color='darkorange', marker='', linestyle='-', linewidth=1, markersize=6, label='linear regression')
plt.legend()
print(clf.score(data, target))

from sklearn.kernel_ridge import KernelRidge

clf = KernelRidge(alpha=0.0002, kernel='rbf')
clf.fit(data, target)

p_kridge = clf.predict(data)

plt.scatter(data, target, color='blue', label='data')

plt.plot(data, p_kridge, color='orange', linestyle='-', linewidth=3, markersize=6, label='kernel ridge')
plt.legend()
#plt.plot(data, p, color='orange', marker='o', linestyle='-', linewidth=1, markersize=6)

#Ridge

from sklearn.metrics.pairwise import rbf_kernel
from sklearn.linear_model import Ridge

kx = rbf_kernel(X=data, Y=data, gamma=50)
#KX = rbf_kernel(X, x)

#clf = LinearRegression()
clf = Ridge(alpha=30)
clf.fit(kx, target)

p_ridge = clf.predict(kx)

plt.scatter(data, target,label='data')
for i in range(len(kx)):
    plt.plot(data, kx[i], color='black', linestyle='-', linewidth=1, markersize=3, label='rbf', alpha=0.2)

#plt.plot(data, p, color='green', marker='o', linestyle='-', linewidth=0.1, markersize=3)
plt.plot(data, p_ridge, color='green', linestyle='-', linewidth=1, markersize=3,label='ridge regression')
#plt.legend()

print(clf.score(kx, target))

from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline
#PolynomialFeatures(degree=1)

deg = [1,2,3,4,5,6,7,8,9,10]
for d in deg:
    regr = Pipeline([
        ('poly', PolynomialFeatures(degree=d)),
        ('linear', LinearRegression())
    ])
    regr.fit(data, target)
    # make predictions
    p_poly = regr.predict(data)
    # plot regression result
    plt.scatter(data, target, label='data')
    plt.plot(data, p_poly, label='polynomial of degree %d' % (d))

#Lasso

from sklearn.metrics.pairwise import rbf_kernel
from sklearn.linear_model import Lasso

kx = rbf_kernel(X=data, Y=data, gamma=5)
#KX = rbf_kernel(X, x)

#lasso_clf = LinearRegression()
lasso_clf = Lasso(alpha=10000, max_iter=1000)
lasso_clf.fit(kx, target)

p_lasso = lasso_clf.predict(kx)

plt.scatter(data, target)

#plt.plot(data, p, color='green', marker='o', linestyle='-', linewidth=0.1, markersize=3)
plt.plot(data, p_lasso, color='green', linestyle='-', linewidth=3, markersize=3)

print(lasso_clf.score(kx, target))

from sklearn import model_selection, preprocessing, linear_model, svm

# SVR-rbf
clf_svr = svm.SVR(kernel='rbf', C=1e3, gamma=0.1, epsilon=0.1)
clf_svr.fit(data, target)
y_rbf = clf_svr.fit(data, target).predict(data)
 
# plot

plt.scatter(data, target, color='darkorange', label='data')
plt.plot(data, y_rbf, color='red', label='Support Vector Regression (RBF)')
plt.legend()
plt.show()

`result`


/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py:724: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py:724: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(data, target, test_size=0.1, random_state=0)

from keras.callbacks import EarlyStopping, TensorBoard, ModelCheckpoint

cb_cp = ModelCheckpoint('/content/drive/My Drive/study_ai_ml/skl_ml/out/checkpoints/weights.{epoch:02d}-{val_loss:.2f}.hdf5', verbose=1, save_weights_only=True)
cb_tf  = TensorBoard(log_dir='/content/drive/My Drive/study_ai_ml/skl_ml/out/tensorBoard', histogram_freq=0)
def relu_reg_model():
    model = Sequential()
    model.add(Dense(10, input_dim=1, activation='relu'))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(1000, activation='relu'))
    model.add(Dense(1000, activation='linear'))
#     model.add(Dense(100, activation='relu'))
#     model.add(Dense(100, activation='relu'))
#     model.add(Dense(100, activation='relu'))
#     model.add(Dense(100, activation='relu'))
    model.add(Dense(1))

    model.compile(loss='mean_squared_error', optimizer='adam')
    return model

from keras.models import Sequential
from keras.layers import Input, Dense, Dropout, BatchNormalization
from keras.wrappers.scikit_learn import KerasRegressor

# use data split and fit to run the model
estimator = KerasRegressor(build_fn=relu_reg_model, epochs=100, batch_size=5, verbose=1)

history = estimator.fit(x_train, y_train, callbacks=[cb_cp, cb_tf], validation_data=(x_test, y_test))

`result`


WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1033: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1020: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3005: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

Train on 90 samples, validate on 10 samples
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:207: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:216: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:223: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1122: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1125: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.

Epoch 1/100
90/90 [==============================] - 2s 17ms/step - loss: 1.7399 - val_loss: 0.4522

Epoch 00001: saving model to /content/drive/My Drive/study_ai_ml/skl_ml/out/checkpoints/weights.01-0.45.hdf5
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-19-26d4341f0e70> in <module>()
      6 estimator = KerasRegressor(build_fn=relu_reg_model, epochs=100, batch_size=5, verbose=1)
      7 
----> 8 history = estimator.fit(x_train, y_train, callbacks=[cb_cp, cb_tf], validation_data=(x_test, y_test))

8 frames
/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
    146         fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
    147     elif mode == 'w':
--> 148         fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
    149     elif mode == 'a':
    150         # Open in append mode (read/write).

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5f.pyx in h5py.h5f.create()

OSError: Unable to create file (unable to open file: name = '/content/drive/My Drive/study_ai_ml/skl_ml/out/checkpoints/weights.01-0.45.hdf5', errno = 2, error message = 'No such file or directory', flags = 13, o_flags = 242)

y_pred = estimator.predict(x_train)

`result`


90/90 [==============================] - 0s 1ms/step

plt.title('NonLiner Regressions via DL by ReLU')
plt.plot(data, target, 'o')
plt.plot(data, true_func(data), '.')
plt.plot(x_train, y_pred, "o", label='predicted: deep learning')
#plt.legend(loc=2)

print(lasso_clf.coef_)

`result`


[-0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0.
 -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0.
 -0. -0. -0. -0. -0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

Related Sites

Chapter 1: Linear Regression Model [Chapter 2: Nonlinear Regression Model] (https://qiita.com/matsukura04583/items/baa3f2269537036abc57) [Chapter 3: Logistic Regression Model] (https://qiita.com/matsukura04583/items/0fb73183e4a7a6f06aa5) [Chapter 4: Principal Component Analysis] (https://qiita.com/matsukura04583/items/b3b5d2d22189afc9c81c) [Chapter 5: Algorithm 1 (k-nearest neighbor method (kNN))] (https://qiita.com/matsukura04583/items/543719b44159322221ed) [Chapter 6: Algorithm 2 (k-means)] (https://qiita.com/matsukura04583/items/050c98c7bb1c9e91be71) [Chapter 7: Support Vector Machine] (https://qiita.com/matsukura04583/items/6b718642bcbf97ae2ca8)

<Course> Machine Learning Chapter 2: Nonlinear Regression Model

Machine learning

Chapter 2: Nonlinear Regression Model

Description of non-linear regression model

Cross-validation

(Practice)

result

result

result

result

`result`

`result`

`result`

`result`