table of contents Chapter 1: Linear Regression Model [Chapter 2: Nonlinear Regression Model] (https://qiita.com/matsukura04583/items/baa3f2269537036abc57) [Chapter 3: Logistic Regression Model] (https://qiita.com/matsukura04583/items/0fb73183e4a7a6f06aa5) [Chapter 4: Principal Component Analysis] (https://qiita.com/matsukura04583/items/b3b5d2d22189afc9c81c) [Chapter 5: Algorithm 1 (k-nearest neighbor method (kNN))] (https://qiita.com/matsukura04583/items/543719b44159322221ed) [Chapter 6: Algorithm 2 (k-means)] (https://qiita.com/matsukura04583/items/050c98c7bb1c9e91be71) [Chapter 7: Support Vector Machine] (https://qiita.com/matsukura04583/items/6b718642bcbf97ae2ca8)
It is not so different from the linear model, only the linear map is multiplied by the part of
Commonly used basis functions
Polynomial function
Gaussian basis set
Spline function / B spline function
Non-linear regression based on one-dimensional basis functions
Non-linear regression based on two-dimensional basis functions
Underfitting and overfitting
Model for which a sufficiently small error cannot be obtained for the training data → Untrained
(Countermeasure 1) Use a model with high expressiveness because the expressiveness of the model is low.
(Countermeasure 2) ** Delete unnecessary basis functions (variables) ** to suppress expressiveness
(Countermeasure 3) Use regularization to suppress expressiveness
Remove unnecessary basis functions
Model complexity changes depending on the number, position and bandwidth of basis functions
If you prepare many basis functions for the problem you want to solve, overfitting problems will occur, so prepare appropriate basis functions (select by CV etc.)
Regularization method (penalization method)
Minimize "a function that imposes a regularization term (penalty term) whose value increases with the complexity of the model"
Regularization term (penalty term)
There are several types depending on the shape, and each has different estimator properties.
Regularization (smoothing) parameters
Adjust the smoothness of the curve of the model ▶ Need to decide properly
Role of regularization term (penalty term)
None ▶ Least squares estimator
Use L2 norm ▶ ** Ridge estimator </ font> **
Use L1 norm ▶ ** Lasso estimator </ font> **
The role of regularized paratha
Small ▶ Large constraint
Large ▶ Small constraint surface
Whether your model is unlearned or overfitted with data
Both training error and test error are small ▶ Possibility of generalized model
Small training error but large test error ▶ Overfitting
Neither training error nor test error is reduced ▶ Unlearned
In the case of regression, the solution is explicitly obtained (comparing the values of learning error and training error) 52 Nonlinear regression model
Holdout method
Divide finite data into two parts, one for learning and one for testing, and use it to estimate "prediction accuracy" and "error rate".
If more learning is used, less is used for testing and learning accuracy is improved, but performance evaluation accuracy is worse.
On the contrary, if the number of tests is increased, the number of learnings is reduced, so the accuracy of learning itself deteriorates.
It has the disadvantage of not giving a good performance evaluation unless you have a large amount of data at hand. For example, if you divide it in two, there is a risk that the lost data will be sent to only one of them.
In the nonlinear regression model based on the basis expansion method, the number, position, bandwidth value and tuning parameters of the basis functions are determined by the model that reduces the holdout value. 53 Nonlinear regression model
The following verification data and training data are separated for each iterator, and the model is prepared. Below is an example of dividing the data into 5 parts for learning and evaluation. The important thing is not to cover the verification data and training data.
Even if the accuracy is 70% when verified by the holdout verification method and 65% for CV, CV is used for estimating generalization performance. Cross-validation has higher generalization performance than hold-out validation.
Google drive mount
from google.colab import drive
drive.mount('/content/drive')
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
#seaborn settings
sns.set()
#Background change
sns.set_style("darkgrid", {'grid.linestyle': '--'})
#size(Scale change)
sns.set_context("paper")
n=100
def true_func(x):
z = 1-48*x+218*x**2-315*x**3+145*x**4
return z
def linear_func(x):
z = x
return z
#Generate noisy data from true functions
#Data generation from a true function
data = np.random.rand(n).astype(np.float32)
data = np.sort(data)
target = true_func(data)
#Add noise
noise = 0.5 * np.random.randn(n)
target = target + noise
#Draw data with noise
plt.scatter(data, target)
plt.title('NonLinear Regression')
plt.legend(loc=2)
from sklearn.linear_model import LinearRegression
clf = LinearRegression()
data = data.reshape(-1,1)
target = target.reshape(-1,1)
clf.fit(data, target)
p_lin = clf.predict(data)
plt.scatter(data, target, label='data')
plt.plot(data, p_lin, color='darkorange', marker='', linestyle='-', linewidth=1, markersize=6, label='linear regression')
plt.legend()
print(clf.score(data, target))
from sklearn.kernel_ridge import KernelRidge
clf = KernelRidge(alpha=0.0002, kernel='rbf')
clf.fit(data, target)
p_kridge = clf.predict(data)
plt.scatter(data, target, color='blue', label='data')
plt.plot(data, p_kridge, color='orange', linestyle='-', linewidth=3, markersize=6, label='kernel ridge')
plt.legend()
#plt.plot(data, p, color='orange', marker='o', linestyle='-', linewidth=1, markersize=6)
#Ridge
from sklearn.metrics.pairwise import rbf_kernel
from sklearn.linear_model import Ridge
kx = rbf_kernel(X=data, Y=data, gamma=50)
#KX = rbf_kernel(X, x)
#clf = LinearRegression()
clf = Ridge(alpha=30)
clf.fit(kx, target)
p_ridge = clf.predict(kx)
plt.scatter(data, target,label='data')
for i in range(len(kx)):
plt.plot(data, kx[i], color='black', linestyle='-', linewidth=1, markersize=3, label='rbf', alpha=0.2)
#plt.plot(data, p, color='green', marker='o', linestyle='-', linewidth=0.1, markersize=3)
plt.plot(data, p_ridge, color='green', linestyle='-', linewidth=1, markersize=3,label='ridge regression')
#plt.legend()
print(clf.score(kx, target))
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline
#PolynomialFeatures(degree=1)
deg = [1,2,3,4,5,6,7,8,9,10]
for d in deg:
regr = Pipeline([
('poly', PolynomialFeatures(degree=d)),
('linear', LinearRegression())
])
regr.fit(data, target)
# make predictions
p_poly = regr.predict(data)
# plot regression result
plt.scatter(data, target, label='data')
plt.plot(data, p_poly, label='polynomial of degree %d' % (d))
#Lasso
from sklearn.metrics.pairwise import rbf_kernel
from sklearn.linear_model import Lasso
kx = rbf_kernel(X=data, Y=data, gamma=5)
#KX = rbf_kernel(X, x)
#lasso_clf = LinearRegression()
lasso_clf = Lasso(alpha=10000, max_iter=1000)
lasso_clf.fit(kx, target)
p_lasso = lasso_clf.predict(kx)
plt.scatter(data, target)
#plt.plot(data, p, color='green', marker='o', linestyle='-', linewidth=0.1, markersize=3)
plt.plot(data, p_lasso, color='green', linestyle='-', linewidth=3, markersize=3)
print(lasso_clf.score(kx, target))
from sklearn import model_selection, preprocessing, linear_model, svm
# SVR-rbf
clf_svr = svm.SVR(kernel='rbf', C=1e3, gamma=0.1, epsilon=0.1)
clf_svr.fit(data, target)
y_rbf = clf_svr.fit(data, target).predict(data)
# plot
plt.scatter(data, target, color='darkorange', label='data')
plt.plot(data, y_rbf, color='red', label='Support Vector Regression (RBF)')
plt.legend()
plt.show()
result
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py:724: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py:724: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
y = column_or_1d(y, warn=True)
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(data, target, test_size=0.1, random_state=0)
from keras.callbacks import EarlyStopping, TensorBoard, ModelCheckpoint
cb_cp = ModelCheckpoint('/content/drive/My Drive/study_ai_ml/skl_ml/out/checkpoints/weights.{epoch:02d}-{val_loss:.2f}.hdf5', verbose=1, save_weights_only=True)
cb_tf = TensorBoard(log_dir='/content/drive/My Drive/study_ai_ml/skl_ml/out/tensorBoard', histogram_freq=0)
def relu_reg_model():
model = Sequential()
model.add(Dense(10, input_dim=1, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(1000, activation='linear'))
# model.add(Dense(100, activation='relu'))
# model.add(Dense(100, activation='relu'))
# model.add(Dense(100, activation='relu'))
# model.add(Dense(100, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
return model
from keras.models import Sequential
from keras.layers import Input, Dense, Dropout, BatchNormalization
from keras.wrappers.scikit_learn import KerasRegressor
# use data split and fit to run the model
estimator = KerasRegressor(build_fn=relu_reg_model, epochs=100, batch_size=5, verbose=1)
history = estimator.fit(x_train, y_train, callbacks=[cb_cp, cb_tf], validation_data=(x_test, y_test))
result
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1033: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1020: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3005: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
Train on 90 samples, validate on 10 samples
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:207: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:216: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:223: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1122: The name tf.summary.merge_all is deprecated. Please use tf.compat.v1.summary.merge_all instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/callbacks.py:1125: The name tf.summary.FileWriter is deprecated. Please use tf.compat.v1.summary.FileWriter instead.
Epoch 1/100
90/90 [==============================] - 2s 17ms/step - loss: 1.7399 - val_loss: 0.4522
Epoch 00001: saving model to /content/drive/My Drive/study_ai_ml/skl_ml/out/checkpoints/weights.01-0.45.hdf5
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-19-26d4341f0e70> in <module>()
6 estimator = KerasRegressor(build_fn=relu_reg_model, epochs=100, batch_size=5, verbose=1)
7
----> 8 history = estimator.fit(x_train, y_train, callbacks=[cb_cp, cb_tf], validation_data=(x_test, y_test))
8 frames
/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
146 fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
147 elif mode == 'w':
--> 148 fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
149 elif mode == 'a':
150 # Open in append mode (read/write).
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()
h5py/h5f.pyx in h5py.h5f.create()
OSError: Unable to create file (unable to open file: name = '/content/drive/My Drive/study_ai_ml/skl_ml/out/checkpoints/weights.01-0.45.hdf5', errno = 2, error message = 'No such file or directory', flags = 13, o_flags = 242)
y_pred = estimator.predict(x_train)
result
90/90 [==============================] - 0s 1ms/step
plt.title('NonLiner Regressions via DL by ReLU')
plt.plot(data, target, 'o')
plt.plot(data, true_func(data), '.')
plt.plot(x_train, y_pred, "o", label='predicted: deep learning')
#plt.legend(loc=2)
print(lasso_clf.coef_)
result
[-0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0.
-0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0. -0.
-0. -0. -0. -0. -0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
Related Sites
Chapter 1: Linear Regression Model [Chapter 2: Nonlinear Regression Model] (https://qiita.com/matsukura04583/items/baa3f2269537036abc57) [Chapter 3: Logistic Regression Model] (https://qiita.com/matsukura04583/items/0fb73183e4a7a6f06aa5) [Chapter 4: Principal Component Analysis] (https://qiita.com/matsukura04583/items/b3b5d2d22189afc9c81c) [Chapter 5: Algorithm 1 (k-nearest neighbor method (kNN))] (https://qiita.com/matsukura04583/items/543719b44159322221ed) [Chapter 6: Algorithm 2 (k-means)] (https://qiita.com/matsukura04583/items/050c98c7bb1c9e91be71) [Chapter 7: Support Vector Machine] (https://qiita.com/matsukura04583/items/6b718642bcbf97ae2ca8)