introduction

J'essaie de faire un modèle de régression multivariée, et j'aimerais reprendre certaines des nombreuses méthodes d'apprentissage automatique et comparer et vérifier l'exactitude.

scikit-learn est une bibliothèque d'apprentissage automatique Python qui est implémentée de différentes manières et qui est pratique, j'ai donc essayé de l'utiliser rapidement.

Première démo

Les éléments suivants sont présentés dans des exemples


import numpy as np
from sklearn.svm import SVR
import matplotlib.pyplot as plt
%matplotlib inline

#Générer une entrée avec des nombres aléatoires
X = np.sort(5 * np.random.rand(40, 1), axis=0)
#La sortie est une fonction sin
y = np.sin(X).ravel()

#Ajouter du bruit à la sortie
y[::5] += 3 * (0.5 - np.random.rand(8))

#Noyau RBF, linéaire, raccord polypoly
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
svr_lin = SVR(kernel='linear', C=1e3)
svr_poly = SVR(kernel='poly', C=1e3, degree=2)
y_rbf = svr_rbf.fit(X, y).predict(X)
y_lin = svr_lin.fit(X, y).predict(X)
y_poly = svr_poly.fit(X, y).predict(X)

#Créer un diagramme
plt.figure(figsize=[10, 5])
plt.scatter(X, y, c='k', label='data')
plt.hold('on')
plt.plot(X, y_rbf, c='g', label='RBF model')
plt.plot(X, y_lin, c='r', label='Linear model')
plt.plot(X, y_poly, c='b', label='Polynomial model')
plt.xlabel('data')
plt.ylabel('target')
plt.title('Support Vector Regression')
plt.legend()
plt.show()

résultat:

Je l'ai essayé

conditions

4 variables (4 dimensions)
Préparer l'ensemble de données d'entraînement et l'ensemble de données de test
Après l'entraînement, estimez en insérant des données de test
Comparez la précision d'estimation RBF, linéaire et polynomiale
Pour la précision de l'estimation, utilisez RMSE et le coefficient de corrélation


import numpy as np
from sklearn.svm import SVR
import matplotlib.pyplot as plt

#Générer correctement l'entrée
X1 = np.sort(5 * np.random.rand(40, 1).reshape(40), axis=0)
X2 = np.sort(3 * np.random.rand(40, 1).reshape(40), axis=0)
X3 = np.sort(9 * np.random.rand(40, 1).reshape(40), axis=0)
X4 = np.sort(4 * np.random.rand(40, 1).reshape(40), axis=0)

#Intégrez le tableau d'entrées en un
X = np.c_[X1, X2, X3, X4]

#Calculer la sortie
y = np.sin(X1).ravel() + np.cos(X2).ravel() + np.sin(X3).ravel() - np.cos(X4).ravel()

y_o = y.copy()

#Ajoute du bruit
y[::5] += 3 * (0.5 - np.random.rand(8))

#raccord
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
svr_lin = SVR(kernel='linear', C=1e3)
svr_poly = SVR(kernel='poly', C=1e3, degree=3)
y_rbf = svr_rbf.fit(X, y).predict(X)
y_lin = svr_lin.fit(X, y).predict(X)
y_poly = svr_poly.fit(X, y).predict(X)

#Préparer les données de test
test_X1 = np.sort(5 * np.random.rand(40, 1).reshape(40), axis=0)
test_X2 = np.sort(3 * np.random.rand(40, 1).reshape(40), axis=0)
test_X3 = np.sort(9 * np.random.rand(40, 1).reshape(40), axis=0)
test_X4 = np.sort(4 * np.random.rand(40, 1).reshape(40), axis=0)

test_X = np.c_[test_X1, test_X2, test_X3, test_X4]
test_y = np.sin(test_X1).ravel() + np.cos(test_X2).ravel() + np.sin(test_X3).ravel() - np.cos(test_X4).ravel()

#Essayez d'estimer en plongeant les données de test
test_rbf = svr_rbf.predict(test_X)
test_lin = svr_lin.predict(test_X)
test_poly = svr_poly.predict(test_X)

Ci-dessous, vérification


from sklearn.metrics import mean_squared_error
from math import sqrt

#Calcul du coefficient de corrélation
rbf_corr = np.corrcoef(test_y, test_rbf)[0, 1]
lin_corr = np.corrcoef(test_y, test_lin)[0, 1]
poly_corr = np.corrcoef(test_y, test_poly)[0, 1]

#Calculer RMSE
rbf_rmse = sqrt(mean_squared_error(test_y, test_rbf))
lin_rmse = sqrt(mean_squared_error(test_y, test_lin))
poly_rmse = sqrt(mean_squared_error(test_y, test_poly))

print "RBF: RMSE %f \t\t Corr %f" % (rbf_rmse, rbf_corr)
print "Linear: RMSE %f \t Corr %f" % (lin_rmse, lin_corr)
print "Poly: RMSE %f \t\t Corr %f" % (poly_rmse, poly_corr)

J'ai ce résultat



RBF: RMSE 0.707305 		 Corr 0.748894
Linear: RMSE 0.826913 	 Corr 0.389720
Poly: RMSE 2.913726 	 Corr -0.614328

Modèle de régression multivariée avec scikit-learn - J'ai essayé de comparer et de vérifier SVR

introduction

Première démo

Je l'ai essayé

conditions