A series that implements Coursera Machine Learning programming tasks in Python. (2015/10/23) Added ex2_reg and added. (2015/12/25) Added a version that can be easily written using Polynomial Features with ex2_reg
In this example, the scores of the two tests are given as input data, and the test results (pass or fail) are given as output data, and a classifier by logistic regression is created.
ex2.py
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import linear_model
data = pd.read_csv("ex2data1.txt", header=None)
# read 1st, 2nd column as feature matrix (100x2)
X = np.array([data[0],data[1]]).T
# read 3rd column as label vector (100)
y = np.array(data[2])
# plot
pos = (y==1) # numpy bool index
neg = (y==0) # numpy bool index
plt.scatter(X[pos,0], X[pos,1], marker='+', c='b')
plt.scatter(X[neg,0], X[neg,1], marker='o', c='y')
plt.legend(['Admitted', 'Not admitted'], scatterpoints=1)
plt.xlabel("Exam 1 Score")
plt.ylabel("Exam 2 Score")
# Logistic regression model with no regularization
model = linear_model.LogisticRegression(C=1000000.0)
model.fit(X, y)
# Extract model parameter (theta0, theta1, theta2)
[theta0] = model.intercept_
[[theta1, theta2]] = model.coef_
# Plot decision boundary
plot_x = np.array([min(X[:,0])-2, max(X[:,0])+2]) # lowest and highest x1
plot_y = - (theta0 + theta1*plot_x) / theta2 # calculate x2
plt.plot(plot_x, plot_y, 'b')
plt.show()
The resulting plot looks like this.
To perform logistic regression, use the sklearn.linear_model.LogisticRegression ()
class and learn with the familiar model.fit (X, y)
.
In the LogisticRegression
class, the strength of Regularization is specified by the parameter C
. In class, this was specified by the parameter $ \ lambda $, but C
is the reciprocal of $ \ lambda $ (will appear in a later SVM session). The smaller the C
, the stronger the regularization, and the larger the C
, the weaker the regularization. In this example, we want to have no regularization, so we put a large value (1,000,000) in C
.
After training the model, draw a Decision Boundary. The decision boundary of logistic regression is a straight line defined by $ \ theta ^ {T} X = 0 $. In the case of the example, if you write it down in terms of components, $ \ theta_0 + \ theta_1 x_1 + \ theta_2 x_2 = 0 $, and solve this to the formula $ x_2 =-\ frac {\ theta_0 + \ theta_1 x_1} {\ theta_2} $ Calculates the coordinates of the points on the decision boundary with and passes them to the plot function.
plot_x
, and the corresponding plot_y
is vector-operated. Note that when creating this plot_x
, you must create it as a Numpy Array (if you create it as a Python standard list), you will not be able to perform vector operations.In this example, the two test results of the microchip are given as input data, and the pass or fail flag is given as the result data. Create a classifier that classifies pass and fail using a logistic regression model. Polynomial fitting is used because linear separation is not possible with straight lines. Also, create a model with different regularization parameters $ \ lambda $ and see the effect of regularization.
ex2_reg.py
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import linear_model
# mapFeature(x1, x2)
#Do feature mapping
#Argument: Feature vector x1,x2 (must be of the same dimension n)
#Return value: Feature matrix X(nx28 matrix)
#Up to 6th order 1, x1, x2, x1^2, x1*x2, x2, x1^3, .... x1*x2^5, x2^28 columns like 6
def mapFeature(x1, x2):
degree = 6
out = np.ones(x1.shape) #The first column is 1
for i in range(1, degree+1): #Loop from 1 to degree
for j in range(0, i+1): #Loop from 0 to i
out = np.c_[out, (x1**(i-j) * x2**j)] #Increase the number of columns
return out
#Text from here
data = pd.read_csv("ex2data2.txt", header=None)
x1 = np.array(data[0])
x2 = np.array(data[1])
y = np.array(data[2])
#Plot sample data
pos = (y==1) # numpy bool index
neg = (y==0) # numpy bool index
plt.scatter(x1[pos], x2[pos], marker='+', c='b') #The correct example is'+'
plt.scatter(x1[neg], x2[neg], marker='o', c='y') #Negative example'o'
plt.legend(['y = 0', 'y = 1'], scatterpoints=1)
plt.xlabel("Microchip Test 1")
plt.ylabel("Microchip Test 2")
#X is an nx28 matrix for feature mapping
X = mapFeature(x1, x2)
#Logistic regression model with regularization
model = linear_model.LogisticRegression(penalty='l2', C=1.0)
model.fit(X, y)
# Decision Boundary(Decision boundary)To plot
px = np.arange(-1.0, 1.5, 0.1)
py = np.arange(-1.0, 1.5, 0.1)
PX, PY = np.meshgrid(px, py) # PX,Each PY is a 25x25 matrix
XX = mapFeature(PX.ravel(), PY.ravel()) #Feature mapping. The argument is ravel()Convert to a 625-dimensional vector and pass it. XX is a 625x28 matrix
Z = model.predict_proba(XX)[:,1] #Predicted with a logistic regression model. y=The probability of 1 is in the second column of the result, so take it out. Z is a 625 dimensional vector
Z = Z.reshape(PX.shape) #Convert Z to 25x25 Matrix
plt.contour(PX, PY, Z, levels=[0.5], linewidths=3) # Z=0.The contour line of 5 becomes the decision boundary
plt.show()
The part that uses the LogisticRegression
class is the same as the previous example. Since Regularization (regularization) that appeared in Coursera is L2 regularization (ridge regression), add the option of'penalty ='l2'.
I will draw a decision boundary with a model trained by changing the strength of regularization, but in Coursera there were 3 types of $ \ lambda = 0, 1, 100 $, but in the Python example C = 1000000.0
, C = 1.0
, C = 0.01
.
In case of C = 1000000.0
(no regularization, overfit)
When C = 1.0
When C = 0.01
(too regular, underfit)
ravel ()
method. Matlab / Octave supports ʻA (:). This is because the coordinate matrix created by
meshgrid needs to be a vector when passing it to
mapFeatureand
LogisticRegression.predict_proba (). Conversely, the return vector
Z is returned to the matrix with the
reshape () `method.LogisticRegression.predict_proba ()
is a matrix with the number of samples (625 rows) vertically and the number of classes (2 columns) horizontally, and contains the probability that each sample will be classified into each class. In this example, we want the probability that y == 1
, so we take out only the second column as a vector asZ [:, 1]
. The contour line with probability = 0.5 is the decision boundary.In the above code, I made my own function called mapFeature ()
to generate features of polynomials, but scikit-learn
has a class called sklearn.preprocessing.PolynomialFeatures
that does the same thing. There is, so replace it here. Click here for the code.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn
data = pd.read_csv("ex2data2.txt", header=None)
x1 = np.array(data[0])
x2 = np.array(data[1])
y = np.array(data[2])
#Plot sample data
pos = (y==1) # numpy bool index
neg = (y==0) # numpy bool index
plt.scatter(x1[pos], x2[pos], marker='+', c='b') #The correct example is'+'
plt.scatter(x1[neg], x2[neg], marker='o', c='y') #Negative example'o'
plt.legend(['y = 0', 'y = 1'], scatterpoints=1)
plt.xlabel("Microchip Test 1")
plt.ylabel("Microchip Test 2")
#X is an nx28 matrix for feature mapping
poly = sklearn.preprocessing.PolynomialFeatures(6)
X = poly.fit_transform(np.c_[x1,x2])
#Logistic regression model with regularization
model = linear_model.LogisticRegression(penalty='l2', C=1.0)
model.fit(X, y)
# Decision Boundary(Decision boundary)To plot
px = np.arange(-1.0, 1.5, 0.1)
py = np.arange(-1.0, 1.5, 0.1)
PX, PY = np.meshgrid(px, py) # PX,Each PY is a 25x25 matrix
XX = poly.fit_transform(np.c_[PX.ravel(), PY.ravel()]) #Feature mapping argument is ravel()Converts to a 625-dimensional vector with and passes XX is a 625x28 matrix
Z = model.predict_proba(XX)[:,1] #Predicted by logistic regression model y=Since the probability of 1 is in the second column of the result, the Z to be extracted is a 625-dimensional vector.
Z = Z.reshape(PX.shape) #Convert Z to 25x25 Matrix
plt.contour(PX, PY, Z, levels=[0.5], linewidths=3) # Z=0.The contour line of 5 becomes the decision boundary
plt.show()
I'm studying both Python and machine learning, so I'd be happy if you could point out any strange points (^^)