A series that implements Coursera Machine Learning programming tasks in Python. (2015/10/23) Added ex2_reg and added. (2015/12/25) Added a version that can be easily written using Polynomial Features with ex2_reg

ex2 (logistic regression without regularization)

Introduction

In this example, the scores of the two tests are given as input data, and the test results (pass or fail) are given as output data, and a classifier by logistic regression is created.

Python code

`ex2.py`


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import linear_model

data = pd.read_csv("ex2data1.txt", header=None)
# read 1st, 2nd column as feature matrix (100x2)
X = np.array([data[0],data[1]]).T
# read 3rd column as label vector (100)
y = np.array(data[2])

# plot
pos = (y==1) # numpy bool index
neg = (y==0) # numpy bool index
plt.scatter(X[pos,0], X[pos,1], marker='+', c='b')
plt.scatter(X[neg,0], X[neg,1], marker='o', c='y')
plt.legend(['Admitted', 'Not admitted'], scatterpoints=1)
plt.xlabel("Exam 1 Score")
plt.ylabel("Exam 2 Score")

# Logistic regression model with no regularization
model = linear_model.LogisticRegression(C=1000000.0)
model.fit(X, y)

# Extract model parameter (theta0, theta1, theta2)
[theta0] = model.intercept_
[[theta1, theta2]] = model.coef_
# Plot decision boundary
plot_x = np.array([min(X[:,0])-2, max(X[:,0])+2])   # lowest and highest x1
plot_y = - (theta0 + theta1*plot_x) / theta2   # calculate x2
plt.plot(plot_x, plot_y, 'b')

plt.show()

The resulting plot looks like this.

Machine learning points

To perform logistic regression, use the sklearn.linear_model.LogisticRegression () class and learn with the familiar model.fit (X, y).

In the LogisticRegression class, the strength of Regularization is specified by the parameter C. In class, this was specified by the parameter $ \ lambda $, but C is the reciprocal of $ \ lambda $ (will appear in a later SVM session). The smaller the C, the stronger the regularization, and the larger the C, the weaker the regularization. In this example, we want to have no regularization, so we put a large value (1,000,000) in C.

After training the model, draw a Decision Boundary. The decision boundary of logistic regression is a straight line defined by $ \ theta ^ {T} X = 0 $. In the case of the example, if you write it down in terms of components, $ \ theta_0 + \ theta_1 x_1 + \ theta_2 x_2 = 0 $, and solve this to the formula $ x_2 =-\ frac {\ theta_0 + \ theta_1 x_1} {\ theta_2} $ Calculates the coordinates of the points on the decision boundary with and passes them to the plot function.

Python-like point

Numpy's Boolean index (index by logical value) is used when plotting the one with y = 1 with + and the one with y = 0 with ○ on the scatter plot.
When plotting the decision boundary (straight line), a two-dimensional vector having points at both ends of the straight line as components is put in plot_x, and the corresponding plot_y is vector-operated. Note that when creating this plot_x, you must create it as a Numpy Array (if you create it as a Python standard list), you will not be able to perform vector operations.

ex2_reg (regularized logistic regression)

Introduction

In this example, the two test results of the microchip are given as input data, and the pass or fail flag is given as the result data. Create a classifier that classifies pass and fail using a logistic regression model. Polynomial fitting is used because linear separation is not possible with straight lines. Also, create a model with different regularization parameters $ \ lambda $ and see the effect of regularization.

code

`ex2_reg.py`


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import linear_model

# mapFeature(x1, x2)
#Do feature mapping
#Argument: Feature vector x1,x2 (must be of the same dimension n)
#Return value: Feature matrix X(nx28 matrix)
#Up to 6th order 1, x1, x2, x1^2, x1*x2, x2, x1^3, .... x1*x2^5, x2^28 columns like 6
def mapFeature(x1, x2):
    degree = 6
    out = np.ones(x1.shape) #The first column is 1
    for i in range(1, degree+1):  #Loop from 1 to degree
        for j in range(0, i+1):   #Loop from 0 to i
            out = np.c_[out, (x1**(i-j) * x2**j)] #Increase the number of columns
    return out

#Text from here
data = pd.read_csv("ex2data2.txt", header=None)

x1 = np.array(data[0])
x2 = np.array(data[1])
y = np.array(data[2])

#Plot sample data
pos = (y==1) # numpy bool index
neg = (y==0) # numpy bool index
plt.scatter(x1[pos], x2[pos], marker='+', c='b') #The correct example is'+'
plt.scatter(x1[neg], x2[neg], marker='o', c='y') #Negative example'o'
plt.legend(['y = 0', 'y = 1'], scatterpoints=1)
plt.xlabel("Microchip Test 1")
plt.ylabel("Microchip Test 2")

#X is an nx28 matrix for feature mapping
X = mapFeature(x1, x2)

#Logistic regression model with regularization
model = linear_model.LogisticRegression(penalty='l2', C=1.0)
model.fit(X, y)

# Decision Boundary(Decision boundary)To plot
px = np.arange(-1.0, 1.5, 0.1)
py = np.arange(-1.0, 1.5, 0.1)
PX, PY = np.meshgrid(px, py) # PX,Each PY is a 25x25 matrix
XX = mapFeature(PX.ravel(), PY.ravel()) #Feature mapping. The argument is ravel()Convert to a 625-dimensional vector and pass it. XX is a 625x28 matrix
Z = model.predict_proba(XX)[:,1] #Predicted with a logistic regression model. y=The probability of 1 is in the second column of the result, so take it out. Z is a 625 dimensional vector
Z = Z.reshape(PX.shape) #Convert Z to 25x25 Matrix
plt.contour(PX, PY, Z, levels=[0.5], linewidths=3) # Z=0.The contour line of 5 becomes the decision boundary
plt.show()

Machine learning points

The part that uses the LogisticRegression class is the same as the previous example. Since Regularization (regularization) that appeared in Coursera is L2 regularization (ridge regression), add the option of'penalty ='l2'. I will draw a decision boundary with a model trained by changing the strength of regularization, but in Coursera there were 3 types of $ \ lambda = 0, 1, 100 $, but in the Python example C = 1000000.0 , C = 1.0, C = 0.01.

In case of C = 1000000.0 (no regularization, overfit)

When C = 1.0

When C = 0.01 (too regular, underfit)

Python-like point

Convert a matrix to a vector with the ravel () method. Matlab / Octave supports ʻA (:). This is because the coordinate matrix created by meshgrid needs to be a vector when passing it to mapFeatureandLogisticRegression.predict_proba (). Conversely, the return vector Z is returned to the matrix with the reshape () `method.
The return value of LogisticRegression.predict_proba () is a matrix with the number of samples (625 rows) vertically and the number of classes (2 columns) horizontally, and contains the probability that each sample will be classified into each class. In this example, we want the probability that y == 1, so we take out only the second column as a vector asZ [:, 1]. The contour line with probability = 0.5 is the decision boundary.

Addendum: Simplified using a library for polynomial feature generation

In the above code, I made my own function called mapFeature () to generate features of polynomials, but scikit-learn has a class called sklearn.preprocessing.PolynomialFeatures that does the same thing. There is, so replace it here. Click here for the code.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn

data = pd.read_csv("ex2data2.txt", header=None)

x1 = np.array(data[0])
x2 = np.array(data[1])
y = np.array(data[2])

#Plot sample data
pos = (y==1) # numpy bool index
neg = (y==0) # numpy bool index
plt.scatter(x1[pos], x2[pos], marker='+', c='b') #The correct example is'+'
plt.scatter(x1[neg], x2[neg], marker='o', c='y') #Negative example'o'
plt.legend(['y = 0', 'y = 1'], scatterpoints=1)
plt.xlabel("Microchip Test 1")
plt.ylabel("Microchip Test 2")

#X is an nx28 matrix for feature mapping
poly = sklearn.preprocessing.PolynomialFeatures(6)
X = poly.fit_transform(np.c_[x1,x2])

#Logistic regression model with regularization
model = linear_model.LogisticRegression(penalty='l2', C=1.0)
model.fit(X, y)

# Decision Boundary(Decision boundary)To plot
px = np.arange(-1.0, 1.5, 0.1)
py = np.arange(-1.0, 1.5, 0.1)
PX, PY = np.meshgrid(px, py) # PX,Each PY is a 25x25 matrix
XX = poly.fit_transform(np.c_[PX.ravel(), PY.ravel()]) #Feature mapping argument is ravel()Converts to a 625-dimensional vector with and passes XX is a 625x28 matrix
Z = model.predict_proba(XX)[:,1] #Predicted by logistic regression model y=Since the probability of 1 is in the second column of the result, the Z to be extracted is a 625-dimensional vector.
Z = Z.reshape(PX.shape) #Convert Z to 25x25 Matrix
plt.contour(PX, PY, Z, levels=[0.5], linewidths=3) # Z=0.The contour line of 5 becomes the decision boundary
plt.show()

in conclusion

I'm studying both Python and machine learning, so I'd be happy if you could point out any strange points (^^)

Coursera Machine Learning Challenges in Python: ex2 (Logistic Regression)

ex2 (logistic regression without regularization)

Introduction

Python code

ex2.py

Machine learning points

Python-like point

ex2_reg (regularized logistic regression)

Introduction

code

ex2_reg.py

Machine learning points

Python-like point

Addendum: Simplified using a library for polynomial feature generation

in conclusion

`ex2.py`

`ex2_reg.py`