First TensorFlow (Revised) -Linear Regression and Logistic Regression

The article I wrote earlier First TensorFlow-Linear Regression as an Introduction was written just after TensorFlow was released, so my article Among them, it is an article that you can like. It's just about a year and a half ago, so I wanted to revise the article in consideration of the TensorFlow version upgrade. The contents follow the above and are linear regression and logistic regression.

(The programming environment is as follows. As of May 29, 2017.)

Linear Regression

In the previous version of "First Time ...", we introduced the Tutorial of "Theano", Newmu / Theano-Tutorials --GitHub, and proceeded with the discussion by porting it to TensorFlow. https://github.com/Newmu/Theano-Tutorials

The other day, I found a tutorial on TensorFlow with almost the same content on GitHub, so I would like to introduce it. https://github.com/nlintz/TensorFlow-Tutorials

TensorFlow-Tutorials

Introduction to deep learning based on Google's TensorFlow framework. These tutorials are direct ports of Newmu's Theano Tutorials.

A short tutorial code is posted so that you can deepen your understanding of "TensorFlow" step by step. In addition to Python code, jupyter notebook is also posted. For those who say "I'm not a beginner anymore ...", I recommend you to refer to it because you can use it like a cookbook by downloading (clone) the repository to your PC and referencing it when coding. 00_multiply.py (simple multiplication) is followed by 01_linear_regression.py (linear regression) and 02_logistic_regression.py (logistic regression), so I would like to explain this article along with this. However, the content was slightly customized because it was boring to reprint the code.

First, from linear regression. After importing the related modules, prepare the data to be used. (The original was test data without intercept b, but here it is included b (bias = 3.).)

import tensorflow as tf
import numpy as np

trX = np.linspace(-1, 1, 101)
#The original code has no bias term, but here it is biased(=3)To be a model that includes
target_w = 2.
target_b = 3.
noise_gain = 0.33
trY = target_w * trX + target_b + np.random.randn(*trX.shape) * noise_gain

model () is a function used as a regression model. (A linear function with slope w and intercept b as parameters.)

def model(X, w, b):
    #Linear predictor model
    #Tf in the original code.muliply()Is used but the operator"*"OK.
    # return tf.multiply(X, w) + b
    return X * w + b

Next, prepare the necessary variables.

# TensorFlow placeholders
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
# TensorFlow Variables
w = tf.Variable(0.0, name="weights")    # zero initialize
b = tf.Variable(0.0, name="bias")       # zero initialize

x and y are prepared by the unique terminology "placeholder". Since it is a placeholder, there is no actual value at the time of declaring this, and the actual value is assigned in the subsequent program processing. On the other hand, the initial values of the ordinary Tensor variables w and b are zero.

Then we describe the important relationship between variables called Graph.

y_model = model(X, w, b)
cost = tf.reduce_mean(tf.square(Y - y_model))

cost is a cost function that corresponds to the difference between your model and the actual data. "tf.reduce_mean ()" is a function that calculates the mean. Therefore, the above code calculates MSE (= mean square error).

Next, we define the Optimizer specification and its method for calculating the parameter search.

train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

Here, the gradient descent Optimizer is specified, and the learning rate of 0.01 is given as an argument.

About Optimizer

Now, I'm curious about Optimizer, but TensorFlow currently supports the following (part of the table below). http://www.tensorflow.org/api_docs/python/train.html#optimizers

Optimizer name Description (Reference link)
GradientDescentOptimizer Gradient descent method https://ja.wikipedia.org/wiki/%E7%A2%BA%E7%8E%87%E7%9A%84%E5%8B%BE%E9%85%8D%E9%99%8D%E4%B8%8B%E6%B3%95
AdagradOptimizer AdaGrad method https://ja.wikipedia.org/wiki/%E7%A2%BA%E7%8E%87%E7%9A%84%E5%8B%BE%E9%85%8D%E9%99%8D%E4%B8%8B%E6%B3%95#AdaGrad
MomentumOptimizer Momentum law https://ja.wikipedia.org/wiki/%E7%A2%BA%E7%8E%87%E7%9A%84%E5%8B%BE%E9%85%8D%E9%99%8D%E4%B8%8B%E6%B3%95#.E3.83.A2.E3.83.A1.E3.83.B3.E3.82.BF.E3.83.A0.E6.B3.95
AdamOptimizer Adam method https://ja.wikipedia.org/wiki/%E7%A2%BA%E7%8E%87%E7%9A%84%E5%8B%BE%E9%85%8D%E9%99%8D%E4%B8%8B%E6%B3%95#Adam
RMSPropOptimizer RMSProp method https://ja.wikipedia.org/wiki/%E7%A2%BA%E7%8E%87%E7%9A%84%E5%8B%BE%E9%85%8D%E9%99%8D%E4%B8%8B%E6%B3%95#RMSProp

The parameters that must be set differ depending on the Optimizer, but this time we will use the basic GradientDescentOptimizer. (The required parameter learning rate was set as described above.)

Linear Regression (continued)

Now that we are almost ready, we start a Session that shows the main computational part.

# Launch the graph in a session
with tf.Session() as sess:
    # you need to initialize variables (in this case just variable W)
    tf.global_variables_initializer().run()

    for i in range(100):
        for (x, y) in zip(trX, trY):
            sess.run(train_op, feed_dict={X: x, Y: y})

    final_w, final_b = sess.run([w, b])

Immediately after the session starts, the variables (Variables) are initialized. (Note. In the previous version, I wrote that "Variables must be initialized before starting Session", but it is incorrect. In the previous code, initialization processing (init op) before Session. ) Is defined, and after starting the Session, it is executed by sess.run (init). This time, the initialization process is defined and run () is executed immediately after the Session.)

For the time being, if you check "Initialization of Variables",

--The variable of tf.Variable () needs to be initialized **. --The variable of tf.placeholder () is initialized ** not required **. (It will be assigned to the entity later) --The variables (constants) of tf.constant are not initialized **. It is said that.

There seem to be several ways to write a program to run Sessoin, but as mentioned above, the session delimiter becomes clear by enclosing it in the "with" statement, and it is automatically opened when the "with" is exited. Session is closed (), which is convenient.

One is that it is difficult to give Train data in TensorFlow. How to give data (Data Feeding) is the part where the processing changes depending on how to proceed with learning, but as shown in the above list, "feed_dict" will be used, so keep in mind.

With the above, the regression parameters are calculated. (Note. In the current version of tensorflow, cpu_feature_guard.cc outputs a warning depending on the environment, but you can ignore it. You can stop the warning by using the environment variable of the shell.)

$ python 01_linear_regression.py 

predicted model: y = [   2.045] * x + [   3.001]
target model   : y = [   2.000] * x + [   3.000]

It can be seen that the regression parameters of the predicted model are approximate values of the target model.

The above is summarized and the program is posted again.

import tensorflow as tf
import numpy as np

trX = np.linspace(-1, 1, 101)
#The original code has no bias term, but here it is biased(=3)To be a model that includes
target_w = 2.
target_b = 3.
noise_gain = 0.33
trY = target_w * trX + target_b + np.random.randn(*trX.shape) * noise_gain

def model(X, w, b):
    #Linear predictor model
    #Tf in the original code.muliply()Is used but the operator"*"OK.
    # return tf.multiply(X, w) + b
    return X * w + b

# TensorFlow placeholders
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# TensorFlow Variables
w = tf.Variable(0.0, name="weights")    # zero initialize
b = tf.Variable(0.0, name="bias")       # zero initialize

y_model = model(X, w, b)

#Since it is a cost function and regression model, define Square Error.
cost = tf.reduce_mean(tf.square(Y - y_model))
#Optimization operator(optimizer)Set.Learning rate= 0.01
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

# Launch the graph in a session
with tf.Session() as sess:
    # you need to initialize variables (in this case just variable W)
    tf.global_variables_initializer().run()

    for i in range(100):
        for (x, y) in zip(trX, trY):
            sess.run(train_op, feed_dict={X: x, Y: y})

    final_w, final_b = sess.run([w, b])

# (w, b) = (2, 3)Becomes an approximation of.
print('predicted model: y = [{:>8.3f}] * x + [{:>8.3f}]'.format(
                                        final_w, final_b))
print('target model   : y = [{:>8.3f}] * x + [{:>8.3f}]'.format(
                                        target_w, target_b))

Logistic Regression

The original (https://github.com/nlintz/TensorFlow-Tutorials) Logistic Regression code deals with MNIST, but here it deals with the lightweight handwritten character dataset "digits" that comes with scikit-learn.

First, define the weights initialization support function and model.

#Support functions for weights initialization
def init_weights(shape):
    return tf.Variable(tf.random_normal(shape, stddev=0.01))

#In the original code, the bias is not included in the model, but this code is included.
def model(X, w, b):
    return tf.matmul(X, w) + b

Prepare the "digits" data.

#digits is scikit-Handwritten character data set prepared by learn. pixel: 8 x 8
def load_data():
    digits = load_digits()
    digits_images = digits.data / 16.   # scaling to (0 .. 1)
    digits_target_ = []
    for i in range(len(digits.target)):
        target_one = np.zeros([10], dtype=np.float32)
        target_one[digits.target[i]] = 1.
        digits_target_.append(target_one)
    digits_target_onehot = np.asarray(digits_target_)
    return digits_images, digits_target_onehot

X, Y = load_data()

# train /Divide into test sets
trX, teX, trY, teY =  train_test_split(X, Y, test_size=0.2)

After that, follow the Linear Regression in the previous section to define the Graph.

# TensorFlow placeholders
X_ph = tf.placeholder(tf.float32, [None, 64])
Y_ph = tf.placeholder(tf.float32, [None, 10])

#In the original code, the bias is not included in the model, but it is included in this code.
w = init_weights([64, 10])
b = tf.zeros([10])

py_x = model(X_ph, w, b)

#Cost function, optimizer, forecast
cost = tf.nn.softmax_cross_entropy_with_logits(logits=py_x, labels=Y_ph)
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
predict_op = tf.argmax(py_x, 1)

In Linear Regression, the cost function defined by MSE (Mean Square Error) is replaced with cross entropy (softmax cross entropy) in Logistic Regression. "Tf.nn.softmax_cross_entropy_with_logits ()" in the second half of the above list is a function that calculates cross entorpy in multiclass classification. (This function requires key words for logits = and labels =)

Initialization of variables (Variables) and how to proceed with Session are the same as the previous code. Again, the code is posted together.

import tensorflow as tf
import numpy as np
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

#Support functions for weights initialization
def init_weights(shape):
    return tf.Variable(tf.random_normal(shape, stddev=0.01))

#In the original code, the bias is not included in the model, but this code is included.
def model(X, w, b):
    return tf.matmul(X, w) + b

#digits is scikit-Handwritten character data set prepared by learn. pixel: 8 x 8
def load_data():
    digits = load_digits()
    digits_images = digits.data / 16.   # scaling to (0 .. 1)
    digits_target_ = []
    for i in range(len(digits.target)):
        target_one = np.zeros([10], dtype=np.float32)
        target_one[digits.target[i]] = 1.
        digits_target_.append(target_one)
    digits_target_onehot = np.asarray(digits_target_)
    return digits_images, digits_target_onehot

X, Y = load_data()

# train /Divide into test sets
trX, teX, trY, teY =  train_test_split(X, Y, test_size=0.2)

# TensorFlow placeholders
X_ph = tf.placeholder(tf.float32, [None, 64])
Y_ph = tf.placeholder(tf.float32, [None, 10])

#In the original code, the bias is not included in the model, but it is included in this code.
w = init_weights([64, 10])
b = tf.zeros([10])

py_x = model(X_ph, w, b)

#Cost function, optimizer, forecast
cost = tf.nn.softmax_cross_entropy_with_logits(logits=py_x, labels=Y_ph)
train_op = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
predict_op = tf.argmax(py_x, 1)

# TensorFlow session
with tf.Session() as sess:
    # you need to initialize all variables
    tf.global_variables_initializer().run()

    for i in range(100):
        for start, end in zip(range(0, len(trX), 128), range(128, len(trX)+1, 128)):
            train_fd = {X_ph: trX[start:end], Y_ph: trY[start:end]}
            sess.run(train_op, feed_dict=train_fd)
        #Correct answer rate at numpy level(accuracy)Calculate and output
        if i % 10 == 0:
            print('step {:>3d}: accuracy = {:>8.3f}'.format(
                i, np.mean(np.argmax(teY, axis=1) ==
                         sess.run(predict_op, feed_dict={X_ph: teX}))))

TensorFlow allows you to write a variety of APIs, from relatively primitive APIs (as well as other tools) to high-level APIs, including the Keras API. It may be disagreeable, but I think it would be better for beginners to first learn the primitive usage as described above, and then move on to the high-level API when the contents are understood to some extent.

Recently, I have started studying "PyTorch" with a focus on interest, but even if I have knowledge of TensorFlow, there are many troubles. I think that tutorial code such as simple "linear regression" and "logistic regression" should be useful for introductory study in any library.

(To learn more about TensorFlow, again, check out https://github.com/nlintz/TensorFlow-Tutorials!)

References (web site)

Recommended Posts

First TensorFlow (Revised) -Linear Regression and Logistic Regression
Logistic regression
Logistic regression
Linear regression
Difference between linear regression, Ridge regression and Lasso regression
Understanding data types and beginning linear regression
Understanding Logistic Regression (1) _ About odds and logit transformations
Govern full of PyTorch from linear multiple regression to logistic regression, multi-layer perceptrons and autoencoders
"Linear regression" and "Probabilistic version of linear regression" in Python "Bayesian linear regression"
Linear multiple regression, logistic regression, multi-layer perceptron, autoencoder, Chainer yo!
Getting Started with Tensorflow-About Linear Regression Hypothesis and Cost
[TensorFlow] Least squares linear regression by gradient descent (stochastic descent)
Linear regression with statsmodels
Machine learning logistic regression
Machine learning linear regression
Regression with linear model
Try regression with TensorFlow
Chaos regression of logistic map by petit RNN in Tensorflow
First Computational Physics: Quantum mechanics and linear algebra in python.
[Under investigation] Logistic regression @ scikit-learn's Penalty and Solver's deep relationship
[Machine learning] Understanding logistic regression from both scikit-learn and mathematics