Machine learning beginners try linear regression

Introduction

This is the third time following. This time I would like to do linear regression. As usual, it is implemented in Python, detailed explanation is impossible, so please go to another site (laugh) It feels like it's getting messy, but please watch with warm eyes.

The point

There are two terms:

Mean squared error
Gradient method

Mean squared error

In the case of linear regression, a line of y = Θ_1x + Θ_2 is drawn for the distributed data, and drawing that line to take the error from the actual data is called mean square error.

In the sample, it is defined as follows.

T.dot () is a function of the inner product. It is the sum of the intercepts minus y of the actual data, squared, totaled, and divided by the number of training data (m).

#Cost function
j = T.sum(T.sqr(t[0] + T.dot(data, t[1:]) - target)) / (2 * m)

The image of the figure is Please refer to the Machine Learning Algorithm Implementation Series [Linear Regression]! !!

Gradient method

The gradient method is a method for correcting Θ_1 and Θ_2 of y = Θ_1x + Θ_2 in the direction that will be more correct.

The code is as follows. It defines T.grad () that differentiates a shared variable (np.array ([0,0])) called t (= theta = Θ). Partially differentiate the mean square error function defined above with respect to Θ1 and Θ2, respectively. Then define a function called train () and update the value of t with ʻupdates = ({})` on every run.

#Partial differential
dt = T.grad(cost=j, wrt=t)

#Gradient method(Update of Θ)
train = theano.function(
    inputs  = [],
    outputs = [j],
    updates = ({t: t - (alpha*dt)})
    )

The image is [4th Gradient Descent Method](https://github.com/levelfour/machine-learning-2014/wiki/%E7%AC%AC4%E5%9B%9E ---% E5 % 8B% BE% E9% 85% 8D% E6% B3% 95% EF% BC% 88% E6% 9C% 80% E6% 80% A5% E9% 99% 8D% E4% B8% 8B% E6% B3 Please refer to% 95% EF% BC% 89)! !!

Sample (python)

Show me the code and put it on for people. I use a library called theano for implementation, so if you don't understand, please gg! I don't apply the code per prediction (I will update the code if I have time)

# -*- coding: utf-8 -*-

import numpy as np
import theano
import theano.tensor as T

class Regression:

    def __init__(self):
        self.t = None


    def fit(self, data, target, alpha=0.1):
        #Calculate the length of the explanatory variable
        if isinstance(data[0], (np.ndarray, np.generic)):
            m = len(data[0])
        else:
            m = len(data)

        #Shared variables
        t = theano.shared(np.zeros(m+1), name='theta')

        #Cost function
        j = T.sum(T.sqr(t[0] + T.dot(data, t[1:]) - target)) / (2 * m)
        #Partial differential
        dt = T.grad(cost=j, wrt=t)
        #Gradient method(Update of Θ)
        train = theano.function(
            inputs  = [],
            outputs = [j],
            updates = ({t: t - (alpha*dt)})
            )
        #Learning
        for i in range(100):
            train()


if __name__ == '__main__':

    from sklearn import datasets

    iris = datasets.load_iris()

    reg = Regression()
    reg.fit(data=iris.data, target=iris.target)

reference

The following site was very helpful

Machine Learning Algorithm Implementation Series [Linear Regression]
[4th Gradient Descent Method](https://github.com/levelfour/machine-learning-2014/wiki/%E7%AC%AC4%E5%9B%9E ---% E5% 8B% BE% E9% 85% 8D% E6% B3% 95% EF% BC% 88% E6% 9C% 80% E6% 80% A5% E9% 99% 8D% E4% B8% 8B% E6% B3% 95% EF% BC% 89)