Explanation of the concept of regression analysis using Python Part 1 uses the least squares method to draw the optimal straight line for the data. I explained that the parameters are set to minimize the difference (error) of. Here, as an extra edition, I tried to draw a graph by animating how each parameter changes. If you try this way, you will get an image.
The "sum of square errors" in the title of the graph is the sum of squares of the errors, so the best position is where this is the smallest.
First, let's see how the inclination changes.
I am using the matplotlib.animation.FuncAnimation function to output animation with matplotlib. Let's go out of the function that draws the graph and take the argument to change the value by animation. Here we are creating an animate function. This animate function is called in the FuncAnimation function, and the values set in nframe from 0 to frames are set in order as arguments and called.
import numpy as np
import matplotlib.pyplot as plt
from moviepy.editor import *
from matplotlib import animation as ani
data= np.loadtxt('cars.csv',delimiter=',',skiprows=1)
data[:,1] = map(lambda x: x * 1.61, data[:,1]) #km from mph/Convert to h
data[:,2] = map(lambda y: y * 0.3048, data[:,2]) #Convert from ft to m
def animate(nframe):
plt.clf() # clear graph canvas
slope = 0.746606334842 * (float(nframe)/50) *2 #The slope changes as the argument nframe changes
intercept = - 5.41583710407
x = np.linspace(0,50,50)
y = slope * x + intercept
plt.ylim(-10,80)
plt.xlim(0,50)
plt.xlabel("speed(km/h)")
plt.ylabel("distance(m)")
plt.scatter(data[:,1],data[:,2])
# draw errors
se = 0
i = 0
for d in data:
plt.plot([d[1],d[1]],[d[2],d[1]*slope+intercept],"k")
se += (y[i] - d[2]) ** 2
i += 1
plt.title("Stopping Distances of Cars (slope=%.3f, sum of square errors=%5d)" % (slope, se))
# based line: y = 0.74x -5
plt.plot(x,y)
fig = plt.figure(figsize=(10,6))
anim = ani.FuncAnimation(fig, animate, frames=50, blit=True)
anim.save('regression_anim.mp4', fps=13)
clip = VideoFileClip("regression_anim.mp4")
clip.write_gif("regression_anim.gif")
Unlike before, the intercept moves.
def animate(nframe):
plt.clf() # clear graph canvas
slope = 0.746606334842
intercept = -5.41583710407 + (float(nframe-25)/50) * 50 #The intercept changes as the nframe of the argument changes
x = np.linspace(0,50,50)
y = slope * x + intercept
plt.ylim(-30,80)
plt.xlim(0,50)
plt.xlabel("speed(km/h)")
plt.ylabel("distance(m)")
plt.scatter(data[:,1],data[:,2])
# draw errors
se = 0
i = 0
for d in data:
plt.plot([d[1],d[1]],[d[2],d[1]*slope+intercept],"k")
se += (y[i] - d[2]) ** 2
i += 1
plt.title("Stopping Distances of Cars (slope=%.3f, sum of square errors=%5d)" % (slope, se))
# based line: y = 0.74x -5
plt.plot(x,y)
fig = plt.figure(figsize=(10,6))
anim = ani.FuncAnimation(fig, animate, frames=50, blit=True)
anim.save('regression_anim_i.mp4', fps=13)
clip = VideoFileClip("regression_anim_i.mp4")
clip.write_gif("regression_anim_i.gif")
Recommended Posts