I will explain how to approximate and model observation values (set of x, y) with a specified function in Python.
Image diagram: I will do something like this
Fitting in functions is the basis of modeling. If it's a linear approximation, I think you can usually use the linear regression package, Here, we will explain how to fit with any function you specify, including non-linear functions.
We use a module called "curve_fit" in Python's scipy package. More precisely, it is part of the scipy.optimize module.
First, load the package to be used this time.
import.py
##What to use for fitting
from scipy.optimize import curve_fit
import numpy as np
##What to use for illustration
import seaborn as sns
import matplotlib.pyplot as plt
numpy is used to represent exponential functions. If you are not familiar with the illustration method using seaborn, please click here.
Beautiful graph drawing with python http://qiita.com/hik0107/items/3dc541158fceb3156ee0
Let's actually see how to do function fitting First of all, I will try a linear approximation for arm leveling.
Create the observation data to be fitted as follows. Since it is linearly approximated, prepare a shape that is close to a straight line.
linear.py
list_linear_x = range(0,20,2)
array_error = np.random.normal(size=len(list_linear_x))
array_x = np.array(list_linear_x)
array_y = array_x + array_error ##Complete y=I am making bumpy data by adding an error term to the straight line of x
Let's take a look at the actually completed data.
linear.py
sns.pointplot(x=array_x, y=array_y, join=False)
Now let's fit this in the form of Y = ax + b. This is where curve_fit comes in.
fitting.py
##Define the function expression you want to fit as a function
def linear_fit(x, a, b):
return a*x + b
param, cov = curve_fit(linear_fit, array_x, array_y)
Only this. The estimation results of parameters a and b are stored in list format in the first return value param. The contents of curve_fit are written as (function used for fitting, x for fitting, y for fitting). If you describe the 2nd and 3rd arguments in a list comprehension list, you can handle multiple variables.
Let's see the result of the fitting.
fitting.py
array_y_fit = param[0] * array_x + param[1]
sns.pointplot(x=array_x, y=array_y, join=False)
sns.pointplot(x=array_x, y=array_y_fit, markers="")
The fitting method is OLS (least squares error method).
Next, let's try an approximation with a slightly more complicated nonlinear function. For example, consider a function such as f (x) = b * exp (x / (a + x)).
nonlinear.py
list_y = []
for num in array_x:
list_y.append( param[1] * np.exp( num /(param[0] + num) ) + np.random.rand() )
array_y= np.array(list_y)
sns.pointplot(x=array_x, y=array_y, join=False)
I got the data like this. Somehow, a non-linear function that converges seems to fit better than a linear one.
Now let's fit this in the form f (x) = b * exp (x / (a + x)).
fitting.py
def nonlinear_fit(x,a,b):
return b * np.exp(x / (a+x) )
param, cov = curve_fit(nonlinear_fit, array_x, array_y)
draw.py
list_y = []
for num in array_x:
list_y.append( param[1] * np.exp( num /(param[0] + num) ))
sns.pointplot(x=array_x, y=array_y, join=False)
sns.pointplot(x=array_x, y=np.array(list_y), markers="")
It fits like that.
A rudimentary summary of data manipulation in Python Pandas http://qiita.com/hik0107/items/d991cc44c2d1778bb82e
Data analysis in Python Summary of sources to look at first for beginners http://qiita.com/hik0107/items/0bec82cc09d0e05d5357
If you are interested in data scientists, first look around here, a summary of literature and videos http://qiita.com/hik0107/items/ef5e044d2f47940ba712
Recommended Posts