This article is the 21st day article of Advent Calendar 2019 of ACCESS Co., Ltd. is.
In a certain case of our company, we should do nonlinear least squares fitting, so for the preparation, we will review the nonlinear least squares fitting by lmfit
that we have used before and summarize it here. ..
lmfit
lmfit is the non-linear least squares as the official subtitle says "Non-Linear Least-Squares Minimization and Curve-Fitting for Python". A library for model fitting using the method of multiplication, which extends based on many optimization methods in scipy.optimize. , Has been developed.
The following features are listed.
-Parameter Introduction of class objects. This facilitates the handling of model parameters.
--Ease of changing the model fitting algorithm.
--Improved parameter confidence interval estimation. Confidence interval estimation is now more concise than that of scipy.optimize.leastsq ing.
-Model The curve fit is improved by using the class object. This Model
class object extends the functionality of scipy.optimize.curve_fit.
--Many common linear built-in models are available and ready to use.
--You can easily introduce the function you want to use in your own model fitting. --Access to model parameters is easy with Python's dictinoary feature. --Parameter boundaries / fixed settings can be made easily. It is easy to set the range to search for the optimum value of the parameter when fitting, and to fix the value of the parameter (eliminate it as a free parameter). --Parameters can be modified without changing the objective function. --Parameters can be algebraically constrained. The relationship with other parameters can be restricted.
The data was created for this article and is plotted as follows:
As you can see at a glance, this data has two structures, and around 6.0 and 7.5, you can see a structure like a normal distribution. Let's try fitting this data.
As mentioned above, there are two normal distribution-like structures, so it is necessary to introduce a model that represents them. Also, since it seems that there is a constant offset that is not 0, it is necessary to incorporate a component that reproduces this into the model. The formula is as follows.
f\left( x \right) = G_1 \left( x \right) + G_2 \left( x \right) + C
This can be expressed using the lmfit.models
class object as follows.
import lmfit as lf
import lmfit.models as lfm
#Model definition
model = lfm.ConstantModel() + lfm.GaussianModel(prefix='gauss1_') + lfm.GaussianModel(prefix='gauss2_')
By specifying prefix
, it is convenient because parameters can be distinguished even if the same model function is used.
It can be introduced in one shot by passing the function defined in the Model class object.
def constant(x, const):
return const
model = lf.Model(constant) + lfm.GaussianModel(prefix='gauss1_') + lfm.GaussianModel(prefix='gauss2_')
Even if you declare a Model
class object, the Parameters
class object for that Model
class object is not declared, so start with that declaration.
#Set the initial value of each parameter
par_val = {
'c' : 20,
'const' : 20,
'gauss1_center' : 6.0,
'gauss1_sigma' : 1.0,
'gauss1_amplitude' : 400,
'gauss2_center' : 7.5,
'gauss2_sigma' : 1.0,
'gauss2_amplitude' : 150
}
par_min = {
'c' : 0,
'const' : 0,
'gauss1_center' : 0,
'gauss1_sigma' : 0,
'gauss1_amplitude' : 0,
'gauss2_center' : 0,
'gauss2_sigma' : 0,
'gauss2_amplitude' : 0
}
par_max = {
'c' : 100,
'const' : 100,
'gauss1_center' : 10,
'gauss1_sigma' : 10,
'gauss1_amplitude' : 1000,
'gauss2_center' : 10,
'gauss2_sigma' : 10,
'gauss2_amplitude' : 1000
}
par_vary = {
'c' : True,
'const' : True,
'gauss1_center' : True,
'gauss1_sigma' : True,
'gauss1_amplitude' : True,
'gauss2_center' : True,
'gauss2_sigma' : True,
'gauss2_amplitude' : True
}
#Declaration of Parameters class object for the defined model
params = model.make_params()
for name in model.param_names:
params[name].set(
value=par_val[name], #initial value
min=par_min[name], #lower limit
max=par_max[name], #upper limit
vary=par_vary[name] #Whether to move the parameter
)
The relationship between parameters can be algebraically constrained as follows.
params['gauss2_center'].set(expr='gauss1_center*1.25')
result = model.fit(x=df.x, data=df.y, weights=df.dy**(-1.0), params=params, method='leastsq')
Note: The data is given in the DataFrame
object of pandas
.
summary
print(result.fit_report())
You can see the summary of the fitting results as follows.
```
[[Model]]
((Model(constant) + Model(gaussian, prefix='gauss1_')) + Model(gaussian, prefix='gauss2_'))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 383
# data points = 50
# variables = 6
chi-square = 29.2390839
reduced chi-square = 0.66452463
Akaike info crit = -14.8258350
Bayesian info crit = -3.35369698
[[Variables]]
const: 21.4200227 +/- 1.47331527 (6.88%) (init = 20)
gauss1_amplitude: 88.4566498 +/- 1.39004141 (1.57%) (init = 400)
gauss1_center: 5.99651706 +/- 0.00136123 (0.02%) (init = 6)
gauss1_sigma: 0.09691089 +/- 0.00153157 (1.58%) (init = 1)
gauss2_amplitude: 31.6555368 +/- 1.39743186 (4.41%) (init = 150)
gauss2_center: 7.49564632 +/- 0.00170154 (0.02%) == 'gauss1_center*1.25'
gauss2_sigma: 0.09911853 +/- 0.00446584 (4.51%) (init = 1)
gauss1_fwhm: 0.22820770 +/- 0.00360656 (1.58%) == '2.3548200*gauss1_sigma'
gauss1_height: 364.139673 +/- 4.89553113 (1.34%) == '0.3989423*gauss1_amplitude/max(2.220446049250313e-16, gauss1_sigma)'
gauss2_fwhm: 0.23340629 +/- 0.01051624 (4.51%) == '2.3548200*gauss2_sigma'
gauss2_height: 127.410415 +/- 4.77590041 (3.75%) == '0.3989423*gauss2_amplitude/max(2.220446049250313e-16, gauss2_sigma)'
[[Correlations]](unreported correlations are < 0.100)
C(gauss2_amplitude, gauss2_sigma) = 0.647
C(gauss1_amplitude, gauss1_sigma) = 0.636
C(const, gauss2_amplitude) = -0.555
C(const, gauss1_amplitude) = -0.552
C(const, gauss1_sigma) = -0.367
C(const, gauss2_sigma) = -0.359
C(gauss1_amplitude, gauss2_amplitude) = 0.306
C(gauss1_sigma, gauss2_amplitude) = 0.203
C(gauss1_amplitude, gauss2_sigma) = 0.198
C(gauss1_sigma, gauss2_sigma) = 0.131
```
confidence interval
print('[[Confidence Intervals]]')
print(result.ci_report())
[[Confidence Intervals]]
99.73% 95.45% 68.27% _BEST_ 68.27% 95.45% 99.73%
const : -4.72534 -3.05035 -1.49522 21.42002 +1.48820 +3.02111 +4.65531
gauss1_amplitude: -4.37979 -2.84434 -1.40305 88.45665 +1.41243 +2.88513 +4.47133
gauss1_center : -0.00435 -0.00278 -0.00131 5.99652 +0.00131 +0.00278 +0.00436
gauss1_sigma : -0.00479 -0.00312 -0.00154 0.09691 +0.00156 +0.00321 +0.00499
gauss2_amplitude: -4.31612 -2.82491 -1.40294 31.65554 +1.43259 +2.94767 +4.61057
gauss2_sigma : -0.01344 -0.00889 -0.00446 0.09912 +0.00466 +0.00971 +0.01540
--Plot of fitting results
```python
fig, gridspec = result.plot(data_kws={'markersize': 5})
```
The following graph is automatically generated.
--Fiting result The following properties have the optimum value etc., so feel free to do the rest.
```python
result.chisqr #Chi-square value
result.nfree #Degree of freedom
result.redchi #Converted chi-square value
result.aic # aic
result.bic # bic
result.best_fit # best-fit model value(Value of orange plot of fitting result)
result.residual # (best-fit model)-(data)The value of the(Values in the upper panel of fitting results)
result.eval_components(x=df.x) # best-Value for each component of fit model
result.best_values #Optimal value for each parameter:dict type
result.result.params['const'].value #Optimal value of parameter
result.result.params['const'].stderr #Standard error of parameters
```
that's all. Please continue to enjoy @ rheza_h's article on the 22nd day.
Recommended Posts