Introduction

This article is the 21st day article of Advent Calendar 2019 of ACCESS Co., Ltd. is.

In a certain case of our company, we should do nonlinear least squares fitting, so for the preparation, we will review the nonlinear least squares fitting by lmfit that we have used before and summarize it here. ..

lmfit

lmfit is the non-linear least squares as the official subtitle says "Non-Linear Least-Squares Minimization and Curve-Fitting for Python". A library for model fitting using the method of multiplication, which extends based on many optimization methods in scipy.optimize. , Has been developed.

Features

The following features are listed.

-Parameter Introduction of class objects. This facilitates the handling of model parameters. --Ease of changing the model fitting algorithm. --Improved parameter confidence interval estimation. Confidence interval estimation is now more concise than that of scipy.optimize.leastsq ing. -Model The curve fit is improved by using the class object. This Model class object extends the functionality of scipy.optimize.curve_fit. --Many common linear built-in models are available and ready to use.

Where I find it useful (subjectively)

Ease of handling models and model parameters

--You can easily introduce the function you want to use in your own model fitting. --Access to model parameters is easy with Python's dictinoary feature. --Parameter boundaries / fixed settings can be made easily. It is easy to set the range to search for the optimum value of the parameter when fitting, and to fix the value of the parameter (eliminate it as a free parameter). --Parameters can be modified without changing the objective function. --Parameters can be algebraically constrained. The relationship with other parameters can be restricted.

Model fitting with lmfit

The data was created for this article and is plotted as follows:

As you can see at a glance, this data has two structures, and around 6.0 and 7.5, you can see a structure like a normal distribution. Let's try fitting this data.

model

As mentioned above, there are two normal distribution-like structures, so it is necessary to introduce a model that represents them. Also, since it seems that there is a constant offset that is not 0, it is necessary to incorporate a component that reproduces this into the model. The formula is as follows.

f\left( x \right) = G_1 \left( x \right) +  G_2 \left( x \right) + C

This can be expressed using the lmfit.models class object as follows.

import lmfit as lf
import lmfit.models as lfm

#Model definition
model = lfm.ConstantModel() +  lfm.GaussianModel(prefix='gauss1_') + lfm.GaussianModel(prefix='gauss2_')

By specifying prefix, it is convenient because parameters can be distinguished even if the same model function is used.

When introducing your own model

It can be introduced in one shot by passing the function defined in the Model class object.


def constant(x, const):
    return const

model = lf.Model(constant) +  lfm.GaussianModel(prefix='gauss1_') + lfm.GaussianModel(prefix='gauss2_')

Parameters

Even if you declare a Model class object, the Parameters class object for that Model class object is not declared, so start with that declaration.

#Set the initial value of each parameter
par_val = {
    'c' : 20,
    'const' : 20,
    'gauss1_center' : 6.0,
    'gauss1_sigma' : 1.0,
    'gauss1_amplitude' : 400,
    'gauss2_center' : 7.5,
    'gauss2_sigma' : 1.0,
    'gauss2_amplitude' : 150
}

par_min = { 
    'c' : 0,
    'const' : 0,
    'gauss1_center' : 0,
    'gauss1_sigma' : 0,
    'gauss1_amplitude' : 0,
    'gauss2_center' : 0,
    'gauss2_sigma' : 0,
    'gauss2_amplitude' : 0
}

par_max = { 
    'c' : 100,
    'const' : 100,
    'gauss1_center' : 10,
    'gauss1_sigma' : 10,
    'gauss1_amplitude' : 1000,
    'gauss2_center' : 10,
    'gauss2_sigma' : 10,
    'gauss2_amplitude' : 1000
}

par_vary = { 
    'c' : True,
    'const' : True,
    'gauss1_center' : True,
    'gauss1_sigma' : True,
    'gauss1_amplitude' : True,
    'gauss2_center' : True,
    'gauss2_sigma' : True,
    'gauss2_amplitude' : True
}

#Declaration of Parameters class object for the defined model
params = model.make_params()
for name in model.param_names: 
    params[name].set(
        value=par_val[name], #initial value
        min=par_min[name], #lower limit
        max=par_max[name], #upper limit
        vary=par_vary[name] #Whether to move the parameter
    )

The relationship between parameters can be algebraically constrained as follows.

params['gauss2_center'].set(expr='gauss1_center*1.25')

fitting

result = model.fit(x=df.x, data=df.y, weights=df.dy**(-1.0), params=params, method='leastsq')

Note: The data is given in the DataFrame object of pandas.

Confirmation of fitting result

summary
```
print(result.fit_report())
```

You can see the summary of the fitting results as follows.

```
[[Model]]
    ((Model(constant) + Model(gaussian, prefix='gauss1_')) + Model(gaussian, prefix='gauss2_'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 383
    # data points      = 50
    # variables        = 6
    chi-square         = 29.2390839
    reduced chi-square = 0.66452463
    Akaike info crit   = -14.8258350
    Bayesian info crit = -3.35369698
[[Variables]]
    const:             21.4200227 +/- 1.47331527 (6.88%) (init = 20)
    gauss1_amplitude:  88.4566498 +/- 1.39004141 (1.57%) (init = 400)
    gauss1_center:     5.99651706 +/- 0.00136123 (0.02%) (init = 6)
    gauss1_sigma:      0.09691089 +/- 0.00153157 (1.58%) (init = 1)
    gauss2_amplitude:  31.6555368 +/- 1.39743186 (4.41%) (init = 150)
    gauss2_center:     7.49564632 +/- 0.00170154 (0.02%) == 'gauss1_center*1.25'
    gauss2_sigma:      0.09911853 +/- 0.00446584 (4.51%) (init = 1)
    gauss1_fwhm:       0.22820770 +/- 0.00360656 (1.58%) == '2.3548200*gauss1_sigma'
    gauss1_height:     364.139673 +/- 4.89553113 (1.34%) == '0.3989423*gauss1_amplitude/max(2.220446049250313e-16, gauss1_sigma)'
    gauss2_fwhm:       0.23340629 +/- 0.01051624 (4.51%) == '2.3548200*gauss2_sigma'
    gauss2_height:     127.410415 +/- 4.77590041 (3.75%) == '0.3989423*gauss2_amplitude/max(2.220446049250313e-16, gauss2_sigma)'
[[Correlations]](unreported correlations are < 0.100)
    C(gauss2_amplitude, gauss2_sigma)     =  0.647
    C(gauss1_amplitude, gauss1_sigma)     =  0.636
    C(const, gauss2_amplitude)            = -0.555
    C(const, gauss1_amplitude)            = -0.552
    C(const, gauss1_sigma)                = -0.367
    C(const, gauss2_sigma)                = -0.359
    C(gauss1_amplitude, gauss2_amplitude) =  0.306
    C(gauss1_sigma, gauss2_amplitude)     =  0.203
    C(gauss1_amplitude, gauss2_sigma)     =  0.198
    C(gauss1_sigma, gauss2_sigma)         =  0.131
```

confidence interval

print('[[Confidence Intervals]]')
print(result.ci_report())

[[Confidence Intervals]]
                    99.73%    95.45%    68.27%    _BEST_    68.27%    95.45%    99.73%
const           :  -4.72534  -3.05035  -1.49522  21.42002  +1.48820  +3.02111  +4.65531
gauss1_amplitude:  -4.37979  -2.84434  -1.40305  88.45665  +1.41243  +2.88513  +4.47133
gauss1_center   :  -0.00435  -0.00278  -0.00131   5.99652  +0.00131  +0.00278  +0.00436
gauss1_sigma    :  -0.00479  -0.00312  -0.00154   0.09691  +0.00156  +0.00321  +0.00499
gauss2_amplitude:  -4.31612  -2.82491  -1.40294  31.65554  +1.43259  +2.94767  +4.61057
gauss2_sigma    :  -0.01344  -0.00889  -0.00446   0.09912  +0.00466  +0.00971  +0.01540

--Plot of fitting results

```python
fig, gridspec = result.plot(data_kws={'markersize': 5})
```

The following graph is automatically generated.

--Fiting result The following properties have the optimum value etc., so feel free to do the rest.

```python
result.chisqr #Chi-square value
result.nfree  #Degree of freedom
result.redchi #Converted chi-square value
result.aic    # aic
result.bic    # bic    
result.best_fit  # best-fit model value(Value of orange plot of fitting result)
result.residual  # (best-fit model)-(data)The value of the(Values in the upper panel of fitting results)
result.eval_components(x=df.x)  # best-Value for each component of fit model
result.best_values #Optimal value for each parameter:dict type
result.result.params['const'].value  #Optimal value of parameter
result.result.params['const'].stderr #Standard error of parameters
```

that's all. Please continue to enjoy @ rheza_h's article on the 22nd day.