Module kmpfit (API)¶
Introduction¶
This module provides the class Fitter, which uses the implementation in C of MPFIT, Craig Markwardt’s nonlinear least squares curve fitting routines for IDL. MPFIT uses the LevenbergMarquardt technique to solve the leastsquares problem, which is a particular strategy for iteratively searching for the best fit. In its typical use, MPFIT will be used to fit a usersupplied function (the “model”) to usersupplied data points (the “data”) by adjusting a set of parameters. MPFIT is based upon the robust routine MINPACK1 (LMDIF.F) by Moré and collaborators.
For example, a researcher may think that a set of observed data points is best modelled with a Gaussian curve. A Gaussian curve is parameterized by its mean, standard deviation and normalization. MPFIT will, within certain constraints, find the set of parameters which best fits the data. The fit is “best” in the leastsquares sense; that is, the sum of the weighted squared differences between the model and data is minimized.
See also
Tutorial with background information for this module and practical examples.
Class Fitter¶

class
kapteyn.kmpfit.
Fitter
(residuals, deriv=None, ...)¶ Parameters:  residuals – the residuals function, see description below.
 deriv – optional derivatives function, see description below. If a derivatives
function is given, usercomputed explicit derivatives are automatically
set for all parameters in the attribute
parinfo
, but this can be changed by the user.  .. – other parameters, each corresponding with one of the configuration
attributes described below. They can be defined here, when the Fitter
object is created, or later. The attributes
params0
anddata
must be defined before the methodfit()
is called.
Objects of this class are callable and return the fitted parameters when called.
Residuals function
The residuals function must return a NumPy (dtype=’d’) array with weighted deviations between the model and the data. It takes two arguments: a NumPy array containing the parameter values and a reference to the attribute
data
which can be any object containing information about the data to be fitted. E.g., a tuple like(xvalues, yvalues, errors)
.In a typical scientific problem the residuals should be weighted so that each deviate has a Gaussian sigma of 1.0. If x represents values of the independent variable, y represents a measurement for each value of x, and err represents the error in the measurements, then the deviates could be calculated as follows:
\[deviates = (y  f(x)) / err\]where f is the analytical function representing the model. If err are the 1sigma uncertainties in y, then
\[\sum deviates^2\]will be the total chisquared value. Fitter will minimize this value. As described above, the values of x, y and err are passed through Fitter to the residuals function via the attribute
data
.Derivatives function
The optional derivates function can be used to compute weighted function derivatives, which are used in the minimization process. This can be useful to save time, or when the derivative is tricky to evaluate numerically.
The function takes three arguments: a NumPy array containing the parameter values, a reference to the attribute
data
and a list with boolean values corresponding with the parameters. If a boolean in the list is True, the derivative with respect to the corresponding parameter should be computed, otherwise it may be ignored. Fitter determines these flags depending on how derivatives are specified in itemside
of the attributeparinfo
, or whether the parameter is fixed.The function must return a NumPy array with partial derivatives with respect to each parameter. It must have shape (n,m), where n is the number of parameters and m the number of data points.
Configuration attributes
The following attributes can be set by the user to specify a Fitter object’s behaviour.

params0
¶ Required attribute. A NumPy array, a tuple or a list with the initial parameters values.

data
¶ Required attribute. Python object with information for the residuals function and the derivatives function. See above.

parinfo
¶ A list of dictionaries with parameter contraints, one dictionary per parameter, or None if not given. Each dictionary can have zero or more items with the following keys and values:
'fixed'
: a boolean value, whether the parameter is to be held fixed or not. Default: not fixed.'limits'
: a twoelement tuple or list with upper end lower parameter limits or None, which indicates that the parameter is not bounded on this side. Default: no limits.'step'
: the step size to be used in calculating the numerical derivatives. Default: step size is computed automatically.'side'
: the sidedness of the finite difference when computing numerical derivatives. This item can take four values:0  onesided derivative computed automatically (default)
1  onesided derivative \((f(x+h)  f(x) )/h\)
1  onesided derivative \((f(x)  f(xh))/h\)
2  twosided derivative \((f(x+h)  f(xh))/2h\)
3  usercomputed explicit derivatives
where \(h\) is the value of the parameter
'step'
described above. The “automatic” onesided derivative method will chose a direction for the finite difference which does not violate any constraints. The other methods do not perform this check. The twosided method is in principle more precise, but requires twice as many function evaluations. Default: 0.'deriv_debug'
: boolean to specify console debug logging of usercomputed derivatives. True: enable debugging. If debugging is enabled, then'side'
should be set to 0, 1, 1 or 2, depending on which numerical derivative you wish to compare to. Default: False.As an example, consider a function with four parameters of which the first parameter should be fixed and for the third parameter explicit derivatives should be used. In this case,
parinfo
should have the value[{'fixed': True}, None, {'side': 3}, None]
or[{'fixed': True}, {}, {'side': 3}, {}]
.

ftol
¶ Relative \(\chi^2\) convergence criterium. Default: 1e10

xtol
¶ Relative parameter convergence criterium. Default: 1e10

gtol
¶ Orthogonality convergence criterium. Default: 1e10

epsfcn
¶ Finite derivative step size. Default: 2.2204460e16 (MACHEP0)

stepfactor
¶ Initial step bound. Default: 100.0

covtol
¶ Range tolerance for covariance calculation. Default: 1e14

maxiter
¶ Maximum number of iterations. Default: 200

maxfev
¶ Maximum number of function evaluations. Default: 0 (no limit)
Result attributes
After calling the method
fit()
, the following attributes are available to the user:
params
¶ A NumPy array, list or tuple with the fitted parameters. This attribute has the same type as
params0
.

xerror
¶ Final parameter uncertainties (\(1 \sigma\))

covar
¶ Final parameter covariance (NumPy) matrix.

chi2_min
¶ Final \(\chi^2\).

orignorm
¶ Starting value of \(\chi^2\).

rchi2_min
¶ Minimum reduced \(\chi^2\).

stderr
¶ Standard errors.

npar
¶ Number of parameters.

nfree
¶ Number of free parameters.

npegged
¶ Number of pegged parameters.

dof
¶ Number of degrees of freedom.

resid
¶ Final residuals.

niter
¶ Number of iterations.

nfev
¶ Number of function evaluations.

version
¶ mpfit.c’s version string.

status
¶ Fitting status code.

message
¶ Message string.
Methods:

fit
(params0=None)¶ Perform a fit with the current values of parameters and other attributes.
Optional argument params0: initial fitting parameters. (Default: previous initial values are used.)

confidence_band
(x, dfdp, confprob, f, abswei=False)¶ This method requires SciPy. After the method
fit()
has been called, this method calculates the upper and lower value of the confidence interval for all elements of the NumPy array x. The model values and the arrays with confidence limits are returned and can be used to plot confidence bands.Parameters:  x – NumPy array with the independent values for which the confidence interval is to be found.
 dfdp – a list with derivatives. There must be as many elements in this list as there are parameters in the model. Each element must be a NumPy array with the same length as x.
 confprob – confidence probability, e.g. 0.95 (=95%). From this number the confidence level is derived, e.g. 0.05. The Confidence Band is a (1alpha)*100% band. This implies that for a given value of x the probability that the ‘true’ value of f falls within these limits is (1alpha)*100%.
 f – the model function returning the value y = f(p,x).
p are the bestfit parameters as found by the method
fit()
and x is the given NumPy array with independent values.  abswei – True if weights are absolute. For absolute weights the unscaled covariance matrix elements are used in the calculations. For unit weighting (i.e. unweighted) and relative weighting, the covariance matrix elements are scaled with the value of the reduced chi squared.
Returns: A tuple with the following elements, each one is a Numpy array:
 y: the model values at x: y = f(p,x);
 upperband: the upper confidence limits;
 lowerband: the lower confidence limits.
Note
If parameters were fixed in the fit, the corresponding error is 0 and there is no contribution to the confidence interval.
Testing derivatives¶
In principle, the process of computing explicit derivatives should be straightforward. In practice, the computation can be error prone, often being wrong by a sign or a scale factor.
In order to be sure that the explicit derivatives are correct,
for debugging purposes the
user can set the attribute parinfo[‘deriv_debug’] = True
for any parameter. This will cause Fitter.fit()
to
print both explicit derivatives and numerical derivatives to the
console so that the user can compare the results.
When debugging derivatives, it is important to set parinfo[‘side’]
to the kind of numerical derivative to compare with:
it should be set to 0, 1, 1, or 2, and not set to 3.
When parinfo[‘deriv_debug’] is set for a parameter, then
Fitter.fit()
automatically understands to request usercomputed derivatives.
The console output will be sent to the standard output, and will appear as a block of ASCII text like this:
1 2 3 4 5 6 7 8  FJAC DEBUG BEGIN
# IPNT FUNC DERIV_U DERIV_N DIFF_ABS DIFF_REL
FJAC PARM 1
.... derivative data for parameter 1 ....
FJAC PARM 2
.... derivative data for parameter 2 ....
.... and so on ....
FJAC DEBUG END

which is to say, debugging data will be bracketed by pairs of “FJAC DEBUG” BEGIN/END phrases. Derivative data for individual parameter i will be labeled by “FJAC PARM i”. The columns are, in order,
IPNT  data point number \(j\)
FUNC  residuals function evaluated at \(x_j\)
DERIV_U  usercalculated derivative \({\partial f(x_j)}/{\partial p_i}\)
DERIV_N  numerically calculated derivative according to the value of parinfo[‘side’]
DIFF_ABS  difference between DERIV_U and DERIV_N: fabs(DERIV_UDERIV_N)
DIFF_REL  relative difference: fabs(DERIV_UDERIV_N)/DERIV_U
Since individual numerical derivative values may contain significant roundoff errors, it is up to the user to critically compare DERIV_U and DERIV_N, using DIFF_ABS and DIFF_REL as a guide.
Example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  #!/usr/bin/env python
import numpy
from kapteyn import kmpfit
def residuals(p, d):
a, b, c = p
x, y, w = d
return (y  (a*x*x+b*x+c))/w
x = numpy.arange(50,50,0.2)
y = 2*x*x + 3*x  3 + 2*numpy.random.standard_normal(x.shape)
w = numpy.ones(x.shape)
a = [x, y, w]
f = kmpfit.Fitter(residuals, params0=[1, 2, 0], data=a)
f.fit() # call fit method
print(f.params)
print(f.message)
# result:
# [2.0001022845514451, 3.0014019147386, 3.0096629062273133]
# mpfit (potential) success: Convergence in chisquare value (1)
a[1] = 3*x*x  2*x  5 + 0.5*numpy.random.standard_normal(x.shape)
print(f(params0=[2, 0, 1])) # call Fitter object
# result:
# [3.0000324686457871, 1.999896340813663, 5.0060187435412962]

Function simplefit¶

kapteyn.kmpfit.
simplefit
(model, p0, x, y, err=1.0, ...)¶ Simple interface to
Fitter
.Parameters:  model – model function which must take two arguments: a sequence with initial values and a sequence with xvalues. It must return a NumPy array with function results.
 p0 – a sequence with the initial parameter values.
 x – a sequence with independent variable values.
 y – a sequence with dependent variable values.
 err – a sequence with \(1 \sigma\) errors.
 .. – other arguments, each corresponding with one of the configuration
attributes for an object of class
Fitter
.
Returns: a
Fitter
object from which the fit results can be extracted.