Program: SURFACEFIT Purpose: Given values Z at positions (X,Y) and optional standard errors for Z from a text file with data in columns, find best fit parameters for a user given model function F(X,Y) Category: ANALYSIS, FITTING File: surfacefit.src Author: M.G.R. Vogelaar Keywords: FILENAME= Enter name of file: Name of text file with columns which represent values Z at positions X, Y. A fourth column optionally represents 1 sigma errors of Z. These values are used as weights in the fit. COLS= Enter columns for X, Y, Z, sigmaZ ... [1,2,3,4]: First column should be X. Second is Y and the third is Z. If a fourth column is omitted then all the weights are set to 1 in the fit. Else, weights are calculated as w_i = 1/(sigmaZ_i^2) SCALE= Multiply Z and sigmaZ with factor .... [1.0]: If your Z values are very big or very small, it is often wise to scale your data to smaller values because it is numerically more stable and less dependent on good initial estimates (see INITVALS=). The default is 1.0 which implies that the data is not scaled at all. FXY= Enter F(x,y) with parameters a0,..,a9 [a0+a1*x+a2*y]: Specify the model with parameters a0, a1 etc. for which we want a 'best fit'. The data is stored in x and y. If the default is used, then the program uses an optimized version of the function evaluations needed for the fit. A more generic surface is: FXY=a0+a1*x+a2*y+a3*x*y+a4*x*x+a5*y*y INITVALS= Init. values for a0, a1,.. [0,0,...]: Initial estimates of the parameters for which you want to find a 'best' value. If the values of Z are near 1.0 then the default values (all zero) often suffice. Note that you need to enter exactly n values where n is the number of best-fit parameters. RESTABLE= Plot a table with the residuals ... [Y]/N: For each position in the data, print the position, the Z value, the Z value of the model with the 'best fit' parateters and the difference between these two Z values (the residual) CHAUVENET= Remove Outliers with Chauvenet ... [Y]/N: After fitting the model parameters, we have an impression of what outliers are. Filter possible outliers based on Chauvenet's criterion Calculate mean and standard deviation of the residuals. Express residual-mean in standard deviations of the sample: d = |res_i-mean|/std The probability to find this value in a range mean-d*sigma and mean+d*sigma is P = erf(d/sqrt(2)) The probabilty to find a residual outside this interval is Q=1-P According to Chauvenet we can discard data points where Q < N/2 The fit will be repeated with the outliers removed. Note that Chauvenet can be applied only one. TRIALS= Number of trials for bootstrap ... [0]: If you need a realistic error on your fitted parameters then apply a bootstrap method. The number of random samples is given in this keyword. For each sub sample a fit is made and the error is given by the standard deviation of (an_boot - an_first) where an_boot is parameter n for all the bootstrap samples and an_first is parameter n for the first fit. The default is TRIALS=0 which implies no bootstrapping. Note that if the least squares fitting of your model is sensitive to its initial estimates, these bootstrapping errors can be unreasonable big. Usually this is a hint to scale the data. Note: For functions that are supplied by a user, the function evaluation is done by Python's eval() function. This implies that this evaluation is not optimal. So if you want a reliable bootstrap error estimation (e.g.: TRIALS=5000) then the procedure can take a while. PLOT= Do you want a 3d plot ... [Y]: Display a gridded (i.e. interpolated) version of your data and the fitted curve. The plot is interactive, i.e. one can rotate, pan and zoom the plot. Description: Fit parameters of any model given by a user, given data from a text file on disk. The parameters in the function that describes the model are a0, a1, .. a9. If your data file has a column with a standard error for each datapoint and you selected that column (as the fourth column), then you get a meaningful value for Chi square and the reduced Chi square and the estimated errors. We list both the errors derived from the covariance matrix (made by the fit routine) and the scaled errors, which are derived from the covariance errors while assuming a reduced Chi square equal to 1.0. It seems that bootstrapping (TRIALS=) indicate that these scaled errors are more realistic. Example: FILENAME= testplane.txt COLS= 1 2 3 SCALE= 1e-13 FXY= a0+a1*x+a2*y+a3*x*x+a4*y*y+a5*x*y INITVALS= 0::6 TRIALS= 200 PLOT= Test Data: ! X Y Z 5.0 3996.0 -1.485E+17 397.0 3736.0 -1.328E+17 1995.0 3994.0 -2.147E+17 1994.0 7.0 -6.402E+16 10.0 59.0 -3.415E+13 1177.0 16.0 -2.231E+16 600.0 1610.0 -3.017E+16 1420.0 2070.0 -7.314E+16 1141.0 1160.0 -3.385E+16 990.0 3085.0 -1.052E+17 574.0 2389.0 -5.877E+16 1740.0 1586.0 -7.291E+16 1456.0 3278.0 -1.354E+17 864.0 3846.0 -1.505E+17 1604.0 3030.0 -1.282E+17 1332.0 3658.0 -1.544E+17 554.0 3198.0 -1.006E+17 1643.0 900.0 -5.141E+16 442.0 1937.0 -3.828E+16 1266.0 1661.0 -5.205E+16 Results: Program scaled data as Z = Z*1e-13 Results for F(x,y)=a0+a1*x+a2*y+a3*x*x+a4*y*y+a5*x*y ==================================================== par value cov.error std error a0 0.278127 0.944506 1.93969 a1 -0.000673375 0.00167908 0.00344825 a2 -0.000103869 0.000800445 0.00164384 a3 -0.00160955 7.79177e-07 1.60017e-06 a4 -0.000930062 1.74786e-07 3.58951e-07 a5 -2.83247e-05 2.3408e-07 4.80722e-07 Chi^2 min: 59.0453 Reduced Chi^2: 4.21752 Iterations: 12 Function ev: 95 Status: 2 Residuals from Z - Zmodel: Residuals max: 3.65061 Residuals min: 0.0327222 Residuals standard deviation: 1.175 Standard errors with bootstrap: sigma a0: 5.52513 sigma a1: 0.00792682 sigma a2: 0.00247251 sigma a3: 3.08476e-06 sigma a4: 4.59987e-07 sigma a5: 1.10049e-06 The value Chi^2 min is a measure of goodness of fit. Note that it scales both with the values of Z (high values of Z result in high values of the residuals) and with the weights. The reduced Chi-square gives a relation between the covariance errors and the standard deviation of the mother sample. If this value is ~1 then the errors from the covariance matrix can be used. If not, or when you have equal weights, then you should use the standard errors derived from the assumption that the reduced Chi-square is 1. If the weights are not under- or over estimated, then the bootstrap errors will look like the standard errors. Notes: This is a GIPSY task written in Python. It uses kmpfit, the least squares routine in the Kapteyn Package. Python enables us to evaluate model expressions given by a user. The fit routine does not need expressions for the parameter derivatives, so we end up with a flexible program that enables a user to explore different models. Updates: Nov 20, 2011: VOG, Document created.