Fitting Data with Experimental Errors

Fitting to Data with Experimental Errors

As discussed in Chapter 3, in an experimental context in the physical sciences almost all measured quantities have an error because a perfect experimental apparatus does not exist. The chapter also provides some guidelines for determining what are the values of those errors.

Nonetheless, all too often real experimental data in the sciences and engineering do not have explicit errors associated with the values of the dependent or independent variables. In this case the least-squares fitting techniques discussed in the previous subsection are usually what are used. As we shall see, EDA also provides extensions to this standard method with some reweighting heuristics.

If there are assigned errors in the experimental data, say erry, then these errors are used to weight each term in the sum of the squares. If the errors are estimates of the standard deviation such a weighted sum is called the "chi squared", of the fit.

The least-squares technique takes the derivatives of the ChiSquared with respect to the parameters of the fit, sets each equation to zero, and solves the resulting set of equations. Thus, the only difference between this situation and the one discussed in the previous section is that we weight each residual with the inverse of the error.

Some references refer to the weights w[[i]] of a fit, while others call the errors erry the standard deviations.

Also, some people refer to the "variance", which is the error or standard deviation squared.

If the data has errors in both the independent variable and the dependent one, say errx and erry, respectively, the fitting programs in EDA use what is called an "effective variance technique". For example, imagine we are fitting data to a[2] x^2, and we have a data point where x = 3 +/- 0.1.

[Graphics:../Images/index_gr_27.gif]

To a good approximation, the uncertainty in y, because of the errors in x, is the error in x times the slope of the line.

Thus, if we can assume that the errors in x are independent of the errors in y, we can combine erry and this term in quadrature to get an effective error in y.

Using these errors instead of erry is called the "effective variance technique." In general, if we are modeling

then the algorithm involves replacing erry with

The square of the error erry is the "effective variance".

Notice that since this effective variance contains at least some of the values of the parameters to which we are fitting, the chi-squared is nonlinear in these parameters. This implies that a nonlinear fitting technique is in principle required. However, when the errors are small the nonlinearities are also small and almost always LinearFit can successfully iterate to a reasonable solution.

There are also some subtleties about the value of the independent variable to use in evaluating the derivatives of the function. In almost all cases, differences in fitted values using different ways of doing the evaluation are small compared to the errors in those values. Thus, LinearFit just evaluates the derivatives at the observed values of the independent variable.

When the fit is to a straight line, a particularly effective way to apply the effective variance technique is an algorithm called Brent minimization. This is the default for LinearFit. Section 4.4.1.2 discusses this further.

Converted by Mathematica September 30, 1999. Slightly modified by hand by David Harrison, October 1999. This document is Copyright © 1996, 1997, 1998, 1999 David M. Harrison and Wolfram Research, Inc.