I1.11: Section 8

Section 8: Systematically finding parameter values that fit a model to a dataset

The Models.xls workbook (or a similar workbook derived from it) provides a way for a dataset of x and y values (in columns A and B, respectively) to be matched to y predictions computed for that x value with a model formula (in column C). The model formula makes use of the values of a few parameters in column G. Changes to these parameters will thus change all the column C values. For all types of model, the model-fitting process consists of finding values for the parameters which make the column C model values come as close as possible to the column B data values.

The simplest way to find reasonable parameter settings is to make a scatter-plot graph of columns A, B, and C together, which will display the data and the model predictions on the same scale. When good parameter settings have been found, the two graphs will overlap as closely as possible. Once the graphs match well visually, the numerical differences between the data y value and the model y prediction (computed in column D) can sometimes be used to make fine adjustments to the parameter settings.

In general, even the best model will not fit the data exactly. If the deviations from the model are randomly above and below it, they represent noise in the data — in such a case, probably no other model would do better. On the other hand if most positive and negative deviations are grouped with several others of the same sign, this indicates the model does not fit the data well, perhaps because the kind of model formula being used cannot produce a shape similar to that of the data.

Summary of systematic curve-fitting techniques

  • Make a graph showing the data and model points, so you can observe how good the fit is.
  • First set the position parameters to approximate values, based on the kind of model:
    • Linear — estimate the intercept as the data y value for the x value closest to zero.
  • (If no x value is close to zero, redefine the data or formula by subtracting the first x value)
    • Quadratic — estimate the vertex x and y parameters from the graph
  • Estimate a beginning value for the other parameter from the graph or first two data points.
    • Linear slope—set to the difference of the y values divided by the difference in x values.
    • Quadratic shape—positive if the ends curve up, negative if the ends curve down
  • Adjust the parameter values systematically to improve the initial estimates.
    • Change the first significant digit by 1, observing if this moves the model toward the data
    • Continue first-digit changes in the right direction until the model graph crosses the data.
    • Once the first digit is known, adjust the second digit.
    • Readjust the other parameter(s) to the same precision.