## I1.04: Section 3 Part 1

### Section 3: Using the Models.xls spreadsheet to find a linear model

The Models.xls spreadsheet available on the class web site is a template into which you can put any data values, then fit a mathematical model to closely match that data. It has all the needed formulas preset, including a formula in C3 that is based on the kind of model that is wanted (e.g., linear or quadratic). You will use this or similar spreadsheets as your main tool in this topic and later modeling topics, so it will save you a lot of time if you become skilled at using it.

Models.xls has multiple sheets, each of them preset to fit a particular type of model. In this section we will use the “Linear Model” sheet. Worksheets for Quadratic and Exponential are also included in Models.xls, and will be discussed in later sections and topics. You will find that there is very little difference in the process used for finding different models, so when you learn to use the Linear Model worksheet you will be able to quickly make use of any other model that turns out to be appropriate for the dataset you are fitting.

### Steps for using the Models.xls spreadsheet to find a good linear model for a dataset:

1. Insert a new worksheet with the Insert > Worksheet menu choice, and label the tab at the bottom of this new sheet with an appropriate name (e.g., “Linear Model for Sediment Model”).
2. Copy the contents of the worksheet labeled Linear Model (select it and then use a Ctrl-A, Ctrl-C, Ctrl-V sequence to paste a copy of its contents into the worksheet you inserted in step 1). (It will not work to use the Linear Model worksheet directly or to use the Edit > Move or Copy Sheet menu option, because the model-template worksheets are protected from change so that they are always available.)
3. Look at your data and decide which column you want your model to predict. That column of the data will be labeled as y values and compared to the output of the model. The other column will be labeled as x values, and will be used in the model formula to compute the model output. If you need to rearrange the columns to make the inverse graph or otherwise modify the dataset, use the Data Scratch Pad worksheet in Models.xls to get the data ready for step 4.
4. Place the data into columns A and B, starting the numbers at row 3 (you can put column labels in row 1 if you wish). Use column A for the input x data values and column B for the output y data values. Use as many rows as needed for the data; this may be different for different data sets.
5. Spread the formula in cell C3 down to as many rows as the data. (In the Linear Model sheet, C3 has been preset to “=$G$4*A3+$G$3”, a linear formula that uses the value in cell G4 as slope and the value in cell G3 as intercept.)
6. Spread the formulas in cell D3 and E3 down beside the data and model rows. The formula “=B3-C3” in D3 computes the difference between the data and the model (called the residual deviation), and the formula in E3 computes the square of that deviation. These column E values are used to compute (in G13) a numerical average of how well the model fits the data, which is called the standard deviation.
7. Make a graph of the data and model together. Select the data rows in columns A, B, and C, then select the Scatter Plot option from the Chart menu. The data and model points will be different colors, with one of each kind at each horizontal position.
8. Adjust the “parameter” values at the top of column G to move the model close to the data. Each type of model has different parameters (e.g., intercept and slope for linear models), but a similar process of adjustment can used in all cases.
9. Decide if the model used is appropriate for the dataset. A good model will come close over the whole range of input values, with deviations due mainly to noise in the data. If the best parameter settings cannot produce a good model (e.g., a line model fitted to parabolic data), then that model should not be used for predictions. A different model (e.g., a quadratic one) should be tried.
10. Write the mathematical formula for this set of model parameters (e.g., y = 1.4 x + 7.3).