Rates of Change

Linear functions apply to real world problems that involve a constant rate.

Learning Objectives

Apply linear equations to solve problems about rates of change

Key Takeaways

Key Points

• If you know a real-world problem is linear, such as the distance you travel when you go for a jog, you can graph the function and make some assumptions with only two points.
• The slope of a function is the same as the rate of change for the dependent variable $(y)$. For instance, if you’re graphing distance vs. time, then the slope is how fast your distance is changing with time, or in other words, your velocity.

Key Terms

• rate of change: Ratio between two related quantities that are changing.
• linear equation: A polynomial equation of the first degree (such as $x=2y-7$).
• slope: The ratio of the vertical and horizontal distances between two points on a line; zero if the line is horizontal, undefined if it is vertical.

Rate of Change

Linear equations often include a rate of change.  For example, the rate at which distance changes over time is called velocity.  If two points in time and the total distance traveled is known the rate of change, also known as slope, can be determined.  From this information, a linear equation can be written and then predictions can be made from the equation of the line.

If the unit or quantity in respect to which something is changing is not specified, usually the rate is per unit of time. The most common type of rate is “per unit of time”, such as speed, heart rate and flux. Ratios that have a non-time denominator include exchange rates, literacy rates, and electric field (in volts/meter).

In describing the units of a rate, the word “per” is used to separate the units of the two measurements used to calculate the rate (for example a heart rate is expressed “beats per minute”).

Example

An athlete begins he normal practice for the next marathon during the evening. At 6:00 pm he starts to run and leaves his home.  At 7:30 pm, the athlete finishes the run at home and has run a total of 7.5 miles. How fast was his average speed over the course of the run?

The rate of change is the speed of his run; distance over time. Therefore, the two variables are time $(x)$ and distance $(y)$. The first point is at his house, where his watch read 6:00 pm. This is the beginning time so let’s set it to $0$.  So our first point is $(0,0)$ because he did not run anywhere yet. Let’s think about our time in hours.  Our second point is $1.5$ hours later, and we ran $7.5$ miles. The second point is $(1.5,7.5)$. Our speed (rate of change) is simply the slope of the line connecting the two points. The slope, given by: $m = \frac{y_{2}-y_{1}}{x_{2}-x_{1}}$ becomes $m = \frac{7.5}{1.5}=5$ miles per hour.

Example:  Graph the line illustrating speed

To graph this line, we need the $y$-intercept and the slope to write the equation. The slope was $5$ miles per hour and since the starting point was at $(0,0)$, the $y$-intercept is $0$. So our final function is $y=5x$. Distance and time graph: The graph of $y=5x$. The two variables are time $(x)$ and distance $(y)$.  The rate the runner runs is $5$ miles per hour.  Using the graph, predictions can be made assuming that his average speed remains the same.

With this new function, we can now answer some more questions.

• How many miles did he run after the first half hour?  Using the equation, if $x=\frac{1}{2}$, solve for $y$.  If $y=5x$, then $y=5(0.5)=2.5$ miles.
• If he kept running at the same pace for a total of $3$ hours, how many miles will he have run? If $x=3$, solve for $y$.  If $y=5x$, then $y=5(3)=15$ miles.

There are many such applications for linear equations. Anything that involves a constant rate of change can be nicely represented with a line with the slope. Indeed, so long as you have just two points, if you know the function is linear, you can graph it and begin asking questions! Just make sure what you’re asking and graphing makes sense. For instance, in the marathon example, the domain is really only $x\geq0$, since it doesn’t make sense to go into negative time and lose miles!

Linear Mathematical Models

Linear mathematical models describe real world applications with lines.

Learning Objectives

Apply linear mathematical models to real world problems

Key Takeaways

Key Points

• A mathematical model describes a system using mathematical concepts and language.
• Linear mathematical models can be described with lines. For instance, a car going $50$ mph, has traveled a distance represented by $y=50x$, where $x$ is time in hours and $y$ is miles.  The equation and graph can be used to make predictions.
• Real world applications can also be modeled with multiple lines such as if two trains travel toward each other. The point where the two lines intersect is the point where the trains meet.

Key Terms

• mathematical model: An abstract mathematical representation of a process, device, or concept; it uses a number of variables to represent inputs, outputs, internal states, and sets of equations and inequalities to describe their interaction.
• linear regression: An approach to modeling the linear relationship between a dependent variable $y$ and an independent variable $x$.

Mathematical Models

A mathematical model is a description of a system using mathematical concepts and language. Mathematical models are used not only in the natural sciences and  engineering disciplines, but also in the social sciences. Linear modeling can include population change, telephone call charges, the cost of renting a bike, weight management, or fundraising. A linear model includes the rate of change $(m)$ and the initial amount, the y-intercept $b$. After the model is written and a graph of the line is made, either one can be used to make predictions about behaviors.

Real Life Linear Model

Many everyday activities require the use of mathematical models, perhaps unconsciously. One difficulty with mathematical models lies in translating the real world application into an accurate mathematical representation.

Example: Renting a Moving Van

A rental company charges a flat fee of $30$ and an additional $0.25$ per mile to rent a moving van. Write a linear equation to approximate the cost $y$ (in dollars) in terms of $x$, the number of miles driven. How much would a 75 mile trip cost?

Using the slope-intercept form of a linear equation, with the total cost labeled $y$ (dependent variable) and the miles labeled $x$ (independent variable):

$\displaystyle y=mx+b$

The total cost is equal to the rate per mile times the number of miles driven plus the cost for the flat fee:

$\displaystyle y=0.25x+30$

To calculate the cost of a $75$ mile trip, substitute $75$ for $x$ into the equation:

\displaystyle \begin{align} y&=0.25x+30\\ &=0.25(75)+30\\ &=18.75+30\\ &=48.75 \end{align}

Real life Model with Multiple Equations

It’s also possible to model multiple lines and their equations.

Example

Initially, trains A and B are $325$ miles away from each other. Train A is traveling towards B at $50$ miles per hour and train B is traveling towards A at $80$ miles per hour.  At what time will the two trains meet? At this time how far did the trains travel?

First, begin with the starting positions of the trains,  ($y$-intercepts, $b$). Train A starts are the origin, $(0,0)$. Since train B is $325$ miles away from train A initially, its position is $(0,325)$.

Second, in order to write the equations representing each train’s total distance in terms of time, calculate the rate of change for each train. Since train A is traveling towards train B, which has a greater $y$ value, train A’s rate of change must be positive and equal to its speed of $50$. Train B is traveling towards A, which has a lesser $y$ value, giving B a negative rate of change: $-80$.

The two lines are thus:

$\displaystyle y_A=50x\\$

And:

$\displaystyle y_B=−80x+325$

The two trains will meet where the two lines intersect.  To find where the two lines intersect set the equations equal to each other and solve for $x$:

$\displaystyle y_{A}=y_{B}$

$\displaystyle 50x=-80x+325$

Solving for $x$ gives:

$\displaystyle x=2.5$

The two trains meet after $2.5$ hours. To find where this is, plug $2.5$ into either equation.

Plugging it into the first equation gives us $50(2.5)=125$, which means it meets after A travels $125$ miles.

Here is the distance versus time graphic model of the two trains: Trains: Train A (red line) is represented by the equation: $y=50x$, and Train B (blue line) is represented by the equation: $y=-80x+325$.  The two trains meet at the intersections point $(2.5,125)$, which is after $125$ miles in $2.5$ hours.

Fitting a Curve

Curve fitting with a line attempts to draw a line so that it “best fits” all of the data.

Learning Objectives

Use the least squares regression formula to calculate the line of best fit for a set of points

Key Takeaways

Key Points

• Curve fitting is useful for finding a curve that best fits the data. This allows assumptions about how the data is roughly spread out and predictions about future data points.
• Linear regression attempts to graph a line that best fits the data.
• Ordinary least squares approximation is a type of linear regression that minimizes the sum of the squares of the difference between the approximated value (from the line), and the actual value.
• The slope of the line that approximates $n$ data points is given by $m=\frac{\sum_{i=1}^{n}x_{i}y_{i}-\frac{1}{n}\sum_{i=1}^{n}x_{i}\sum_{j=1}^{n}y_{j}}{\sum_{i=1}^{n}(x_{i}^{2})-\frac{1}{n}(\sum_{i=1}^{n}x_{i})^{2}}$.
• The $y$-intercept of the line that approximates $n$ data points is given by: $b= \displaystyle{\frac{1}{n} \sum_{i=1}^{n} y_{1} - m \frac{1}{n} \sum_{i=1}^{n} x_{i} = \left (\bar{y} - m \bar{x} \right)}$

Key Terms

• curve fitting: The process of constructing a curve, or a mathematical function, that has the best fit to a series of data points, possibly subject to constraints.
• outlier: A value in a statistical sample which does not fit a pattern nor describes most other data points.
• least squares approximation: An attempt to minimize the sums of the squared distance between the predicted point and the actual point.
• linear regression: An approach to modeling the linear relationship between a dependent variable, $y$ and an independent variable, $x$.

Curve Fitting

Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data is required, or smoothing, in which a “smooth” function is constructed that approximately fits the data. Fitted curves can be used as an aid for data visualization, to infer values of a function where no data are available, and to summarize the relationships among two or more variables. Extrapolation refers to the use of a fitted curve beyond the range of the observed data, and is subject to a greater degree of uncertainty since it may reflect the method used to construct the curve as much as it reflects the observed data.

In this section, we will only be fitting lines to data points, but it should be noted that one can fit polynomial functions, circles, piece-wise functions, and any number of functions to data and it is a heavily used topic in statistics.

Linear Regression Formula

Linear regression is an approach to modeling the linear relationship between a dependent variable, $y$ and an independent variable, $x$. With linear regression, a line in slope-intercept form, $y=mx+b$ is found that “best fits” the data.

The simplest and perhaps most common linear regression model is the ordinary least squares approximation. This approximation attempts to minimize the sums of the squared distance between the line and every point.

$\displaystyle m=\frac{\sum_{i=1}^{n}x_{i}y_{i}-\frac{1}{n}\sum_{i=1}^{n}x_{i}\sum_{j=1}^{n}y_{j}}{\sum_{i=1}^{n}(x_{i}^{2})-\frac{1}{n}(\sum_{i=1}^{n}x_{i})^{2}}$

To find the slope of the line of best fit, calculate in the following steps:

1. The sum of the product of the $x$ and $y$ coordinates $\sum_{i=1}^{n}x_{i}y_{i}$.
2. The sum of the $x$-coordinates $\sum_{i=1}^{n}x_{i}$.
3. The sum of the $y$-coordinates $\sum_{j=1}^{n}y_{j}$.
4. The sum of the squares of the $x$-coordinates $\sum_{i=1}^{n}(x_{i}^{2})$.
5. The sum of the $x$-coordinates squared $(\sum_{i=1}^{n}x_{i})^{2}$.
6. The quotient of the numerator and denominator.

\displaystyle \begin{align} b&= \frac{1}{n} \sum_{i=1}^{n} y_{1} - m \frac{1}{n} \sum_{i=1}^{n} x_{i} \\ &= \left (\bar{y} - m \bar{x} \right) \end{align}

To find the $y$-intercept ($b$), calculate using the following steps:

1. The average of the $y$-coordinates. Let $\bar{y}$, pronounced $y$-bar, represent the mean (or average) $y$ value of all the data points: $\bar y =\frac{1}{n}\sum_{i=1}^{n} y_{i}$.
2. The average of the $x$-coordinates. Respectively $\bar{x}$, pronounced $x$-bar, is the mean (or average) $x$ value of all the data points: $\bar x=\frac{1}{n}\sum_{i=1}^{n} x_{i}$.
3. Replace values into the formula above $b=\bar{y} - m \bar{x}$.

Using these values of $m$ and $b$ we now have a line that approximates the points on the graph.

Example:  Write the least squares fit line and then graph the line that best fits the data

For $n=8$ points: $(-1,0),(0,0),(1,1),(2,2),(3,1),(4,2.5),(5,3)$ and $(6,4)$. Example Points: The points are graphed in a scatterplot fashion.

First, find the slope $(m)$ and $y$-intercept $(b)$ that best approximate this data, using the equations from the prior section:

To find the slope, calculate:

1. The sum of the product of the $x$ and $y$ coordinates $\sum_{i=1}^{n}x_{i}y_{i}$.
2. The sum of the $x$-coordinates $\sum_{i=1}^{n}x_{i}$.
3. The sum of the $y$-coordinates $\sum_{i=1}^{n}y_{i}$.

\displaystyle \begin{align} \sum_{i=1}^{n}x_{i}y_{i}&=0+0+1+4+3+10+15+24\\&=57 \end{align}\displaystyle \begin{align} \sum_{i=1}^{n}x_{i}&=-1+0+1+2+3+4+5+6\\&=20 \end{align}\displaystyle \begin{align} \sum_{i=1}^{n}y_{i}&=0+0+1+2+1+2.5+3+4\\&=13.5 \end{align}

$\displaystyle m=\frac{\sum_{i=1}^{n}x_{i}y_{i}-\frac{1}{n}\sum_{i=1}^{n}x_{i}\sum_{j=1}^{n}y_{j}}{\sum_{i=1}^{n}(x_{i}^{2})-\frac{1}{n}(\sum_{i=1}^{n}x_{i})^{2}}$

4. Calculate the numerator:  The product of the $x$
and $y$-coordinates
minus one-eighth the product of the sum of the $x$-coordinates and the sum of the $y$-coordinates:

$\displaystyle \sum_{i=1}^{n}x_{i}y_{i}-\frac{1}{n}\sum_{i=1}^{n}x_{i}\sum_{j=1}^{n}y_{j}$

The numerator in the slope equation is:

$\displaystyle 57-\frac{1}{8}(20)(13.5)=23.25$

5. Calculate the denominator:  The
sum of the squares of the $x$-coordinates minus one-eighth the sum of the $x$-coordinates squared:

$\displaystyle \sum_{i=1}^{n}(x_{i}^{2})-\frac{1}{n}(\sum_{i=1}^{n}x_{i})^{2}$

\displaystyle \begin{align} \sum_{i=1}^{n}(x_{i}^{2})&=1+0+1+4+9+16+25+36\\&=92 \end{align}

The denominator is $92-\frac{1}{8}(20)^{2}=92-50=42$ and the slope is the quotient of the numerator and denominator: $\frac{23.25}{42}\approx0.554.$

Now for the $y$-intercept, ($b$) one-eighth times the average of the $x$-coordinates: $\bar{x}=\frac{20}{8}=2.5$ and one-eighth times the average of the $y$-coordinates: $\bar{y}=\frac{13.5}{8}=1.6875$.

Therefore $b=\frac{1}{n} \sum_{i=1}^{n} y_{1} - m \frac{1}{n} \sum_{i=1}^{n} x_{i} \\$:

$\displaystyle b\approx1.6875-0.554(2.5)=0.3025.$

Our final equation is therefore $y=0.554x+0.3025$, and this line is graphed along with the points. Least Squares Fit Line: The line found by the least squares approximation, $y = 0.554x+0.3025$. Notice 4 points are above the line, and 4 points are below the line.

Outliers and Least Square Regression

If we have a point that is far away from the approximating line, then it will skew the results and make the line much worse.  For instance, let’s say in our original example, instead of the point $(-1,0)$ we have $(-1,6)$.

Using the same calculations as above with the new point, the results are:$m\approx0.0536$ and $b\approx2.3035$, to get the new equation $y=0.0536x+2.3035$.

Looking at the points and line in the new figure below, this new line does not fit the data well, due to the outlier $(-1,6)$.  Indeed, trying to fit linear models to data that is quadratic, cubic, or anything non-linear, or data with many outliers or errors can result in bad approximations. Outlier Approximated Line: Here is the approximated line given the new outlier point at (-1, 6).