Using the Normal Distribution

Learning Outcomes

  • Recognize the normal probability distribution and apply it appropriately.
  • Compare normal probabilities by converting to the standard normal distribution and directly using technology.

Example

The shaded area in the following graph indicates the area to the left of x. This area is represented by the probability P(X < x). Normal tables, computers, and calculators provide or calculate the probability P(X < x).

This is a normal distribution curve. A value, x, is labeled on the horizontal axis, X. A vertical line extends from point x to the curve, and the area under the curve to the left of x is shaded. The area of this shaded section represents the probability that a value of the variable is less than x.

Remember, P(X < x) = Area to the left of the vertical line through x.

The area to the right is then P(X > x) = 1 – P(X < x) = Area to the right of the vertical line through x.

P(X < x) is the same as P(Xx) and P(X > x) is the same as P(Xx) for continuous distributions.

If the area to the left is 0.0228, then the area to the right is 1 – 0.0228 = 0.9772.

try it

If the area to the left of x is 0.012, then what is the area to the right?

Calculation of Probabilities and Finding Values from Probabilities

Probabilities are calculated using technology. We can also find values in the distribution (such as percentiles) from given probabilities using technology.

Regardless of the method used to find the probabilities or numbers, always draw a sketch of the normal distribution, shade the relevant area under the curve and label the known and unknown values. You can use the fact that the mean is in the center of the distribution (50% of the area to its left and 50% of the area to its right) and the Empirical Rule percentages in the previous section to determine if your answers are reasonable.

Finding Probabilities FOR THE NORMAL DISTRIBUTION Using EXCEL

Access the norm.dist function (under [latex]f_{x}[/latex]). There are four inputs required [latex]x[/latex], the mean, the standard deviation, and cumulative. For all work in this class, we are using this function to find P([latex]X < x[/latex]), so we will only be using cumulative = 1 (or True). 

  • Area/Probability to the Left of a Number: To calculate P([latex]x < [/latex] number),  NORM.DIST(number, [latex]\displaystyle{\mu},{\sigma}[/latex], 1).
  • Area/Probability Between Two Numbers: To calculate P([latex]number1 < x < number2 [/latex]), subtract the smaller area from the larger area. NORM.DIST(number2, [latex]\displaystyle{\mu},{\sigma}[/latex], 1) – NORM.DIST(number1, [latex]\displaystyle{\mu},{\sigma}[/latex], 1).
  • Area/Probability to the Right of a Number: To calculate P([latex]x > [/latex] number), subtract the area from one. 1 – NORM.DIST(number, [latex]\displaystyle{\mu},{\sigma}[/latex], 1).

If you convert your x values to z-scores, your distribution will be the Standard normal distribution. You can enter a mean of zero and a standard deviation of one into this NORM.DIST function or run the NORM.S.DIST function which only requires two inputs [the z-score and cumulative = 1 (or True)].

Finding Probabilities FOR THE NORMAL DISTRIBUTION Using TI-83, 83+, 84, 84+

Go into 2nd DISTR. After pressing 2nd DISTR, press2:normalcdf. The syntax for the instructions are as follows:  normalcdf(lower value, upper value, mean, standard deviation)

You get 1E99 (= 1099) by pressing 1, the EE key (a 2nd key) and then 99. Or, you can enter10^99 instead. The number 1099 is way out in the right tail of the normal curve. The number –1099 is way out in the left tail of the normal curve.

  • Area/Probability to the Left of a Number: To calculate P([latex]x < [/latex] number),  normcdf(-1E99, number, [latex]\displaystyle{\mu},{\sigma}[/latex]).
  • Area/Probability Between Two Numbers: To calculate P([latex]number1 < x< number2 [/latex]). Number2 is greater than number1  normalcdf(number1,number2,[latex]\displaystyle{\mu},{\sigma}[/latex])
  • Area/Probability to the Right of a Number: To calculate P([latex]x > [/latex] number),  normalcdf(number, 1E99, [latex]\displaystyle{\mu},{\sigma}[/latex]).

If the mean and standard deviation are left out of the function, the calculator will use the mean of zero and standard deviation of one for the Standard Normal Distribution.

Finding VALUES FOR THE NORMAL DISTRIBUTION Using EXCEL

Access the norm.inv function (under [latex]f_{x}[/latex]). There are three inputs required (the area to the left, the mean and the standard deviation).

 

 

 

  • Number with a given probability/area to its left: NORM.INV(probability/area to the left, [latex]\displaystyle{\mu},{\sigma}[/latex].

– If the problem gives you an area to the right of your unknown number, convert it to the area to the left by subtracting it from one. Example: the x value that has area 0.015 its right also has 1 – 0.015 or 0.985 to its left.

– The NORM.S.INV function provides z-scores from the Standard normal distribution without having to enter the mean of zero and a standard deviation of one.

Finding VALUES FOR THE NORMAL DISTRIBUTION Using TI-83, 83+, 84, 84+

Go into 2nd DISTR. After pressing 2nd DISTR, press3:invNorm. The syntax for the instructions are as follows: invNorm(probability/area to the left, mean, standard deviation)

  • Number with a given probability/area to its left: invNorm(probability/area to the left, [latex]\displaystyle{\mu},{\sigma}[/latex].

– If the problem gives you an area to the right of your unknown number, convert it to the area to the left by subtracting it from one. Example: the x value that has area 0.015 its right also has 1 – 0.015 or 0.985 to its left.

– If the mean and standard deviation are left out of the function, the calculator will use the mean of zero and standard deviation of one for the Standard Normal Distribution.

Note

To calculate the probability without the use of technology, use the probability tables provided here. The tables include instructions for how to use them.

Example

The final exam scores in a statistics class were normally distributed with a mean of 63 and a standard deviation of five.

  1. Find the probability that a randomly selected student scored more than 65 on the exam.
  2. Find the probability that a randomly selected student scored less than 85.
  3. Find the 90th percentile (that is, find the score k that has 90% of the scores below k and 10% of the scores above k).
  4. Find the 70th percentile (that is, find the score k such that 70% of scores are below k and 30% of the scores are above k).

Solution:  Let X = a score on the final exam. X ~ N(63, 5), where μ = 63 and σ = 5

  1. Draw a graph. Then, find P(x > 65).
    This is a normal distribution curve. The peak of the curve coincides with the point 63 on the horizontal axis. The point 65 is also labeled. A vertical line extends from point 65 to the curve. The probability area to the right of 65 is shaded; it is equal to 0.3446.

Using Excel 1 – NORM.DIST(65, 63, 5, 1 or True) The result is [latex]P(x > 65)=0.3446[/latex].

Using TI83/84 2nd Distr 2:normalcdf(65,1E99,63,5). The result is [latex]P(x > 65)=0.3446[/latex]. 

The probability that any student selected at random scores more than 65 is 0.3446.

2. Draw a graph. Then find P(x < 85), and shade the graph.

Using Excel NORM.DIST(85, 63, 5, 1 or True) The result is [latex]P(x < 85)=1.0000[/latex].

Using TI83/84 2nd Distr 2:normalcdf(-1E99,85,63,5). The result is [latex]P(x < 85)=1.0000[/latex]. 

The probability that one student scores less than 85 is approximately one (or 100%).

3. Find the 90th percentile. For each problem or part of a problem, draw a new graph. Draw the x-axis. Shade the area that corresponds to the 90th percentile.
Let k = the 90th percentile. The variable k is located on the x-axis. P(x < k) is the area to the left of k. The 90th percentile k separates the exam scores into those that are the same or lower than k and those that are the same or higher. Ninety percent of the test scores are the same or lower than k, and ten percent are the same or higher. The variable k is often called a critical value.
This is a normal distribution curve. The peak of the curve coincides with the point 63 on the horizontal axis. A point, k, is labeled to the right of 63. A vertical line extends from k to the curve. The area under the curve to the left of k is shaded. This represents the probability that x is less than k: P(x < k) = 0.90

Using Excel NORM.INV(0.90, 63, 5) The result is [latex]\approx69.4[/latex]

Using TI83/84 2nd Distr 3:invNorm(0.90,63,5) The result is [latex]\approx69.4[/latex]

The 90th percentile is 69.4. This means that 90% of the test scores fall at or below 69.4 and 10% fall at or above.

4. Find the 70th percentile. Draw a new graph and label it appropriately.

Using Excel NORM.INV(0.70, 63, 5) The result is [latex]\approx65.6[/latex]

Using TI83/84 2nd Distr 3:invNorm(0.70,63,5) The result is [latex]\approx65.6[/latex]

k = 65.6 The 70th percentile is 65.6. This means that 70% of the test scores fall at or below 65.6 and 30% fall at or above 65.6.

try it

The golf scores for a school team were normally distributed with a mean of 68 and a standard deviation of three.

Find the probability that a randomly selected golfer scored less than 65.

Example

A personal computer is used for office work at home, research, communication, personal finances, education, entertainment, social networking, and a myriad of other things. Suppose that the average number of hours a household personal computer is used for entertainment is two hours per day. Assume the times for entertainment are normally distributed and the standard deviation for the times is half an hour.

  1. Find the probability that a household personal computer is used for entertainment between 1.8 and 2.75 hours per day.
  2. Find the maximum number of hours per day that the bottom quartile of households uses a personal computer for entertainment.

try it

The golf scores for a school team were normally distributed with a mean of 68 and a standard deviation of three. Find the probability that a golfer scored between 66 and 70.

Example

There are approximately one billion smartphone users in the world today. In the United States the ages 13 to 55+ of smartphone users approximately follow a normal distribution with approximate mean and standard deviation of 36.9 years and 13.9 years, respectively.

  1. Determine the probability that a random smartphone user in the age range 13 to 55+ is between 23 and 64.7 years old.
  2. Determine the probability that a randomly selected smartphone user in the age range 13 to 55+ is at most 50.8 years old.
  3. Find the 80th percentile of this distribution, and interpret it in a complete sentence.

try it

Use the information in previous example to answer the following questions.

  1. Find the 30th percentile, and interpret it in a complete sentence.
  2. What is the probability that the age of a randomly selected smartphone user in the range 13 to 55+ is less than 27 years old.

Example

There are approximately one billion smartphone users in the world today. In the United States the ages 13 to 55+ of smartphone users approximately follow a normal distribution with approximate mean and standard deviation of 36.9 years and 13.9 years respectively. Using this information, answer the following questions (round answers to one decimal place).

  1. Calculate the interquartile range (IQR).
  2. Forty percent of the ages that range from 13 to 55+ are at least what age?

try it

Two thousand students took an exam. The scores on the exam have an approximate normal distribution with a mean
μ = 81 points and standard deviation σ = 15 points.

  1. Calculate the first- and third-quartile scores for this exam.
  2. The middle 50% of the exam scores are between what two values?

Example

A citrus farmer who grows mandarin oranges finds that the diameters of mandarin oranges harvested on his farm follow a normal distribution with a mean diameter of 5.85 cm and a standard deviation of 0.24 cm.

  1. Find the probability that a randomly selected mandarin orange from this farm has a diameter larger than 6.0 cm. Sketch the graph.
  2. The middle 20% of mandarin oranges from this farm have diameters between ______ and ______.
  3. Find the 90th percentile for the diameters of mandarin oranges, and interpret it in a complete sentence.

try it

Using the information from previous, answer the following:

  1. The middle 40% of mandarin oranges from this farm are between ______ and ______.
  2. Find the 16th percentile and interpret it in a complete sentence.

References

“Naegele’s rule.” Wikipedia. Available online at http://en.wikipedia.org/wiki/Naegele’s_rule (accessed May 14, 2013).

“403: NUMMI.” Chicago Public Media & Ira Glass, 2013. Available online at http://www.thisamericanlife.org/radio-archives/episode/403/nummi (accessed May 14, 2013).

“Scratch-Off Lottery Ticket Playing Tips.” WinAtTheLottery.com, 2013. Available online at http://www.winatthelottery.com/public/department40.cfm (accessed May 14, 2013).

“Smart Phone Users, By The Numbers.” Visual.ly, 2013. Available online at http://visual.ly/smart-phone-users-numbers (accessed May 14, 2013).

“Facebook Statistics.” Statistics Brain. Available online at http://www.statisticbrain.com/facebook-statistics/(accessed May 14, 2013).

Concept Review

The normal distribution, which is continuous, is the most important of all the probability distributions. Its graph is bell-shaped. This bell-shaped curve is used in almost all disciplines. Since it is a continuous distribution, the total area under the curve is one. The parameters of the normal are the mean µ and the standard deviation σ. A special normal distribution, called the standard normal distribution is the distribution of z-scores. Its mean is zero, and its standard deviation is one.

Formula Review

Normal Distribution:
X ~ N(µ, σ) where µ is the mean and σ is the standard deviation.

Standard Normal Distribution:
Z ~ N(0, 1).

Calculator function for probability: normalcdf (lower x value of the area, upper x value of the area, mean, standard deviation).

Excel function for probability/area to the left: NORM.DIST(x value, mean, standard deviation,1).

Calculator function for the kth percentile: k = invNorm (area to the left of k, mean, standard deviation)

Calculator function for the kth percentile: k = NORM.INV(area to the left of k, mean, standard deviation)