FALL 2019

Normal Distribution

MOTIVATING EXAMPLE

  • Red line - true density
  • Blue dashed line - normal model with mean and SD from observed heights

NORMAL DISTRIBUTION

The normal distribution is the most common distribution in statistics. It has a mean \(\mu\) and a standard deviation \(\sigma\), which describe it entirely.

If we assume a random variable \(X\) is normally distributed (mean \(\mu\), SD \(\sigma\)), then

  • \(X\sim N(\mu,\sigma)\)
  • Distribution of \(X\) is:
    • Unimodal
    • Symmetric
    • “Bell-shaped”

NORMAL DISTRIBUTION

HOW DO WE COMPARE NORMAL CURVES?

STANDARDIZING WITH Z-SCORES

The Z-score of an observation is the number of standard deviations it falls above or below the mean. The Z-score for an observation \(x\) that follows a distribution with mean \(\mu\) and standard deviation \(\sigma\) is computed as:

\[Z=\frac{x-\mu}{\sigma}\]

STANDARDIZED NORMAL CURVES - HEIGHTS

STANDARD NORMAL DISTRIBUTION

The normal distribution with \(\mu=0\) and \(\sigma=1\) is called the standard normal distribution.

WHY STANDARDIZE?

  • Facilitate comparisons
    • Percentiles for standardized testing, SAT vs ACT
    • Percentiles for height and weight for toddlers in different populations
  • Z-scores have a special meaning
    • number of standard deviations above or below the mean

NORMAL DISTRIBUTION - PROBABILITY

If \(X \sim N(\mu, \sigma)\), then \(X\) is a continuous random variable (recall end of Chapter 3 lecture).

What does probability mean for a continuous random variable?

NORMAL DISTRIBUTION - PROBABILITY

If \(X \sim N(\mu, \sigma)\), then \(X\) is a continuous random variable (recall end of Chapter 3 lecture).

What does probability mean for a continuous random variable?

  • Defined as area under the curve (e.g. \(P(69\leq X \leq 73)\)):

FINDING TAIL AREAS

How do we find the area under the curve (probability) if \(X \sim N(\mu,\sigma)\)?

  1. Calculus (not covered in this course)
  2. R: pnorm() function
  3. Normal table

For (2) and (3), we assume our process follows a normal distribution - this is a MODEL - it is NOT EXACT.

FINDING TAIL AREAS: R

pnorm() gives us the probability of an observation below a certain value, given an appropriate mean and SD

## P(X < 69):
pnorm(69, mean=mean(cdc_m$height), sd=sd(cdc_m$height))
## [1] 0.338728
## P(X < 72):
pnorm(72, mean=mean(cdc_m$height), sd=sd(cdc_m$height))
## [1] 0.7193796

FINDING TAIL AREAS: TABLE

PRACTICE

According to the CDC data base, the mean height of US men is 70.25 inches and the SD is 3.01 inches. If we model height (random variable \(X\)) as normal (with the previously stated mean and variance), what is the probability a male is between 69 and 72 inches tall?

PRACTICE

According to the CDC data base, the mean height of US men is 70.25 inches and the SD is 3.01 inches. If we model height (random variable \(X\)) as normal (with the previously stated mean and variance), what is the probability a male is between 69 and 72 inches tall?

Step 1: Draw a picture to identify what you want

PRACTICE

Step 2: Identify what you information you can get

## P(X < 69)
pnorm(q=69, 
      mean=mean(cdc_m$height), 
      sd=sd(cdc_m$height))
## [1] 0.338728

## P(X < 72)
pnorm(q=72, 
      mean=mean(cdc_m$height), 
      sd=sd(cdc_m$height))
## [1] 0.7193796

PRACTICE

Step 3: Connect what we have to what we want

Both pnorm() and the Z table give lower tail probabilities. To get what we want:

\[P(69 \leq X \leq 72)=P(X\leq 72)-P(X\leq 69)=0.7914-0.3387=0.4527\]

PRACTICE

At Heinz ketchup factory the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of ketchup in the bottle is below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles have less than 35.8 ounces of ketchup?

PRACTICE

At Heinz ketchup factory, the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of ketchup in the bottle is below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles have less than 35.8 ounces of ketchup?

PRACTICE

At Heinz ketchup factory, the amounts which go into bottles of ketchup are supposed to be normally distributed with mean 36 oz. and standard deviation 0.11 oz. Once every 30 minutes a bottle is selected from the production line, and its contents are noted precisely. If the amount of ketchup in the bottle is below 35.8 oz. or above 36.2 oz., then the bottle fails the quality control inspection. What percent of bottles have less than 35.8 ounces of ketchup?

\(P(X<35.8)=P(X\leq 35.8)=\) 0.0345182

PRACTICE

What percent of bottles pass quality control inspection?

PRACTICE

What percent of bottles pass quality control inspection?

PRACTICE

What percent of bottles pass quality control inspection?

## P(X < 36.2)-P(X < 35.8)
pnorm(q=36.2, mean=36, sd=0.11)-pnorm(q=35.8, mean=36, sd=0.11)
## [1] 0.9309637

FINDING CUTOFF POINTS

Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the lowest 3% of human body temperatures?

  • Use R: qnorm() function (what does this return?)
qnorm(0.03)
## [1] -1.880794
  • Use Z table (work backwards)

FINDING CUTOFF POINTS

Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the lowest 3% of human body temperatures on the original scale?

  • Use R: qnorm() function (what does this return?)
co <- qnorm(0.03)
co
## [1] -1.880794

FINDING CUTOFF POINTS

Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the lowest 3% of human body temperatures on the original scale?

  • Use R: qnorm() function (what does this return?)
co <- qnorm(0.03)
co
## [1] -1.880794

\(Z=\frac{x-\mu}{\sigma}=\frac{x-98.2}{0.73}=-1.88\)

\(-1.88\times 0.73+98.2=96.8\)

PRACTICE

Body temperatures of healthy humans are distributed nearly normally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the highest 10% of human body temperatures (on the original scale)?

PRACTICE

Body temperatures of healthy humans are distributed nearly nor- mally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the highest 10% of human body temperatures (on the original scale)?

co <- qnorm(0.90)
co
## [1] 1.281552

PRACTICE

Body temperatures of healthy humans are distributed nearly nor- mally with mean 98.2 F and standard deviation 0.73 F. What is the cutoff for the highest 10% of human body temperatures (on the original scale)?

co <- qnorm(0.90)
co
## [1] 1.281552

\(Z=\frac{x-\mu}{\sigma}=\frac{x-98.2}{0.73}=1.28\)

\(1.28\times 0.73+98.2=99.1\)

68-95-99.7 RULE

Rule of thumb for the probability of falling within 1, 2, and 3 standard deviations of the mean in the normal distribution.

USING THE 68-95-99.7 RULE

SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.

USING THE 68-95-99.7 RULE

SAT scores are distributed nearly normally with mean 1500 and standard deviation 300.

PRACTICE

Which of the following is false?

  1. Majority of Z scores in a right skewed distribution are negative.
  2. In skewed distributions the Z score of the mean might be different than 0.
  3. For a normal distribution, IQR is less than 2 x SD.
  4. Z scores are helpful for determining how unusual a data point is compared to the rest of the data in the distribution.

PRACTICE

Which of the following is false?

  1. Majority of Z scores in a right skewed distribution are negative.
  2. In skewed distributions the Z score of the mean might be different than 0.
  3. For a normal distribution, IQR is less than 2 x SD.
  4. Z scores are helpful for determining how unusual a data point is compared to the rest of the data in the distribution.

Binomial Distribution

MOTIVATING EXAMPLE

When a certain telemarketer makes a call, they have a 10% chance of making a sale (Y) and a 90% of not making a sale (N).

If the telemarketer makes three calls, what is the probability of making exactly one sale?

Scenario 1:

\[P(YNN)=P(Y)P(N)P(N)=0.1\times 0.9\times 0.9=0.081\]

Scenario 2:

\[P(NYN)=P(N)P(Y)P(N)=0.9\times 0.1\times 0.9=0.081\]

Scenario 3:

\[P(NNY)=P(N)P(N)P(Y)=0.9\times 0.9\times 0.1=0.081\]

Prob exactly one sale: 0.081+0.081+0.081=3x0.081=0.243

MOTIVATING EXAMPLE

The question on the previous slide asked for the probability of a given number of “successes”, \(k\), in a given number of independent trials, \(n\) (\(k=1\) success in \(n=3\) trials).

We calculated this probability as: \[\# \ scenarios \times P(single \ scenario)\]

Fortunately, there is a less tedious way to count the “number of scenarios”.

CHOOSE FUNCTION

Writing out the number of scenarios is possible for small examples, like the telemarketer problem. For larger \(n\) and/or \(k\) different than 1, (e.g. \(n=9\), \(k=2\)), this gets much more tedious and error prone (feel free to try by modifying the previous example).

The choose function is used to calculate the number of ways to choose \(k\) successes in \(n\) trials:

\[\left(\begin{matrix}n \\ k \end{matrix}\right)=\frac{n!}{k!(n-k)!}\] Factorial:

  • \(n!=n\times(n-1)\times\cdots\times 2 \times 1\)
  • \(k!=k\times(k-1)\times\cdots\times 2\times 1\)

PRACTICE

Which of the following is false?

  1. There are \(n\) ways of getting 1 success in \(n\) trials, \(\left(\begin{matrix}n \\ 1 \end{matrix}\right)=1\).
  2. There is only 1 way of getting \(n\) successes in \(n\) trials, \(\left(\begin{matrix}n \\ n \end{matrix}\right)=n\).
  3. There is only 1 way of getting \(n\) failures in \(n\) trials, \(\left(\begin{matrix}n \\ 0 \end{matrix}\right)=n\)
  4. There are \(n-1\) ways of getting \(n-1\) successes in \(n\) trials, \(\left(\begin{matrix}n \\ n-1 \end{matrix}\right)=n-1\)

PRACTICE

Which of the following is false?

  1. There are \(n\) ways of getting 1 success in \(n\) trials, \(\left(\begin{matrix}n \\ 1 \end{matrix}\right)=1\).
  2. There is only 1 way of getting \(n\) successes in \(n\) trials, \(\left(\begin{matrix}n \\ n \end{matrix}\right)=n\).
  3. There is only 1 way of getting \(n\) failures in \(n\) trials, \(\left(\begin{matrix}n \\ 0 \end{matrix}\right)=n\)
  4. There are \(n-1\) ways of getting \(n-1\) successes in \(n\) trials, \(\left(\begin{matrix}n \\ n-1 \end{matrix}\right)=n-1\)

BINOMIAL DISTRIBUTION

Binomial distribution: used to describe the number of successes \(k\) in a fixed number of independent trials \(n\)

\[P(X=k)=\underbrace{\left(\begin{matrix}n \\ k \end{matrix}\right)}_{\# \ scenarios}\underbrace{p^k(1-p)^{n-k}}_{P(single \ scenario)}=\frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}\]

  • \(p:\) probability of “success”
  • \(\mu=E(X)=np\): expectation/population mean
  • \(\sigma^2=Var(X)=np(1-p)\): population variance
  • \(\sigma=\sqrt{np(1-p)}\): population standard deviation

BINOMIAL CONDITIONS

The random variable \(X\) is binomial if the following conditions are met:

  1. The trials are independent.
  2. The number of trials, \(n\), is fixed.
  3. Each trial outcome can be classified as a success or failure.
  4. The probability of a success, \(p\), is the same for each trial.

All four of these conditions must be satisfied to have a binomial distribution. You should know these conditions.

PRACTICE

Revisiting our motivating example:

When a certain telemarketer makes a call, they have a 10% chance of making a sale (Y) and a 90% of not making a sale (N).

Question: Is this binomial?

  1. Are the trials independent?
  2. Is the number of trials, \(n\), fixed?
  3. Can each trial outcome be classified as a success or failure?
  4. Is the probability of success, \(p\), the same for each trial?

PRACTICE

Revisiting our motivating example:

When a certain telemarketer makes a call, they have a 10% chance of making a sale (Y) and a 90% of not making a sale (N).

Question: Is this binomial?

  1. Are the trials independent? Yes
  2. Is the number of trials, \(n\), fixed? Yes, \(n=3\)
  3. Can each trial outcome be classified as a success or failure? Yes
  4. Is the probability of success, \(p\), the same for each trial? Yes, \(p=0.1\)

FINDING PROBABILITIES FOR BINOMIAL

How do we find probabilities when \(X \sim Binomial(n,p)\)?

  1. Use the formula (OK for small \(n\))
  2. R: dbinom() and pbinom() function
  3. Binomial table
  4. Normal approximation to the binomial

For all options, we assume our process follows a binomial distribution - this is a MODEL

FINDING PROBABILITIES FOR BINOMIAL

You can always calculate binomial probabilities using the formula, but it gets unwieldy quickly as \(n\) increases. For small enough \(n\), it is practical to use the formula:

\[P(X=k)=\left(\begin{matrix}n \\ k \end{matrix}\right)p^k(1-p)^{n-k}=\frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}\]

FINDING PROBABILITIES FOR BINOMIAL

dbinom() gives us the probability of observing exactly a certain number of successes \(k\) (x argument) given a fixed number of trials, \(n\) (size argument) and a probability of success, \(p\) (prob argument)

EXAMPLE: Suppose there is a quiz with 10 multiple choice questions, each having four possible answers. If a student guesses randomly on each question, what it the probability that they get exactly 8 questions correct.

## P(X=8), X ~ Binomial(10, 0.25)
dbinom(x=8, size=10, prob=0.25)
## [1] 0.0003862381

FINDING PROBABILITIES FOR BINOMIAL

dbinom() gives us the probability of observing exactly a certain number of successes \(k\) (x argument) given a fixed number of trials, \(n\) (size argument) and a probability of success, \(p\) (prob argument)

EXAMPLE: Suppose there is a quiz with 10 multiple choice questions, each having four possible answers. If a student guesses randomly on each question, what it the probability that they get at least 8 questions correct.

## P(X>=8), X ~ Binomial(10, 0.25)
 (dbinom(x=8, size=10, prob=0.25)
 +dbinom(x=9, size=10, prob=0.25)
 +dbinom(x=10, size=10, prob=0.25))
## [1] 0.000415802

FINDING PROBABILITIES FOR BINOMIAL

pbinom() gives us the probability of observing at most a certain number of successes \(k\) (q argument) given a fixed number of trials, \(n\) (size argument) and a probability of success, \(p\) (prob argument)

EXAMPLE: Suppose there is a quiz with 10 multiple choice questions, each having four possible answers. If a student guesses randomly on each question, what it the probability that they get at least 8 questions correct.

## P(X >= 8), X ~ Binomial(10, 0.25)
1-pbinom(q=7, size=10, prob=0.25)
## [1] 0.000415802

NORMAL APPROXIMATION

The binomial distribution with probability of success, \(p\), is nearly normal when the sample size \(n\) is sufficiently large such that \(np\) and \(np(1-p)\) are both at least 10. The approximate normal distribution has parameters corresponding to the mean and standard deviation of the binomial distribution:

  • \(\mu=np\)
  • \(\sigma=\sqrt{np(1-p)}\)

SO YOU WANT TO COMPUTE BINOMIAL PROBABILITIES

Steps

  1. Check that the model is appropriate (check four conditions).
  2. Identify \(n\), \(p\), and \(k\).
  3. Identify what probability you want to compute using appropriate notation.
  4. Use R (if available), R output (possibly provided on exams), or formulas to determine the probability. If you are doing it by hand, check to see if it is appropriate to use a normal approximation.
  5. Interpret results.

REFERENCES

  • Diez et al. (2019) OpenIntro Statistics, Fourth Edition