FALL 2019
Random process: a process/situation where we can identify a set of possible events/outcomes that could occur, but we don’t know which event will happen
Examples include coin tosses, die rolls, daily stock price
Often it is helpful to model processes as random that are not truly random
Terminology
Random process: Whether the Dow Jones goes up or down tomorrow
What is the sample space?
Give an example of an event:
How would we try to find the probability of that event?
Random process: Whether the Dow Jones goes up or down tomorrow
What is the sample space? Set of outcomes: up, down, no change
Give an example of an event: Stock market goes up
How would we try to find the probability of that event? Sample from historical data and find the observed proportion of times the stock market went up relative to the previous day
As the sample size grows (i.e. as more observations are recorded), the observed probability of an event (\(\hat{p}_n\)) converges to the probability of that event, \(p\).
Example: flips of a fair coin; \(\hat{p}_n\) is the proportion of observed heads
Assume you are tossing a fair coin and you observe 10 heads in a row. What is the probability that the eleventh toss will also result in a head? Is it 0.5? Less than 0.5? More than 0.5?
Assume you are tossing a fair coin and you observe 10 heads in a row. What is the probability that the eleventh toss will also result in a head? Is it 0.5? Less than 0.5? More than 0.5?
It is still 0.5; \(P(11^{th} \text{ toss H})=0.5\).
Assume you are tossing a fair coin and you observe 10 heads in a row. What is the probability that the eleventh toss will also result in a head? Is it 0.5? Less than 0.5? More than 0.5?
It is still 0.5; \(P(11^{th} \text{ toss H})=0.5\).
Two events/outcomes are disjoint if they cannot both happen. In other words, if you know one event happens, you also know that the other does not happen. Such events are also called mutually exclusive.
What are some examples of disjoint events?
Two events/outcomes are disjoint if they cannot both happen. In other words, if you know one event happens, you also know that the other does not happen. Such events are also called mutually exclusive.
What are some examples of disjoint events?
If two events \(A\) and \(B\) are disjoint, then the probability that at least one of them occurs is
\[P(A \text{ or } B)=P(A)+P(B)\] If there are more than two disjoint events (\(k\), in this case), then the probability that at least one of them occurs is
\[P(A_1 \text{ or } A_2 \text{ or } \cdots \text{ or } A_k)=P(A_1)+P(A_2)+\cdots+P(A_k)\]
Let’s play a game. As a class, you can pick a card color (red/black) and then a suit corresponding to that color (diamonds/hearts if red; spades/clubs if black). Then, we will draw 10 cards from a standard, well shuffled deck (52 cards). The class gets a point for each card corresponding to their suit and color. I get a point otherwise.
As we play, think about whether this is a fair game.
Let’s play a game. As a class, you can pick a card color (red/black) and then a suit corresponding to that color (diamonds/hearts if red; spades/clubs if black). Then, we will draw 10 cards from a standard, well shuffled deck (52 cards). The class gets a point for each card corresponding to their suit and color. I get a point otherwise.
What is the probability that I get a point on a draw?
Suppose the class picked hearts. Then,
\[P(\text{not } heart)=P(diamond)+P(spade)+P(club)\] \[P(\text{not } heart)=1/4+1/4+1/4=3/4\]
Consider the example about the Dow Jones. What is the probability that the Dow Jones does not go down tomorrow? Denote this using probability notation.
Consider the example about the Dow Jones. What is the probability that the Dow Jones does not go down tomorrow? Denote this using probability notation.
\[P(\text{not } D)=P(U \text{ or } NC)=P(U) +P(NC)\]
Let A and B be any two events (disjoint or not). Then, the probability at least one of them occurs is \[P(A \text{ or } B)=P(A)+P(B)-P(A \text{ and } B) \] where \(P(A \text{ and }B)\) is the probability that both events occur.
Why do we need to subtract \(P(A \text{ and }B)\) in this expression?
Let A and B be any two events (disjoint or not). Then, the probability at least one of them occurs is \[P(A \text{ or } B)=P(A)+P(B)-P(A \text{ and } B)\] where \(P(A \text{ and }B)\) is the probability that both events occur.
Why do we need to subtract \(P(A \text{ and }B)\) in this expression?
Recall our smallpox example from last week.
Result Inoculated died lived Sum no 844 5136 5980 yes 6 238 244 Sum 850 5374 6224
What is the probability that an individual was inoculated (V) or lived (L)?
Recall our smallpox example from last week.
Result Inoculated died lived Sum no 844 5136 5980 yes 6 238 244 Sum 850 5374 6224
What is the probability that an individual was inoculated (V) or lived (L)?
\[P(V \text{ or } L)=P(V)+P(L)-P(V \text{ and } L)\]
\[P(V \text{ or } L)=244/6224+5374/6224-238/6224=\]
0.8643959
A probability distribution is a list of the possible outcomes/events with corresponding probabilities satisfying the following three rules:
For discrete random variables, a probability distribution can be represented in a table of all disjoint outcomes and their associated probabilities.
Example: Handedness
About 90% of people are right-handed and the remainder are left-handed. The probability distribution for handedness of a child is:
Handedness | R | L |
---|---|---|
Probability | 0.90 | 0.10 |
Example: STAT 140 class data
Class year probability distribution:
2020 2021 2022 2023 0.333 0.273 0.364 0.030
In a survey, 52% of respondents said they are Democrats. What is the probability that a randomly selected respondent from this sample is a Republican?
In a survey, 52% of respondents said they are Democrats. What is the probability that a randomly selected respondent from this sample is a Republican?
This depends on how many party affiliations we have. We need to be able to list all the possible events and their probabilities (the probability distribution) to answer this question.
The complement of an event A is denoted by Ac, and Ac represents all the outcomes not in A. A and Ac are related:
Examples of complements:
Assume you have two fair dice. What is the probability that the sum of the dice is less than or equal to 10?
Let T denote the total (sum) of the two dice faces.
Assume you have two fair dice. What is the probability that the sum of the dice is less than or equal to 10?
Let T denote the total (sum) of the two dice faces.
\(P(T \leq 10)=1-P(T=11 \text{ or } T=12)\)
\(P(T=11 \text{ or } T=12)=P(T=11)+P(T=12)\)
\(P(T=11)+P(T=12)=2/36+1/36=1/12\)
\(P(T \leq 10)=1-1/12=11/12\)
If two events are complements of each other, are they disjoint?
If two events are complements of each other, are they disjoint?
Yes, if one happens, then the other cannot happen.
Example: Flipping a coin - if the coin is heads, then it cannot be tails on that flip. The events are heads (\(H\)) and tails (\(T\)), which can also be denoted not heads (\(H^c\)).
Example: Residence hall - if I live in the Rockies (\(R\)), I cannot live in any other residence hall (\(R^c\)).
If two events are disjoint, are they complements of each other?
If two events are disjoint, are they complements of each other?
Not necessarily. If \(B\) is the complement of \(A\) (i.e. \(B\) is \(A^c\)), then \(B\) is everything that \(A\) is not. \(A\) and \(B\) can be disjoint, however, without being complements. Consider the following:
Example: Flipping a coin - there are only two possible events here, and we know that they are disoint. In this case, \(H\) and \(T\) are disjoint events, but they are also complements (\(T=H^c\)).
Example: Residence hall - living in the Rockies (\(R\)) and living in Wilder (\(W\)) are disjoint events. However, the complement of \(R\), \(R^c\) is not \(W\). Rather, \(R^c\) is all the residences halls excluding the Rockies.
Two processes are independent is knowing the outcome of one provides no information about the outcome of the other.
Examples:
About 90% of people are right-handed and the remainder are left-handed. Knowing the handedness of one person does not give you any information about the handedness of another person.
The probability distribution for handedness of two people is:
Handedness | RR | RL | LR | LL |
---|---|---|---|---|
Probability | (0.90)(0.90)=0.81 | (0.90)(0.10)=0.09 | (0.10)(0.90)=0.09 | (0.10)(0.10)=0.01 |
Let A and B represent events from two different and independent processes. Then the probability that both A and B occur is the product of their separate probabilities:
\[ P(A \text{ and } B)=P(A)\times P(B).\] If there are \(k\) events \(A_1,...,A_k\) from \(k\) independent processes, then the probability they all occur is
\[P(A_1)\times P(A_2)\times\cdots\times P(A_k).\]
When a certain telemarketer makes a call, they have a 10% chance of making a sale (Y) and a 90% of not making a sale (N).
If the telemarketer makes three calls, what is the probability of making exactly one sale?
When a certain telemarketer makes a call, they have a 10% chance of making a sale (Y) and a 90% of not making a sale (N).
If the telemarketer makes three calls, what is the probability of making exactly one sale?
\(P(\text{exactly one } Y)=P(\text{one } Y \text{ and } \text{two } N)\) \(P(\text{one } Y \text{ and } \text{two } N)=P(YNN \text{ or } NYN \text{ or } NNY)\) \(P(YNN \text{ or } NYN \text{ or } NNY)=P(Y)P(N)P(N)+P(N)P(Y)P(N)+P(N)P(N)P(Y)\) \(P(Y)P(N)P(N)+P(N)P(Y)P(N)+P(N)P(N)P(Y)=3\times(0.1)(0.9)(0.9)=0.243\)
When a certain telemarketer makes a call, they have a 10% chance of making a sale (Y) and a 90% of not making a sale (N).
If the telemarketer makes three calls, what is the probability of making at least one sale?
When a certain telemarketer makes a call, they have a 10% chance of making a sale (Y) and a 90% of not making a sale (N).
If the telemarketer makes three calls, what is the probability of making at least one sale?
\(P(\text{at least one } Y)=1-P(\text{all } N)\) \(1-P(\text{all } N)=1-P(N)P(N)P(N)=1-(0.9)^3=0.271\)
Let’s revisit our smallpox example one more time. At the time of the 1721 smallpox epidemic in Boston, there was hypothesis that vaccinated people were more likely to survive the epidemic than non-vaccinated people.
Conditional probability allows us to explore the relationship between vaccination status (yes/no) and result (lived/died) to address that hypothesis.
Marginal Probability: a probability based on only one variable or process; example form:
\[ P(A) \]
Small pox example
(\(V\)=vaccinated, \(V^c\)= not vaccinated, \(L\)=lived, \(L^c\)=died):
Joint Probability: a probability of outcomes for two or more variables or processes; example form:
\[ P(A \text{ and }B) \]
Small pox example
Conditional Probability:
\[P(A|B)=\frac{P(A \text{ and }B)}{P(B)}\]
Components of conditional probability:
Smallpox example:
What is the probability of a randomly chosen person not being vaccinated and living?
What is the probability of a randomly chosen person who is not vaccinated living?
\(L^c\) | \(L\) | Total | |
---|---|---|---|
\(V^c\) | 844 | 5136 | 5980 |
\(V\) | 6 | 238 | 244 |
Total | 850 | 5374 | 6224 |
Smallpox example:
What is the probability of a randomly chosen person not being vaccinated and living?
\(L^c\) | \(L\) | Total | |
---|---|---|---|
\(V^c\) | 844 | 5136 | 5980 |
\(V\) | 6 | 238 | 244 |
Total | 850 | 5374 | 6224 |
Joint probability: \[P(V^c \text{ and } L)=5136/6224\]
Smallpox example:
What is the probability of a randomly chosen person who is not vaccinated living?
\(L^c\) | \(L\) | Total | |
---|---|---|---|
\(V^c\) | 844 | 5136 | 5980 |
\(V\) | 6 | 238 | 244 |
Total | 850 | 5374 | 6224 |
Conditional probability: \[P(L|V^c)=5136/5980\]
In general, if \(A\) and \(B\) represent any two outcomes or events (either independent or dependent), then
\[P(A \text{ and } B)=P(A|B)\times P(B).\]
It is always true that
\[P(A \text{ and } B)=P(A|B)\times P(B).\]
Recall for independent events \(A\) and \(B\),
\[P(A \text{ and } B)=P(A)\times P(B).\]
Why does \(P(A)=P(A|B)\) when \(A\) and \(B\) are independent?
Millenials led US pet ownership to 84.6 million in 2016
Pet Owner: Y or N Demographic: Gen Y/millenial (M), Baby Boomer (B), Other (O)
Probability of (living in a household) owning a pet: \[P(Y)=0.68\] Sums of demographics of pet owners (condition=pet ownership, denoted Y): \[ P(M|Y)+P(B|Y)+P(O|Y)=0.35+0.32+0.33=1 \]
Let \(A_1, ..., A_k\) represent all the disjoint events for a variable or process. Then, if \(B\) is an event, possibly for another event or process, we have:
\[P(A_1|B)+\cdots +P(A_k|B)=1 \]
Millenials led US pet ownership to 84.6 million in 2016
Pet Owner: Y or N Demographic: Gen Y/millenial (M), Baby Boomer (B), Other (O)
Probability of (living in a household) owning a pet: \[P(Y)=0.68\] Probability of not being a millenial given that own a pet: \[ P(M^c|Y)=1-P(M|Y)=1-0.35=0.65 \]
The rule for complements also holds when an event and its complement are conditioned on the same information:
\[P(A|B)=1-P(A^c|B) \]
Diagnostic tests are used to determine whether an individual has a particular disease. However, diagnostic tests are not perfect, so a positive test result does not guarantee that the tested individual has the disease. Generally, characteristics of the test are determined in a lab so we know how much we can trust the result. These are called sensitivity and specificity.
Known quantities:
Want to know:
Allows us to invert probabilities. For events \(A\) and \(B\),
\[P(B|A)=\frac{P(A|B)P(B)}{P(A \text{ and } B)+P(A \text{ and } B^c)}=\frac{P(A\text{ and }B)}{P(A)}.\]
This rule sets the foundation for Bayesian statistics, which we will not cover in this class.