Administrative info PA2 due Monday HW5 due Tuesday Review Recall that a probability space consists of the following: (1) A random experiment. (2) The sample space (set of possible outcomes). (3) The probability of each possible outcome of the experiment. Further recall that the probabilities must satisfy the following: (1) ∀ω∈Ω . 0 <= Pr[ω] <= 1 (2) ∑_{ω∈Ω} Pr[ω] = 1 An event is a subset of the sample space, i.e. a set of outcomes from the sample space. The probability of an event E is the sum of the probabilities of the outcomes in E: Pr[E] = ∑_{ω∈E} Pr[ω]. In the case of a uniform distribution, this simplifies to Pr[E] = |E|/|Ω|. Probability Identities Before we move on, let's note some facts about probability that can make it easier to compute probabilities. We defined the complement of an event E as E = &Omega\E. Then, Pr[E] = 1 - Pr[E]. Proof: ∑_{ω∈Ω} Pr[ω] = ∑_{ω∈E;} Pr[ω] + ∑_{ω∈Ω\E} Pr[ω] = Pr[E] + Pr[E] = 1. Let A and B be events in Ω. Then Pr[A ∪ B] = Pr[A] + Pr[B] - Pr[A ∩ B]. Writing this out in terms of sums, we get ∑_{ω∈Ω} Pr[ω] = ∑_{ω∈A;} Pr[ω] + ∑_{ω∈B} Pr[ω] - ∑_{ω∈A ∩ B} Pr[ω] As in inclusion/exclusion for sets, the first two terms double count the probabilities of those outcomes in A ∩ B, so we have to subtract the probability of A ∩ B. EX: What is the probability that a random integer n between 1 and 100 is divisible by 5 or 7? ANS: Let A be the event that n is divisible by 5, B be the event that it is divisible by 7. Pr[A] = 1/5, Pr[B] = 14/100 = 7/50, and Pr[A ∩ B] = 2/100 = 1/50. So Pr[A ∪ B] = 1/5 + 7/50 - 1/50 = 16/50 = 8/25. Let A_1, ..., A_n be n mutually disjoint events in Ω. Then Pr[A_1 ∪ ... ∪ A_n] = Pr[A_1] + ... + Pr[A_n]. This follows from the above, generalized to n events using induction, and then removing the intersection terms which are all 0. EX: Suppose I roll a red and a blue die. What is the probability that the red die is less than 4? ANS: Let A_i be the event that the red die is i. Then Pr[A_i] = 1/6 for 1 <= i <= 6, and the A_i are mutually disjoint. Thus, Pr[A_1 ∪ A_2 ∪ A_3] = Pr[A_1] + Pr[A_2] + Pr[A_3] = 1/2. Conditional Probability A pharmaceutical company is marketing a new test for HIV that it claims is 99% effective, meaning that it will report positive for 99% of people who have HIV and negative for 99% of those who don't have HIV. Suppose a random person takes the test and gets a positive test result. What is the probability that the person has HIV? This is an example of conditional probability. Given some information about a particular event in the sample space, we want to compute new probabilities for other events. Let's start off with simpler examples before coming back to the above. EX: Suppose I flip a fair coin twice. The result of the first flip is heads. What is the probability that I got two heads? ANS: Let's start by drawing the sample space Ω. There are 4 equally likely outcomes HH, HT, TH, and TT. We are now told that event A = "the first flip is H" has occurred. Which outcomes are now possible? There are only 2 outcomes in A, HH and HT, each of which is equally likely. So we have a new sample space Ω' that consists of just the outcomes HH and HT, each with probability 1/2. Let event B = "both flips are heads." What is the probability of B in this new sample space? Only one of the two outcomes in Ω' is in B, so Pr[B] = 1/2 in the new sample space. We write this as Pr[B|A], "the probability of B given A," which is the probability of B occurring in a new sample space consisting of just those outcomes in A. Generalizing the above procedure, suppose we are told an event A occurs. Then what is the new conditional probability of each outcome ω, i.e. Pr[ω|A]? For ω ∉ A, this is clearly 0. For ω ∈ A, the relative likelyhood of any two outcomes in A should remain the same, but we need to renormalize so that we satisfy the requirement that all probabilities add to 1. By definition, we had ∑_{ω ∈ A} Pr[ω] = Pr[A], so if we normalize by dividing by Pr[A], i.e. Pr[ω|A] = Pr[ω]/Pr[A], we get ∑_{ω ∈ A} Pr[ω|A] = ∑_{ω ∈ A} Pr[ω]/Pr[A] = Pr[A]/Pr[A] = 1. Now suppose we have another event B. What is Pr[B|A]? The outcomes in B that are not in A contribute nothing, since their new conditional probabilities are 0. So only the outcomes in both B and A contribute any probability, and we get Pr[B|A] = ∑_{ω ∈ B ∩ A} Pr[ω|A] = ∑_{ω ∈ B ∩ A} Pr[ω]/Pr[A] = Pr[B ∩ A]/Pr[A]. To summarize, when conditioning on an event A, we cross out any possibilities that are incompatible with A and then renormalize by 1/Pr[A] so that the probabilities of the remaining outcomes add to 1. We can compute the probabilities of events directly in this new sample space or use the identities above to get the same result. EX: Suppose I toss a red and a blue die, and I tell you that the resulting sum is 4. What is the probability that the red die is 1? ANS: Let A be the event that the sum is 7, B be the event that the red die is 1. The outcomes (1, 3), (2, 2), and (3, 1) are in A, so Pr[A] = 1/12. What is Pr[B ∩ A]? Only the outcome (1, 3) is in B ∩ A, so Pr[B ∩ A] = 1/36. Then Pr[B|A] = Pr[B ∩ A]/Pr[A] = 1/3. We could also have redefined the sample space to come up with the same result. Given A, we have a new sample space Ω' consisting of the outcomes (1, 3), (2, 2), and (3, 1), each with probability 1/3. Then B has probability 1/3 in this new sample space. So Pr[B|A] = 1/3. EX: Suppose I toss a red and a blue die, and I tell you that the resulting sum is 7. What is the probability that the red die is 1? ANS: Let A be the event that the sum is 7, B be the event that the red die is 1. Pr[A] = 1/6 as we computed before. What is Pr[B ∩ A]? Only the outcome (1, 6) is in B ∩ A, so Pr[B ∩ A] = 1/36. Then Pr[B|A] = Pr[B ∩ A]/Pr[A] = 1/6. EX: Suppose I toss 3 balls into 3 bins (with replacement). Let A = "1st bin empty," B = "2nd bin empty." What is Pr[A|B]? ANS: Pr[B] = 2^3/3^3 = 8/27, Pr[A ∩ B] = 1/3^3 = 1/27, so Pr[A|B] = (1/27)/(8/27) = 1/8. Thus, the fact that the 2nd bin is empty makes it much less likely that the 1st one is as well. EX: Suppose I flip a fair coin 51 times. If the first 50 flips are heads, what is the probability that the 51st is heads? ANS: Let A be the event that the first 50 flips are heads, B be the event that the 51st is heads. There are only 2 outcomes in A out of 2^51, so Pr[A] = 1/2^50. There are 2^50 outcomes in B, so Pr[B] = 1/2. Only one outcome is in both A and B, so Pr[A ∩ B] = 1/2^51. Then Pr[B|A] = (1/2^51)/(1/2^50) = 1/2. So the first 50 flips tell us nothing about the 51st; the probability of heads is still 1/2. We have seen multiple examples where Pr[B|A] = Pr[B]. We say that A and B are "independent" of this is the case. Intuitively, two events A and B are independent of knowing that one happens does not change the likelihood of the other happening. So the 51st flip of a fair coin is independent from what came up before. If A and B are independent, we get Pr[B|A] = Pr[B ∩ A]/Pr[A] = Pr[B], so Pr[B ∩ A] = Pr[A] Pr[B]. This is a very useful identity. EX: Suppose I flip a coin with probability p of heads n times. What is the probability of a particular outcome with k heads? ANS: Each flip is independent, with probability p for heads. The k heads flips have probability p, and the n-k tails flips have probability (1-p). So we get p^k (1-p)^(n-k) for an outcome with k heads. EX: Suppose a casino advertises the following game. You pick a number from 1 to 6. The casino rolls three dice, and if your number comes up, you win. What is your probability of winning? ANS: It's not 1/2! Let A_i be the event that your number comes up on the ith die. We want to know Pr[A_1 ∪ A_2 ∪ A_3] = 1 - Pr[A_1 ∩ A_2 ∩ A_3] = 1 - Pr[A_1] Pr[A_2] Pr[A_3] = 1 - (5/6)^3 ≈ 1 - 0.58 = 0.42. In the third line above, we used the fact that the results of each dice are mutually independent. We will come back to the concept of mutual independence later. So your probability of winning is less than 1/2. Suppose you are flying to Las Vegas (in order to play the game above). Your friend, fearing for your safety, gives you the following advice: "You know, you should always carry a bomb on an airplane. The chance of there being one bomb on the plane is pretty small, but the chance of two bombs is miniscule. So by carrying a bomb on the airplane, your chances of being blown up are astronomically reduced." What do you think of his advice? Let A be the event that you carry a bomb on board, B be the event that someone else carries a bomb on board. How are A and B related? They are independent, so Pr[B|A] = Pr[B], and the likelihood that someone else has a bomb doesn't change one bit if you bring one aboard.