Administrative info Updated MT2 solutions posted on Piazza Exams can be picked up from Soda front office PA3 due today HW8 due next Wednesday Final exam next Thursday No regrades for HW8 (not enough time) or final exam (UCB policy) Review In the continuous sample spaces we consider in this class, the probability of any particular outcome is 0. So instead of working with outcomes, we work directly with events. EX: James Bond jumps out of a plane and lands at a position uniformly distributed in [0, 1000]. The probability that he lands in an interval [a, b], where 0 <= a <= b <= 1000, is (b-a)/1000. Often, we work directly with random variables. A continuous random variable has a range that includes a continuous subset of R. Thus, for a random variable X, Pr[X = a] = 0 for any a. Again, we work with intervals, such as Pr[a < X <= b], which can have non-zero probability. We can describe a continuous random variable X in two ways. (1) The cumulative distribution function (cdf): F(x) = Pr[X <= x]. Then Pr[a < X <= b] = F(b) - F(a). The cdf F(x) must satisfy a few properties. (a) 0 <= F(x) <= 1 for all x∈R, since F(x) is a probability. (b) F(x) must be monotonically increasing: F(x) <= F(y) if x <= y. (2) The probability density function (pdf): f(x) = d/dx F(x). Then F(x) = ∫_{-∞}^y f(y) dy Pr[a < X <= b] = ∫_a^b f(y) dy. The pdf f(x) must satisfy a few properties. (a) f(x) >= 0 for all x∈R. Otherwise it would be possible to find an interval for which the integral is negative, resulting in a negative probability. (b) ∫_{-∞}^{+∞} f(x) dx = 1, i.e. the probability Pr[-∞ < X < +∞] is 1. The pdf tells us where there is a higher probability density. Its graph is similar to the distribution graph for a discrete random variable. EX: Let X be Bond's landing position when he jumps out of the plane. Then the cdf of X is { 0 if x < 0 G(x) = { x/1000 if 0 <= x <= 1 { 1 if x > 0 The pdf of X is { 0 if x < 0 g(x) = { 1/1000 if 0 <= x <= 1 { 0 if x > 0 A plot of the pdf: f(x) ____ 1/1000 _________ _________ 0 1000 x As you can see, he is likely likely to land anywhere in [0, 1000]. EX: James Bond shoots at a 1 foot radius gas tank, hitting any point on it with uniform probability. What is the pdf of the distance from the center to where he hits? Let Y = distance of hit from center. Then the cdf of Y is { 0 if y < 0 F(y) = { y^2 if 0 <= y <= 1 { 1 if y > 0 The pdf of Y is { 0 if y < 0 f(y) = { 2y if 0 <= y <= 1 { 0 if y > 0 A plot of the pdf: f(y) 2 / / 1 / _________/ _________ 0 1 y This shows us that there is higher density away from the center than closer to it. So Bond is more likely to hit further away from the center than closer to it. As you can see in the above example, the pdf is not restricted to the range [0, 1]. This is because the pdf is a density, not an actual probability. We defined the pdf as f(y) = lim_{δ->0} Pr[y < Y <= y + δ] / δ, so it is the limit of a tiny probability divided by a tiny length, which can give us any non-negative value. We derived an expression for the expectation of a random variable: E(X) = ∫_{-∞}^{+∞} x f(x) dx. Then the variance is defined as in the discrete case, with E(X^2) = ∫_{-∞}^{+∞} x^2 f(x) dx. EX: What is E(X), Bond's expected landing position? It is E(X) = ∫_{-∞}^{+∞} x g(x) dx = ∫_{0}^{1000} x/1000 dx. = x^2/2000 |_0^{1000} = 1000^2/2000 = 1000/2 = 500. Then E(X^2) = ∫_{-∞}^{+∞} x^2 g(x) dx = ∫_{0}^{1000} x^2/1000 dx. = x^3/3000 |_0^{1000} = 1000^3/3000 = 1000^2/3. Then Var(X) = E(X^2) - E(X)^2 = 1000^2/3 - 1000^2/4 = 1000^2/12. In general, for a random variable Z that is uniformly distributed in the interval [0, d], we get E(Z) = d/2 Var(Z) = d^2/12. A random variable W that is uniformly distirbuted in the interval [a, a+d] is just W = Z + a, so we get E(W) = E(Z) + a = a + d/2 Var(W) = Var(Z) = d^2/12. Exponential Distribution Recall that if we have a number of independent trials, each of which has a probability p of success, then the number of trials T until the first success follows a geometric distribution T ~ Geom(p), so Pr[T = i] = p(1-p)^{i-1} for all i∈Z^+, and Pr[T > i] = (1-p)^i for all i∈N. Suppose now that we perform a large number of trials every second, where each trial has a small probability p of success, so that we perform a trial every δ seconds for some small δ. By linearity of expectation, the average rate of success λ per second is λ = p / δ, since there are 1/δ trials per second, each with probability of success p, so we have p = λ δ. Let S be the number of seconds until the first success. Then Pr[T > k] = Pr[S > kδ] (since each trial takes δ seconds) = Pr[S > t], where t = kδ and t >= 0 since k >= 0. Then Pr[S > t] = Pr[T > k] = (1 - p)^k (since T ~ Geom(p)) = (1 - p)^{t/δ} (since k = t / δ) = (1 - λδ)^{t/δ} (since p = λ δ) ≈ (e^{-λδ})^{t/δ} (since p = λ δ is small) = e^{-λ t}. Finally, we get, for t >= 0, Pr[S <= t] = 1 - Pr[S > t] = 1 - e^{-λ t} as the cdf of S, which gives us a pdf f(t) = d/dt (1 - e^{-λ t}) = λ e^{-λ t}. Both the pdf and cdf are 0 if t < 0. This is the "exponential distribution," which has pdf f(t) = { λ e^{-λ t} if t >= 0 { 0 if t < 0 It is the continuous version of the geometric distribution and tells us how long we need to wait for a success, if successes can occur at any time and λ is the average rate of success per unit time. We write S ~ Exp(λ). Computing the expectation and variance of an exponential random variable requires integration by parts, and we get E(S) = 1/λ Var(S) = 1/λ^2. These are similar to the geometric distribution, where we got an expectation of 1/p and a variance of (1-p)/p^2. Note that though p is restricted to [0, 1], since it is a probability, λ can be any non-negative value, since it is the average rate of success. In particular, it may be the case that we expect many successes in a unit of time, in which case λ will be greater than 1. Recall the relationship between the binomial and the geometric distribution. They both examine what happens when have a series of independent trials, each with probability p of success. The binomial distribution tells us how many successes we get in a fixed number of trials, while the geometric distribution tells us in which trial the first success occurs. The exponential distribution has a similar relationship to the Poisson distribution. They both examine what happens when we have a particular average rate of success λ. The Poisson tells us how many successes we get in a fixed unit length of time, and the exponential tells us at what time the first success occurs. EX: Suppose a web server processes on average 1.2 requests per second. Then the amount of time between requests follows an exponential distribution with λ = 1.2. Suppose a request comes in. What is the probability that a new request will come in within the next second? Let S be the amount of time until the next request. Then S ~ Exp(1.2). Then Pr[S <= 1] = ∫_0^1 1.2 e^{-1.2 t} dt = -e^{-1.2 t} |_0^1 = -e^{-1.2} + 1 ≈ 0.7. Note that we could use the Poisson distribution to solve this problem. Let R be the number of requests in the next second. Then R ~ Poiss(1.2). Then Pr[S <= 1] = Pr[R > 1] = 1 - Pr[R = 0] = 1 - 1.2^0/0! e^{-1.2} = 1 - e^{-1.2}, as we computed before. Normal Distribution A random variable X has a "normal distribution", also called a "Gaussian distribution,", if it has a pdf of the form f(x) = 1/√{2πσ^2} e^{-(x-μ)^2/(2σ^2)} for some values of μ and σ. It can then be shown that ∫_{-∞}^{+∞} f(x) dx = 1, as required for a pdf, and that E(X) = μ Var(X) = σ^2, hence the parameters μ and σ. We write X ~ N(μ, σ^2). The pdf of a normal distribution is a symmetric bell-shaped curve centered at μ, with a width determined by σ. The "standard normal distribution" has parameters μ = 0, σ = 1. So if Y is a standard normal, then Y ~ N(0, 1), and the cdf of Y is g(y) = 1/√{2π} e^{-y^2/2}. More on the normal distribution next time.