Administrative info
Review session Sunday 7/10 5pm in 310 Soda
MT1 policies
- 1 cheat sheet (8.5x11, double sided)
- no calculators
Old exams online; see email
Review
So far, we have seen six counting principles:
(1) Enumeration
(2) Product rule
(3) Sum rule
(4) Isomorphism principle
(5) Pigeonhole principle
(6) Permutations
Recall that an r-permutation of a set of n elements S is an ordered
list of r items from S. There are n!/(n-r)! such lists.
EX: How many anagrams are there of the word "eraser"?
ANS: There are 6 letters in "eraser":
2 each of 'e' and 'r'
1 each of 'a' and 's'
If we pretend that repeated letters are distinguishable, then
there are 6! permutations of "eraser". But there is a 2-to-1
correspondence between the set such anagrams and the set of
anagrams with identical e's and distinguishable r's. There is a
further 2-to-1 correspondence between this set and the set of
anagrams with identical e's and r's. So the size of the latter
set is 6!/(2^2).
More Counting
(7) Combinations
EX: How many 5-card poker hands are there?
ANS: We know that there are 52!/47! 5-permutations of the set of
cards. However, as in the "eraser" case above, we've
overcounted, since the hand "10 J Q K A" (all spades) is
the same as the hand "A K Q J 10". In fact, there are 5!
orderings of this hand (permutations of the set {10, J, Q,
K, A}). So there is a 5!-to-one correspondence between the
set of ordered poker hands to the set of unordered poker
hands, and the number of different hands is actually
52!/(47!5!).
What we are doing here is "choosing" 5 cards out of the 52, i.e.
constructing a subset of size 5 from a set of 52. Choosing an
r-combination of items from a set of n is so common that it has
its own notation:
(n)
(r)
This is pronounced as "n choose r." We may write C(n, r) since
we can't really use the proper notation in ASCII text. The
formula for C(n, r) is n!/[r!(n-r)!]. (Proof by using
permutation and r-to-1 correspondence, as above.)
EX: There are 70 people in this class. If I ask for 5 volunteers
to play Set, how many ways sets of volunteers are there?
ANS: C(70, 10) = 70!/(10!60!)
EX: If I flip a fair coin 100 times, how many sequences of flips
contain exactly 50 heads?
ANS: Choose 50 out of the 100 flips to heads, the rest tails. So
C(100, 50).
There are many useful identities between combinations. The
simplest is
C(n, r) = C(n, n-r).
As for some others, perhaps you have seen Pascal's triangle:
(0)
(0)
(1) (1)
(0) (1)
(2) (2) (2)
(0) (1) (2)
(3) (3) (3) (3)
(0) (1) (2) (3)
(4) (4) (4) (4) (4)
(0) (1) (2) (3) (4)
(5) (5) (5) (5) (5) (5)
(0) (1) (2) (3) (4) (5)
(6) (6) (6) (6) (6) (6) (6)
(0) (1) (2) (3) (4) (5) (6)
...
row sum
1 1
1 1 2
1 2 1 4
1 3 3 1 8
1 4 6 4 1 16
1 5 10 10 5 1 32
1 6 15 20 15 6 1 64
(By convention, 0! = 1).
Note that to get any entry, you sum its neighbors to the top
left and top right. This is due to the identity
C(n+1, m+1) = C(n, m) + C(n, m+1).
Also note that
C(n, 0) + C(n, 1) + ... + C(n, n) = 2^n
We will prove some of these identities later.
(8) Stars and bars
EX: A band of 2 pirates (say Johnny Depp and Orlando Bloom) have
4 indistinguishable gold coins to divide among them. How
many different ways are there to split up the booty?
(They're pirates, so it doesn't have to be split equally.)
ANS: If Johnny Depp gets i, 0 <= i <= 4, Orlando Bloom gets 4-i.
There are 5 possible values of i.
EX: A band of 3 pirates (add Keira Knightley) have 4
indistinguishable gold coins to divide among them. How many
different ways are there to split up the booty?
ANS: This seems harder. Let's line up the coins and partition
them into sets, the first one going to the Johnny Depp, the
second to Orlando Bloom, and the third to Keira Knightley.
We'll draw a line in the sand to separate each pirate's
share from the others. Here are the possiblities:
OOOO||
OOO|O| OOO||O
OO|OO| OO|O|O OO||OO
O|OOO| O|OO|O O|O|OO O||OOO
|OOOO| |OOO|O |OO|OO |O|OOO ||OOOO
So there are 15.
Note that there is a 1-to-1 correspondence between
splitting up the booty and 6-bit strings with exactly 2
ones. The number of the latter is C(6, 2) = 15 (i.e. choose
2 positions out of the string to be ones). So there are
C(6, 2) = 15 ways to split up the booty.
In general, if we want to split up k identical items (e.g.
coins) into n (distinguishable) sets (e.g. one for each pirate),
there are C(n+k-1, k) = C(n+k-1, n-1) ways to do so. This
procedure is called "stars and bars." Maybe someone stole the
coins and replaced them with starfish?
Balls and Bins Framework
The course reader uses a "balls and bins" framework to introduce
counting. Here, we will see how to apply our counting principles to
the various balls and bins examples in the course reader.
The basic idea in this framework is that we are placing k balls into
n bins, under various constraints. We want to know how many ways to
do this if:
(a) the balls are distinguishable or identical
(b) a bin can contain only one ball or more than one
In terms of (b), we use the term "sampling with replacement" if a
bin can contain more than one ball (i.e. once we pick a bin for the
first ball, we "replace" that bin in the set of allowed bins for the
remaining balls). Otherwise, we are "sampling without replacement."
(1) Distinguishable balls, with replacement
We want to throw k balls into n bins, such that multiple balls
can go into the same bin.
This is just like the example of 3-digit area codes; there, we
had k=3 balls (digits) to place into n=10 bins (numbers [0-9])
such that multiple digits can have the same number. By the
product rule, this is just 10^3, or n^k in the general case.
(2) Distinguishable balls, without replacement
We want to throw k balls into n bins, such that no bin contains
multiple balls.
This is just like the example of 3-digit area codes with no
repeated numbers; there, we had k=3 balls (digits) to place into
n=10 bins (numbers [0-9]) such that no digit can have the same
number. By the product rule, this is just 10*9*8 = 10!/7!, or
n!/(n-k)! in the general case. Another way to think about this
is in terms of a k-permutation of the n bins, which gives us the
same result.
(3) Identical balls, without replacement
We want to throw k identical balls into n bins, such that no bin
contains multiple balls.
This is just like the poker hand example; there, we had k=5
identical balls (cards in a hand) that we wanted to throw into
52 bins (cards in a deck) such that no bin (card) is repeated.
This was just C(52, 5), or C(n, k) in the general case.
(4) Identical balls, with replacement
We want to throw k identical balls into n bins, such that
multiple balls can go into the same bin.
This is just like the pirate treasure example; there we had k=4
identical balls (coins) that we wanted to divide among n bins
(pirates). This was just C(6, 2), or C(n+k-1, n-1) = C(n+k-1, k)
in the general case.
Combinatorial Proofs
We have seen various combinatorial identities. These can be proven
by expanding out the terms and algebraic manipulation, but this can
be quite tedious. Instead, we can prove them using counting
arguments. We come up with a particular set of items and show that
if you count the items one way, you end up with one expression, and
if you count them a different way, you end up with a different
expression. Since the set you counted is the same in both cases,
those two expressions must be equal (assuming you didn't make a
mistake!). This is called a "combinatorial proof."
Let's do some examples.
EX: Prove that C(n, r) = C(n, n-r).
ANS: Suppose we have n items. We want to know how many ways we can
choose r of them. (The set that we are counting here is the set
of all r-combinations of the set of n items.) We can do so in
the following ways:
(a) Choose r items directly from the n. There are C(n, r) ways
to do so.
(b) Pick n-r items and through them away. Keep the remaining r
items. There are C(n, n-r) ways to do this.
Since we are counting the same thing in either procedure, we
must have C(n, r) = C(n, n-r).
EX: Prove that C(n+1, m+1) = C(n, m) + C(n, m+1).
ANS: Suppose we have n+1 items and want to choose m+1 of them. We
can
(a) Choose the m+1 directly out of the n+1. (C(n+1, m+1)).
(b) Decide whether or not to pick the first item. There are
two cases:
(1) Pick the first item. Then we have to choose m remaining
items out of the remaining n items. (C(n, m))
(2) Don't pick the first item. Then we have to choose m+1
items out of the remaining n items. (C(n, m+1))
By the sum rule, there are C(n, m) + C(n, m+1) total ways.
Again, these two procedures are counting the same set, so
C(n+1, m+1) = C(n, m) + C(n, m+1).
===========
Introduction to Probability Theory
Now that we've learned to count, we turn our attention back to
probability theory. Here are some statements we'd like to be able to
understand:
(1) The chance of getting a flush (i.e. all the cards have the same
suit) in a 5-card poker hand is around 2 in 1000.
(2) If you flip a fair coin 50 times, resulting in 50 heads, the
probability that the 51st flip is heads is 1/2.
(3) If quicksort picks a random pivot at each step, then it will
sort any sequence of n numbers in O(n log n) time with hight
probability.
(4) With this algorithm for balancing the workload among servers,
the probability that a user has to wait more than 1 minute is
2%.
(5) There is a 60% chance of the "Big One" (large earthquake)
hitting Northern California in the next 30 years.
(6) The percentage of Californians who identify themselves as
Democrats is 44.5%.
In order to understand these statements, we have to know probability
theory.
Probability Spaces
All of the above statements are made in the context of a specific
"probability space." A probability space consists of the
following:
(1) A "random experiment," i.e. an experiment whose outcome is
"random".
EX: A single coin flip, drawing 5 cards from a deck of 52.
(2) The set of possible outcomes or "sample points." This set is
the "sample space."
EX: Heads or tails, the C(52, 5) possible 5-card hands.
(3) The probability of each possible outcome of the experiment.
EX: 1/2 for each of heads and tails, 1/C(52, 5) for each
5-card hand
EX: Experiment: A sequence of 51 flips of a fair coin. Sample
space: all 2^51 possible sequences of H and T.
Probabilities: 1/(2^51) for each sample point.
Formally, a probability space consists of a sample space (denoted
by the capital Greek letter Ω) with a probability
Pr[ω] for each sample point ω.
The probability Pr is actually given by a function
P: Ω -> [0, 1],
though we use Pr[ω] to refer to ω's probability
instead of P(ω).
The probability assignment must satisfy the following the
following constraints:
(1) ∀ω∈Ω . 0 <= Pr[ω] <= 1
Of course, we specified the range of P to enforce this.
(2) ∑_{ω∈Ω} Pr[ω] = 1
In other words, the probabilities of all outcomes have to add
to 1.
The simplest probability space consists of a finite set Ω
with a uniform probability assignment
∀ω∈Ω . Pr[ω] = 1/|Ω|.
This is called the "uniform distribution." The examples we saw
above all had uniform distribution.
Most of the time, we are not interested in specific outcomes. For
example in scenario (1) above, we want to know the probability of
getting any 5-card flush, not a particular 5-card flush (which we
know is 1/C(52, 5)). So we want to know the probability of a set
of outcomes, i.e. a subset of the sample space. We refer to a
subset of the sample space as an "event." Naturally, the
probability of an event E is the sum of the probabilities of the
outcomes in E:
Pr[E] = ∑_{ω∈E} Pr[ω].
In the case of a uniform distribution, this simplifies to
Pr[E] = |E|/|Ω|.
EX: What is the probability of a flush in a 5-card poker hand?
ANS: We know that this experiment has uniform distribution with
|Ω| = C(52, 5). Let E be the event that the hand is a
flush. There are four suits, and we can pick 5 cards out of
the 13 in a suit to be a flush, so the number of outcomes in
E is |E| = 4 C(13, 5). Thus, Pr[E] = 4 C(13, 5) / C(52, 5)
or about 0.002.