Administrative info
Final exam tomorrow 5-8pm in 10 Evans
Two 8.5x11 double-sided cheat sheets allowed
No regrades for HW8 (not enough time) or final exam (UCB policy)
HW8, review solutions to be posted shortly
Review
We defined two sets A and B to have the same cardinality if there
is a bijection between A and B.
We defined a set A to be countable if there is a bijection between
A and some subset of N. If A is countable and infinite, then it
is countably infinite.
All countably infinite sets have the same cardinality, since they
can be put into a bijection with N.
To show that a set A is countable, it suffices to demonstrate a
(possibly infinite) enumeration of A that lists all elements of A.
Any set N∪{a}, where a∉N, is therefore countable, since
we could list N∪{a} as
N∪{a} = {a, 0, 1, 2, ...}.
(So in some sense, ∞+1 = ∞, perhaps to the chagrin of
some children.)
We used diagonalization to show that the set of real numbers in the
interval [0, 1] is uncountable, i.e. uncountably infinite. The
technique was a follows:
(1) Assume that a set S can be enumerated.
(2) Consider an arbitrary list of all the elements of S.
(3) Use the diagonal from the list to construct a new element
t.
(4) Show that t is in S but is different from all elements in the
list and so is not in the list. Contradiction.
This shows that the original assumption that S is countable was
false, so S is uncountably infinite.
Now we can understand the difference between discrete and continuous
random variables. The range of a discrete random variable is a
countable subset of R, while that of a continuous random
variable is an uncountable subset of R. (As an exercise, show
that any interval [a, b] of real numbers, a < b, is uncountably
infinite. Hint: demonstrate a bijection between [a, b] and [0, 1].)
Let's use diagonalization to show that the set FB of all functions
from finite binary strings to {0,1} is uncountable. This is the set
of all functions of the form
f:{0,1}^*->{0,1}.
(Note that we define a function by its mapping from inputs to
outputs, not by its functional form. Thus, the following function on
real numbers
f:R->R f(x) = x
is the same as
g:R->R g(x) = x + 1 - 1.)
Let's start by defining a representation for functions in FB. We
know that BS = {0,1}^* is countable, so its elements can be listed
in some order
BS = {s0, s1, s2, ...}.
So let's represent a function f∈FB as a corresponding list of
outputs
f = (f(s0), f(s1), f(s2), ....).
This representation is infinite, like that of real numbers, so it
should be no surprise that FB is uncountable.
Let's assume that FB is countable. Then its elements can be listed:
i f∈FB
0 (0, 1, 0, 1, 0, 1, ...)
1 (1, 1, 1, 0, 0, 1, ...)
2 (0, 0, 0, 0, 0, 0, ...)
3 (0, 1, 0, 0, 1, 1, ...)
4 (1, 1, 1, 1, 0, 1, ...)
5 (0, 0, 0, 1, 1, 1, ...)
... ...
Then we can construct a new function g that is different from all
of the functions in the list:
g = (1-f0(s0), 1-f1(s1), 1-f2(s2), ...)
= (1, 0, 1, 1, 1, 0, ...).
Since g∈FB but not in the list, this is a contradiction, so FB
is uncountable.
As might be clear from the above, our representation for functions
demonstrates a bijection between FB and the set IBS of infinite
binary strings. This immediately implies that FB is uncountable.
Computability
We defined the set FB of functions that take in finite bitstrings
as input and output 0 or 1:
FB = {f:{0,1}^*->{0,1}}.
We saw that this set is uncountable.
A function f is "computable" if there is a computer program P that
computes it, meaning that for any input bitstring s, P terminates
when run on s and outputs f(s).
Is the set CP of computer programs that take in a finite bitstring
and produce 0 or 1 countable? A computer program must be finite, so
it can be represented as a finite bitstring, implying that there is
a bijection between CP and some subset of BS, the set of finite
bitstrings. Since BS is countable, this implies that CP is
countable.
Since FB is uncountable and CP is countable, the cardinality of FB
is strictly larger than that of CP, implying that there are
functions that are not computable.
The above, however, is a non-constructive proof. It merely tells us
that there are uncomputable functions, without demonstrating an
example of a function that is uncomputable.
In order to demonstrate a concrete example, we first note that the
set CP x {0,1}^*, the Cartesian product of the set of computer
programs and the set of finite bitstrings, is countable. Then there
is a bijection between CP x {0,1}^* and {0,1}^*, implying that we
can represent an element (P, I) of CP x {0,1}^* as a finite
bitstring.
Now define the function
h: {0,1}^* -> {0,1}
h(x) = { 1 if the program P halts when run on I, where x = (P, I)
0 otherwise
Then h∈FB, the set of functions that we demonstrated is
uncountable.
Is the function h computable? Let's assume that it is computable, so
there exists a program
HaltOrNot(P, I):
if P halts when run on I then 1
else 0.
Not that in order for h to be computable, then HaltOrNot must
terminate. So it is not sufficient for it to call P as a subroutine
and return 1 when P halts, since HaltOrNot would not terminate if P
does not. (As a side note, since {0,1}^* x {0,1}^* is countable, it
has a bijection with {0,1}^*, so we can convert multiple inputs to a
program into a single input. However, it is more convenient to
explicitly write two inputs, so that is what we will do.)
Now if HaltOrNot exists, then Alan Turing argued that he could
construct the following program that calls HaltOrNot as a
subroutine:
Turing(P):
if HaltOrNot(P, P) = 1 then go into an infinite loop
else halt immediately, returning 0.
The program Turing, given another program P, calls HaltOrNot to
determine if P halts when run on itself. (Recall that a program has
a bitstring representation, so that representation can be passed
into a program itself as input.) If so, Turing does the opposite,
going into an infinite loop. Similarly, if P does not halt, then
Turing does the opposite, halting.
What happens when we call Turing(Turing)? There are two
possibilities:
Case 1: Turing(Turing) halts. Then when Turning(Turing) runs, it
calls HaltOrNot(Turing, Turing), which will return 1 since
Turing(Turing) halts. Then Turing(Turing) will go into an
infinite loop, so it won't halt, which is a contradiction.
Case 2: Turing(Turing) doesn't halt. Then when Turing(Turing) runs,
it calls HaltOrNot(Turing, Turing), which will return 0
since Turing(Turing) doesn't halt. Then Turing(Turing) will
halt immediately, contradicting the fact that it does not
halt.
In either case, we end up with a contradiction. Thus, our original
assumption that HaltOrNot exists is false, and h is uncomputable.
Here is another way to express this proof using a modified form of
diagonalization. Since we know that the set of programs CP is
countable, we can list all its elements. Lets list them in both
dimensions of a 2D table:
P0 P1 P2 P3 P4 ...
---------------------
P0 | H H H H H ...
P1 | H H H H H ...
P2 | N N N N N ...
P3 | H N N H N ...
P4 | H H H N N ...
... ...
The table entries represent what happens when the program in the
vertical axis is run on the input in the horizontal axis. For
example, the entry in the second row and third column is what
happens when running P1(P2). Either the program halts, which
we denote by 'H', or not, which we denote by 'N'.
Now if HaltOrNot exists, we can write the program Turing that does
the opposite of the diagonal in the table. Thus, when Turing is run
on P_i, it does the opposite of P_i(P_i), halting if P_i(P_i) does
not, going into an infinite loop if it does. This implies that
Turing is different from any program P_i on the list, since its
behavior differs from that of P_i when run on P_i. But since the
list enumerates all programs in CP, this implies that Turing is not
in CP. This further implies that HaltOrNot isn't either, since we
can easily write Turing if we have HaltOrNot.
So we have shown that there is no program HaltOrNot that will tell
us whether or not another programs halts. Bummer.
It gets worse. We can demonstrate that any function that answers an
interesting question about computer programs is uncomputable. For
example, suppose we want to know whether or not a program will print
"Hello world!" when run on a particular input. Assume that there is
a program PrintsHW that computes this:
PrintsHW(P, I):
if P prints "Hello world!" when run on I then 1
else 0.
Then we could write HaltOrNot:
HaltOrNot(P, I):
1. Remove all print statements from P.
2. Add a statement to print "Hello world!" before each halt
statement in P.
3. Return PrintsHW(P, I).
Then the modified P will halt if and only if it prints "Hello
world!", so if we can compute whether or not it prints "Hello
world!", we can compute whether or not it halts. Since we know we
can't do the latter, we can't do the former either.
We can repeat the above procedure for any interesting question about
programs, showing that it is uncomputable. As an example, we can
show that no program exists that can determine with certainty
whether or not another program is a virus. The best we can do is
approximate, using heuristics.