Administrative info
  HW3 out
  PA1 due Friday!

Review
  We have seen how to express statements precisely, how to prove them,
  and even how to analyze algorithms and prove interesting facts about
  them. We are now done with the first unit of the course and turn our
  attention to modular arithmetic and its applications.

Overview
  What do DVDs and clocks have in common?
  - They're both round.
  - They both use modular arithmetic!

  Suppose I want to talk to someone in Cairo, which is nine hours
  ahead of Pacific time. It would probably be a good idea to figure
  out what time it is in Cairo first to make sure that he isn't
  asleep. If it's 5pm right now, what time is it in Cairo? We add 9 to
  5 to get 14, and then subtract 12 to get 2. So it's 2am there, and I
  should probably wait until tomorrow to make the call.

  We use mod 12 arithmetic to calculate time. We will see how such
  arithmetic works in general and look at many applications.

Standard Arithmetic
  Suppose we restrict ourselves to natural numbers. What operations
  are possible on the naturals?

  Given two natural numbers, we can add them and end up with a
  natural.

  We can also multiply them and end up with a natural.

  What about subtraction? If we subtract 5 from 4, we end up with -1,
  which is unnatural. So in order to allow subtraction, we need to
  expand our set of elements to all integers.

  OK, so now we can subtract. What about division? Well, 1 / 2 is
  clearly not an integer, so we need to expand our set to the
  rationals.

  We can continue adding operations until we end up with the reals.

  On a computer, however, all of these sets are inconvenient, since
  they can't be represented using a finite number of bits. We try to
  get around this using floats/doubles, but even they have maximum
  values, and we have to worry about roundoff, etc. So rather than
  trying to represent infinite sets, let's restrict ourselves to
  finite sets like we do when talking about the time.

Modular Arithmetic
  To represent the time, we use mod 12 arithmetic. Once we get past
  12, we wrap back around to 1. For our purposes, however, since we
  are computer scientists, we like to start at 0. So instead, we will
  start with 0 and wrap back around to it once we pass 11. If we have
  a number much bigger than 11, then we just repeatedly subtract 12
  until we get between 0 and 11.
    Ex: What is mod(22, 12)? 10.
        What is mod(146, 12)? 2.

  For non-negative x, m, we define
    mod(x, m) = x - floor(x/m) * m.
  This tells us how many multiples of m to subtract, namely
  floor(x/m).

  What if we have numbers less than 0? Then we just add a multiple of
  12 until we get between 0 and 11.

  We can now define equivalence classes that consist of all integers
  that are the same mod 12.
    Ex: -12, 0, 12, 24, 36 are equivialent
        -11, 1, 13, 25, 37 are equivalent

  We use x ≡ y (mod m) to denote the fact that x and y are in the same
  equivalence class mod m.
    Ex: 2 ≡ 14 (mod 12)
        3 ≡ 147 (mod 12)
  What does this actually mean? It means that 2 and 14 differ by a
  multiple of 12. More formally, (2 - 14) = 12k for some integer k.
  In general, if a ≡ b (mod m), then (a - b) = mk for some integer k,
  or a = b + mk.

  We can add, multiply, and subtract mod m: we do so as if we were
  dealing with integers, and then convert to the right equivalence
  class mod m.
    Ex: 2 * 32 ≡ 64 ≡ 4 (mod 12)

  Let's write out a few addition and multiplication tables.

  + 0 1 2   * 0 1 2   + 0 1 2 3   * 0 1 2 3
  0 0 1 2   0 0 0 0   0 0 1 2 3   0 0 0 0 0
  1 1 2 0   1 0 1 2   1 1 2 3 0   1 0 1 2 3
  2 2 1 0   2 0 2 1   2 2 3 0 1   2 0 2 0 2
                      3 3 0 1 2   3 0 3 2 1

  They're just what you would expect, with the numbers reduced mod m.

  For standard arithmetic, we have a number of useful identities. For
  example, we know that if a = c and b = d, then a + b = c + d. Does
  the same identity hold in modular arithmetic?
    If a ≡ c (mod m), then (a-c) = km for some k. Similarly, (b-d) =
    lm for some l. Then (a+b) - (c+d) = (a-c) + (b-d) = (k+l)m, so
    (a+b) ≡ (c+d) (mod m). The identity still holds.
  Similarly, we have if a ≡ c (mod m) and b ≡ d (mod m), then ab ≡ cd
  (mod m).

  This makes computing modular arithmetic expressions much easier. We
  can reduce operands before performing an operation so that we work
  with smaller numbers.
    Ex: (13+11) * 18
        ≡ (6+4) * 4 (mod 7)
        ≡ 10 * 4 (mod 7)
        ≡ 3 * 4 (mod 7)
        ≡ 12 (mod 7)
        ≡ 5 (mod 7)

  Now that we have addition, multiplication, and subtraction (which is
  really the same as addition), what about division? In the integers,
  we had to introduce rationals to be able to express the result of
  1/5, but we'd like to stick to our nice, finite set this time.

  We can actually reduce division to multiplying by a reciprocal, or
  multiplicative inverse. Thus, to divide by 5, we instead multiply by
  its inverse 1/5. Any number x, when multiplied by its inverse 1/x,
  results in 1. Thus, when it comes to modular arithmetic, the inverse
  of a number x (mod m) should give us 1 (mod m) when multiplied by x.

  Let's look at our mod 3 multiplication table. Does every number have
  an inverse? We see that 1 * 1 ≡ 1 (mod 3), so 1 is its own inverse.
  Similarly with 2. What about 0?

  What about mod 4? We see that 1 and 3 have inverses, but 0 and 2
  don't.

  So some numbers have an inverse mod m, and some don't. Can we come
  up with a general rule for when an inverse exists?

  Does 3 have an inverse mod 12? We would require a value x such that
    3a ≡ 1 (mod 12)
    3a = 12k + 1 for some integer k
  But this is impossible. No matter what k is, 12k will be a multiple
  of 3, so 12k+1 cannot be a multiple of 3.

  In general, if we want the inverse of x mod m, we need
    ax ≡ 1 (mod m)
    ax = km + 1 for some integer k
  If x and k share any prime factor p, then we have
    (multiple of p) = (multiple of p) + 1,
  which is impossible! Thus, if gcd(x, m) ≠ 1, then there is no
  inverse of x mod m.

  What if we are working mod some prime, say 5. Does every non-zero
  number now have an inverse?
    We don't know yet! The statement we proved is equivalent to
      x has inverse mod m => gcd(x, m) = 1 (contrapositive)
    To conclude that
      gcd(x, m) = 1 => x has inverse mod m
    is a converse error! It turns out that it is true, but we have
    to prove it.

    It seems hard to prove. So once again, we are desperate, and what
    do we do when we are desperate? Prove something harder!

    Claim: If gcd(x, m) = 1, then the values a*x where a = 0, 1, ...,
      m-1 are all distinct modulo m.
    Note that if this is true, then there must be some a such that a*x
    ≡ b (mod m) for any b. In particular, there must be some a
    such that a*x ≡ 1 (mod m), which is the inverse of x mod m.
    Proof: Suppose for the purpose of a contradiction that there are
      a1, a2 such that a1 ≠ a2 (mod m) and a1 x ≡ a2 x (mod m). Then
      we have (a1 - a2) x = km for some integer k. The RHS is a
      multiple of m, so the LHS must be a multiple of m. Since gcd(x,
      m) = 1, it must be that (a1-a2) is a multiple of m, i.e. a1 - a2
      = lm for some integer l. This implies that a1 ≡ a2 (mod m),
      which is a contradiction.

  So we see that gcd(x, m) = 1 <=> x has inverse mod m. Thus, GCD is
  important, and we now turn our attention to computing it.

GCD
  How can we compute gcd(x, y)? The simplest way is to just try every
  number 1, 2, ..., min(x,y) and find the largest one that divides
  both x and y. But this is really slow. As we will see in RSA, we
  need to compute GCDs of large numbers, say 128 bit. So if x and y
  are around 2^128, we would need to test around 2^128 possible
  divisors. If we do that, the answer won't matter anymore, because
  we'll all be dead before we finish. So we want something much
  faster, say on the order of 128 or 128^3 or something reasonable
  like that.

  Euclid also thought about this. He was smart, and he knew he wouldn't
  live forever, so he came up with an algorithm that wouldn't take
  forever. It relies on the following fact:
    gcd(x, y) = gcd(x - y, y) [assume x >= y >= 0]
    Proof: If d divides x, y, then x = kd, y = ld, so d divides x - y
      = (k-l)d. If d divides (x-y) and y, then x-y = kd, y = ld, and x
      = (k+l)d, so d divides x as well.

    So we know that gcd(568, 132) = gcd(436, 132), since 568-132=436.

    In fact, if we apply the above fact many times, we can show that
      gcd(x, y) = gcd(mod(x, y), y) [x >= y >= 0].

  So using the above fact, here is Euclid's algorithm:
    gcd(x, y):
      if y = 0 then:
        return x
      else:
        return gcd(y, mod(x, y))

  Let's try to prove that the algorithm is correct for all x,y >= 0.
  We have two variables here, and we've only seen induction over a
  single variable n to prove ∀n∈N . P(n). So what variable
  do we do induction over?

  In general, determining the induction variable can be tricky when
  we have multiple variables. If we make the wrong choice, it makes
  our job a lot harder in the proof.

  Here, let's do induction over y. So we define
    P(y) = ∀x∈N . x >= y => algorithm computes gcd(x, y),
  and we want to prove ∀y∈N . P(y).

  Proof by strong induction:
    Base case: y = 0
      Algorithm returns x, which is correct since x|x and x|0.
    IH: Assume P(k) for all 0 <= k <= y.
    IS: We need to prove P(y+1). We know by our lemma above that
      gcd(x, y+1) = gcd(mod(x, y+1), y+1), which is the same as
      gcd(y+1, mod(x, y+1)). This is what the algorithm returns, and
      since mod(x, y+1) < y+1, by the IH, it computes it correctly.

  Here's an example of running the algorithm on 568, 132
    gcd(568, 132)
      gcd(132, 40)
        gcd(40, 12)
          gcd(12, 4)
            gcd(4, 0)
              4

  Notice that the numbers get quite a bit smaller in each
  iteration. In fact, after two iterations, we can prove that the
  first argument x goes down by at least a factor of 2. Thus, the
  number of iterations is logarithmic in x, i.e. linear in the number
  of bits in x. The total running time is actually O(n^3), where n is
  the number of bits in x, since each iteration actually takes O(n^2)
  time.
    Proof by cases:
    Case 1: x/2 >= y
      Then the first argument in the next iteration is y <= x/2.
    Case 2: x >= y > x/2
      Then the arguments in the next iteration are (y, mod(x, y)), and
      then in the iteration after that (mod(x, y), mod(y, mod(x, y))).
      So the first argument is mod(x, y) after two iterations. But
      mod(x, y) <= x - y < x - x/2 = x/2 since y > x/2.

  So now, we can compute gcd(x, y), so we can tell if x has an inverse
  mod y. But how do we determine what the inverse actually is? We will
  see next time.