Administrative info
  PA1 due tomorrow!
  No lecture or section on Monday
  Office hours on Monday?

Review
  We have now seen how to perform modular arithmetic and one of its
  applications, public key encryption using RSA. We saw how to encode a
  message so that only the receiver could decode it. Today, we will see
  how to share a secret message among n people such that k of them need
  to be present in order to recover it.

Secret Sharing Schemes
  Suppose we have a bank vault that requires a 4-digit PIN to open it.
  For extra security, we want the bank's president and chairmen to
  agree to open it before it can be opened. How do we divide the
  combination so that both of them must cooperate to open it, but
  either one cannot open it on their own?

  Here are our goals for a secret sharing scheme:
  1) Robustness.
     It works, i.e. if the president and chairman cooperate, they must
     aways be able to open the vault.
  2) Secrecy.
     Neither person should be able to tell anything about the PIN by
     looking at their piece of it.
  3) Minimality.
     We want to minimize the amount of information that either person
     needs to memorize.

  Here's a simple scheme: give the president the first two digits and
  the chairman the last two digits. What do you think about this
  scheme?
    Unfortunately, we compromise on secrecy here. There are 10000
    possible combinations, but since each person knows two digits,
    they only have to guess two more, requiring only 100 guesses. This
    is much less secure.

  Suppose instead, if the PIN is some number s, we decide to give the
  president a natural number r and the chairman a natural number t
  such that s = r + t?
    Again, we compromise on secrecy. Suppose r = 8000. Then the
    president knows that s must lie in the range [8000, 9999].

  How can we fix the problem? (We don't want to use negative numbers,
  since we're not negative people. We like to think positive.)
    Modular arithmetic! We choose r and t such that s ≡ r + t (mod
    10000).

    Does this scheme satisfy our secrecy requirement? It does: since
    the president only knows r, no matter what s is, there is some
    value t such that s ≡ r + t (mod 10000), namely t ≡ s - r (mod
    10000). Since any value of s is consistent with the information he
    has, he can't tell anything about s.

  What if we had three people, say the president, chairman, and
  vice-president? Well we could just pick three numbers mod 10000 that
  add to s. But what if we only required that two out of three be
  present to open the vault?

  In general, we want to share a secret among n people such that k of
  them need to cooperate to recover the secret, k <= n. Our scheme
  does not allow us to do so.

Polynomials
  Let's take a detour and review polynomials before taking another
  stab at constructing a secret sharing scheme.

  Recall that a polyomial is represented as
    P(x) = a_d x^d + a_{d-1} x^{d-1} + ... + a_1 x + a_0.
  Such a polynomial has degree d, if a_d ≠ 0.
    Ex: P(x) = x + 1
        P(x) = 3x^2 - 2
        P(x) = 4

  The coefficients a_i can actually be real numbers, rationals, or
  integers mod m, where m is prime. In the latter case, we say we are
  working over a "finite field," or more specifically a "Galois
  field," denoted by GF(m) where m is the prime modulus.

  A root of a polynomial is a value s such that P(s) = 0.
    Ex: P(x) = x + 1          s = -1
        P(x) = 3x^2 - 2       s = +/- sqrt{2/3}
        P(x) = 4              no roots!
        P(x) ≡ x + 1 (mod 5)  s ≡ 4 (mod 5)

  There are two very important properties of polynomials.
  1) Any nonzero degree d polynomial P(x) has at most d distinct
     roots.
  2) Given d+1 points (x_1, y_1), ..., (x_{d+1}, y_{d+1}) with all x_i
     distinct, there is exactly one polynomial of degree <= d that
     goes through all d+1 points.

    Property 2 is a generalization of the maxim "2 points determine a
    line." A line is a polynomial of degree 1, so the property holds.

  Before we get to proving these properties, let's review how to
  divide one polynomial by another. Let's consider 4x^2-3x+7 / x-3.

                   4 x + 9  r 34
          -----------------
    x - 3 ) 4x^2 - 3 x + 7
            4x^2 - 12x
            ----------
                    9x + 7
                    9x - 27
                    -------
                         34

  So we see that 4x^2-3x+7 = (x-3)(4x+9) + 34.

  We can divide polynomials in general, but we are really only
  interested in dividing by binomials of the form (x-c). If we divide
  a degree d polynomial P(x) by (x-c), we can write P(x) = (x-c) Q(x)
  + r, where Q(x) has degree d-1 and r is a constant remainder.

  Now we can prove property 1.
    First, we prove that s is a root of P(x) iff (x-s) evenly divides
    P(x).
      We can always divide P(x) by (x-s) to get P(x) = (x-s) Q(x) + r.
      Substituting x = s, we get P(s) = 0 * Q(x) + r = 0 by definition
      of root, so r = 0.

      Similarly, if (x-s) divides P(x), we have P(x) = (x-s)
      Q(x). Substituting x = s, we see that P(s) = 0, so s is a root
      of P(x).

    Now suppose for the purposes of a contradiction that P(x) has
    degree d but has d+1 distinct roots s_1, ..., s_{d+1}. Then we
    know that (x-s_i) divides P(x) for each root s_i, so P(x) must be
    a multiple of (x-s_1)(x-s_2)...(x-s_{d+1}). (Note: We would need
    to use induction prove this rigorously. See the course reader for
    a summary of the induction strategy.) Multiplying out those terms,
    we see that P(x) has degree at least d+1, which is a
    contradiction.

  Let's move on to property 2. We first show that there can be no more
  than one polynomial of degree d that passes through the given d+1
  distinct points (x_1, y_1), ..., (x_{d+1}, y_{d+1}).
    Suppose for the purposes of a contradiction that there are two
    degree d polynomials P(x) and Q(x), P(x) ≠ Q(x), that pass
    through the d+1 points. Let R(x) = P(x) - Q(x), R(x) ≠ 0 since
    P(x) ≠ Q(x). What is the degree of R(x)? It can be no more than
    d. What is the value of R(x_i) for each i? Since P(x_i) = Q(x_i) =
    y_i, R(x_i) = 0. But this implies that R(x) has at least d+1 roots
    x_1, ..., x_{d+1}, which is impossible by property 1.
    Contradiction.

  We've shown that there is at most one polynomial that passes through
  the given points. Now we need to show that one always exists. We
  will do so by developing a procedure to construct the exact
  polynomial that goes through those points.

  Before we continue, note that all our arguments so far are valid
  when working over the reals, the rationals, or GF(m) for prime m.
  We've only used addition, subtraction, multiplication, and division
  (by nonzeros), which are all defined on each of these sets.

Polynomial Interpolation
  Let's look at a specific example of constructing a polynomial. Since
  we've been learning about modular arithmetic and we like to work
  with small numbers, let's work in GF(5), i.e. modulo 5. (We will see
  that we only require addition, multiplication, and division, so the
  procedure we develop will work equally well over the reals or the
  rationals.)

  Suppose we want to find the polynomial P(x) over GF(5) that passes
  through the points (1, 1), (2, 3), and (3, 2). How can we do so?

  Let's make the problem a little simpler. Let's try to find a
  polynomial that passes through (1, 1), (2, 0), and (3, 0). We see
  that 2 and 3 are roots of this polynomial, so let's try
    g_1(x) ≡ (x-2)(x-3)  (mod 5).
  What's g_1(1)? We get g_1(1) ≡ (-1)(-2) ≡ 2 (mod 5). Crud. This
  isn't equal to 1.

  How can we fix this? Let's divide g(x) by 2. Can we do so?
    Δ_1(x) ≡ 2^{-1} g_1(x) ≡ 3 g_1(x) ≡ 3(x-2)(x-3)  (mod 5).
  Plugging in 1, we get Δ_1(1) ≡ 3(-1)(-2) ≡ 6 ≡ 1 (mod 5), as
  required.

  Now let's find a polynomial that goes through the points (1, 0),
  (2, 1), and (3, 0). Let's follow the same procedure.
    g_2(x) ≡ (x-1)(x-3)  (mod 5)
    g_2(2) ≡ (1)(-1) ≡ -1 ≡ 4 (mod 5)
    Δ_2(x) ≡ 4^{-1} g_2(x) ≡ 4 g_2(x) ≡ 4(x-1)(x-3)  (mod 5).
  How can we turn this into a polynomial that goes through (1, 0),
  (2, 3), and (3, 0)? By multiplying by 3!

  Now let's find a polynomial that goes through the points (1, 0),
  (2, 0), and (3, 1). Let's follow the same procedure.
    g_3(x) ≡ (x-1)(x-2)  (mod 5)
    g_3(3) ≡ (2)(1) ≡ 2 (mod 5)
    Δ_3(x) ≡ 2^{-1} g_3(x) ≡ 3 g_3(x) ≡ 3(x-1)(x-2)  (mod 5).
  How can we turn this into a polynomial that goes through (1, 0),
  (2, 0), and (3, 2)? By multiplying by 2!

  Notice that (with all operations mod 5)
    Δ_1(1) ≡ 1, Δ_1(2) ≡ 0, Δ_1(3) ≡ 0
    3Δ_2(1) ≡ 0, 3Δ_2(2) ≡ 3, 3Δ_2(3) ≡ 0
    2Δ_3(1) ≡ 0, 2Δ_3(2) ≡ 0, 2Δ_3(3) ≡ 2
  Then if we let P(x) ≡ Δ_1(x) + 3Δ_2(x) + 2Δ_3(x), we
  get
    P(1) ≡ 1, P(2) ≡ 3, P(3) ≡ 2,
  as desired. So
    P(x) ≡ 3(x-2)(x-3) + 12(x-1)(x-3) + 6(x-1)(x-2)  (mod 5)
         ≡ 3(x-2)(x-3) + 2(x-1)(x-3) + (x-1)(x-2)  (mod 5)
         ≡ 3x^2 - 15x + 18 + 2x^2 - 8x + 6 + x^2 - 3x + 2  (mod 5)
         ≡ x^2 + 4x + 1  (mod 5).
  You can verify that this does indeed go through the given points.

  Can you see how to generalize this? Given d+1 points (x_1, y_1),
  ..., (x_{d+1}, y_{d+1}), we construct Δ_i as follows:
    g_i(x) ≡ (x - x_1)...(x - x_{i-1})(x - x_{i+1})...(x - x_{d+1})
    Δ_i(x) ≡ (g_i(x_i))^{-1} g_i(x).
  Then we can define P(x) to be
    P(x) ≡ y_1 * Δ_1(x) + ... + y_{d+1} * Δ_{d+1}(x).
  This works whether we are working over the reals, rationals, or
  GF(m). This procedure is known as Lagrange interpolation.

  Since the degree of each of the deltas is d, the degree of P(x) is
  also d. Thus, we see that a degree d polynomial P(x) does exist that
  goes through the d+1 given points. This completes our proof of
  property 2 above.

Shamir Secret Sharing
  Back to secret sharing. Recall that we wanted to share a secret s
  among n people such that any k of them can recover s, k <= n. We can
  do so using polynomials.

  Let m be a large prime, m >> s,n (>> means "much larger than," not
  right shift!). Now pick a random polynomial P(x) of degree k-1 such
  that P(0) ≡ s (mod m). Give P(1) to the first person, P(2) to the
  second, ..., P(n) to the nth person.

  Does this scheme satisfy our requirements?

  First, can k people recover the secret? Indeed they can, since
  together they have k points, and P(x) has degree k-1. Thus, they can
  perform Lagrange interpolation to recover P(x) and hence P(0) ≡ s.

  Second, can k-1 people recover the secret? No, since they only have
  k-1 points, so they cannot recover a degree k-1 polynomial. In fact,
  for any possible secret s' (mod m), they can construct a polynomial
  that is consistent with s' as well as their k-1 points. Thus, they
  actually know nothing about the original secret, even though they
  have k-1 out of the k points required to recover it!

  This scheme is known as Shamir secret sharing, after Shamir of RSA
  fame.

  As an example, suppose we wanted to share the secret s = 1 among n =
  5 people such that k = 3 are necessary to recover s, working over
  GF(7). First, we find a random degree k-1 = 2 polynomial P(x) such
  that P(0) ≡ s (mod 7). Let's choose P(x) ≡ 3x^2 + 5x + 1. What would
  each person get?
    Person 1 gets P(1) ≡ 3 + 5 + 1 ≡ 2 (mod 7)
    Person 2 gets P(2) ≡ 12 + 10 + 1 ≡ 2 (mod 7)
    Person 3 gets P(3) ≡ 27 + 15 + 1 ≡ 1 (mod 7)
    Person 4 gets P(4) ≡ 48 + 20 + 1 ≡ 6 (mod 7)
    Person 5 gets P(5) ≡ 75 + 25 + 1 ≡ 3 (mod 7)
  Then if persons 3, 4, and 5 agree to recover the secret, they run
  Lagrange interpolation to get:
    g_3(x) ≡ (x-4)(x-5), g_3(3) ≡ 2, Δ_3(x) ≡ 4(x-4)(x-5) (mod 7)
    g_4(x) ≡ (x-3)(x-5), g_4(4) ≡ 6, Δ_3(x) ≡ 6(x-3)(x-5) (mod 7)
    g_5(x) ≡ (x-3)(x-4), g_5(5) ≡ 2, Δ_3(x) ≡ 4(x-3)(x-4) (mod 7)
  Then P(x) ≡ Δ_3(x) + 6Δ_4(x) + 3Δ_5(x)
    ≡ 3x^2 + 5x + 1 (mod 7).
  As you can see, they retrieve the original polynomial. Plugging in 0
  results in P(0) ≡ 1 = s.