Administrative info PA1 due tomorrow! No lecture or section on Monday Office hours on Monday? Review We have now seen how to perform modular arithmetic and one of its applications, public key encryption using RSA. We saw how to encode a message so that only the receiver could decode it. Today, we will see how to share a secret message among n people such that k of them need to be present in order to recover it. Secret Sharing Schemes Suppose we have a bank vault that requires a 4-digit PIN to open it. For extra security, we want the bank's president and chairmen to agree to open it before it can be opened. How do we divide the combination so that both of them must cooperate to open it, but either one cannot open it on their own? Here are our goals for a secret sharing scheme: 1) Robustness. It works, i.e. if the president and chairman cooperate, they must aways be able to open the vault. 2) Secrecy. Neither person should be able to tell anything about the PIN by looking at their piece of it. 3) Minimality. We want to minimize the amount of information that either person needs to memorize. Here's a simple scheme: give the president the first two digits and the chairman the last two digits. What do you think about this scheme? Unfortunately, we compromise on secrecy here. There are 10000 possible combinations, but since each person knows two digits, they only have to guess two more, requiring only 100 guesses. This is much less secure. Suppose instead, if the PIN is some number s, we decide to give the president a natural number r and the chairman a natural number t such that s = r + t? Again, we compromise on secrecy. Suppose r = 8000. Then the president knows that s must lie in the range [8000, 9999]. How can we fix the problem? (We don't want to use negative numbers, since we're not negative people. We like to think positive.) Modular arithmetic! We choose r and t such that s ≡ r + t (mod 10000). Does this scheme satisfy our secrecy requirement? It does: since the president only knows r, no matter what s is, there is some value t such that s ≡ r + t (mod 10000), namely t ≡ s - r (mod 10000). Since any value of s is consistent with the information he has, he can't tell anything about s. What if we had three people, say the president, chairman, and vice-president? Well we could just pick three numbers mod 10000 that add to s. But what if we only required that two out of three be present to open the vault? In general, we want to share a secret among n people such that k of them need to cooperate to recover the secret, k <= n. Our scheme does not allow us to do so. Polynomials Let's take a detour and review polynomials before taking another stab at constructing a secret sharing scheme. Recall that a polyomial is represented as P(x) = a_d x^d + a_{d-1} x^{d-1} + ... + a_1 x + a_0. Such a polynomial has degree d, if a_d ≠ 0. Ex: P(x) = x + 1 P(x) = 3x^2 - 2 P(x) = 4 The coefficients a_i can actually be real numbers, rationals, or integers mod m, where m is prime. In the latter case, we say we are working over a "finite field," or more specifically a "Galois field," denoted by GF(m) where m is the prime modulus. A root of a polynomial is a value s such that P(s) = 0. Ex: P(x) = x + 1 s = -1 P(x) = 3x^2 - 2 s = +/- sqrt{2/3} P(x) = 4 no roots! P(x) ≡ x + 1 (mod 5) s ≡ 4 (mod 5) There are two very important properties of polynomials. 1) Any nonzero degree d polynomial P(x) has at most d distinct roots. 2) Given d+1 points (x_1, y_1), ..., (x_{d+1}, y_{d+1}) with all x_i distinct, there is exactly one polynomial of degree <= d that goes through all d+1 points. Property 2 is a generalization of the maxim "2 points determine a line." A line is a polynomial of degree 1, so the property holds. Before we get to proving these properties, let's review how to divide one polynomial by another. Let's consider 4x^2-3x+7 / x-3. 4 x + 9 r 34 ----------------- x - 3 ) 4x^2 - 3 x + 7 4x^2 - 12x ---------- 9x + 7 9x - 27 ------- 34 So we see that 4x^2-3x+7 = (x-3)(4x+9) + 34. We can divide polynomials in general, but we are really only interested in dividing by binomials of the form (x-c). If we divide a degree d polynomial P(x) by (x-c), we can write P(x) = (x-c) Q(x) + r, where Q(x) has degree d-1 and r is a constant remainder. Now we can prove property 1. First, we prove that s is a root of P(x) iff (x-s) evenly divides P(x). We can always divide P(x) by (x-s) to get P(x) = (x-s) Q(x) + r. Substituting x = s, we get P(s) = 0 * Q(x) + r = 0 by definition of root, so r = 0. Similarly, if (x-s) divides P(x), we have P(x) = (x-s) Q(x). Substituting x = s, we see that P(s) = 0, so s is a root of P(x). Now suppose for the purposes of a contradiction that P(x) has degree d but has d+1 distinct roots s_1, ..., s_{d+1}. Then we know that (x-s_i) divides P(x) for each root s_i, so P(x) must be a multiple of (x-s_1)(x-s_2)...(x-s_{d+1}). (Note: We would need to use induction prove this rigorously. See the course reader for a summary of the induction strategy.) Multiplying out those terms, we see that P(x) has degree at least d+1, which is a contradiction. Let's move on to property 2. We first show that there can be no more than one polynomial of degree d that passes through the given d+1 distinct points (x_1, y_1), ..., (x_{d+1}, y_{d+1}). Suppose for the purposes of a contradiction that there are two degree d polynomials P(x) and Q(x), P(x) ≠ Q(x), that pass through the d+1 points. Let R(x) = P(x) - Q(x), R(x) ≠ 0 since P(x) ≠ Q(x). What is the degree of R(x)? It can be no more than d. What is the value of R(x_i) for each i? Since P(x_i) = Q(x_i) = y_i, R(x_i) = 0. But this implies that R(x) has at least d+1 roots x_1, ..., x_{d+1}, which is impossible by property 1. Contradiction. We've shown that there is at most one polynomial that passes through the given points. Now we need to show that one always exists. We will do so by developing a procedure to construct the exact polynomial that goes through those points. Before we continue, note that all our arguments so far are valid when working over the reals, the rationals, or GF(m) for prime m. We've only used addition, subtraction, multiplication, and division (by nonzeros), which are all defined on each of these sets. Polynomial Interpolation Let's look at a specific example of constructing a polynomial. Since we've been learning about modular arithmetic and we like to work with small numbers, let's work in GF(5), i.e. modulo 5. (We will see that we only require addition, multiplication, and division, so the procedure we develop will work equally well over the reals or the rationals.) Suppose we want to find the polynomial P(x) over GF(5) that passes through the points (1, 1), (2, 3), and (3, 2). How can we do so? Let's make the problem a little simpler. Let's try to find a polynomial that passes through (1, 1), (2, 0), and (3, 0). We see that 2 and 3 are roots of this polynomial, so let's try g_1(x) ≡ (x-2)(x-3) (mod 5). What's g_1(1)? We get g_1(1) ≡ (-1)(-2) ≡ 2 (mod 5). Crud. This isn't equal to 1. How can we fix this? Let's divide g(x) by 2. Can we do so? Δ_1(x) ≡ 2^{-1} g_1(x) ≡ 3 g_1(x) ≡ 3(x-2)(x-3) (mod 5). Plugging in 1, we get Δ_1(1) ≡ 3(-1)(-2) ≡ 6 ≡ 1 (mod 5), as required. Now let's find a polynomial that goes through the points (1, 0), (2, 1), and (3, 0). Let's follow the same procedure. g_2(x) ≡ (x-1)(x-3) (mod 5) g_2(2) ≡ (1)(-1) ≡ -1 ≡ 4 (mod 5) Δ_2(x) ≡ 4^{-1} g_2(x) ≡ 4 g_2(x) ≡ 4(x-1)(x-3) (mod 5). How can we turn this into a polynomial that goes through (1, 0), (2, 3), and (3, 0)? By multiplying by 3! Now let's find a polynomial that goes through the points (1, 0), (2, 0), and (3, 1). Let's follow the same procedure. g_3(x) ≡ (x-1)(x-2) (mod 5) g_3(3) ≡ (2)(1) ≡ 2 (mod 5) Δ_3(x) ≡ 2^{-1} g_3(x) ≡ 3 g_3(x) ≡ 3(x-1)(x-2) (mod 5). How can we turn this into a polynomial that goes through (1, 0), (2, 0), and (3, 2)? By multiplying by 2! Notice that (with all operations mod 5) Δ_1(1) ≡ 1, Δ_1(2) ≡ 0, Δ_1(3) ≡ 0 3Δ_2(1) ≡ 0, 3Δ_2(2) ≡ 3, 3Δ_2(3) ≡ 0 2Δ_3(1) ≡ 0, 2Δ_3(2) ≡ 0, 2Δ_3(3) ≡ 2 Then if we let P(x) ≡ Δ_1(x) + 3Δ_2(x) + 2Δ_3(x), we get P(1) ≡ 1, P(2) ≡ 3, P(3) ≡ 2, as desired. So P(x) ≡ 3(x-2)(x-3) + 12(x-1)(x-3) + 6(x-1)(x-2) (mod 5) ≡ 3(x-2)(x-3) + 2(x-1)(x-3) + (x-1)(x-2) (mod 5) ≡ 3x^2 - 15x + 18 + 2x^2 - 8x + 6 + x^2 - 3x + 2 (mod 5) ≡ x^2 + 4x + 1 (mod 5). You can verify that this does indeed go through the given points. Can you see how to generalize this? Given d+1 points (x_1, y_1), ..., (x_{d+1}, y_{d+1}), we construct Δ_i as follows: g_i(x) ≡ (x - x_1)...(x - x_{i-1})(x - x_{i+1})...(x - x_{d+1}) Δ_i(x) ≡ (g_i(x_i))^{-1} g_i(x). Then we can define P(x) to be P(x) ≡ y_1 * Δ_1(x) + ... + y_{d+1} * Δ_{d+1}(x). This works whether we are working over the reals, rationals, or GF(m). This procedure is known as Lagrange interpolation. Since the degree of each of the deltas is d, the degree of P(x) is also d. Thus, we see that a degree d polynomial P(x) does exist that goes through the d+1 given points. This completes our proof of property 2 above. Shamir Secret Sharing Back to secret sharing. Recall that we wanted to share a secret s among n people such that any k of them can recover s, k <= n. We can do so using polynomials. Let m be a large prime, m >> s,n (>> means "much larger than," not right shift!). Now pick a random polynomial P(x) of degree k-1 such that P(0) ≡ s (mod m). Give P(1) to the first person, P(2) to the second, ..., P(n) to the nth person. Does this scheme satisfy our requirements? First, can k people recover the secret? Indeed they can, since together they have k points, and P(x) has degree k-1. Thus, they can perform Lagrange interpolation to recover P(x) and hence P(0) ≡ s. Second, can k-1 people recover the secret? No, since they only have k-1 points, so they cannot recover a degree k-1 polynomial. In fact, for any possible secret s' (mod m), they can construct a polynomial that is consistent with s' as well as their k-1 points. Thus, they actually know nothing about the original secret, even though they have k-1 out of the k points required to recover it! This scheme is known as Shamir secret sharing, after Shamir of RSA fame. As an example, suppose we wanted to share the secret s = 1 among n = 5 people such that k = 3 are necessary to recover s, working over GF(7). First, we find a random degree k-1 = 2 polynomial P(x) such that P(0) ≡ s (mod 7). Let's choose P(x) ≡ 3x^2 + 5x + 1. What would each person get? Person 1 gets P(1) ≡ 3 + 5 + 1 ≡ 2 (mod 7) Person 2 gets P(2) ≡ 12 + 10 + 1 ≡ 2 (mod 7) Person 3 gets P(3) ≡ 27 + 15 + 1 ≡ 1 (mod 7) Person 4 gets P(4) ≡ 48 + 20 + 1 ≡ 6 (mod 7) Person 5 gets P(5) ≡ 75 + 25 + 1 ≡ 3 (mod 7) Then if persons 3, 4, and 5 agree to recover the secret, they run Lagrange interpolation to get: g_3(x) ≡ (x-4)(x-5), g_3(3) ≡ 2, Δ_3(x) ≡ 4(x-4)(x-5) (mod 7) g_4(x) ≡ (x-3)(x-5), g_4(4) ≡ 6, Δ_3(x) ≡ 6(x-3)(x-5) (mod 7) g_5(x) ≡ (x-3)(x-4), g_5(5) ≡ 2, Δ_3(x) ≡ 4(x-3)(x-4) (mod 7) Then P(x) ≡ Δ_3(x) + 6Δ_4(x) + 3Δ_5(x) ≡ 3x^2 + 5x + 1 (mod 7). As you can see, they retrieve the original polynomial. Plugging in 0 results in P(0) ≡ 1 = s.