CSCE 590H -- Introduction to Cryptography I Spring 2003 Stephen Fenner Please note: These notes are primarily to myself, so they may be less than clear at times. I use some math conventions (borrowed from LaTeX): ^ -- means what follows is a superscript _ -- means what follows is a subscript (it also may mean underline) {..} -- used for grouping more than one character 2003/01/13 Basic definitions: cryptography, cryptanalysis, cryptology, confidentiality, integrity, authentication, encryption (enciphering), decryption (deciphering), plaintext (cleartext), ciphertext, secure/ insecure channel, keys, Alice, Bob and Oscar. Definition of a symmetric cipher (cryptosystem): P (plaintext space), C (ciphertext space), and K (key space) are arbitrary sets, with two functions e : K x P -> C ("encryption function") d : C x K -> P ("decription function") where we usually write e_k(x) for e(k,x), making e_k : P -> C for all k in K, and similarly, write d_k(y) for d(k,y), making d_k : C -> P for all k in K. "Soundness" condition: for all x in C and k in K, d_k(e_k(x)) = x Shift cipher (Caesar cipher if k=3): P = C = K = {0,...,25} (identify with letters of the alphabet), e_k(x) := (x + k) mod 26 d_k(y) := (y - k) mod 26 Check soundness condition: d_k(e_k(x)) = d_k((x+k) mod 26) = (((x+k) mod 26) - k) mod 26 = (x + k - k) mod 26 = x mod 26 = x. Note that we will define the remainder of a mod operation to be always nonnegative. (Important Rule: in an expression involving +, -, x, and the last operation is "mod n", for some n, then "mod n" can be inserted anywhere in the expression without changing its value.) Apply e_3 to "attackgalatiaatdawn" (one letter at a time) Problem 1: this is easy to break -- try all 26 keys to decrypt. With high probability, for exactly one key the English cleartext is recognizable. The key space is too small. Substitution cipher: P = C = {0,...,25}, K = { pi | pi is a permutation on P } (A permutation on a set S is a map pi : S -> S that is one-to-one and onto.) e_{pi}(x) := pi(x) d_{pi}(x) := pi^{-1}(x) |K| = 26!, which is about 2^{100}. Exhaustive search of the key space is not feasible, but the cipher can still be broken by frequency analysis (letter frequencies in English are not uniform; guess that the most frequently occuring ciphertext letter decrypts to 'e', etc.; can also take advantage of nonuniformity of digraphs in English). Problem 2: Frequency analysis works for a long enough message. P and C are too small, so the message must be chopped up into pieces too small to be uniformly distributed. 2003/01/15 Notation: for integers n>1, let Z_n = {0,...,n-1} ("the integers modulo n"). We can define addition and multiplication operations on Z_n by a + b := (a + b) mod n a * b := (a * b) mod n Both operations are closed, commutative, associative. 0 is additive identity, and 1 is multiplicative identity. Additive inverse of a is denoted -a and is either 0 (if a=0) or n-a (if a!=0). Define subtraction a - b as a + (-b). Distributive law holds. Some examples: 7 + 11, 7 - 11, 7 * 11, -7, -11 in Z_{17}. For a,b in Z (integers), and m an integer > 1, define a === b (mod m) to mean that a mod m = b mod m, or equivalently, m divides a-b (m | a-b). Usually fix m and consider as a binary relation on ZxZ. Equivalence relation with set of representatives Z_m. Shift cipher is a special case of substitution cipher. Another special case: Affine cipher: P = C = Z_{26} as before. e_k(x) = (ax + b) mod 26. where k = (a,b) and a,b in Z_{26}. For soundness, must have e_k be one-to-one. True for some values of a but not others. Analyze. Define gcd, relatively prime (coprime), Z_m^*, For integers a and b, not both 0, define gcd(a,b) to be the largest integer dividing both a and b (the greatest common divisor). [Note: "x divides y" (written "x|y") means formally that there is an integer z such that xz = y.] If gcd(a,b) = 1, we say that a and b are _relatively_prime_ or _coprime_. For integer m>0, we define Z_m^* to be the set of elements of Z_m that are coprime to m. Example: Z_{26}^* = {1,3,5,7,9,11,15,17,19,21,23,25}. 2003/01/20 no class (MLK Day) 2003/01/22 FACT: Z_m^* is closed under multiplication in Z_m. PROOF: let a and b be any elements of Z_m^*, and let p = ab (in Z). p is coprime to m, because a and b are, so gcd(m,p) = 1. But by the exercise on the first homework problem set, we can add to p any multiple of m without changing gcd. That is, gcd(m,p+km) for any integer k. Since p mod m is of the form p+km for some k (zero or negative), we have gcd(m,ab mod m) = gcd(m,p mod m) = gcd(m,p) = 1. Thus, ab mod m is coprime to m. This just means that the product of a and b in Z_m is also in Z_m^*. Since this is true for any a and b in Z_m^*, Z_m^* is closed under multiplication in Z_m (i.e., multiplication mod m). QED Multiplicative inverses in Z_m FACT: for any a in Z_m, a has a multiplicative inverse in Z_m (an element b in Z_m such that ab === 1 (mod m)) if and only if a is in Z_m^*. Further, if a does have a multiplicative inverse, it is also in Z_m^* and is unique. PROOF: for any integer b, gcd(m,a) \leq gcd(m,ab) (multiplication in Z). This is an obvious fact. But gcd(m,ab) = gcd(m,ab mod m). Thus if a is not in Z_m^*, then 1 < gcd(m,a) \leq gcd(m,ab mod m), so there is no b such that ab mod m = 1, thus a has no multiplicative inverse if a is not in Z_m^*. To prove the converse, fix a in Z_m^* and let f : Z_m^* -> Z_m^* be the map defined as f(x) = ax mod m. (f maps Z_m^* into Z_m^* because Z_m^* is closed under multiplication mod m.) We first show that f is one-to-one (injective). Let x and y be any elements of Z_m^* such that f(x) = f(y). So we have ax === ay (mod m), that is, m|a(x-y). But since m and a have no common factors, it must be the case that m|x-y, that is, x === y (mod m). Since 0 \leq x,y < m, this means that x=y. Thus f is injective. Now, since f maps a finite set injectively into itself, it must also be surjective (onto) by the pigeon-hole principle. In particular, there is some element b in Z_m^* such that f(b) = 1. Thus, ab = 1 in Z_m^*, and b is unique in satisfying this equation, because f is one-to-one. QED We let a^{-1} denote the inverse of a, for a in Z_m^*. The proof above does not provide an easy way of calculating a^{-1}. We'll show one later. Vigen\`ere cipher: Let m>0 be an integer. Define P = C = K = (Z_{26})^M. For k = (k_1,...,k_m), define e_k(x_1,...,x_m) = (x_1+k_1,...,x_m+k_m) d_k(y_1,...,y_m) = (y_1-k_1,...,y_m-k_m). Not secure. Do frequency analysis on every mth letter. How to determine m? Affine-Hill cipher: combines affine cipher with Vigen\`ere cipher: Let m>0 be an integer. Define P = C = (Z_{26})^M. Define encryption as e_k(x_1,...,x_m) = (x_1 ... x_m)M + (b_1 ... b_m), where M is an m-by-m matrix with entries in Z_{26}, and b_1,...,b_m are in Z_{26}. Here, all arithmetic on vector and matrix components is done in Z_{26}, that is, modulo 26. To decrypt, we must solve the equation (y_1 ... y_m) = (x_1 ... x_m)M + (b_1 ... b_m), for the vector (x_1 ...,x_m). We have (x_1 ... x_m)M = ((y_1 - b_1) ... (y_m - b_m)). Note: the _Hill_cipher_ is the special case where (b_1,...,b_m) = (0,...,0). We can solve for (x_1 ... x_m) if and only if we can multiply both sides of the equation with the inverse of M, that is, iff M has an inverse (a matrix N over Z_{26} such that MN = NM = I, where I is the m-by-m identity matrix over Z_{26}). To see whether M has an inverse, we consider the cofactor matrix M', whose (i,j)th entry is (-1)^{i+j}c_{i,j}, where c_{i,j} is the determinant (computed in Z_{26}) of the submatrix of M obtained by removing the i'th _column_ and j'th _row_ of M. The standard formula: MM' = M'M = det(M) I hold here when doing arithmetic in Z_{26} just as it does with arithmetic in Z, so we can invert M if det(M) has an inverse in Z_{26}, that is, det(M) is in Z_{26}^*, in which case we set N = M^{-1} := det(M)^{-1} M', where the inverse of det(M) is taken modulo 26. Conversely, for any two matrices A and B over Z_{26}, the formula det(AB) = det(A)det(B) holds when arithmetic is done modulo 26, so if M has an inverse N, then det(M)det(N) = det(MN) = det(I) = 1, so det(M) must be invertible modulo 26. So we've established, that a square matrix is invertible over Z_{26} (and is thus part of a legitimate Hill cipher key) iff its determinant is invertible in Z_{26}. Of course, there is really nothing special about the number 26 in the results above. These same results hold modulo n for any integer n>0. The only qualitative difference between different values of n lies in which elements of Z_n are invertible in Z_n. If n is prime, then all nonzero elements of Z_n are invertible (i.e., Z_n^* = {1,...,n-1}). The Hill cipher over Z_n for prime n is better in some sense than if n is not prime, because if n is prime then all matrices with nonzero determinant are invertible. So for prime p, we have the following laws of arithmetic in Z_p: 1. + is commutative and associative 2. * is commutative and associative 3. * distributes over + 4. there is an additive identity (denoted 0 or zero) 5. there is a multiplicative identity (denoted 1 or one) 6. every element a has an additive inverse (denoted -a, the negation of a) 7. every nonzero element a has a multiplicate inverse (denoted a^{-1} or 1/a, also called the reciprocal of a) Any set F with two binary operations +,* satisfying the seven axioms above is called a _field_. The real numbers R and the rational numbers Q (and the complex numbers C) with the usual + and * operations are fields. Now we've seen that Z_p is also a field, were + and * are modulo p. Almost all the facts of algebra you learned in High School hold in any field. Also, most linear algebra facts also hold when vector and matrix components are elements of any field and where arithmetic operations are done in the field. For example, it holds for any field F that a square matrix (over F) is invertible iff its determinant (computed in F) is nonzero (in F). Euler totient function For integer m>0, the _Euler_totient_function_ phi(m) is defined to be |Z_M^*|, that is, the number of elements in Z_m that are coprime to m. If m is a prime power, that is, m = p^k for some k, then clearly, Z_m^* contains exactly the nonmultiples of p that are less than m. There are m - p^{k-1} of these, so phi(m) = m - p^{k-1} = p^{k-1}(p - 1). We'll show later that if a,b>0 are coprime, then phi(ab) = phi(a)phi(b). These two facts above allow us to compute phi(m) given the prime factorization of m. Suppose m = p_1^{e_1} ... p_n^{e_n}, where p_1 < ...< p_n are prime and e_1,...,e_n > 0. Then the facts about phi above immediately yield the formula: phi(m) = product_{i=1}^n p_i^{e_i-1}(p_i - 1). 2003/01/27 Permutation cipher: Fix m > 1. Let P = C = Z_{26}^m, and let K = S_m. [For any n>0, S_n is defined as the set of permutations on {1,...,n}; it is called the _symmetric_group_ on {1,...,n}.] We encrypt with key pi as follows: e_{pi}(x_1,...,x_m) = (x_{pi(1)},...,x_{pi(m)}) and decrypt using rho = pi^{-1}: d_{pi}(y_1,...,y_n) = (y_{rho(1)},...,y_{rho(m)}). So we don't _substitute_ letters, we just rearrange them. This is really just a special case of the Hill cipher where the matrix is a permutation matrix. [A square matrix is a permutation matrix if its entries are all 0 and 1, and every row and column has exactly one 1. A permutation matrix permutes the coordinates in a vector.] Stream ciphers So far, we encrypt a long message x_1x_2x_3... by e_k(x_1x_2x_3...) = e_k(x_1)e_k(x_2)e_k(x_3).... These ciphers are called _block_ciphers_. Instead, we may use k to produce a _key_stream_ z_1,z_2,..., where each z_i is in K, and encrypt by e_k(x_1x_2...) = e_{z_1}(x_1)e_{z_2}(x_2).... This is called a _stream_cipher_. In general, there is a rule that computes each z_i in terms of k, i, earlier z_j's, and perhaps the plaintext up through x_i (i.e., x_1...x_i). There are also encryption and decription functions e and d, as usual. A block cipher is a special case of a stream cipher, where k = z_1 = z_2 = .... Sometimes, though, it is easier to view a block cipher as a stream cipher with smaller P and K. For example, the Vigen`ere cipher can be viewed as a stream cipher, where K = Z_{26}^m, and P = C = Z_{26} as follows: given a key k = (k_1,...,k_m), the rule for z_i is z_i = if i <= m then k_i else z_{i-m}. This stream cipher is _periodic_ with period m. It is also _synchronous_, that is, the keystream only depends on the key and not anything else, such as the plaintext. Next is an asynchronous cipher. Autokey cipher: P = C = = K = Z_{26}. Given a key k and plaintext message x_1x_2..., the rule for z_i is z_1 = k, z_{i+1} = x_i, for i>=1. Encrypting is as with the shift cipher. To decrypt the ciphertext y_1y_2... corresponding to the plaintext x_1x_2..., we first compute x_1 = (y_1 - k) mod 26, then x_2 = (y_2 - x_2) mod 26, etc. LFSRs A Linear Feedback Shift Register (LFSR) cipher is a stream cipher over single bits. Fix m > 1. Let P = C = Z_2. A key consists of a fixed vector (c_0,...,c_{m-1}) of _coefficients_ in Z_2, and an _initial_ vector (k_1,...,k_m). The keystream z_1z_2... is defined by z_i = k_i if i <= m, z_{i+m} = ( sum_{j=0}^{m-1} c_jz_{i+j} ) mod 2, for i >= 1. Each z_i is in Z_2. For encryption and decryption, we have e_{z_i}(x_i) = (x_i + z_i) mod 2, d_{z_i}(y_i) = (y_i + z_i) mod 2. If the coefficients c_i are chosen correctly then, for _any_ nonzero initialization vector, the cipher is periodic with period 2^m - 1. LFSRs can be implemented in hardware very efficiently (the c_i are fixed in this case). (Draw picture.) Some More Cyptanalysis Four most common attack models (based on what resources Oscar has). In each case, Oscar is trying to find the key: Ciphertext only attack: Oscar has some ciphertext. Known plaintext attack: Oscar has some ciphertext and the corresponding plaintext. Chosen plaintext attack: Oscar has (temporary) access to the encryption machinery, chooses some plaintext to encrypt, and sees the ciphertext. Chosen ciphertext attack: Oscar has (temporary) access to the decryption machinery, chosses some ciphertext to decrypt, and sees the plaintext. All the monoalphabetic ciphers (shift, affine, substitution) are succeptable to ciphertext-only attack, due to the nonuniformity of letter frequencies in English. Vigen'ere cipher can also be cryptanalyzed this way, by breaking up the ciphertext y_1y_2... into m subsets, corresponding to the residue mod m of the indices. To determine m, we essentially guess m = 2, 3, ..., when we have the right m, we will see highly nonuniform letter frequencies, more than with the wrong m. Could use repeated ciphertext trigrams to guess m, then autocorrelation to measure the nonuniformity of the distribution. [The autocorrelation is the chance that two samplings according to the distribution have the same outcome. For distribution {p_i}, it is I(p) = sum_i p_i^2 I(p) is minimized by the uniform distribution. 2003/01/29 The affine cipher has an easy known-plaintext attack that involves solving linear equations over Z_{26} (or more generally, Z_n). Suppose, for example, we know that "od" encrypts to "KY". Then we get two linear equations satisfied by the unknown key (a,b): 14a + b === 10 (mod 26), 4a + b === 24 (mod 26). Subtracting second from first gives 10a === 12 (mod 26), but unfortunately, this equation has two solutions for a: 9 and 22, giving two possible keys (9,14) and (22,14). (The solution is not unique because 10 is not invertible mod 26.) To uniquely determine the key, we need more plaintext-ciphertext. Suppose we also know that "h" encrypts to "Z", that is, 7a + b === 25 (mod 26). This new equation is satisfied by (a,b) = (9,14) but not (22,14), so (9,14) is the key. Sometimes, two equations suffice. Suppose we only know that "dh" encrypts to "YZ". Then the two equations are 4a + b === 24 (mod 26), 7a + b === 25 (mod 26). Subtracting first from second gives 3a === 1 (mod 26), But 3 is invertible, so multiply both sides by 3^{-1} = 9 to get a === 9 (mod 26), so the key is (9,14). More general systems of linear equations in modular arithmetic can be converted into a single matrix equation Ax=b. The matrix equation is solvable iff det(A) is invertible with respect to the given modulus. Hill cipher is somewhat difficult to attack with ciphertext only, but a known-plaintext attack is easy, and similar to the case with the affine cipher. If x and y are known m-vectors over Z_{26}, then the matrix equation y = xM yields m linear equations in the m^2 unknown entries of the matrix key M. If we have m many plaintext-ciphertext pairs, this gives us m^2 linear equations in the m^2 unknowns (over Z_{26}). These equations can be expressed as a matrix equation Aa = b, where a is an m^2-column vector of the unknown entries of M, A is a known m^2-by-m^2 matrix, and b is a known m^2-column vector. This equation is uniquely solvable provided det(A) is invertible mod 26. If not, we'll need to add in more plaintext-ciphertext pairs until we can pick out an invertible set of equations. If we don't know m, we can guess m=2,3,... until a key is found, asuming m is not too large. I we have the wrong m, then the key we compute won't agree with additional plaintext-ciphertext pairs. LFSRs are similarly cryptanalyzable by a known-plaintext attack using systems of linear equations over Z_2. Shannon's Theory Evaluating security of a cryptosystem: Computational security -- best algo for breaking the code runs in at least N operations, for some specified large N. No known cryptosystem can be proved secure under this definition. Restrict type of attack (e.g., exhaustive key search). Provable security -- proof of security assuming some well-studied problem (e.g., integer factorization) is computationally difficult. This is similar to showing a problem NP-complete; it is not an unconditional proof of difficulty/security. Unconditional security -- the system cannot be broken, even allowing unbounded computational resources for Oscar. Cryptosystems Unconditionally Secure Against Ciphertext-Only Attacks _Discrete_random_variable_ _X_: finite set X together with a probability distribution on X. Pr[X=x] (or just Pr[x]) is probability that _X_ takes on the value x (i.e., the probability assigned to x by the distribution. We have 0 <= Pr[x] for all x, and sum_{x in X} Pr[x] = 1. An _event_ E is a subset of X. Pr[x in E] (or just Pr[E]) is sum_{x in E} Pr[x]. Dice example. Two dice, S_n event "sum is n", D = "doubles" event. Definition 2.2 is wrong. Define a probability distribution on X x Y first as the _joint_prob_distribution_, with Pr[x,y] = Pr[(x,y)]. Define derived random variables _X_ and _Y_ from this distribution. 2003/02/03 Conditional probability, independence Bayes' Theorem: For two prob dists X and Y, Pr[y|x] = Pr[x|y]*Pr[y]/Pr[x]. Corollary: X and Y are independent iff Pr[x|y] = Pr[x]. Perfect secrecy. Assume cryptosystem (P,C,K,e,d) is specified, and a particular key k is used only for one encryption. Assume prob dist on P: use _x_ as random var. _A_priori_ prob that a plaintext x occurs is Pr[_x_=x]. Also assume prob dist on K (random var _K_), with reasonable assumption that _K_ it is independent of _x_. _x_ and _K_ induce the random var _y_ ranging over the ciphertexts: Pr[_y_=y] = sum_{K:y in C(K)} Pr[_K_=K]Pr[_x_=d_K(y)]. Notice that Pr[_y_=y|_x_=x] = sum_{K:x=d_K(y)} Pr[_K_=K]. So by Bayes' Theorem, Pr[_x_=x|_y_=y] = ... can be computed by anyone whose knows the probability distributions. Definition: A cryptosystem has _perfect_secrecy_ if Pr[x|y] = Pr[x] for all x in P and y in C. That is, knowing the ciphertext does not help at all in guessing the plaintext. Example with shift cipher. Compute probabilities. Perfect secrecy implies |K| >= |C| (assuming Pr[y] > 0 for all y in C), since, for any fixed x in P, we have Pr[y|x] = Pr[x|y]*Pr[y]/Pr[x] = Pr[y] > 0, so there must be at least one key K such that e_K(x) = y for all y in C. Hence there are at least as many keys as ciphertexts. In _any_ cryptosystem, we must have |C| >= |P|, since plaintexts must encrypt 1-1 into ciphertexts. Theorem (Shannon): Suppose |K| = |C| = |P|. Then we have perfect secrecy iff every key is used with equal probability 1/|K|, and for all x in P and y in C, there is a unique key k such that e_k(x) = y. Proof: By our reasonable assumption above, Pr[y]>0 for all y in C (otherwise remove y from C). Assume perfect secrecy. As observed above, for all x in P and y in C, there is at least one key k such that e_k(x)=y. Thus |C| = |{e_k(x): k in K}| <= |K|, but |C| = |K|, so |{e_k(x) : k in K}| = |K|. Thus there cannot exist two distinct keys that encrypt x to y. Let n = |K|, and let P = {x_1,...,x_n}. Fix y in C. Name keys k_i such that e_{k_i}(x_i) = y. Then, Pr[x_i] = Pr[x_i|y] = Pr[y|x_i]Pr[x_i]/Pr[y] = Pr[k=k_i]Pr[x_i]/Pr[y], so Pr[y] = Pr[k=k_i], so keys are equiprobable. For the converse, just compute Pr[x|y]. QED 2003/02/05 One-time pad. Entropy is a measure of uncertainty, or of information content (uncertainty of the outcome of an event before it happens = information conveyed by the event that removes the uncertainty). If an event occcurs with probability p, the information provided by the event occuring is about log(1/p) = - log p. Imagine n equally likely possibilities. The probability of any event is thus 1/n. You need about log n bits, then, to describe the outcome, so the information is log n = - log p, where p = 1/n is the probability of the event. If the distribution is highly skewed, say, p_0 = 1/1000 and p_1 = 999/1000, there is little information conveyed if 1 occurs, but a lot when 0 occurs. Let X be a random var ranging over n possible values with probabilities p_1,...,p_n. Then the _entropy_of_ X, H(X) = H(p_1,...,p_n), is the average information content of an outcome of X, i.e., H(X) = - sum_{i=1}^n log p_i. If some p_i = 0, we adopt the convention that 0 log 0 = 0. This is correct for two reasons: (1) we expect H to be continuous in p_1,...,p_n, and lim_{x -> 0^+} x log x = 0; (2) of p_i = 0, then that event will never occur, so we should expect to remove it from the distribution without changing the uncertainty, so p_i log p_i should not contribute anything to H(X). H(X) = 0 iff some p_i = 1 and the rest are zero. Example with (p_1,p_2,p_3) = (1/2,1/4,1/4). Entropy and information compression. Huffman encodings. There is a prefix (instantaneous) code f : X^* -> {0,1}^* whose average length ell(f) = sum_i p_i|f(x_i)| satisfies H(X) <= ell(f) < H(X) + 1. Properties of Entropy Jensen's inequality. X is random var with prob dist p_1,...,p_n. Then H(X) <= log n, with equality iff p_1 = ... = p_n = 1/n. H(X,Y) <= H(X) + H(Y), with equality holding iff X and Y are independent. Conditional probability distribution X|y for all y in Y. Clearly, H(X|y) = - sum_x Pr[x|y] log Pr[x|y]. Define the _conditional_entropy_ H(X|Y) is the weighted avg of this over all y. Statement at bottom of page 61 is wrong; it is the opposite: H(X|Y) measures the average amount of information about X that is _left_unrevealed_ by Y. (He makes the same conceptual error on the next page as well.) H(X,Y) = H(Y) + H(X|Y). Venn diagram to describe entropy relationships between two random vars (doesn't work with 3 or more vars necessarily). 2002/02/10 H(X|Y) = 0 iff X is uniquely determined by (a function of) Y. In the Venn diagram, X is a subset of Y. Spurious keys and unicity distance Theorem: For any cryptosystem, H(K|C) = H(K) + H(P) - H(C). Proof: H(K,P,C) = H(C|K,P) + H(K,P), but H(C|K,P) = 0, since the key and plaintext determine the ciphertext uniquely. So H(K,P,C) = H(K,P), but K and P are independent, so H(K,P) = H(K) + H(P). Similarly, H(K,P,C) = H(K,C), since plaintext is determined uniquely by key and ciphertext. Thus, H(K|C) = H(K,C) - H(C) = H(K,P,C) - H(C) = H(K) + H(P) - H(C). Entropy per letter of a natural language -- English, say. H(uniformly random string of letters) = log 26 = 4.70 per letter H(random string of letters by English frequency count) = 4.19 per letter Interletter correlations reduce this value. Let P^n be random var whose prob distribution is uniform over all n-grams of English plaintext, and 0 otherwise. Then the _entropy_ (per letter) of English is defined as H_{English} = lim_{n->infinity} H(P^n)/n, and the _redundancy_ is R_{English} = 1 - H_{English}/log |P| (here, |P| = 26) Redundancy is always between 0 and 1. Empirically, 1 <= H_{English} <= 1.5, so R_{English} = 0.75 (roughly). Redundancy is the optimal compression rate. Prob dists on K and P^n induce a prob dist on C^n. Given y in C^n, let K(y) = {k in K | exists x in P^n, Pr[x]>0 and e_k(x) = y} BIG QUESTION: How long does the ciphertext have to be before we can be sure of a unique decription? Average number of keys over possible ciphertexts y in C^n is k_n = sum_{y in C^n} Pr[y]|K(y)|. We have H(K|C^n) = H(K) + H(P^n) - H(C^n) by Theorem above, and H(P^n) is about n(1 - R_{English}) log |P| for large n. Certainly, H(C^n) <= n log |C|, so if |C| = |P|, H(K|C^n) >= H(K) - n R_E log |P|. Now, H(K|C^n) = sum_{y in C^n}Pr[y]H(K|y) <= sum Pr[y]log|K(y)| <= log(sum Pr[y]|K(y)|) = log k_n. Thus, log k_n >= H(K) - nR_E log |P|, so if K is uniform, k_n >= |K|/|P|^{nR_E}, for large n. The RHS -> 0 exponentially fast as n -> infinity. Unicity distance is value for n when RHS drops below 1. n_0 = log|K|/(R_E log|P|). Example: for substitution cipher (with English), |P| = 26 and |K| = 26!, R_E = 0.75. Then n_0 is about 25. Product cryptosystems S_1 x S_2 means S_1 followed by S_2, keys chosen independently. x is associative. A cryptosystem is _endomorphic_ if P = C. Two systems are _equivalent_ if there is a correspondance of key spaces such that each cipher has the same set of possible encryption/decryption functions with the same probabilities. An endomorphic system S is _idempotent_ if S x S === S. Two systems commute if S x T === T x S. If S and T are both idempotent and commute, then S x T is also idempotent (proof). 2002/02/12 The product of noncommuting ciphers may not be idempotent, even if the individual ciphers are, so there is added security in running the product two or more times ("rounds"). Substitution Permutation Networks (SPNs) Substitution and permutation ciphers frequently don't commute. Describe SPN in book. Draw picture. Linear Cryptanalysis of an SPN (Matsui) This is a known plaintext attack. We assume we have a large sample of plaintext-ciphertext pairs, all encrypted using the same key. Define bias of a binary distribution. Piling Up Lemma: the bias of the xor of independent distributions is the product of the biases. (Do n=2 case) Consider an S-box pi with m input bits and n output bits. Assume a uniform probability distribution on the inputs, i.e., Pr[x_1,...,x_m] = 2^{-m} for all tuples (x_1,...,x_m). pi induces a joint random variable (X_1,...,X_m,Y_1,...,Y_n) on the inputs and outputs, i.e., Pr[x_1,...,x_m,y_1,...,y_n] = (if y = pi(x) then 2^{-m} else 0). For any subset of inputs and outputs, we get a binary random variable corresponding to the xor of these bits X_{i_1} xor ... xor X_{i_k} xor Y_{j_1} xor ... xor Y_{j_l}, which can be computed by tabulating input-output frequencies of the S-box pi. We want to find subsets for which the bias of the above binary variable is bounded away from 0 (i.e., large absolute value; "highly biased"). In other words, we want the bits to satisfy a linear relationship most of the time. 2002/02/17 I'll call such a subset a _biased_throughput_. Now consider the SPN starting from the plaintext up through xor'ing with K^{Nr}, that is, not including the last layer of S-boxes and the last round key. Suppose for a moment we can find a subset S of plaintext bits and output bits of this truncated SPN whose xor is highly biased. Then with high probability, we can find K^{Nr+1} by exhaustive search as follows: Fix some trial value value k for K^{Nr+1}. Then for each plaintext-ciphertext pair, run the SPN backwards starting with the ciphertext, through xor'ing with k, then backwards through the last row of S-boxes, resulting in a bit vector w. Now compute the xor of the bits of the plaintext and w that are in S. Do this for each plaintext-ciphertext pair, and tabulate the results. If k is correct, that is, if k = K^{Nr+1}, then we expect a bias in the results corresponding to the expected bias of S. But if k is _wrong_ in any of the bits that influence w, then the distribution will likely be unbiased. This way, we can recover some of the bits of K^{Nr+1}. (We really only need to search over the relevant bits of K^{Nr+1}, and let the other bits just be 0.) How do we find such a subset S? We try to find a (sub)network of S-boxes in the truncated SPN, and biased throughputs for each, so that output bits of biased throughputs on one layer exactly correspond to inputs of biased throughputs on the next layer. If we then take the xor of all the bits for all the biased throughputs, then the intermediate bits cancel, and we are left with an xor of - some initial inputs (plaintext bits), - some output bits at the end of the truncated SPN, and - lots of intermediate round key bits (which are constant) We assume that the various throughputs in the network are independent for the sample of plaintext-ciphertext pairs that we have. This is not true in general, but seems to hold approximately in practice so that the Piling Up Lemma gives accurate results. By the Piling Up Lemma, the bias of S is 2^{s-1}(product of the biases of the individual throughputs), where s is the number of S-boxes in the subnetwork. Note that since the key bits are constant, their xor is either the constant 0 or the constant 1, so they can be ignored in the xor when computing the bias. (They affect the _sign_ of the bias but not its absolute value.) If epsilon is the bias of S, then pair sample size should be on the order of epsilon^{-2}. Illustrate with picture from book. Data Encryption Standard (DES) (high-level overview) Adopted by the NBS (now NIST) in 1977, developed by IBM in consultation with the NSA. For >20 years, the most often used cipher. Initial controversy over the S-boxes (only nonlinear part of DES), which had no published specific design criteria. Key length (56 bits) criticized as too short. DES is a block cipher encrypting 64-bit blocks using a _Feistel_network_ of 16 rounds. In a Feistel network, the state is split into two equal sized bit vectors (L_i,R_i) and is updated as follows: L_{i+1} = R_i, R_{i+1} = L_i xor f(R_i,K_i), where K_i is the i'th round key, and f is an _arbitrary_ function (not necessarily injective given constant K_i). Draw picture. Decryption is easy (solve for (L_i,R_i) in terms of (L_{i+1},R_{i+1}) and K_i. 2002/02/19 In DES, f uses a combination of 8 (nonlinear, noninjective) S-boxes and a bit permutation. Each S-box maps 6 bits to 4 bits, and is different from the other seven. Differential Cryptanalysis (Biham and Shamir) Some similarity to linear cryptanalysis, except it is a chosen plaintext attack. We look for biases of S-boxes as before, but our distribution is now over xors of pairs of inputs and the xor of the corresponding outputs. Doing this cancels out the key. Today, DES can be cracked rather quickly using networked, general-purpose machines, just by exhaustive key search. Advanced Encryption Standard (AES) NIST adopted Rijndael as the AES, a replacement for DES, in November 2001 after open competition starting in 1997. Criteria were: security, cost, algorithm and implementation characteristics. AES has block length 128 (bits), and three allowable key lengths: 128, 192, 256. Iterated cipher with number Nr of rounds depending on the key length: 10, 12, 14, respectively. It is basically an SPN. Encryption in AES: Input: plaintext x State := x State := State xor RoundKey // operation AddRoundKey repeat Nr - 1 times SubBytes(State) // substitution using an S-box ShiftRows(State) // permutation of each row MixColumns(State) // permutation/substitution on each column State := State xor Roundkey // AddRoundKey endRepeat SubBytes(State) ShiftRows(State) State := State xor RoundKey // AddRoundKey output ciphertext y = State All AES ops are byte-oriented, making fast software implementations possible on a wide range of general-purpose and embedded systems. Write State as a 4x4 byte array [s_{i,j}]. Initially, State is bytes of plaintext in column-major order. SubBytes does bijective substitution on each byte independently. Permutation pi_S of {0,1}^8. (pi_S can be represented by a 16x16 table of bytes.) pi_S uses a field algebraic structure on {0,1}^8. SubBytes(z = a_7a_6...a_0): if z != 0 then z := z^{-1} (in the field) (c_7...c_0) := (01100011) for i := 0 to 7 do b_i := a_i xor a_{i+4} xor a_{i+5} xor a_{i+6} xor a_{i+7} xor c_i return (b_7...b_0) (All subscripts are reduced mod 8 in for-loop, above) ShiftRows(State): for i := 0 to 3 do for j := 0 to 3 do s_{i,j} := s_{i,j+i} (Subscripts reduced mod 4) MixColumns(State): for each column c=(c_0,c_1,c_2,c_3) of State do for i := 0 to 3 do u_i := FieldMult(x,c_i) xor FieldMult(x+1,c_{i+1} xor c_{i+2} xor c_{i+3} // indices reduced mod 4 for i := 0 to 3 do c_i := u_i (Here, x is (00000010) and x+1 is (00000011).) KeyExpansion(key=(key[0],...,key[15])): z := 1 (in field) for i := 1 to 10 do RCon[i] := z ^ 000000 // ^ is concatenation with hex digits z := FieldMult(z,x) for i := 0 to 3 do w[i] := (key[4i],key[4i+1],key[4i+2],key[4i+3]) for i := 4 to 43 do temp := w[i-1] if i === 0 (mod 4) then temp := SubWord(RotWord(temp)) xor RCon[i/4] w[i] := w[i-4] xor temp return (w[0],...,w[43]) RotWord(B_0,B_1,B_2,B_3) = (B_1,B_2,B_3,B_0) (B_i are bytes) Subword(B_0,...,B_3): for i := 0 to 3 do B_i' := SubBytes(B_i) return (B_0',...,B_3') RCon is an array of 10 words that is constant (indep of key or texts). To decrypt, run (inverse) operations in reverse order (note that AddRoundKey is its own inverse). Throughputs of S-box have low bias in linear approx and difference distribution (making linear and differential cryptanalysis hard). This is a result of the algebraic properties of the S-box. Also, MixColumns makes finding linear/diff attacks using "few" S-boxes impossible (wide-tail strategy). Recently (last Summer), a new cryptanalysis of reduced-round AES. 2002/02/24 Midterm Exam through 3.6 2002/02/26 Modes of Operation: ECB, CFB, CBC, OFB Electronic Codebook (ECB) Cipher Feedback (CFB) Cipher Block Chaining (CBC) Output Feedback (OFB) So far, we've used ECB, which is the most naive (and worst). Can find information about plaintext if repeated blocks. Extreme case of low entropy: block is always either all 0s or all 1s. In CBC mode, to encrypt x_1...x_n into y_1...y_n, we have y_i = e_K(y_{i-1} xor x_i), where y_0 is an _initialization_vector_ (IV). Alice sends y_0...y_n to Bob. Bob decrypts: x_i = d_k(y_i) xor y_{i-1}. In OFB and CFB, generate a keystream z_1z_2... and xor it with the plaintext (so these act like stream ciphers): y_i = x_i xor z_i. OFB: z_i = e_K(z_{i-1}), and z_0 is the IV. Alice sends z_0y_1y_2.... altering some x_i only alters y_i (useful for error recovery in noisy channels, such as satellite comm). CFB: y_0 = IV and z_i = e_K(y_{i-1}). In CBC or CFB, if x_i is altered, then y_i, y_{i+1}, ... are all affected. So these are useful for authentication and integrity protection. Specif., they can be used to produce a MAC (message authentication code). Even without encryption, Alice can send IV,x_1,...,x_n,MAC=y_n. There are variants of OFB and CFB called k-bit feedback modes, where k <= size of block. Cryptographic Hash Functions (Chapter 4) Used to provide assurance of data integrity; produces a short "fingerprint" or "message digest" of the data. Fix h, a cryptographically secure hash function. h maps bit strings of arbitrary length to strings of some fixed length (often 128 or 160 bits). A _keyed_ hash function h_K depends on some key K. Alice sends (x,y=h_K(x)) to Bob. Bob can detect any alteration. Easy to transform an unkeyed hash function into a keyed hash function: h_K(x) = h(K^x). A _hash_family_ is a 4-tuple (X,Y,K,H), where 1. X is a set of possible _messages_, 2. Y is a finite set of possible _message_digests_ (or _authentication_tags_), 3. K is the _keyspace_, a finite set of possible _keys_ 4. For each k in K, there is a _hash_function_ h_k in H. Each h_k:X->Y. Always assume that |X|>=|Y|, and often that |X|>=2|Y|. (x,y) is _valid_ under key k if h_K(x) = y. We want to prevent an adversary from constructing valid pairs. If |X|=N and |Y|=M, then any subset of Y^X is a _(N,M)-hash_family_. (Y^X is my own notation for the set of all functions from X to Y. |Y^X| = M^N.) An unkeyed hash function h:X->Y can be considered a hash family with only one key. Security Spse h is an unkeyed hash function. The only way to produce a valid pair (x,h(x)) should be to first choose x then compute h(x). For h to be secure, the following three problems should be difficult to solve: PREIMAGE Instance: A hash function h:X->Y and an element y in Y. Find: x in X such that h(x) = y. If difficult, say that h is "one-way". SECOND PREIMAGE Instance: h and x. Find: x' != x such that h(x') = h(x). "second-preimage-resistant" COLLISION Instance: h Find: x,x' such that x != x' and h(x) = h(x') "collision-resistant" 2003/03/03 Suppose |X|=N and |Y|=M. We want to guage the difficulty of solving the problems above by randomized Las Vegas algorithms in the average case in the Random Oracle Model. Random Oracle Model: assume h is a completely random member of Y^X (chosen uniformly), and we can make queries to h as an oracle (questions of the form: "What is h(x)?" for any given x). We assume each query has unit cost. We don't have any other way of computing h(x). This is an idealized model. Las Vegas algorithm: The algorithm can make random choices (say, by flipping an internal coin), then either outputs a *correct* answer or says "fail". It may fail to find a correct answer even if one exists, but it will never output an incorrect answer. For any given input and oracle h, the algorithm will succeed with some probability, taken over its internal random choices. Average Case: we take the average over all inputs to the algorithm (assuming they are all the same size), chosen uniformly. An algorithm is an (epsilon,q)-algorithm if, for a given size, the algorithm makes <= q queries to h and succeeds with probability >= epsilon on average over the inputs and random choices for h (and also the algorithm's internal choices). Algo for preimage: Given: y in Y, h (as oracle) repeat q times choose x in X u.a.r. find h(x) (one query to h) if h(x) = y then halt and output x end-repeat output "fail" This algo is provably close to optimal for q queries (only improvement is to choose the x's without replacement - i.e., distinct). Pr[success] = 1 - Pr[failure] = Pr[failure on 1] * ... * Pr[failure on q] (since each x is chosen independently from the others) = (1 - 1/M)^q So Pr[success] = 1 - (1 - 1/M)^q =approx 1 - e^{-q/M} =approx q/M (using 1 - x <=approx e^{-x} twice) So this is a (q/M,q)-algorithm. Algo to find a collision: Given h (as oracle) Choose x_1,...,x_q in X distinct u.a.r. if h(x_i) = h(x_j) for some i!=j, then output (x_i,x_j) else output "fail" Makes q queries; provably optimal subject to q queries. Can sort the h(x_i) values and look for adjacent equal elements. Prob[success] = 1 - Pr[failure] = 1 - 1((M-1)/M)((M-2)/M)...((M-q+1)/M) This is q pingpong balls in M slots - prob that two or more are in the same slot. 2003/03/05 Pr[success] = 1 - (1 - 1/M)(1 - 2/M)...(1 - (q-1)/M) >approx 1 - e^{-1/M}e^{-2/M}...e^{-(q-1)/M} = 1 - e^{-q(q-1)/M} =approx q(q-1)/M =approx q^2/M. This is small provided q^2 << M. Iterated hash functions Merkle-Damg\oard construction 2003/03/17 Due to time constraints, we'll skip SHA-1. Also skipping Message Authentication Codes (MACs) We'll return to these as time permits. Public-Key Cryptography and RSA So far: cryptosystems are _symmetric_key_, i.e., the same key K is used for both encryption and decryption (d_K is the same as e_K or easily derived from it), and Alice and Bob must share this (secret) key. Key distribution must be secure. This may be a problem. In _asymmetric_ or public-key cryptography, it may be computationally infeasible to find d_K given just e_K. Bob generates e_K and d_K at the same time, publishes e_K, but keeps d_K secret. Alice (or anyone else) can send an encrypted message to Bob, but only Bob can decrypt the message. Metal box analogy. Idea first publically proposed by Diffie, Hellman in 1976, although it was known in 1970 by Ellis, Cocks (classified). RSA (Rivest, Shamir, Adleman) cryptosystem in 1977. Other proposed systems. Security relies on the (presumed) computational hardness of certain problems, mostly in number theory. RSA relies on the difficulty of factoring large integers. ElGamal and elliptic curve systems depend on the difficulty of the discrete logarithm problem. Trapdoor one-way function y = f(x) = x^b mod n (if gcd(b,phi(n)) = 1). Trapdoor is a such that y^a ==== x mod n. This is essentially how RSA works. a is the multiplicative inverse of b (mod phi(n)). Recall Euler's theorem: if x is in Z_n^*, then x^{phi(n)} === 1 (mod n). Stronger version: x^{phi(n) + 1} === x (mod n) for any x in Z_n. Extended Euclidean Algorithm (EEA; used to find modular inverses). Euclidean algorithm to find gcd(a,b) gcd(a,b) is the least positive linear combination xa + yb, where x and y range over Z. We need to find such an x and y. How to find inverses given x and y. How RSA works Bob (preprocessing: key generation): 1. Generates two large distinct primes p and q in secret, at random 2. Lets N := pq 3. Computes m := phi(N) = (p - 1)(q - 1) 4. Finds e in Z_m^* (not nec at random; 3 might do if 3 is in Z_m^*) 5. Computes d := e^{-1} (mod m) using EEA 6. Publishes N and e, keeping d private ((N,e) is Bob's public key; d is Bob's private key) P = C = Z_N Alice: 1. Has plaintext message x in Z_N 2. Computes ciphertext y := x^e mod N 3. Sends y to Bob over channel susceptible to eavesdropping Bob 1. Receives y from Alice 2. Computes cleartext x := y^d mod N This works because ed === 1 (mod m) so there is an integer k >= 0 such that ed = km+1. If x is in Z_N^*, then we have the following equivalences (mod N): y^d === (x^e)^d === x^{ed} === x^{km+1} === (x^m)^k x === 1^k x (by Euler's theorem) === x One can show that the above equivalence holds even if x is not in Z_N^*. Bob can easily compute d because he knows m = phi(N). But Oscar the attacker does not know phi(N), so he cannot run EEA to find d. Claim that finding phi(N) given N is at least as hard as factoring N. Proof: Given N, suppose Oscar has an algorithm that computes phi(N) (at least with high probability on average). Then phi(N) = (p - 1)(q - 1) = pq - p - q + 1 = N - p - q + 1, so p + q = N + 1 - phi(N), and this quantity is known to Oscar. Multiply both sides by p to get p^2 + N = p(N + 1 - phi(N)), which is a quadratic equation in p with known coefficients. Oscar uses the quadratic equation to find p (and q, the other solution). So Oscar can easily factor N, which we assume is difficult. Euclidean Algorithm Uses the fact (from the homework) that gcd(x,y) = gcd(x,y+kx) for any integer k, so in particular, gcd(a,b) = gcd(b,a) = gcd(b,a mod b). gcd(a,b) // assume a,b >= 0 while b != 0 do temp := a a := b b := temp mod b endwhile return a 2003/03/19 EEA: we add stuff to the Euclidean algorithm to find x and y such that gcd(a,b) = xa + yb. EEA(a0,b0) // assume a0,b0 >= 0 x1 := 1 y1 := 0 x2 := 0 y2 := 1 a := a0 b := b0 // Loop invariants: // a = x1 a0 + y1 b0 // b = x2 a0 + y2 b0 while b != 0 do q := a div b r := a mod b // 0 <= r < b and a = qb + r // thus x1 a0 + y1 b0 = q(x2 a0 + y2 b0) + r // thus r = (x1 - q x2)a0 + (y1 - q y2)b0 (a,b) := (b,r) (x1,y1,x2,y2) := (x2,y2,x1 - q x2,y1 - q y2) endwhile return (x,y) := (x1,y1) Note that we can always have |x| <= b0 and |y| <= a0. For example, if x > b0, then subtract b0 from x and add a0 to y. Repeat as necessary. Chinese Remainder Theorem: Let m_1,...,m_k be pairwise coprime, and let M = m_1...m_k. Then the map Z_M -> Z_{m_1} x ... x Z_{m_k} given by x |-> (x mod m_1, ... , x mod m_k) is a bijection, and (in fact) is easily invertible. Equiv: for all c_1 in Z_{m_1} ... c_k in Z_{m_k} there is a (unique) x in Z_M such that x === c_1 (mod m_1) ... x === c_k (mod m_k) Use CRT to show that phi(ab) = phi(a)phi(b) if a is coprime to b. Get full formula for phi(n) this way in terms of the prime factorization of n. Can also use CRT as an alternate representation of a number. Definition of a group (abelian) Cyclic group Z_n^* is always a group under multiplication mod n. If n is prime then Z_n^* is also cyclic. 2003/03/24 Primitive elements of Z_p^*. Finding two large primes Prime Number theorem: density of primes among n-bit numbers is about 1/log n, where n is large. Select n-bit numbers at random, test each for primality. Primality testing: Solovay-Strassen or Miller-Rabin Miller-Rabin is used in practice (slightly faster). We'll do Solovay-Strassen. Each algo is Monte Carlo with one-sided error: Given n: Flip random coins. If n is prime, then algo outputs "n is prime" with probability 1 If n is composit, the algo outputs "n is prime" with probability <= 1/2 (<= 1/4 for Miller-Rabin). Amplify the error by repeating some large number of times (100, say). Error prob is then <= 2^{-100}. Quadratic residues mod p (an odd prime) (define; characterize as even exponent of some primitive element). Legendre symbol (n/p) = if n===0 (p) then 0 else if n is q.r. mod p then 1 else -1 (here, p is an odd prime). Fact: (n/p) === n^{(p-1)/2} (mod p). (Prove using primitive element) Jacobi symbol: a an integer, n an odd number. (a/n) = (a/p_1)...(a/p_k) where n = p_1...p_k, and the p_i are prime ((a/p_i) is Legendre symbol). Solovay-Strassen: Given n: Pick a random a in Z_n - {0} if gcd(a,n) > 1 then output "composite" let x = (a/n) let y = a^{(n-1)/2} mod n if x === y (mod n) then output "prime" else output "composite" Can show that if n is composite, then x === y for at most half the a's in Z_n^*. This is fast if we can compute (a/n) quickly (note that we don't know the prime factorization of n). If n > 0 is odd, then 1. If m_1 === m_2 (mod n) then (m_1/n) = (m_2/n). 2. (2/n) = if n = \pm 1 (mod 8) then 1 else -1. 3. (m_1m_2/n) = (m_1/n)(m_2/n). 4. (m/n) = if m === n === 3 (mod 4) then -(n/m) else (n/m). (Quadratic Reciprocity Theorem (Gauss)) Use these facts to compute (a/n) quickly. 2002/03/26 Square Roots Mod n Thm: If p is odd prime and e > 0 and gcd(a,p) = 1, then y^2 === a (mod p^e) has no solutions if (a/p) = -1 and two solutions (modulo p^e) otherwise. Cor: If n = \prod_{i=1}^l p_i^{e_i} > 1 is odd and the p_i are distinct primes, then a has 2^l square roots mod n if (a/p_i) = 1 for all i, and no solutions otherwise. Proof: Use CRT. Can obtain all 2^l square roots of a by taking all square roots of 1 and multiplying by a single fixed square root of a. Attacks on RSA: Factoring Pollard's Rho Method n is (composite) number to be factored. Let p be the smallest prime factor of n. If we can find x != x' in Z_n such that x === x' (mod p), then p <= gcd(x-x',n) < n, so we get a nontrivial factor of n. Apply recursively. For random subset X of Z_n, try gcd(n,x-x') for all distinct pairs x,x' in X. Birthday paradox analysis. Assume f : Z_n -> Z_n is pseudorandom function. (For example, f(x) = (x^2 + a) mod n is popular, even just with a = 1.) Iterate f until a collision is found (mod some prime dividing n). Input: (n,x_1) // x_1 is in Z_n x := x_1 x' := f(x) while gcd(x-x',n) = 1 do x := f(x) x' := f(x') x' := f(x') endwhile if gcd(x-x',n) = n then output "fail" else output gcd(x-x',n) On average, x, x' gets into a cycle after Theta(sqrt(n)) iterations. Reduce mod p, where p is smallest prime dividing n: x, x' gets into a cycle after Theta(sqrt(p)) = O(n^{1/4}) operations. So, highly likely that collision found mod p before mod n. This still takes exponential time in log p, so RSA resists this. Another general approach: n is the (odd) number to be factored. Assume that n is not a prime power (this is easy to check). Then every x in Z_n has at least four square roots mod n. Look for x, y in Z_n such that x^2 === y^2 (mod n) but x =/= \pm y (mod n). Then, n | x^2 - y^2 = (x+y)(x-y), but n does not divide x+y nor x-y, so 1 < gcd(x-y,n) < n, and 1 < gcd(x+y,n) < n. Factor Base Methods: Let B = {p_0,p_1,...,p_k}, where p_0 = -1 and p_1,...,p_k are the first k primes. We want to find several z_i in Z_n such that z_i^2 === a product of the p_j (mod n). Suppose we find z_1^2 === p_0^{e_{1,0}}...p_k^{e_{1,k}} ... z_m^2 === p_0^{e_{m,0}}...p_k^{e_{m,k}} Then we find a subset of the z_i^2 whose product (mod n) is expressible with all even exponents, say w = p_1^{2b_1}...p_k^{2b_k} mod n This is always possible if m > k+1. (More on that later.) Then let u = the product of the z_i's from the selected subset, mod n. Let v = p_1^{b_1}...p_k^{b_k}. Then u^2 === v^2 === w (mod n) But there is (at least!) a 50/50 chance that u =/= \pm v (mod n). In fact, the chances are at least 1 - 2^{-k} (??? check) To find the subset of the z_i, consider the matrix E = [e_{i,j}] of exponents. Reduce E mod 2 to a matrix over Z_2. Then the subset will correspond to a linearly dependent set of rows (that add to [0,...,0] mod 2). We can use standard row reduction techniques from linear algebra to do this. Sieve Methods: These are factor base methods where the z_i's are found by process of elimination ("sieve") from a large range of values. This can be more efficient then just trying one z at a time. Also this can be sped up easily by concurrent processing. (Effort forced RSA to increase key size about 15 years ago.) Quadratic Sieve and Number Field Sieve (the latter is the fastest known factoring algo). These are still exponential time, though. 2003/03/31 n = pq is easy to factor if p-1 or q-1 is smooth (i.e., has all small prime factors). So in choosing p and q, easiest to make them _strong_ primes (i.e., (p-1)/2 and (q-1)/2 are also prime). This has other benefits, in particular, the encryption exponent e can be 3, since gcd(3,phi(n)) = 1. Finding a strong prime of lenth k takes (heuristically) about Theta(k^2) probes. Semantic Attacts (may get some information about plaintext without finding private key): For example, the repeated message attack, small message space attack, and the small message attack. Also, if x is the plaintext and y = x^e mod n is the ciphertext, then (x/n) = (y/n)^d = (y/n), since d (decryption exponent) is odd. So Oscar gains a single bit of information about x, namely, (x/n). Can fix all these by padding x with a random nonzero value (in the high-order bits). This padding should be chosen independently for each message. Rabin's Cryptosystem Bob's private key: p,q (distinct large random primes such that p === q === 3 (mod 4)). Public key: n = pq. P = C = Z_n^* e(x) = x^2 mod n d(y) = sqrt(y) mod n Not uniquely decipherable (each y in Z_n^* has 4 square roots). Bob finds a square root of a quadratic residue y (mod p) by sqrt(y) = y^{(p+1)/4} mod p. Computes similarly mod q, then uses CRT to find a square root of y (mod n). This may not be x. Bob precomputes all square roots of 1 (mod n) (using CRT again), to get all four square roots of y from one of them. Alice puts redundancy in x to let Bob distinguish it. Oscar: computing square roots mod n is as difficult as factoring. Discrete Log-Based Public-Key Cryptosystems (Chapter 6) Defined discrete log function in an abelian group. ElGamal: Let p be a prime so that Discrete Log problem in (Z_p^*,.) is infeasible, and let alpha in Z_p^* be a primitive element. Let P = Z_p^*, C = Z_p^* x Z_p^*, and define K = { (p,alpha,a,beta) : beta === alpha^a (mod p) }. a is private key, the rest public key. e_k(x,k) = (y_1,y_2), where y_1 = alpha^k mod p and y_2 = x beta^k mod p. If Oscar has an algo for discrete log to the base alpha in Z_p^*, then he can compute Bob's private key a given beta. No other significantly faster way of doing this is known, so ElGamal is assumed secure. p should have at least about 300 decimal digits (1k bits). 2003/04/02 Diffie-Hellman Key Exchange Alice and Bob can agree on a (random) key K securely (conditionally). Neither Alice nor Bob can completely control the actual value of K. Communication is symmetric and can be done in parallel. Fixed public prime p and g generator of Z_p^*. Alice: Private random value a in Z_{p-1}. Bob: Private random value b in Z_{p-1}. Alice -> Bob: B := g^a mod p Bob -> Alice: A := g^b mod p Alice: K := B^a mod p Bob: K := A^b mod p K = g^{ab} mod p. Oscar sees g^a and g^b. We assume the Diffie-Hellman problem (given g^a and g^b, compute g^{ab}) is infeasible. (If discrete log is easy, so is this). Partial disclosure of bits of K.... Both sytems above work with an arbitrary (abelian) group , such that elements of G are represented so that . and inverses are easy, but discrete log is hard. Generalizations of these systems boil down to finding more appropriate groups. Finite Fields Recall: a _field_ is a structure such that 1. is an abelian group (0 is identity, -a is the inverse of a), 2. is an abelian group (1 is identity, a^{-1} or 1/a is the inverse of a, and 3. . distributes over +. C,R,Q,Z_p are fields for prime p. Fact: If F is a finite field, then |F| = p^m for some prime p and some integer m>0. Conversely, for every prime p and m>0, there is a finite field F with |F| = p^m. This field is unique up to isomorphism. For q = p^m we write F_q or GF(q) for this field (GF = "Galois Field"). Representing Finite Fields Z_p[x] is the set of all univariate polynomials over Z_p (i.e., with coefficients in Z_p), with the usual rules for + and . of polynomials (using these ops in Z_p for the coefficients). If P != 0, then write P uniquely as sum_{i=0}^k c_i x^i, where x is a formal variable, k >= 0, the c_i are elements of Z_p, and c_k != 0. We say k is the _degree_ of P (deg(P) = k). By convention, deg(0) = -infinity. P is _monic_ if c_k = 1. Given (representations of) A and B in Z_p^* (as lists of the coefficients, say). Addition and multiplication is "easy" (standard polynomial addition and multiplication). Division is also easy: If B != 0, then there are unique Q and R in Z_p[x] such that A = QB + R and deg(R) < deg(B). 2003/04/07 We write A div B for Q and A mod B for R. We say A === B (mod P) if P | (A-B). (A | B if there is a C such that AC = B.) We can define gcd(A,B) in the usual way (assume monic for uniqueness). A and B (not both zero) are _coprime_ or _relatively_prime_ if deg(A,B) = 0. Can compute gcd using Euclidean Algo in Z_p[x]. Can find X,Y such that gcd(A,B) = AX + BY using EEA in Z_p[x]. This is completely analogous with the case for integers. Suppose deg(F) = m > 0. We can define Z_p[x]/(F) as "the ring of polynomials over Z_p, mod F" as follows: Elements are all polynomials of degree < m. Addition and multiplication are as in Z_p[x] except that we take the remainder mod F. F is _irreducible_ if there do not exist G,H such that F = GH and deg(G)>0 and deg(H)>0. So F is coprime to every polynomial of degree 0. There is a polynomial-time test for irreducibility. (Surprisingly, factoring polynomials is also possible in polynomial time.) Fact: If F is a finite field, then |F| = p^m for some prime p and some m>0. Conversely, for any such p and m, there is a (unique up to isomorphism) field of size p^m. Thus every possible finite field is representable as Z_p[x]/(F) for some irreducible F. Definition: For field F, let F^* be the multiplicative group F-{0} under multiplication. Fact: If F is a finite field, then F^* is cyclic. Furthermore, there is a p-time test for primitivity (i.e., being a generator) in F^*. (The number of primitive elements is phi(|F^*|) = phi(p^m - 1), which is usually a large fraction of all elements of F^*.) For most (properly chosen) finite fields F large enough, the discrete log problem in F^* is infeasible. So we can use these for ElGamal and Diffie-Hellman. Generally, for group and element g, discrete log to base g in the cyclic (sub)group is feasible if the order of g is smooth (Pohlig-Hellman attack). So || should have at least one large prime factor. Elliptic Curves over a Finite Field Discuss elliptic curves over any field (say R) with char not in {2,3}. (The characteristic char(F) of a field F is the least p>0 such that 1+...+1 (p 1's) = 0, or char(F) = 0 of no such p exists. Easy to show that if char(F) > 0, then it must be prime. char(GF(p^m)) = p, and char(C) = char(R) = char(Q) = 0.) An _elliptic_curve_ E over a field F (with char(F) not in {2,3}) is the set of solutions in (x,y) in FxF of the equation y^2 = x^3 + ax + b, where a,b in F are fixed numbers such that 4a^3 + 27b^2 != 0, together with an additional point O at infinity. a and b completely determine E. Geometrical view of elliptic curves over R. Addition: generic case for distinct points: (x_1,y_1) + (x_2,y_2) = (x_3,y_3) where x_3 = lambda^2 - x_1 - x_2, and y_3 = lambda(x_1 - x_3) - y_1, where lambda = (y_2 - y_1)/(x_2 - x_1) Generic case for 2(x_1,y_1): lambda = (3x_1^2 + a)/(2y_1) (rest same). (All operations are in F.) If P = (x,y) != O, then -P = (x,-y). Other cases: for any P in E, P + O = O + P = P, P + (-P) = O, and -O = O. Can show that + is associative, commutative. 2003/04/09 Elliptic curves over Z_p (p > 3 prime) Theorem (Hasse): If E is any elliptic curve over Z_p, then p + 1 - 2 sqrt(p) <= #(E) <= p + 1 + 2 sqrt(p). There is an algorithm (due to Schoof) to compute #(E) in O((log p)^8) bit operations (O((log p)^6) operations in Z_p). Theorem: For any elliptic curve E over Z_p, there are integers n_1, n_2 > 0 such that (E,+) is isomorphic to Z_{n_1} x Z_{n_2}, and n_2 | gcd(p-1,n_1). If #(E) is prime, then (E,+) is cyclic, and any nonzero element of E generates E. (This is true of any group of prime order.) Weak elliptic curves (where discrete log is easy): supersingular (ptime discrete log algo by Menezes, Okamoto, Vanstone); curves with "trace one," i.e., #(E) = p. Besides these two classes of curves,

subseteq E is believed secure provided |

| >= about 2^{160} and |

| has at least one large prime factor (to guard against Pohlig-Hellman). Naive ElGamal has two problems: 1. Expansion factor of 4 2. No known easy way to encode plaintexts as points on the curve. ECIES (Elliptic Curve Integrated Encryption Scheme) [simplified; full-blown scheme includes symmetric key encryption and MACs]. Plaintext x need not be first coordinate of curve point. ECIES also uses_point_compression_: store nonzero element (x,y) on curve as (x,y mod 2). If p === 3 (mod 4), then y = (\pm z^{(p+1)/4}) mod p = sqrt(z) mod p, where z = x^3 + a x + b. Parity of y determines which square root to use. Simplified ECIES H =

subgroup of E such that |H| = n is prime and discrete log on H is infeasible. P = Z_p^*, C = (Z_p x Z_2) x Z_p^*, K = {(E,P,m,Q,n) : Q = mP} (P,Q,n) is public key; m in Z_n^* is private key. Alice: selects secret random k in Z_n^*; x in Z_p^* is plaintext. e_K(x,k) = (PointCompress(kP), x x_0 mod p) where kQ = (x_0,y_0) and x_0 != 0. Ciphertext y = (y_1,y_2) where y_1 in Z_p x Z_2 and y_2 in Z_p^*. d_K(y) = y_2(x_0)^{-1} mod p where (x_0,y_0) = m PointDecompress(y_1). Algorithm to compute multiple cP of a point P: use "double and add-subtract" using signed binary representation of c: c = sum{i=0}^{l-1} c_i 2^i where each c_i is in {-1,0,1}. (Can obtain signed binary rep from strict binary rep by replacing each string of 2 or more contiguous 1s with 100...00(-1). NAF representation. This way, get no two contiguous nonzero elements.) Algo for cP: Q := O for i := l-1 downto 0 do Q := 2Q if c_i = 1 then Q := Q+P else if c_i = -1 then Q := Q-P return Q 2003/04/14 A Semantic Attack on ElGamal beta is a q.r. iff a is even. y_1 is a q.r. iff k is even. Thus we can tell whether beta^k === alpha^{ak} is a q.r. Thus we can determine the quadratic residuosity of x. If, for example, there are only two possible plaintexts with different quadratic residuosity, can tell them apart. If p = 2q + 1 (q a prime), then restricting beta, y_1, and x to be q.r.'s is equivalent to ElGamal on the subgroup of q.r.'s modulo p (same as Z_q). Bit Security of Discrete Log (Assume alpha is primitive element of Z_p^*.) Computing lsb of log_{alpha} x is easy: if x^{(p-1)/2} === 1 (mod p) then 0 else 1. (Just the quadratic residuosity of x (mod p).) Similarly, if p-1 = 2^s t, where t is odd, then the s least significant bits of log_{alpha} x are easy to compute. However, computing the (s+1)'st least significant bit is just as hard as computing the entire discrete log. Oracle algo in the case s=1 (i.e., p === 3 (mod 4)): Compute log_{alpha} beta. Assume L_j(beta) is the j'th lsb of log_{alpha} beta. L_1(beta) is already easy. Assume we have an oracle for L_2. x_0 := L_1(beta) beta := beta/alpha^{x_0} mod p // beta is a q.r. mod p i := 1 while beta != 1 do x_i := L_2(beta) gamma := beta^{(p+1)/4} mod p // gamma is a square root of beta (mod p) if L_1(gamma) = x_i then beta := gamma else beta := p - gamma beta := beta/alpha^{x_i} mod p // beta is a q.r. mod p i := i+1 endwhile return x_{i-1},...,x_0. Note that L_1(x) != L_1(p-x) for x in Z_p^*: Since alpha^{(p-1)/2} === -1 (mod p), we have, for some j, x === alpha^j (mod p) and -x === alpha^{j + (p-1)/2} (mod p). Since (p-1)/2 is odd, j = log_{alpha} x and j + (p-1)/2 = log_{alpha}(-x) have opposite parity. Thus we can use this inequality to distinguish which square root of beta is the right one (i.e., whose discrete log is half that of beta). Digital Signatures Confidentiality (attacks: eavesdropping) * Integrity/authentication (attacks: forging, tampering) Availability (attacks: denial-of-service) A _signature_scheme_ is a 5-tuple (P,A,K,S,V) where 1. P is finite set of possible messages 2. A is finite set of possible signatures 3. K (keyspace) is finite set of possible keys 4. For ever K in K, there is an algo sig_K in S and an algo ver_K in V with sig_K : P -> A ver_K : P x A -> bool such that, for all x in P and y in A, ver_K(x,y) = [[ y = sig_K(x) ]] (x,y) is a _signed_message_. RSA Signatures Alice "decrypts" her plaintext x using her private key as her signature. Bob verifies by "encrypting" the signature using Alice's public key and checking that the result is x. Can combine confidentiality and integrity. 2003/04/16 ElGamal Signatures K = (Z_p^*,alpha in Z_p^*,a in Z_{p-1},beta), where beta = alpha^a mod p. x in Z_p^* is message, k is secret random element of Z_{p-1} sig_K(x,k) = (gamma,delta), where gamma = alpha^k mod p (element of Z_p^*) delta = (x - a gamma)k^{-1} mod (p-1) (element of Z_{p-1}). ver_K(x,(gamma,delta)) = [[ beta^{gamma} gamma^{delta} === alpha^x (mod p) ]]. Use the fact that a gamma + k delta === x (mod p-1). To see this, start with alpha^x === beta^{gamma} gamma^{delta} (mod p), and make the substitutions gamma := alpha^k (mod p) and beta := alpha^a (mod p) to get alpha^x === alpha^{a gamma + k delta} (mod p). Get congruence of exponents mod p-1. Solve this for delta. Security of this scheme: Oscar tries to forge a signature for a message x without knowing a. If he choose a gamma and tries to find the corresponding delta, he must compute log_{gamma} alpha^x beta^{-gamma}. If he first chooses delta then tries to find gamma, he is trying to solve the equation beta^{gamma} gamma^{delta} === alpha^x (mod p) for the unknown gamma. This problem has no known feasible solution. Likewise, cannot choose gamma, delta together and solve for x (discrete log again). Oscar can sign a random message x by choosing gamma, delta, and x simultaneously. (Existential forgery.) Given i,j in Z_{p-1}, s.t. gamma = alpha^i beta^j, must satisfy alpha^x === beta^{gamma}(alpha^i beta^j)^{delta} (mod p) equiv alpha^{x - i delta} === beta^{gamma + j delta} (mod p) This will be satisfied if x - i delta === 0 (p-1) and gamma + j delta === 0 (mod p-1). Can easily solve for delta and x provided gcd(j,p-1) = 1. We get gamma = alpha^i beta^j mod p, delta = -gamma j^{-1} mod p-1, x = -gamma i j^{-1} mod p-1. -------- Unlike confidentiality, where secrecy of information loses value over time, signatures (of copyrighted works, for example) may have to be verified years later than when they were created, so parameters must be big enough to resist forseeable computational power. Thus p should be >= 2^{1024}, about. This makes the ElGamal signature have about 2048 bits, which is too long for certain devices (e.g., smart cards). Schnorr's Signature Scheme (a variant of this is DSA) Signatures are much shorter, but forging still seems to require computation in Z_p (p >= 2^{1024}). Suppose q | p-1 and is a prime with at least about 160 bits. Let alpha be an element of Z_p^* with order q. Then is isomorphic to Z_q. Assume a secure hash function h : {0,1}^* -> Z_q. K = (q,p,alpha,a,beta), where beta = alpha^a mod p a is private (Alice), all else public. k is secret random element of Z_q chosen by Alice. sig_K(x,k) = (gamma,delta), where gamma = h(x || (alpha^k mod p)), delta = k + a gamma mod q, for message x in {0,1}^*, gamma, delta in Z_q are 160 bits each. ver_K(x,(gamma,delta)) = [[ h(x || (alpha^{delta} beta^{-gamma} modp)) = gamma ]]. Forging (even an existential forgery) seems to require solving for gamma inside and outside the hash function. Highly unlikely if h is secure enough. 2003/04/21 (Easter break) 2003/04/23 Digital Signature Algorithm (DSA) Proposed in 1991. Selection process by NIST was not public. Standard developed by NSA without input from industry. L originally fixed at 512. p is an L-bit prime where 512 <= L <= 1024 and L === 0 (mod 64), such that the discrete log problem in Z_p is intractable. q is a 160-bit prime that divides p-1. Let alpha in Z_p^* be a q'th root of 1 (mod p). P = {0,1}^*, A = Z_q^* x Z_p^*, and K = {(p,q,alpha,a,beta) : beta === alpha^a (mod p) and 0 <= a <= q-1}. a is private, all else is public. For K and a secret random number k, 1 <= k <= q-1, define sig_K(x,k) = (gamma,delta), where gamma = (alpha^k mod p) mod q, delta = (SHA-1(x) + a gamma) k^{-1} mod q. (If gamma = 0 or delta = 0 (prob about 2^{-160}), choose a new k.) To verify x in {0,1}^* and gamma,delta in Z_q^*, compute e_1 = SHA-1(x) delta^{-1} mod q, e_2 = gamma delta^{-1} mod q. ver_K(x,(gamma,delta)) <==> (alpha^{e_1} beta^{e_2} mod p) mod q = gamma. Check that the valid signature verifies: Note that alpha has order q, so alpha^a === alpha^b (mod p) iff a === b (mod q). alpha^{e_1} beta^{e_2} === alpha^{e_1 + a e_2} (mod p) === alpha^{delta^{-1}(SHA-1(x) + a gamma)} (mod p) === alpha^k (mod p) since SHA-1(x) + a gamma === k delta (mod q). This congruence is equivalent to the equation alpha^{e_1} beta^{e_2} mod p = alpha^k mod p. Taking both sides mod q yields ver_K(x,(gamma,delta)). To forge a message, Oscar must solve alpha^{delta^{-1} SHA-1(x)} beta^{delta^{-1} gamma} mod p === gamma (mod q) for gamma and delta (and x if an existential attack, which is unlikely to be easier, since only the hash value of x is used). In October 2001, NIST recommends L = 1024. Elliptic Curve DSA (ECDSA) FIPS 186-2 (2000) p is a prime or a power of 2, E an elliptic curve defined over GF(p). A is a point on E with prime order q, such that discrete log problem in is infeasible. P = {0,1}^*, A = Z_q^* x Z_q^*. K = {(p,q,E,A,m,B) : B = mA and 0 <= m <= q-1}. m is private, the rest public. For message x in {0,1}^* and secret random k, 1 <= k <= q-1, define sig_K(x,k) = (r,s) where kA = (u,v), r = u mod q, s = k^{-1} (SHA-1(x) + mr) mod q. (If either r=0 or s=0, choose different k.) To verify (x,(r,s)) (x in {0,1}^* and r,s in Z_q^*), compute w = s^{-1} mod q, i = w (SHA-1(x)) mod q, j = wr mod q, (u,v) = iA + jB. ver_K(x,(r,s)) <==> u mod q = r. Provably Secure Signature Schemes (Proof of security conditioned on the existence of functions with certain desirable hardness properties) One-Time Sigs (only one message signed, verified an unlimited number of times) Lamport Signature Scheme k > 0 and P = {0,1}^k. Suppose f : Y -> Z is a one-way function, and let A = Y^k. Let y_{i,j} in Y be chosen at random, 1 <= i <= k, and j = 0, 1, and let z_{i,j} = f(y_{i,j}). Key K is the 2k y's and 2k z's. y's are private key and z's public. sig_K(x_1,...,x_k) = (y_{1,x_1},...,y_{k,x_k}). ver_K((x_1,...,x_k),(a_1,...,a_k)) <==> f(a_i) = z_{i,x_i}, 1<=i<=k. Example: f(x) = alpha^x mod p, where alpha is primitive mod p. If two message differing in >= 2 bit positions are signed with same key, Oscar can sign a message that uses any combination of bits in those positions (he knows both y_{i,0} and y_{i,1}). The other bits must be the same as those common to the original two messages. 2003/04/28 Skipping full domain hash. Undeniable Signatures (Chaum & van Antwerpen, 1989) Alice's signature cannot be verified without her cooperation. Protects against unauthorized copying, distribution. Uses a _challenge-response_ protocol. We must also keep Alice from falsely disavowing a valid signature. There is a _disavowal_ protocol that Alice can use to prove (in court, say) that a signature is invalid. This protocol won't work with a valid signature. Potential practical problems: This runs against a defendant's (Alice's) usual right to "remain silent," which put the active burdon of proof (that Alice signed something) entirely on the prosecutor. If Alice forgets her private key, she will not be able to actively prove a forgery with the disavowal protocol (as well as losing the ability to verify her signature). So this scheme may not be appropriate in all contexts. Three components: signing algo, verification protocol, disavowal protocol. Chaum-van Antwerpen sig scheme: p, q are primes with p = 2q + 1, and the discrete log problem in Z_p (to *any* base!) is intractable. alpha in Z_p^* has order q. a is in Z_q^*. G = is the multiplicative subgroup of Z_p^* of order q (which consists of exactly the quadratic residues mod p). (Rationale: want |G| to be as large as possible with prime order.) P = A = G, and K = {(p,alpha,a,beta) : beta = alpha^a mod p}. p,alpha,beta are public, a is private. For K = (p,alpha,a,beta) and x in G, define y = sig_K(x) = x^a mod p. (Note that there is no way to verify (x,y) without knowing a, since y could be any element of G (except the identity of x != 1), independent of x (without knowledge of a). Finding a is equivalent to an instance of the discrete log problem in Z_p.) For x,y in G, use this protocol to verify: 1. Bob chooses e_1,e_2 in Z_q^* secretly at random. 2. Bob computes c = y^{e_1} beta^{e_2} mod p and sends it to Alice (c is the "challenge" and c should be != 1). (Without knowledge of e_1 and e_2, this is a completely random element of G - {1}.) If Bob knows any base b in G to which discrete log is (even partially) feasible, he can cheat here by sending b (see the next step). 3. Alice computes d = c^{a^{-1} mod q} mod p and sends it to Bob (d is Alice's response). (Again, finding a^{-1} mod q (and hence a) is discrete log to base c.) 4. Bob accepts y as a valid signature iff d === x^{e_1} alpha^{e_2} (mod p). Bob accepts a valid signature (if Alice cooperates): (All exponents are reduced mod q) d === c^{a^{-1}} (mod p) === y^{e_1 a^{-1}} beta^{e_2 a^{-1}} (mod p) === x^{e_1} alpha^{e_2} (mod p) Alice cannot fool Bob into accepting a fraudulent signature, except with probability 1/(q-2) (no computational assumptions!): Each challenge c != 1 corresponds to exactly q-2 many ordered pairs (e_1,e_2) (for every e_1 there is exactly one e_2 that yields c (except when c = y^{e_1} then there is no e_2), since y and beta both generate G; if c = 1 there are q-1 many pairs). Alice has no way to tell which of the q-2 pairs Bob used. If x =/= x^a (mod p), then any response d Alice makes is consistent with exactly one of the q-2 possible pairs: let i,j,k in Z_q be unique such that c === alpha^i, d === alpha^j, x === alpha^k, and y === alpha^l (mod p). The system of congruences c === y^{e_1} beta^{e_2} (mod p) d === x^{e_1} alpha^{e_2} (mod p) is equivalent to i === l e_1 + a e_2 (mod q) j === k e_1 + e_2 (mod q). Assuming y =/= x^a (mod p), this implies l =/= ak (mod q), that is, the coefficient matrix of the system has nonzero determinant, and is thus uniquely solvable for e_1 and e_2. So Alice's response is consistent with exactly one of the q-2 pairs (e_1,e_2). Disavowal Protocol: We run the verification protocol twice, first with (e_1,e_2), then with (f_1,f_2) independently. Bob checks that the verification fails both times. Then Bob is satisfied that y is a forgery iff (d alpha^{-e_2})^{f_1} === (D alpha^{-f_2})^{e_1} (mod p), where d is Alice's first response and D is Alice's second response. This last check is that Alice is consistently following the protocol. We need to prove two things: 1. Alice can convince Bob that an invalid signature is a forgery. 2. Alice cannot cheat and make Bob believe that a valid signature is a forgery except with very small probability. 1. If y =/= x^a (mod p) and Bob and Alice follow the disavowal protocol, then (d alpha^{-e_2})^{f_1} === (c^{a^{-1}} alpha^{-e_2})^{f_1} === ((y^{e_1} alpha^{a e_2})^{a^{-1}} alpha^{-e_2})^{f_1} === y^{e_1 a^{-1} f_1} alpha^{e_2 f_1} alpha^{-e_2 f_1} === y^{e_1 a^{-1} f_1} (mod p) and similarly, (D alpha^{-f_2})^{e_1} === y^{f_1 a^{-1} e_1} (mod p). 2003/04/30 2. If y === x^a (mod p), Bob follows the disavowal protocol (although Alice may not), and Alice's d =/= x^{e_1} alpha^{e_2} (mod p) and her D =/= x^{f_1} alpha^{f_2} (mod p), then Bob's final check succeeds with probability only 1/(q-2): (All exponents are reduced mod q.) Suppose all of the hypotheses above are true, and consider the event that the final consistency check succeeds. Rewrite the consistency check as D === d_0^{f_1} alpha^{f_2} (mod p), where d_0 = d^{e_1^{-1}} alpha^{-e_2 e_1^{-1}} mod p is a value that only depends on Bob's choices in the first verification step of the protocol. By previous arguments, we conclude that y is a (the) valid signature for d_0 with probability 1 - 1/(q-2). In this event, since y is also a valid signature for x, we have x^a === d_0^a (mod p), which implies that x = d_0. But then x^{e_1} alpha^{e_2} === d_0^{e_1} alpha^{e_2} === d (mod p), which contradicts our assumption, so the check can succeed only with probability 1/(q-2). More Interesting Topics Certificates and Certification Authorities (CAs): part of the public key infrastructure -- how does Bob know that he really receives Alice's public key, and not that of Oscar pretending to be Alice? Alice presents her public key a and reasonable proof of identity (driver's license, corporate documents, etc.) to a reputable (trusted) third party entity, a CA. If the CA is satisfied that it is really Alice who is claiming to be "Alice", the CA will issue a _certificate_ to Alice containing: (i) the name "Alice" (some sort of full domain name), (ii) the public key a, and (iii) optional other information such as expiration date, grade of certificate, type of entity Alice is (private person, corporation, government agency, etc.), etc. The certificate is signed by the CA's private key. Bob (and everybody else) knows and trusts the CA's public key. Alice presents her certificate to Bob, and Bob verifies the CA's signature. One CA may vouch for other CAs -- get tree (forest) structure of distributed trust. Some well-known and well-trusted CAs: Verisign, Microsoft, NSA, MIT, ... Certificate Revocation Lists (CRLs): Needed because may need to revoke Alice's certificate before it expires (e.g., Alice is fired from her company and is disgruntled). CRLs should be kept in a secure public directory. Secret Sharing (threshold and otherwise): t-n Threshold Secret Sharing: A secret quantity D (the code to launch a nuclear missile, for example) is distributed between n Army generals in such a way that any subset of generals of size at least t can cooperate to recover D, but no subset of size smaller than t can get *any* information about D. Bit Commitment: Committing phase: Alice chooses a message m and sends a commitment y to Bob. Revealing phase: Alice reveals message m to Bob. We need the protocol to satisfy two properties: Binding: Alice cannot change m after the Committing phase. Concealing: Bob does not have any information about m until the Revealing phase. Useful to ensure nonadaptive contract bids, for example. Zero-knowledge (ZK) proof-based authentication: Alice knows a secret s, corresponding to some public key. She authenticates herself to Bob by proving that she knows s in a way that reveals no information about s to Bob (except the fact that Alice knows s). This prevents repeated-use degradation. Fiat-Shamir protocol: Initialization: 1. A trusted third party, Rusty, generates an RSA-like modulus n = pq, where p and q are distinct large primes. Rusty publishes n but keeps p and q secret. 2. Alice selects a secret s in Z_n^*, computes v = s^2 mod n, and registers v with Rusty as her public key. Protocol: do the following t times: Alice -> Bob: x = r^2 mod n, r is secret random element of Z_n^* (commitment) and x is the witness Bob -> Alice: random e in {0,1} (challenge) Alice -> Bob: y = r s^e mod n (response) Bob accepts the round if y != 0 and y^2 === x v^e (mod n). If Bob accepts all t rounds, then Bob accepts the proof. Oscar, not knowing s and trying to impersonate Alice, can only be sure of correctly responding to one of the two possible e values. To be sure of answering e=0 correctly, he must send x = r^2 mod n as a witness, where he knows r. If e=1, however, Oscar must respond with a square root of x v (mod n), which is \pm r s (mod n). If Oscar wants to be sure of responding to e=1 correctly, he must send x = r^2/v, but then if e=0, then he must give a square root of r^2/v, which is \pm r/s. So on each round, Oscar will fail with probability >= 1/2. No extra knowledge of s is conveyed to Bob: If e=0, then Alice sends y=r, which is completely independent of s. If e=1, then Alice sends y = rs mod n, which looks completely random to Bob because he does not know r. Key point, acceptable communications (x,y) from Alice could be simulated by Bob himself: Bob chooses y randomly, then defines x = y^2 or x = y^2/v. This assumes that computing square roots mod n is difficult (it is equivalent to factoring n).