RSA was patented in 1983 and has since become a cornerstone of secure communications in the modern world. How does it work?
To say that RSA is an example of asymmetric (public key-private key) encryption would be stating the obvious. It is possibly the most well-known asymmetric encryption algorithm. Often used in combination with other encryption schemes, RSA is a foundation of modern secure communications. It rests on the prime factorization problem and a one-way function re-discovered by Ron Rivest, Adi Shamir, and Leonard Adleman. RSA is an acronym for the last names of those three men. Last time, we discussed asymmetric cryptography in general (link). This time, we’ll dig into RSA.
Before the 1970s, symmetric encryption ruled secure communications. Then, James H. Ellis published the first big step toward asymmetric encryption. The UK’s intelligence agency (GCHQ) quickly classified his work. At the GCHQ, Clifford Cocks and Malcolm Williamson expanded upon Ellis’s work. Several years later, Ralph Merkle published an early form of asymmetric encryption, and his work was in the public sphere. His work influenced the design of the Diffie-Helman Key Exchange. 1977 saw Rivest et. al put forward a one-way function that was difficult to invert. The pieces for RSA were in place.
Conceptually, asymmetric encryption is pretty simple. You communicate securely with two keys (the encryption and decryption keys). The keys are the inverse of each other. In other words, the decryption key unscrambles the encryption key and vice versa. Without the appropriate key, unscrambling the messages is hard. So, the encryption process is one way. How would you do that with math?
Prime factorization is the basis of asymmetric cryptography. Finding the multiple of two primes (e.g., 907 * 773 = 701,111) is easy. However, finding the two primes that multiply to a number is hard and gets harder as the number grows in size. But, the problem becomes easy again with additional information (e.g., one of the prime factors). In other words, prime factorization is a trapdoor function. We can use it to lock and secure our message, and it would be difficult to remove the lock without a key.
What good is a lock without a key? The first step in RSA is to choose two prime numbers (p and q) to find our modulus (n). In practice, to keep our message secure, we would use two random prime numbers that are very large and not close to each other. We'll use 907 and 773 for our example.
p = 907
q = 773
n = 907 * 773
= 701,111
Next up is Carmichael’s totient function. Carmichael's totient function is the smallest positive divisor of Euler's totient function that satisfies the conclusion of Euler's theorem. In other words, it is the number of positive integers less than n that are coprime with n.
λ(n) = lcm (p − 1, q − 1)
Where λ(n) is the totient function of n and lcm is the lowest common multiple. So, with our p and q, the equation is:
λ(n) = lcm (907 - 1, 773 - 1)
= 349,716
With the totient in hand, let's generate our public key. The public key is simply a prime number e and a modulus (n), such that:
c = mᵉ mod n
The message is m, c is the ciphertext, and ᵉ mod n is the public key. Since the public key is not secret, e doesn't have to be random. Let's go with 11. We have already calculated our modulus.
e = 11
n = 701,111
Let's say we want to encrypt the message 4. The equation becomes:
c = 4¹¹ mod 701,111
= 688,749
So, our message is encrypted. Only the holder of the secret key can decrypt it.
So, we have a public key that we can broadcast. Now, we need to find the private key that goes with it. The equation is:
d =1/e mod λ(n)
where d and n make up our private key. In other words, we take the modular inverse of the totient we calculated earlier to find our private key.
d =1/11 mod 349,716
= 254,339
To decrypt the ciphertext we found above, we use a simple equation:
m = cᵈ mod n
So,
m = 688,749²⁵⁴³³⁹ mod 701,111
= 4
That takes care of the math behind RSA. It is simple and elegant. However, the format of the ciphertext can give clues about the message. So an attacker can make deductions about the communication even if it is encrypted. That is, of course, undesirable. So, we must pad the message before encrypting it. That means we will add randomized data to it to help disguise the format. There are multiple ways that an RSA implementation can use to pad messages (e.g., optimal asymmetric encryption padding) and multiple standards for such tasks (e.g., PKCS).
That is RSA. It has secured emails, digital transactions, and online chats. Given its prominence, it is fair to wonder, what are its weak points?
RSA is fundamentally about the difficulty of prime factorization. As computers develop, prime factorization may get easier. For example, a 768-bit key has been broken. Secure RSA keys should be 2048-bits or longer. The recommended length will increase as computer technology progresses.
RSA is also vulnerable if the generated prime numbers are not generated randomly enough or are too close together. Additionally, the number d cannot be too small. A cryptographically secure pseudo-random number generator can help address the problem of poor key generation.