RSA algorithm

From Citizendium
Revision as of 00:16, 26 October 2008 by imported>Sandy Harris (re-order, change some heading levels)
Jump to navigation Jump to search
This article is developed but not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
 
This editable, developed Main Article is subject to a disclaimer.

The RSA algorithm is the best known public key encryption algorithm. Like any public key system, it can be used to create digital signatures as well as for secrecy.

It is named for its inventors Ron Rivest, Adi Shamir and Leonard Adeleman. The original paper defining it is "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems" [1]. by those three authors.

The algorithm was patented U.S. Patent 4,405,829, PDF in 1983 by MIT, but released into the public domain in 2000.

The inventors started a company, RSA Laboratories, which became a major player in the information security industry. Rivest in particular went on to invent additional systems which the company marketed, including the MD4 and MD5 hash algorithms and various encryption algorithms with names of the form RCn including RC4, a widely used stream cipher.

Staff at RSA Laboratories wrote a specification for use of the RSA algorithm in Internet protocols. The current version is RFC 2437. RSA is used in many protocols, including IPsec, Open PGP and DNS security.

How it works

To generate an RSA key pair, the system first finds two distinct primes p, q and the product N = pq. Take p-1 and q-1 and find either the product T = (p-1)(q-1) or, better, the least common multiple T = lcm( p-1, q-1) ( discussion). Then choose encryption exponent e and decryption exponent d such that d*e == 1 modulo T. The public key is then the pair (N,e) and the private key (N,d).

The strength parameter of the system is the length of N in bits. As of 2008, 1024 bits is considered secure but some users choose larger sizes to be on the safe side.

Using t for plaintext and c for ciphertext, encryption is then:

 c = te  modulo N

and decryption, using m for the decrypted message, is:

 m = cd modulo N

so we have:

 m = (te)d modulo N
 m = tde modulo N

whence (via the proof below)

 m = t   modulo N

That is, the encrypt/decrypt pair of operations always give the correct result.

Proof

The proof is based on theorems proved by Fermat, back in the 17th century,

For prime p and any x:

 xp == x   modulo p

and for non-zero x:

 xp-1 == 1   modulo p

Whence, for any k and non-zero x:

 xk(p-1) == 1   modulo p

so for non-zero x:

 xk(p-1)(q-1) == 1   modulo p

and

 xk(p-1)(q-1)+1 == x   modulo p

but that also holds if x is zero modulo p, since then both sides are zero.

Similarly,

 xk(p-1)(q-1)+1 == x   modulo q

so if p and q are distinct primes

 xk(p-1)(q-1)+1 == x   modulo pq

But we have:

 de == 1 mod T
 de = k(p-1)(q-1)+1   for some k

and

 m = tde    modulo N which is modulo pq

so

 m = t      in all cases


Security

A few considerations are critical for the security of RSA:

  • Generating primes which an enemy cannot guess requires a strong random number generator; any weakness in the generator reduces security.
  • Using the system securely requires that the private key never be revealed; losing that key to an enemy instantly reduces security to zero.
  • Discovery of an efficient solution to the integer factorisation problem would break RSA. See below

For more subtle problems, see Attacks on RSA.


RSA and factoring

Given an efficient solution to the integer factorisation problem, breaking RSA would become trivial. The attacker is assumed to have the public key (N,e). If he can factor N, he gets p, q and therefore p-1, q-1, and T. He knows e and can calculate its inverse mod T using the efficient Extended Euclidean algorithm. That gives him d and he already has N, so now he knows the private key (N,d). The cryptosystem would be rendered worthless.

The problem with that is that no efficient solution for factoring is known, despite considerable effort by quite a few people over several decades to find one. It seems possible no such algorithm exists, though no-one has proven that.

However, while no really efficient (polynomial in the number of bits in N) methods are known, there are various methods that do work (see integer factorisation), those methods are improving, and computers get faster all the time (see Moore's Law). It is advisable to use large RSA moduli to provide a safety factor against these changes.

RSA Laboratories ran a factoring contest for some years, with large cash prizes, to test the security of RSA. This ended in 2007.

Implementation differences

T can be calculated in two ways. The original RSA paper, and other references such as Schneier [2] and PKCS1v1, RFC 2313, give T = (p-1)(q-1). Other references such as PKCS1v2, RFC 2437, give T = lcm(p-1, q-1). Menezes at al. give both [3]. Using the least common multiple has now become the usual implementation practice. It is slightly more efficient, but the system works either way. In one sense, the difference is not important.

However it can lead to interoperation problems if two implementations do it differently. In an example from IPsec, the FreeS/WAN implementer used the product while PGPnet used the least common multiple. About half of the signatures generated with PGPnet failed to verify on FreeS/WAN. Several users, both implementers, and some managers attempted to find the problem, without success until the specification discrepancy was noticed.

References

  1. Rivest et al. "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems"
  2. Schneier, Bruce (2nd edition, 1996,), Applied Cryptography, John Wiley & Sons, ISBN 0-471-11709-9
  3. Menezes, AJ; PC van Oorschot & SA Vanstone (Fifth Edition, 2001), Handbook of Applied Cryptography, ISBN 0-8493-8523-7