About the Cryptopocalypse

You're browsing without Javascript! If you have no idea what that means, you should ask your technical friend about it.
Otherwise - kudos. The website should work - with the exception of comments. You can learn what they are; however, by visiting /comments.py?postid=blog-cryptopocalypse_followup
If it doesn't, please contact me and let me know.

20 Aug 2013 9:11 EST

Randomness, and DSA

Last weekend, Bitcoin reported that several Android-based wallets were insecure and that bitcoin were stolen from them - all because the individual transactions (authenticated by a ECDSA signature) used a poor source of randomness. When you use a poor source of randomness in a DSA (or ECDSA) signature, you may reuse one of the values in the signature, the k value. Reusing k can expose the private key - something Sony knows well, as it was the flaw that exposed a private key in the PS3.

But the Bitcoin apps used SecureRandom, which as the name implies is actually supposed to be cryptographically secure. On Android, it's the recommended, "cryptographically secure" library... and it wasn't secure. And the results have been as catastrophic as they can be - long term private keys have been recovered. Why? Mathematically speaking, we understand why - but why do we rely on DSA, when it has this horrendous property? We know that random number generators will fail. Whether it's this instance, Mining Your P's and Q's, or Debian's openssl bug - even in seemingly highly secure or reliable instances... we still shouldn't trust the random number generator fully. While every algorithm requires strong randomness for key generation, only DSA requires strong randomness for every signature - RSA does not. It's time to stop letting DSA hold a gun to our heads.

Fortunately, we have a solution, and it doesn't necessarily require moving to another algorithm. We can incrementally fix DSA. The k value doesn't necessarily have to be random - it just must be unique and secret. RFC 6979 is a mechanism to generate a unique number based off the key and message, to ensure it's not reused. Dubbed "deterministic DSA" - you can generate signatures with this technique that are completely compatible with any DSA signature-checking implementation. With all the problems DSA has given us, deterministic DSA (or another scheme that does not rely on randomness for every single signature) really needs to be the standard way of generating signatures.

The option to do this in OpenSSL (EC_KEY_set_nonce_from_hash) has been committed (by Adam Langley) and will be available in new versions. To be pedantic, another signature scheme that creates a deterministic, unique value is the ECC-based ed25519 signature scheme; and ~~Schnorr signatures also reduce the dependence on randomness~~ - Schnorr signatures do depend on the RNG.

Cyptopocalypse, and Joux's Work

So let's talk about Cryptopocalypse. What I anticipated being a relatively low impact talk urging folks to examine how we're betting the farm in one particular area (which is always a bad idea) - has turned into a momentous media surge. A lot of folks have rightly picked on how we portrayed the applicability of Joux's work to RSA. I want to follow up with some more details for folks who weren't able to attend, as a lot of the media coverage has been second and third hand accounts.

To start off, in the talk I gave about several examples of the relationship between factoring and discrete logs. It turns out some of the parallels go back even further... to the 1920s and 1930s! "Théories des nombres" (1922) and "On Factoring Large Numbers" (1931) talk about solving Discrete Logs and Factoring respectively, both using the techniques of combining congruencies. For other historical notes, and more math, there's some excellent links worth pointing to in the discussions of the talk.

But let's talk more about the recent work. A lot of focus in the media has been on RSA. Which is understandable, because of how entrenched we are with RSA. Let me go back to the math though. Joux's record setting Discrete Logs are done in a field of small characteristic that is also highly composite. A small characteristic is not what Diffie Hellman, DSA, and ElGamal is done in; they are done in 'large characteristic' fields or more commonly 'prime fields'. (Also, to be clear, this is 'classic' DH and DSA, as opposed to the Elliptic Curve flavors: ECDH and ECDSA.)

There are roughly four steps of the Function Field Sieve:

Polynomial Selection
Sieving
Linear Algebra
the Descent

Polynomial selection is done to find a polynomial that will be used in sieving, and sieving finds relations between the logs of elements in the field. The Linear Algebra is used to solve these relations to get the logarithms of them. The output of the Linear Algebra is a set of vectors, where each vector represents the discrete log of an element. Imagine them as a system of equations. The Descent starts at the specific element you want to solve for, and recursively uses these systems of equations to rewrite the specific element into smaller and smaller terms that you can eventually solve. (That's why it's called the Descent.)

Joux chooses a specific polynomial - this is roughly analogous to the polynomial selection for factoring special numbers (using the Special Number Field Sieve). Based on the structure of the individual target, you choose a polynomial that fits well. The sieving is sped up due to the small characteristic and also because the field being worked on is highly composite. These are optimizations that don't apply directly to prime fields.

Before 2013, in what some people are calling 'classic Function Field Sieve' (which is an interesting hint about how much things have changed) - the Descent was actually pretty fast. Way less than 1% of the run time of an entire Function Field Sieve process. But now, with the new techniques, the Descent is much harder. The sieving is very fast, and you complete the linear algebra quickly - but there are much fewer relations than there were before. It's much harder to descend to them. Before, people didn't care about optimizing the Descent - now it's very important.

And the Descent has been optimized for fields of a small characteristic. It's been optimized to the point of quasi-polynomial, a fantastic achievement. In fact, as the characteristic gets larger (up to "medium" characteristic), the algorithm still applies. It gets slower however - it's basically slower and slower as the characteristic grows.

Does the new Descent algorithm apply to prime fields though? Not right now. And the reason why is very interesting. During the Descent, the major tool used is detecting smooth elements. (A smooth element is one that has small prime factors.) For fields of a small characteristic, this is fast and efficient, because the elements can be represented by polynomials, and detecting the smoothness of a polynomial is easy. For prime fields, the elements seemingly cannot be represented that way. In that case, detecting the smoothness of an element is done by... factoring it.

A major limitation of solving discrete logs in prime fields today, is the lack of an efficient factoring algorithm! Now, for prime fields, ECM is used. (EC stands for 'Elliptic Curve', in yet another all-math-is-related-but-don't-get-confused-because-we-haven't-changed-topics situation. We haven't started talking about Elliptic Curves or anything.) ECM is used because it turns out to be more efficient than the General Number Field Sieve (GNFS) when a number has small factors. (For breaking RSA keys, we use the GNFS.) ECM is being improved, and there are several papers published on the topic each year.

But to circle back home, I wanted to explain Joux's work a slight level deeper (but still quite above the actual published papers.) The recent work have gotten mathematicians very excited about discrete logs, and in essence, Joux's work is very analogous to the Special Number Field Sieve, which is used to factor large numbers that have very specific properties.

The work does not have obvious implications to prime fields (which is what DH, DSA, and ElGamal use) and it does not have obvious implications to RSA (from no less than the the man himself). But all of these problems are somewhat related, and improvements in one area can lead to improvements in another - either from indirect applications of the research, or from focused scrutiny and excitement. Advances in the General Number Field Sieve have evolved out of the Special Number Field Sieve.

We need much more flexibility in our cryptographic protocols. When advances are made, we are bit by them - again, and again, and again. MD5 was known to be weak in the mid-2000s - but it took generating a rogue certificate authority to convince browsers and CAs they needed to deprecate it right away. (And it still took them a few years.) And even after that - Stuxnet still exploited a MD5-based vulnerability. Single-DES was known to be brute-forcible in 1999, and yet we're still breaking deployed protocols relying on it today: MSCHAPv2 in 2012, SIM Cards and PEAP network authentication in 2013. RSA-1024 is known to be weak today (estimates I've seen have suggested it's doable with 3-11 TB RAM clusters, which isn't terribly large). And while (some) browsers seem to plan on distrusting 1024-bit root certs in early 2014 - we're going to be trusting 1024-bit certs for much longer than we should be. (Oh, and Apache has no option to increase the DH bit-length above 1024, so your "Perfect Forward Secrecy" DHE sessions aren't looking too good.)

If you aren't planning for the Cryptopocalypse of whatever algorithm and keysize you're trusting right now - you're doomed to repeat the same mistakes. Because we keep falling victim to known-broken algorithms again, and again, and again.

ECC & TLS

In our slides, we mentioned that TLS 1.2 was the first to include ECC options. Some ciphersuites will not work in prior versions of TLS (like GCM-based ciphersuites) but others will. Adam Langley pointed out you can use ECDH, ECDHE, and ECDSA in TLS 1.0, so long as your browser supports it (and nearly all up to date browsers do). Also, a small correction - you are able to have the CA and Certificate signing algorithm not match prior to TLS 1.2 - cross-signing is possible in TLS 1.0.

Something I'd like to be called out on, that we haven't been yet, is being able to easily buy a ECC-based SSL cert from a Certificate Authority. It sure would be nice to be able to buy an ECC certificate for my website. Now I've had one or two people ask me - if you don't trust RSA, what's the point of getting an ECC cert when the whole internet trusts tons of RSA-based CAs?

And the answer is Certificate Pinning. You can use Chrome's preloaded pins, the almost-completed Public Key Pinning header, the forthcoming TACK extension, and iOS and Android solutions to pin to your ECC cert, and not have to worry about RSA or compromised certificate authorities.

ECC is Nice

At the end of our talk we lay out a lot of recommendations for people.

If you work on an OS or a language - make ECC easy to use, and use it yourself
If you work on a browser - keep pushing TLS 1.2. (All browsers are working on this, to be fair)
If you write software - support TLS 1.2 on your webservers, make your protocols upgradable (or use TLS 1.2), and don't place all your trust for software updates in any single thing like RSA
If you're a CA - make it easy to buy ECC certs, and document the process
If you're a normal company - turn on ECDHE, survey your exposure

Making ECC built-in, accessible, and relied-on makes sense. The Prime Field curves have strong security margins, shorter ciphertext (awesome for cellular and mobile), and efficiency gains. We're having major difficulties moving off RSA 1024, and that shouldn't be trusted today. Starting supporting ECC today will make it so much less painful to migrate to it when we need to. It's much, much easier for clients (especially browsers) to disable support for a newly vulnerable feature than it is to deploy a brand-new feature entirely.

As an important footnote, Colin Percival and Dennis F pointed out that binary curves give most people pause. And the reasoning is excellent - Binary Curves have an 'inherent structure', and we seem to be pretty good at exploiting inherent structures. So staying on the P curves is prudent.

Do we have anything else?

Now I just talked about needing flexibility, and then said ECC a bunch of times. Betting the farm on ECC isn't flexibility. Flexibility comes from RSA + ECC, ideally in tandem, but if not, being able to switch between them easily. But what about things besides ECC?

As Matt Green points out, there's the McEliece Cryptosystem which only has 187 kilobit - 1 megabit keys. That's sarcasm.) And there's NTRU, which looks pretty awesome. It's efficient, well studied, and even is standardized. It is however patented. There's some flexbility there, and the patent may be expiring soon, but just as OCB mode never caught on (probably due to the patent), NTRU isn't catching on either. A shame, in my opinion, as NTRU neatly avoids the 'Quantum Cryptopocalypse' also.

I would be remiss if I did not thank Jason Papadopoulos and Pierrick Gaudry for all their work helping me understand this stuff. Any errors are entirely mine and not theirs.

Comments

/comments.py?postid=blog-cryptopocalypse_followup

Add a comment...

Name: required
E-Mail: required, hidden, gravatared
Website:
Comment: required, markdown enabled (help)

you type:	you see:
italics	italics
bold	bold
[stolen from reddit!](http://reddit.com)	stolen from reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"