mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-24 20:33:31 +01:00
183b5905bb
We don't need to explain the difference between 2nd preimage and collision: anybody who doesn't know can use wikipedia.
248 lines
11 KiB
Plaintext
248 lines
11 KiB
Plaintext
Filename: xxx-what-uses-sha1.txt
|
|
Title: Where does Tor use SHA-1 today?
|
|
Authors: Nick Mathewson, Marian
|
|
Created: 30-Dec-2008
|
|
Status: Meta
|
|
|
|
|
|
Introduction:
|
|
|
|
Tor uses SHA-1 as a message digest. SHA-1 is showing its age:
|
|
theoretical attacks for finding collisions against it get better
|
|
every year or two, and it will likely be broken in practice before
|
|
too long.
|
|
|
|
According to smart crypto people, the SHA-2 functions (SHA-256, etc)
|
|
share too much of SHA-1's structure to be very good. RIPEMD-160 is
|
|
also based on flawed past hashes. Some people think other hash
|
|
functions (e.g. Whirlpool and Tiger) are not as bad; most of these
|
|
have not seen enough analysis to be used yet.
|
|
|
|
Here is a 2006 paper about hash algorithms.
|
|
http://www.sane.nl/sane2006/program/final-papers/R10.pdf
|
|
|
|
(Todo: Ask smart crypto people.)
|
|
|
|
By 2012, the NIST SHA-3 competition will be done, and with luck we'll
|
|
have something good to switch too. But it's probably a bad idea to
|
|
wait until 2012 to figure out _how_ to migrate to a new hash
|
|
function, for two reasons:
|
|
1) It's not inconceivable we'll want to migrate in a hurry
|
|
some time before then.
|
|
2) It's likely that migrating to a new hash function will
|
|
require protocol changes, and it's easiest to make protocol
|
|
changes backward compatible if we lay the groundwork in
|
|
advance. It would suck to have to break compatibility with
|
|
a big hard-to-test "flag day" protocol change.
|
|
|
|
This document attempts to list everything Tor uses SHA-1 for today.
|
|
This is the first step in getting all the design work done to switch
|
|
to something else.
|
|
|
|
This document SHOULD NOT be a clearinghouse of what to do about our
|
|
use of SHA-1. That's better left for other individual proposals.
|
|
|
|
|
|
Why now?
|
|
|
|
The recent publication of "MD5 considered harmful today: Creating a
|
|
rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob
|
|
Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de
|
|
Weger has reminded me that:
|
|
|
|
* You can't rely on theoretical attacks to stay theoretical.
|
|
* It's quite unpleasant when theoretical attacks become practical
|
|
and public on days you were planning to leave for vacation.
|
|
* Broken hash functions (which SHA-1 is not quite yet AFAIU)
|
|
should be dropped like hot potatoes. Failure to do so can make
|
|
one look silly.
|
|
|
|
|
|
Triage
|
|
|
|
How severe are these problems? Let's divide them into these
|
|
categories, where H(x) is the SHA-1 hash of x:
|
|
PREIMAGE -- find any x such that a H(x) has a chosen value
|
|
-- A SHA-1 usage that only depends on preimage
|
|
resistance
|
|
* Also SECOND PREIMAGE. Given x, find a y not equal to
|
|
x such that H(x) = H(y)
|
|
COLLISION<role> -- A SHA-1 usage that depends on collision
|
|
resistance, but the only party who could mount a
|
|
collision-based attack is already in a trusted role
|
|
(like a distribution signer or a directory authority).
|
|
COLLISION -- find any x and y such that H(x) = H(y) -- A
|
|
SHA-1 usage that depends on collision resistance
|
|
and doesn't need the attacker to have any special keys.
|
|
|
|
There is no need to put much effort into fixing PREIMAGE and SECOND
|
|
PREIMAGE usages in the near-term: while there have been some
|
|
theoretical results doing these attacks against SHA-1, they don't
|
|
seem to be close to practical yet. To fix COLLISION<code-signing>
|
|
usages is not too important either, since anyone who has the key to
|
|
sign the code can mount far worse attacks. It would be good to fix
|
|
COLLISION<authority> usages, since we try to resist bad authorities
|
|
to a limited extent. The COLLISION usages are the most important
|
|
to fix.
|
|
|
|
Kelsey and Schneier published a theoretical second preimage attack
|
|
against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE
|
|
and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes
|
|
require minimal effort.
|
|
|
|
http://www.schneier.com/paper-preimages.html
|
|
|
|
Additionally, we need to consider the impact of a successful attack
|
|
in each of these cases. SHA-1 collisions are still expensive even
|
|
if recent results are verified, and anybody with the resources to
|
|
compute one also has the resources to mount a decent Sybil attack.
|
|
|
|
Let's be pessimistic, and not assume that producing collisions of
|
|
a given format is actually any harder than producing collisions at
|
|
all.
|
|
|
|
|
|
What Tor uses hashes for today:
|
|
|
|
1. Infrastructure.
|
|
|
|
A. Our X.509 certificates are signed with SHA-1.
|
|
COLLSION
|
|
B. TLS uses SHA-1 (and MD5) internally to generate keys.
|
|
PREIMAGE?
|
|
* At least breaking SHA-1 and MD5 simultaneously is
|
|
much more difficult than breaking either
|
|
independently.
|
|
C. Some of the TLS ciphersuites we allow use SHA-1.
|
|
PREIMAGE?
|
|
D. When we sign our code with GPG, it might be using SHA-1.
|
|
COLLISION<code-signing>
|
|
* GPG 1.4 and up have writing support for SHA-2 hashes.
|
|
This blog has help for converting:
|
|
http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/
|
|
E. Our GPG keys might be authenticated with SHA-1.
|
|
COLLISION<code-signing-key-signing>
|
|
F. OpenSSL's random number generator uses SHA-1, I believe.
|
|
PREIMAGE
|
|
|
|
2. The Tor protocol
|
|
|
|
A. Everything we sign, we sign using SHA-1-based OAEP-MGF1.
|
|
PREIMAGE?
|
|
B. Our CREATE cell format uses SHA-1 for: OAEP padding.
|
|
PREIMAGE?
|
|
C. Our EXTEND cells use SHA-1 to hash the identity key of the
|
|
target server.
|
|
COLLISION
|
|
D. Our CREATED cells use SHA-1 to hash the derived key data.
|
|
??
|
|
E. The data we use in CREATE_FAST cells to generate a key is the
|
|
length of a SHA-1.
|
|
NONE
|
|
F. The data we send back in a CREATED/CREATED_FAST cell is the length
|
|
of a SHA-1.
|
|
NONE
|
|
G. We use SHA-1 to derive our circuit keys from the negotiated g^xy
|
|
value.
|
|
NONE
|
|
H. We use SHA-1 to derive the digest field of each RELAY cell, but that's
|
|
used more as a checksum than as a strong digest.
|
|
NONE
|
|
|
|
3. Directory services
|
|
|
|
[All are COLLISION or COLLISION<authority> ]
|
|
|
|
A. All signatures are generated on the SHA-1 of their corresponding
|
|
documents, using PKCS1 padding.
|
|
* In dir-spec.txt, section 1.3, it states,
|
|
"SIGNATURE" Object contains a signature (using the signing key)
|
|
of the PKCS1-padded digest of the entire document, taken from
|
|
the beginning of the Initial item, through the newline after
|
|
the Signature Item's keyword and its arguments."
|
|
So our attacker, Malcom, could generate a collision for the hash
|
|
that is signed. Thus, a second pre-image attack is possible.
|
|
Vulnerable to regular collision attack only if key is stolen.
|
|
If the key is stolen, Malcom could distribute two different
|
|
copies of the document which have the same hash. Maybe useful
|
|
for a partitioning attack?
|
|
B. Router descriptors identify their corresponding extra-info documents
|
|
by their SHA-1 digest.
|
|
* A third party might use a second pre-image attack to generate a
|
|
false extra-info document that has the same hash. The router
|
|
itself might use a regular collision attack to generate multiple
|
|
extra-info documents with the same hash, which might be useful
|
|
for a partitioning attack.
|
|
C. Fingerprints in router descriptors are taken using SHA-1.
|
|
* The fingerprint must match the public key. Not sure what would
|
|
happen if two routers had different public keys but the same
|
|
fingerprint. There could perhaps be unpredictable behaviour.
|
|
D. In router descriptors, routers in the same "Family" may be listed
|
|
by server nicknames or hexdigests.
|
|
* Does not seem critical.
|
|
E. Fingerprints in authority certs are taken using SHA-1.
|
|
F. Fingerprints in dir-source lines of votes and consensuses are taken
|
|
using SHA-1.
|
|
G. Networkstatuses refer to routers identity keys and descriptors by their
|
|
SHA-1 digests.
|
|
H. Directory-signature lines identify which key is doing the signing by
|
|
the SHA-1 digests of the authority's signing key and its identity key.
|
|
I. The following items are downloaded by the SHA-1 of their contents:
|
|
XXXX list them
|
|
J. The following items are downloaded by the SHA-1 of an identity key:
|
|
XXXX list them too.
|
|
|
|
4. The rendezvous protocol
|
|
|
|
A. Hidden servers use SHA-1 to establish introduction points on relays,
|
|
and relays use SHA-1 to check incoming introduction point
|
|
establishment requests.
|
|
B. Hidden servers use SHA-1 in multiple places when generating hidden
|
|
service descriptors.
|
|
* The permanent-id is the first 80 bits of the SHA-1 hash of the
|
|
public key
|
|
** time-period performs caclulations using the permanent-id
|
|
* The secret-id-part is the SHA-1 has of the time period, the
|
|
descriptor-cookie, and replica.
|
|
* Hash of introduction point's identity key.
|
|
C. Hidden servers performing basic-type client authorization for their
|
|
services use SHA-1 when encrypting introduction points contained in
|
|
hidden service descriptors.
|
|
D. Hidden service directories use SHA-1 to check whether a given hidden
|
|
service descriptor may be published under a given descriptor
|
|
identifier or not.
|
|
E. Hidden servers use SHA-1 to derive .onion addresses of their
|
|
services.
|
|
* What's worse, it only uses the first 80 bits of the SHA-1 hash.
|
|
However, the rend-spec.txt says we aren't worried about arbitrary
|
|
collisons?
|
|
F. Clients use SHA-1 to generate the current hidden service descriptor
|
|
identifiers for a given .onion address.
|
|
G. Hidden servers use SHA-1 to remember digests of the first parts of
|
|
Diffie-Hellman handshakes contained in introduction requests in order
|
|
to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be
|
|
taking a hash of a hash here.
|
|
H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with
|
|
a connecting client.
|
|
|
|
5. The bridge protocol
|
|
|
|
XXXX write me
|
|
|
|
A. Client may attempt to query for bridges where he knows a digest
|
|
(probably SHA-1) before a direct query.
|
|
|
|
6. The Tor user interface
|
|
|
|
A. We log information about servers based on SHA-1 hashes of their
|
|
identity keys.
|
|
COLLISION
|
|
B. The controller identifies servers based on SHA-1 hashes of their
|
|
identity keys.
|
|
COLLISION
|
|
C. Nearly all of our configuration options that list servers allow SHA-1
|
|
hashes of their identity keys.
|
|
COLLISION
|
|
E. The deprecated .exit notation uses SHA-1 hashes of identity keys
|
|
COLLISION
|