mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-24 12:23:32 +01:00
199 lines
8.0 KiB
Plaintext
199 lines
8.0 KiB
Plaintext
Filename: 158-microdescriptors.txt
|
|
Title: Clients download consensus + microdescriptors
|
|
Author: Roger Dingledine
|
|
Created: 17-Jan-2009
|
|
Status: Open
|
|
|
|
0. History
|
|
|
|
15 May 2009: Substantially revised based on discussions on or-dev
|
|
from late January. Removed the notion of voting on how to choose
|
|
microdescriptors; made it just a function of the consensus method.
|
|
(This lets us avoid the possibility of "desynchronization.")
|
|
Added suggestion to use a new consensus flavor. Specified use of
|
|
SHA256 for new hashes. -nickm
|
|
|
|
15 June 2009: Cleaned up based on comments from Roger. -nickm
|
|
|
|
1. Overview
|
|
|
|
This proposal replaces section 3.2 of proposal 141, which was
|
|
called "Fetching descriptors on demand". Rather than modifying the
|
|
circuit-building protocol to fetch a server descriptor inline at each
|
|
circuit extend, we instead put all of the information that clients need
|
|
either into the consensus itself, or into a new set of data about each
|
|
relay called a microdescriptor.
|
|
|
|
Descriptor elements that are small and frequently changing should go
|
|
in the consensus itself, and descriptor elements that are small and
|
|
relatively static should go in the microdescriptor. If we ever end up
|
|
with descriptor elements that aren't small yet clients need to know
|
|
them, we'll need to resume considering some design like the one in
|
|
proposal 141.
|
|
|
|
Note also that any descriptor element which clients need to use to
|
|
decide which servers to fetch info about, or which servers to fetch
|
|
info from, needs to stay in the consensus.
|
|
|
|
2. Motivation
|
|
|
|
See
|
|
http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
|
|
http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
|
|
http://archives.seul.org/or/dev/Nov-2008/msg00007.html
|
|
for a discussion of the options and why this is currently the best
|
|
approach.
|
|
|
|
3. Design
|
|
|
|
There are three pieces to the proposal. First, authorities will list in
|
|
their votes (and thus in the consensus) the expected hash of
|
|
microdescriptor for each relay. Second, authorities will serve
|
|
microdescriptors, directory mirrors will cache and serve
|
|
them. Third, clients will ask for them and cache them.
|
|
|
|
3.1. Consensus changes
|
|
|
|
If the authorities choose a consensus method of a given version or
|
|
later, a microdescriptor format is implicit in that version.
|
|
A microdescriptor should in every case be a pure function of the
|
|
router descriptor and the consensus method.
|
|
|
|
In votes, we need to include the hash of each expected microdescriptor
|
|
in the routerstatus section. I suggest a new "m" line for each stanza,
|
|
with the base64 of the SHA256 hash of the router's microdescriptor.
|
|
|
|
For every consensus method that an authority supports, it includes a
|
|
separate "m" line in each router section of its vote, containing:
|
|
"m" SP methods 1*(SP AlgorithmName "=" digest) NL
|
|
where methods is a comma-separated list of the consensus methods
|
|
that the authority believes will produce "digest".
|
|
|
|
(As with base64 encoding of SHA1 hashes in consensuses, let's
|
|
omit the trailing =s)
|
|
|
|
The consensus microdescriptor-elements and "m" lines are then computed
|
|
as described in Section 3.1.2 below.
|
|
|
|
(This means we need a new consensus-method that knows
|
|
how to compute the microdescriptor-elements and add "m" lines.)
|
|
|
|
The microdescriptor consensus uses the directory-signature format from
|
|
proposal 162, with the "sha256" algorithm.
|
|
|
|
|
|
3.1.1. Descriptor elements to include for now
|
|
|
|
In the first version, the microdescriptor should contain the
|
|
onion-key element, and the family element from the router descriptor,
|
|
and the exit policy summary as currently specified in dir-spec.txt.
|
|
|
|
3.1.2. Computing consensus for microdescriptor-elements and "m" lines
|
|
|
|
When we are generating a consensus, we use whichever m line
|
|
unambiguously corresponds to the descriptor digest that will be
|
|
included in the consensus.
|
|
|
|
(If different votes have different microdescriptor digests for a
|
|
single <descriptor-digest, consensus-method> pair, then at least one
|
|
of the authorities is broken. If this happens, the consensus should
|
|
contain whichever microdescriptor digest is most common. If there is
|
|
no winner, we break ties in the favor of the lexically earliest.
|
|
Either way, we should log a warning: there is definitely a bug.)
|
|
|
|
The "m" lines in a consensus contain only the digest, not a list of
|
|
consensus methods.
|
|
|
|
3.1.3. A new flavor of consensus
|
|
|
|
Rather than inserting "m" lines in the current consensus format,
|
|
they should be included in a new consensus flavor (see proposal
|
|
162).
|
|
|
|
This flavor can safely omit descriptor digests.
|
|
|
|
When we implement this voting method, we can remove the exit policy
|
|
summary from the current "ns" flavor of consensus, since no current
|
|
clients use them, and they take up about 5% of the compressed
|
|
consensus.
|
|
|
|
This new consensus flavor should be signed with the sha256 signature
|
|
format as documented in proposal 162.
|
|
|
|
3.2. Directory mirrors fetch, cache, and serve microdescriptors
|
|
|
|
Directory mirrors should fetch, catch, and serve each microdescriptor
|
|
from the authorities. (They need to continue to serve normal relay
|
|
descriptors too, to handle old clients.)
|
|
|
|
The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
|
|
available at:
|
|
http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
|
|
(We use base64 for size and for consistency with the consensus
|
|
format. We use -s instead of +s to separate these items, since
|
|
the + character is used in base64 encoding.)
|
|
|
|
All the microdescriptors from the current consensus should also be
|
|
available at:
|
|
http://<hostname>/tor/micro/all.z
|
|
so a client that's bootstrapping doesn't need to send a 70KB URL just
|
|
to name every microdescriptor it's looking for.
|
|
|
|
Microdescriptors have no header or footer.
|
|
The hash of the microdescriptor is simply the hash of the concatenated
|
|
elements.
|
|
|
|
Directory mirrors should check to make sure that the microdescriptors
|
|
they're about to serve match the right hashes (either the hashes from
|
|
the fetch URL or the hashes from the consensus, respectively).
|
|
|
|
We will probably want to consider some sort of smart data structure to
|
|
be able to quickly convert microdescriptor hashes into the appropriate
|
|
microdescriptor. Clients will want this anyway when they load their
|
|
microdescriptor cache and want to match it up with the consensus to
|
|
see what's missing.
|
|
|
|
3.3. Clients fetch them and cache them
|
|
|
|
When a client gets a new consensus, it looks to see if there are any
|
|
microdescriptors it needs to learn. If it needs to learn more than
|
|
some threshold of the microdescriptors (half?), it requests 'all',
|
|
else it requests only the missing ones. Clients MAY try to
|
|
determine whether the upload bandwidth for listing the
|
|
microdescriptors they want is more or less than the download
|
|
bandwidth for the microdescriptors they do not want.
|
|
|
|
Clients maintain a cache of microdescriptors along with metadata like
|
|
when it was last referenced by a consensus, and which identity key
|
|
it corresponds to. They keep a microdescriptor
|
|
until it hasn't been mentioned in any consensus for a week. Future
|
|
clients might cache them for longer or shorter times.
|
|
|
|
3.3.1. Information leaks from clients
|
|
|
|
If a client asks you for a set of microdescs, then you know she didn't
|
|
have them cached before. How much does that leak? What about when
|
|
we're all using our entry guards as directory guards, and we've seen
|
|
that user make a bunch of circuits already?
|
|
|
|
Fetching "all" when you need at least half is a good first order fix,
|
|
but might not be all there is to it.
|
|
|
|
Another future option would be to fetch some of the microdescriptors
|
|
anonymously (via a Tor circuit).
|
|
|
|
Another crazy option (Roger's phrasing) is to do decoy fetches as
|
|
well.
|
|
|
|
4. Transition and deployment
|
|
|
|
Phase one, the directory authorities should start voting on
|
|
microdescriptors, and putting them in the consensus.
|
|
|
|
Phase two, directory mirrors should learn how to serve them, and learn
|
|
how to read the consensus to find out what they should be serving.
|
|
|
|
Phase three, clients should start fetching and caching them instead
|
|
of normal descriptors.
|
|
|