mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-24 20:33:31 +01:00
6101468cbe
svn:r5250
396 lines
17 KiB
Plaintext
396 lines
17 KiB
Plaintext
$Id$
|
|
|
|
Tor directory protocol for 0.1.1.x series
|
|
|
|
0. Scope and preliminaries
|
|
|
|
This document should eventually be merged into tor-spec.txt and replace
|
|
the existing notes on directories.
|
|
|
|
This is not a finalized version; what we actually wind up implementing
|
|
may be very different from the system described here.
|
|
|
|
0.1. Goals
|
|
|
|
There are several problems with the way Tor handles directories right
|
|
now:
|
|
1. Directories are very large and use a lot of bandwidth.
|
|
2. Every directory server is a single point of failure.
|
|
3. Requiring every client to know every server won't scale.
|
|
4. Requiring every directory cache to know every server won't scale.
|
|
5. Our current "verified server" system is kind of nonsensical.
|
|
6. Getting more directory servers adds more points of failure and
|
|
worsens possible partitioning attacks.
|
|
|
|
This design tries to solve every problem except problems 3 and 4, and to
|
|
be compatible with likely eventual solutions to problems 3 and 4.
|
|
|
|
1. Outline
|
|
|
|
There is no longer any such thing as a "signed directory". Instead,
|
|
directory servers sign a very compressed 'network status' object that
|
|
lists the current descriptors and their status, and router descriptors
|
|
continue to be self-signed by servers. Clients download network status
|
|
listings periodically, and download router descriptors as needed. ORs
|
|
upload descriptors relatively infrequently.
|
|
|
|
There are multiple directory servers. Rather than doing anything
|
|
complicated to coordinate themselves, clients simply rotate through them
|
|
in order, and only use servers that most of the last several directory
|
|
servers like.
|
|
|
|
2. Router descriptors
|
|
|
|
The router descriptor format is unchanged from tor-spec.txt.
|
|
|
|
ORs SHOULD generate a new router descriptor whenever any of the
|
|
following events have occurred:
|
|
|
|
- A period of time (18 hrs by default) has passed since the last
|
|
time a descriptor was generated.
|
|
|
|
- A descriptor field other than bandwidth or uptime has changed.
|
|
|
|
- Bandwidth has changed by more than +/- 50% from the last time a
|
|
descriptor was generated, and at least a given interval of time
|
|
(20 mins by default) has passed since then.
|
|
|
|
- Uptime has been reset.
|
|
|
|
After generating a descriptor, ORs upload it to every directory
|
|
server they know.
|
|
|
|
3. Network status
|
|
|
|
Directory servers generate, sign, and compress a network-status document
|
|
as needed. As an optimization, they may rate-limit the number of such
|
|
documents generated to once every few seconds. Directory servers should
|
|
rate-limit at least to the point where these documents are generated no
|
|
faster than once per second.
|
|
|
|
The network status document contains a preamble, a set of router status
|
|
entries, and a signature, in that order.
|
|
|
|
We use the same meta-format as used for directories and router descriptors
|
|
in "tor-spec.txt".
|
|
|
|
The preamble contains:
|
|
|
|
"network-status-version" -- A document format version. For this
|
|
specification, the version is "2".
|
|
"dir-source" -- The hostname, current IP address, and directory
|
|
port of the directory server, separated by spaces.
|
|
"fingerprint" -- A base16-encoded hash of the signing key's
|
|
fingerprint, with no additional spaces added.
|
|
"contact" -- An arbitrary string describing how to contact the
|
|
directory server's administrator. Administrators should include at
|
|
least an email address and a PGP fingerprint.
|
|
"dir-signing-key" -- The directory server's public signing key.
|
|
"client-versions" -- A comma-separated list of recommended client versions.
|
|
"server-versions" -- A comma-separated list of recommended server versions.
|
|
"published" -- The publication time for this network-status object.
|
|
"dir-options" -- A set of flags separated by spaces:
|
|
"Names" if this directory server performs name bindings.
|
|
"Versions" if this directory server recommends software versions.
|
|
|
|
The dir-options entry is optional. The "-versions" entries are required if
|
|
the "Versions" flag is present. The other entries are required and must
|
|
appear exactly once. The "network-status-version" entry must appear first;
|
|
the others may appear in any order.
|
|
|
|
For each router, the router entry contains: (This format is designed for
|
|
conciseness.)
|
|
|
|
"r" -- followed by the following elements, separated by spaces:
|
|
- The OR's nickname,
|
|
- A hash of its identity key, encoded in base64, with trailing =
|
|
signs removed.
|
|
- A hash of its most recent descriptor, encoded in base64, with
|
|
trailing = signs removed. (The hash is calculated as for
|
|
computing the signature of a descriptor.)
|
|
- The publication time of its most recent descriptor.
|
|
- An IP
|
|
- An OR port
|
|
- A directory port (or "0" for none")
|
|
"s" -- A series of space-separated status flags:
|
|
"Exit" if the router is useful for building general-purpose exit
|
|
circuits.
|
|
"Stable" if the router tends to stay up for a long time.
|
|
"Fast" if the router has high bandwidth.
|
|
"Running" if the router is currently usable.
|
|
"Named" if the router's identity-nickname mapping is canonical.
|
|
"Valid" if the router has been 'validated'.
|
|
"Authority" if the router is a directory authority.
|
|
|
|
The "r" entry for each router must appear first and is required. The
|
|
's" entry is optional. Unrecognized flags, or extra elements on the
|
|
"r" line must be ignored.
|
|
|
|
The signature section contains:
|
|
|
|
"directory-signature". A signature of the rest of the document using
|
|
the directory server's signing key.
|
|
|
|
We compress the network status list with zlib before transmitting it.
|
|
|
|
4. Directory server operation
|
|
|
|
By default, directory servers remember all non-expired, non-superseded OR
|
|
descriptors that they have seen.
|
|
|
|
For each OR, a directory server remembers whether the OR was running and
|
|
functional the last time they tried to connect to it, and possibly other
|
|
liveness information.
|
|
|
|
Directory server administrators may label some servers or IPs as
|
|
blacklisted, and elect not to include them in their network-status lists.
|
|
|
|
Thus, the network-status list includes all non-blacklisted,
|
|
non-expired, non-superseded descriptors for ORs that the directory has
|
|
observed at least once to be running.
|
|
|
|
Directory server administrators may decide to support name binding. If
|
|
they do, then they must maintain a file of nickname-to-identity-key
|
|
mappings, and try to keep this file consistent with other directory
|
|
servers. If they don't, they act as clients, and report bindings made by
|
|
other directory servers (name X is bound to identity Y if at least one
|
|
binding directory lists it, and no directory binds X to some other Y'.)
|
|
|
|
The authoritative network-status published by a host should be available at:
|
|
http://<hostname>/tor/status/authority.z
|
|
|
|
An authoritative network-status published by another host with fingerprint
|
|
<F> should be available at:
|
|
http://<hostname>/tor/status/fp/<F>.z
|
|
|
|
An authoritative network-status published by other hosts with fingerprints
|
|
<F1>,<F2>,<F3> should be available at:
|
|
http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z
|
|
|
|
The most recent network-status documents from all known authoritative
|
|
directories, concatenated, should be available at:
|
|
http://<hostname>/tor/status/all.z
|
|
|
|
The most recent descriptor for a server whose identity key has a
|
|
fingerprint of <F> should be available at:
|
|
http://<hostname>/tor/server/fp/<F>.z
|
|
|
|
The most recent descriptors for servers with fingerprints <F1>,<F2>,<F3>
|
|
should be available at:
|
|
http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z
|
|
|
|
The descriptor for a server whose digest (in hex) is <D> should be
|
|
available at:
|
|
http://<hostname>/tor/server/d/<D>.z
|
|
|
|
The most recent descriptors with digests <D1>,<D2>,<D3> should be
|
|
available at:
|
|
http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z
|
|
|
|
The most recent descriptor for this server should be at:
|
|
http://<hostname>/tor/server/authority.z
|
|
|
|
A concatenated set of the most recent descriptors for all known servers
|
|
should be available at:
|
|
http://<hostname>/tor/server/all.z
|
|
|
|
For debugging, directories MAY expose non-compressed objects at URLs like
|
|
the above, but without the final ".z".
|
|
|
|
Clients MUST handle compressed concatenated information in two forms:
|
|
- A concatenated list of zlib-compressed objects.
|
|
- A zlib-compressed concatenated list of objects.
|
|
Directory servers MAY generate either format: the former requires less
|
|
CPU, but the latter requires less bandwidth.
|
|
|
|
4.1. Caching
|
|
|
|
Directory caches (most ORs) regularly download network status documents,
|
|
and republish them at a URL based on the directory server's identity key:
|
|
http://<hostname>/tor/status/<identity fingerprint>.z
|
|
|
|
A concatenated list of all network-status documents should be available at:
|
|
http://<hostname>/tor/status/all.z
|
|
|
|
4.2. Compression
|
|
|
|
|
|
5. Client operation
|
|
|
|
Every OP or OR, including directory servers, acts as a client to the
|
|
directory protocol.
|
|
|
|
Each client maintains a list of trusted directory servers. Periodically
|
|
(currently every 20 minutes), the client downloads a new network status. It
|
|
chooses the directory server from which its current information is most
|
|
out-of-date, and retries on failure until it finds a running server.
|
|
|
|
When choosing ORs to build circuits, clients proceed as follows:
|
|
- A server is "listed" if it is listed by more than half of the "live"
|
|
network status documents the clients have downloaded. (A network
|
|
status is "live" if it is the most recently downloaded network status
|
|
document for a given directory server, and the server is a directory
|
|
server trusted by the client, and the network-status document is no
|
|
more than D (say, 10) days old.)
|
|
- A server is "valid" is it is listed as valid by more than half of the
|
|
"live" downloaded" network-status document.
|
|
- A server is "running" if it is listed as running by more than
|
|
half of the "recent" downloaded network-status documents.
|
|
(A network status is "recent" if it was published in the last
|
|
60 minutes. If there are fewer than 3 such documents, the most
|
|
recently published 3 are "recent." If there are fewer than 3 in all,
|
|
all are "recent.")
|
|
|
|
|
|
Clients store network status documents so long as they are live.
|
|
|
|
5.1. Scheduling network status downloads
|
|
|
|
This download scheduling algorithm implements the approach described above
|
|
in a relatively low-state fashion. It reflects the current Tor
|
|
implementation.
|
|
|
|
Clients maintain a list of authorities; each client tries to keep the same
|
|
list, in the same order.
|
|
|
|
Periodically, on startup, and on HUP, clients check whether they need to
|
|
download fresh network status documents. The approach is as follows:
|
|
- If we have under X network status documents newer than OLD, we choose a
|
|
member of the list at random and try download XX documents starting
|
|
with that member's.
|
|
- Otherwise, if we have no network status documents newer than NEW, we
|
|
check to see which authority's document we retrieved most recently,
|
|
and try to retrieve the next authority's document. If we can't, we
|
|
try the next authority in sequence, and so on.
|
|
|
|
5.2. Managing naming
|
|
|
|
In order to provide human-memorable names for individual server
|
|
identities, some directory servers bind names to IDs. Clients handle
|
|
names in two ways:
|
|
|
|
If a client is encountering a name it has not mapped before:
|
|
|
|
If all the "binding" networks-status documents the client has so far
|
|
received same claim that the name binds to some identity X, and the
|
|
client has received at least three network-status documents, the client
|
|
maps the name to X.
|
|
|
|
If a client is encountering a name it has mapped before:
|
|
|
|
It uses the last-mapped identity value, unless all of the "binding"
|
|
network status documents bind the name to some other identity.
|
|
|
|
5.3. Notes on what we do now.
|
|
|
|
THIS SECTION SHOULD BE FOLDED INTO THE EARLIER SECTIONS; THEY ARE WRONG;
|
|
THIS IS RIGHT.
|
|
|
|
All downloaded networkstatuses are discarded once they are 10 days old (by
|
|
published date).
|
|
|
|
Authdirs download each others' networkstatus every
|
|
AUTHORITY_NS_CACHE_INTERVAL minutes (currently 10).
|
|
|
|
Directory caches download authorities' networkstatus every
|
|
NONAUTHORITY_NS_CACHE_INTERVAL minutes (currently 10).
|
|
|
|
Clients always try to replace any networkstatus received over
|
|
NETWORKSTATUS_MAX_VALIDITY ago (currently 2 days). Also, when the most
|
|
recently received networkstatus is more than
|
|
NETWORKSTATUS_CLIENT_DL_INTERVAL (30 minutes) old, and we do not have any
|
|
open directory connections fetching a networkstatus, clients try to
|
|
download the networkstatus on their list after the most recently received
|
|
networkstatus, skipping failed networkstatuses. A networkstatus is
|
|
"failed" if NETWORKSTATUS_N_ALLOWABLE_FAILURES (3) attempts in a row have
|
|
all failed.
|
|
|
|
We do not update router statuses if we have less than half of the
|
|
networkstatuses.
|
|
|
|
A networkstatus is "live" if it is the most recent we have received signed
|
|
by a given trusted authority.
|
|
|
|
A networkstatus is "recent" if it is "live" and:
|
|
- it was received in the last DEFAULT_RUNNING_INTERVAL (currently 60
|
|
minutes)
|
|
OR - it was one of the MIN_TO_INFLUENCE_RUNNING (3) most recently received
|
|
networkstatuses.
|
|
|
|
Authorities always believe their own opinion as to a router's status. For
|
|
other tors:
|
|
- a router is valid if more than half of the live networkstatuses think
|
|
it's valid.
|
|
- a router is named if more than half of the live networkstatuses from
|
|
naming authorities think it's named, and they all think it has the
|
|
same name.
|
|
- a router is running if more than half of the recent networkstatuses
|
|
think it's running.
|
|
|
|
Everyone downloads router descriptors as follows:
|
|
|
|
- If any networkstatus lists a more recently published routerdesc with a
|
|
different descriptor digest, and no more than
|
|
MAX_ROUTERDESC_DOWNLOAD_FAILURES attempts to retrieve that routerdesc
|
|
have failed, then that routerdesc is "downloadable".
|
|
|
|
- Every DirFetchInterval, or whenever a request for routerdescs returns
|
|
no routerdescs, we launch a set of requests for all downloadable
|
|
routerdescs. We divide the downloadable routerdescs into groups of no
|
|
more than DL_PER_REQUEST, and send a request for each group to
|
|
directory servers chosen independently.
|
|
|
|
- We also launch a request as above when a request for routerdescs
|
|
fails and we have no directory connections fetching routerdescs.
|
|
|
|
TODO Specify here:
|
|
- When to 0-out failure count for networkstatus?
|
|
|
|
- Drop fallback to download-all. Also, always split download.
|
|
|
|
- For versions: if you're listed by more than half of live versioning
|
|
networkstatuses, good. if less than half of networkstatuses are live,
|
|
don't do anything. If half are live, and half of less of the
|
|
versioning ones list you, warn. Only warn once every 24 hours.
|
|
|
|
- For names: warn if an unnamed router is specified by nickname.
|
|
Rate-limit these warnings.
|
|
- Also, don't believe N->K if another naming authdir says N->K'.
|
|
- Revise naming rule: N->K is true if any naming directory says N->K,
|
|
and no other naming directory says N->K' or N'->K.
|
|
|
|
- Minimum info to build circuits.
|
|
|
|
- Revise: always split requests when we have too little info to build
|
|
circuits.
|
|
|
|
- Describe when router is "out of date". (Any dirserver says so.)
|
|
|
|
- Change rule from "do not launch new connections when one exists" to
|
|
"do not request any fingerprint that we're currently requesting."
|
|
|
|
- Launch new connections every minute, plus whenever a download fails.
|
|
- Reset routerdesc failure count after 60 minutes, or when
|
|
when network comes back on after absence.
|
|
- Make "I didn't get the one I thought was most recent" a failure.
|
|
- Retry these every 5 minutes if you're a client.
|
|
- Mirrors should retry these harder and more often.
|
|
- If we have a routerdesc for Bob, and he says, "I'm 0.1.0.x", don't
|
|
fetch a new one if it was published in the last 2 hours. (??)
|
|
|
|
- Describe what we do with old server versions.
|
|
|
|
- If we have less than 16 to download, do not download unless 10 minutes
|
|
have passed since last download.
|
|
|
|
- Which descriptors do directory servers remember?
|
|
|
|
6. Remaining issues
|
|
|
|
Client-knowledge partitioning is worrisome. Most versions of this don't
|
|
seem to be worse than the Danezis-Murdoch tracing attack, since an
|
|
attacker can't do more than deduce probable exits from entries (or vice
|
|
versa). But what about when the client connects to A and B but in a
|
|
different order? How bad can it be partitioned based on its knowledge?
|
|
|