mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-10 05:03:43 +01:00
Add proposal 141: download server descriptors on demand. (Status: Draft).
svn:r15302
This commit is contained in:
parent
44452c2756
commit
bcd7357b71
@ -63,6 +63,7 @@ Proposals by number:
|
||||
138 Remove routers that are not Running from consensus documents [CLOSED]
|
||||
139 Download consensus documents only when it will be trusted [CLOSED]
|
||||
140 Provide diffs between consensuses [OPEN]
|
||||
141 Download server descriptors on demand [DRAFT]
|
||||
|
||||
|
||||
Proposals by status:
|
||||
@ -74,6 +75,7 @@ Proposals by status:
|
||||
132 A Tor Web Service For Verifying Correct Browser Configuration
|
||||
133 Incorporate Unreachable ORs into the Tor Network
|
||||
134 More robust consensus voting with diverse authority sets
|
||||
141 Download server descriptors on demand
|
||||
OPEN:
|
||||
120 Shutdown descriptors when Tor servers stop
|
||||
121 Hidden Service Authentication
|
||||
|
219
doc/spec/proposals/141-jit-sd-downloads.txt
Normal file
219
doc/spec/proposals/141-jit-sd-downloads.txt
Normal file
@ -0,0 +1,219 @@
|
||||
Filename: 141-jit-sd-downloads.txt
|
||||
Title: Download server descriptors on demand
|
||||
Version: $Revision$
|
||||
Last-Modified: $Date$
|
||||
Author: Peter Palfrader
|
||||
Created: 15-Jun-2008
|
||||
Status: Draft
|
||||
|
||||
1. Overview
|
||||
|
||||
Downloading all server descriptors is the most expensive part
|
||||
of bootstrapping a Tor client. These server descriptors currently
|
||||
amount to about 1.5 Megabytes of data, and this size will grow
|
||||
linearly with network size.
|
||||
|
||||
Fetching all these server descriptors takes a long while for people
|
||||
behind slow network connections. It is also a considerable load on
|
||||
our network of directory mirrors.
|
||||
|
||||
This document describes proposed changes to the Tor network and
|
||||
directory protocol so that clients will no longer need to download
|
||||
all server descriptors.
|
||||
|
||||
These changes consist of moving load balancing information into
|
||||
network status documents, implementing a means to download server
|
||||
descriptors on demand in an anonymity-preserving way, and dealing
|
||||
with exit node selection.
|
||||
|
||||
2. What is in a server descriptor
|
||||
|
||||
When a Tor client starts the first thing it will try to get is a
|
||||
current network status document, a consensus signed by a majority
|
||||
of directory authorities. This document is currently about 100
|
||||
Kilobytes in size, tho it will grow linearly with network size.
|
||||
This document lists all servers currently running on the network.
|
||||
The Tor client will then try to get a server descriptor for each
|
||||
of the running servers. All server descriptors currently amount
|
||||
to about 1.5 Metabytes of downloads.
|
||||
|
||||
A Tor client learns several things about a server from its descriptor.
|
||||
Some of these it already learned from the network status document
|
||||
published by the authorities, but the server descriptor contains it
|
||||
again in a single statement signed by the server itself, not just by
|
||||
the directory authorities.
|
||||
|
||||
Tor clients use the information from server descriptors for
|
||||
different purposes, which are considered in the following sections.
|
||||
|
||||
#three ways: One, to determine if a server will be able to handle
|
||||
#this client's request; two, to actually communicate or use the server;
|
||||
#three, for load balancing decisions.
|
||||
#
|
||||
#These three points are considered in the following subsections.
|
||||
|
||||
2.1 Load balancing
|
||||
|
||||
The Tor load balancing mechanism is quite complex in its details, but
|
||||
it has a simple goal: The more traffic a server can handle the more
|
||||
traffic it should get. That means the more traffic a server can
|
||||
handle the more likely a client will use it.
|
||||
|
||||
For this purpose each server descriptor has bandwidth information
|
||||
which tries to convey a server's capacity to clients.
|
||||
|
||||
Currently we weigh servers differently for different purposes. There
|
||||
is a weigh for when we use a server as a guard node (our entry to the
|
||||
Tor network), there is one weigh we assign servers for exit duties,
|
||||
and a third for when we need intermediate (middle) nodes.
|
||||
|
||||
2.2 Exit information
|
||||
|
||||
When a Tor wants to exit to some resource on the internet it will
|
||||
build a circuit to an exit node that allows access to that resource's
|
||||
IP address and TCP Port.
|
||||
|
||||
When building that circuit the client can make sure that the circuit
|
||||
ends at a server that will be able to fulfill the request because the
|
||||
client already learned of all the servers' exit policies from their
|
||||
descriptors.
|
||||
|
||||
2.3 Capability information
|
||||
|
||||
Server descriptors contain information about the specific version or
|
||||
the Tor protocol they understand [proposal 105].
|
||||
|
||||
Furthermore the server descriptor also contains the exact version of
|
||||
the Tor software that the server is running and some decisions are
|
||||
made based on the server version number (for instance a Tor client
|
||||
will only make conditional consensus requests [proposal from 13 Apr
|
||||
2008 that never got a number] when talking to Tor servers version
|
||||
0.2.1.1-alpha or later).
|
||||
|
||||
2.4 Contact/key information
|
||||
|
||||
A server descriptor lists a server's IP address and TCP ports on which
|
||||
it accepts onion and directory connections. Furthermore it contains
|
||||
the onion key, a short lived RSA key to which clients encrypt CREATE
|
||||
cells.
|
||||
|
||||
2.5 Identity information
|
||||
|
||||
A Tor client learns the digest of a server's key from the network
|
||||
status document. Once it has a server descriptor this descriptor
|
||||
contains the full RSA identity key of the server. Clients verify
|
||||
that 1) the digest of the identity key matches the expected digest
|
||||
it got from the consensus, and 2) that the signature on the descriptor
|
||||
from that key is valid.
|
||||
|
||||
|
||||
3. Doing away with the need for all SDs
|
||||
|
||||
3.1 Load balancing info in consensus documents
|
||||
|
||||
One of the reasons why clients download all server descriptors is for
|
||||
doing load proper load balancing as described in 2.1. In order for
|
||||
clients to not require all server descriptors this information will
|
||||
have to move into the network status document.
|
||||
|
||||
[XXX Two open questions here:
|
||||
a) how do we arrive at a consensus weight?
|
||||
b) how to represent weights in the consensus?
|
||||
Maybe "s Guard=0.13 Exit=0.02 Middle=0.00 Stable.."
|
||||
]
|
||||
|
||||
3.2 Fetching descriptors on demand
|
||||
|
||||
As described in 2.4 a descriptor lists IP address, OR- and Dir-Port,
|
||||
and the onion key for a server.
|
||||
|
||||
A client already knows the IP address and the ports from the consensus
|
||||
documents, but without the onion key it will not be able to send
|
||||
CREATE/EXTEND cells for that server. Since the client needs the onion
|
||||
key it needs the descriptor.
|
||||
|
||||
If a client only downloaded a few descriptors in an observable manner
|
||||
then that would leak which nodes it was going to use.
|
||||
|
||||
This proposal suggests the following:
|
||||
|
||||
1) when connecting to a guard node for which the client does not
|
||||
yet have a cached descriptor it requests the descriptor it
|
||||
expects by hash. (The consensus document that the client holds
|
||||
has a hash for the descriptor of this server. We want exactly
|
||||
that descriptor, not a different one.)
|
||||
|
||||
[XXX: How? We could either come up with a new cell type,
|
||||
RELAY_REQUEST_SD that takes only a hash (of the SD), or use
|
||||
RELAY_BEGIN_DIR. The former is probably smarter since we will
|
||||
want to use it later on as well, and there we will require
|
||||
padding.]
|
||||
|
||||
A client MAY cache the descriptor of the guard node so that it does
|
||||
not need to request it every single time it contacts the guard.
|
||||
|
||||
2) when a client wants to extend a circuit that currently ends in
|
||||
server B to a new next server C, the client will send a
|
||||
RELAY_REQUEST_SD cell to server B. This cell contains in its
|
||||
payload the hash of a server descriptor the client would like
|
||||
to obtain (C's server descriptor). The server sends back the
|
||||
descriptor and the client can now form a valid EXTEND/CREATE cell
|
||||
encrypted to C's onion key.
|
||||
|
||||
Clients MUST NOT cache such descriptors. If they did they might
|
||||
leak that they already extended to that server at least once
|
||||
before.
|
||||
|
||||
Replies to RELAY_REQUEST_SD requests need to be padded to some
|
||||
constant upper limit in order to conceal a client's destination
|
||||
from anybody who might be counting cells/bytes.
|
||||
|
||||
[XXX: detailed spec of RELAY_REQUEST_SD cell and its reply]
|
||||
[XXX: figure out a decent padding size]
|
||||
|
||||
3.3 Protocol versions
|
||||
|
||||
[XXX: find out where we need "opt protocols Link 1 2 Circuit 1"
|
||||
information described in 2.3 above. If we need it, it might have
|
||||
to go into the consensus document.]
|
||||
|
||||
[XXX: Similarly find out where we need the version number of a
|
||||
remote tor server. This information is in the consensus, but
|
||||
maybe we use it in some place where having it signed by the
|
||||
server in question is really important?]
|
||||
|
||||
3.4 Exit selection
|
||||
|
||||
Currently finding an appropriate exit node for a user's request is
|
||||
easy for a client because it has complete knowledge of all the exit
|
||||
policies of all servers on the network.
|
||||
|
||||
[XXX: I have no finished ideas here yet.
|
||||
- if clients only rely on the current exit flag they will
|
||||
a) never use servers for exit purposes that don't have it,
|
||||
b) will have a hard time finding a suitable exit node for
|
||||
their weird port that only a few servers allow.
|
||||
- the authorities could create a new summary document that
|
||||
lists all the exit policies and their nodes (by fingerprint).
|
||||
I need to find out how large that document would be.
|
||||
- can we make the "Exit" flag more useful? can we come
|
||||
up with some "standard policies" and have operators pick
|
||||
one of the standards?
|
||||
]
|
||||
|
||||
4. Future possibilities
|
||||
|
||||
This proposal still requires that all servers have the descriptors of
|
||||
every other node in the network in order to answer RELAY_REQUEST_SD
|
||||
cells. These cells are sent when a circuit is extended from ending at
|
||||
node B to a new node C. In that case B would have to answer a
|
||||
RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest).
|
||||
|
||||
In order to answer that request B obviously needs a copy of C's server
|
||||
descriptor. In the future we might amend RELAY_REQUEST_SD cells to
|
||||
contain also the expected IP address and OR-port of the server C (the
|
||||
client learns them from the network status document), so that B no
|
||||
longer needs to know all the descriptors of the entire network but
|
||||
instead can simply go and ask C for its descriptor before passing it
|
||||
back to the client.
|
||||
|
Loading…
Reference in New Issue
Block a user