2010-03-15 00:06:48 +01:00
|
|
|
Filename: xxx-automatic-node-promotion.txt
|
|
|
|
Title: Automatically promoting Tor clients to nodes
|
|
|
|
Author: Steven Murdoch
|
|
|
|
Created: 12-Mar-2010
|
|
|
|
Status: Draft
|
|
|
|
Target:
|
|
|
|
|
|
|
|
1. Overview
|
|
|
|
|
|
|
|
This proposal describes how Tor clients could determine when they
|
|
|
|
have sufficient bandwidth capacity and are sufficiently reliable to
|
2010-03-15 01:14:41 +01:00
|
|
|
become either bridges or Tor relays. When they meet this
|
2010-03-15 00:06:48 +01:00
|
|
|
criteria, they will automatically promote themselves, based on user
|
|
|
|
preferences. The proposal also defines the new controller messages
|
|
|
|
and options which will control this process.
|
|
|
|
|
2010-03-15 18:50:50 +01:00
|
|
|
Note that for the moment, only transitions between client and
|
|
|
|
bridge are being considered. Transitions to public relay will
|
|
|
|
be considered at a future date, but will use the same
|
|
|
|
infrastructure for measuring capacity and reliability.
|
|
|
|
|
2010-03-15 00:06:48 +01:00
|
|
|
2. Motivation and history
|
|
|
|
|
|
|
|
Tor has a growing user-base and one of the major impediments to the
|
|
|
|
quality of service offered is the lack of network capacity. This is
|
|
|
|
particularly the case for bridges, because these are gradually
|
|
|
|
being blocked, and thus no longer of use to people within some
|
|
|
|
countries. By automatically promoting Tor clients to bridges, and
|
2010-03-15 01:14:41 +01:00
|
|
|
perhaps also to full public relays, this proposal aims to solve
|
2010-03-15 00:06:48 +01:00
|
|
|
these problems.
|
|
|
|
|
|
|
|
Only Tor clients which are sufficiently useful should be promoted,
|
|
|
|
and the process of determining usefulness should be performed
|
|
|
|
without reporting the existence of the client to the central
|
|
|
|
authorities. The criteria used for determining usefulness will be
|
|
|
|
in terms of bandwidth capacity and uptime, but parameters should be
|
|
|
|
specified in the directory consensus. State stored at the client
|
|
|
|
should be in no more detail than necessary, to prevent sensitive
|
|
|
|
information being recorded.
|
|
|
|
|
|
|
|
3. Design
|
|
|
|
|
|
|
|
3.x Opt-in state model
|
|
|
|
|
|
|
|
Tor can be in one of five node-promotion states:
|
|
|
|
|
|
|
|
- off (O): Currently a client, and will stay as such
|
|
|
|
- auto (A): Currently a client, but will consider promotion
|
|
|
|
- bridge (B): Currently a bridge, and will stay as such
|
|
|
|
- auto-bridge (AB): Currently a bridge, but will consider promotion
|
2010-03-15 01:14:41 +01:00
|
|
|
- relay (R): Currently a public relay, and will stay as such
|
2010-03-15 00:06:48 +01:00
|
|
|
|
|
|
|
The state can be fully controlled from the configuration file or
|
|
|
|
controller, but the normal state transitions are as follows:
|
|
|
|
|
|
|
|
Any state -> off: User has opted out of node promotion
|
|
|
|
Off -> any state: Only permitted with user consent
|
|
|
|
|
|
|
|
Auto -> auto-bridge: Tor has detected that it is sufficiently
|
|
|
|
reliable to be a *bridge*
|
|
|
|
Auto -> bridge: Tor has detected that it is sufficiently reliable
|
2010-03-15 01:14:41 +01:00
|
|
|
to be a *relay*, but the user has chosen to remain a *bridge*
|
|
|
|
Auto -> relay: Tor has detected that it is sufficiently reliable
|
|
|
|
to be *relay*, and will skip being a *bridge*
|
|
|
|
Auto-bridge -> relay: Tor has detected that it is sufficiently
|
|
|
|
reliable to be a *relay*
|
2010-03-15 00:06:48 +01:00
|
|
|
|
2010-03-15 01:23:12 +01:00
|
|
|
Note that this model does not support automatic demotion. If this
|
|
|
|
is desirable, there should be some memory as to whether the
|
|
|
|
previous state was relay, bridge, or auto-bridge. Otherwise the
|
|
|
|
user may be prompted to become a relay, although he has opted to
|
|
|
|
only be a bridge.
|
2010-03-15 00:06:48 +01:00
|
|
|
|
|
|
|
3.x User interaction policy
|
|
|
|
|
|
|
|
There are a variety of options in how to involve the user into the
|
|
|
|
decision as to whether and when to perform node promotion. The
|
|
|
|
choice also may be different when Tor is running from Vidalia (and
|
|
|
|
thus can readily prompt the user for information), and standalone
|
|
|
|
(where Tor can only log messages, which may or may not be read).
|
|
|
|
|
|
|
|
The option requiring minimal user interaction is to automatically
|
|
|
|
promote nodes according to reliability, and allow the user to opt
|
|
|
|
out, by changing settings in the configuration file or Vidalia user
|
|
|
|
interface.
|
|
|
|
|
|
|
|
Alternatively, if a user interface is available, Tor could prompt
|
|
|
|
the user when it detects that a transition is available, and allow
|
2010-03-15 02:07:12 +01:00
|
|
|
the user to choose which of the available options to select. If
|
|
|
|
Vidalia is not available, it still may be possible to solicit an
|
|
|
|
email address on install, and contact the operator to ask whether
|
|
|
|
a transition to bridge or relay is permitted.
|
2010-03-15 00:06:48 +01:00
|
|
|
|
|
|
|
Finally, Tor could by default not make any transition, and the user
|
|
|
|
would need to opt in by stating the maximum level (bridge or
|
2010-03-15 01:14:41 +01:00
|
|
|
relay) to which the node may automatically promote itself.
|
2010-03-15 00:06:48 +01:00
|
|
|
|
2010-03-29 21:49:30 +02:00
|
|
|
3.x Performance monitoring model
|
|
|
|
|
|
|
|
To prevent a large number of clients activating as relays, but
|
|
|
|
being too unreliable to be useful, clients should measure their
|
|
|
|
performance. If this performance meets a parameterized acceptance
|
|
|
|
criteria, a client should consider promotion. To measure
|
|
|
|
reliability, this proposal adopts a simple user model:
|
|
|
|
|
|
|
|
- A user decides to use Tor at times which follow a Poisson
|
|
|
|
distribution
|
|
|
|
- At each time, the user will be happy if the bridge chosen has
|
|
|
|
adequate bandwidth and is reachable
|
|
|
|
- If the chosen bridge is down or slow too many times, the user
|
|
|
|
will consider Tor to be bad
|
|
|
|
|
|
|
|
If we additionally assume that the recent history of relay
|
|
|
|
performance matches the current performance, we can measure
|
|
|
|
reliability by simulating this simple user.
|
|
|
|
|
|
|
|
The following parameters are distributed to clients in the
|
|
|
|
directory consensus:
|
|
|
|
|
|
|
|
- min_bandwidth: Minimum self-measured bandwidth for a node to be
|
|
|
|
considered useful, in bytes per second
|
|
|
|
- check_period: How long, in seconds, to wait between checking
|
|
|
|
reachability and bandwidth (on average)
|
|
|
|
- num_samples: Number of recent samples to keep
|
|
|
|
- num_useful: Minimum number of recent samples where the node was
|
|
|
|
reachable and had at least min_bandwidth capacity, for a client
|
|
|
|
to consider promoting to a bridge
|
|
|
|
|
|
|
|
A different set of parameters may be used for considering when to
|
|
|
|
promote a bridge to a full relay, but this will be the subject of a
|
|
|
|
future revision of the proposal.
|
|
|
|
|
|
|
|
3.x Performance monitoring algorithm
|
|
|
|
|
|
|
|
The simulation described above can be implemented as follows:
|
|
|
|
|
|
|
|
Every 60 seconds:
|
|
|
|
1. Tor generates a random floating point number x in
|
|
|
|
the interval [0, 1).
|
|
|
|
2. If x > (1 / (check_period / 60)) GOTO end; otherwise:
|
|
|
|
3. Tor sets the value last_check to the current_time (in seconds)
|
|
|
|
4. Tor measures reachability
|
|
|
|
5. If the client is reachable, Tor measures its bandwidth
|
|
|
|
6. If the client is reachable and the bandwidth is >=
|
|
|
|
min_bandwidth, the test has succeeded, otherwise it has failed.
|
|
|
|
7. Tor adds the test result to the end of a ring-buffer containing
|
|
|
|
the last num_samples results: measurement_results
|
|
|
|
8. Tor saves last_check and measurements_results to disk
|
|
|
|
9. If the length of measurements_results == num_samples and
|
|
|
|
the number of successes >= num_useful, Tor should consider
|
|
|
|
promotion to a bridge
|
|
|
|
end.
|
|
|
|
|
|
|
|
When Tor starts, it must fill in the samples for which it was not
|
|
|
|
running. This can only happen once the consensus has downloaded,
|
|
|
|
because the value of check_period is needed.
|
|
|
|
|
|
|
|
1. Tor generates a random number y from the Poisson distribution [1]
|
|
|
|
with lambda = (current_time - last_check) * (1 / check_period)
|
|
|
|
2. Tor sets the value last_check to the current_time (in seconds)
|
|
|
|
3. Add y test failures to the ring buffer measurements_results
|
|
|
|
4. Tor saves last_check and measurements_results to disk
|
|
|
|
|
|
|
|
In this way, a Tor client will measure its bandwidth and
|
|
|
|
reachability every check_period seconds, on average. Provided
|
|
|
|
check_period is sufficiently greater than a minute (say, at least an
|
|
|
|
hour), the times of check will follow a Poisson distribution. [2]
|
|
|
|
|
|
|
|
While this does require that Tor does record the state of a client
|
|
|
|
over time, this does not leak much information. Only a binary
|
|
|
|
reachable/non-reachable is stored, and the timing of samples becomes
|
|
|
|
increasingly fuzzy as the data becomes less recent.
|
|
|
|
|
2010-03-30 17:57:49 +02:00
|
|
|
On IP address changes, Tor should clear the ring-buffer, because
|
|
|
|
from the perspective of users with the old IP address, this node
|
|
|
|
might as well be a new one with no history. This policy may change
|
|
|
|
once we start allowing the bridge authority to hand out new IP
|
|
|
|
addresses given the fingerprint.
|
|
|
|
|
|
|
|
3.x Bandwidth measurement
|
|
|
|
|
|
|
|
Tor needs to measure its bandwidth to test the usefulness as a
|
|
|
|
bridge. A non-intrusive way to do this would be to passively measure
|
|
|
|
the peak data transfer rate since the last reachability test. Once
|
|
|
|
this exceeds min_bandwidth, Tor can set a flag that this node
|
|
|
|
currently has sufficient bandwidth to pass the bandwidth component
|
|
|
|
of the upcoming performance measurement.
|
|
|
|
|
|
|
|
For the first version we may simply skip the bandwidth test,
|
|
|
|
because the existing reachability test sends 500 kB over several
|
|
|
|
circuits, and checks whether the node can transfer at least 50
|
|
|
|
kB/s. This is probably good enough for a bridge, so this test
|
|
|
|
might be sufficient to record a success in the ring buffer.
|
|
|
|
|
2010-03-15 00:06:48 +01:00
|
|
|
3.x New options
|
|
|
|
|
|
|
|
3.x New controller message
|
|
|
|
|
2010-03-17 12:56:59 +01:00
|
|
|
4. Migration plan
|
2010-03-15 00:06:48 +01:00
|
|
|
|
2010-03-17 12:56:59 +01:00
|
|
|
We should start by setting a high bandwidth and uptime requirement
|
|
|
|
in the consensus, so as to avoid overloading the bridge authority
|
|
|
|
with too many bridges. Once we are confident our systems can scale,
|
|
|
|
the criteria can be gradually shifted down to gain more bridges.
|
|
|
|
|
|
|
|
5. Related proposals
|
|
|
|
|
|
|
|
6. Open questions:
|
2010-03-15 02:07:12 +01:00
|
|
|
|
|
|
|
- What user interaction policy should we take?
|
|
|
|
|
|
|
|
- When (if ever) should we turn a relay into an exit relay?
|
|
|
|
|
|
|
|
- What should the rate limits be for auto-promoted bridges/relays?
|
|
|
|
Should we prompt the user for this?
|
|
|
|
|
|
|
|
- Perhaps the bridge authority should tell potential bridges
|
|
|
|
whether to enable themselves, by taking into account whether
|
|
|
|
their IP address is blocked
|
2010-03-15 18:50:50 +01:00
|
|
|
|
|
|
|
- How do we explain the possible risks of running a bridge/relay
|
|
|
|
* Use of bandwidth/congestion
|
|
|
|
* Publication of IP address
|
|
|
|
* Blocking from IRC (even for non-exit relays)
|
|
|
|
|
|
|
|
- What feedback should we give to bridge relays, to encourage then
|
|
|
|
e.g. number of recent users (what about reserve bridges)?
|
2010-03-29 21:49:30 +02:00
|
|
|
|
2010-03-30 17:57:49 +02:00
|
|
|
- Can clients back-off from doing these tests (yes, we should do
|
|
|
|
this)
|
|
|
|
|
2010-03-29 21:49:30 +02:00
|
|
|
[1] For algorithms to generate random numbers from the Poisson
|
|
|
|
distribution, see: http://en.wikipedia.org/wiki/Poisson_distribution#Generating_Poisson-distributed_random_variables
|
|
|
|
[2] "The sample size n should be equal to or larger than 20 and the
|
|
|
|
probability of a single success, p, should be smaller than or equal to
|
|
|
|
.05. If n >= 100, the approximation is excellent if np is also <= 10."
|
|
|
|
http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm (e-Handbook of Statistical Methods)
|
|
|
|
|
|
|
|
% vim: spell ai et:
|