mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-10 21:23:58 +01:00
157 lines
5.4 KiB
Plaintext
157 lines
5.4 KiB
Plaintext
Filename: 140-consensus-diffs.txt
|
|
Title: Provide diffs between consensuses
|
|
Author: Peter Palfrader
|
|
Created: 13-Jun-2008
|
|
Status: Accepted
|
|
Target: 0.2.2.x
|
|
|
|
0. History
|
|
|
|
22-May-2009: Restricted the ed format even more strictly for ease of
|
|
implementation. -nickm
|
|
|
|
1. Overview.
|
|
|
|
Tor clients and servers need a list of which relays are on the
|
|
network. This list, the consensus, is created by authorities
|
|
hourly and clients fetch a copy of it, with some delay, hourly.
|
|
|
|
This proposal suggests that clients download diffs of consensuses
|
|
once they have a consensus instead of hourly downloading a full
|
|
consensus.
|
|
|
|
2. Numbers
|
|
|
|
After implementing proposal 138 which removes nodes that are not
|
|
running from the list a consensus document is about 92 kilobytes
|
|
in size after compression.
|
|
|
|
The diff between two consecutive consensus, in ed format, is on
|
|
average 13 kilobytes compressed.
|
|
|
|
3. Proposal
|
|
|
|
3.1 Clients
|
|
|
|
If a client has a consensus that is recent enough it SHOULD
|
|
try to download a diff to get the latest consensus rather than
|
|
fetching a full one.
|
|
|
|
[XXX: what is recent enough?
|
|
time delta in hours / size of compressed diff
|
|
0 20
|
|
1 9650
|
|
2 17011
|
|
3 23150
|
|
4 29813
|
|
5 36079
|
|
6 39455
|
|
7 43903
|
|
8 48907
|
|
9 54549
|
|
10 60057
|
|
11 67810
|
|
12 71171
|
|
13 73863
|
|
14 76048
|
|
15 80031
|
|
16 84686
|
|
17 89862
|
|
18 94760
|
|
19 94868
|
|
20 94223
|
|
21 93921
|
|
22 92144
|
|
23 90228
|
|
[ size of gzip compressed "diff -e" between the consensus on
|
|
2008-06-01-00:00:00 and the following consensuses that day.
|
|
Consensuses have been modified to exclude down routers per
|
|
proposal 138. ]
|
|
|
|
Data suggests that for the first few hours diffs are very useful,
|
|
saving about 60% for the first three hours, 30% for the first 10,
|
|
and almost nothing once we are past 16 hours.
|
|
]
|
|
|
|
3.2 Servers
|
|
|
|
Directory authorities and servers need to keep up to X [XXX: depends
|
|
on how long clients try to download diffs per above] old consensus
|
|
documents so they can build diffs. They should offer a diff to the
|
|
most recent consensus at the URL
|
|
|
|
http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>
|
|
|
|
where hash is the full digest of the consensus the client currently
|
|
has, and FPRLIST is a list of (abbreviated) fingerprints of
|
|
authorities the client trusts.
|
|
|
|
Servers will only return a consensus if more than half of the requested
|
|
authorities have signed the document, otherwise a 404 error will be sent
|
|
back. The fingerprints can be shortened to a length of any multiple of
|
|
two, using only the leftmost part of the encoded fingerprint. Tor uses
|
|
3 bytes (6 hex characters) of the fingerprint. (This is just like the
|
|
conditional consensus downloads that Tor supports starting with
|
|
0.1.2.1-alpha.)
|
|
|
|
If a server cannot offer a diff from the consensus identified by the
|
|
hash but has a current consensus it MUST return the full consensus.
|
|
|
|
[XXX: what should we do when the client already has the latest
|
|
consensus? I can think of the following options:
|
|
- send back 3xx not modified
|
|
- send back 200 ok and an empty diff
|
|
- send back 404 nothing newer here.
|
|
|
|
I currently lean towards the empty diff.]
|
|
|
|
4. Diff Format
|
|
|
|
Diffs start with the token "network-status-diff-version" followed by a
|
|
space and the version number, currently "1".
|
|
|
|
If a document does not start with network-status-diff it is assumed
|
|
to be a full consensus download and would therefore currently start
|
|
with "network-status-version 3".
|
|
|
|
Following the network-status-diff header line is a diff, or patch, in
|
|
limited ed format. We choose this format because it is easy to create
|
|
and process with standard tools (patch, diff -e, ed). This will help
|
|
us in developing and testing this proposal and it should make future
|
|
debugging easier.
|
|
|
|
[ If at one point in the future we decide that the space benefits from
|
|
a custom diff format outweighs these benefits we can always
|
|
introduce a new diff format and offer it at for instance
|
|
../diff2/... ]
|
|
|
|
We support the following ed commands, each on a line by itself:
|
|
- "<n1>d" Delete line n1
|
|
- "<n1>,<n2>d" Delete lines n1 through n2, including
|
|
- "<n1>c" Replace line n1 with the following block
|
|
- "<n1>,<n2>c" Replace lines n1 through n2, including, with the
|
|
following block.
|
|
- "<n1>a" Append the following block after line n1.
|
|
- "a" Append the following block after the current line.
|
|
- "s/.//" Remove the first character in the current line.
|
|
|
|
Note that line numbers always apply to the file after all previous
|
|
commands have already been applied.
|
|
|
|
The commands MUST apply to the file from back to front, such that
|
|
lines are only ever referred to by their position in the original
|
|
file.
|
|
|
|
The "current line" is either the first line of the file, if this is
|
|
the first command, the last line of a block we added in an append or
|
|
change command, or the line immediate following a set of lines we just
|
|
deleted (or the last line of the file if there are no lines after
|
|
that).
|
|
|
|
The replace and append command take blocks. These blocks are simply
|
|
appended to the diff after the line with the command. A line with
|
|
just a period (".") ends the block (and is not part of the lines
|
|
to add). Note that it is impossible to insert a line with just
|
|
a single dot. Recommended procedure is to insert a line with
|
|
two dots, then remove the first character of that line using s/.//.
|