propose a plan for 104-short-descriptors

svn:r9786
This commit is contained in:
Roger Dingledine 2007-03-09 22:55:35 +00:00
parent 3d64374071
commit 5b734f5210

View File

@ -34,7 +34,9 @@ Proposal:
Another possible solution would be to drop these fields from descriptors,
and have them uploaded as a part of a separate "bandwidth report" to the
authorities. This could help prevent the mistake of using long descriptors
in the place of short ones.
in the place of short ones. It could also be generalized later to be an
overall status report, to include sanitized GeoIP information and whatever
else comes up.
Other disposable fields:
@ -49,11 +51,15 @@ Other disposable fields:
accept
(Apparently, exit polices are highly compressible.)
[Does size-on-disk matter to anybody? Some clients and servers don't
have much disk, or have really slow disk (e.g. USB). And we don't
store caches compressed right now. -RD]
Issues:
Indexing long descriptor or bandwidth reports presents an issue: right now
the way to make sure you have the same copy of a descriptor as everyone
else is to request the descriptor by its digest, and to make sure to that
else is to request the descriptor by its digest, and to make sure that
the digest you request is the one that the authorities like.
Authorities should presumably list the digests of short descriptors, since
@ -62,19 +68,21 @@ Issues:
with information nobody wants.
Possible solutions are:
- Drop the property that you can be sure of having the same long
descriptor as others. This seems unoptimal.
- Have a separate extra-information-status that also gets generated by the
1) Drop the property that you can be sure of having the same long
descriptor as others. This seems unoptimal, but if nobody caches
long descriptors so you have to go to the authority to get them,
maybe it's not so bad.
2) Have a separate extra-information-status that also gets generated by the
authorities; use it to tell which long descriptors others have. Also a
pain.
- Have short descriptors include a hash of the corresponding long
3) Have short descriptors include a hash of the corresponding long
descriptor/extra-info. This would keep the same order of magnitude
performance increase (~59.2% savings as opposed to 61% savings.)
This would require longdesc/extra-info downloaders to fetch
router data before they could know which longdescs/extra info to fetch.
- Have each authority make a signed concatenated "extra info" document,
4) Have each authority make a signed concatenated "extra info" document,
and hope we never need to reconcile them.
- ????
5) ????
Migration:
@ -83,12 +91,20 @@ Migration:
* Authorities should accept both, now, and silently drop short
descriptors.
* Routers should upload both once authorities accept them.
* There should be a "long descriptor" url and the current "normal" URL.
* There should be a "long descriptor" url named
/tor/server/fp-detailed/ and the current "normal" URL.
Authorities should serve long descriptors from both URLs.
There's no such thing as asking for a long descriptor by
its digest.
* Once tools that want long descriptors support fetching them from the
"long descriptor" URL:
* Have authorities remember short descriptors, and serve them from the
'normal' URL.
These tools include:
lefkada's exit.py script.
tor26's noreply script and general directory cache.
https://nighteffect.us/tns/ for its graphs
and check with or-talk for the rest, once it's time.
For bandwidth info approach:
* First:
@ -99,3 +115,30 @@ Migration:
* Once tools that want bandwidth info support fetching it:
* Have routers stop including bandwidth info in their router
descriptors.
Discussion:
Solution 4 seems like a nice plan: in many cases, the external services
that use read-history and write-history are directory authorities
themselves, so they just use their local opinion.
Roger thinks we should go with the long/short descriptor plan, along
with solution 4. We don't want to just upload a bandwidth message,
because that involves new data structures for every new piece of
information we decide to upload. I suspect we'll realize once this
is deployed that there is other info we want to put in the long
descriptors.
This won't solve the future sanitized GeoIP uploading question, but
who knows where we'll actually want to send that data, and whether
we'll want to handle it with the same privacy constraints as this data,
so let's not try to solve that yet.
However, we may still need some basic reconciling algorithms between
authorities -- otherwise, if a router uploads to four authorities
and fails to reach the fifth, then that fifth will never have the new
descriptor. This will mean that the best strategy for external tools
is to fetch full concatenated-style long-descriptor lists from every
single authority, and merge them locally. So each authority should
periodically fetch the list from the others and take the new ones.