mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-10 21:23:58 +01:00
135 lines
5.7 KiB
Plaintext
135 lines
5.7 KiB
Plaintext
Filename: 168-reduce-circwindow.txt
|
|
Title: Reduce default circuit window
|
|
Author: Roger Dingledine
|
|
Created: 12-Aug-2009
|
|
Status: Open
|
|
Target: 0.2.2
|
|
|
|
0. History
|
|
|
|
|
|
1. Overview
|
|
|
|
We should reduce the starting circuit "package window" from 1000 to
|
|
101. The lower package window will mean that clients will only be able
|
|
to receive 101 cells (~50KB) on a circuit before they need to send a
|
|
'sendme' acknowledgement cell to request 100 more.
|
|
|
|
Starting with a lower package window on exit relays should save on
|
|
buffer sizes (and thus memory requirements for the exit relay), and
|
|
should save on queue sizes (and thus latency for users).
|
|
|
|
Lowering the package window will induce an extra round-trip for every
|
|
additional 50298 bytes of the circuit. This extra step is clearly a
|
|
slow-down for large streams, but ultimately we hope that a) clients
|
|
fetching smaller streams will see better response, and b) slowing
|
|
down the large streams in this way will produce lower e2e latencies,
|
|
so the round-trips won't be so bad.
|
|
|
|
2. Motivation
|
|
|
|
Karsten's torperf graphs show that the median download time for a 50KB
|
|
file over Tor in mid 2009 is 7.7 seconds, whereas the median download
|
|
time for 1MB and 5MB are around 50s and 150s respectively. The 7.7
|
|
second figure is way too high, whereas the 50s and 150s figures are
|
|
surprisingly low.
|
|
|
|
The median round-trip latency appears to be around 2s, with 25% of
|
|
the data points taking more than 5s. That's a lot of variance.
|
|
|
|
We designed Tor originally with the original goal of maximizing
|
|
throughput. We figured that would also optimize other network properties
|
|
like round-trip latency. Looks like we were wrong.
|
|
|
|
3. Design
|
|
|
|
Wherever we initialize the circuit package window, initialize it to
|
|
101 rather than 1000. Reducing it should be safe even when interacting
|
|
with old Tors: the old Tors will receive the 101 cells and send back
|
|
a sendme ack cell. They'll still have much higher deliver windows,
|
|
but the rest of their deliver window will go unused.
|
|
|
|
You can find the patch at arma/circwindow. It seems to work.
|
|
|
|
3.1. Why not 100?
|
|
|
|
Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme
|
|
ack cell after 101 cells rather than the intended 100 cells.
|
|
|
|
Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But
|
|
hopefully we'll have moved to some datagram protocol long before
|
|
0.2.1.19 becomes obsolete.
|
|
|
|
3.2. What about stream packaging windows?
|
|
|
|
Right now the stream packaging windows start at 500. The goal was to
|
|
set the stream window to half the circuit window, to provide a crude
|
|
load balancing between streams on the same circuit. Once we lower
|
|
the circuit packaging window, the stream packaging window basically
|
|
becomes redundant.
|
|
|
|
We could leave it in -- it isn't hurting much in either case. Or we
|
|
could take it out -- people building other Tor clients would thank us
|
|
for that step. Alas, people building other Tor clients are going to
|
|
have to be compatible with current Tor clients, so in practice there's
|
|
no point taking out the stream packaging windows.
|
|
|
|
3.3. What about variable circuit windows?
|
|
|
|
Once upon a time we imagined adapting the circuit package window to
|
|
the network conditions. That is, we would start the window small,
|
|
and raise it based on the latency and throughput we see.
|
|
|
|
In theory that crude imitation of TCP's windowing system would allow
|
|
us to adapt to fill the network better. In practice, I think we want
|
|
to stick with the small window and never raise it. The low cap reduces
|
|
the total throughput you can get from Tor for a given circuit. But
|
|
that's a feature, not a bug.
|
|
|
|
4. Evaluation
|
|
|
|
How do we know this change is actually smart? It seems intuitive that
|
|
it's helpful, and some smart systems people have agreed that it's
|
|
a good idea (or said another way, they were shocked at how big the
|
|
default package window was before).
|
|
|
|
To get a more concrete sense of the benefit, though, Karsten has been
|
|
running torperf side-by-side on exit relays with the old package window
|
|
vs the new one. The results are mixed currently -- it is slightly faster
|
|
for fetching 40KB files, and slightly slower for fetching 50KB files.
|
|
|
|
I think it's going to be tough to get a clear conclusion that this is
|
|
a good design just by comparing one exit relay running the patch. The
|
|
trouble is that the other hops in the circuits are still getting bogged
|
|
down by other clients introducing too much traffic into the network.
|
|
|
|
Ultimately, we'll want to put the circwindow parameter into the
|
|
consensus so we can test a broader range of values once enough relays
|
|
have upgraded.
|
|
|
|
5. Transition and deployment
|
|
|
|
We should put the circwindow in the consensus (see proposal 167),
|
|
with an initial value of 101. Then as more exit relays upgrade,
|
|
clients should seamlessly get the better behavior.
|
|
|
|
Note that upgrading the exit relay will only affect the "download"
|
|
package window. An old client that's uploading lots of bytes will
|
|
continue to use the old package window at the client side, and we
|
|
can't throttle that window at the exit side without breaking protocol.
|
|
|
|
The real question then is what we should backport to 0.2.1. Assuming
|
|
this could be a big performance win, we can't afford to wait until
|
|
0.2.2.x comes out before starting to see the changes here. So we have
|
|
two options as I see them:
|
|
a) once clients in 0.2.2.x know how to read the value out of the
|
|
consensus, and it's been tested for a bit, backport that part to
|
|
0.2.1.x.
|
|
b) if it's too complex to backport, just pick a number, like 101, and
|
|
backport that number.
|
|
|
|
Clearly choice (a) is the better one if the consensus parsing part
|
|
isn't very complex. Let's shoot for that, and fall back to (b) if the
|
|
patch turns out to be so big that we reconsider.
|
|
|