our pqueue implementation does bizarre unspecified things with
ordering of elements that are equal. it certainly doesn't do any
sort of "first in first out" property that i was expecting.
now make it explicit by saying that "equal-effort, added-earlier" is
higher priority.
specifically, if we have 16 in-flight rend circs, and the next
one at the top of the pqueue is lower than our suggested effort,
then don't launch it yet.
this way we always launch adequate-effort requests immediately, and
we always handle *some* low-effort requests, but we are ready at any
moment to handle a few new adequate-effort requests.
this change makes us reach the callback *after* each mainloop
run, rather than as the next event to run immediately after
activation.
with the old behavior, we were starving everything else to drain the
pqueue entirely, each time we got a new intro2 cell.
now we at least will get to other activities as well.
now we let ourselves queue up to twice as many as we expect, and when
we get to the limit we make a new pqueue and move over the first n
elements that we like most.
(the old approach, of calling SMARTLIST_DEL_CURRENT_KEEPORDER() on
elements in a pqueue, will destroy its heapify property.)
we also discard elements that are too old, either during the trimming
process or if they come up as the next request to respond to.
lastly, fix a fencepost error on how many rend reqs we would handle
per iteration.
If PoW are enabled, use a priority queue by effort for the rendezvous
requests hooked into the mainloop.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When parsing an INTRODUCE2 cell, we extract data in order to launch the
rendezvous circuit. This commit creates a data structure just for that
data so it can be used by future commits for prop327 in order to copy
that data over a priority queue instead of the whole intro data data
structure which contains pointers that could dissapear.
Signed-off-by: David Goulet <dgoulet@torproject.org>
At this commit, the tor main loop solves it. We might consider moving
this to the CPU pool at some point.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We discovered two cases where edge connections can stall during testing:
1. Due to final data sitting in the edge inbuf when it was resumed
2. Due to flag synchronization between the token bucket and XON/XOFF
The first issue has always existed in C-Tor, but we were able to tickle it
in scp testing. If the last data from the protocol is able to fit in the
inbuf, but not large enough to send, if an XOFF or connection block comes in
at exactly that point, when the edge connection resumes, there will be no
data to read from the socket, but the inbuf can just sit there, never
draining.
We noticed the second issue along the way to finding the first. It seems
wrong, but it didn't seem to affect anything in practice.
These are extremely rare in normal operation, but with conflux, XON/XOFF
activity is more common, so we hit these.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In https://gitlab.torproject.org/tpo/core/tor/-/issues/40623, we changed the
DESTROY propogation to ensure memory was freed quickly at relays. This was a
good move, but it exacerbates the condition where a stream is closed on a
circuit, and then it is immediately closed because it is dirty. This creates a
race between the DESTROY and the last data sent on the stream. This race is
visible in shadow, and does happen.
This could be backported. A better solution to these kinds of problems is to
create an ENDED cell, and not close any circuits until the ENDED comes back.
But this will also require thinking, since this ENDED cell can also get lost,
so some kind of timeout may be needed either way. The ENDED cell could just
allow us to have much longer timeouts for this case.
This adds utility functions to help stream block decisions, as well as cpath
layer_hint checks for stream cell acceptance, and syncing stream lists
for conflux circuits.
These functions are then called throughout the codebase to properly manage
conflux streams.
Streams can get blocked on a circuit in two ways:
1. When the circuit package window is full
2. When the channel's cell queue is too high
Conflux needs to decouple stream blocking from both of these conditions,
because streams can continue on another circuit, even if the primary circuit
is blocked for either of these cases.
However, both conflux and congestion control need to know if the channel's
cell queue hit the highwatermark and is still draining, because this condition
is used by those components, independent of stream state.
Therefore, this commit renames the 'streams_blocked_on_chan' variable to
signify that it refers to the cell queue state, and also refactors the actual
stream blocking bits out, so they can be handled separately if conflux is
present.
Because circuit-level sendmes are sent before relay data cells are processed,
we can safely move this to before the conflux decision. In this way,
regardless of conflux being negotiated, we still send sendmes as soon as data
cells are recieved. This avoids introducing conflux queue delay into RTT
measurement, which is important for measuring actual circuit capacity.
The circuit-level tracking must happen inside the call to send a data cell,
since that call now chooses a circuit to send on. Turns out, we were already
doing this kind of here, but only for the digest. Now we do both things here.
We use the oldest-circ-first method here, since that seems good for conflux:
queues could briefly spike, but the bad case is if they are maliciously
bloated to stick around for a long time.
The tradeoff here is that it is possible to kill old circuits on a relay
quickly, but that has always been the case with this algorithm choice.
Signed-off-by: David Goulet <dgoulet@torproject.org>