The issue is that we use the cpuworker system with relays only, so if we
start up as a client and transition to being a relay later, we'll be
sad.
This fixes bug 14901; not in any released version of Tor.
This fixes a bug where we decide to free the circuit because it isn't on
any workqueue anymore, and then the job finishes and the circuit gets
freed again.
Fixes bug #14815, not in any released version of Tor.
David Goulet finds that when he runs a busy relay for a while with the
latest version of the git code, the number of onionskins handled
slowly dwindles to zero, with total_pending_tasks wedged at its
maximum value.
I conjecture this is because the total_pending_tasks variable isn't
decremented when we successfully cancel a job. Fixed that.
Fixes bug 14741; bugfix not on any released version of tor.
Previously I used one queue per worker; now I use one queue for
everyone. The "broadcast" code is gone, replaced with an idempotent
'update' operation.
The solution I took is to not free a circuit with a pending
uncancellable work item, but rather to set its magic number to a
sentinel value. When we get a work item, we check whether the circuit
has that magic sentinel, and if so, we free it rather than processing
the reply.
To avoid having diffs turn out too big, I had replaced some unneeded
ifs and fors with if (1), so that the indentation would still work out
right. Now I might as well clean those up.
Long ago we supported systems where there was no support for
threads, or where the threading library was broken. We shouldn't
have do that any more: on every OS that matters, threads exist, and
the OS supports running threads across multiple CPUs.
This resolves tickets 9495 and 12439. It's a prerequisite to making
our workqueue code work better, since sensible workqueue
implementations don't split across multiple processes.
scan-build doesn't realize that a request can't be timed at the end
unless it's timed at the start, and so it's not possible for us to
be subtracting start from end without start being set.
Nevertheless, let's not confuse it.
If we don't, we can wind up with a wedged cpuworker, and write to it
for ages and ages.
Found by skruffy. This was a bug in 2dda97e8fd, a.k.a. svn
revision 402. It's been there since we have been using cpuworkers.
Now that circid_t is 4 bytes long, the default integer promotions will
leave it alone when sizeof(int) == 4, which will leave us formatting an
unsigned as an int. That's technically undefined behavior.
Fixes bug 8447 on bfffc1f0fc. Bug not
in any released Tor.
When we compute the estimated microseconds we need to handle our
pending onionskins, we could (in principle) overflow a uint32_t if
we ever had 4 million pending onionskins before we had any data
about how onionskins take. Nevertheless, let's compute it properly.
Fixes bug 8210; bugfix on 0.2.4.10. Found by coverity; this is CID
980651.
We need a weak RNG in a couple of places where the strong RNG is
both needless and too slow. We had been using the weak RNG from our
platform's libc implementation, but that was problematic (because
many platforms have exceptionally horrible weak RNGs -- like, ones
that only return values between 0 and SHORT_MAX) and because we were
using it in a way that was wrong for LCG-based weak RNGs. (We were
counting on the low bits of the LCG output to be as random as the
high ones, which isn't true.)
This patch adds a separate type for a weak RNG, adds an LCG
implementation for it, and uses that exclusively where we had been
using the platform weak RNG.
The right way to set "MaxOnionsPending" was to adjust it until the
processing delay was appropriate. So instead, let's measure how long
it takes to process onionskins (sampling them once we have a big
number), and then limit the queue based on its expected time to
finish.
This change is extra-necessary for ntor, since there is no longer a
reasonable way to set MaxOnionsPending without knowing what mix of
onionskins you'll get.
This patch also reserves 1/3 of the onionskin spots for ntor
handshakes, on the theory that TAP handshakes shouldn't be allowed to
starve their speedier cousins. We can change this later if need be.
Resolves 7291.
The unit of work sent to a cpuworker is now a create_cell_t; its
response is now a created_cell_t. Several of the things that call or
get called by this chain of logic now take create_cell_t or
created_cell_t too.
Since all cpuworkers are forked or spawned by Tor, they don't need a
stable wire protocol, so we can just send structs. This saves us some
insanity, and helps p
I'm going to want a generic "onionskin" type and set of wrappers, and
for that, it will be helpful to isolate the different circuit creation
handshakes. Now the original handshake is in onion_tap.[ch], the
CREATE_FAST handshake is in onion_fast.[ch], and onion.[ch] now
handles the onion queue.
This commit does nothing but move code and adjust header files.
Note: this is a squashed commit; see branch bug6465_rebased_v2 of user/andrea/tor.git for full history of the following 2 commits:
Use channel_t in cpuworker.c
Fix bug in channel_t usage in cpuworker.c that was killing relaying on channel_t-ized Tor. The tags passed to the worker now have a channel ID, not a connection ID.
The SMARTLIST_FOREACH macro is more convenient than BEGIN/END when
you have a nice short loop body, but using it for long bodies makes
your preprocessor tell the compiler that all the code is on the same
line. That causes grief, since compiler warnings and debugger lines
will all refer to that one line.
So, here's a new style rule: SMARTLIST_FOREACH blocks need to be
short.
The function is over 10 or 20% on some of Moritz's profiles, depending
on how you could.
Since it's checking for a multi-hour timeout, this is safe to do.
Fixes bug 4518.