We add a compression level argument to tor_zlib_new, and use it to
determine how much memory to allocate for the zlib object. We use the
existing level by default, but shift to smaller levels for small
requests when we have been over 3/4 of our memory usage in the past
half-hour.
Closes ticket 11791.
Also, refactor the way we handle failed handshakes so that this
warning doesn't propagate itself to "onion_skin_client_handshake
failed" and "circuit_finish_handshake failed" and
"connection_edge_process_relay_cell (at origin) failed."
Resolves warning from 9635.
Two bugs here:
1) We didn't add EXTEND2/EXTENDED2 to relay_command_to_string().
2) relay_command_to_string() didn't log the value of unrecognized
commands.
Both fixed here.
Check for consistency between the queued destroy cells and the marked
circuit IDs. Check for consistency in the count of queued destroy
cells in several ways. Check to see whether any of the marked circuit
IDs have somehow been marked longer than the channel has existed.
tor_memeq has started to show up on profiles, and this is one of the
most frequent callers of that function, appearing as it does on every
cell handled for entry or exit.
59f9097d5c introduced tor_memneq here;
it went into Tor 0.2.1.31. Fixes part of 12169.
Also, stop accepting the old kind of RESOLVED cells with no TTL
fields; they haven't been sent since 0.1.1.6-alpha.
This patch won't work without the fix to #10468 -- it will break
DNSPorts unless they set the proper ipv4/6 flags on entry_connection_t.
This contains the obvious implementation using the circuitmux data
structure. It also runs the old (slow) algorithm and compares
the results of the two to make sure that they're the same.
Needs review and testing.
In a couple of places, to implement the OOM-circuit-killer defense
against sniper attacks, we have counters to remember the age of
cells or data chunks. These timers were based on wall clock time,
which can move backwards, thus giving roll-over results for our age
calculation. This commit creates a low-budget monotonic time, based
on ratcheting gettimeofday(), so that even in the event of a time
rollback, we don't do anything _really_ stupid.
A future version of Tor should update this function to do something
even less stupid here, like employ clock_gettime() or its kin.
This patch splits out some of the functions in OOM handling so that
it's easier to check them without involving the rest of Tor or
requiring that the circuits be "wired up".
This was a mistake in the merge commit 7a2b30fe16. It
would have made the CellStatistics code give completely bogus
results. Bug not in any released Tor.
Conflicts:
src/or/or.h
src/or/relay.c
Conflicts were simple to resolve. More fixes were needed for
compilation, including: reinstating the tv_to_msec function, and renaming
*_conn_cells to *_chan_cells.
Previously, when we ran low on memory, we'd close whichever circuits
had the most queued cells. Now, we close those that have the
*oldest* queued cells, on the theory that those are most responsible
for us running low on memory, and that those are the least likely to
actually drain on their own if we wait a little longer.
Based on analysis from a forthcoming paper by Jansen, Tschorsch,
Johnson, and Scheuermann. Fixes bug 9093.
We previously used FILENAME_PRIVATE identifiers mostly for
identifiers exposed only to the unit tests... but also for
identifiers exposed to the benchmarker, and sometimes for
identifiers exposed to a similar module, and occasionally for no
really good reason at all.
Now, we use FILENAME_PRIVATE identifiers for identifiers shared by
Tor and the unit tests. They should be defined static when we
aren't building the unit test, and globally visible otherwise. (The
STATIC macro will keep us honest here.)
For identifiers used only by the unit tests and never by Tor at all,
on the other hand, we wrap them in #ifdef TOR_UNIT_TESTS.
This is not the motivating use case for the split test/non-test
build system; it's just a test example to see how it works, and to
take a chance to clean up the code a little.
This implements "algorithm 1" from my discussion of bug #9072: on OOM,
find the circuits with the longest queues, and kill them. It's also a
fix for #9063 -- without the side-effects of bug #9072.
The memory bounds aren't perfect here, and you need to be sure to
allow some slack for the rest of Tor's usage.
This isn't a perfect fix; the rest of the solutions I describe on
codeable.
I added the code to pass a destroy cell to a queueing function rather
than writing it immediately, and the code to remember that we
shouldn't reuse the circuit id until the destroy is actually sent, and
the code to release the circuit id once the destroy has been sent...
and then I finished by hooking destroy_cell_queue into the rest of
Tor.
- Move cell_command_to_string from control.c to command.c.
- Use accessor for global_circuitlist instead of extern.
- Add a struct for cell statistics by command instead of six arrays.
- Split up control_event_circuit_cell_stats by using two helper functions.
- Add TestingEnableCellStatsEvent option.
- Prepare functions for testing.
- Rename a few variables and document a few things better.
Found while investigating 8093, but probably not the cause of it,
since this bug would result in us sending too few SENDMEs, not in us
receiving SENDMEs unexpectedly.
Bugfix on the fix for 7889, which has appeared in 0.2.4.10-alpha, but
not yet in any released 0.2.3.x version.
Also, don't call the exit node 'reject *' unless our decision to pick
that node was based on a non-summarized version of that node's exit
policy.
rransom and arma came up with the ideas for this fix.
Fix for 7582; the summary-related part is a bugfix on 0.2.3.2-alpha.
In a number of places, we decrement timestamp_dirty by
MaxCircuitDirtiness in order to mark a stream as "unusable for any
new connections.
This pattern sucks for a few reasons:
* It is nonobvious.
* It is error-prone: decrementing 0 can be a bad choice indeed.
* It really wants to have a function.
It can also introduce bugs if the system time jumps backwards, or if
MaxCircuitDirtiness is increased.
So in this patch, I add an unusable_for_new_conns flag to
origin_circuit_t, make it get checked everywhere it should (I looked
for things that tested timestamp_dirty), and add a new function to
frob it.
For now, the new function does still frob timestamp_dirty (after
checking for underflow and whatnot), in case I missed any cases that
should be checking unusable_for_new_conns.
Fixes bug 6174. We first used this pattern in 516ef41ac1,
which I think was in 0.0.2pre26 (but it could have been 0.0.2pre27).
Coverity is worried that we're checking entry_conn in some cases,
but not in the case where we set entry_conn->pending_optimistic_data.
This commit should calm it down (CID 718623).
This check isn't necessary (see comment on #7801), but it took at
least two smart people a little while to see why it wasn't necessary,
so let's have it in to make the code more readable.
We need a weak RNG in a couple of places where the strong RNG is
both needless and too slow. We had been using the weak RNG from our
platform's libc implementation, but that was problematic (because
many platforms have exceptionally horrible weak RNGs -- like, ones
that only return values between 0 and SHORT_MAX) and because we were
using it in a way that was wrong for LCG-based weak RNGs. (We were
counting on the low bits of the LCG output to be as random as the
high ones, which isn't true.)
This patch adds a separate type for a weak RNG, adds an LCG
implementation for it, and uses that exclusively where we had been
using the platform weak RNG.