Accomplished via the following:
1. Use NETINFO cells to determine if both peers will agree on canonical
status. Prefer connections where they agree to those where they do not.
2. Alter channel_is_better() to prefer older orconns in the case of multiple
canonical connections, and use the orconn with more circuits on it in case
of age ties.
Also perform some hourly accounting on how many of these types of connections
there are and log it at info or notice level.
This defense will cause Cisco, Juniper, Fortinet, and other routers operating
in the default configuration to collapse netflow records that would normally
be split due to the 15 second flow idle timeout.
Collapsing these records should greatly reduce the utility of default netflow
data for correlation attacks, since all client-side records should become 30
minute chunks of total bytes sent/received, rather than creating multiple
separate records for every webpage load/ssh command interaction/XMPP chat/whatever
else happens to be inactive for more than 15 seconds.
The defense adds consensus parameters to govern the range of timeout values
for sending padding packets, as well as for keeping connections open.
The defense only sends padding when connections are otherwise inactive, and it
does not pad connections used solely for directory traffic at all. By default
it also doesn't pad inter-relay connections.
Statistics on the total padding in the last 24 hours are exported to the
extra-info descriptors.
Right now, there's only a mechanism to look for a channel where the
RSA ID matches *and* the ED ID matches. We can add a separate map
later if we want.
In particular, these functions are the ones that set the identity of
a given connection or channel, and/or confirm that we have learned
said IDs.
There's a lot of stub code here: we don't actually need to use the
new keys till we start looking up connections/channels by Ed25519
IDs. Still, we want to start passing the Ed25519 IDs in now, so it
makes sense to add these stubs as part of 15055.
The functions it warns about are:
assert, memcmp, strcat, strcpy, sprintf, malloc, free, realloc,
strdup, strndup, calloc.
Also, fix a few lingering instances of these in the code. Use other
conventions to indicate _intended_ use of assert and
malloc/realloc/etc.
This is a big-ish patch, but it's very straightforward. Under this
clang warning, we're not actually allowed to have a global variable
without a previous extern declaration for it. The cases where we
violated this rule fall into three roughly equal groups:
* Stuff that should have been static.
* Stuff that was global but where the extern was local to some
other C file.
* Stuff that was only global when built for the unit tests, that
needed a conditional extern in the headers.
The first two were IMO genuine problems; the last is a wart of how
we build tests.
Conflicts:
src/or/channel.c
src/or/circuitlist.c
src/or/connection.c
Conflicts involved removal of next_circ_id and addition of
unusable-circid tracking.
The point of the "idle timeout" for connections is to kill the
connection a while after it has no more circuits. But using "last
added a non-padding cell" as a proxy for that is wrong, since if the
last circuit is closed from the other side of the connection, we
will not have sent anything on that connection since well before the
last circuit closed.
This is part of fixing 6799.
When applied to 0.2.5, it is also a fix for 12023.
Use a per-channel ratelim_t to control the rate at which we report
failures for each channel.
Explain why I picked N=32.
Never return a zero circID.
Thanks to Andrea and to cypherpunks.
Fixes a possible root cause of 11553 by only making 64 attempts at
most to pick a circuitID. Previously, we would test every possible
circuit ID until we found one or ran out.
This algorithm succeeds probabilistically. As the comment says:
This potentially causes us to give up early if our circuit ID
space is nearly full. If we have N circuit IDs in use, then we
will reject a new circuit with probability (N / max_range) ^
MAX_CIRCID_ATTEMPTS. This means that in practice, a few percent
of our circuit ID capacity will go unused.
The alternative here, though, is to do a linear search over the
whole circuit ID space every time we extend a circuit, which is
not so great either.
This makes new vs old clients distinguishable, so we should try to
batch it with other patches that do that, like 11438.
Most of these are simple. The only nontrivial part is that our
pattern for using ENUM_BF was confusing doxygen by making declarations
that didn't look like declarations.
By calling circuit_n_chan_done() unconditionally on close, we were
closing pending connections that might not have been pending quite for
the connection we were closing. Fix for bug 9880.
Thanks to skruffy for finding this and explaining it patiently until
we understood.