To ease debugging of miscount issues, attach vanguards with --loglevel DEBUG
and obtain control port logs (or use any other control port CIRC and
CIRC_MINOR event logging mechanism).
If circuit padding wants to keep a circuit open and pathbias used to ignore
it, pathbias should continue to ignore it.
This may catch other purpose-change related miscounts (such as timeout
measurement, cannibalization, onion service circuit transitions, and
vanguards).
When a circuit is marked for close, check to see if any of our padding
machines want to take ownership of it and continue padding until the machine
hits the END state.
For safety, we also ensure that machines that do not terminate are still
closed as follows: Because padding machine timers are UINT32_MAX in size, if
some sort of network event doesn't happen on a padding-only circuit within
that time, we can conclude it is deadlocked and allow
circuit_expire_old_circuits_clientside() to close it.
If too much network activity happens, then per-machine padding limits can be
used to cease padding, which will cause network cell events to cease, on the
circuit, which will cause circpad to abandon the circuit as per the above time
limit.
We need to check here because otherwise we can try to schedule padding with no
tokens left upon the receipt of a padding event when our bins just became
empty.
Our other tests tested state lengths against padding packets, and token counts
against non-padding packets. This test checks state lengths against
non-padding packets (and also padding packets too), and checks token counts
against padding packets (and also non-padding packets too).
The next three commits are needed to make this test pass (it found 3 bugs).
Yay?
Since the reproducible RNG dumps its own seed, we don't need to do
it for it. Since tinytest can tell us if the test failed, we don't
need our own test_failed booleans.
This commit moves code that updates the state length and padding limit counts
out from the callback to its own function, for clarity.
It does not change functionality.
This commit moves the padding state limit checks and the padding rate limit
checks out of the token removal codepath, and causes all three functions to
get called from a single circpad_machine_count_nonpadding_sent() function.
It does not change functionality.
The code flow in theory can end up with a layer_hint to be NULL but in
practice it should never happen because with an origin circuit, we must have
the layer_hint.
Just in case, BUG() on it if we ever end up in this situation and recover by
closing the circuit.
Fixes#30467.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Fortunately, in 0.3.5.1-alpha we improved logging for various
failure cases involved with onion service client auth.
Unfortunately, for this one, we freed the file right before logging
its name.
Fortunately, tor_free() sets its pointer to NULL, so we didn't have
a use-after-free bug.
Unfortunately, passing NULL to %s is not defined.
Fortunately, GCC 9.1.1 caught the issue!
Unfortunately, nobody has actually tried building Tor with GCC 9.1.1
before. Or if they had, they didn't report the warning.
Fixes bug 30475; bugfix on 0.3.5.1-alpha.
The INTRODUCE1 trunnel definition file doesn't support that value so it can
not be used else it leads to an assert on the intro point side if ever tried.
Fortunately, it was impossible to reach that code path.
Part of #30454
Signed-off-by: David Goulet <dgoulet@torproject.org>
See proposal 289 section 4.3 for more details.
It describes the flow control protocol at the circuit and stream level. If
there is no FlowCtrl protocol version, tor supports the unauthenticated flow
control features from its supported Relay protocols.
At this commit, relay will start advertising FlowCtrl=1 meaning they support
authenticated SENDMEs v1.
Closes#30363
Signed-off-by: David Goulet <dgoulet@torproject.org>
- Move test-only cpath_get_n_hops() to crypt_path.c.
- Move onion_next_hop_in_cpath() and rename to cpath_get_next_non_open_hop().
The latter function was directly accessing cpath->state, and it's a first step
at hiding ->state.
Some of these functions are now public and cpath-specific so their name should
signify the fact they are part of the cpath module:
assert_cpath_layer_ok -> cpath_assert_layer_ok
assert_cpath_ok -> cpath_assert_ok
onion_append_hop -> cpath_append_hop
circuit_init_cpath_crypto -> cpath_init_circuit_crypto
circuit_free_cpath_node -> cpath_free
onion_append_to_cpath -> cpath_extend_linked_list
Now that we are using a constructor we should be more careful that we are
always using the constructor to initialize crypt_path_t, so make sure that
->private is initialized.
We are using an opaque pointer so the structure needs to be allocated on the
heap. This means we now need a constructor for crypt_path_t.
Also modify all places initializing a crypt_path_t to use the constructor.
For various reasons, this was a nontrivial movement. There are
several places in the code where we do something like "update the
flags on this routerstatus or node if we're an authority", and at
least one where we pretended to be an authority when we weren't.
I don't believe any of these represent a real timing vulnerability
(remote timing against memcmp() on a modern CPU is not easy), but
these are the ones where I believe we should be more careful.
For memeq and friends, "tor_" indicates constant-time and "fast_"
indicates optimized. I'm fine with leaving the constant-time
"safe_mem_is_zero" with its current name, but the "tor_" prefix on
the current optimized version is misleading.
Also, make the tor_digest*_is_zero() uniformly constant-time, and
add a fast_digest*_is_zero() version to use as needed.
A later commit in this branch will fix all the users of
tor_mem_is_zero().
Closes ticket 30309.
Manually fix up some reply-generating code that the Coccinelle scripts
won't match. Some more complicated ones remain -- these are mostly
ones that accumulate data to send, and then call connection_buf_add()
or connection_write_str_to_buf() directly.
Create a set of abstractions for controller commands and events to
output replies to the control channel. The control protocol has a
relatively consistent SMTP-like structure, so it's helpful when code
that implements control commands and events doesn't explicitly format
everything on its own.
Split the core reply formatting code out of control_fmt.c into
control_proto.c. The remaining code in control_format.c deals with
specific subsystems and will eventually move to join those subsystems.
When we tell the periodic event manager about an event, we are
"registering" that event. The event sits around without being
usable, however, until we "connect" the event to libevent. In the
end, we "disconnect" the event and remove its libevent parts.
Previously, we called these operations "add", "setup", and
"destroy", which led to confusion.
We need a little refactoring for this to work, since the
initialization code for the periodic events assumes that libevent is
already initialized, which it can't be until it's configured.
This change, combined with the previous ones, lets other subsystems
declare their own periodic events, without mainloop.c having to know
about them. Implements ticket 30293.
Because this function is poking within the relay_crypto_t object, move the
function to the module so we can keep it opaque as much as possible.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
We add random padding to every cell if there is room. This commit not only
fixes how we compute that random padding length/offset but also improves its
safety with helper functions and a unit test.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
We'll use it this in order to know when to hash the cell for the SENDME
instead of doing it at every cell.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
When adding random to a cell, skip the first 4 bytes and leave them zeroed. It
has been very useful in the past for us to keep bytes like this.
Some code trickery was added to make sure we have enough room for this 4 bytes
offset when adding random.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
The digest object is as large as the entire internal digest object's state,
which is often much larger than the actual set of bytes you're transmitting.
This commit makes it that we keep the digest itself which is 20 bytes.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
No behavior change but code had to be refactored a bit. Also, the tor_memcmp()
was changed to tor_memneq().
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
The circuit and stream level functions that update the package window have
been renamed to have a "_note_" in them to make their purpose more clear.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
No behavior change. Only moving code and fixing part of it in order to use the
parameters passed as pointers.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
To achieve such, this commit also changes the trunnel declaration to use a
union instead of a seperate object for the v1 data.
A constant is added for the digest length so we can use it within the SENDME
code giving us a single reference.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
In order to do so, depending on where the cell is going, we'll keep the last
cell digest that is either received inbound or sent outbound.
Then it can be used for validation.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
Now that we keep the last seen cell digests on the Exit side on the circuit
object, use that to match the SENDME v1 transforming this whole process into a
real authenticated SENDME mechanism.
Part of #26841
Signed-off-by: David Goulet <dgoulet@torproject.org>
This makes tor remember the last seen digest of a cell if that cell is the
last one before a SENDME on the Exit side.
Closes#26839
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit makes tor able to parse and handle a SENDME version 1. It will
look at the consensus parameter "sendme_accept_min_version" to know what is
the minimum version it should look at.
IMPORTANT: At this commit, the validation of the cell is not fully
implemented. For this, we need #26839 to be completed that is to match the
SENDME digest with the last cell digest.
Closes#26841
Signed-off-by: David Goulet <dgoulet@torproject.org>
This code will obey the consensus parameter "sendme_emit_min_version" to know
which SENDME version it should send. For now, the default is 0 and the
parameter is not yet used in the consensus.
This commit adds the support to send version 1 SENDMEs but aren't sent on the
wire at this commit.
Closes#26840
Signed-off-by: David Goulet <dgoulet@torproject.org>
In order to be able to deploy the authenticated SENDMEs, these two consensus
parameters are needed to control the minimum version that we can emit and
accept.
See section 4 in prop289 for more details.
Note that at this commit, the functions that return the values aren't used so
compilation fails if warnings are set to errors.
Closes#26842
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, we would only close the stream when our deliver window was
negative at the circuit-level but _not_ at the stream-level when receiving a
DATA cell.
This commit adds an helper function connection_edge_end_close() which
sends an END and then mark the stream for close for a given reason.
That function is now used both in case the deliver window goes below zero for
both circuit and stream level.
Part of #26840
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we are about to send a DATA cell, we have to decrement the package window
for both the circuit and stream level.
This commit adds helper functions to handle the package window decrement.
Part of #26288
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we get a relay DATA cell delivered, we have to decrement the deliver
window on both the circuit and stream level.
This commit adds helper functions to handle the deliver window decrement.
Part of #26840
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is a bit of a complicated commit. It moves code but also refactors part
of it. No behavior change, the idea is to split things up so we can better
handle and understand how SENDME cells are processed where ultimately it will
be easier to handle authenticated SENDMEs (prop289) using the intermediate
functions added in this commit.
The entry point for the cell arriving at the edge (Client or Exit), is
connection_edge_process_relay_cell() for which we look if it is a circuit or
stream level SENDME. This commit refactors that part where two new functions
are introduced to process each of the SENDME types.
The sendme_process_circuit_level() has basically two code paths. If we are a
Client (the circuit is origin) or we are an Exit. Depending on which, the
package window is updated accordingly. Then finally, we resume the reading on
every edge streams on the circuit.
The sendme_process_stream_level() applies on the edge connection which will
update the package window if needed and then will try to empty the inbuf if
need be because we can now deliver more cells.
Again, no behavior change but in order to split that code properly into their
own functions and outside the relay.c file, code modification was needed.
Part of #26840.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Take apart the SENDME cell specific code and put it in sendme.{c|h}. This is
part of prop289 that implements authenticated SENDMEs.
Creating those new files allow for the already huge relay.c to not grow in LOC
and makes it easier to handle and test the SENDME cells in an isolated way.
This commit only moves code. No behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The nodelist_idx for each node_t serves as a unique identifier for
the node, so we can use a bitarray to hold all the excluded
nodes, and then remove them from the smartlist.
Previously use used smartlist_subtract(sl, excluded), which is
O(len(sl)*len(excluded)).
We can use this function in other places too, but this is the one
that showed up on the profiles of 30291.
Closes ticket 30307.
This command does not fit perfectly with the others, since its
second argument is optional and may contain equal signs. Still,
it's probably better to squeeze it into the new metaformat, since
doing so allows us to remove several pieces of the old
command-parsing machinery.
The two options are mutually exclusive, since otherwise an entry
like "Foo" would be ambiguous. We want to have the ability to treat
entries like this as keys, though, since some controller commands
interpret them as flags.
(This should be all of the command that work nicely with positional
arguments only.)
Some of these commands should probably treat extra arguments as
incorrect, but for now I'm trying to be careful not to break
any existing users.
The first line break in particular was mishandled: it was discarded
if no arguments came before it, which made it impossible to
distinguish arguments from the first line of the body.
To solve this, we need to allocate a copy of the command rather than
using NUL to separate it, since we might have "COMMAND\n" as our input.
Fixes ticket 29984.
There _is_ an underlying logic to these commands, but it isn't
wholly uniform, given years of tweaks and changes. Fortunately I
think there is a superset that will work.
This commit adds a parser for some of the most basic cases -- the
ones currently handled by getargs_helper() and some of the
object-taking ones. Soon will come initial tests; then I'll start using
the parser.
After that, I'll expand the parser to handle the other cases that come
up in the controller protocol.
We have checks in various places in mainlook.c to make sure that
events are initialized before we invoke any periodic_foo() functions
on them. But now that each subsystem will own its own periodic
events, it will be cleaner if we don't assume that they are all
setup or not.
The end goal here is to move the periodic callback to their
respective modules, so that mainloop.c doesn't have to include so
many other things.
This patch doesn't actually move any of the callbacks out of
mainloop.c yet.
In this patch we lower the log level of the failures for the three calls
to unlink() in networkstatus_set_current_consensus(). These errors might
trigger on Windows because the memory mapped consensus file keeps the
file in open state even after we have close()'d it. Windows will then
error on the unlink() call with a "Permission denied" error.
The consequences of ignoring these errors is that we leave an unused
file around on the file-system, which is an easier way to fix this
problem right now than refactoring networkstatus_set_current_consensus().
See: https://bugs.torproject.org/29930
In "make test-network-all", test IPv6-only v3 single onion services,
using the chutney network single-onion-v23-ipv6-md. This test will
not pass until 23588 has been merged.
Closes ticket 27251.
Disable padding via limit check and machine condition. Limits cause us to stop
sending padding. Machine conditions cause the machines to be shut down, and
not restarted.
In 0.3.4 and later, these functions are declared in rephist.h:
STATIC uint64_t find_largest_max(bw_array_t *b);
STATIC void commit_max(bw_array_t *b);
STATIC void advance_obs(bw_array_t *b);
But in 0.2.9, they are declared in rephist.c and test_relay.c.
So compilers fail with a "must use 'struct' tag" error.
We add the missing struct typedef in test_relay.c, to match the
declarations in rephist.c.
(Merge commit 813019cc57 moves these functions into rephist.h instead.)
Fixes bug 30184; not in any released version of Tor.
When releasing OpenSSL patch-level maintenance updates,
we do not want to rebuild binaries using it.
And since they guarantee ABI stability, we do not have to.
Without this patch, warning messages were produced
that confused users:
https://bugzilla.opensuse.org/show_bug.cgi?id=1129411
Fixes bug 30190; bugfix on 0.2.4.2-alpha commit 7607ad2bec
Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
In 0.3.4 and later, we declare write_array as:
extern struct bw_array_t *write_array;
...
typedef struct bw_array_t bw_array_t;
But in 0.2.9, we declare write_array as:
typedef struct bw_array_t bw_array_t;
extern bw_array_t *write_array;
And then again in rephist.c:
typedef struct bw_array_t bw_array_t;
So some compilers fail with a duplicate declaration error.
We backport 684b396ce5, which removes the duplicate declaration.
And this commit deals with the undeclared type error.
Backports a single line from merge commit 813019cc57.
Fixes bug 30184; not in any released version of Tor.
We need to encode here instead of doing escaped(), since fwict
escaped() does not currently handle NUL bytes.
Also, use warn_if_nul_found in more cases to avoid duplication.
The smartlist functions take great care to reset unused pointers inside
the smartlist memory to NULL.
The function smartlist_remove_keeporder does not clear memory in such
way when elements have been removed. Therefore call memset after the
for-loop that removes elements. If no element is removed, it is
effectively a no-op.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The smartlist code takes great care to set all unused pointers inside
the smartlist memory to NULL. Check if this is also the case after
modifying the smartlist multiple times.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Previously, our use of abort() would break anywhere that we didn't
include stdlib.h. This was especially troublesome in case where
tor_assert_nonfatal() was used with ALL_BUGS_ARE_FATAL, since that
one seldom gets tested.
As an alternative, we could have just made this header include
stdlib.h. But that seems bloaty.
Fixes bug 30189; bugfix on 0.3.4.1-alpha.
Coverity doesn't like to see a path where we test a pointer for
NULL if we have already ready dereferenced the pointer on that
path. While in this case, the check is not needed, it's best not to
remove checks from the unit tests IMO. Instead, I'm adding an
earlier check, so that coverity, when analyzing this function, will
think that we have always checked the pointer before dereferencing
it.
Closes ticket 30180; CID 1444641.
Use a table-based lookup to find the right command handler. This
will serve as the basement for several future improvements, as we
improve the API for parsing commands.
This should please coverity, and fix CID 1415721. It didn't
understand that networkstatus_get_param() always returns a value
between its minimum and maximum values.
This should please coverity, and fix CID 1415722. It didn't
understand that networkstatus_get_param() always returns a value
between its minimum and maximum values.
This should please coverity, and fix CID 1415723. It didn't understand
that networkstatus_get_param() always returns a value between its
minimum and maximum values.
The logic here should be "use versions or free it". The "free it"
part was previously in a kind of obfuscated place, so coverity
wasn't sure it was invoked as appropriate. CID 1437436.
The function compat_getdelim_ is used for tor_getline if tor is compiled
on a system that lacks getline and getdelim. These systems should be
very rare, considering that getdelim is POSIX.
If this system is further a 32 bit architecture, it is possible to
trigger a double free with huge files.
If bufsiz has been already increased to 2 GB, the next chunk would
be 4 GB in size, which wraps around to 0 due to 32 bit limitations.
A realloc(*buf, 0) could be imagined as "free(*buf); return malloc(0);"
which therefore could return NULL. The code in question considers
that an error, but will keep the value of *buf pointing to already
freed memory.
The caller of tor_getline() would free the pointer again, therefore
leading to a double free.
This code can only be triggered in dirserv_read_measured_bandwidths
with a huge measured bandwith list file on a system that actually
allows to reach 2 GB of space through realloc.
It is not possible to trigger this on Linux with glibc or other major
*BSD systems even on unit tests, because these systems cannot reach
so much memory due to memory fragmentation.
This patch is effectively based on the penetration test report of
cure53 for curl available at https://cure53.de/pentest-report_curl.pdf
and explained under section "CRL-01-007 Double-free in aprintf() via
unsafe size_t multiplication (Medium)".
If the concatenation of connection buffer and the buffer of linked
connection exceeds INT_MAX bytes, then buf_move_to_buf returns -1 as an
error value.
This value is currently casted to size_t (variable n_read) and will
erroneously lead to an increasement of variable "max_to_read".
This in turn can be used to call connection_buf_read_from_socket to
store more data inside the buffer than expected and clogging the
connection buffer.
If the linked connection buffer was able to overflow INT_MAX, the call
of buf_move_to_buf would have previously internally triggered an integer
overflow, corrupting the state of the connection buffer.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Many buffer functions have a hard limit of INT_MAX for datalen, but
this limitation is not enforced in all functions:
- buf_move_all may exceed that limit with too many chunks
- buf_move_to_buf exceeds that limit with invalid buf_flushlen argument
- buf_new_with_data may exceed that limit (unit tests only)
This patch adds some annotations in some buf_pos_t functions to
guarantee that no out of boundary access could occur even if another
function lacks safe guards against datalen overflows.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If the concatenation of connection buffer and the buffer of linked
connection exceeds INT_MAX bytes, then buf_move_to_buf returns -1 as an
error value.
This value is currently casted to size_t (variable n_read) and will
erroneously lead to an increasement of variable "max_to_read".
This in turn can be used to call connection_buf_read_from_socket to
store more data inside the buffer than expected and clogging the
connection buffer.
If the linked connection buffer was able to overflow INT_MAX, the call
of buf_move_to_buf would have previously internally triggered an integer
overflow, corrupting the state of the connection buffer.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Many buffer functions have a hard limit of INT_MAX for datalen, but
this limitation is not enforced in all functions:
- buf_move_all may exceed that limit with too many chunks
- buf_move_to_buf exceeds that limit with invalid buf_flushlen argument
- buf_new_with_data may exceed that limit (unit tests only)
This patch adds some annotations in some buf_pos_t functions to
guarantee that no out of boundary access could occur even if another
function lacks safe guards against datalen overflows.
[This is a backport of the submitted patch to 0.2.9, where the
buf_move_to_buf and buf_new_with_data functions did not exist.]
Fixes bug 29922; bugfix on 0.2.9.3-alpha when we tried to capture
all these warnings. No need to backport any farther than 0.3.5,
though -- these warnings don't cause test failures before then.
This one was tricky to find because apparently it only happened on
_some_ windows builds.
In current NSS versions, these ciphersuites don't work with
SSL_ExportKeyingMaterial(), which was causing relays to fail when
they tried to negotiate the v3 link protocol authentication.
Fixes bug 29241; bugfix on 0.4.0.1-alpha.