Remember, you can't check to see if there are N bytes left in a
buffer by doing (buf + N < end), since the buf + N computation might
take you off the end of the buffer and result in undefined behavior.
Fixes 28202; bugfix on 0.2.0.3-alpha.
It is not enough to look at protover for v3 rendezvous support but also we
need to make sure that the curve25519 onion key is present or in other words
that the descriptor has been fetched and does contain it.
Fixes#27797.
Signed-off-by: David Goulet <dgoulet@torproject.org>
With the new refresh_service_descriptor() function we had both
refresh_service_descriptor() and update_service_descriptor() which is basically
the same thing.
This commit renames update_service_descriptor() to
update_service_descriptor_intro_points() to make it clear it's not a generic
refresh and it's only about intro points.
Commit changes no code.
Treat backtrace test failures as expected on NetBSD, OpenBSD, and
macOS/Darwin, until we solve bug 17808.
(FreeBSD failures have been treated as expected since 18204 in 0.2.8.)
Fixes bug 27948; bugfix on 0.2.5.2-alpha.
Before this commit, we would create the descriptor signing key certificate
when first building the descriptor.
In some extreme cases, it lead to the expiry of the certificate which triggers
a BUG() when encoding the descriptor before uploading.
Ticket #27838 details a possible scenario in which this can happen. It is an
edge case where tor losts internet connectivity, notices it and closes all
circuits. When it came back up, the HS subsystem noticed that it had no
introduction circuits, created them and tried to upload the descriptor.
However, in the meantime, if tor did lack a live consensus because it is
currently seeking to download one, we would consider that we don't need to
rotate the descriptors leading to using the expired signing key certificate.
That being said, this commit does a bit more to make this process cleaner.
There are a series of things that we need to "refresh" before uploading a
descriptor: signing key cert, intro points and revision counter.
A refresh function is added to deal with all mutable descriptor fields. It in
turn simplified a bit the code surrounding the creation of the plaintext data.
We keep creating the cert when building the descriptor in order to accomodate
the unit tests. However, it is replaced every single time the descriptor is
uploaded.
Fixes#27838
Signed-off-by: David Goulet <dgoulet@torproject.org>
When storing a descriptor in the client cache, if we are about to replace an
existing descriptor, make sure to close every introduction circuits of the old
descriptor so we don't have leftovers lying around.
Ticket 27471 describes a situation where tor is sending an INTRODUCE1 cell on
an introduction circuit for which it doesn't have a matching intro point
object (taken from the descriptor).
The main theory is that, after a new descriptor showed up, the introduction
points changed which led to selecting an introduction circuit not used by the
service anymore thus for which we are unable to find the corresponding
introduction point within the descriptor we just fetched.
Closes#27471.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It won't be used if there are no authorized client configured. We do that so
we can easily support the addition of a client with a HUP signal which allow
us to avoid more complex code path to generate that cookie if we have at least
one client auth and we had none before.
Fixes#27995
Signed-off-by: David Goulet <dgoulet@torproject.org>
Occasionally, key pinning doesn't catch a relay that shares an ed25519
ID with another relay. Log the identity fingerprints and the shared
ed25519 ID when this happens, instead of making a BUG() warning.
Fixes bug 27800; bugfix on 0.3.2.1-alpha.
Commit 488e2b00bf introduced an issue, most
likely introduced by a bad copy paste, that made us stop reading on the
connection if our write bandwidth limit was reached.
The problem is that because "read_blocked_on_bw" was never set, the connection
was never reenabled for reading.
This is most likely the cause of #27813 where bytes were accumulating in the
kernel TCP bufers because tor was not doing reads. Only relays with
RelayBandwidthRate would suffer from this but affecting all relays connecting
to them. And using that tor option is recommended and best practice so many
many relays have it enabled.
Fixes#28089.
It turns out that if _only_ the ControlPort is set and nothing else, tor would
simply not bootstrap and thus not start properly. Commit 67a41b6306
removed that requirement for tor to be considered a "client".
Unfortunately, this made the mainloop enable basically nothing if only the
ControlPort is set in the torrc.
This commit now makes it that we also consider the ControlPort when deciding
if we are a Client or not. It does not revert 67a41b6306 meaning
options_any_client_port_set() stays the same, not looking at the control port.
Fixes#27849.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Nothing should ever look at them on failure, but in some cases,
the unit tests don't check for failure, and then GCC-LTO freaks out.
Fixes part of 27772.
These confused GCC LTO, which thought they might be used
uninitialized. I'm pretty sure that as long as 'res' indicates
success, they will always be set to something, but let's unconfuse
the compiler in any case.
OpenSolaris apparently doesn't have timeradd(), so we added a
replacement, but we weren't including it here after the big
refactoring in 0.3.5.1-alpha.
Fixes bug 27963; bugfix on 0.3.5.1-alpha.
It looks to be the case that Rust's standard allocator, jemalloc, is
incompatible with sanitizers. The incompatibility, for whatever reason,
seems to cause segfaults at runtime when jemalloc is linked with
sanitizers.
Without actually trying to figure out what's going on here this commit
instead takes the hammer of "let's remove jemalloc when testing". The
`tor_allocate` crate now by default switches to the system allocator
(eventually this will want to be the tor allocator). Most crates then
link to `tor_allocate` ot pick this up, but the `smartlist` crate had to
manually switch to the system allocator in testing and the `external`
crate had to be sure to link to `tor_allocate`.
The final gotcha here is that this patch also switches to
unconditionally passing `--target` to Cargo. For weird and arcane
reasons passing `--target` with the host target of the compiler (which
Cargo otherwise uses as the default) is different than not passing
`--target` at all. This ensure that our custom `RUSTFLAGS` with
sanitizer options doesn't make its way into build scripts, just the
final testing artifacts.
This is no longer necessary with upstream rust-lang/rust changes as well
as some local tweaks. Namely:
* The `-fsanitize=address`-style options are now passed via `-C
link-args` through `RUSTFLAGS`. This obviates the need for the shell
script.
* The `-C default-linker-libraries`, disabling `-nodefaultlibs`, is
passed through `RUSTFLAGS`, which is necessary to ensure that
`-fsanitize=address` links correctly.
* The `-C linker` option is passed to ensure we're using the same C
compiler as normal C code, although it has a bit of hackery to only
get the `gcc` out of `gcc -std=c99`
Allowing this didn't do any actual harm, since there aren't any
shared structures or leakable objects here. Still, it's bad style
and might cause trouble in the future.
Closes ticket 27856.
Various places in our code try to activate these events or check
their status, so we should make sure they're initialized as early as
possible. Fixes bug 27861; bugfix on 0.3.5.1-alpha.
When freeing a configuration object from confparse.c in
dump_config(), we need to call the appropriate higher-level free
function (like or_options_free()) and not just config_free().
This only happens with options (since they're the one where
options_validate allocates extra stuff) and only when running
--dump-config with something other than minimal (since
OPTIONS_DUMP_MINIMAL doesn't hit this code).
Fixes bug 27893; bugfix on 0.3.2.1-alpha.
It differs from the rest of the rephist code in that it's actually
necessary for Tor to operate, so it should probably go somewhere
else. I'm not sure where yet, so I'll leave it in the same
directory, but give it its own file.
Since this is completely core functionality, I'm putting it in
core/mainloop, even though it depends on feature/hibernate. We'll
have to sort that out in the future.
If a tor client gets a descriptor that it can't decrypt, chances are that the
onion requires client authorization.
If a tor client is configured with client authorization for an onion but
decryption fails, it means that the configured keys aren't working anymore.
In both cases, we'll log notice the former and log warn the latter and the
rest of the decryption errors are now at info level.
Two logs statement have been removed because it was redundant and printing the
fetched descriptor in the logs when 80% of it is encrypted wat not helping.
Fixes#27550
Signed-off-by: David Goulet <dgoulet@torproject.org>
The main.c code is responsible for initialization and shutdown;
the mainloop.c code is responsible for running the main loop of Tor.
Splitting the "generic event loop" part of mainloop.c from the
event-loop-specific part is not done as part of this patch.
The parts for handling cell formats should be in src/core/or.
The parts for handling onionskin queues should be in src/core/or.
Only the crypto wrapper belongs in src/core/crypto.
When sending the INTRODUCE1 cell, we acquire the needed data for the cell but
if the RP node_t has invalid data, we'll fail the send and completely kill the
SOCKS connection.
Instead, close the rendezvous circuit and return a transient error meaning
that Tor can recover by selecting a new rendezvous point. We'll also do the
same when we are unable to encode the INTRODUCE1 cell for which at that point,
we'll simply take another shot at a new rendezvous point.
Fixes#27774
Signed-off-by: David Goulet <dgoulet@torproject.org>
The result of CString::into_raw() is not safe to free
with free() except under finicky and fragile circumstances
that we definitely don't meet right now.
This was missed in be583a34a3.
When we close a socket via tor_tls_free(), we previously had no way
for our socket accounting logic to learn about it. This meant that
the socket accounting code would think we had run out of sockets,
and freak out.
Fixes bug 27795; bugfix on 0.3.5.1-alpha.
In dirauth:
* bwauth.c reads and uses bandwidth files
* guardfraction.c reads and uses the guardfraction file
* reachability.c tests relay reachability
* recommend_pkg.c handles the recommended-packages lines.
* recv_descs.c handles fingerprint files and processing incoming
routerinfos that relays upload to us
* voteflag.c computes flag thresholds and sets those thresholds on
routerstatuses when computing votes
In control:
* fmt_serverstatus.c generates the ancient "v1 server status"
format that controllers expect.
In nodelist:
* routerstatus_fmt.c formats routerstatus entries for a consensus,
a vote, or for the controller.
Client side, when a descriptor is finally fetched and stored in the cache, we
then go over all pending SOCKS request for that descriptor. If it turns out
that the intro points are unusable, we close the first SOCKS request but not
the others for the same .onion.
This commit makes it that we'll close all SOCKS requests so we don't let
hanging the other ones.
It also fixes another bug which is having a SOCKS connection in RENDDESC_WAIT
state but with a descriptor in the cache. At some point, tor will expire the
intro failure cache which will make that descriptor usable again. When
retrying all SOCKS connection (retry_all_socks_conn_waiting_for_desc()), we
won't end up in the code path where we have already the descriptor for a
pending request causing a BUG().
Bottom line is that we should never have pending requests (waiting for a
descriptor) with that descriptor in the cache (even if unusable).
Fixees #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is not enough to look at protover for v3 rendezvous support but also we
need to make sure that the curve25519 onion key is present or in other words
that the descriptor has been fetched and does contain it.
Fixes#27797.
Signed-off-by: David Goulet <dgoulet@torproject.org>
There are now separate modules for:
* the list of router descriptors
* the list of authorities and fallbacks
* managing authority certificates
* selecting random nodes
That unit test makes sure we don't have pending SOCK request if the descriptor
turns out to be unusable.
Part of #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Client side, when a descriptor is finally fetched and stored in the cache, we
then go over all pending SOCKS request for that descriptor. If it turns out
that the intro points are unusable, we close the first SOCKS request but not
the others for the same .onion.
This commit makes it that we'll close all SOCKS requests so we don't let
hanging the other ones.
It also fixes another bug which is having a SOCKS connection in RENDDESC_WAIT
state but with a descriptor in the cache. At some point, tor will expire the
intro failure cache which will make that descriptor usable again. When
retrying all SOCKS connection (retry_all_socks_conn_waiting_for_desc()), we
won't end up in the code path where we have already the descriptor for a
pending request causing a BUG().
Bottom line is that we should never have pending requests (waiting for a
descriptor) with that descriptor in the cache (even if unusable).
Fixees #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The trunnel functions are written under the assumption that their
allocators can fail, so GCC LTO thinks they might return NULL. In
point of fact, they're using tor_malloc() and friends, which can't
fail, but GCC won't necessarily figure that out.
Fixes part of #27772.
Instead, have it call a mockable function. We don't want
crypto_strongest_rand() to be mockable, since doing so creates a
type error when we call it from ed25519-donna, which we do not build
in a test mode.
Fixes bug 27728; bugfix on 0.3.5.1-alpha
This argument was added to match an older idea for the C api, but we
decided not to do it that way in C.
Fixes bug 27741; bugfix on 0.3.3.6 / TROVE-2018-005 fix.
Since the default cache directory is the same as the default data
directory, we don't want the default CacheDirectoryGroupReadable
value (0) to override an explicitly set "DataDirectoryGroupReadable
1".
To fix this, I'm making CacheDirectoryGroupReadable into an
autobool, and having the default (auto) value mean "Use the value of
DataDirectoryGroupReadable if the directories are the same, and 0
otherwise."
Fixes bug 26913; bugfix on 0.3.3.1-alpha when the CacheDirectory
option was introduced.
This shouldn't be a user-visible change: nobody has a 16 MB RSA
key that they're trying to use with Tor.
I'm doing this to fix CID 1439330 / ticket 27730, where coverity
complains (on 64-bit) that we are making a comparison that is never
true.
This patch moves the logic that adds the proxy headers to an earlier
point in the exit connection lifetime, which ensures that the
application data cannot be written to the outbuf before the proxy header
is added.
See: https://bugs.torproject.org/4700
This patch changes HiddenServiceExportCircuitID so instead of being a
boolean it takes a string, which is the protocol. Currently only the
'haproxy' protocol is defined.
See: https://bugs.torproject.org/4700
Without this patch we would encode the IPv6 address' last part as
::ffffffff instead of ::ffff:ffff when the GID is UINT32_MAX.
See: https://bugs.torproject.org/4700
In hs_config.c, we do validate the permission of the hidden service directory
but we do not try to create it. So, in the event that the directory doesn't
exists, we end up in the loading key code path which checks for the
permission and possibly creates the directory. On failure, don't BUG() since
there is a perfectly valid use case for that function to fail.
Fixes#27335
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is harder than with OpenSSL, since OpenSSL counts the bytes on
its own and NSS doesn't. To fix this, we need to define a new
PRFileDesc layer that has its own byte-counting support.
Closes ticket 27289.
On GCC and Clang, there's a feature to warn you about bad
conditionals like "if (a = b)", which should be "if (a == b)".
However, they don't warn you if there are extra parentheses around
"a = b".
Unfortunately, the tor_assert() macro and all of its kin have been
passing their inputs through stuff like PREDICT_UNLIKELY(expr) or
PREDICT_UNLIKELY(!(expr)), both of which expand to stuff with more
parentheses around "expr", thus suppressing these warnings.
To fix this, this patch introduces new macros that do not wrap
expr. They're only used when GCC or Clang is enabled (both define
__GNUC__), since they require GCC's "({statement expression})"
syntax extension. They're only used when we're building the
unit-test variant of the object files, since they suppress the
branch-prediction hints.
I've confirmed that tor_assert(), tor_assert_nonfatal(),
tor_assert_nonfatal_once(), BUG(), and IF_BUG_ONCE() all now give
compiler warnings when their argument is an assignment expression.
Fixes bug 27709.
Bugfix on 0.0.6, where we first introduced the "tor_assert()" macro.
.retain() would allocating a Vec of billions of integers and check them
one at a time to separate the supported versions from the unsupported.
This leads to a memory DoS.
Closes ticket 27206. Bugfix on e6625113c9.
Before 0.3.3.1-alpha, we would exit() in this case immediately. But
now that we leave tor_main() more conventionally, we need to make
sure we restore things so as not to cause a double free.
Fixes bug 27708; bugfix on 0.3.3.1-alpha.
Since we use a 32-bit approximation for millisecond conversion here,
we can't expect so much precision.
Fixes part of bug 27139; bugfix on 0.3.4.1-alpha.
Multiply-then-divide is more accurate, but it runs into trouble when
our input is above INT32_MAX/numerator. So when our value is too
large, do divide-then-multiply instead.
Fixes part of bug 27139; bugfix on 0.3.4.1-alpha.
We use an optimized but less accurate formula for converting coarse
time differences to milliseconds on 32-bit OSX platforms, so that we
can avoid 64-bit division.
The old numbers were off by 0.4%. The new numbers are off by .006%.
This should make the unit tests a bit cleaner, and our tolerances a
bit closer.
All node_get_all_orports() does is allocate and return a smartlist
with at most two tor_addr_port_t members that match ORPort's of
node configuration. This is harmful for memory efficiency, as it
allocates the same stuff every time it is called. However,
node_is_a_configured_bridge() does not need to call it, as it
already has all the information to check if there is configured
bridge for a given node.
The new code is arranged in a way that hopefully makes each succeeding
linear search through bridge_list less likely.
We determine that a cell was dropped by inspecting CIRC_BW fields. If we did
not update the delivered or overhead fields after processing the cell, the
cell was dropped/not processed.
Also emit CIRC_BW events for cases where we decide to close the circuit in
this function, so vanguards can print messages about dropped cells in those
cases, too.