Resume refusing to start with relative file paths and RunAsDaemon
set (regression from the fix for bug 22731).
Fixes bug 28298; bugfix on 0.3.3.1-alpha.
If tor terminates due to SIGNAL HALT before test_rebind.py calls
tor_process.terminate(), an OSError 3 (no such process) is thrown.
Fixes part of bug 27968 on 0.3.5.1-alpha.
Remember, you can't check to see if there are N bytes left in a
buffer by doing (buf + N < end), since the buf + N computation might
take you off the end of the buffer and result in undefined behavior.
Fixes 28202; bugfix on 0.2.0.3-alpha.
It is not enough to look at protover for v3 rendezvous support but also we
need to make sure that the curve25519 onion key is present or in other words
that the descriptor has been fetched and does contain it.
Fixes#27797.
Signed-off-by: David Goulet <dgoulet@torproject.org>
With the new refresh_service_descriptor() function we had both
refresh_service_descriptor() and update_service_descriptor() which is basically
the same thing.
This commit renames update_service_descriptor() to
update_service_descriptor_intro_points() to make it clear it's not a generic
refresh and it's only about intro points.
Commit changes no code.
Treat backtrace test failures as expected on NetBSD, OpenBSD, and
macOS/Darwin, until we solve bug 17808.
(FreeBSD failures have been treated as expected since 18204 in 0.2.8.)
Fixes bug 27948; bugfix on 0.2.5.2-alpha.
Before this commit, we would create the descriptor signing key certificate
when first building the descriptor.
In some extreme cases, it lead to the expiry of the certificate which triggers
a BUG() when encoding the descriptor before uploading.
Ticket #27838 details a possible scenario in which this can happen. It is an
edge case where tor losts internet connectivity, notices it and closes all
circuits. When it came back up, the HS subsystem noticed that it had no
introduction circuits, created them and tried to upload the descriptor.
However, in the meantime, if tor did lack a live consensus because it is
currently seeking to download one, we would consider that we don't need to
rotate the descriptors leading to using the expired signing key certificate.
That being said, this commit does a bit more to make this process cleaner.
There are a series of things that we need to "refresh" before uploading a
descriptor: signing key cert, intro points and revision counter.
A refresh function is added to deal with all mutable descriptor fields. It in
turn simplified a bit the code surrounding the creation of the plaintext data.
We keep creating the cert when building the descriptor in order to accomodate
the unit tests. However, it is replaced every single time the descriptor is
uploaded.
Fixes#27838
Signed-off-by: David Goulet <dgoulet@torproject.org>
When storing a descriptor in the client cache, if we are about to replace an
existing descriptor, make sure to close every introduction circuits of the old
descriptor so we don't have leftovers lying around.
Ticket 27471 describes a situation where tor is sending an INTRODUCE1 cell on
an introduction circuit for which it doesn't have a matching intro point
object (taken from the descriptor).
The main theory is that, after a new descriptor showed up, the introduction
points changed which led to selecting an introduction circuit not used by the
service anymore thus for which we are unable to find the corresponding
introduction point within the descriptor we just fetched.
Closes#27471.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It won't be used if there are no authorized client configured. We do that so
we can easily support the addition of a client with a HUP signal which allow
us to avoid more complex code path to generate that cookie if we have at least
one client auth and we had none before.
Fixes#27995
Signed-off-by: David Goulet <dgoulet@torproject.org>
Both client and service had their own code for this. Consolidate into one
place so we avoid duplication.
Closes#27549
Signed-off-by: David Goulet <dgoulet@torproject.org>
Occasionally, key pinning doesn't catch a relay that shares an ed25519
ID with another relay. Log the identity fingerprints and the shared
ed25519 ID when this happens, instead of making a BUG() warning.
Fixes bug 27800; bugfix on 0.3.2.1-alpha.
Commit 488e2b00bf introduced an issue, most
likely introduced by a bad copy paste, that made us stop reading on the
connection if our write bandwidth limit was reached.
The problem is that because "read_blocked_on_bw" was never set, the connection
was never reenabled for reading.
This is most likely the cause of #27813 where bytes were accumulating in the
kernel TCP bufers because tor was not doing reads. Only relays with
RelayBandwidthRate would suffer from this but affecting all relays connecting
to them. And using that tor option is recommended and best practice so many
many relays have it enabled.
Fixes#28089.
It turns out that if _only_ the ControlPort is set and nothing else, tor would
simply not bootstrap and thus not start properly. Commit 67a41b6306
removed that requirement for tor to be considered a "client".
Unfortunately, this made the mainloop enable basically nothing if only the
ControlPort is set in the torrc.
This commit now makes it that we also consider the ControlPort when deciding
if we are a Client or not. It does not revert 67a41b6306 meaning
options_any_client_port_set() stays the same, not looking at the control port.
Fixes#27849.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Nothing should ever look at them on failure, but in some cases,
the unit tests don't check for failure, and then GCC-LTO freaks out.
Fixes part of 27772.
These confused GCC LTO, which thought they might be used
uninitialized. I'm pretty sure that as long as 'res' indicates
success, they will always be set to something, but let's unconfuse
the compiler in any case.
OpenSolaris apparently doesn't have timeradd(), so we added a
replacement, but we weren't including it here after the big
refactoring in 0.3.5.1-alpha.
Fixes bug 27963; bugfix on 0.3.5.1-alpha.
It looks to be the case that Rust's standard allocator, jemalloc, is
incompatible with sanitizers. The incompatibility, for whatever reason,
seems to cause segfaults at runtime when jemalloc is linked with
sanitizers.
Without actually trying to figure out what's going on here this commit
instead takes the hammer of "let's remove jemalloc when testing". The
`tor_allocate` crate now by default switches to the system allocator
(eventually this will want to be the tor allocator). Most crates then
link to `tor_allocate` ot pick this up, but the `smartlist` crate had to
manually switch to the system allocator in testing and the `external`
crate had to be sure to link to `tor_allocate`.
The final gotcha here is that this patch also switches to
unconditionally passing `--target` to Cargo. For weird and arcane
reasons passing `--target` with the host target of the compiler (which
Cargo otherwise uses as the default) is different than not passing
`--target` at all. This ensure that our custom `RUSTFLAGS` with
sanitizer options doesn't make its way into build scripts, just the
final testing artifacts.
This is no longer necessary with upstream rust-lang/rust changes as well
as some local tweaks. Namely:
* The `-fsanitize=address`-style options are now passed via `-C
link-args` through `RUSTFLAGS`. This obviates the need for the shell
script.
* The `-C default-linker-libraries`, disabling `-nodefaultlibs`, is
passed through `RUSTFLAGS`, which is necessary to ensure that
`-fsanitize=address` links correctly.
* The `-C linker` option is passed to ensure we're using the same C
compiler as normal C code, although it has a bit of hackery to only
get the `gcc` out of `gcc -std=c99`
Allowing this didn't do any actual harm, since there aren't any
shared structures or leakable objects here. Still, it's bad style
and might cause trouble in the future.
Closes ticket 27856.
Various places in our code try to activate these events or check
their status, so we should make sure they're initialized as early as
possible. Fixes bug 27861; bugfix on 0.3.5.1-alpha.
When freeing a configuration object from confparse.c in
dump_config(), we need to call the appropriate higher-level free
function (like or_options_free()) and not just config_free().
This only happens with options (since they're the one where
options_validate allocates extra stuff) and only when running
--dump-config with something other than minimal (since
OPTIONS_DUMP_MINIMAL doesn't hit this code).
Fixes bug 27893; bugfix on 0.3.2.1-alpha.
It differs from the rest of the rephist code in that it's actually
necessary for Tor to operate, so it should probably go somewhere
else. I'm not sure where yet, so I'll leave it in the same
directory, but give it its own file.
Since this is completely core functionality, I'm putting it in
core/mainloop, even though it depends on feature/hibernate. We'll
have to sort that out in the future.
If a tor client gets a descriptor that it can't decrypt, chances are that the
onion requires client authorization.
If a tor client is configured with client authorization for an onion but
decryption fails, it means that the configured keys aren't working anymore.
In both cases, we'll log notice the former and log warn the latter and the
rest of the decryption errors are now at info level.
Two logs statement have been removed because it was redundant and printing the
fetched descriptor in the logs when 80% of it is encrypted wat not helping.
Fixes#27550
Signed-off-by: David Goulet <dgoulet@torproject.org>
The main.c code is responsible for initialization and shutdown;
the mainloop.c code is responsible for running the main loop of Tor.
Splitting the "generic event loop" part of mainloop.c from the
event-loop-specific part is not done as part of this patch.
The parts for handling cell formats should be in src/core/or.
The parts for handling onionskin queues should be in src/core/or.
Only the crypto wrapper belongs in src/core/crypto.
When sending the INTRODUCE1 cell, we acquire the needed data for the cell but
if the RP node_t has invalid data, we'll fail the send and completely kill the
SOCKS connection.
Instead, close the rendezvous circuit and return a transient error meaning
that Tor can recover by selecting a new rendezvous point. We'll also do the
same when we are unable to encode the INTRODUCE1 cell for which at that point,
we'll simply take another shot at a new rendezvous point.
Fixes#27774
Signed-off-by: David Goulet <dgoulet@torproject.org>
The result of CString::into_raw() is not safe to free
with free() except under finicky and fragile circumstances
that we definitely don't meet right now.
This was missed in be583a34a3.
When we close a socket via tor_tls_free(), we previously had no way
for our socket accounting logic to learn about it. This meant that
the socket accounting code would think we had run out of sockets,
and freak out.
Fixes bug 27795; bugfix on 0.3.5.1-alpha.
In dirauth:
* bwauth.c reads and uses bandwidth files
* guardfraction.c reads and uses the guardfraction file
* reachability.c tests relay reachability
* recommend_pkg.c handles the recommended-packages lines.
* recv_descs.c handles fingerprint files and processing incoming
routerinfos that relays upload to us
* voteflag.c computes flag thresholds and sets those thresholds on
routerstatuses when computing votes
In control:
* fmt_serverstatus.c generates the ancient "v1 server status"
format that controllers expect.
In nodelist:
* routerstatus_fmt.c formats routerstatus entries for a consensus,
a vote, or for the controller.
Client side, when a descriptor is finally fetched and stored in the cache, we
then go over all pending SOCKS request for that descriptor. If it turns out
that the intro points are unusable, we close the first SOCKS request but not
the others for the same .onion.
This commit makes it that we'll close all SOCKS requests so we don't let
hanging the other ones.
It also fixes another bug which is having a SOCKS connection in RENDDESC_WAIT
state but with a descriptor in the cache. At some point, tor will expire the
intro failure cache which will make that descriptor usable again. When
retrying all SOCKS connection (retry_all_socks_conn_waiting_for_desc()), we
won't end up in the code path where we have already the descriptor for a
pending request causing a BUG().
Bottom line is that we should never have pending requests (waiting for a
descriptor) with that descriptor in the cache (even if unusable).
Fixees #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is not enough to look at protover for v3 rendezvous support but also we
need to make sure that the curve25519 onion key is present or in other words
that the descriptor has been fetched and does contain it.
Fixes#27797.
Signed-off-by: David Goulet <dgoulet@torproject.org>
There are now separate modules for:
* the list of router descriptors
* the list of authorities and fallbacks
* managing authority certificates
* selecting random nodes
That unit test makes sure we don't have pending SOCK request if the descriptor
turns out to be unusable.
Part of #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Client side, when a descriptor is finally fetched and stored in the cache, we
then go over all pending SOCKS request for that descriptor. If it turns out
that the intro points are unusable, we close the first SOCKS request but not
the others for the same .onion.
This commit makes it that we'll close all SOCKS requests so we don't let
hanging the other ones.
It also fixes another bug which is having a SOCKS connection in RENDDESC_WAIT
state but with a descriptor in the cache. At some point, tor will expire the
intro failure cache which will make that descriptor usable again. When
retrying all SOCKS connection (retry_all_socks_conn_waiting_for_desc()), we
won't end up in the code path where we have already the descriptor for a
pending request causing a BUG().
Bottom line is that we should never have pending requests (waiting for a
descriptor) with that descriptor in the cache (even if unusable).
Fixees #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The trunnel functions are written under the assumption that their
allocators can fail, so GCC LTO thinks they might return NULL. In
point of fact, they're using tor_malloc() and friends, which can't
fail, but GCC won't necessarily figure that out.
Fixes part of #27772.
Instead, have it call a mockable function. We don't want
crypto_strongest_rand() to be mockable, since doing so creates a
type error when we call it from ed25519-donna, which we do not build
in a test mode.
Fixes bug 27728; bugfix on 0.3.5.1-alpha
This argument was added to match an older idea for the C api, but we
decided not to do it that way in C.
Fixes bug 27741; bugfix on 0.3.3.6 / TROVE-2018-005 fix.
Since the default cache directory is the same as the default data
directory, we don't want the default CacheDirectoryGroupReadable
value (0) to override an explicitly set "DataDirectoryGroupReadable
1".
To fix this, I'm making CacheDirectoryGroupReadable into an
autobool, and having the default (auto) value mean "Use the value of
DataDirectoryGroupReadable if the directories are the same, and 0
otherwise."
Fixes bug 26913; bugfix on 0.3.3.1-alpha when the CacheDirectory
option was introduced.
This shouldn't be a user-visible change: nobody has a 16 MB RSA
key that they're trying to use with Tor.
I'm doing this to fix CID 1439330 / ticket 27730, where coverity
complains (on 64-bit) that we are making a comparison that is never
true.
This patch moves the logic that adds the proxy headers to an earlier
point in the exit connection lifetime, which ensures that the
application data cannot be written to the outbuf before the proxy header
is added.
See: https://bugs.torproject.org/4700
This patch changes HiddenServiceExportCircuitID so instead of being a
boolean it takes a string, which is the protocol. Currently only the
'haproxy' protocol is defined.
See: https://bugs.torproject.org/4700
Without this patch we would encode the IPv6 address' last part as
::ffffffff instead of ::ffff:ffff when the GID is UINT32_MAX.
See: https://bugs.torproject.org/4700
In hs_config.c, we do validate the permission of the hidden service directory
but we do not try to create it. So, in the event that the directory doesn't
exists, we end up in the loading key code path which checks for the
permission and possibly creates the directory. On failure, don't BUG() since
there is a perfectly valid use case for that function to fail.
Fixes#27335
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is harder than with OpenSSL, since OpenSSL counts the bytes on
its own and NSS doesn't. To fix this, we need to define a new
PRFileDesc layer that has its own byte-counting support.
Closes ticket 27289.
On GCC and Clang, there's a feature to warn you about bad
conditionals like "if (a = b)", which should be "if (a == b)".
However, they don't warn you if there are extra parentheses around
"a = b".
Unfortunately, the tor_assert() macro and all of its kin have been
passing their inputs through stuff like PREDICT_UNLIKELY(expr) or
PREDICT_UNLIKELY(!(expr)), both of which expand to stuff with more
parentheses around "expr", thus suppressing these warnings.
To fix this, this patch introduces new macros that do not wrap
expr. They're only used when GCC or Clang is enabled (both define
__GNUC__), since they require GCC's "({statement expression})"
syntax extension. They're only used when we're building the
unit-test variant of the object files, since they suppress the
branch-prediction hints.
I've confirmed that tor_assert(), tor_assert_nonfatal(),
tor_assert_nonfatal_once(), BUG(), and IF_BUG_ONCE() all now give
compiler warnings when their argument is an assignment expression.
Fixes bug 27709.
Bugfix on 0.0.6, where we first introduced the "tor_assert()" macro.
.retain() would allocating a Vec of billions of integers and check them
one at a time to separate the supported versions from the unsupported.
This leads to a memory DoS.
Closes ticket 27206. Bugfix on e6625113c9.
Before 0.3.3.1-alpha, we would exit() in this case immediately. But
now that we leave tor_main() more conventionally, we need to make
sure we restore things so as not to cause a double free.
Fixes bug 27708; bugfix on 0.3.3.1-alpha.
Since we use a 32-bit approximation for millisecond conversion here,
we can't expect so much precision.
Fixes part of bug 27139; bugfix on 0.3.4.1-alpha.
Multiply-then-divide is more accurate, but it runs into trouble when
our input is above INT32_MAX/numerator. So when our value is too
large, do divide-then-multiply instead.
Fixes part of bug 27139; bugfix on 0.3.4.1-alpha.
We use an optimized but less accurate formula for converting coarse
time differences to milliseconds on 32-bit OSX platforms, so that we
can avoid 64-bit division.
The old numbers were off by 0.4%. The new numbers are off by .006%.
This should make the unit tests a bit cleaner, and our tolerances a
bit closer.
All node_get_all_orports() does is allocate and return a smartlist
with at most two tor_addr_port_t members that match ORPort's of
node configuration. This is harmful for memory efficiency, as it
allocates the same stuff every time it is called. However,
node_is_a_configured_bridge() does not need to call it, as it
already has all the information to check if there is configured
bridge for a given node.
The new code is arranged in a way that hopefully makes each succeeding
linear search through bridge_list less likely.
We determine that a cell was dropped by inspecting CIRC_BW fields. If we did
not update the delivered or overhead fields after processing the cell, the
cell was dropped/not processed.
Also emit CIRC_BW events for cases where we decide to close the circuit in
this function, so vanguards can print messages about dropped cells in those
cases, too.
This function tells the underlying TLS object that it shouldn't
close the fd on exit. Mostly, we hope not to have to use it, since
the NSS implementation is kludgey, but it should allow us to fix
It's possible for a unit test to report success via its pipe, but to
fail as it tries to clean up and exit. Notably, this happens on a
leak sanitizer failure.
Fixes bug 27658; bugfix on 0.2.2.4-alpha when tinytest was
introduced.
This commit makes it that the authorized clients in the descriptor are in
random order instead of ordered by how they were read on disk.
Fixes#27545
Signed-off-by: David Goulet <dgoulet@torproject.org>
Existing cached directory information can cause misleadingly high
bootstrap percentages. To improve user experience, defer reporting of
directory information progress until at least one connection has
succeeded to a relay or bridge.
Closes ticket 27169.
If a tor client gets a descriptor that it can't decrypt, chances are that the
onion requires client authorization.
If a tor client is configured with client authorization for an onion but
decryption fails, it means that the configured keys aren't working anymore.
In both cases, we'll log notice the former and log warn the latter and the
rest of the decryption errors are now at info level.
Two logs statement have been removed because it was redundant and printing the
fetched descriptor in the logs when 80% of it is encrypted wat not helping.
Fixes#27550
Signed-off-by: David Goulet <dgoulet@torproject.org>
Track bootstrap phase (enumerated by bootstrap_status_t) independently
from the bootstrap progress (which can represent intermediate
progress). This allows control_event_bootstrap_problem() to avoid
doing a linear search through the bootstrap progress space to find the
current bootstrap phase.
Move the mostly-invariant part of control_event_boostrap() into a
helper control_event_bootstrap_core(). The helper doesn't modify any
state beyond doing logging and control port notifications.
Simplify control_event_bootstrap() by making it return void again. It
is currently a fairly complicated function, and it's made more
complicated by returning an int to signal whether it logged at NOTICE
or INFO.
The callers conditionally log messages at level NOTICE based on this
return value. Change the callers to unconditionally log their verbose
human-readable messages at level INFO to keep NOTICE logs less
cluttered.
This partially reverts the changes of #14950.
One HSv3 unit test used "tor_memeq()" without checking the return value. This
commit changes that to use "tt_mem_op()" to actually make the test validate
something :).
Signed-off-by: David Goulet <dgoulet@torproject.org>
>>>> CID 1439133: Null pointer dereferences (REVERSE_INULL)
>>>> Null-checking "fields" suggests that it may be null, but it
>>>> has already been dereferenced on all paths leading to the check.
>>>> CID 1439132: Null pointer dereferences (REVERSE_INULL)
>>>> Null-checking "fields" suggests that it may be null, but it
>>>> has already been dereferenced on all paths leading to the check.
This is an attempt to work around what I think may be a bug in
OSS-Fuzz, which thinks that uninitialized data might be passed to
the curve25519 functions.
There are three reasons we use a cached_dir_t to hold a consensus:
1. to serve that consensus to a client
2. to apply a consensus diff to an existing consensus
3. to send the consensus to a controller.
But case 1 is dircache-only. Case 2 and case 3 both fall back to
networkstatus_read_cached_consensus(). So there's no reason for us
to store this as a client. Avoiding this saves about 23% of our RAM
usage, according to our experiments last month.
This is, semantically, a partial revert of e5c608e535.
Fixes bug 27247; bugfix on 0.3.0.1-alpha.
We already had fallback code for "dir/status-vote/current/consensus"
to read from disk if we didn't have a cached_dir_t available. But
there's a function in networkstatus_t that does it for us, so let's
do that.
Return a newly allocated fake client authorization object instead of taking
the object as a parameter.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When reloading tor, check if our the configured client authorization have
changed from what we previously had. If so, republish the updated descriptor.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, the validation by decoding a created descriptor was disabled
because the interface had to be entirely changed and not implemented at the
time.
This commit re-enabled it because it is now implemented.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Parse the client authorization section from the descriptor, use the client
private key to decrypt the auth clients, and then use the descriptor cookie to
decrypt the descriptor.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit refactors the existing decryption code to make it compatible with
a new logic for when the client authorization is enabled.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because this secret data building logic is not only used by the descriptor
encoding process but also by the descriptor decoding, refactor the function to
take both steps into account.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The new ClientOnionAuthDir option is introduced which is where tor looks to
find the HS v3 client authorization files containing the client private key
material.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, we encrypted the descriptor without the descriptor cookie. This
commit, when the client auth is enabled, the descriptor cookie is always used.
I also removed the code that is used to generate fake auth clients because it
will not be used anymore.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit tests that the descriptor building result, when the client
authorization is enabled, includes everything that is needed.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We need to generate all the related keys when building the descriptor, so that
we can encrypt the descriptor.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is not supported, and always fails. Some compilers warn about the
function pointer cast on 64-bit Windows.
Fixes bug 27461; bugfix on 0.2.2.23-alpha.
gcc 8 warns that extend_info_t.nickname might be truncated by strncpy().
But it doesn't know that nickname can either contain a hex id, or a
nicknames. hex ids are only used for general and HSDir circuits.
Fixes bug 27463; bugfix on 0.1.1.2-alpha.
GetProcAddress() returns FARPROC, which is (long long int(*)()) on
64-bit Windows:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms683212(v=vs.85).aspx
But GetAdaptersAddresses() is (long unsigned int(*)()), on both 32-bit
and 64-bit Windows:
https://docs.microsoft.com/en-us/windows/desktop/api/iphlpapi/nf-iphlpapi-getadaptersaddresses
So gcc 8 issues a spurious "incompatible function pointer" warning
about the cast to GetAdaptersAddresses_fn_t.
Silence this warning by casting to a void function pointer, before
the cast to GetAdaptersAddresses_fn_t.
This issue is already fixed by 26481 in 0.3.5 and later, by removing
the lookup and cast.
Fixes bug 27465; bugfix on 0.2.3.11-alpha.
This reverts commit b5fddbd241.
The commit here was supposed to be a solution for #27451 (fd
management with NSS), but instead it caused an assertion failure.
Fixes bug 27500; but not in any released Tor.
On new glibc versions, there's an explicit_bzero(). With openssl,
there's openssl_memwipe().
When no other approach works, use memwipe() and a memory barrier.
This function was a wrapper around RSA_check_key() in openssl, which
checks for invalid RSA private keys (like those where p or q are
composite, or where d is not the inverse of e, or where n != p*q).
We don't need a function like this in NSS, since unlike OpenSSL, NSS
won't let you import a bogus private key.
I've renamed the function and changed its return type to make it
more reasonable, and added a unit test for trying to read a key
where n != p*q.
This function was supposed to implement a half-duplex mode for our
TLS connections. However, nothing in Tor actually uses it (besides
some unit tests), and the implementation looks really questionable
to me. It's probably best to remove it. We can add a tested one
later if we need one in the future.
The OpenSSL "RSA" object is currently 408 bytes compares to the ASN.1 encoding
which is 140 for a 1024 RSA key.
We save 268 bytes per descriptor (routerinfo_t) *and* microdescriptor
(microdesc_t). Scaling this to 6000 relays, and considering client usually
only have microdescriptors, we save 1.608 MB of RAM which is considerable for
mobile client.
This commit makes it that we keep the RSA onion public key (used for TAP
handshake) in ASN.1 format instead of an OpenSSL RSA object.
Changes is done in both routerinfo_t and microdesc_t.
Closes#27246
Signed-off-by: David Goulet <dgoulet@torproject.org>
TRUNCATED cells were ignored while in path bias. Now they are obeyed, and
cause us to tear down the circuit. The actual impact is minimal, since we
would just wait around for a probe that would never arrive before.
This commit changes client behavior.
We allow their CONNECTEDs, RESOLVEDs, ENDs, SENDMEs, and DATA cells to not
count as dropped until the windows are empty, or we get an END.
This commit does not change behavior. It only changes CIRC_BW event field
values.
By removing Tor2Web, there is no way a client can be non anonymous so we
remove that function and the callsites.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because we just removed Tor2web support, the need_specific_rp is not needed
anymore when cannibalizing a circuit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
No behaviour change.
A previous fix to chutney removed v3 onion services from the
mixed+hs-v23 network, so seeing "mixed+hs-v23" in tests is
confusing.
Fixes bug 27345; bugfix on 0.3.2.1-alpha.
LinkAuth method 1 is the one where we pull the TLS master secrets
out of the OpenSSL data structures and authenticate them with
RSA. Right now we list method 1 as required for clients and relays.
That's a problem, since we can't reasonably support it with NSS. So
let's remove it as a requirement and a recommendation.
As for method 3: I'd like to recommend it it, but that would make
0.2.9 start warning. Let's not do that till at least some time
after 0.3.5 (the next LTS) is stable.
Closes ticket 27286
Instead, count exits as usable if they have the exit flag, and
present if they also have a non-reject exit policy.
Requiring a threshold of usable descriptors avoids directories trickling
exit descriptors to clients to discover their ExitNodes settings.
Part of 27236.
Previously, Tor would only check the exit flag. In small networks, Tor
could bootstrap once it received a consensus with exits, without fetching
the new descriptors for those exits.
After bootstrap, Tor delays descriptor fetches, leading to failures in
fast networks like chutney.
Fixes 27236; bugfix on 0.2.6.3-alpha.
In order to switch the default HS version from 2 to 3, we need tor to be smart
and be able to decide on the version by trying to load the service keys during
configuration validation.
Part of #27215
Signed-off-by: David Goulet <dgoulet@torproject.org>
Part of #27215, we need to call the ed_key_init_from_file function during
option_validate() which is before the global_options variable is set.
This commit make ed_key_init_from_file() stop using get_options() and instead
now has a or_options_t parameter.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Bug description: For each descriptor, its revision counter is the OPE
ciphertext of the number of seconds since the start time of its SRV value.
This bug caused us to confuse the SRV start time in the middle of the lifetime
of a descriptor in some edge-cases, which caused descriptor rejects.
Bug cause: The bug occurs when we fetch a 23:00 consensus after
midnight (e.g. at 00:08 when not all dirauths have fetched the latest 00:00
consensus). In that case, the voting schedule (which was used for SRV start
time calculation) would return a valid-after past-midnight, whereas our
consensus would be pre-midnight, and that would confuse the SRV start time
computation which is used by HS revision counters (because we would reset the
start time of SRV, without rotating descriptors).
Bug fix: We now use our local consensus time to calculate the SRV start time,
instead of the voting schedule. The voting schedule does not work as originally
envisioned in this case, because it was created for voting by dirauths and not
for scheduling stuff on clients.
We used to link both libraries at once, but now that I'm working on
TLS, there's nothing left to keep OpenSSL around for when NSS is
enabled.
Note that this patch causes a couple of places that still assumed
OpenSSL to be disabled when NSS is enabled
- tor-gencert
- pbkdf2
Also, add a stubbed-out nss version of the modules. The tests won't
pass with NSS yet since the NSS modules don't do anything.
This is a good patch to read with --color-moved.
This cleans up a lot of junk from crypto_rsa_openssl, and will
save us duplicated code in crypto_rsa_nss (when it exists).
(Actually, it already exists, but I am going to use git rebase so
that this commit precedes the creation of crypto_rsa_nss.)
Unlike the old test, this test no will no longer mess around with
the forbidden internals of any openssl data structures.
Additionally, it verifies several other behaviors of
tor_tls_cert_matches_key() that we had wanted to verify, such as
the possibility of the certificate's key not matching.
Fixes bug 27226; bugfix on 0.2.5.1-alpha.
This allows us to mock our own tor_tls_get_peer_certificate()
function in order to test ..cert_matches_key(), which will in turn
allow us to simplify test_tortls_cert_matches_key() considerably.
Prep work for the fix for 27226.
Until https://github.com/rust-lang/rust/issues/52652 is fixed,
unwinding on panic is potentially unsound in a mixed C/Rust codebase.
The codebase is supposed to be panic-free already, but just to be safe.
This started mattering at commit d1820c1516.
Fixes#27199; bugfix on tor-0.3.3.1-alpha.
It's impossible for spaces to get here, since spaces are used as
separators between individual protocol entries higher up.
And it shouldn't ignore whitespace that isn't a literal space
character, because that would differ from the C implementation.
These were added in 9925d2e687.
Fixes#27177. Bugfix on 0.3.3.5-rc.
It was parsing "1-2-3" as if it were 1-2, ignoring the 2nd hyphen
and everything after.
Introduced in d1820c1516.
Fixes#27164; bugfix on 0.3.3.1-alpha.
The function takes an already validated utf-8 string, and
it never checks if the version numbers are an empty string.
That parse error happens later.
Fix on 701c2b69f5
These are the 12 stable and documented configuration options,
set to their default values.
use_small_heuristics is only stabilized in rustfmt 0.9, so maintain
support for 0.8.x for now by commenting it out.
comment_width is unstable and did nothing, since wrap_comments defaults
to false.
Default values gotten from `rustfmt --print-config default rustfmt.toml`.
e7932fa9c2/Configurations.md
Replace master .travis.yml with 034 .travis.yml.
All the changes in master have been backported to the
034 .travis.yml already.
Replace master src/test/test_rust.sh with 034
src/test/test_rust.sh, which was backported from
master. One 033/034-specific commit needs to be
reverted.
Replace 034 .travis.yml with 033 .travis.yml.
Subsequent commits will restore 034 functionality.
Replace 034 src/test/test_rust.sh with 033
src/test/test_rust.sh, which was backported from
master.
Replace 033 .travis.yml with 032 .travis.yml.
Subsequent commits will restore 033 functionality.
src/rust/tor_util/include.am is deleted in 033.
Subsequent commits will apply 032 changes to
src/rust/tor_rust/include.am.
Replace 033 src/test/test_rust.sh with 032
src/test/test_rust.sh, which was backported from
master.
At the same time, sternly warn any person thinking about relying on
any particular format too strictly. If you do this, and your
program breaks, it is your bug, not mine.
I hope that the debian clang maintainers will look at debian bug
903709 soon. But until they do, this should keep our users and our
CI happy on sid with clang.
Closes ticket 26779.
The seccomp rule for the openat syscall checks for the AT_FDCWD
constant. Because this constant is usually a negative value, a
cast to unsigned int is necessary to make sure it does not get
converted to uint64_t used by seccomp.
More info on:
https://github.com/seccomp/libseccomp/issues/69#issuecomment-273805980
When we fixed 25939 in f7633c1fca, we
introduced a call to rescan_periodic_events() from inside the onion
service logic. But this meant that we could rescan the event list --
thereby running event callbacks! -- from inside the hidden service code.
This could cause us to run some of our event callbacks from an
inconsistent state, if we were in the middle of changing options.
A related bug (#25761) prevented us from rescanning our periodic
events as appropriate, but when we fixed THAT one, this bug reared
its ugly head.
The fix here is that "enabling" an event should cause us to run it
from the event loop, but not immediately from the point where we
enable it.
Fixes bug 27003; bugfix on 0.3.4.1-alpha.
We need this so that the tor_api user can specify some arguments,
while the tor_api implementation adds others.
This implementation detail should not be visible to tor_api users.
This change also makes tor_ersatz_socketpair() follow the same
interface as socketpair() rather than tor_socketpair(), so it now
needs to be wrapped in the same code as socketpair() does.
This is comparatively straightforward too, except for a couple of
twists:
* For as long as we're building with two crypto libraries, we
want to seed _both_ their RNGs, and use _both_ their RNGs to
improve the output of crypto_strongest_rand()
* The NSS prng will sometimes refuse to generate huge outputs.
When it does, we stretch the output with SHAKE. We only need
this for the tests.
We would usually call it through tor_cleanup(), but in some code
paths, we wouldn't. These paths would break restart-in-process,
since leaving fields uncleared would cause assertion failures on
restart.
Fixes bug 26948; bugfix on 0.3.3.1-alpha
Conditionalize the pragma that temporarily disables
-Wunused-const-variable. Some versions of gcc don't support it. We
need to do this because of an apparent bug in some libzstd headers.
Fixes bug 26785; bugfix on 0.3.2.11.
Instead, log a protocol warning when single onion services or
Tor2web clients fail to authenticate direct connections to relays.
Fixes bug 26924; bugfix on 0.2.9.1-alpha.
Stop putting ed25519 link specifiers in v3 onion service descriptors,
when the intro point doesn't support ed25519 link authentication.
Fixes bug 26627; bugfix on 0.3.2.4-alpha.
Stop sending ed25519 link specifiers in v3 onion service introduce
cells, when the rendezvous point doesn't support ed25519 link
authentication.
Fixes bug 26627; bugfix on 0.3.2.4-alpha.
Our previous definition implied that code would never keep running
if a BUG occurred (which it does), and that BUG(x) might be true
even if x was false (which it can't be).
Closes ticket 26890. Bugfix on 0.3.1.4-alpha.
This is another attempt to fix 1437668. The assertion here should
be safe, since the rules of networkstatus_get_param() keep the value
it returns in range.
The following bug was causing many issues for this branch in chutney:
In sr_state_get_start_time_of_current_protocol_run() we were using the
consensus valid-after to calculate beginning_of_current_round, but we were
using time(NULL) to calculate the current_round slot. This was causing time
sync issues when the consensus valid-after and time(NULL) were disagreeing on
what the current round is. Our fix is to use the consensus valid-after in both
places.
This also means that we are not using 'now' (aka time(NULL)) anymore in that
function, and hence we can remove that argument from the function (and its
callers). I'll do this in the next commit so that we keep things separated.
Furthermore, we fix a unittest that broke.
We only build a descriptor once, and we just re-encode it (and change its intro
points if needed) before uploading.
Hence we should set the revision counter before uploading, not during building.
The OPE cipher is tied to the current blinded key which is tied to the current
time period. Hence create the OPE cipher structure when we create a new
descriptor (and build its blinded key).
Now that the rev counter depends on the local time, we need to be more careful
in the unittests. Some unittests were breaking because they were using
consensus values from 1985, but they were not updating the local time
appropriately. That was causing the OPE module to complain that it was trying
to encrypt insanely large values.
To do so for a given descriptor, we use the "seconds since the SR protocol run"
started, for the SRV that is relevant to this descriptor. This is guaranteed to
be a positive value (since we need an SRV to be able to build a descriptor),
and it's also guaranteed to be a small value (since SRVs stop being listed on a
consensus after 48 hours).
We cannot use the "seconds since the time period started", because for the next
descriptor we use the next time period, so the timestamp would end up negative.
See [SERVICEUPLOAD] from rend-spec-v3.txt for more details.
To do so, we have to introduce a new `is_current` argument to a bunch of
functions, because to use "seconds since the SR protocol run" we need to know
if we are building the current or the next descriptor, since we use a different
SRV for each descriptor.
This is meant for use when encrypting the current time within the
period in order to get a monotonically increasing revision counter
without actually revealing our view of the time.
This scheme is far from the most state-of-the-art: don't use it for
anything else without careful analysis by somebody much smarter than
I am.
See ticket #25552 for some rationale for this logic.
If an authority is not configured with a V3BandwidthsFile, this line
SHOULD NOT appear in its vote.
If an authority is configured with a V3BandwidthsFile, but parsing
fails, this line SHOULD appear in its vote, but without any headers.
Part of 3723, implements the spec in 26799.
also add tests for bw_file_headers.
Headers are all that is found before a correct relay line or
the terminator.
Tests include:
* a empty bandwidth file
* a bandwidth file with only timestamp
* a bandwidth file with v1.0.0 headers
* a bandwidth file with v1.0.0 headers and relay lines
* a bandwidth file with v1.1.0 headers and v1.0.0 relay lines
* a bandwidth file with v1.0.0 headers, malformed relay lines and
relay lines
* a bandwidth file with v1.0.0 headers, malformed relay lines,
relay lines and malformed relay lines
* a bandwidth file with v1.1.0 headers without terminator
* a bandwidth file with v1.1.0 headers with terminator
* a bandwidth file with v1.1.0 headers without terminator and
relay lines
* a bandwidth file with v1.1.0 headers with terminator and relay
lines
* a bandwidth file with v1.1.0 headers without terminator, bad
relay lines and relay lines
* a bandwidth file with v1.1.0 headers with terminator, bad relay
lines and relay lines
If bandwidth file terminator is found, set end of headers flag
and do not store the line.
If it is not, parse a relay line and check whether it is a header
line.
* add bwlist_headers argument to dirserv_read_measured_bandwidth
in order to store all the headers found when parsing the file
* add bwlist_headers to networkstatus_t in order to store the
the headers found by the previous function
* include the bandwidth headers as string in vote documents
* add test to check that dirserv_read_measured_bandwidth generates
the bwlist_headers
Coverity rightly complains that early in the function we're checking
whether username is NULL, and later we're passing it unconditionally
to strlen().
Fixes CID 1437967. Bug not in any released Tor.
We need this in our unit tests, since otherwise NSS will notice
we've forked and start cussing us out.
I suspect we'll need a different hack for daemonizing, but this
should be enough for tinytest to work.
This patch adds two assertions in get_net_param_from_list() to ensure
that the `res` value is correctly within the range of the output domain.
Hopefully fixes Coverity CID #1415721, #1415722, and #1415723.
See: https://bugs.torproject.org/26780
The "Bifroest" bridge authority has been retired; the new bridge authority
is "Serge", and it is operated by George from the TorBSD project.
Closes ticket 26771.
These are now part of crypto_init.c. The openssl-only parts now
live in crypto_openssl_mgt.c.
I recommend reviewing this patch with -b and --color-moved.
We have to check for ERR_load_KDF_strings() here, since that's the
only one that's actually a function rather than a macro.
Fixes compilation with LibreSSL. Fixes bug 26712; bug not in
any released Tor.
That place is git-revision.c; git-revision.c now lives in lib/log.
Also fix the compilation rules so that all object files that need
micro-revision.i depend on it.
Fun fact: these files used to be called log.[ch] until we ran into
conflicts with systems having a log.h file. But now that we always
include "lib/log/log.h", we should be fine.
This function has a nasty API, since whether or not it invokes the
resolver depends on whether one of its arguments is NULL. That's a
good way for accidents to happen.
This patch incidentally makes tor-resolve support socks hosts on
IPv6.
These are now combined into an inaddr.[ch], since their purpose is
to implement functions for struct in_addr and struct in6_addr.
The definitions for in6_addr and its allies are now in a separate
header, inaddr_st.h.
Closes ticket 26532.
This commit won't build yet -- it just puts everything in a slightly
more logical place.
The reasoning here is that "src/core" will hold the stuff that every (or
nearly every) tor instance will need in order to do onion routing.
Other features (including some necessary ones) will live in
"src/feature". The "src/app" directory will hold the stuff needed
to have Tor be an application you can actually run.
This commit DOES NOT refactor the former contents of src/or into a
logical set of acyclic libraries, or change any code at all. That
will have to come in the future.
We will continue to move things around and split them in the future,
but I hope this lays a reasonable groundwork for doing so.
This is a very gentle commit that just lays the groundwork in the
build system: it puts the include files to build libtor-app.a into
src/core, and to build the tor executable into src/app. The
executable is now "src/app/tor".
This is temporary, until src/or is split.
Putting this in containers would be another logical alternative,
except that addresses depend on containers, and we don't like
cycles.
When on a private network a single directory authority is explicitly
configured, allow that DA's instace of tor to pick itself as its own
DA, by having router_pick_dirserver_generic() set the PDS_ALLOW_SELF
flag when the list of potential sources contains only one element.
This would subsequently allow router_pick_trusteddirserver_impl() to
calculate is 'requreother' condition as 'false', enabling it to pick
itself as a valid DA candidate.
Fixes#25928
Signed-off-by: Gabriel Somlo <gsomlo@gmail.com>
Recent Python3 versions seem to require this on Windows.
Fixes bug 26535; bug introduced in f4be34f70d, which
was apparently intended itself as a Python3 workaround.
When we do redefine them, use inline functions instead of #define.
This fixes a latent code problem in our redefinition of these
functions, which was exposed by our refactoring: Previously, we
would #define strcasecmp after string.h was included, so nothing bad
would happen. But when we refactored, we would sometimes #define it
first, which was a problem on mingw, whose headers contain
(approximately):
inline int strcasecmp (const char *a, const char *b)
{ return _stricmp(a,b); }
Our define turned this into:
inline int _stricmp(const char *a, const char *b)
{ return _stricmp(a,b); }
And GCC would correctly infer that this function would loop forever,
rather than actually comparing anything. This caused bug 26594.
Fixes bug 26594; bug not in any released version of Tor.
This code was in compat_threads, since it was _used_ for efficiently
notifying the main libevent thread from another thread. But in
spite of its usage, it's fundamentally a part of the network code.
The "conffile" module knows about includes and filesystem access,
whereas confline doesn't. This will make it possible to put these
functions into libraries without introducing a cycle.
For HSv3, the HSADDRESS= wasn't properly parsed for the HSPOST command. It now
correctly use it and furthermore sends back a "200 OK" in case the command is
successful for a v3 descriptor.
Fixes#26523
Signed-off-by: David Goulet <dgoulet@torproject.org>
The smartlist_core library now contains only the parts of smartlists
that are needed for the logging library. This resolves the
circularity between "container" and "log".
The "containers" library still uses the logging code, and has the
higher-level smartlist functions.
Fixes bug 26480; bug appeared when we re-enabled the geoip tests on
windows. Bug originally introduced by our fix to 25787; bug not in
any released Tor.
This patch fixes a memory leak in disk_state_parse_commits() where if
commit is NULL, we continue the internal loop, but without ever freeing
the args variable.
See: Coverity CID 1437441.
This patch fixes a memory leak in frac_nodes_with_descriptors() where
we might return without free'ing the bandwidths variable.
See: Coverity CID 1437451.
This patch fixes a memory leak in new_establish_intro_cell() that could
happen if a test assertion fails and the *cell_out value isn't properly
free'd.
See: Coverity CID 1437445
This patch fixes a memory leak in decode_link_specifiers() where the
hs_spec variable might leak if the default label is taken in the
switch/case expression.
See: Coverity CID 1437437.
This patches fixes a memory leak in client_likes_consensus() where if
consensus_cache_entry_get_voter_id_digests() would fail we would return
without having free'd the voters list.
See: Coverity CID 1437447
This patch fixes a memory leak in hs_helper_build_hs_desc_impl() where
if a test assertion would fail we would leak the storage that `desc`
points to.
See: Coverity CID 1437448
This patch fixes a memory leak in pick_hsdir_v3() where we might return
early, but forgot to free the responsible_hsdirs variable. We solve this
by not allocating storage for responsible_hsdirs until it's actually
needed.
See: Coverity CID 1437449
This patch fixes a potential memory leak in test_hs_auth_cookies() if a
test-case fails and we goto the done label where no memory clean up is
done.
See: Coverity CID 1437453
This patch fixes a potential memory leak in
hs_helper_build_intro_point() where a `goto done` is called before the
`intro_point` variable have been assigned to the value of the `ip`
variable.
See: Coverity CID 1437460
See: Coverity CID 1437456
This patch:
- introduces an fdio module for low-level fd functions that don't
need to log.
- moves the responsibility for opening files outside of torlog.c,
so it won't need to call tor_open_cloexec.
The locking code gets its own module, since it's more fundamental
than the higher-level locking code.
Extracting the logging code was the whole point here. :)
Out-of-tree builds could fail to run the rust tests if built in
offline mode. cargo expects CARGO_HOME to point to the .cargo
directory, not the directory containing .cargo.
Fixes bug 26455; bug not in any released tor.
Includes configuration files to enforce these rules on lib and
common. Of course, "common" *is* a modularity violation right now,
so these rules aren't as strict as I would like them to be.
I am calling the crypto library "crypt_ops", since I want
higher-level crypto things to be separated from lower-level ones.
This library will hold only the low-level ones, once we have it
refactored.
When I wrote the first one of these, it needed the path of the geoip
file. But that doesn't translate well in at least two cases:
- Mingw, where the compile-time path is /c/foo/bar and the
run-time path is c:\foo\bar.
- Various CI weirdnesses, where we cross-compile a test binary,
then copy it into limbo and expect it to work.
Together, these problems precluded these tests running on windows.
So, instead let's just generate some minimal files ourselves, and
test against them.
Fixes bug 25787
Fix a memory leak where directory authorities would leak a chunk of
memory for every router descriptor every time they considered voting.
This bug was taking down directory authorities in the live network due
to out-of-memory issues.
Fixes bug 26435; bugfix on 0.3.3.6.
We'd like to feature gate code that calls C from Rust, as a workaround
to several linker issues when running `cargo test` (#25386), and we
can't feature gate anything out of test code if `cargo test` is called
with `--all-features`.
* FIXES#26400: https://bugs.torproject.org/26400
Previously we had code like this for bad things happening from
signal handlers, but it makes sense to use the same logic to handle
cases when something is happening at a level too low for log.c to be
involved.
My raw_assert*() stuff now uses this code.
We had accumulated a bunch of cruft here. Now let's only include
src and src/ext. (exception: src/trunnel is autogenerated code, and
need to include src/trunnel.)
This commit will break the build hard. The next commit will fix it.
The doctests for src/rust/crypto don't compile for multiple reasons,
including some missing exports and incorrect identifier paths. Fixes
bug 26415; bugfix on 0.3.4.1-alpha.
We need this trick because some of our Rust tests depend on our C
code, which in turn depend on other native libraries, which thereby
pulls a whole mess of our build system into "cargo test".
To solve this, we add a build script (build.rs) to set most of the
options that we want based on the contents of config.rust. Some
options can't be set, and need to go to the linker directly: we use
a linker replacement (link_rust.sh) for these. Both config.rust and
link_rust.sh are generated by autoconf for us.
This patch on its own should enough to make the crypto test build,
but not necessarily enough to make it pass.
After the big or.h refactoring, one single unit test file was missing two
headers for node_t and microdesc_t.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Due to linker issues (#25386) when testing Rust code which calls C,
all tests which touch FFI code should now be feature-gated behind the
"test-c-from-rust" flag. To run this test code, cargo must be called
with `cargo test --features="test-c-from-rust"`.
* FIXES#26398: https://bugs.torproject.org/26398
Also make sure that we're actually running the test from within the right
cwd, like we do when we're building. This seems necessary to avoid
an error when running offline.
Amusingly, it appears that we had this bug before: we just weren't
noticing it, because of bug 26258.
Some versions of GCC complain that the bfn_mock_node_get_by_id
function might return NULL, but we're assuming that it won't.
(We're assuming it won't return NULL because we know in the tests
that we're passing it valid IDs.)
To make GCC happy, tt_assert() that each node_t is set before using
it.
Fixes a second case of bug26269; bugfix on 0.3.0.1-alpha.
Here is how this changes the HSv3 client-side and service-side:
For service side we already required live consensus to upload descriptors (see
9e900d1db7) so we should never get there without
a live consensus.
For the client-side we now require a live consensus to attempt to connect to
HS. While this changes the client behavior in principle, it doesn't really
change it, because we always required live consensus to set HSDir indices, so
before this patch a client with no live consensus would try to compute
responsible HSDirs without any HSDir indices and bug out. This makes the client
behavior more consistent, by requiring a live consensus (and hence a
semi-synced clock) for the client to connect to an HS entirely.
The alternative would have been to allow setting HSDir indices with a non-live
consensus, but this would cause the various problems outlined by commit
b89d2fa1db.
There are a few reasons that relays might be uploading desciptors
without saying X-Desc-Gen-Reason:
1. They are running an old version of our software, before 0.3.2.stable.
2. They are not running our software, but they are claiming they
are.
3. They are uploading through a proxy that strips X-Desc-Gen-Reason.
4. They somehow had a bug in their software.
According to the 25686 data, 1 is the most common reason. This
ticket is an attempt to diagnose case 4, or prove that case 4
doesn't actually happen.
With the work on #25500 (reducing CPU client usage), the HS service main loop
callback is enabled as soon as the HS service map changes which happens when
registering a new service.
Unfortunately, for an ephemeral service, we were building the onion address
*after* the registration leading to the "service->onion_address` to be an
empty string.
This broke the "HS_DESC CREATED" event which had no onion address in it. And
also, we were logging an empty onion address for that service.
Fixes#25939
Signed-off-by: David Goulet <dgoulet@torproject.org>
* REFACTORS `UnvalidatedProtoEntry::from_str` to place the bulk of the
splitting/parsing logic in to a new
`UnvalidatedProtoEntry::parse_protocol_and_version_str()` method (so that
both `from_str()` and `from_str_any_len()` can call it.)
* ADD a new `UnvalidatedProtoEntry::from_str_any_len()` method in order to
maintain compatibility with consensus methods older than 29.
* ADD a limit on the number of characters in a protocol name.
* FIXES part of #25517: https://bugs.torproject.org/25517
In protover.c, the `expand_protocol_list()` function expands a `smartlist_t` of
`proto_entry_t`s to their protocol name concatenated with each version number.
For example, given a `proto_entry_t` like so:
proto_entry_t *proto = tor_malloc(sizeof(proto_entry_t));
proto_range_t *range = tor_malloc_zero(sizeof(proto_range_t));
proto->name = tor_strdup("DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa");
proto->ranges = smartlist_new();
range->low = 1;
range->high = 65536;
smartlist_add(proto->ranges, range);
(Where `[19KB]` is roughly 19KB of `"a"` bytes.) This would expand in
`expand_protocol_list()` to a `smartlist_t` containing 65536 copies of the
string, e.g.:
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=1"
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=2"
[…]
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=65535"
Thus constituting a potential resource exhaustion attack.
The Rust implementation is not subject to this attack, because it instead
expands the above string into a `HashMap<String, HashSet<u32>` prior to #24031,
and a `HashMap<UnvalidatedProtocol, ProtoSet>` after). Neither Rust version is
subject to this attack, because it only stores the `String` once per protocol.
(Although a related, but apparently of too minor impact to be usable, DoS bug
has been fixed in #24031. [0])
[0]: https://bugs.torproject.org/24031
* ADDS hard limit on protocol name lengths in protover.c and checks in
parse_single_entry() and expand_protocol_list().
* ADDS tests to ensure the bug is caught.
* FIXES#25517: https://bugs.torproject.org/25517
* REFACTORS `UnvalidatedProtoEntry::from_str` to place the bulk of the
splitting/parsing logic in to a new
`UnvalidatedProtoEntry::parse_protocol_and_version_str()` method (so that
both `from_str()` and `from_str_any_len()` can call it.)
* ADD a new `UnvalidatedProtoEntry::from_str_any_len()` method in order to
maintain compatibility with consensus methods older than 29.
* ADD a limit on the number of characters in a protocol name.
* FIXES part of #25517: https://bugs.torproject.org/25517
In protover.c, the `expand_protocol_list()` function expands a `smartlist_t` of
`proto_entry_t`s to their protocol name concatenated with each version number.
For example, given a `proto_entry_t` like so:
proto_entry_t *proto = tor_malloc(sizeof(proto_entry_t));
proto_range_t *range = tor_malloc_zero(sizeof(proto_range_t));
proto->name = tor_strdup("DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa");
proto->ranges = smartlist_new();
range->low = 1;
range->high = 65536;
smartlist_add(proto->ranges, range);
(Where `[19KB]` is roughly 19KB of `"a"` bytes.) This would expand in
`expand_protocol_list()` to a `smartlist_t` containing 65536 copies of the
string, e.g.:
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=1"
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=2"
[…]
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=65535"
Thus constituting a potential resource exhaustion attack.
The Rust implementation is not subject to this attack, because it instead
expands the above string into a `HashMap<String, HashSet<u32>` prior to #24031,
and a `HashMap<UnvalidatedProtocol, ProtoSet>` after). Neither Rust version is
subject to this attack, because it only stores the `String` once per protocol.
(Although a related, but apparently of too minor impact to be usable, DoS bug
has been fixed in #24031. [0])
[0]: https://bugs.torproject.org/24031
* ADDS hard limit on protocol name lengths in protover.c and checks in
parse_single_entry() and expand_protocol_list().
* ADDS tests to ensure the bug is caught.
* FIXES#25517: https://bugs.torproject.org/25517
Apparently, even though I had tested on OpenSSL 1.1.1 with
no-deprecated, OpenSSL 1.1.0 is different enough that I should have
tested with that as well.
Fixes bug 26156; bugfix on 0.3.4.1-alpha where we first declared
support for this configuration.
Add two new files (crypto_hkdf.c, crypto_hkdf.h) as new module of crypto.[ch].
This new module includes all functions and dependencies related to HKDF
operations. Those have been removed from crypto.[ch].
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
i don't know if whitespace is ok to have before preprocessing
directives on all platforms, but anyway we almost never have it,
so now things are more uniform.
Before this commit, the control events were never triggered. It was introduced
with commit 0c19ce7bde.
Fixes#26082
Signed-off-by: David Goulet <dgoulet@torproject.org>
We alloc/free X.509 structures in three ways:
1) X509 structure allocated with X509_new() and X509_free()
2) Fake X509 structure allocated with fake_x509_malloc() and fake_x509_free()
May contain valid pointers inside.
3) Empty X509 structure shell allocated with tor_malloc_zero() and
freed with tor_free()
The specification describes the signature token to be right after a newline
(\n) then the token "signature" and then a white-space followed by the encoded
signature.
This commit makes sure that when we parse the signature from the descriptor,
we are always looking for that extra white-space at the end of the token.
It will allow us also to support future fields that might start with
"signature".
Fixes#26069
Signed-off-by: David Goulet <dgoulet@torproject.org>
There are three cases where this can happen: changes in our
controller events, changes in our DisableNetwork setting, and
changes in our hibernation state.
Closes ticket 26063.
Two new values in each direction. DELIVERED counts valid end-to-end circuit
data that is accepted by our end and OVERHEAD counts the slack unused data in
each of the relay command cells for those accepted cells.
Control port changes are in the next commit.
We still do this time update here, since we do it from all
callbacks, but it is no longer a reason to keep the once-per-second
callback enabled.
Closes ticket 26009.
Since we're going to be disabling the second-elapsed callback, we're
going to sometimes have long periods when no events file, and so the
current second is not updated. Handle that by having a better means
to detect "clock jumps" as opposed to "being idle for a while".
Tolerate far more of the latter.
Part of #26009.
The any_client_port_set() returns true if the ControlPort is set which is
wrong because we can have that port open but still not behave as a tor client
(like many relays for instance).
Fixes#26062
Signed-off-by: David Goulet <dgoulet@torproject.org>
The options_any_client_port_set() returns true if the ControlPort is set which
is wrong because we can have that port open but still not behave as a tor
client (like many relays for instance).
Fixes#26062
Signed-off-by: David Goulet <dgoulet@torproject.org>
This functions is now used outside of networkstatus.c and makes more sense to
be in config.c.
It is also renamed to options_any_client_port_set() for the config.c
namespace.
No code behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Now it has a function that can tell the rest of Tor whether any
once-a-second controller item should fire, and a function to fire
all the once-a-second events.
When directory authorities read a zero-byte bandwidth file, they log
a warning with the contents of an uninitialised buffer. Log a warning
about the empty file instead.
Fixes bug 26007; bugfix on 0.2.2.1-alpha.
Remove v3 optimization which made Tor not detect disabling services.
This optimization is not so needed because we only call that function after HUP
anyway.
Fixes bug #25761.
During service configuration, rend_service_prune_list_impl_() sets
rend_service_staging_list to NULL, which blocked pruning after a HUP.
This patch initializes rend_service_staging_list when needed, so that HUP can
detect disabled onion services.
Fixes bug #25761.
Previously the coverage on this function was mostly accidental,
coming as it did from test_entryconn.c. These new tests use mocking
to ensure that we actually hit the different failure and retry cases
of addressmap_get_virtual_address(), and make our test coverage a
bit more deterministic.
Closes ticket 25993.
Previously, an authority with a clock more than 60 seconds ahead could
cause a client with a correct clock to warn that the client's clock
was behind. Now the clocks of a majority of directory authorities
have to be ahead of the client before this warning will occur.
Relax the early-consensus check so that a client's clock must be 60
seconds behind the earliest time that a given sufficiently-signed
consensus could possibly be available.
Add a new unit test that calls warn_early_consensus() directly.
Fixes bug 25756; bugfix on 0.2.2.25-alpha.
construct_consensus() in test_routerlist.c created votes using a
timestamp from time(). Tests that called construct_consensus() might
have nondeterministic results if they rely on time() not changing too
much on two successive calls.
Neither existing of the two existing tests that calls
construct_consensus is likely to have a failure due to this problem.
Included crypto_dh.h in some files in order to solve DH module dependency
issues.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Add two new files (crypto_dh.c, crypto_dh.h) as new module of crypto.[ch]. This
new module includes all functions and dependencies related to DH operations.
Those have been removed from crypto.[ch].
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
crypto_log_errors() has been moved to crypto_util.[ch]. It was duplicated in
some files so they have been removed too.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Code movement for the refactoring for ticket 24658 didn't copy the
inclusion of sys/random.h, which is needed to get a prototype for
getentropy() on macOS 10.12 Sierra. It also didn't copy the inclusion
of sys/syscall.h, which might prevent the getrandom() syscall from
being properly detected. Move these inclusions. Bug not in any
released Tor.
Our previous algorithm had a nonzero probability of picking no
events to cancel, which is of course incorrect. The new code uses
Vitter's good old reservoir sampling "algorithm R" from 1985.
Fixes bug 26008; bugfix on 0.2.6.3-alpha.
This functionality was covered only accidentally by our voting-test
code, and as such wasn't actually tested at all. The tests that
called it made its coverage nondeterministic, depending on what time
of day you ran the tests.
Closes ticket 26014.
This is needed for libressl-2.6.4 compatibility, which we broke when
we merged a15b2c57e1 to fix bug 19981. Fixes bug 26005; bug
not in any released Tor.
We cleared this value in second_elapsed_callback. But what were we
using it for? For detecting if Libevent returned EINVAL too often!
We already have a way to detect too-frequent events, and that's with
a ratelim_t. Refactor the code to use that instead. Closes ticket
26016.
From Neel's latest patch on optimizing the hs_circ_service_get_intro_circ()
digest calculation, remove an extra white-space and clarify a comment of the
legacy key digest to inform when to use it.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This function must return false if the module is not compiled in. In order to
do that, we move the authdir_mode_v3() function out of router.c and into the
dirauth module new header file named mode.h.
It is always returning false if we don't have the module.
Closes#25990
Signed-off-by: David Goulet <dgoulet@torproject.org>
This test was using the current time to pick the time period number,
and a randomly generated hs key. Therefore, it sometimes picked an
index that would wrap around the example dht, and sometimes would
not.
The fix here is just to fix the time period and the public key.
Fixes bug 25997; bugfix on 0.3.2.1-alpha.
When directory authorities read a zero-byte bandwidth file, they log
a warning with the contents of an uninitialised buffer. Log a warning
about the empty file instead.
Fixes bug 26007; bugfix on 0.2.2.1-alpha.
LibreSSL, despite not having the OpenSSL 1.1 API, does define
OPENSSL_VERSION in crypto.h. Additionally, it apparently annotates
some functions as returning NULL, so that our unit tests need to be
more careful about checking for NULL so they don't get compilation
warnings.
Closes ticket 26006.
This test, in test_client_pick_intro(), will have different coverage
depending on whether it selects a good intro point the first time or
whether it has to try a few times. Since it produces the shorter
coverage with P=1/4, repeat this test 64 times so that it only
provides reduced coverage with P=1/2^128. The performance cost is
negligible.
Closes ticket 25996. This test was introduced in 0.3.2.1-alpha.
I'd prefer not to do this for randomized tests, but as things stand
with this test, it produces nondeterministic test coverage.
Closes ticket 25995; bugfix on 0.2.2.2-alpha when this test was
introduced.
This change should make it impossible for the monotonic time to roll
over from one EWMA tick to the next during this test, and make it so
that this test never invokes scale_active_circuits() (which it
doesn't test).
(Earlier changes during the 0.3.4 series should make this call even
rarer than it was before, since we fixed#25927 and removed
cached_gettimeofday. Because this test didn't update
cached_gettimeofday, the chance of rolling over a 10-second interval
was much higher.)
Closes ticket 25994; bugfix on 0.3.3.1-alpha when this test was
introduced.
Arguably, the conditions under which these events happen should be a
bit different, but the rules are complex enough here that I've tried
to have this commit be pure refactoring.
Closes ticket 25952.
Finally, before this code goes away, take a moment to look at the
amazing way that we used to try to have an event happen
every N seconds:
get_uptime() / N != (get_uptime()+seconds_elapsed) / N
Truly, it is a thing of wonder. I'm glad we didn't start using this
pattern everywhere else.
By doing so, it is renamed to voting_schedule_recalculate_timing(). This
required a lot of changes to include voting_schedule.h everywhere that this
function was used.
This effectively now makes voting_schedule.{c|h} not include dirauth/dirvote.h
for that symbol and thus no dependency on the dirauth module anymore.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This function doesn't need to be public from the dirvote common file (which
will get renamed in future commit) so move it to dirauth/dirvote.c and make it
static.
Part of #25988
Signed-off-by: David Goulet <dgoulet@torproject.org>
It makes more sense to be in networkstatus.c so move it there and rename it
with the "networkstatus_" prefix.
Part of #25988
Signed-off-by: David Goulet <dgoulet@torproject.org>
Apparently, we can decide our state is dirty before we create the
event to tell the mainloop that we should save it. That's not a
problem, except for the assertion failure.
This requires that when a log cb happens, the event for flushing
queued events is scheduled, so we also add the necessary machinery
to have that happen.
Note that this doesn't actually help with logs from outside the main
thread, but those were already suppressed: see #25987 for a ticket
tracking that issue.
Sometimes the logging system will queue a log message for later.
When it does this, the callback will either get flushed at the next
safe time, or from the second-elapsed callback.
But we're trying to eliminate the second-elapsed callback, so let's
make a way for the log system to tell its users about this.
Commit 0f3b765b3c added
tor_assert_nonfatal_unreached() to dirvote_add_vote() and
dirvote_add_signatures() when the dirauth module is disabled.
However, they need to return a value. Furthermore, the dirvote_add_vote()
needs to set the msg_out and status_out so it can be sent back. Else,
uninitialized values would be used.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Originally, it was made public outside of the dirauth module but it is no
longer needed. In doing so, we put it back in dirvote.c and reverted its name
to the original one:
dirvote_authority_cert_dup() --> authority_cert_dup()
Signed-off-by: David Goulet <dgoulet@torproject.org>
The --disable-module-* configure option removes code from the final binary but
we still build the unit tests with the disable module(s) so we can actually
test that code path all the time and not forget about it.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This code is only for dirauth so this commit moves it into the module in
dirvote.c.
No code behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Both functions are used for directory request but they can only be used if the
running tor instance is a directory authority.
For this reason, make those symbols visible but hard assert() if they are
called when the module is disabled. This would mean we failed to safeguard the
entry point into the module.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In order to further isolate the dirauth code into its module, this moves the
handling of the directory request GET /tor/status-vote/* into the module.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In order to make sr_commit_free() only used by the dirauth module, this
commits moves the commits free from a vote object into the dirvote.c file
which is now only for the module.
The function does nothing if the module is disabled.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Adds two unittests:
- First checks the path selection of basic Tor circs.
- Second checks the path selection of vanguard circs.
There is a TODO on the second unittest that we might want to test sooner than
later, but it's not trivial to do it right now.
To do these unittests we needed the following mods:
- Make some functions STATIC.
- Add some more fields to the big fake network nodes of test_entrynodes.c
- Switch fake node nicknames to base32 (because base64 does not produce valid nicknames).
This prevents a malicious RP/IP from learning the guard node in the case that
we are using only one (because we aren't using two guards, or because one of
those two guards is temporarily down).
This ensures the "strong" version of Property #6 from
https://lists.torproject.org/pipermail/tor-dev/2018-April/013098.html
(Information about the guard(s) does not leak to the website/RP at all).
The last hop in vanguard circuits can be an RP/IP/HSDir.
Since vanguard circuits are at least 3 hops (sometimes 4) before this node,
this change will not cause A - B - A paths.
When parsing a vote in routerparse.c, only dirauth extract the commits from
the vote so move all this code into dirvote.c so we can make it specific to
the dirauth module.
If the dirauth module is disabled, the commit parsing does nothing.
Signed-off-by: David Goulet <dgoulet@torproject.org>
From dirvote.c to networkstatus.c where it makes more sense both in terms of
namespace and subsystem responsability.
This removes one less dependency on the dirauth module.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add static inline dirauth public functions used outside of the dirauth module
so they can be seen by the tor code but simply do nothing.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Move most of the shared random functions that are needed outside of the
dirauth module.
At this commit, because dirvote.c hasn't been refactor, it doesn't compile
because some SR functions need a dirvote function.
Furthermore, 5 functions haven't been touched yet because they are dirauth
only but are in used in other C files than the dirauth module ones.
No code behavior change. Only moving code around.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is a pretty big commit but it only moves these files to src/or/dirauth:
dircollate.c dirvote.c shared_random.c shared_random_state.c
dircollate.h dirvote.h shared_random.h shared_random_state.h
Then many files are modified to change the include line for those header files
that have moved into a new directory.
Without using --disable-module-dirauth, everything builds fine. When using the
flag to disable the module, tor doesn't build due to linking errors. This will
be addressed in the next commit(s).
No code behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Remove useless include.
Clearly identify functions that are used by other part of Tor, functions that
are only used by the dirauth subsystem and functions that are exposed for unit
tests.
This will help us in the dirauth modularization effort.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Many functions become static to the C file or exposed to the tests within the
PRIVATE define of dirvote.h.
This commit moves a function to the top. No code behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Don't access the AuthoritativeDir options directly. We do this so we can move
authdir_mode() to the dirauth module.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Make our build system support a disable dirauth module option. It can only be
disabled explicitly with:
$ ./configure --disable-module-dirauth
If *not* specified that is enabled, an automake conditional variable is set to
true and a defined value for the C code:
AM_CONDITIONAL: BUILD_MODULE_DIRAUTH
AC_DEFINE: HAVE_MODULE_DIRAUTH=1
This introduces the dirauth/ module directory in src/or/ for which .c files
are only compiled if the BUILD_MODULE_DIRAUTH is set.
All the header files are compiled in regardless of the support so we can use
the alternative entry point functions of the dirauth subsystem.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because we rescan the main loop event list if the global map of services has
changed, this makes sure it does work.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because ADD_ONION/DEL_ONION can modify the global service map (both for v2 and
v3), we need to rescan the event list so we either enable or disable the HS
service main loop event.
Fixees #25939
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is done because it makes our life easier with unit tests. Also, a rescan
on an uninitialized event list will result in a stacktrace.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we change the hibernation state, rescan the main loop event list because
the new state might affect the events.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Implement the ability to set flags per events which influences the set up of
the event.
This commit only adds one flag which is "need network" meaning that the event
is not enabled if tor has disabled the network or if hibernation mode.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Prior to #23100, we were not counting HS circuit build times in our
calculation of the timeout. This could lead to a condition where our timeout
was set too low, based on non HS circuit build times, and then we would
abandon all HS circuits, storing no valid timeouts in the histogram.
This commit avoids the assert.
Needed to run tests from the tarball else the geoip unit test would fail by
not finding that file.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In 25374, we created the necessary post-loop event for scheduling
connection_ap_attach_pending as needed. Before that, we were
already running this event once per mainloop. There's no reason to
also run it once per second.
Closes ticket 25933. No changes file, since the relevant change is
already in 25374. Or possibly in 17590, depending on how you look
at it.
Our main function, though accurate on all platforms, can be very
slow on 32-bit hosts. This one is faster on all 32-bit hosts, and
accurate everywhere except apple, where it will typically be off by
1%. But since 32-bit apple is a relic anyway, I think we should be
fine.
Previously were using this value to have a cheap highish-resolution
timer. But we were only using it in one place, and current dogma is
to use monotime_coarse_t for this kind of thing.
This part of the code was the only part that used "cached
getttimeofday" feature, which wasn't monotonic, which we updated at
slight expense, and which I'd rather not maintain.
The clean_consdiffmgr() callback is only for relays acting as a directory
server, not all relays.
This commit adds a role for only directory server and sets the
clean_consdiffmgr() callback to use it.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We can't end up in the removed else {} condition since we first validate the
flavor we get and then we validate the flavor we parse from the given
consensus which means we can only handle the two flavors of the if/elseif
conditions.
Fixes#25914
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, we were ignoring values _over_ EPSILON. This bug was
also causing a warning at startup because the default value is set
to -1.0.
Fixes bug 25577; bugfix on 6b1dba214d. Bug not in any released tor.
Two helper functions to enable an event and disable an event which wraps the
launch and destroy of an event but takes care of the enabled flag.
They are also idempotent that is can be called multiple time on the same event
without effect if the event was already enabled or disabled.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In case we transitionned to a new role in Tor, we need to launch and/or
destroy some periodic events.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In tor, we have a series of possible "roles" that the tor daemon can be
enabled for. They are:
Client, Bridge, Relay, Authority (directory or bridge) and Onion service.
They can be combined sometimes. For instance, a Directory Authority is also a
Relay. This adds a "roles" field to a periodic event item object which is used
to know for which roles the event is for.
The next step is to enable the event only if the roles apply. No behavior
change at this commit.
Pars of #25762
Signed-off-by: David Goulet <dgoulet@torproject.org>
No behavior change, just to make it easier to find callbacks and for the sake
of our human brain to parse the list properly.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Consensus method 25 is the oldest one supported by any stable
version of 0.2.9, which is our current most-recent LTS. Thus, by
proposal 290, they should be removed.
This commit does not actually remove the code to implement these
methods: it only makes it so authorities will no longer support
them. I'll remove the backend code for them in later commits.
It tried to pick nodes for which only routerinfo_t items are set,
but without setting UseMicroDescriptors to 0. This won't work any
more, now that we're strict about using the right descriptor types
due to 25691/25692/25213.
In order to fix 25691 and 25692, we need to pass the "direct_conn"
flag to more places -- particularly when choosing single-hop
tunnels. The right way to do this involves having a couple more
functions accept router_crn_flags_t, rather than a big list of
boolean arguments.
This commit also makes sure that choose_good_exit_server_general()
honors the direct_conn flag, to fix 25691 and 25692.
In router_add_running_nodes_to_smartlist(), we had an inline
implementation of the logic from node_has_descriptor(), which should
be changed to node_has_preferred_descriptor().
This patch adds a new node_has_preferred_descriptor() function, and
replaces most users of node_has_descriptor() with it. That's an
important change, since as of d1874b4339 (our fix for #25213),
we are willing to say that a node has _some_ descriptor, but not the
_right_ descriptor for a particular use case.
Part of a fix for 25691 and 25692.
Now that we allow cpuworkers for dirport-only hosts (to fix 23693),
we need to allow dup_onion_keys() to succeed for them.
The change to construct_ntor_key_map() is for correctness,
but is not strictly necessary.
This is done as follows:
* Only one function (find_dl_schedule()) actually returned a
smartlist. Now it returns an int.
* The CSV_INTERVAL type has been altered to ignore everything
after the first comma, and to store the value before the first
comma in an int.
This commit won't compile. It was made with the following perl
scripts:
s/smartlist_t \*(.*)DownloadSchedule;/int $1DownloadInitialDelay;/;
s/\b(\w*)DownloadSchedule\b/$1DownloadInitialDelay/;
sizeof(ret) is the size of the pointer, not the size of what it
points to. Fortunately, we already have a function to compare
tor_addr_port_t values for equality.
Bugfix on c2c5b13e5d8a77e; bug not in any released Tor. Found by
clang's scan-build.
Now that we update our buckets on demand before reading or writing,
we no longer need to update them all every TokenBucketRefillInterval
msec.
When a connection runs out of bandwidth, we do need a way to
reenable it, however. We do this by scheduling a timer to reenable
all blocked connections for TokenBucketRefillInterval msec after a
connection becomes blocked.
(If we were using PerConnBWRate more, it might make sense to have a
per-connection timer, rather than a single timeout. But since
PerConnBWRate is currently (mostly) unused, I'm going to go for the
simpler approach here, since usually whenever one connection has
become blocked on bandwidth, most connections are blocked on
bandwidth.)
Implements ticket 25373.
Previously this was done as part of the refill callback, but there's
no real reason to do it like that. Since we're trying to remove the
refill callback completely, we can do this work as part of
record_num_bytes_transferred_impl(), which already does quite a lot
of this.
We used to do this 10x per second in connection_buckets_refill();
instead, we now do it when the bucket becomes empty. This change is
part of the work of making connection_buckets_refill() obsolete.
Closes ticket 25828; bugfix on 0.2.3.5-alpha.
We recently merged a circuit cell queue size safeguard. This commit adds the
number of killed circuits that have reached the limit to the DoS heartbeat. It
now looks like this:
[notice] DoS mitigation since startup: 0 circuits killed with too many
cells. 0 circuits rejected, 0 marked addresses. 0 connections closed. 0
single hop clients refused.
Second thing that this patch does. It makes tor always print the DoS
mitigation heartbeat line (for a relay) even though no DoS mitigation have
been enabled. The reason is because we now kill circuits that have too many
cells regardless on if it is enabled or not but also it will give the operator
a chance to learn what is enabled with the heartbeat instead of suddenly
appearing when it is enabled by let say the consensus.
Fixes#25824
Signed-off-by: David Goulet <dgoulet@torproject.org>
Unfortunately, the units passed to
monotime_coarse_stamp_units_to_approx_msec() was always 0 due to a type
conversion.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit introduces the consensus parameter "circ_max_cell_queue_size"
which controls the maximum number of cells a circuit queue should have.
The default value is currently 50000 cells which is above what should be
expected but keeps us a margin of error for padding cells.
Related to this is #9072. Back in 0.2.4.14-alpha, we've removed that limit due
to a Guard discovery attack. Ticket #25226 details why we are putting back the
limit due to the memory pressure issue on relays.
Fixes#25226
Signed-off-by: David Goulet <dgoulet@torproject.org>
Both header and code file had some indentation issues after mass renaming.
No code behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Really, the uint32_t is only an optimization; any kind of unit
should work fine. Some users might want to use time_t or
monotime_coarse_t or something like that.
Begin by creating a lowest-level triple of the types needed to
implement a token bucket: a configuration, a timestamp, and the raw
bucket itself.
Note that for low-level buckets, the units of the timestamp and the
bucket itself are unspecified: each user can use a different type.
(This patch breaks check-spaces; a later patch will fix it)
This is a simple search-and-replace to rename the token bucket type
to indicate that it contains both a read and a write bucket, bundled
with their configuration. It's preliminary to refactoring the
bucket type.
This test works by having two post-loop events activate one another
in a tight loop. If the "post-loop" mechanism didn't work, this
would be enough to starve all other events.
A linked connection_t is one that gets its I/O, not from the
network, but from another connection_t. When such a connection has
something to write, we want the corresponding connection to run its
read callback ... but not immediately, to avoid infinite recursion
and/or event loop starvation.
Previously we handled this case by activating the read events
outside the event loop. Now we use the "postloop event" logic.
This lets us simplify do_main_loop_once() a little.
We've been labeling some events as happening "outside the event
loop", to avoid Libevent starvation. This patch provides a cleaner
mechanism to avoid that starvation.
For background, the problem here is that Libevent only scans for new
events once it has run all its active callbacks. So if the
callbacks keep activating new callbacks, they could potentially
starve Libevent indefinitely and keep it from ever checking for
timed, socket, or signal events.
To solve this, we add the ability to label some events as
"post-loop". The rule for a "post-loop" event is that any events
_it_ activates can only be run after libevent has re-scanned for new
events at least once.
This differs from our previous token bucket abstraction in a few
ways:
1) It is an abstraction, and not a collection of fields.
2) It is meant to be used with monotonic timestamps, which should
produce better results than calling gettimeofday over and over.
In d1874b4339, we adjusted this check so that we insist on
using routerinfos for bridges. That's almost correct... but if we
have a bridge that is also a regular relay, then we should use
insist on its routerinfo when connecting to it as a bridge
(directly), and be willing to use its microdescriptor when
connecting to it elsewhere in our circuits.
This bug is a likely cause of some (all?) of the (exit_ei == NULL)
failures we've been seeing.
Fixes bug 25691; bugfix on 0.3.3.4-alpha
When size_t is 32 bits, the unit tests can't fit anything more than
4GB-1 into a size_t.
Additionally, tt_int_op() uses "long" -- we need tt_u64_op() to
safely test uint64_t values for equality.
Bug caused by tests for #24782 fix; not in any released Tor.
This patch changes the algorithm of compute_real_max_mem_in_queues() to
use 0.4 * RAM iff the system has more than or equal to 8 GB of RAM, but
will continue to use the old value of 0.75 * RAM if the system have less
than * GB of RAM available.
This patch also adds tests for compute_real_max_mem_in_queues().
See: https://bugs.torproject.org/24782
This patch makes get_total_system_memory mockable, which allows us to
alter the return value of the function in tests.
See: https://bugs.torproject.org/24782
This one happens if for some reason you start with DirPort enabled
but server mode turned off entirely.
Fixes a case of bug 23693; bugfix on 0.3.1.1-alpha.
This roughly doubles our test coverage of the bridges.c module.
* ADD new testing module, .../src/test/test_bridges.c.
* CHANGE a few function declarations from `static` to `STATIC`.
* CHANGE one function in transports.c, transport_get_by_name(), to be
mockable.
* CLOSES#25425: https://bugs.torproject.org/25425
This patch lifts the list of default directory authorities from config.c
into their own auth_dirs.inc file, which is then included in config.c
using the C preprocessor.
Patch by beastr0.
See: https://bugs.torproject.org/24854
* ADD new /src/common/crypto_rand.[ch] module.
* ADD new /src/common/crypto_util.[ch] module (contains the memwipe()
function, since all crypto_* modules need this).
* FIXES part of #24658: https://bugs.torproject.org/24658
This module doesn't actually need to mock the libevent mainloop at
all: it can just use the regular mainloop that the test environment
sets up.
Part of ticket 23750.
This change makes cpuworker and test_workqueue no longer need to
include event2/event.h. Now workqueue.c needs to include it, but
that is at least somewhat logical here.
We are going to stop recommending 0.2.5 so there is no reason to keep the
undef statement anymore.
Fixes#20522.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Both in geoip_note_client_seen() and options_need_geoip_info(), switch from
accessing the options directly to using the should_record_bridge_info() helper
function.
Fixes#25290
Signed-off-by: David Goulet <dgoulet@torproject.org>
Next commit is addressing the circuit queue cell limit so cleanup before doing
anything else.
Part of #25226
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, the limit for MAX_PROTOCOLS_TO_EXPAND was actually being applied
in Rust to the maximum number of version (total, for all subprotocols).
Whereas in C, it was being applied to the number of subprotocols that were
allowed. This changes the Rust to match C's behaviour.
The behaviours still do not match, unsurprisingly, but now we know where a
primary difference is: the Rust is validating version ranges more than the C,
so in the C it's possible to call protover_all_supported on a ridiculous
version range like "Sleen=0-4294967294" because the C uses
MAX_PROTOCOLS_TO_EXPAND to count the number of *subprotocols* whereas the Rust
uses it to count the total number of *versions* of all subprotocols.
There's now no difference in these tests w.r.t. the C or Rust: both
fail miserably (well, Rust fails with nice descriptive errors, and C
gives you a traceback, because, well, C).
The DoS potential is slightly higher in C now due to some differences to the
Rust code, see the C_RUST_DIFFERS tags in src/rust/protover/tests/protover.rs.
Also, the comment about "failing at the splitting stage" in Rust wasn't true,
since when we split, we ignore empty chunks (e.g. "1--1" parses into
"(1,None),(None,1)" and "None" can't be parsed into an integer).
Finally, the comment about "Rust seems to experience an internal error" is only
true in debug mode, where u32s are bounds-checked at runtime. In release mode,
code expressing the equivalent of this test will error with
`Err(ProtoverError::Unparseable)` because 4294967295 is too large.
Previously, if "Link=1-5" was supported, and you asked protover_all_supported()
(or protover::all_supported() in Rust) if it supported "Link=3-999", the C
version would return "Link=3-999" and the Rust would return "Link=6-999". These
both behave the same now, i.e. both return "Link=6-999".
During code review and discussion with Chelsea Komlo, she pointed out
that protover::compute_for_old_tor() was a public function whose
return type was `&'static CStr`. We both agree that C-like parts of
APIs should:
1. not be exposed publicly (to other Rust crates),
2. only be called in the appropriate FFI code,
3. not expose types which are meant for FFI code (e.g. `*mut char`,
`CString`, `*const c_int`, etc.) to the pure-Rust code of other
crates.
4. FFI code (e.g. things in `ffi.rs` modules) should _never_ be called
from pure-Rust, not even from other modules in its own crate
(i.e. do not call `protover::ffi::*` from anywhere in
`protover::protoset::*`, etc).
With that in mind, this commit makes the following changes:
* CHANGE `protover::compute_for_old_tor()` to be
visible only at the `pub(crate)` level.
* RENAME `protover::compute_for_old_tor()` to
`protover::compute_for_old_tor_cstr()` to reflect the last change.
* ADD a new `protover::compute_for_old_tor()` function wrapper which
is public and intended for other Rust code to use, which returns a
`&str`.
It was changed to take borrows instead of taking ownership.
* REFACTOR `protover::ffi::protover_is_supported_here()` to use changed method
signature on `protover::is_supported_here()`.
This includes a subtle difference in behaviour to the previous Rust
implementation, where, for each vote that we're computing over, if a single one
fails to parse, we skip it. This now matches the current behaviour in the C
implementation.
* REFACTOR `protover::ffi::protover_compute_vote()` to use
new types and methods.
This includes a subtle difference in behaviour, as in 4258f1e18, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.
* REFACTOR `protover::ffi::protover_list_supports_protocol_or_later()` to use
new types and methods.
This includes a subtle difference in behaviour, as in 4258f1e18, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.
* REFACTOR `protover::ffi::protover_list_supports_protocol()` to use new types
and methods.
This includes differences in behaviour to before, which should now more closely
match the C version:
- If parsing a protover `char*` from C, and the string is not parseable, this
function will return 1 early, which matches the C behaviour when protocols
are unparseable. Previously, we would parse it and its version numbers
simultaneously, i.e. there was no fail early option, causing us to spend more
time unnecessarily parsing versions.
* REFACTOR `protover::ffi::protover_all_supported()` to use new types and
methods.
Previously, the rust implementation of protover considered an empty string to be
a valid ProtoEntry, while the C version did not (it must have a "=" character).
Other differences include that unknown protocols must now be parsed as
`protover::UnknownProtocol`s, and hence their entries as
`protover::UnvalidatedProtoEntry`s, whereas before (nearly) all protoentries
could be parsed regardless of how erroneous they might be considered by the C
version.
My apologies for this somewhat messy and difficult to read commit, if any part
is frustrating to the reviewer, please feel free to ask me to split this into
smaller changes (possibly hard to do, since so much changed), or ask me to
comment on a specific line/change and clarify how/when the behaviours differ.
The tests here should more closely match the behaviours exhibited by the C
implementation, but I do not yet personally guarantee they match precisely.
* REFACTOR unittests in protover::protover.
* ADD new integration tests for previously untested behaviour.
* FIXES part of #24031: https://bugs.torproject.org/24031.
This changes `protover::is_supported_here()` to be aware of new datatypes
(e.g. don't call `.0` on things which are no longer tuple structs) and also
changes the method signature to take borrows, making it faster, threadable, and
easier to read (i.e. the caller can know from reading the function signature
that the function won't mutate values passed into it).
* CHANGE the `protover::is_supported_here()` function to take borrows.
* REFACTOR the `protover::is_supported_here()` function to be aware of new
datatypes.
* FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new type for votes upon `protover::ProtoEntry`s (technically, on
`protover::UnvalidatedProtoEntry`s, because the C code does not validate based
upon currently known protocols when voting, in order to maintain
future-compatibility), and converts several functions which would have operated
on this datatype into methods for ease-of-use and readability.
This also fixes a behavioural differentce to the C version of
protover_compute_vote(). The C version of protover_compute_vote() calls
expand_protocol_list() which checks if there would be too many subprotocols *or*
expanded individual version numbers, i.e. more than MAX_PROTOCOLS_TO_EXPAND, and
does this *per vote* (but only in compute_vote(), everywhere else in the C seems
to only care about the number of subprotocols, not the number of individual
versions). We need to match its behaviour in Rust and ensure we're not allowing
more than it would to get the votes to match.
* ADD new `protover::ProtoverVote` datatype.
* REMOVE the `protover::compute_vote()` function and refactor it into an
equivalent-in-behaviour albeit more memory-efficient voting algorithm based
on the new underlying `protover::protoset::ProtoSet` datatype, as
`ProtoverVote::compute()`.
* REMOVE the `protover::write_vote_to_string()` function, since this
functionality is now generated by the impl_to_string_for_proto_entry!() macro
for both `ProtoEntry` and `UnvalidatedProtoEntry` (the latter of which is the
correct type to return from a voting protocol instance, since the entity
voting may not know of all protocols being voted upon or known about by other
voting parties).
* FIXES part of #24031: https://bugs.torproject.org/24031
rust: Fix a difference in compute_vote() behaviour to C version.
This implements conversions from either a ProtoEntry or an UnvalidatedProtoEntry
into a String, for use in replacing such functions as
`protover::write_vote_to_string()`.
* ADD macro for implementing ToString trait for ProtoEntry and
UnvalidatedProtoEntry.
* FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new protover::UnvalidatedProtoEntry type, which is the
UnknownProtocol variant of a ProtoEntry, and refactors several functions which
should operate on this type into methods.
This also fixes what was previously another difference to the C implementation:
if you asked the C version of protovet_compute_vote() to compute a single vote
containing "Fribble=", it would return NULL. However, the Rust version would
return "Fribble=" since it didn't check if the versions were empty before
constructing the string of differences. ("Fribble=" is technically a valid
protover string.) This is now fixed, and the Rust version in that case will,
analogous to (although safer than) C returning a NULL, return None.
* REMOVE internal `contains_only_supported_protocols()` function.
* REMOVE `all_supported()` function and refactor it into
`UnvalidatedProtoEntry::all_supported()`.
* REMOVE `parse_protocols_from_string_with_no_validation()` and
refactor it into the more rusty implementation of
`impl FromStr for UnvalidatedProtoEntry`.
* REMOVE `protover_string_supports_protocol()` and refactor it into
`UnvalidatedProtoEntry::supports_protocol()`.
* REMOVE `protover_string_supports_protocol_or_later()` and refactor
it into `UnvalidatedProtoEntry::supports_protocol_or_later()`.
* FIXES part of #24031: https://bugs.torproject.org/24031
rust: Fix another C/Rust different in compute_vote().
This fixes the unittest from the prior commit by checking if the versions are
empty before adding a protocol to a vote.
This replaces the `protover::SupportedProtocols` (why would you have a type just
for things which are supported?) with a new, more generic type,
`protover::ProtoEntry`, which utilises the new, more memory-efficient datatype
in protover::protoset.
* REMOVE `get_supported_protocols()` and `SupportedProtocols::tor_supported()`
(since they were never used separately) and collapse their functionality into
a single `ProtoEntry::supported()` method.
* REMOVE `SupportedProtocols::from_proto_entries()` and reimplement its
functionality as the more rusty `impl FromStr for ProtoEntry`.
* REMOVE `get_proto_and_vers()` function and refactor it into the more rusty
`impl FromStr for ProtoEntry`.
* FIXES part of #24031: https://bugs.torproject.org/24031
* ADD new type, protover::UnknownProtocol, so that we have greater type safety
and our protover functionality which works with unsanitised protocol names is
more clearly demarcated.
* REFACTOR protover::Proto, renaming it protover::Protocol to mirror the new
protover::UnknownProtocol type name.
* ADD a utility conversion of `impl From<Protocol> for UnknownProtocol` so that
we can easily with known protocols and unknown protocols simultaneously
(e.g. doing comparisons, checking their version numbers), while not allowing
UnknownProtocols to be accidentally used in functions which should only take
Protocols.
* FIXES part of #24031: https://bugs.torproject.org/24031
* ADD new protover::protoset module.
* ADD new protover::protoset::ProtoSet class for holding protover versions.
* REMOVE protover::Versions type implementation and its method
`from_version_string()`, and instead implement this behaviour in a more
rust-like manner as `impl FromStr for ProtoSet`.
* MOVE the `find_range()` utility function from protover::protover to
protover::protoset since it's only used internally in the
implementation of ProtoSet.
* REMOVE the `contract_protocol_list()` function from protover::protover and
instead refactor it (reusing nearly the entire thing, with minor superficial,
i.e. non-behavioural, changes) into a more rusty
`impl ToString for ProtoSet`.
* REMOVE the `expand_version_range()` function from protover::protover and
instead refactor it into a more rusty implementation of
`impl Into<Vec<Version>> for ProtoSet` using the new error types in
protover::errors.
* FIXES part of #24031: https://bugs.torproject.org/24031.
This will allow us to do actual error handling intra-crate in a more
rusty manner, e.g. propogating errors in match statements, conversion
between error types, logging messages, etc.
* FIXES part of #24031: https://bugs.torproject.org/24031.
Previously, the limit for MAX_PROTOCOLS_TO_EXPAND was actually being applied
in Rust to the maximum number of version (total, for all subprotocols).
Whereas in C, it was being applied to the number of subprotocols that were
allowed. This changes the Rust to match C's behaviour.
The behaviours still do not match, unsurprisingly, but now we know where a
primary difference is: the Rust is validating version ranges more than the C,
so in the C it's possible to call protover_all_supported on a ridiculous
version range like "Sleen=0-4294967294" because the C uses
MAX_PROTOCOLS_TO_EXPAND to count the number of *subprotocols* whereas the Rust
uses it to count the total number of *versions* of all subprotocols.
There's now no difference in these tests w.r.t. the C or Rust: both
fail miserably (well, Rust fails with nice descriptive errors, and C
gives you a traceback, because, well, C).
The DoS potential is slightly higher in C now due to some differences to the
Rust code, see the C_RUST_DIFFERS tags in src/rust/protover/tests/protover.rs.
Also, the comment about "failing at the splitting stage" in Rust wasn't true,
since when we split, we ignore empty chunks (e.g. "1--1" parses into
"(1,None),(None,1)" and "None" can't be parsed into an integer).
Finally, the comment about "Rust seems to experience an internal error" is only
true in debug mode, where u32s are bounds-checked at runtime. In release mode,
code expressing the equivalent of this test will error with
`Err(ProtoverError::Unparseable)` because 4294967295 is too large.
Previously, if "Link=1-5" was supported, and you asked protover_all_supported()
(or protover::all_supported() in Rust) if it supported "Link=3-999", the C
version would return "Link=3-999" and the Rust would return "Link=6-999". These
both behave the same now, i.e. both return "Link=6-999".
During code review and discussion with Chelsea Komlo, she pointed out
that protover::compute_for_old_tor() was a public function whose
return type was `&'static CStr`. We both agree that C-like parts of
APIs should:
1. not be exposed publicly (to other Rust crates),
2. only be called in the appropriate FFI code,
3. not expose types which are meant for FFI code (e.g. `*mut char`,
`CString`, `*const c_int`, etc.) to the pure-Rust code of other
crates.
4. FFI code (e.g. things in `ffi.rs` modules) should _never_ be called
from pure-Rust, not even from other modules in its own crate
(i.e. do not call `protover::ffi::*` from anywhere in
`protover::protoset::*`, etc).
With that in mind, this commit makes the following changes:
* CHANGE `protover::compute_for_old_tor()` to be
visible only at the `pub(crate)` level.
* RENAME `protover::compute_for_old_tor()` to
`protover::compute_for_old_tor_cstr()` to reflect the last change.
* ADD a new `protover::compute_for_old_tor()` function wrapper which
is public and intended for other Rust code to use, which returns a
`&str`.
It was changed to take borrows instead of taking ownership.
* REFACTOR `protover::ffi::protover_is_supported_here()` to use changed method
signature on `protover::is_supported_here()`.
This includes a subtle difference in behaviour to the previous Rust
implementation, where, for each vote that we're computing over, if a single one
fails to parse, we skip it. This now matches the current behaviour in the C
implementation.
* REFACTOR `protover::ffi::protover_compute_vote()` to use
new types and methods.
This includes a subtle difference in behaviour, as in 4258f1e18, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.
* REFACTOR `protover::ffi::protover_list_supports_protocol_or_later()` to use
new types and methods.
This includes a subtle difference in behaviour, as in 4258f1e18, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.
* REFACTOR `protover::ffi::protover_list_supports_protocol()` to use new types
and methods.
This includes differences in behaviour to before, which should now more closely
match the C version:
- If parsing a protover `char*` from C, and the string is not parseable, this
function will return 1 early, which matches the C behaviour when protocols
are unparseable. Previously, we would parse it and its version numbers
simultaneously, i.e. there was no fail early option, causing us to spend more
time unnecessarily parsing versions.
* REFACTOR `protover::ffi::protover_all_supported()` to use new types and
methods.
Previously, the rust implementation of protover considered an empty string to be
a valid ProtoEntry, while the C version did not (it must have a "=" character).
Other differences include that unknown protocols must now be parsed as
`protover::UnknownProtocol`s, and hence their entries as
`protover::UnvalidatedProtoEntry`s, whereas before (nearly) all protoentries
could be parsed regardless of how erroneous they might be considered by the C
version.
My apologies for this somewhat messy and difficult to read commit, if any part
is frustrating to the reviewer, please feel free to ask me to split this into
smaller changes (possibly hard to do, since so much changed), or ask me to
comment on a specific line/change and clarify how/when the behaviours differ.
The tests here should more closely match the behaviours exhibited by the C
implementation, but I do not yet personally guarantee they match precisely.
* REFACTOR unittests in protover::protover.
* ADD new integration tests for previously untested behaviour.
* FIXES part of #24031: https://bugs.torproject.org/24031.
This changes `protover::is_supported_here()` to be aware of new datatypes
(e.g. don't call `.0` on things which are no longer tuple structs) and also
changes the method signature to take borrows, making it faster, threadable, and
easier to read (i.e. the caller can know from reading the function signature
that the function won't mutate values passed into it).
* CHANGE the `protover::is_supported_here()` function to take borrows.
* REFACTOR the `protover::is_supported_here()` function to be aware of new
datatypes.
* FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new type for votes upon `protover::ProtoEntry`s (technically, on
`protover::UnvalidatedProtoEntry`s, because the C code does not validate based
upon currently known protocols when voting, in order to maintain
future-compatibility), and converts several functions which would have operated
on this datatype into methods for ease-of-use and readability.
This also fixes a behavioural differentce to the C version of
protover_compute_vote(). The C version of protover_compute_vote() calls
expand_protocol_list() which checks if there would be too many subprotocols *or*
expanded individual version numbers, i.e. more than MAX_PROTOCOLS_TO_EXPAND, and
does this *per vote* (but only in compute_vote(), everywhere else in the C seems
to only care about the number of subprotocols, not the number of individual
versions). We need to match its behaviour in Rust and ensure we're not allowing
more than it would to get the votes to match.
* ADD new `protover::ProtoverVote` datatype.
* REMOVE the `protover::compute_vote()` function and refactor it into an
equivalent-in-behaviour albeit more memory-efficient voting algorithm based
on the new underlying `protover::protoset::ProtoSet` datatype, as
`ProtoverVote::compute()`.
* REMOVE the `protover::write_vote_to_string()` function, since this
functionality is now generated by the impl_to_string_for_proto_entry!() macro
for both `ProtoEntry` and `UnvalidatedProtoEntry` (the latter of which is the
correct type to return from a voting protocol instance, since the entity
voting may not know of all protocols being voted upon or known about by other
voting parties).
* FIXES part of #24031: https://bugs.torproject.org/24031
rust: Fix a difference in compute_vote() behaviour to C version.
This implements conversions from either a ProtoEntry or an UnvalidatedProtoEntry
into a String, for use in replacing such functions as
`protover::write_vote_to_string()`.
* ADD macro for implementing ToString trait for ProtoEntry and
UnvalidatedProtoEntry.
* FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new protover::UnvalidatedProtoEntry type, which is the
UnknownProtocol variant of a ProtoEntry, and refactors several functions which
should operate on this type into methods.
This also fixes what was previously another difference to the C implementation:
if you asked the C version of protovet_compute_vote() to compute a single vote
containing "Fribble=", it would return NULL. However, the Rust version would
return "Fribble=" since it didn't check if the versions were empty before
constructing the string of differences. ("Fribble=" is technically a valid
protover string.) This is now fixed, and the Rust version in that case will,
analogous to (although safer than) C returning a NULL, return None.
* REMOVE internal `contains_only_supported_protocols()` function.
* REMOVE `all_supported()` function and refactor it into
`UnvalidatedProtoEntry::all_supported()`.
* REMOVE `parse_protocols_from_string_with_no_validation()` and
refactor it into the more rusty implementation of
`impl FromStr for UnvalidatedProtoEntry`.
* REMOVE `protover_string_supports_protocol()` and refactor it into
`UnvalidatedProtoEntry::supports_protocol()`.
* REMOVE `protover_string_supports_protocol_or_later()` and refactor
it into `UnvalidatedProtoEntry::supports_protocol_or_later()`.
* FIXES part of #24031: https://bugs.torproject.org/24031
rust: Fix another C/Rust different in compute_vote().
This fixes the unittest from the prior commit by checking if the versions are
empty before adding a protocol to a vote.
This replaces the `protover::SupportedProtocols` (why would you have a type just
for things which are supported?) with a new, more generic type,
`protover::ProtoEntry`, which utilises the new, more memory-efficient datatype
in protover::protoset.
* REMOVE `get_supported_protocols()` and `SupportedProtocols::tor_supported()`
(since they were never used separately) and collapse their functionality into
a single `ProtoEntry::supported()` method.
* REMOVE `SupportedProtocols::from_proto_entries()` and reimplement its
functionality as the more rusty `impl FromStr for ProtoEntry`.
* REMOVE `get_proto_and_vers()` function and refactor it into the more rusty
`impl FromStr for ProtoEntry`.
* FIXES part of #24031: https://bugs.torproject.org/24031
* ADD new type, protover::UnknownProtocol, so that we have greater type safety
and our protover functionality which works with unsanitised protocol names is
more clearly demarcated.
* REFACTOR protover::Proto, renaming it protover::Protocol to mirror the new
protover::UnknownProtocol type name.
* ADD a utility conversion of `impl From<Protocol> for UnknownProtocol` so that
we can easily with known protocols and unknown protocols simultaneously
(e.g. doing comparisons, checking their version numbers), while not allowing
UnknownProtocols to be accidentally used in functions which should only take
Protocols.
* FIXES part of #24031: https://bugs.torproject.org/24031
* ADD new protover::protoset module.
* ADD new protover::protoset::ProtoSet class for holding protover versions.
* REMOVE protover::Versions type implementation and its method
`from_version_string()`, and instead implement this behaviour in a more
rust-like manner as `impl FromStr for ProtoSet`.
* MOVE the `find_range()` utility function from protover::protover to
protover::protoset since it's only used internally in the
implementation of ProtoSet.
* REMOVE the `contract_protocol_list()` function from protover::protover and
instead refactor it (reusing nearly the entire thing, with minor superficial,
i.e. non-behavioural, changes) into a more rusty
`impl ToString for ProtoSet`.
* REMOVE the `expand_version_range()` function from protover::protover and
instead refactor it into a more rusty implementation of
`impl Into<Vec<Version>> for ProtoSet` using the new error types in
protover::errors.
* FIXES part of #24031: https://bugs.torproject.org/24031.
This will allow us to do actual error handling intra-crate in a more
rusty manner, e.g. propogating errors in match statements, conversion
between error types, logging messages, etc.
* FIXES part of #24031: https://bugs.torproject.org/24031.
When a relay is collecting internal statistics about how many
create cell requests it has seen of each type, accurately count the
requests from relays that temporarily fall out of the consensus.
(To be extra conservative, we were already ignoring requests from clients
in our counts, and we continue ignoring them here.)
Fixes bug 24910; bugfix on 0.2.4.17-rc.
Directory authorities no longer vote in favor of the Guard flag
for relays that don't advertise directory support.
Starting in Tor 0.3.0.1-alpha, Tor clients have been avoiding using
such relays in the Guard position, leading to increasingly broken load
balancing for the 5%-or-so of Guards that don't advertise directory
support.
Fixes bug 22310; bugfix on 0.3.0.6.
Add a missing lock acquisition around access to queued_control_events
in control_free_all(). Use the reassign-and-unlock strategy as in
queued_events_flush_all(). Fixes bug 25675. Coverity found this bug,
but only after we recently added an access to
flush_queued_event_pending.
The earlier checks in this function should ensure that components is
always nonempty. But in case somebody messes with them in the
future, let's add an extra check to make sure we aren't crashing.
Coverity found a null pointer reference in nodelist_add_microdesc().
This is almost certainly impossible assuming that the routerstatus_t
returned by router_get_consensus_status_by_descriptor_digest() always
corresponds to an entry in the nodelist. Fixes bug 25629.
Coverity found a null pointer reference in nodelist_add_microdesc().
This is almost certainly impossible assuming that the routerstatus_t
returned by router_get_consensus_status_by_descriptor_digest() always
corresponds to an entry in the nodelist. Fixes bug 25629.
Coverity found a null pointer reference in nodelist_add_microdesc().
This is almost certainly impossible assuming that the routerstatus_t
returned by router_get_consensus_status_by_descriptor_digest() always
corresponds to an entry in the nodelist. Fixes bug 25629.
If we failed to connect at the TCP level to a relay, note it down and refuse
to connect again for another 60 seconds.
Fixes#24767
Signed-off-by: David Goulet <dgoulet@torproject.org>
This update is needed to make it consistent with the behavior of
node_awaiting_ipv6(), which doesn't believe in the addresses from
routerinfos unless it actually plans to use those routerinfos.
Fixes bug 25213; bugfix on b66b62fb75 in 0.3.3.1-alpha,
which tightened up the definition of node_awaiting_ipv6().
There was a nonfatal assertion in pathbias_should_count that would
trigger if onehop_tunnel was set, but the desired_path_length was
greater than 1. This patch fixes that. Fixes bug 24903; bugfix on
0.2.5.2-alpha.
These tests handle incoming and outgoing cells on a three-hop
circuit, and make sure that the crypto works end-to-end. They don't
yet test spec conformance, leaky-pipe, or various error cases.
Make sure we actually only report client channel to the geoip cache instead of
looking if it is a known relay. Looking if it is a known relay can be
unreliable because they come and go from the consensus.
Fixes#24904
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because of #25306 for which we are unable to reproduce nor understand how it
is possible, this commit removes the asserts() and BUG() on the missing
descriptors instead when rotating them.
This allows us to log more data on error but also to let tor recover
gracefully instead of dying.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This patch adds some additional logging to circuits_handle_oom() to give
us more information about which specific compression backend that is
using a certain amount of memory.
See: https://bugs.torproject.org/25372
These declarations need to exist unconditionally, but they were
trapped inside an "#else /* !(defined(HAVE_SYSLOG_H)) */" block.
Fixes a travis regression caused by 23881; bug not in any released tor.
Additionally, this change extracts the functions that created and
freed these elements.
These structures had common "forward&reverse stream&digest"
elements, but they were initialized and freed through cpath objects,
and different parts of the code depended on them. Now all that code
is extacted, and kept in relay_crypto.c
This should help us improve modularity, and should also make it
easier for people to experiment with other relay crypto strategies
down the road.
This commit is pure function movement.
This function is used upon receiving a cell, and only handles the
decrypting part. The encryption part is currently handled inside
circuit_package_relay_cell.
This should avoid most intermittent test failures on developer and CI machines,
but there could (and probably should) be a more elegant solution.
Also, this test was testing that the IP was created and its expiration time was
set to a time greater than or equal to `now+INTRO_POINT_LIFETIME_MIN_SECONDS+5`:
/* Time to expire MUST also be in that range. We add 5 seconds because
* there could be a gap between setting now and the time taken in
* service_intro_point_new. On ARM, it can be surprisingly slow... */
tt_u64_op(ip->time_to_expire, OP_GE,
now + INTRO_POINT_LIFETIME_MIN_SECONDS + 5);
However, this appears to be a typo, since, according to the comment above it,
adding five seconds was done because the IP creation can be slow on some
systems. But the five seconds is added to the *minimum* time we're comparing
against, and so it actually functions to make this test *more* likely to fail on
slower systems. (It should either subtract five seconds, or instead add it to
time_to_expire.)
* FIXES#25450: https://bugs.torproject.org/25450
These were meant to demonstrate old behavior, or old rust behavior.
One of them _should_ work in Rust, but won't because of
implementation details. We'll fix that up later.
The C code and the rust code had different separate integer overflow
bugs here. That suggests that we're better off just forbidding this
pathological case.
Also, add tests for expected behavior on receiving a bad protocol
list in a consensus.
Fixes another part of 25249.
I've refactored these to be a separate function, to avoid tricky
merge conflicts.
Some of these are disabled with "XXXX" comments; they should get
fixed moving forward.
This one can only be exploited if you can generate a correctly
signed consensus, so it's not as bad as 25074.
Fixes bug 25251; also tracked as TROVE-2018-004.
In some cases we had checked for it, but in others we had not. One
of these cases could have been used to remotely cause
denial-of-service against directory authorities while they attempted
to vote.
Fixes TROVE-2018-001.
* ADD includes for "torint.h" and "container.h" to crypto_digest.h.
* ADD includes for "crypto_digest.h" to a couple places in which
crypto_digest_t was then missing.
* FIXES part of #24658: https://bugs.torproject.org/24658#comment:30
Folks have found two in the past week or so; we may as well fix the
others.
Found with:
\#!/usr/bin/python3
import re
def findMulti(fname):
includes = set()
with open(fname) as f:
for line in f:
m = re.match(r'^\s*#\s*include\s+["<](\S+)[>"]', line)
if m:
inc = m.group(1)
if inc in includes:
print("{}: {}".format(fname, inc))
includes.add(m.group(1))
import sys
for fname in sys.argv[1:]:
findMulti(fname)
We moved the crypto_pk_obselete_* functions into crypto_rsa.[ch] because they fit
better with the RSA module.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
since all it does is produce false positives
this commit should get merged into 0.2.9 and 0.3.0 *and* 0.3.1, even
though the code in the previous commit is already present in 0.3.1. sorry
for the mess.
[Cherry-picked]
since all it does is produce false positives
this commit should get merged into 0.2.9 and 0.3.0 *and* 0.3.1, even
though the code in the previous commit is already present in 0.3.1. sorry
for the mess.
This commit takes a piece of commit af8cadf3a9 and a piece of commit
46fe353f25, with the goal of making channel_is_client() be based on what
sort of connection handshake the other side used, rather than seeing
whether the other side ever sent a create_fast cell to us.
We moved the crypto_pk_* digest functions into crypto_rsa.[ch] because they fit
better with the RSA module.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Also correct MAX_VERSIONS_TO_EXPAND to match the C.
NOTE that this patch leads to incorrect behavior: the C code allows
huge ranges; it just doesn't allow votes on them (currently). For
full compatibility, we'll need to make the rust code store ranges as
ranges natively, possibly using something like the range_map crate.
Still, this patch is smaller than a "proper" fix.
Fixes TROVE-2018-003.
Since 0.2.4, tor uses EWMA circuit policy to prioritize. The previous
algorithm, round-robin, hasn't been used since then but was still used as a
fallback.
Now that EWMA is mandatory, remove that code entirely and enforce a cmux
policy to be set.
This is part of a circuitmux cleanup to improve performance and reduce
complexity in the code. We'll be able to address future optimization with this
work.
Closes#25268
Signed-off-by: David Goulet <dgoulet@torproject.org>
The reason to do so is because these functions haven't been used in years so
since 0.2.4, every callsite is NOP.
In future commits, we'll remove the round robin circuit policy which is mostly
validated within those function.
This simplifies the code greatly and remove dead code for which we never had a
configure option in the first place nor an easy way to use them in production.
Part of #25268
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is rename to something more meaningful that explains what it does exactly
which is sets the EWMA options (currently only one exists). The new name is
cmux_ewma_set_options().
Also, remove a public function from circuitmux_ewma.h that is only used in the
C file. Make it static inline as well.
Signed-off-by: David Goulet <dgoulet@torproject.org>
To achieve this, a default value for the CircuitPriorityHalflife option was
needed. We still look in the options and then the consensus but in case no
value can be found, the default CircuitPriorityHalflifeMsec=30000 is used. It
it the value we've been using since 0.2.4.4-alpha.
This means that EWMA, our only policy, can not be disabled anymore fallbacking
to the round robin algorithm. Unneeded code to control that is removed in this
commit.
Part of #25268
Signed-off-by: David Goulet <dgoulet@torproject.org>
We had this safeguard around dos_init() but not when the consensus changes
which can modify consensus parameters and possibly enable the DoS mitigation
even if tor wasn't a public relay.
Fixes#25223
Signed-off-by: David Goulet <dgoulet@torproject.org>
Explicitly inform the operator of the rejected relay to set a valid email
address in the ContactInfo field and contact bad-relays@ mailing list.
Fixes#25170
Signed-off-by: David Goulet <dgoulet@torproject.org>
We don't expect this to come up very much, but we may as well make
sure that the value isn't predictable (as we do for the other
addresses) in case the issue ever comes up.
Spotted by teor.
This patch lowers the log-level from warning to info in the cases where
we are going to attempt another method as entropy source to hopefully
make the user feel less concerned.
See: https://bugs.torproject.org/25120
* ADD a new macro, tor_util::string::cstr!() which takes Rust strings,
concatenates them together, appends a NUL byte, and converts it into a
std::ffi::CStr for handing to C.
This is to avoid positively identifying Exit relays if tor client connection
comes from them that is reentering the network.
One thing to note is that this is done only in the DoS subsystem but we'll
still add it to the geoip cache as a "client" seen. This is done that way so
to avoid as much as possible changing the current behavior of the geoip client
cache since this is being backported.
Closes#25193
Signed-off-by: David Goulet <dgoulet@torproject.org>
Rationale: this helps for performance only, but we don't actually
have any reason to think that the checks here are
performance-critical. Let's not normalize the use of unsafe {}.
Explicitly inform the operator of the rejected relay to set a valid email
address in the ContactInfo field and contact bad-relays@ mailing list.
Fixes#25170
Signed-off-by: David Goulet <dgoulet@torproject.org>
* FIXES#25127: https://bugs.torproject.org/25127
* ADDS a new module to the Rust tor_util crate for small utilities
for working with static strings between languages.
* CHANGES the return type of protover_compute_for_old_tor to point to
immutable data.
* CHANGES the code from the previous commit to use the new static
string utilities.
At this commit, the SocksSocketsGroupWritable option is renamed to
UnixSocksGroupWritable. A deprecated warning is triggered if the old option is
used and tor will use it properly.
Fixes#24343
Signed-off-by: David Goulet <dgoulet@torproject.org>
On slow system, 1 msec between one read and the other was too tight. For
instance, it failed on armel with a 4msec gap:
https://buildd.debian.org/status/package.php?p=tor&suite=experimental
Increase to 10 msec for now to address slow system. It is important that we
keep this OP_LE test in so we make sure the msec/usec/nsec read aren't
desynchronized by huge gaps. We'll adjust again if we ever encounter a system
that goes slower than 10 msec between calls.
Fixes#25113
Signed-off-by: David Goulet <dgoulet@torproject.org>
Remove a series of connection counters that were only used when dumping the
rephist statistics with SIGUSR1 signal.
This reduces the or_history_t structure size.
Closes#25163
Signed-off-by: David Goulet <dgoulet@torproject.org>
This removes the code that tracks the extend attemps a client makes. We don't
use it and it was only used to provide statistics on a SIGUSR1 from the
rephist dump stats function.
Part of #25163
Signed-off-by: David Goulet <dgoulet@torproject.org>
Since we're making it so that unstable zstd apis can be disabled,
we need to test them. I do this by adding a variant setup/cleanup
function for the tests, and teaching it about a fake compression
method called "x-zstd:nostatic".
Services can keep rendezvous circuits for a while so don't log them if tor is
a single onion service.
Fixes#25116
Signed-off-by: David Goulet <dgoulet@torproject.org>
The HT_FOREACH() is insanely heavy on the CPU and this is part of the fast
path so make it return the nice memory size counter we added in
4d812e29b9.
Fixes#25148
Signed-off-by: David Goulet <dgoulet@torproject.org>
Included crypto_digest.[ch] into include.am in order to solve a compiling
issue. Also EOF line in crypto_digest.c added.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Included crypto_digest.h in some files in order to solve xof+digest module
dependency issues. Removed crypto.h where it isn't needed anymore.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Add two new files (crypto_digest.c, crypto_digest.h) as new module of
crypto.[ch]. This new module includes all functions and dependencies related
to digest and xof operations. Those have been removed from crypto.[ch].
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Becasue the circuit creation burst and rate can change at runtime it is
possible that between two refill of a bucket, we end up setting the bucket
value to less than there currently is.
Fixes#25128
Signed-off-by: David Goulet <dgoulet@torproject.org>
If the cache is using 20% of our maximum allowed memory, clean 10% of it. Same
behavior as the HS descriptor cache.
Closes#25122
Signed-off-by: David Goulet <dgoulet@torproject.org>
The current code flow makes it that we can release a channel in a PENDING
state but not in the pending list. This happens while the channel is being
processed in the scheduler loop.
Fixes#25125
Signed-off-by: David Goulet <dgoulet@torproject.org>
This tests many cases of the KIST scheduler with the pending list state by
calling entry point in the scheduler while channels are scheduled or not.
Also, it adds a test for the bug #24700.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This patch makes the wording around error cases for selecting an entropy
source in Tor slightly more verbose. We also let the user know when
something goes wrong that we are trying out a fallback method instead.
See: https://bugs.torproject.org/25120
Included crypto_rsa.[ch] into include.am in order to resolve a compiling issue.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
crypto_get_rsa_padding_overhead() and crypto_get_rsa_padding() are
not static inline anymore in order to split the crypto_rsa module
from crypto.[ch].
Also included necessary modules in order to solve dependency issues.
Also made two functions in crypto.c use crypto_pk_asn1_encdoe()
instead of reaching into the crypto_pk_t struct.
This reverts commit 9a06282546.
It appears that I misunderstood how the seccomp2 filter rules
interact. It appears that `SCMP_ACT_ERRNO()` always takes
precedence over `SCMP_ACT_ALLOW()` -- I had thought instead that
earlier rules would override later ones. But this change caused bug
25115 (not in any released Tor).
The accurate address of a connection is real_addr, not the addr member.
channel_tls_get_remote_addr_method() now returns real_addr instead.
Fixes#24952; bugfix on 707c1e2 in 0.2.4.11-alpha.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Tor preemptiely builds circuits and they can be cannibalized later in their
lifetime. A Guard node can become unusable (from our guard state) but we can
still have circuits using that node opened. It is important to not pick those
circuits for any usage through the cannibalization process.
Fixes#24469
Signed-off-by: David Goulet <dgoulet@torproject.org>
These are no longer meaningful, since there's no longer an upper
limit to how many times (in the exponential-backoff world) one can
retry a download. download_status_is_ready() didn't check these any
more, and neither do we.
In 0.3.2.1-alpha, we've added notify_networkstatus_changed() in order to have
a way to notify other subsystems that the consensus just changed. The old and
new consensus are passed to it.
Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.
This commit changes the notify_networkstatus_changed() to be the "before"
function and adds an "after" notification from which the scheduler subsystem
is notified.
Fixes#24975
When we stopped looking at the "protocols" variable directly, we
broke the hs_service/build_update_descriptors test, since it didn't
actually update any of the flags.
The fix here is to call summarize_protover_flags() from that test,
and to expose summarize_protover_flags() as "STATIC" from
routerparse.c.
This is the quick fix that is keeping the channel in PENDING state so if we
ever try to reschedule the same channel, it won't happened.
Fixes#24700
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is possible in normal circumstances that a client fetches a descriptor
that has a lower revision counter than the one in its cache. This can happen
due to HSDir desync.
Fixes#24976
Signed-off-by: David Goulet <dgoulet@torproject.org>
Setting the default for this at 10 and the learning timeout to 3 minutes means
we will complete our cbt learning in 30 minutes, which is under the reduced
padding connection timeout window.
In 0.3.2.1-alpha, we've added this function in order to have a way to notify
other subsystems that the consensus just changed. The old consensus and the
new one are passed to it.
Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.
With this commit, notify_networkstatus_changed() has been moved _after_ the
new consensus is set globally. The main obvious reasons is to fix the bug
described above and in #24975. The other reason is that this notify function
doesn't return anything which could be allowing the possibility of refusing to
set the new consensus on error. In other words, the new consensus is set right
after the notification whatever happens.
It does no harm or change in behavior to set the new consensus first and then
notify the subsystems. The two functions currently used are for the control
port using the old and new consensus and sending the diff. The second is the
scheduler that needs the new consensus to be set globally before being called.
Of course, the function has been documented accordinly to clearly state it is
done _after_ the new consensus is set.
Fixes#24975
Signed-off-by: David Goulet <dgoulet@torproject.org>
Stop adding unneeded channel padding right after we finish flushing
to a connection that has been trying to flush for many seconds.
Instead, treat all partial or complete flushes as activity on the
channel, which will defer the time until we need to add padding.
This fix should resolve confusing and scary log messages like
"Channel padding timeout scheduled 221453ms in the past."
Fixes bug 22212; bugfix on 0.3.1.1-alpha.
I think technically we could resolve bug 22212 by adding a call to
channel_timestamp_active() only in the finished_flushing case. But I added
a call in the flushed_some case too since that seems to more accurately
reflect the notion of "active".
Using get_uptime() and reset_uptime() instead of
accessing stats_n_seconds_working directly.
stats_n_seconds_working is not extern anymore.
Ticket #25081
Because this touches too many commits at once, it is made into one single
commit.
Remove the use of "tenths" for the circuit rate to simplify things. We can
only refill the buckets at best once every second because of the use of
approx_time() and our token system is set to be 1 token = 1 circuit so make
the rate a flat integer of circuit per second.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Imagine this scenario. We had 10 connections over the 24h lifetime of a geoip
cache entry. The lifetime of the entry has been reached so it is about to get
freed but 2 connections remain for it. After the free, a third connection
comes in thus making us create a new geoip entry for that address matching the
2 previous ones that are still alive. If they end up being closed, we'll have
a concurrent count desynch from what the reality is.
To mitigate this probably very rare scenario in practice, when we free a geoip
entry and it has a concurrent count above 0, we'll go over all connections
matching the address and clear out the tracked flag. So once they are closed,
we don't try to decrement the count.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This option refuses any ESTABLISH_RENDEZVOUS cell arriving from a client
connection. Its default value is "auto" for which we can turn it on or off
with a consensus parameter. Default value is 0.
Signed-off-by: David Goulet <dgoulet@torproject.org>
If the client address was detected as malicious, apply a defense which is at
this commit to return a DESTROY cell.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a function that notifies the DoS subsystem that a new CREATE cell has
arrived. The statistics are updated accordingly and the IP address can also be
marked as malicious if it is above threshold.
At this commit, no defense is applied, just detection with a circuit creation
token bucket system.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Implement a basic connection tracking that counts the number of concurrent
connections when they open and close.
This commit also adds the circuit creation mitigation data structure that will
be needed at later commit to keep track of the circuit rate.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit introduces the src/or/dos.{c|h} files that contains the code for
the Denial of Service mitigation subsystem. It currently contains basic
functions to initialize and free the subsystem. They are used at this commit.
The torrc options and consensus parameters are defined at this commit and
getters are implemented.
Signed-off-by: David Goulet <dgoulet@torproject.org>
And fix the unsupported protover example so it uses a Link protover much
higher than 5.
Part of #25070, bugfix on 0.3.3.1-alpha, which introduced the protover crate.
I'm leaving the getsockname code in transproxy alone, since it is
comparatively isolated, rather platform-specific, and hard to test.
Implements 18105.
Add two new files (crypto_rsa.c, crypto_rsa.h) as new module of crypto.[ch].
This new module includes all functions and dependencies related to RSA
operations. Those have been removed from crypto.[ch].
All new changes related to RSA operations must be done in these files.
Follows #24658
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
The upcoming DoS mitigation subsytem needs to keep information on a per-IP
basis which is also what the geoip clientmap does.
For another subsystem to access that clientmap, this commit adds a lookup
function that returns the entry. For this, the clientmap_entry_t had to be
moved to the header file.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We'd been using crypto_digest_dup() and crypto_digest_assign() here,
but they aren't necessary. Instead we can just use the stack to
store the previous state of the SHA_CTX and avoid a malloc/free pair.
Closes ticket 24914.
In order to make the OR and dir checking functions in router.c less confusing
we renamed some functions and splitted consider_testing_reachability() into
router_should_check_reachability() and router_do_reachability_checks(). Also we
improved the documentation.
Fixes#18918.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Previously, we wouldn't do this when running with a routerinfo_t in
some cases, leading to many needless calls to the protover module.
This change also cleans up the code in nodelist.c a bit.
Fixes bug 25008; bugfix on 0.2.9.4-alpha.
Without this patch, not only will TLS1.3 not work with Tor, but
OpenSSL 1.1.1 with TLS1.3 enabled won't build any connections at
all: It requires that either TLS1.3 be disabled, or some TLS1.3
ciphersuites be listed.
Closes ticket 24978.
As we're trying not to have all the other modules in Tor, we moved the openssl
namespace includes back into crypto.c and crypto_openssl_mgt.c files.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Included crypto_openssl_mgt.[ch] into the appropiate files in order to resolve
compiling and dependencies issues.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
Renamed free_openssl() to crypto_openssl_free_all(). Also we made variables and
functions static again.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
For 23847, we want Tor to be able to shut down and then restart in
the same process. Here's a patch to make the Tor binary do that.
To test it, you need to build with --enable-restart-debugging, and
then you need to set the environment variable TOR_DEBUG_RESTART.
With this option, Tor will then run for 5 seconds, then restart
itself in-process without exiting. This only happens once.
You can change the 5-second interval using
TOR_DEBUG_RESTART_AFTER_SECONDS.
Implements ticket 24583.
Fix an "off by 2" error in counting rendezvous failures on the onion
service side.
While we thought we would stop the rendezvous attempt after one failed
circuit, we were actually making three circuit attempts before giving up.
Fixes bug 24895; bugfix on 0.0.6.
Rename crypto_openssl.[ch] to crypto_openssl_mgt.[ch] because it is possible we
need crypto_openssl.[ch] in the future.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
Since helper_create_introduce1_cell() checks "cell" for nullness,
scan-build is concerned that test_introduce1_validation()
dereferences it without checking it. So, add a check.
Not backporting, since this is spurious, _and_ tests-only.
Fix a set of false positives where relays would consider connections
to other relays as being client-only connections (and thus e.g.
deserving different link padding schemes) if those relays fell out
of the consensus briefly.
Now we look only at the initial handshake and whether the connection
authenticated as a relay.
Fixes bug 24898; bugfix on 0.3.1.1-alpha.
New-style (v3) onion services now obey the "max rendezvous circuit
attempts" logic.
Previously they would make as many rendezvous circuit attempts as they
could fit in the MAX_REND_TIMEOUT second window before giving up.
Fixes bug 24894; bugfix on 0.3.2.1-alpha.
Define TOR_PRIuSZ as minGW compiler doesn't support zu format specifier for
size_t type.
Fixes#24861 on ac9eebd.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
... in get_interface_addresses_ioctl().
This pointer alignment issue exists on x86_64 macOS, but is unlikely to exist
elsewhere. (i386 macOS only requires 4-byte alignment, and other OSs have
8-byte ints.)
Fixes bug 24733; not in any released version of tor.
Add free_openssl() function to free the memory allocated for OpenSSL version
management variables. It is required since OpenSSL management has been isolated
from the crypto module.
Follows #24658.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
Add two new files (crypto_openssl.c, crypto_openssl.h) as new module of
crypto.[ch]. This new module includes all functions and dependencies related
to OpenSSL management. Those have been removed from crypto.[ch].
All new changes related to OpenSSL management must be done in these files.
Follows #24658
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
If we tried to move a descriptor from routerlist->old_routers
back into the current routerlist, we were preparing a buffer with
format_iso_time() on ri->cert_expiration_time, and doing it preemptively
since router_add_to_routerlist() might free ri so we wouldn't be able
to get at it later in the function.
But if the descriptor we're moving doesn't have any ed signature, then
its cert will be marked to expire at TIME_MAX, and handing TIME_MAX
to format_iso_time() generates this log warning:
correct_tm(): Bug: gmtime(9223372036854775807) failed with error Value too large for defined data type: Rounding down to 2037
The fix is to preemptively remember the expiry time, but only prepare
the buffer if we know we're going to need it.
Bugfix on commit a1b0a0b9, which came about as part of a fix for bug
20020, and which is not yet in any released version of Tor (hence no
changes file).
Using this script:
sed -i.bak $'s|^,$|/* ===== */\\\n,|' src/or/fallback_dirs.inc
(Due to embedded newlines, this script only works in bash.)
And manually add a delimiter to the end of the header, and the start of
the fallback list.
This allows us to check that the code compiles, and the unit tests pass.
And it allows downstream users stem and atlas to adapt to the new format.
The upcoming fallback rebuild will automatically generate this new format.
Follow-up to 24725.
Using this script:
sed -i.bak $'s|^,$|/* extrainfo=0 */\\\n,|' src/or/fallback_dirs.inc
(Due to embedded newlines, this script only works in bash.)
This allows us to check that the code compiles, and the unit tests pass.
And it allows downstream users stem and atlas to adapt to the new format.
The upcoming fallback rebuild will automatically generate this new format,
with actual relay extrainfo cache flags.
Follow-up to 22759.
Using this script:
sed -i.bak $'s|^,$|/* nickname= */\\\n,|' src/or/fallback_dirs.inc
(Due to embedded newlines, this script only works in bash.)
This allows us to check that the code compiles, and the unit tests pass.
And it allows downstream users stem and atlas to adapt to the new format.
The upcoming fallback rebuild will automatically generate this new format,
with actual relay nicknames.
Follow-up to 24600.
Using this script:
sed -i.bak 's/" weight=10",/,/' src/or/fallback_dirs.inc
This allows us to check that the code compiles, and the unit tests pass.
And it allows downstream users stem and atlas to adapt to the new format.
The upcoming fallback rebuild will automatically generate this new format.
Follow-up to 24679.
The code had been using c_char and i8 interchangeably, but it turns
out that c_char is only i8 on platforms where "char" is signed. On
other platforms, c_char is u8.
Fixes bug 24794; bug not on any released Tor.
These are all about local variables shadowing global
functions. That isn't normally a problem, but at least one
compiler we care about seems to treat this as a case of -Wshadow
violation, so let's fix it.
Fixes bug 24634; bugfix on 0.3.2.1-alpha.