Because the test is adding entries to the "rend_cache" directly, the
rend_cache_increment_allocation() was never called which made the
rend_cache_clean() call trigger that underflow warning:
rend_cache/clean: [forking] Nov 29 09:55:04.024 [warn] rend_cache_decrement_allocation(): Bug: Underflow in rend_cache_decrement_allocation (on Tor 0.4.0.0-alpha-dev 2240fe63feb9a8cf)
The test is still good and valid.
Fixes#28660
Signed-off-by: David Goulet <dgoulet@torproject.org>
This patch adds two new functions: buf_flush_to_pipe() and
buf_read_from_pipe(), which makes use of our new buf_flush_to_fd() and
buf_read_from_fd() functions.
See: https://bugs.torproject.org/28179
This patch refactors buf_read_from_socket() into buf_read_from_fd(), and
creates a specialized function for buf_read_from_socket(), which uses
buf_read_from_fd().
See: https://bugs.torproject.org/28179
This patch refactors buf_flush_to_socket() into buf_flush_to_fd() and
creates a specialization function for buf_flush_to_socket() that makes
use of buf_flush_to_fd().
See: https://bugs.torproject.org/28179
The DormantClientTimeout option controls how long Tor will wait before
going dormant. It also provides a way to disable the feature by setting
DormantClientTimeout to e.g. "50 years".
The DormantTimeoutDisabledByIdleStreams option controls whether open but
inactive streams count as "client activity". To implement it, I had to
make it so that reading or writing on a client stream *always* counts as
activity.
Closes ticket 28429.
Specifically, if the consensus is older than the (estimted or
measured) release date for this version of tor, we assume that the
required versions may have changed in between that consensus and
this release.
Implements ticket 27735 and proposal 297.
This patch has routers use the same canonicalization logic as
authorities when encoding their family lists. Additionally, they
now warn if any router in their list is given by nickname, since
that's error-prone.
This patch also adds some long-overdue tests for family formatting.
Prop298 says that family entries should be formatted with
$hexids in uppercase, nicknames in lower case, $hexid~names
truncated, and everything sorted lexically. These changes implement
that ordering for nodefamily.c.
We don't _strictly speaking_ need to nodefamily.c formatting use
this for prop298 microdesc generation, but it seems silly to have
two separate canonicalization algorithms.
After we clear the protover map for getting full, we need to
re-create it, since we are about to use it.
This is a bugfix for bug 28558. It is a bugfix for the code from
ticket 27225, which is not in any released Tor. Found by Google
OSS-Fuzz, as issue 11475.
To succesful compile tor-print-ed-signing-cert.exe on Windows we
sometimes need to include the @TOR_LIB_GDI@ library.
See: https://bugs.torproject.org/28485
This representation is meant to save memory in microdescriptors --
we can't use it in routerinfo_t yet, since those families need to be
encoded losslessly for directory voting to work.
This representation saves memory in three ways:
1. It uses only one allocation per family. (The old way used a
smartlist (2 allocs) plus one strdup per entry.)
2. It stores identity digests in binary, not hex.
3. It keeps families in a canonical format, memoizes, and
reference-counts them.
Part of #27359.
This event makes us become dormant if we have seen no activity in a
long time.
Note that being any kind of a server, or running an onion service,
always counts as being active.
Note that right now, just having an open stream that Tor
did not open on its own (for a directory request) counts as "being
active", so if you have an idle ssh connection, that will keep Tor
from becoming dormant.
Many of the features here should become configurable; I'd like
feedback on which.
This is part of 28422, so we don't have to call
consider_hibernation() once per second when we're dormant.
This commit does not remove delayed shutdown from hibernate.c: it
uses it as a backup shutdown mechanism, in case the regular shutdown
timer mechanism fails for some reason.
The previous "ALL" role was the OR of a bunch of other roles,
which is a mistake: it's better if "ALL" means "all".
The "NET_PARTICIPANT" role refers to the anything that is actively
building circuits, downloading directory information, and
participating in the Tor network. For now, it is set to
!net_is_disabled(), but we're going to use it to implement a new
"extra dormant mode".
Closes ticket 28336.
Correctly identify Windows 8.1, Windows 10, and Windows Server 2008
and later from their NT versions.
On recent Windows versions, the GetVersionEx() function may report
an earlier Windows version than the running OS. To avoid user
confusion, add "[or later]" to Tor's version string on affected
versions of Windows.
Remove Windows versions that were never supported by the
GetVersionEx() function.
Stop duplicating the latest Windows version in get_uname().
Fixes bug 28096; bugfix on 0.2.2.34; reported by Keifer Bly.
In conn_close_if_marked(), we can decide to keep a connection open that still
has data to flush on the wire if it is being rate limited on the write side.
However, in this process, we were also looking at the read() side which can
still have token in its bucket and thus not stop the reading. This lead to a
BUG() introduced in 0.3.4.1-alpha that was expecting the read side to be
closed due to the rate limit but which only applies on the write side.
This commit removes any bandwidth check on the read side and simply stop the
read side on the connection regardless of the bucket state. If we keep the
connection open to flush it out before close, we should not read anything.
Fixes#27750
Signed-off-by: David Goulet <dgoulet@torproject.org>
Apparently some freebsd compilers can't tell that 'c' will never
be used uninitialized.
Fixes bug 28413; bugfix on 0.2.9.3-alpha when we added support for
longer AES keys to this function.
We don't use this syscall, but openssl apparently does.
(This syscall puts a socket into a half-closed state. Don't worry:
It doesn't shut down the system or anything.)
Fixes bug 28183; bugfix on 0.2.5.1-alpha where the sandbox was
introduced.
Apparently, even though the manpage says it returns an int, it
can return a long instead and cause a warning.
Bug not in any released Tor. Part of #28399
Our tests showed that this function is responsible for a huge number
of our malloc/free() calls. It's a prime candidate for being
memoized.
Closes ticket 27225.
Resume refusing to start with relative file paths and RunAsDaemon
set (regression from the fix for bug 22731).
Fixes bug 28298; bugfix on 0.3.3.1-alpha.
If tor terminates due to SIGNAL HALT before test_rebind.py calls
tor_process.terminate(), an OSError 3 (no such process) is thrown.
Fixes part of bug 27968 on 0.3.5.1-alpha.
Remember, you can't check to see if there are N bytes left in a
buffer by doing (buf + N < end), since the buf + N computation might
take you off the end of the buffer and result in undefined behavior.
Fixes 28202; bugfix on 0.2.0.3-alpha.
It is not enough to look at protover for v3 rendezvous support but also we
need to make sure that the curve25519 onion key is present or in other words
that the descriptor has been fetched and does contain it.
Fixes#27797.
Signed-off-by: David Goulet <dgoulet@torproject.org>
With the new refresh_service_descriptor() function we had both
refresh_service_descriptor() and update_service_descriptor() which is basically
the same thing.
This commit renames update_service_descriptor() to
update_service_descriptor_intro_points() to make it clear it's not a generic
refresh and it's only about intro points.
Commit changes no code.
Treat backtrace test failures as expected on NetBSD, OpenBSD, and
macOS/Darwin, until we solve bug 17808.
(FreeBSD failures have been treated as expected since 18204 in 0.2.8.)
Fixes bug 27948; bugfix on 0.2.5.2-alpha.
Before this commit, we would create the descriptor signing key certificate
when first building the descriptor.
In some extreme cases, it lead to the expiry of the certificate which triggers
a BUG() when encoding the descriptor before uploading.
Ticket #27838 details a possible scenario in which this can happen. It is an
edge case where tor losts internet connectivity, notices it and closes all
circuits. When it came back up, the HS subsystem noticed that it had no
introduction circuits, created them and tried to upload the descriptor.
However, in the meantime, if tor did lack a live consensus because it is
currently seeking to download one, we would consider that we don't need to
rotate the descriptors leading to using the expired signing key certificate.
That being said, this commit does a bit more to make this process cleaner.
There are a series of things that we need to "refresh" before uploading a
descriptor: signing key cert, intro points and revision counter.
A refresh function is added to deal with all mutable descriptor fields. It in
turn simplified a bit the code surrounding the creation of the plaintext data.
We keep creating the cert when building the descriptor in order to accomodate
the unit tests. However, it is replaced every single time the descriptor is
uploaded.
Fixes#27838
Signed-off-by: David Goulet <dgoulet@torproject.org>
When storing a descriptor in the client cache, if we are about to replace an
existing descriptor, make sure to close every introduction circuits of the old
descriptor so we don't have leftovers lying around.
Ticket 27471 describes a situation where tor is sending an INTRODUCE1 cell on
an introduction circuit for which it doesn't have a matching intro point
object (taken from the descriptor).
The main theory is that, after a new descriptor showed up, the introduction
points changed which led to selecting an introduction circuit not used by the
service anymore thus for which we are unable to find the corresponding
introduction point within the descriptor we just fetched.
Closes#27471.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It won't be used if there are no authorized client configured. We do that so
we can easily support the addition of a client with a HUP signal which allow
us to avoid more complex code path to generate that cookie if we have at least
one client auth and we had none before.
Fixes#27995
Signed-off-by: David Goulet <dgoulet@torproject.org>
Both client and service had their own code for this. Consolidate into one
place so we avoid duplication.
Closes#27549
Signed-off-by: David Goulet <dgoulet@torproject.org>
Occasionally, key pinning doesn't catch a relay that shares an ed25519
ID with another relay. Log the identity fingerprints and the shared
ed25519 ID when this happens, instead of making a BUG() warning.
Fixes bug 27800; bugfix on 0.3.2.1-alpha.
Commit 488e2b00bf introduced an issue, most
likely introduced by a bad copy paste, that made us stop reading on the
connection if our write bandwidth limit was reached.
The problem is that because "read_blocked_on_bw" was never set, the connection
was never reenabled for reading.
This is most likely the cause of #27813 where bytes were accumulating in the
kernel TCP bufers because tor was not doing reads. Only relays with
RelayBandwidthRate would suffer from this but affecting all relays connecting
to them. And using that tor option is recommended and best practice so many
many relays have it enabled.
Fixes#28089.
It turns out that if _only_ the ControlPort is set and nothing else, tor would
simply not bootstrap and thus not start properly. Commit 67a41b6306
removed that requirement for tor to be considered a "client".
Unfortunately, this made the mainloop enable basically nothing if only the
ControlPort is set in the torrc.
This commit now makes it that we also consider the ControlPort when deciding
if we are a Client or not. It does not revert 67a41b6306 meaning
options_any_client_port_set() stays the same, not looking at the control port.
Fixes#27849.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Nothing should ever look at them on failure, but in some cases,
the unit tests don't check for failure, and then GCC-LTO freaks out.
Fixes part of 27772.
These confused GCC LTO, which thought they might be used
uninitialized. I'm pretty sure that as long as 'res' indicates
success, they will always be set to something, but let's unconfuse
the compiler in any case.
OpenSolaris apparently doesn't have timeradd(), so we added a
replacement, but we weren't including it here after the big
refactoring in 0.3.5.1-alpha.
Fixes bug 27963; bugfix on 0.3.5.1-alpha.
It looks to be the case that Rust's standard allocator, jemalloc, is
incompatible with sanitizers. The incompatibility, for whatever reason,
seems to cause segfaults at runtime when jemalloc is linked with
sanitizers.
Without actually trying to figure out what's going on here this commit
instead takes the hammer of "let's remove jemalloc when testing". The
`tor_allocate` crate now by default switches to the system allocator
(eventually this will want to be the tor allocator). Most crates then
link to `tor_allocate` ot pick this up, but the `smartlist` crate had to
manually switch to the system allocator in testing and the `external`
crate had to be sure to link to `tor_allocate`.
The final gotcha here is that this patch also switches to
unconditionally passing `--target` to Cargo. For weird and arcane
reasons passing `--target` with the host target of the compiler (which
Cargo otherwise uses as the default) is different than not passing
`--target` at all. This ensure that our custom `RUSTFLAGS` with
sanitizer options doesn't make its way into build scripts, just the
final testing artifacts.
This is no longer necessary with upstream rust-lang/rust changes as well
as some local tweaks. Namely:
* The `-fsanitize=address`-style options are now passed via `-C
link-args` through `RUSTFLAGS`. This obviates the need for the shell
script.
* The `-C default-linker-libraries`, disabling `-nodefaultlibs`, is
passed through `RUSTFLAGS`, which is necessary to ensure that
`-fsanitize=address` links correctly.
* The `-C linker` option is passed to ensure we're using the same C
compiler as normal C code, although it has a bit of hackery to only
get the `gcc` out of `gcc -std=c99`
Allowing this didn't do any actual harm, since there aren't any
shared structures or leakable objects here. Still, it's bad style
and might cause trouble in the future.
Closes ticket 27856.
Various places in our code try to activate these events or check
their status, so we should make sure they're initialized as early as
possible. Fixes bug 27861; bugfix on 0.3.5.1-alpha.
When freeing a configuration object from confparse.c in
dump_config(), we need to call the appropriate higher-level free
function (like or_options_free()) and not just config_free().
This only happens with options (since they're the one where
options_validate allocates extra stuff) and only when running
--dump-config with something other than minimal (since
OPTIONS_DUMP_MINIMAL doesn't hit this code).
Fixes bug 27893; bugfix on 0.3.2.1-alpha.
It differs from the rest of the rephist code in that it's actually
necessary for Tor to operate, so it should probably go somewhere
else. I'm not sure where yet, so I'll leave it in the same
directory, but give it its own file.
Since this is completely core functionality, I'm putting it in
core/mainloop, even though it depends on feature/hibernate. We'll
have to sort that out in the future.
If a tor client gets a descriptor that it can't decrypt, chances are that the
onion requires client authorization.
If a tor client is configured with client authorization for an onion but
decryption fails, it means that the configured keys aren't working anymore.
In both cases, we'll log notice the former and log warn the latter and the
rest of the decryption errors are now at info level.
Two logs statement have been removed because it was redundant and printing the
fetched descriptor in the logs when 80% of it is encrypted wat not helping.
Fixes#27550
Signed-off-by: David Goulet <dgoulet@torproject.org>
The main.c code is responsible for initialization and shutdown;
the mainloop.c code is responsible for running the main loop of Tor.
Splitting the "generic event loop" part of mainloop.c from the
event-loop-specific part is not done as part of this patch.
The parts for handling cell formats should be in src/core/or.
The parts for handling onionskin queues should be in src/core/or.
Only the crypto wrapper belongs in src/core/crypto.
When sending the INTRODUCE1 cell, we acquire the needed data for the cell but
if the RP node_t has invalid data, we'll fail the send and completely kill the
SOCKS connection.
Instead, close the rendezvous circuit and return a transient error meaning
that Tor can recover by selecting a new rendezvous point. We'll also do the
same when we are unable to encode the INTRODUCE1 cell for which at that point,
we'll simply take another shot at a new rendezvous point.
Fixes#27774
Signed-off-by: David Goulet <dgoulet@torproject.org>
The result of CString::into_raw() is not safe to free
with free() except under finicky and fragile circumstances
that we definitely don't meet right now.
This was missed in be583a34a3.
When we close a socket via tor_tls_free(), we previously had no way
for our socket accounting logic to learn about it. This meant that
the socket accounting code would think we had run out of sockets,
and freak out.
Fixes bug 27795; bugfix on 0.3.5.1-alpha.
In dirauth:
* bwauth.c reads and uses bandwidth files
* guardfraction.c reads and uses the guardfraction file
* reachability.c tests relay reachability
* recommend_pkg.c handles the recommended-packages lines.
* recv_descs.c handles fingerprint files and processing incoming
routerinfos that relays upload to us
* voteflag.c computes flag thresholds and sets those thresholds on
routerstatuses when computing votes
In control:
* fmt_serverstatus.c generates the ancient "v1 server status"
format that controllers expect.
In nodelist:
* routerstatus_fmt.c formats routerstatus entries for a consensus,
a vote, or for the controller.
Client side, when a descriptor is finally fetched and stored in the cache, we
then go over all pending SOCKS request for that descriptor. If it turns out
that the intro points are unusable, we close the first SOCKS request but not
the others for the same .onion.
This commit makes it that we'll close all SOCKS requests so we don't let
hanging the other ones.
It also fixes another bug which is having a SOCKS connection in RENDDESC_WAIT
state but with a descriptor in the cache. At some point, tor will expire the
intro failure cache which will make that descriptor usable again. When
retrying all SOCKS connection (retry_all_socks_conn_waiting_for_desc()), we
won't end up in the code path where we have already the descriptor for a
pending request causing a BUG().
Bottom line is that we should never have pending requests (waiting for a
descriptor) with that descriptor in the cache (even if unusable).
Fixees #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is not enough to look at protover for v3 rendezvous support but also we
need to make sure that the curve25519 onion key is present or in other words
that the descriptor has been fetched and does contain it.
Fixes#27797.
Signed-off-by: David Goulet <dgoulet@torproject.org>
There are now separate modules for:
* the list of router descriptors
* the list of authorities and fallbacks
* managing authority certificates
* selecting random nodes
That unit test makes sure we don't have pending SOCK request if the descriptor
turns out to be unusable.
Part of #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Client side, when a descriptor is finally fetched and stored in the cache, we
then go over all pending SOCKS request for that descriptor. If it turns out
that the intro points are unusable, we close the first SOCKS request but not
the others for the same .onion.
This commit makes it that we'll close all SOCKS requests so we don't let
hanging the other ones.
It also fixes another bug which is having a SOCKS connection in RENDDESC_WAIT
state but with a descriptor in the cache. At some point, tor will expire the
intro failure cache which will make that descriptor usable again. When
retrying all SOCKS connection (retry_all_socks_conn_waiting_for_desc()), we
won't end up in the code path where we have already the descriptor for a
pending request causing a BUG().
Bottom line is that we should never have pending requests (waiting for a
descriptor) with that descriptor in the cache (even if unusable).
Fixees #27410.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The trunnel functions are written under the assumption that their
allocators can fail, so GCC LTO thinks they might return NULL. In
point of fact, they're using tor_malloc() and friends, which can't
fail, but GCC won't necessarily figure that out.
Fixes part of #27772.
Instead, have it call a mockable function. We don't want
crypto_strongest_rand() to be mockable, since doing so creates a
type error when we call it from ed25519-donna, which we do not build
in a test mode.
Fixes bug 27728; bugfix on 0.3.5.1-alpha
This argument was added to match an older idea for the C api, but we
decided not to do it that way in C.
Fixes bug 27741; bugfix on 0.3.3.6 / TROVE-2018-005 fix.
Since the default cache directory is the same as the default data
directory, we don't want the default CacheDirectoryGroupReadable
value (0) to override an explicitly set "DataDirectoryGroupReadable
1".
To fix this, I'm making CacheDirectoryGroupReadable into an
autobool, and having the default (auto) value mean "Use the value of
DataDirectoryGroupReadable if the directories are the same, and 0
otherwise."
Fixes bug 26913; bugfix on 0.3.3.1-alpha when the CacheDirectory
option was introduced.
This shouldn't be a user-visible change: nobody has a 16 MB RSA
key that they're trying to use with Tor.
I'm doing this to fix CID 1439330 / ticket 27730, where coverity
complains (on 64-bit) that we are making a comparison that is never
true.
This patch moves the logic that adds the proxy headers to an earlier
point in the exit connection lifetime, which ensures that the
application data cannot be written to the outbuf before the proxy header
is added.
See: https://bugs.torproject.org/4700
This patch changes HiddenServiceExportCircuitID so instead of being a
boolean it takes a string, which is the protocol. Currently only the
'haproxy' protocol is defined.
See: https://bugs.torproject.org/4700
Without this patch we would encode the IPv6 address' last part as
::ffffffff instead of ::ffff:ffff when the GID is UINT32_MAX.
See: https://bugs.torproject.org/4700
In hs_config.c, we do validate the permission of the hidden service directory
but we do not try to create it. So, in the event that the directory doesn't
exists, we end up in the loading key code path which checks for the
permission and possibly creates the directory. On failure, don't BUG() since
there is a perfectly valid use case for that function to fail.
Fixes#27335
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is harder than with OpenSSL, since OpenSSL counts the bytes on
its own and NSS doesn't. To fix this, we need to define a new
PRFileDesc layer that has its own byte-counting support.
Closes ticket 27289.
On GCC and Clang, there's a feature to warn you about bad
conditionals like "if (a = b)", which should be "if (a == b)".
However, they don't warn you if there are extra parentheses around
"a = b".
Unfortunately, the tor_assert() macro and all of its kin have been
passing their inputs through stuff like PREDICT_UNLIKELY(expr) or
PREDICT_UNLIKELY(!(expr)), both of which expand to stuff with more
parentheses around "expr", thus suppressing these warnings.
To fix this, this patch introduces new macros that do not wrap
expr. They're only used when GCC or Clang is enabled (both define
__GNUC__), since they require GCC's "({statement expression})"
syntax extension. They're only used when we're building the
unit-test variant of the object files, since they suppress the
branch-prediction hints.
I've confirmed that tor_assert(), tor_assert_nonfatal(),
tor_assert_nonfatal_once(), BUG(), and IF_BUG_ONCE() all now give
compiler warnings when their argument is an assignment expression.
Fixes bug 27709.
Bugfix on 0.0.6, where we first introduced the "tor_assert()" macro.
.retain() would allocating a Vec of billions of integers and check them
one at a time to separate the supported versions from the unsupported.
This leads to a memory DoS.
Closes ticket 27206. Bugfix on e6625113c9.
Before 0.3.3.1-alpha, we would exit() in this case immediately. But
now that we leave tor_main() more conventionally, we need to make
sure we restore things so as not to cause a double free.
Fixes bug 27708; bugfix on 0.3.3.1-alpha.
Since we use a 32-bit approximation for millisecond conversion here,
we can't expect so much precision.
Fixes part of bug 27139; bugfix on 0.3.4.1-alpha.
Multiply-then-divide is more accurate, but it runs into trouble when
our input is above INT32_MAX/numerator. So when our value is too
large, do divide-then-multiply instead.
Fixes part of bug 27139; bugfix on 0.3.4.1-alpha.
We use an optimized but less accurate formula for converting coarse
time differences to milliseconds on 32-bit OSX platforms, so that we
can avoid 64-bit division.
The old numbers were off by 0.4%. The new numbers are off by .006%.
This should make the unit tests a bit cleaner, and our tolerances a
bit closer.
All node_get_all_orports() does is allocate and return a smartlist
with at most two tor_addr_port_t members that match ORPort's of
node configuration. This is harmful for memory efficiency, as it
allocates the same stuff every time it is called. However,
node_is_a_configured_bridge() does not need to call it, as it
already has all the information to check if there is configured
bridge for a given node.
The new code is arranged in a way that hopefully makes each succeeding
linear search through bridge_list less likely.
We determine that a cell was dropped by inspecting CIRC_BW fields. If we did
not update the delivered or overhead fields after processing the cell, the
cell was dropped/not processed.
Also emit CIRC_BW events for cases where we decide to close the circuit in
this function, so vanguards can print messages about dropped cells in those
cases, too.
This function tells the underlying TLS object that it shouldn't
close the fd on exit. Mostly, we hope not to have to use it, since
the NSS implementation is kludgey, but it should allow us to fix
It's possible for a unit test to report success via its pipe, but to
fail as it tries to clean up and exit. Notably, this happens on a
leak sanitizer failure.
Fixes bug 27658; bugfix on 0.2.2.4-alpha when tinytest was
introduced.
This commit makes it that the authorized clients in the descriptor are in
random order instead of ordered by how they were read on disk.
Fixes#27545
Signed-off-by: David Goulet <dgoulet@torproject.org>
Existing cached directory information can cause misleadingly high
bootstrap percentages. To improve user experience, defer reporting of
directory information progress until at least one connection has
succeeded to a relay or bridge.
Closes ticket 27169.