In KIST, we could have a small congestion window value than the unacked
packets leading to a integer overflow which leaves the tcp_space value to be
humongous.
This has no security implications but it results in KIST scheduler allowing to
send cells on a potentially saturated connection.
Found by #24423. Fixes#24590.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, circuit_stream_is_being_handled incorrectly reported
that (1) an exit port was "handled" by a circuit regardless of
whether the circuit was already isolated in some way, and
(2) that a stream could be "handled" by a circuit even if their
isolation settings were incompatible.
As a result of (1), in Tor Browser, circuit_get_unhandled_ports was
reporting that all ports were handled even though all non-internal
circuits had already been isolated by a SOCKS username+password.
Therefore, circuit_predict_and_launch_new was declining to launch
new exit circuits. Then, when the user visited a new site in Tor
Browser, a stream with new SOCKS credentials would be initiated,
and the stream would have to wait while a new circuit with those
credentials could be built. That wait was making the
time-to-first-byte longer than it needed to be.
Now, clean, not-yet-isolated circuit(s) will be automatically
launched ahead of time and be ready for use whenever a new stream
with new SOCKS credentials (or other isolation criteria) is
initiated.
Fixes bug 18859. Thanks to Nick Mathewson for improvements.
Making errno error log more useful for getrandom() call. Adding if statement to
make difference between ENOSYS and other errors.
Fixes#24500
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
First, hs_service_intro_circ_has_closed() is now called in circuit_mark_for
close() because the HS subsystem needs to learn when an intro point is
actually not established anymore as soon as possible. There is a time window
between a close and a free.
Second, when we mark for close, we also remove it from the circuitmap because
between the close and the free, a service can launch an new circuit to that
same intro point and thus register it which only succeeds if the intro point
authentication key is not already in the map.
However, we still do a remove from the circuitmap in circuit_free() in order
to also cleanup the circuit if it wasn't marked for close prior to the free.
Fixes#23603
Signed-off-by: David Goulet <dgoulet@torproject.org>
In the KIST main loop, if the channel happens to be not opened, set its state
to IDLE so we can release it properly later on. Prior to this fix, the channel
was in PENDING state, removed from the channel pending list and then kept in
that state because it is not opened.
This bug was introduced in commit dcabf801e5 for
which we made the scheduler loop not consider unopened channel.
This has no consequences on tor except for an annoying but harmless BUG()
warning.
Fixes#24502
Signed-off-by: David Goulet <dgoulet@torproject.org>
Some platforms don't have good monotonic time support so don't warn when the
diff between the last run of the scheduler time and now is negative. The
scheduler recovers properly from this so no need to be noisy.
Fixes#23696
Signed-off-by: David Goulet <dgoulet@torproject.org>
Fortunately, use_cached_ipv4_answers was already 0, so we wouldn't
actually use this info, but it's best not to have it.
Fixes bug 24050; bugfix on 0.2.6.3-alpha
TROVE-2017-12. Severity: Medium
When choosing a random node for a circuit, directly use our router
descriptor to exclude ourself instead of the one in the global
descriptor list. That list could be empty because tor could be
downloading them which could lead to not excluding ourself.
Closes#21534
TROVE-2017-12. Severity: Medium
Thankfully, tor will close any circuits that we try to extend to
ourselves so this is not problematic but annoying.
Part of #21534.
TROVE-2017-13. Severity: High.
In the unlikely case that a hidden service could be missing intro circuit(s),
that it didn't have enough directory information to open new circuits and that
an intro point was about to expire, a use-after-free is possible because of
the intro point object being both in the retry list and expiring list at the
same time.
The intro object would get freed after the circuit failed to open and then
access a second time when cleaned up from the expiring list.
Fixes#24313
Going from 4 hours to 24 hours in order to try reduce the efficiency of guard
discovery attacks.
Closes#23856
Signed-off-by: David Goulet <dgoulet@torproject.org>
Stop checking for bridge descriptors when we actually want to know if
any bridges are usable. This avoids potential bootstrapping issues.
Fixes bug 24367; bugfix on 0.2.0.3-alpha.
Stop stalling when bridges are changed at runtime. Stop stalling when
old bridge descriptors are cached, but they are not in use.
Fixes bug 24367; bugfix on 23347 in 0.3.2.1-alpha.
Previously, if store_multiple() reported a partial success, we would
store all the handles it gave us as if they had succeeded. But it's
possible for the diff to be only partially successful -- for
example, if LZMA failed but the other compressors succeeded.
Fixes bug 24086; bugfix on 0.3.1.1-alpha.
If we can't read a file because of an FS issue, we say "we can't
read that" and move on. But if we can't read it because it's empty,
because it has no labels, or because its labels are misformatted, we
should remove it.
Fixes bug 24099; bugfix on 0.3.1.1-alpha.
A circuit with purpose C_INTRODUCING means that its state is opened but the
INTRODUCE1 cell hasn't been sent yet. We shouldn't consider that circuit when
looking for timing out "building circuit". We have to wait on the rendezvous
circuit to be opened before sending that cell so the intro circuit needs to be
kept alive for at least that period of time.
This patch makes that the purpose C_INTRODUCING is ignored in the
circuit_expire_building() which means that we let the circuit idle timeout
take care of it if we end up never using it.
Fixes#23681
Signed-off-by: David Goulet <dgoulet@torproject.org>
We don't want to allow general signals to be sent, but there's no
problem sending a kill(0) to probe whether a process is there.
Fixes bug 24198; bugfix on 0.2.5.1-alpha when the seccomp2 sandbox
was introduced.
When we close a connection via connection_close_immediately, we kill
its events immediately. But if it had been blocked on bandwidth
read/write, we could try to re-add its (nonexistent) events later
from connection_bucket_refill -- if we got to that callback before
we swept the marked connections.
Fixes bug 24167. Fortunately, this hasn't been a crash bug since we
introduced connection_check_event in 0.2.9.10, and backported it.
This is a bugfix on commit 89d422914a, I believe, which
appeared in Tor 0.1.0.1-rc.
Commit 56c5e282a7 suppressed that same log
statement in directory_info_has_arrived() for microdescriptors so do the same
for the descriptors. As the commit says, we already have the bootstrap
progress for this.
Fixes#23861
Signed-off-by: David Goulet <dgoulet@torproject.org>
evdns is allowed to give us unrecognized object types; it is allowed
to give us non-IPv4 answer types, and it is (even) allowed to give
us empty answers without an error.
Closes ticket 24097.
Due to #23662 this can happen under natural causes and does not disturb
the functionality of the service. This is a simple 0.3.2 fix for now,
and we plan to fix this properly in 0.3.3.
This function -- a mock replacement used only for fuzzing -- would
have a buffer overflow if it got an RSA key whose modulus was under
20 bytes long.
Fortunately, Tor itself does not appear to have a bug here.
Fixes bug 24247; bugfix on 0.3.0.3-alpha when fuzzing was
introduced. Found by OSS-Fuzz; this is OSS-Fuzz issue 4177.
On failure to upload, the HS_DESC event would report "UPLOAD_FAILED" as the
Action but it should have reported "FAILED" according to the spec.
Fixes#24230
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we have fewer than 15 descriptors to fetch, we will delay the
fetch for a little while. That's fine, if we can go ahead and build
circuits... but if not, it's a poor choice indeed.
Fixes bug 23985; bugfix on 0.1.1.11-alpha.
In 0.3.0.3-alpha, when we made primary guard descriptors necessary
for circuit building, this situation got worse.
When calculating the fraction of nodes that have descriptors, and all
all nodes in the network have zero bandwidths, count the number of nodes
instead.
Fixes bug 23318; bugfix on 0.2.4.10-alpha.
Back in 0.2.4.3-alpha (e106812a77), when we switched from using
double to using uint64 for selecting by bandwidth, I got the math
wrong: I should have used llround(x), or (uint64_t)(x+0.5), but
instead I wrote llround(x+0.5). That means we would always round
up, rather than rounding to the closest integer
Fixes bug 23318; bugfix on 0.2.4.3-alpha.
The flush cells process can close a channel if the connection write fails but
still return that it flushed at least one cell. This is due because the error
is not propagated up the call stack so there is no way of knowing if the flush
actually was successful or not.
Because this would require an important refactoring touching multiple
subsystems, this patch is a bandaid to avoid the KIST scheduler to handle
closed channel in its loop.
Bandaid on #23751.
Signed-off-by: David Goulet <dgoulet@torproject.org>
If it decrypts something that turns out to start with a NUL byte,
then decrypt_desc_layer() will return 0 to indicate the length of
its result. But 0 also indicates an error, which causes the result
not to be freed by decrypt_desc_layer()'s callers.
Since we're trying to stabilize 0.3.2.x, I've opted for the simpler
possible fix here and made it so that an empty decrypted string will
also count as an error.
Fixes bug 24150 and OSS-Fuzz issue 3994.
The original bug was present but unreachable in 0.3.1.1-alpha. I'm
calling this a bugfix on 0.3.2.1-alpha since that's the first version
where you could actually try to decrypt these descriptors.
The node_get_ed25519_id() warning can actually be triggered by a relay flagged
with NoEdConsensus so instead of triggering a warning on all relays of the
network, downgrade it to protocol warning.
Fixes#24025
Signed-off-by: David Goulet <dgoulet@torproject.org>
When a BUG() occurs, this macro will print extra information about the state
of the scheduler and the given channel if any. This will help us greatly to
fix future bugs in the scheduler especially when they occur rarely.
Fixes#23753
Signed-off-by: David Goulet <dgoulet@torproject.org>
When running "make test-network-all", test that IPv6-only clients can use
microdescriptors. IPv6-only microdescriptor client support was fixed in
tor 0.3.0.1-alpha.
Requires chutney master 61c28b9 or later.
Closes ticket 24109.
When the directory information changes, callback to the HS client subsystem so
it can check if any pending SOCKS connections are waiting for a descriptor. If
yes, attempt a refetch for those.
Fixes#23762
Signed-off-by: David Goulet <dgoulet@torproject.org>
The new decryption function performs no decryption, skips the salt,
and doesn't check the mac. This allows us to fuzz the
hs_descriptor.c code using unencrypted descriptor test, and exercise
more of the code.
Related to 21509.
If the intro point supports ed25519 link authentication, make sure we don't
have a zeroed key which would lead to a failure to extend to it.
We already check for an empty key if the intro point does not support it so
this makes the check on the key more consistent and symmetric.
Fixes#24002
Signed-off-by: David Goulet <dgoulet@torproject.org>
Turns out that when reloading a tor configured with hidden service(s), we
weren't copying all the needed information between the old service object to
the new one.
For instance, the desc_is_dirty timestamp wasn't which could lead to the
service uploading its desriptor much later than it would need to.
The replaycache wasn't also moved over and some intro point information as
well.
Fixes#23790
Signed-off-by: David Goulet <dgoulet@torproject.org>
Bridge relays can use it to add a "bridge-distribution-request" line
to their bridge descriptor, which tells BridgeDB how they'd like their
bridge address to be given out.
Implements tickets 18329.
Fixes bug 23908; bugfix on 0.3.1.6-rc when we made the keypin
failure message really long.
Backport from 0.3.2's 771fb7e7ba,
where arma said "get rid of the scary 256-byte-buf landmine".
Stop attempting to unconditionally mirror the tor repository in GitLab
CI. This prevented developers from enabling GitLab CI on master
because the "update" job would attempt to run, causing an unuseful CI
failure. Fixes bug 23755.
Skip test_config_include_no_permission() when running as root, because
it will get an unexpected success from config_get_lines_include().
This affects some continuous integration setups. Fixes bug 23758.
Add more explanation in doc/HACKING about how to read gcov output,
including a reference to the gcov documentation in the GCC manual.
Also add details about how our postprocessing scripts modify gcov
output.
When we added HTTPTunnelPort, the answer that we give when you try
to use your SOCKSPort as an HTTP proxy became wrong. Now we explain
that Tor sorta _is_ an HTTP proxy, but a SOCKSPort isn't.
I have left the status line the same, in case anything is depending
on it. I have removed the extra padding for Internet Explorer,
since the message is well over 512 bytes without it.
Fixes bug 23678; bugfix on 0.3.2.1-alpha.
Without this fix, changes from client to bridge don't trigger
transition_affects_workers(), so we would never have actually
initialized the cpuworkers.
Fixes bug 23693. Bugfix on 3bcdb26267 0.2.6.3-alpha, which
fixed bug 14901 in the general case, but not on the case where
public_server_mode() did not change.
Because our monotonic time interface doesn't play well with value set to 0,
always initialize to now() the scheduler_last_run at init() of the KIST
scheduler.
Fixes#23696
Signed-off-by: David Goulet <dgoulet@torproject.org>
When a channel is scheduled and flush cells returns 0 that is no cells to
flush, we flag it back in waiting for cells so it doesn't get stuck in a
possible infinite loop.
It has been observed on moria1 where a closed channel end up in the scheduler
where the flush process returned 0 cells but it was ultimately kept in the
scheduling loop forever. We suspect that this is due to a more deeper problem
in tor where the channel_more_to_flush() is actually looking at the wrong
queue and was returning 1 for an empty channel thus putting the channel in the
"Case 4" of the scheduler which is to go back in pending state thus
re-considered at the next iteration.
This is a fix that allows the KIST scheduler to recover properly from a not
entirelly diagnosed problem in tor.
Fixes#23676
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we added single_conn_free_bytes(), we cleared the outbuf on a
connection without setting outbuf_flushlen() to 0. This could cause
an assertion failure later on in flush_buf().
Fixes bug 23690; bugfix on 0.2.6.1-alpha.
This caused a BUG log when we noticed that the circuit had no
channel. The likeliest culprit for exposing that behavior is
d769cab3e5, where we made circuit_mark_for_close() NULL out
the n_chan and p_chan fields of the circuit.
Fixes bug 8185; bugfix on 0.2.5.4-alpha, I think.
My current theory is that this is just a marked circuit that hasn't
closed yet, but let's gather more information in case that theory is
wrong.
Diagnostic for 8185.
This patch ensures that we return TOR_COMPRESS_BUFFER_FULL in case we
have a input bytes left to process, but are out of output buffer or in
case we need to finish where the compression implementation might need
to write an epilogue.
See: https://bugs.torproject.org/23551