We had omitted some checks for whether our vanguards (second layer
guards from proposal 333) overlapped or came from the same family.
Now make sure to pick each of them to be independent.
Fixes bug 40639; bugfix on 0.4.7.1-alpha.
Remove UPTIME_TO_GUARANTEE_STABLE, MTBF_TO_GUARANTEE_STABLE,
TIME_KNOWN_TO_GUARANTEE_FAMILIAR WFU_TO_GUARANTEE_GUARD and replace each
of them with a tunnable torrc option.
Related to #40652
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, `channelpadding_get_netflow_inactive_timeout_ms` would
crash with an assertion failure if `low_timeout` was greater than
`high_timeout`. That wasn't possible in practice because of checks
in `channelpadding_update_padding_for_channel`, but it's better not
to have a function whose correctness is this tricky to prove.
Fixes#40645. Bugfix on 0.3.1.1-alpha.
Note that with this commit, TRUNCATED cells won't be used anymore that
is client and relays won't emit them.
Fixes#40623
Signed-off-by: David Goulet <dgoulet@torproject.org>
This also incidently removes a use of uninitialized stack data from the
connection_or_set_ext_or_identifier() function.
Fixes#40648
Signed-off-by: David Goulet <dgoulet@torproject.org>
The escaped() function and its kin already wrap their output in
quotes: there's no reason to do so twice.
I am _NOT_ making a corresponding change in calls that make the same
mistake in controller-related functions, however, due to the risk of
a compatibility break. :(
Closes#22723.
Once the cpath is finalized, e2e encryption setup, transfer the ccontrol
from the rendezvous circuit to the cpath.
This allows the congestion control subsystem to properly function for
both upload and download side of onion services.
Closes#40586
Signed-off-by: David Goulet <dgoulet@torproject.org>
Introduces two new consensus parameter:
exit_dns_timeout: Number of seconds before libevent should consider
the DNS request a timeout.
exit_dns_num_attempts: Number of attempts that libeven should retry a
previously failing query before calling it a timeout.
Closes#40312
Signed-off-by: David Goulet <dgoulet@torproject.org>
Due to a possible Guard subsystem recursion, when the HS client gets
notified that the directory information has changed, it must run it in a
seperate mainloop event to avoid such issue.
See the ticket for more information on the recursion. This also fixes a
fatal assert.
Fixes#40579
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is possible to not have the descriptor anymore by the time the
rendezvous circuit opens. Don't BUG() on that.
Instead, when sending the INTRODUCE1 cell, make sure the descriptor we
have (or have just fetched) matches what we setup in the rendezvous
circuit.
If not, the circuit is closed and another one is opened for a retry.
Fixes#40576
Signed-off-by: David Goulet <dgoulet@torproject.org>
Prometheus needs unique labels and so this bug was causing an onion
service with multiple ports to have multiple "port=" label for the
metrics requiring a port label.
Fixes#40581
Signed-off-by: David Goulet <dgoulet@torproject.org>
Prometheus needs unique labels and so this bug was causing an onion
service with multiple ports to have multiple "port=" label for the
metrics requiring a port label.
Fixes#40581
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is possible that a scheduled channel ended up with 0 bytes in its
outbuf after the scheduling loop and having an outbuf table entry
indicating that we need to flush bytes on the wire after the loop.
This lead to attempt to write 0 bytes up to the TLS layer that would
prevent such action.
All in all, this fixes wasted CPU cycles on attempting to flush nothing.
Fixes#40548
Signed-off-by: David Goulet <dgoulet@torproject.org>
We removed HSIntro=3 and HSDir=1 that are v2 specific. Since 0.3.5.17,
we do not support introducing or being a directory for onion service v2.
Closes#40509
Signed-off-by: David Goulet <dgoulet@torproject.org>
Change it from "timeout" to "tor_timeout" in order to indicate that the
DNS timeout is one from tor's DNS threshold and not the DNS server
itself.
Fixes#40527
Signed-off-by: David Goulet <dgoulet@torproject.org>
Tor has configure libevent to attempt up to 3 times a DNS query for a
maximum of 5 seconds each. Once that 5 seconds has elapsed, it consider
the query "Timed Out" but tor only gets a timeout if all 3 attempts have
failed.
For example, using Unbound, it has a much higher threshold of timeout.
It is well defined in
https://www.nlnetlabs.nl/documentation/unbound/info-timeout/ and has
some complexity to it. But the gist is that if it times out, it will be
much more than 5 seconds.
And so the Tor DNS timeouts are more of a "UX issue" rather than a
"network issue". For this reason, we are removing this metric from the
overload general signal.
See https://gitlab.torproject.org/tpo/network-health/team/-/issues/139
for more information.
Fixes#40527
Signed-off-by: David Goulet <dgoulet@torproject.org>
This avoids performing and then freeing a lot of small mallocs() if
the hash line has too many elements.
Fixes one case of bug 40472; resolves OSS-Fuzz 38363. Bugfix on
0.3.1.1-alpha when the consdiff parsing code was introduced.
When a new consensus method is negotiated, these values will all get
replaced with "2038-01-01 00:00:00".
This change should be safe because:
* As of 0.2.9.11 / 0.3.0.7 / 0.3.1.1-alpha, Tor takes no action
about published_on times in the future.
* The only remaining parties relying on published_on values are (we
believe) relays running 0.3.5.x, which rely on the values in a NS
consensus to see whether their descriptors are out of date. But
this patch only changes microdesc consensuses.
* The latest Tor no longer looks at this field in consensuses.
Why make this change? In experiments, replacing these values with a
fixed value made the size of compressed consensus diffs much much
smaller. (Like, by over 50%!)
Implements proposal 275; Implements #40130.
The connect failure cache had a bad interaction with retrying connections
to our guards or bridges when we go offline and then come back online --
while offline we would fail to connect and cache this result, and then
when we return we would decline to even attempt to connect, because our
failure cache said it wouldn't work.
Now only cache connect failures for relays when we connected to them
because of somebody else's EXTEND request.
Fixes bug 40499; bugfix on 0.3.3.4-alpha.
From LibreSSL versions 3.2.1 through 3.4.0, our configure script
would conclude that TLSv1.3 as supported, but it actually wasn't.
This led to annoying breakage like #40128 and #40445.
Now we give an error message if we try to build with one of those
versions.
Closes#40511.
Previously the logic was reversed, and always gave the wrong answer.
This has no other effect than to change whether we suppress
deprecated API warnings.
Fixes#40429; bugfix on 0.3.5.13.
Mingw headers sometimes like to define alternative scanf/printf
format attributes depending on whether they're using clang, UCRT,
MINGW_ANSI_STDIO, or the microsoft version of printf/scanf. This
change attempts to use the right one on the given platform.
This is an attempt to fix part of #40355.