This adds 2 histogram metrics for hidden services:
* `tor_hs_rend_circ_build_time` - the rendezvous circuit build time in milliseconds
* `tor_hs_intro_circ_build_time` - the introduction circuit build time in milliseconds
The text representation representation of the new metrics looks like this:
```
# HELP tor_hs_rend_circ_build_time The rendezvous circuit build time in milliseconds
# TYPE tor_hs_rend_circ_build_time histogram
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="1000.00"} 2
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="5000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="10000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="30000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="60000.00"} 10
tor_hs_rend_circ_build_time_bucket{onion="<elided>",le="+Inf"} 10
tor_hs_rend_circ_build_time_sum{onion="<elided>"} 10824
tor_hs_rend_circ_build_time_count{onion="<elided>"} 10
# HELP tor_hs_intro_circ_build_time The introduction circuit build time in milliseconds
# TYPE tor_hs_intro_circ_build_time histogram
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="1000.00"} 0
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="5000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="10000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="30000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="60000.00"} 6
tor_hs_intro_circ_build_time_bucket{onion="<elided>",le="+Inf"} 6
tor_hs_intro_circ_build_time_sum{onion="<elided>"} 9843
tor_hs_intro_circ_build_time_count{onion="<elided>"} 6
```
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
This adds a `reason` label to the `hs_intro_rejected_intro_req_count` and
`hs_rdv_error_count` metrics introduced in #40755.
Metric look up and intialization is now more a bit more involved. This may be
fine for now, but it will become unwieldy if/when we add more labels (and as
such will need to be refactored).
Also, in the future, we may want to introduce finer grained `reason` labels.
For example, the `invalid_introduce2` label actually covers multiple types of
errors that can happen during the processing of an INTRODUCE2 cell (such as
cell parse errors, replays, decryption errors).
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
This introduces a couple of new service side metrics:
* `hs_intro_rejected_intro_req_count`, which counts the number of introduction
requests rejected by the hidden service
* `hs_rdv_error_count`, which counts the number of rendezvous errors as seen by
the hidden service (this number includes the number of circuit establishment
failures, failed retries, end-to-end circuit setup failures)
Closes#40755. This partially addresses #40717.
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
Directory authorities now include their AuthDirMaxServersPerAddr
config option in the consensus parameter section of their vote. Now
external tools can better predict how they will behave.
In particular, the value should make its way to the
https://consensus-health.torproject.org/#consensusparams page.
Once enough dir auths vote this param, they should also compute a
consensus value for it in the consensus document. Nothing uses this
consensus value yet, but we could imagine having dir auths consult it
in the future.
Implements ticket 40753.
This updates the docs to stop suggesting `pk` can be NULL, as that doesn't seem
to be the case anymore (`tor_assert(pk)`).
Signed-off-by: Gabriela Moldovan <gabi@torproject.org>
When I copied this to arti, I messed up and thought that the default
time period was 1440 seconds for some weird testing reason. That led
to confusion.
This commit adds a test case that time period 1440 is May 20, 1973:
now arti and c tor match!
Add new liblzma enums (LZMA_SEEK_NEEDED and LZMA_RET_INTERNAL*)
conditional to the API version they arrived in. The first stable
version of liblzma this affects is 5.4.0
Fixes#40741
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
Add new liblzma enums (LZMA_SEEK_NEEDED and LZMA_RET_INTERNAL*)
conditional to the API version they arrived in. The first stable
version of liblzma this affects is 5.4.0
Fixes#40741
Signed-off-by: Micah Elizabeth Scott <beth@torproject.org>
If a circuit only sends a tiny amount of data such that its cwnd is not
full, it won't increase its cwnd above the minimum. Since slow start circuits
should never hit the minimum otherwise, we can just ignore them for RTT reset
to handle this.
Since these are derived from the number of SENDMEs in a cwnd/cc update,
and a cwnd should not exceed ~10k, there's plenty of room in uint16_t
for them, even if the network gets significantly faster.
Also provides some stickiness, so that once full, the congestion window is
considered still full for the rest of an update cycle, or the entire
congestion window.
In this way, we avoid increasing the congestion window if it is not fully
utilized, but we can still back off in this case. This substantially reduces
queue use in Shadow.
Since these are derived from the number of SENDMEs in a cwnd/cc update,
and a cwnd should not exceed ~10k, there's plenty of room in uint16_t
for them, even if the network gets significantly faster.
Also provides some stickiness, so that once full, the congestion window is
considered still full for the rest of an update cycle, or the entire
congestion window.
In this way, we avoid increasing the congestion window if it is not fully
utilized, but we can still back off in this case. This substantially reduces
queue use in Shadow.
Having no TotalBuildTimes along a positive CircuitBuildAbandonedCount
count lead to a segfault. We check for that condition and then BUG + log
warn if that is the case.
It should never happened in theory but if someone modified their state
file, it can lead to this problem so instead of segfaulting, warn.
Fixes#40437
Signed-off-by: David Goulet <dgoulet@torproject.org>
The logic was inverted. Introduced in commit
9155e08450.
This was reported through our bug bounty program on H1. It fixes the
TROVE-2022-002.
Fixes#40730
Signed-off-by: David Goulet <dgoulet@torproject.org>
We need the fallbackdir file to be the same so our release CI can
generate a new list and apply it uniformly on all series.
(Same as geoip)
Signed-off-by: David Goulet <dgoulet@torproject.org>
We need all geoip files to be the same so our release CI can generate a
new list and apply it uniformly on all series.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Rotate the relay identity key and v3 identity key for moria1. They
have been online for more than a decade, there was a known potential
compromise, and anyway refreshing keys periodically is good practice.
Advertise new ports too, to avoid confusion.
Closes ticket 40722.
This change mitigates DNS-based website oracles by making the time that
a domain name is cached uncertain (+- 4 minutes of what's measurable).
Resolves TROVE-2021-009.
Fixes#40674
We cap our number of CPU worker threads to at least 2 even if we have a
single core. But also, before we used to always add one extra thread
regardless of the number of core.
This meant that we were off when re-using the get_num_cpus() function
when calculating our onionskin work overhead because we were always off
by one.
This commit makes it that we always use the number of thread our actual
thread pool was configured with.
Fixes#40719
Signed-off-by: David Goulet <dgoulet@torproject.org>
Cap this to 2 threads always because we need a low and high priority
thread even with a single core.
Fixes#40713
Signed-off-by: David Goulet <dgoulet@torproject.org>
Created and Rejected connections are ever going up counters. While
Opened connections are gauges going up and down.
Fixes#40712
Signed-off-by: David Goulet <dgoulet@torproject.org>
This change mitigates DNS-based website oracles by making the time that
a domain name is cached uncertain (+- 4 minutes of what's measurable).
Resolves TROVE-2021-009.
Fixes#40674
This is part of the fast path so we need to cache consensus parameters
instead of querying it everytime we need to learn a value.
Part of #40704
Signed-off-by: David Goulet <dgoulet@torproject.org>
Until now, there was this magic number (64) used as the maximum number
of tasks a CPU worker can take at once.
This commit makes it a consensus parameter so our future selves can
think of a better value depending on network conditions.
Part of #40704
Signed-off-by: David Goulet <dgoulet@torproject.org>
Transform the hardcoded value ONIONQUEUE_WAIT_CUTOFF into a consensus
parameter so we can control it network wide.
Closes#40704
Signed-off-by: David Goulet <dgoulet@torproject.org>
This also incidently removes a use of uninitialized stack data from the
connection_or_set_ext_or_identifier() function.
Fixes#40648
Signed-off-by: David Goulet <dgoulet@torproject.org>
* it also uses sys/param.h to track its version;
* present that to tor_libc_get_version_str() as libc version;
while here, we also fix the return of FreeBSD version
* __FreeBSD_version is the correct var tracking the OSVERSION
* we use OSVERSION here (defined by __FreeBSD__);
* it's part of the <sys/param.h> include;
* that tracks all noteworthy changes made to the base system.
* __BSD_VISIBLE is defined by systems like FreeBSD and OpenBSD;
* that also extends to DragonFlyBSD;
* it's used on stdlib.h and ctypes.h on those systems.
This BUG() was added when the code was written to see if this callback
was ever executed after we marked the handle as EOF. It turns out, it
does, but we handle it gracefully. We can therefore remove the BUG().
Fixes tpo/core/tor#40596.
Remove a harmless "Bug" log message that can happen in
relay_addr_learn_from_dirauth() on relays during startup:
tor_bug_occurred_(): Bug: ../src/feature/relay/relay_find_addr.c:225: relay_addr_learn_from_dirauth: Non-fatal assertion !(!ei) failed. (on Tor 0.4.7.10 )
Bug: Tor 0.4.7.10: Non-fatal assertion !(!ei) failed in relay_addr_learn_from_dirauth at ../src/feature/relay/relay_find_addr.c:225. Stack trace: (on Tor 0.4.7.10 )
Finishes fixing bug 40231.
Fixes bug 40523; bugfix on 0.4.5.4-rc.
Move the retry from circuit_expire_building() to when the offending
circuit is being closed.
Fixes#40695
Signed-off-by: David Goulet <dgoulet@torproject.org>
Logic is too convoluted and we can't efficiently apply a specific
timeout depending on the purpose.
Remove it and instead rely on the right circuit cutoff instead of
keeping this flagged circuit open forever.
Part of #40694
Signed-off-by: David Goulet <dgoulet@torproject.org>
Explicitly set the S_CONNECT_REND purpose to a 4-hop cutoff.
As for the established rendezvous circuit waiting on the RENDEZVOUS2,
set one that is very long considering the possible waiting time for the
service to get the request and join our rendezvous.
Part of #40694
Signed-off-by: David Goulet <dgoulet@torproject.org>
Change it to an "unreachable" error so the intro point can be retried
and not flagged as a failure and never retried again.
Closes#40692
Signed-off-by: David Goulet <dgoulet@torproject.org>