When we don't yet have a descriptor for one of our bridges, disable
the entry guard retry schedule on that bridge. The entry guard retry
schedule and the bridge descriptor retry schedule can conflict,
e.g. where we mark a bridge as "maybe up" yet we don't try to fetch
its descriptor yet, leading Tor to wait (refusing to do anything)
until it becomes time to fetch the descriptor.
Fixes bug 40497; bugfix on 0.3.0.3-alpha.
we do a notice-level log when we decide we *don't* have enough dir
info, but in 0.3.5.1-alpha (see commit eee62e13d9, #14950) we lost our
corresponding notice-level log when things come back.
bugfix on 0.3.5.1-alpha; fixes bug 40496.
When we try to fetch a bridge descriptor and we fail, we mark
the guard as failed, but we never scheduled a re-compute for
router_have_minimum_dir_info().
So if we had already decided we needed to wait for this new descriptor,
we would just wait forever -- even if, counterintuitively, *losing* the
bridge is just what we need to *resume* using the network, if we had it
in state GUARD_REACHABLE_MAYBE and we were stalling to learn this outcome.
See bug 40396 for more details.
When we looked, this was the third most frequent message at
PROTOCOL_WARN, and doesn't actually tell us what to do about it.
Now:
* we just log it at info
* we log it only once per circuit
* we report, in the heartbeat, how many times it happens, how many
cells it happens with per circuit, and how long these circuits
have been alive (on average).
Fixes the final part of #40400.
This is due to the libevent bug
https://github.com/libevent/libevent/issues/1219 that fails to return
back the DNS record type on error.
And so, the MetricsPort now only reports the errors as a global counter
and not a per record type.
Closes#40490
Signed-off-by: David Goulet <dgoulet@torproject.org>
With this commit, we will only report a general overload state if we've
seen more than X% of DNS timeout errors over Y seconds. Previous
behavior was to report when a single timeout occured which is really too
small of a threshold.
The value X is a consensus parameters called
"overload_dns_timeout_scale_percent" which is a scaled percentage
(factor of 1000) so we can represent decimal points for X like 0.5% for
instance. Its default is 1000 which ends up being 1%.
The value Y is a consensus parameters called
"overload_dns_timeout_period_secs" which is the time period for which
will gather DNS errors and once over, we assess if that X% has been
reached ultimately triggering a general overload signal.
Closes#40491
Signed-off-by: David Goulet <dgoulet@torproject.org>
This means that at this commit, tor will stop logging that v2 is
deprecated and treat a v2 address as a bad hostname that we can't use.
Part of #40476
Signed-off-by: David Goulet <dgoulet@torproject.org>
Values greater than 100 would have had the same effect as 100, so
this doesn't actually change Tor's behavior; it just makes the
intent clearer. Fixes#40486; see also torspec#66.