When we implemented prop275 in 0.4.8.1-alpha, we changed the
behavior of networkstatus_getinfo_helper_single to omit meaningful
published_on times, replacing them with "2038-01-01". This is
necessary when we're formatting a routerstatus with no additional
info, since routerstatus objects no longer include a published_on.
But in networkstatus_getinfo_by_purpose, we do have a routerinfo
that does have a published_on. This patch uses that information
to report published_on times in our output when we're making a
"virtual" networkstatus for a big file of routerinfo_t objects.
This is mostly important for bridge authorities, since when
they dump a secret list of the bridges, they want to include
published_on times.
Closes#40855. Bugfix on 0.4.8.1-alpha.
This is part of the fast path so we need to cache consensus parameters
instead of querying it everytime we need to learn a value.
Part of #40704
Signed-off-by: David Goulet <dgoulet@torproject.org>
Until now, there was this magic number (64) used as the maximum number
of tasks a CPU worker can take at once.
This commit makes it a consensus parameter so our future selves can
think of a better value depending on network conditions.
Part of #40704
Signed-off-by: David Goulet <dgoulet@torproject.org>
Directory authorities and relays now interact properly with directory
authorities if they change addresses. In the past, they would continue
to upload votes, signatures, descriptors, etc to the hard-coded address
in the configuration. Now, if the directory authority is listed in
the consensus at a different address, they will direct queries to this
new address.
Specifically, these three activities have changed:
* Posting a vote, a signature, or a relay descriptor to all the dir auths.
* Dir auths fetching missing votes or signatures from all the dir auths.
* Dir auths fetching new descriptors from a specific dir auth when they
just learned about them from that dir auth's vote.
We already do this desired behavior (prefer the address in the consensus,
but fall back to the hard-coded dirservers info if needed) when fetching
missing certs.
There is a fifth case, in router_pick_trusteddirserver(), where clients
and relays are trying to reach a random dir auth to fetch something. I
left that case alone for now because the interaction with fallbackdirs
is complicated.
Implements ticket 40705.
Directory authorities stop voting a consensus "Measured" weight
for relays with the Authority flag. Now these relays will be
considered unmeasured, which should reserve their bandwidth
for their dir auth role and minimize distractions from other roles.
In place of the "Measured" weight, they now include a
"MeasuredButAuthority" weight (not used by anything) so the bandwidth
authority's opinion on this relay can be recorded for posterity.
Resolves ticket 40698.
This applies only for relays. Previous commit adds two new consensus
parameters that dictate how libevent is configured with DNS resolution.
And so, with a new consensus, we now look at those values in case they
ever change.
Without this, Exit relay would have to HUP or restart to apply any new
Exit DNS consensus parameters.
Related to #40312
Signed-off-by: David Goulet <dgoulet@torproject.org>
This code was heavily reused from the previous DNS timeout work done in
ticket #40491 that was removed afterall from our code.
Closes#40560
Signed-off-by: David Goulet <dgoulet@torproject.org>
Republishing is necessary to ensure that clients connect using the correct
sendme_inc upon any change. Additionally, introduction points must be
re-chosen, so that cached descriptors with old values are not usable.
We do not expect to change sendme_inc, unless cell size or TLS record size
changes, so this should be rare.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Change it from "timeout" to "tor_timeout" in order to indicate that the
DNS timeout is one from tor's DNS threshold and not the DNS server
itself.
Fixes#40527
Signed-off-by: David Goulet <dgoulet@torproject.org>
When a new consensus method is negotiated, these values will all get
replaced with "2038-01-01 00:00:00".
This change should be safe because:
* As of 0.2.9.11 / 0.3.0.7 / 0.3.1.1-alpha, Tor takes no action
about published_on times in the future.
* The only remaining parties relying on published_on values are (we
believe) relays running 0.3.5.x, which rely on the values in a NS
consensus to see whether their descriptors are out of date. But
this patch only changes microdesc consensuses.
* The latest Tor no longer looks at this field in consensuses.
Why make this change? In experiments, replacing these values with a
fixed value made the size of compressed consensus diffs much much
smaller. (Like, by over 50%!)
Implements proposal 275; Implements #40130.
Nothing breaks here, since all non-voting users of
routerstatus_t.published_on have been adjusted or removed in
previous commits.
We have to expand the API of routerstatus_format_entry() a bit,
though, so that it can always get a published time as argument,
since it can't get it from the routerstatus any more.
This should have no effect on voter behavior.
It used to describe when the old and new routerinfos were published
when we'd decide to download a routerinfo. Now it describes what
their descriptor digests are.
The consensus voters shouldn't actually include such old routers in
the consensus anyway, so this logic shouldn't come up...
but if a client _does_ download something it wouldn't use, it won't
retry infinitely: see checks for WRA_NEVER_DOWNLOADABLE.
Previously we'd look at the routerstatus published_on field when
deciding what to dump, which really has no point. If something's in
the consensus with an ancient published date, then we do want to
keep it.
we do a notice-level log when we decide we *don't* have enough dir
info, but in 0.3.5.1-alpha (see commit eee62e13d9, #14950) we lost our
corresponding notice-level log when things come back.
bugfix on 0.3.5.1-alpha; fixes bug 40496.
The bridge descriptor fetching codes ends up fetching a lot of duplicate
bridge descriptors, because this is how we learn when the descriptor
changes.
This commit only changes comments plus whether we log that one line.
It moves us back to the old behavior, before the previous commit for
30496, where we would only log that line when the bridge descriptor
we're talking about is better than the one we already had (if any).
Without this change, if we have a working bridge, and we add a new bridge,
we will schedule the fetch attempt for that new bridge descriptor for
three hours(!) in the future.
This change is especially needed because of bug #40396, where if you have
one working bridge and one bridge whose descriptor you haven't fetched
yet, your Tor will stall until you have successfully fetched that new
descriptor -- in this case for hours.
In the old design, we would put off all further bridge descriptor fetches
once we had any working bridge descriptor. In this new design, we make the
decision per bridge based on whether we successfully got *its* descriptor.
To make this work, we need to also call learned_bridge_descriptor() every
time we get a bridge descriptor, not just when it's a novel descriptor.
Fixes bug 40396.
Also happens to fix bug 40495 (redundant descriptor fetches for every
bridge) since now we delay fetches once we succeed.
A side effect of this change is that if we have any configured bridges
that *aren't* working, we will keep trying to fetch their descriptors
on the modern directory retry schedule -- every couple of seconds for
the first half minute, then backing off after that -- which is a lot
faster than before.
This proposal implements part of Prop335; it's based on a patch
from Neel Chauhan.
When configured to do so, authorities will assign a MiddleOnly flag
to certain relays. Any relay which an authority gives this flag
will not get Exit, V2Dir, Guard, or HSDir, and might get BadExit if
the authority votes for that one.
With this commit, we will only report a general overload state if we've
seen more than X% of DNS timeout errors over Y seconds. Previous
behavior was to report when a single timeout occured which is really too
small of a threshold.
The value X is a consensus parameters called
"overload_dns_timeout_scale_percent" which is a scaled percentage
(factor of 1000) so we can represent decimal points for X like 0.5% for
instance. Its default is 1000 which ends up being 1%.
The value Y is a consensus parameters called
"overload_dns_timeout_period_secs" which is the time period for which
will gather DNS errors and once over, we assess if that X% has been
reached ultimately triggering a general overload signal.
Closes#40491
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is unfortunately massive but both functionalities were extremely
intertwined and it would have required us to actually change the HSv2 code in
order to be able to split this into multiple commits.
After this commit, there are still artefacts of v2 in the code but there is no
more support for service, intro point and HSDir.
The v2 support for rendezvous circuit is still available since that code is
the same for the v3 and we will leave it in so if a client is able to
rendezvous on v2 then it can still transfer traffic. Once the entire network
has moved away from v2, we can remove v2 rendezvous point support.
Related to #40266
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously we would warn in this case... but there's really no
justification for doing so, and it can only cause confusion.
Fixes bug #40281; bugfix on 0.4.0.1-alpha.