Because of #25306 for which we are unable to reproduce nor understand how it
is possible, this commit removes the asserts() and BUG() on the missing
descriptors instead when rotating them.
This allows us to log more data on error but also to let tor recover
gracefully instead of dying.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Folks have found two in the past week or so; we may as well fix the
others.
Found with:
\#!/usr/bin/python3
import re
def findMulti(fname):
includes = set()
with open(fname) as f:
for line in f:
m = re.match(r'^\s*#\s*include\s+["<](\S+)[>"]', line)
if m:
inc = m.group(1)
if inc in includes:
print("{}: {}".format(fname, inc))
includes.add(m.group(1))
import sys
for fname in sys.argv[1:]:
findMulti(fname)
These are all about local variables shadowing global
functions. That isn't normally a problem, but at least one
compiler we care about seems to treat this as a case of -Wshadow
violation, so let's fix it.
Fixes bug 24634; bugfix on 0.3.2.1-alpha.
Commit e80893e51b made tor call
hs_service_intro_circ_has_closed() when we mark for close a circuit.
When we cleanup intro points, we iterate over the descriptor's map of intro
points and we can possibly mark for close a circuit. This was problematic
because we would MAP_DEL_CURRENT() the intro point then free it and finally
mark for close the circuit which would lookup the intro point that we just
free in the map we are iterating over.
This can't be done and leads to a use-after-free because the intro point will
be returned successfully due to the fact that we are still in the loop
iterating. In other words, MAP_DEL_CURRENT() followed by a digest256map_get()
of the same object should never be done in the same loop.
Fixes#24595
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is groundwork for the HSPOST control port command that needs a way in the
HS subsystem to upload a service descriptor to a specific HSDir.
To do so, we add a public function that takes a series of parameters including
a fully encoded descriptor and initiate a directory request to a specific
routerstatut_t object.
It is for now not used but should be, in future commit, by the HSPOST command.
This commit has no behavior change, only refactoring.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This makes the REPLICA= field optional for the control port event. A v2
service will always pass it and v3 is ignored.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When an intro circuit has closed, do not warn anymore when we can't find the
service. It is possible to hit that condition if the service is removed before
the circuits were fully closed. This happens in the case of deleting an
ephemeral service.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The functions are now used by the ADD_ONION/DEL_ONION control port command as
well. This commits makes them fully functionnal with hidden service v3.
Part of #20699
Signed-off-by: David Goulet <dgoulet@torproject.org>
The hs_service_intro_circ_has_closed() was removing intro point objects if too
many retries.
We shouldn't cleanup those objects in that function at all but rather let
cleanup_intro_points() do its job and clean it properly.
This was causing an issue in #23603.
Furthermore, this moves the logic of remembering failing intro points in the
cleanup_intro_points() function which should really be the only function to
know when to cleanup and thus when an introduction point should be remembered
as a failed one.
Fixes#23603
Signed-off-by: David Goulet <dgoulet@torproject.org>
This will be used by the control port command "GETINFO
hs/service/desc/id/<ADDR>" which returns the encoded current descriptor for
the given onion address.
Signed-off-by: David Goulet <dgoulet@torproject.org>
If the intro point supports ed25519 link authentication, make sure we don't
have a zeroed key which would lead to a failure to extend to it.
We already check for an empty key if the intro point does not support it so
this makes the check on the key more consistent and symmetric.
Fixes#24002
Signed-off-by: David Goulet <dgoulet@torproject.org>
The previous version of these functions had the following issues:
* they can't supply both the IPv4 and IPv6 addresses in link specifiers,
* they try to fall back to a 3-hop path when the address for a direct
connection is unreachable, but this isn't supported by
launch_rendezvous_point_circuit(), so it fails.
But we can't fix these things in a bugfix release.
Instead, always put IPv4 addresses in link specifiers.
And if a v3 single onion service can't reach any intro points, fail.
This supports v3 hidden services on IPv4, dual-stack, and IPv6, and
v3 single onion services on IPv4 only.
Part of 23820, bugfix on 0.3.2.1-alpha.
Also demote a log message that can occur under natural causes
(if the circuit subsystem is missing descriptors/consensus etc.).
The HS subsystem will naturally retry to connect to intro points,
so no need to make that log user-facing.
It is possible that two descriptor upload requests are launched in a very
short time frame which can lead to the second request finishing before the
first one and where that first one will make the HSDir send back a 400
malformed descriptor leading to a warning.
To avoid such, cancel all active directory connections for the specific
descriptor we are about to upload.
Note that this race is still possible on the HSDir side which triggers a log
info to be printed out but that is fine.
Fixes#23457
Signed-off-by: David Goulet <dgoulet@torproject.org>
Before, this function meant "can we connect to this node and
authenticate it using its ed25519 key?" Now it can additionally
mean, "when somebody else connects to this node, do we expect that
they can authenticate using the node's ed25519 key"?
This change lets us future-proof our link authentication a bit.
Closes ticket 20895. No backport needed, since ed25519 link
authentication support has not been in any LTS release yet, and
existing releases with it should be obsolete before any releases
without support for linkauth=3 are released.
There was a bug in upload_descriptor_to_all() where we picked between
first and second hsdir index based on which time segment we are. That's
not right and instead we should be uploading our two descriptors using a
different hsdir index every time. That is, upload first descriptor using
first hsdir index, and upload second descriptor using second hdsir index.
Also simplify stuff in pick_hdsir_v3() since that's only used to fetch
descriptors and hence we can just always use the fetch hsdir index.
With the latest change on how we use the HSDir index, the client and service
need to pick their responsible HSDir differently that is depending on if they
are before or after a new time period.
The overlap mode is active function has been renamed for this and test added.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because of #23387, we've realized that there is one scenario that makes
the client unable to reach the service because of a desynch in the time
period used. The scenario is as follows:
+------------------------------------------------------------------+
| |
| 00:00 12:00 00:00 12:00 00:00 12:00 |
| SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 |
| |
| $==========|-----------$===========|-----------$===========| |
| ^ ^ |
| C S |
+------------------------------------------------------------------+
In this scenario the HS has a newer consensus than the client, and the
HS just moved to the next TP but the client is still stuck on the old
one. However, the service is not in any sort of overlap mode so it
doesn't cover the old TP anymore, so the client is unable to fetch a
descriptor.
We've decided to solve this by extending the concept of overlap period
to be permanent so that the service always publishes two descriptors and
aims to cover clients with both older and newer consensuses. See the
spec patch in #23387 for more details.
Based on our #23387 findings, it seems like to maintain 24/7
reachability we need to employ different logic when computing hsdir
indices for fetching vs storing. That's to guarantee that the client
will always fetch the current descriptor, while the service will always
publish two descriptors aiming to cover all possible edge cases.
For more details see the next commit and the spec branch.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Use the valid_after time from the consensus to get the time period number else
we might get out of sync with the overlap period that uses valid_after.
Make it an optional feature since some functions require passing a
specific time (like hs_get_start_time_of_next_time_period()).
Signed-off-by: David Goulet <dgoulet@torproject.org>
When merging #20657, somehow hs_service_dir_info_changed() became unused
leading to not use the re-upload to HSDir when we were missing information
feature.
Turns out that it is not possible to pick an HSDir with a missing descriptor
because in order to compute the HSDir index, the descriptor is mandatory to
have so we can know its position on the hashring.
This commit removes that dead feature and fix the
hs_service_dir_info_changed() not being used.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We used to check if it was set to 0 which is what unused circuit have but when
the rendezvous circuit was cannibalized, the timestamp_dirty is not 0 but we
still need to reset it so we can actually use it without having the chance of
expiring the next second (or very soon).
Fixes#23123
Signed-off-by: David Goulet <dgoulet@torproject.org>
This fixes a serious bug in our hsdir set change logic:
We used to add nodes in the list of previous hsdirs everytime we
uploaded to a new hsdir and we only cleared the list when we built a new
descriptor. This means that our prev_hsdirs list could end up with 7
hsdirs, if for some reason we ended up uploading our desc to 7 hsdirs
before rebuilding our descriptor (e.g. this can happen if the set of
hsdirs changed).
After our previous hdsir set had 7 nodes, then our old algorithm would
always think that the set has changed since it was comparing a smartlist
with 7 elements against a smartlist with 6 elements.
This commit fixes this bug, by clearning the prev_hsdirs list before we
upload to all hsdirs. This makes sure that our prev_hsdirs list always
contains the latest hsdirs!