Due to #23662 this can happen under natural causes and does not disturb
the functionality of the service. This is a simple 0.3.2 fix for now,
and we plan to fix this properly in 0.3.3.
Clients add rendezvous point IPv6 addresses to introduce cell link specifiers,
when the node has a valid IPv6 address.
Also check the node's IPv4 address is valid before adding any link specifiers.
Implements #23577.
On failure to upload, the HS_DESC event would report "UPLOAD_FAILED" as the
Action but it should have reported "FAILED" according to the spec.
Fixes#24230
Signed-off-by: David Goulet <dgoulet@torproject.org>
DisableNetwork is a subset of net_is_disabled(), which is (now) a
subset of should_delay_dir_fetches().
Some of these changes are redundant with others higher or lower in
the call stack. The ones that I think are behavior-relevant are
listed in the changes file. I've also added comments in a few
places where the behavior is subtle.
Fixes bug 12062; bugfix on various versions.
Commit e67f4441eb introduced a safeguard against
using an uninitialized voting schedule object. However, the dirvote_act() code
was looking roughly at the same thing to know if it had to compute the timings
before voting with this condition:
if (!voting_schedule.voting_starts) {
...
dirvote_recalculate_timing(options, now);
}
The sr_init() function is called very early and goes through the safeguard
thus the voting schedule is always initilized before the first vote.
That first vote is a crucial one because we need to have our voting schedule
aligned to the "now" time we are about to use for voting. Then, the schedule
is updated when we publish our consensus or/and when we set a new consensus.
From that point on, we only want to update the voting schedule through that
code flow.
This "created_on_demand" is indicating that the timings have been recalculated
on demand by another subsystem so if it is flagged, we know that we need to
ignore its values before voting.
Fixes#24186
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we have fewer than 15 descriptors to fetch, we will delay the
fetch for a little while. That's fine, if we can go ahead and build
circuits... but if not, it's a poor choice indeed.
Fixes bug 23985; bugfix on 0.1.1.11-alpha.
In 0.3.0.3-alpha, when we made primary guard descriptors necessary
for circuit building, this situation got worse.
When calculating the fraction of nodes that have descriptors, and all
all nodes in the network have zero bandwidths, count the number of nodes
instead.
Fixes bug 23318; bugfix on 0.2.4.10-alpha.
Back in 0.2.4.3-alpha (e106812a77), when we switched from using
double to using uint64 for selecting by bandwidth, I got the math
wrong: I should have used llround(x), or (uint64_t)(x+0.5), but
instead I wrote llround(x+0.5). That means we would always round
up, rather than rounding to the closest integer
Fixes bug 23318; bugfix on 0.2.4.3-alpha.
The flush cells process can close a channel if the connection write fails but
still return that it flushed at least one cell. This is due because the error
is not propagated up the call stack so there is no way of knowing if the flush
actually was successful or not.
Because this would require an important refactoring touching multiple
subsystems, this patch is a bandaid to avoid the KIST scheduler to handle
closed channel in its loop.
Bandaid on #23751.
Signed-off-by: David Goulet <dgoulet@torproject.org>
dirvote_get_next_valid_after_time() is the only public function that uses the
voting schedule outside of the dirvote subsystem so if it is zeroed,
recalculate its timing if we can that is if a consensus exists.
Part of #24161
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because the HS and SR subsystems can use the voting schedule early (with the
changes in #23623 making the SR subsystem using the static voting schedule
object), we need to recalculate the schedule very early when setting the new
consensus.
Fixes#24161
Signed-off-by: David Goulet <dgoulet@torproject.org>
If it decrypts something that turns out to start with a NUL byte,
then decrypt_desc_layer() will return 0 to indicate the length of
its result. But 0 also indicates an error, which causes the result
not to be freed by decrypt_desc_layer()'s callers.
Since we're trying to stabilize 0.3.2.x, I've opted for the simpler
possible fix here and made it so that an empty decrypted string will
also count as an error.
Fixes bug 24150 and OSS-Fuzz issue 3994.
The original bug was present but unreachable in 0.3.1.1-alpha. I'm
calling this a bugfix on 0.3.2.1-alpha since that's the first version
where you could actually try to decrypt these descriptors.
The node_get_ed25519_id() warning can actually be triggered by a relay flagged
with NoEdConsensus so instead of triggering a warning on all relays of the
network, downgrade it to protocol warning.
Fixes#24025
Signed-off-by: David Goulet <dgoulet@torproject.org>
When a BUG() occurs, this macro will print extra information about the state
of the scheduler and the given channel if any. This will help us greatly to
fix future bugs in the scheduler especially when they occur rarely.
Fixes#23753
Signed-off-by: David Goulet <dgoulet@torproject.org>
They are not yet implemented: they will upload descriptors, but won't be
able to rendezvous, because IPv6 addresses in link specifiers are ignored.
Part of #23820.
The previous version of this function had the following issues:
* it didn't check if the extend_info contained an IPv6 address,
* it didn't check if the ed25519 identity key was valid.
But we can't add IPv6 support in a bugfix release.
Instead, BUG() if the address is an IPv6 address, so we always put IPv4
addresses in link specifiers. And ignore missing ed25519 identifiers,
rather than generating an all-zero link specifier.
This supports v3 hidden services on IPv4, dual-stack, and IPv6, and
v3 single onion services on IPv4 only.
Part of 23820, bugfix on 0.3.2.1-alpha.
When the directory information changes, callback to the HS client subsystem so
it can check if any pending SOCKS connections are waiting for a descriptor. If
yes, attempt a refetch for those.
Fixes#23762
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because the HS subsystem needs the voting schedule to compute time period, we
need all tor type to do that.
Part of #23623
Signed-off-by: David Goulet <dgoulet@torproject.org>
The new decryption function performs no decryption, skips the salt,
and doesn't check the mac. This allows us to fuzz the
hs_descriptor.c code using unencrypted descriptor test, and exercise
more of the code.
Related to 21509.
The exposed get_voting_schedule() allocates and return a new object everytime
it is called leading to an awful lot of memory allocation when getting the
start time of the current round which is done for each node in the consensus.
Closes#23623
Signed-off-by: David Goulet <dgoulet@torproject.org>
If the intro point supports ed25519 link authentication, make sure we don't
have a zeroed key which would lead to a failure to extend to it.
We already check for an empty key if the intro point does not support it so
this makes the check on the key more consistent and symmetric.
Fixes#24002
Signed-off-by: David Goulet <dgoulet@torproject.org>
The previous version of these functions had the following issues:
* they can't supply both the IPv4 and IPv6 addresses in link specifiers,
* they try to fall back to a 3-hop path when the address for a direct
connection is unreachable, but this isn't supported by
launch_rendezvous_point_circuit(), so it fails.
But we can't fix these things in a bugfix release.
Instead, always put IPv4 addresses in link specifiers.
And if a v3 single onion service can't reach any intro points, fail.
This supports v3 hidden services on IPv4, dual-stack, and IPv6, and
v3 single onion services on IPv4 only.
Part of 23820, bugfix on 0.3.2.1-alpha.
The previous version of this function has the following issues:
* it doesn't choose between IPv4 and IPv6 addresses correctly, and
* it doesn't fall back to a 3-hop path when the address for a direct
connection is unreachable.
But we can't fix these things in a bugfix release.
Instead, treat IPv6 addresses like any other unrecognised link specifier
and ignore them. If there is no IPv4 address, return NULL.
This supports v3 hidden services on IPv4, dual-stack, and IPv6, and
v3 single onion services on IPv4 only.
Part of 23820, bugfix on 0.3.2.1-alpha.
Turns out that when reloading a tor configured with hidden service(s), we
weren't copying all the needed information between the old service object to
the new one.
For instance, the desc_is_dirty timestamp wasn't which could lead to the
service uploading its desriptor much later than it would need to.
The replaycache wasn't also moved over and some intro point information as
well.
Fixes#23790
Signed-off-by: David Goulet <dgoulet@torproject.org>
Bridge relays can use it to add a "bridge-distribution-request" line
to their bridge descriptor, which tells BridgeDB how they'd like their
bridge address to be given out.
Implements tickets 18329.
Fixes bug 23908; bugfix on 0.3.1.6-rc when we made the keypin
failure message really long.
Backport from 0.3.2's 771fb7e7ba,
where arma said "get rid of the scary 256-byte-buf landmine".
It _should_ work, and I don't see a reason that it wouldn't, but
just in case, add a 10 second timer to make tor die with an
assertion failure if it's supposed to exit but it doesn't.
This function was never about 'finishing' the event loop, but rather
about making sure that the code outside the event loop would be run
at least once.
Sometimes when we call exit(), it's because the process is
completely hopeless: openssl has a broken AES-CTR implementation, or
the clock is in the 1960s, or something like that.
But sometimes, we should return cleanly from tor_main() instead, so
that embedders can keep embedding us and start another Tor process.
I've gone through all the exit() and _exit() calls to annotate them
with "exit ok" or "XXXX bad exit" -- the next step will be to fix
the bad exit()s.
First step towards 23848.
At first, we put the tor_git_revision constant in tor_main.c, so
that we wouldn't have to recompile config.o every time the git
revision changed. But putting it there had unintended side effect
of forcing every program that wanted to link libor.a (including
test, test-slow, the fuzzers, the benchmarks, etc) to declare their
own tor_git_revision instance.
That's not very nice, especially since we want to start supporting
others who want to link against Tor (see 23846).
So, create a new git_revision.c file that only contains this
constant, and remove the duplicated boilerplate from everywhere
else.
Part of implementing ticket 23845.
This feature should help programs that want to launch and manage a
Tor process, as well as programs that want to launch and manage a
Tor instance in a separate thread. Right now, they have to open a
controlport, and then connect to it, with attendant authentication
issues. This feature allows them to just start with an
authenticated connection.
Bug 23900.
Create a function that tells us if we can fetch or not the descriptor for the
given service key.
No behavior change. Mostly moving code but with a slight change so the
function can properly work by returning a boolean and also a possible fetch
status code.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we added HTTPTunnelPort, the answer that we give when you try
to use your SOCKSPort as an HTTP proxy became wrong. Now we explain
that Tor sorta _is_ an HTTP proxy, but a SOCKSPort isn't.
I have left the status line the same, in case anything is depending
on it. I have removed the extra padding for Internet Explorer,
since the message is well over 512 bytes without it.
Fixes bug 23678; bugfix on 0.3.2.1-alpha.
Without this fix, changes from client to bridge don't trigger
transition_affects_workers(), so we would never have actually
initialized the cpuworkers.
Fixes bug 23693. Bugfix on 3bcdb26267 0.2.6.3-alpha, which
fixed bug 14901 in the general case, but not on the case where
public_server_mode() did not change.
Because our monotonic time interface doesn't play well with value set to 0,
always initialize to now() the scheduler_last_run at init() of the KIST
scheduler.
Fixes#23696
Signed-off-by: David Goulet <dgoulet@torproject.org>
When a channel is scheduled and flush cells returns 0 that is no cells to
flush, we flag it back in waiting for cells so it doesn't get stuck in a
possible infinite loop.
It has been observed on moria1 where a closed channel end up in the scheduler
where the flush process returned 0 cells but it was ultimately kept in the
scheduling loop forever. We suspect that this is due to a more deeper problem
in tor where the channel_more_to_flush() is actually looking at the wrong
queue and was returning 1 for an empty channel thus putting the channel in the
"Case 4" of the scheduler which is to go back in pending state thus
re-considered at the next iteration.
This is a fix that allows the KIST scheduler to recover properly from a not
entirelly diagnosed problem in tor.
Fixes#23676
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we added single_conn_free_bytes(), we cleared the outbuf on a
connection without setting outbuf_flushlen() to 0. This could cause
an assertion failure later on in flush_buf().
Fixes bug 23690; bugfix on 0.2.6.1-alpha.
This caused a BUG log when we noticed that the circuit had no
channel. The likeliest culprit for exposing that behavior is
d769cab3e5, where we made circuit_mark_for_close() NULL out
the n_chan and p_chan fields of the circuit.
Fixes bug 8185; bugfix on 0.2.5.4-alpha, I think.
My current theory is that this is just a marked circuit that hasn't
closed yet, but let's gather more information in case that theory is
wrong.
Diagnostic for 8185.
If 6 SOCKS requests are opened at once, it would have triggered 6 fetches
which ultimately poke all 6 HSDir. We don't want that, if we have multiple
SOCKS requests for the same service, do one fetch only.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When purging last HSDir requests, we used time(NULL) for computing the
service blinded key but in all other places in our codebase we actually
use the consensus times. That can cause wrong behavior if the consensus
is in a different time period than time(NULL).
This commit is required for proper purging of HSDir requests.
The confparse field has type UINT, which corresponds to an int
type. We had uint32_t.
This shouldn't cause trouble in practice, since int happens to
4-bytes wide on every platform where an authority is running. It's
still wrong, though.
These should have been int, but we had listed them as unsigned.
That's an easy mistake to make, since "int" corresponds with either
INT or UINT in the configuration file.
This bug cannot have actually caused a problem in practice, since we
check those fields' values on load, and ensure that they are in
range 0..INT32_MAX.
New approach, suggested by Taylor: During testing builds, we
initialize a union member of an appropriate pointer type with the
address of the member field we're trying to test, so we can make
sure that the compiler doesn't warn.
My earlier approach invoked undefined behavior.
Also demote a log message that can occur under natural causes
(if the circuit subsystem is missing descriptors/consensus etc.).
The HS subsystem will naturally retry to connect to intro points,
so no need to make that log user-facing.
So we can track them more easily in the logs and match any open/close/free
with those identifiers.
Part of #23645
Signed-off-by: David Goulet <dgoulet@torproject.org>
This removes the "nickname" of the cannibalized circuit last hop as it is
useless. It now logs the n_circ_id and global identifier so we can match it
with other logging statement.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Prior to the log statement, the circuit n_circ_id value is zeroed so keep a
copy so we can log it at the end.
Part of #23645
Signed-off-by: David Goulet <dgoulet@torproject.org>
Make the "Exit" flag assignment only depend on whether the exit
policy allows connections to ports 80 and 443. Previously relays
would get the Exit flag if they allowed connections to one of
these ports and also port 6667.
Resolves ticket 23637.
Back in 0.2.4.3-alpha (e106812a77), when we switched from using
double to using uint64 for selecting by bandwidth, I got the math
wrong: I should have used llround(x), or (uint64_t)(x+0.5), but
instead I wrote llround(x+0.5). That means we would always round
up, rather than rounding to the closest integer
Fixes bug 23318; bugfix on 0.2.4.3-alpha.
The is_first_hop field should have been called used_create_fast,
but everywhere that we wanted to check it, we should have been
checking channel_is_client() instead.
The diff is confusing, but were two static scheduler functions that
needed moving to static comment block.
No code change. Thanks dgoulet for original commit
The clock_skew_warning() refactoring allowed calls from
or_state_load() to control_event_bootstrap_problem() to occur prior
bootstrap phase 0, causing an assertion failure. Initialize the
bootstrap status prior to calling clock_skew_warning() from
or_state_load().
or_state_load() was using an incorrect sign convention when calling
clock_skew_warning() to warn about state file clock skew. This caused
the wording of the warning to be incorrect about the direction of the
skew.
is_canonical doesn't mean "am I connected to the one true address of
this relay"; it means "does this relay tell me that the address I'm
connected to belong to it." The point is to prevent TCP-based MITM,
not to prevent the relay from multi-homing.
Related to 22890.
Authority IPv6 addresses were originally added in 0.2.8.1-alpha.
This leaves 3/8 directory authorities with IPv6 addresses, but there
are also 52 fallback directory mirrors with IPv6 addresses.
Resolves 19760.
Use this value instead of hardcoded values of 32 everywhere. This also
addresses the use of REND_DESC_ID_V2_LEN_BASE32 in
hs_lookup_last_hid_serv_request() for the HSDir encoded identity digest length
which is accurate but semantically wrong.
Fixes#23305.
Signed-off-by: David Goulet <dgoulet@torproject.org>
RENDEZVOUS1 cell is 84 bytes long in v3 and 168 bytes long in v2 so this
commit pads with random bytes the v3 cells up to 168 bytes so they all look
alike at the rendezvous point.
Closes#23420
Signed-off-by: David Goulet <dgoulet@torproject.org>
This warning is caused by a different tv_usec data type on macOS
compared to the system on which the patch was developed.
Fixes 23575 on 0.3.2.1-alpha.
It is highly unlikely to happen but if so, we need to know and why. The
warning with the next_run values could help.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When the KIST schedule() is called, it computes a diff value between the last
scheduler run and the current monotonic time. If tha value is below the run
interval, the libevent even is updated else the event is run.
It turned out that casting to int32_t the returned int64_t value for the very
first scheduler run (which is set to 0) was creating an overflow on the 32 bit
value leading to adding the event with a gigantic usec value. The scheduler
was simply never running for a while.
First of all, a BUG() is added for a diff value < 0 because if the time is
really monotonic, we should never have a now time that is lower than the last
scheduler run time. And we will try to recover with a diff value to 0.
Second, the diff value is changed to int64_t so we avoid this "bootstrap
overflow" and future casting overflow problems.
Fixes#23558
Signed-off-by: David Goulet <dgoulet@torproject.org>
Otherwise integer overflows can happen. Remember, doing a i32xi32
multiply doesn't actually produce a 64-bit output. You need to do
i64xi32 or i64xi64.
Coverity found this as CID 1417753
Each type of scheduler implements its own static scheduler_t object and
returns a reference to it.
This commit also makes it a const pointer that is it can only change inside
the scheduler type subsystem but not outside for extra protection.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Instead, add wrappers to do the needed action the different scheduler needs
with the libevent object.
Signed-off-by: David Goulet <dgoulet@torproject.org>
A channel can bounce in the scheduler and bounce out with the IDLE state which
means that if it came in the scheduler once, it has socket information that
needs to be freed from the global hash table.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This option is a list of possible scheduler type tor can use ordered by
priority. Its default value is "KIST,KISTLite,Vanilla" which means that KIST
will be used first and if unavailable will fallback to KISTLite and so on.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is possible that tor was compiled with KIST support but the running kernel
has no support for it. In that case, fallback to a naive approach and flag
that we have no kernel support.
At this commit, if the kernel support is disabled, there are no ways to come
back from it other than restarting tor with a kernel that supporst KIST.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a detection for the KIST scheduler in our build system and set
HAVE_KIST_SUPPORT if available.
Adapt the should use kist function with this new compile option.
Signed-off-by: David Goulet <dgoulet@torproject.org>
- HT_FOREACH_FN defined in an additional place because nickm did that
in an old kist prototype
- Make channel_more_to_flush mockable for future sched tests
- Add empty scheduler_{vanilla,kist}.c files and put in include.am
Signed-off-by: David Goulet <dgoulet@torproject.org>
- massive change to src/tgest/test_options.c since the sched options
were added all over the place in it
- removing the sched options caused some tests to pass/fail in new ways
so I assumed current behavior is correct and made them pass again
- ex: "ConnLimit must be greater" lines
- ex: "Authoritative directory servers must" line
- remove test_options_validate__scheduler in prep for new sched tests
Signed-off-by: David Goulet <dgoulet@torproject.org>
An unnecessary routerlist check in the NETINFO clock skew detection in
channel_tls_process_netinfo_cell() was preventing clients from
reporting NETINFO clock skew to controllers.
This patch replaces a few calls to router_get_by_id_digest ("do we
have a routerinfo?") with connection_or_digest_is_known_relay ("do
we know this relay to be in the consensus, or have been there some
time recently?").
Found while doing the 21585 audit; fixes bug 23533. Bugfix on
0.3.0.1-alpha.
The memleak was occuring because of the way ExcludeNodes is handled in
that function. Basically, we were putting excluded intro points extend
infos in a special variable which was never freed. Also, if there were
multiple excluded intro points then that variable was overwritten
everytime leaking more memory. This commit should fix both issues.
This commit adds a pretty advanced test for the client-side making sure that
picking intro is done properly.
This unittest also reveals a memleak on the client_pick_intro() function which
is fixed by the subsequent commit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Make download status next attempts reported over the control port
consistent with the time used by tor. This issue only occurs if a
download status has not been reset before it is queried over the
control port.
Fixes 23525, not in any released version of tor.
If future code asks if there are any running bridges, without checking
if bridges are enabled, log a BUG warning rather than crashing.
Fixes 23524 on 0.3.0.1-alpha
Change the contract of control_event_bootstrap_problem() to be more
general and to take a connection_t. New function
control_event_bootstrap_prob_or() has the specific or_connection_t
funcionality previously used.
Fix the test_build_address() test and its test vectors python script.
They were both using a bogus pubkey for building an HS address which
does not validate anymore.
Also fix a few more unittests that were using bogus onion addresses
and were failing the validation. I replaced the bogus address with
the one generated from the test vector script.
Directory servers now include a "Date:" http header for response
codes other than 200. Clients starting with a skewed clock and a
recent consensus were getting "304 Not modified" responses from
directory authorities, so without a Date header the client would
never hear about a wrong clock.
Fixes bug 23499; bugfix on 0.0.8rc1.
In #23466 we discovered that cached descriptors can stay around on the
client-side for up to 72 hours. In reality we only want those descs to
get cached for the duration of the current time period, since after that
TP is gone the client needs to compute a new blinded key to use for the HS.
In this commit we start using the consensus time (if available) when
cleaning up cached client descriptor entries. That makes sense because
the client uses consensus time anyway for connecting to hidden
services (e.g. computing blinded keys and time periods).
If no recent consensus is available, we consider descriptors to be
expired since we will want to fetch new ones when we get a live
consensus to avoid the Roger bug. If we didn't do that, when Roger
desuspends his laptop there would be a race between Tor fetching a new
consensus, and Tor connecting to the HS which would still cause
reachability issues.
We also turned a rev counter check into a BUG, since we should never
receive a descriptor with a strictly smaller rev counter than the one we
already have, except if there is a bug or if the HSDir wants to mess
with us. In any case, let's turn this into a BUG so that we can detect
and debug such cases easily.
This change refactors find_dl_schedule() to only call dependent functions
as needed. In particular, directory_fetches_from_authorities() only needs
to be called on clients.
Stopping spurious directory_fetches_from_authorities() calls on every
download on public relays has the following impacts:
* fewer address resolution attempts, particularly those mentioned in 21789
* fewer descriptor rebuilds
* fewer log messages, particularly those limited in 20610
Fixes 23470 in 0.2.8.1-alpha.
The original bug was introduced in commit 35bbf2e as part of prop210.
But when clients are just starting, make them try each bridge a few times
before giving up on it.
These changes make the bridge download schedules more explicit: before
17750, they relied on undocumented behaviour and specific schedule
entries. (And between 17750 and this fix, they were broken.)
Fixes 23347, not in any released version of tor.
We were always incrementing bridge download statuses on each attempt,
but we were using the "increment on failure" functions to do it.
And we never incremented them on failure.
No behaviour change.
The download schedule tells Tor to wait 15 minutes before downloading
bridge descriptors. But 17750 made Tor ignore that and start immediately.
Since we fixed 17750, Tor waits 15 minutes for bridge client bootstrap,
like the schedule says.
This fixes the download schedule to start immediately, and to try each
bridge 3 times in the first 30 seconds. This should make bridge bootstraps
more reliable.
Fixes 23347.
It is possible that two descriptor upload requests are launched in a very
short time frame which can lead to the second request finishing before the
first one and where that first one will make the HSDir send back a 400
malformed descriptor leading to a warning.
To avoid such, cancel all active directory connections for the specific
descriptor we are about to upload.
Note that this race is still possible on the HSDir side which triggers a log
info to be printed out but that is fine.
Fixes#23457
Signed-off-by: David Goulet <dgoulet@torproject.org>
When option validation or transition is happening, there are no
"current options" -- only "old options" and "maybe new options".
Looking at get_options() is likely a mistake, so have a nonfatal
assertion let us know if we do that.
Closes 22281.
Because we can get a RENDEZVOUS2 cell before the INTRODUCE_ACK, we need to
correctly handle the circuit purpose REND_JOINED that is not change its
purpose when we get an INTRODUCE_ACK and simply close the intro circuit
normally.
Fixes#23455
Signed-off-by: David Goulet <dgoulet@torproject.org>
Before, this function meant "can we connect to this node and
authenticate it using its ed25519 key?" Now it can additionally
mean, "when somebody else connects to this node, do we expect that
they can authenticate using the node's ed25519 key"?
This change lets us future-proof our link authentication a bit.
Closes ticket 20895. No backport needed, since ed25519 link
authentication support has not been in any LTS release yet, and
existing releases with it should be obsolete before any releases
without support for linkauth=3 are released.
This test is important because it tests that upload_descriptor_to_all()
is in synch with pick_hsdir_v3(). That's not the case for the
reachability test which just compares the responsible hsdir sets.
There was a bug in upload_descriptor_to_all() where we picked between
first and second hsdir index based on which time segment we are. That's
not right and instead we should be uploading our two descriptors using a
different hsdir index every time. That is, upload first descriptor using
first hsdir index, and upload second descriptor using second hdsir index.
Also simplify stuff in pick_hdsir_v3() since that's only used to fetch
descriptors and hence we can just always use the fetch hsdir index.
This is a large and important unit test for the hidden service version
3! It tests the service reachability for a client using different
consensus timings and makes sure that the computed hashring is the same
on both side so it is actually reachable.
Signed-off-by: David Goulet <dgoulet@torproject.org>
With the latest change on how we use the HSDir index, the client and service
need to pick their responsible HSDir differently that is depending on if they
are before or after a new time period.
The overlap mode is active function has been renamed for this and test added.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because of #23387, we've realized that there is one scenario that makes
the client unable to reach the service because of a desynch in the time
period used. The scenario is as follows:
+------------------------------------------------------------------+
| |
| 00:00 12:00 00:00 12:00 00:00 12:00 |
| SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 |
| |
| $==========|-----------$===========|-----------$===========| |
| ^ ^ |
| C S |
+------------------------------------------------------------------+
In this scenario the HS has a newer consensus than the client, and the
HS just moved to the next TP but the client is still stuck on the old
one. However, the service is not in any sort of overlap mode so it
doesn't cover the old TP anymore, so the client is unable to fetch a
descriptor.
We've decided to solve this by extending the concept of overlap period
to be permanent so that the service always publishes two descriptors and
aims to cover clients with both older and newer consensuses. See the
spec patch in #23387 for more details.
Based on our #23387 findings, it seems like to maintain 24/7
reachability we need to employ different logic when computing hsdir
indices for fetching vs storing. That's to guarantee that the client
will always fetch the current descriptor, while the service will always
publish two descriptors aiming to cover all possible edge cases.
For more details see the next commit and the spec branch.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The pruning process and the deleting ephemeral service function iterates over
all circuits and were asserting on rend_data for a matching circuit. This is
not good because now we have v3 circuits without a rend_data.
Fixes#23429
Signed-off-by: David Goulet <dgoulet@torproject.org>
Use the valid_after time from the consensus to get the time period number else
we might get out of sync with the overlap period that uses valid_after.
Make it an optional feature since some functions require passing a
specific time (like hs_get_start_time_of_next_time_period()).
Signed-off-by: David Goulet <dgoulet@torproject.org>
Undeprecate it;
rename it to TestingClientDNSRejectInternalAddresses;
add the old name as an alias;
reject configurations where it is set but TestingTorNetwork is not;
change the documentation accordingly.
Closes tickets 21031 and 21522.
Version 3 hidden service needs rendezvous point that have the protocol version
HSRend >= 2 else the rendezvous cells are rejected.
Fixes#23361
Signed-off-by: David Goulet <dgoulet@torproject.org>
The chdir() call in RunAsDaemon makes the behavior here surprising,
and either way of trying to resolve the surprise seems sure to
startle a significant fraction of users. Instead, let's refuse to
guess, and refuse these configurations.
Closes ticket 22731.
I'm doing this using the Proxy-Authorization: header to support
clients that understand it, and with a new tor-specific header that
makes more sense for our use.
By convention, a function that frobs a foo_t should be called
foo_frob, and it should have a foo_t * as its first argument. But
for many of the buf_t functions, the buf_t was the final argument,
which is silly.
Our convention is that functions which manipulate a type T should be
named T_foo. But the buffer functions were super old, and followed
all kinds of conventions. Now they're uniform.
Here's the perl I used to do this:
\#!/usr/bin/perl -w -i -p
s/read_to_buf\(/buf_read_from_socket\(/;
s/flush_buf\(/buf_flush_to_socket\(/;
s/read_to_buf_tls\(/buf_read_from_tls\(/;
s/flush_buf_tls\(/buf_flush_to_tls\(/;
s/write_to_buf\(/buf_add\(/;
s/write_to_buf_compress\(/buf_add_compress\(/;
s/move_buf_to_buf\(/buf_move_to_buf\(/;
s/peek_from_buf\(/buf_peek\(/;
s/fetch_from_buf\(/buf_get_bytes\(/;
s/fetch_from_buf_line\(/buf_get_line\(/;
s/fetch_from_buf_line\(/buf_get_line\(/;
s/buf_remove_from_front\(/buf_drain\(/;
s/peek_buf_startswith\(/buf_peek_startswith\(/;
s/assert_buf_ok\(/buf_assert_ok\(/;
This lets us drop the testing-only function buf_get_first_chunk_data(),
and lets us implement proto_http and proto_socks without looking at
buf_t internals.
The service needs the latest SRV and set of relays for the best accurate
hashring to upload its descriptor to so it needs a live consensus thus don't
do anything until we have it.
Fixes#23331
Signed-off-by: David Goulet <dgoulet@torproject.org>
When merging #20657, somehow hs_service_dir_info_changed() became unused
leading to not use the re-upload to HSDir when we were missing information
feature.
Turns out that it is not possible to pick an HSDir with a missing descriptor
because in order to compute the HSDir index, the descriptor is mandatory to
have so we can know its position on the hashring.
This commit removes that dead feature and fix the
hs_service_dir_info_changed() not being used.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Based on questions and comments from dgoulet, I've tried to fill
in the reasoning about why these functions work in the way that they
do, so that it will be easier for future programmers to understand
why this code exists and works the way it does.
We used to check if it was set to 0 which is what unused circuit have but when
the rendezvous circuit was cannibalized, the timestamp_dirty is not 0 but we
still need to reset it so we can actually use it without having the chance of
expiring the next second (or very soon).
Fixes#23123
Signed-off-by: David Goulet <dgoulet@torproject.org>
The function was never returning an error code on failure to parse the
OutboundAddress* options.
In the process, it was making our test_options_validate__outbound_addresses()
not test the right thing.
Fixes#23366
Signed-off-by: David Goulet <dgoulet@torproject.org>
This fixes a serious bug in our hsdir set change logic:
We used to add nodes in the list of previous hsdirs everytime we
uploaded to a new hsdir and we only cleared the list when we built a new
descriptor. This means that our prev_hsdirs list could end up with 7
hsdirs, if for some reason we ended up uploading our desc to 7 hsdirs
before rebuilding our descriptor (e.g. this can happen if the set of
hsdirs changed).
After our previous hdsir set had 7 nodes, then our old algorithm would
always think that the set has changed since it was comparing a smartlist
with 7 elements against a smartlist with 6 elements.
This commit fixes this bug, by clearning the prev_hsdirs list before we
upload to all hsdirs. This makes sure that our prev_hsdirs list always
contains the latest hsdirs!
Our logic for detecting hsdir set changes was needlessly compicated: we
had to sort smartlists and compare them.
Instead, we can simplify things by employing the following logic:
"We should reupload our descriptor if the latest HSDir set contains
nodes that were not previously there"
Since we can't be sure that we can unlink enough files on windows
here, let's let the number of permitted entries grow huge if it
really must.
We do this by letting the storagedir hold lots of entries, but still
trying to keep the number of entries under the configured limit. We
also have to tell consdiffmgr not to freak out if it can't actually
remove enough entries.
Part of a fix for bug 22752
Some parentheses were missing making the rend_max_intro_circs_per_period()
return a lower value than it was suppose to.
The calculation is that a service at most will open a number of intro points
that it wants which is 3 by default or HiddenServiceNumIntroductionPoints. Two
extra are launched for performance reason. Finally, this can happen twice for
two descriptors for the current and next time period.
From:
2 * n_intro_wanted + 2
...which resulted in 8 for 3 intro points, this commit fixes it to:
(n_intro_wanted + 2) * 2
... resulting in 12 possible intro point circuit which is the correct maximum
intro circuit allowed per period.
Last, this commit rate limits the the log message if we ever go above that
limit else over a INTRO_CIRC_RETRY_PERIOD, we can print it often!
Fixes#22159
Signed-off-by: David Goulet <dgoulet@torproject.org>
That check was wrong:
a) We should be making sure that the size of `key` is big enough before
proceeding, since that's the buffer that we would overread with the
tor_memeq() below.
The old check used to check that `req_key_str` is big enough which is
not right, since we won't read deep into that buffer.
The new check makes sure that `key` has enough size to survive the
tor_memeq(), and if not it moves to the next element of the strmap.
b) That check shouldn't be a BUG since that strmap contains
variable-sized elements and we should not be bugging out if we happen
to compare a small sized element (v2) to a bigger one (v3).
This way, we can clear off the directory requests from our cache and thus
allow the next client to query those HSDir again at the next SOCKS connection.
Signed-off-by: David Goulet <dgoulet@torproject.org>
v3 client now cleans up the HSDir request cache when a connection to a service
was successful.
Closes#23308
Signed-off-by: David Goulet <dgoulet@torproject.org>
We used to not copy the state which means that after HUP we would forget
if we are in overlap mode or not. That caused bugs where the service
would enter overlap mode twice, and rotate its descs twice, causing all
sorts of bugs.
Apart from the fact that a newly allocated service doesn't have descriptors
thus the move condition can never be true, the service needs the descriptor
signing key to cross-certify the authentication key of each intro point so we
need to move the descriptors between services and not only the intro points.
Fixes#23056
Signed-off-by: David Goulet <dgoulet@torproject.org>
We refactor the descriptor reupload logic to be similar to the v2 logic
where we update a global 'consider_republishing_rend_descriptors' flag
and then we use that to check for hash ring changes during the global
hidden service callbacks.
This fixes bugs where we would inspect the hash ring immediately as we
receive new dirinfo (e.g. consensus) but before running the hidden
service housekeeping events. That was leaving us in an inconsistent
state wrt hsdir indices and causing bugs all around.
The problem was that when we went from overlap mode to non-overlap mode,
we were not wiping the 'desc_next' descriptor and instead we left it on
the service. This meant that all functions that iterated service
descriptors were also inspecting the useless 'desc_next' descriptor that
should have been deleted.
This commit refactors rotate_all_descriptors() so that it rotates
descriptor both when entering overlap mode and also when leaving it.
A client can re-extend up to 3 intro points on the same circuit. This happens
when we get NACKed by the intro point for which we choose a new intro and
re-extend the circuit to it.
That process can be arbitrarly long so reset the dirty timestamp of the
circuit everytime we choose to re-extend so we get a bit more time to actually
do our introduction.
This is a client circuit so it is short live once opened thus giving us a bit
more time to complete the introduction is ok.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When looking for an introduction circuit in circuit_get_best(), we log an info
message if we are about to launch a new intro circuit in parallel. However,
the condition was considering marked for close circuit leading to the function
triggering the log info even though there is actually no valid intro circuit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Only register the RP circuit when it opens and not when we send the INTRODUCE1
cell else, when re-extending to a new IP, we would register the same RP
circuit with the same cookie twice leading to the circuit being closed.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Changed the assert_intro_circ_ok() to an almost non fatal function so tor can
recover properly. We keep the anonymity assert because if that is not right,
we have much deeper problems and client should stop sending bytes to the
network immediately.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This function has been replaced by hs_client_receive_rendezvous_acked(() doing
the same exact thing for both v2 and v3 service.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The client needs to find the right intro point object from the circuit
identity digest it is opened to. This new function does that.
Signed-off-by: David Goulet <dgoulet@torproject.org>
New function named hs_cell_introduce1_data_clear() is introduced to clear off
an hs_cell_introduce1_data_t object.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When a client decodes a descriptor, make sure it matches the expected blinded
key which is derived from the hidden service identity key.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Needed by the client when fetching a descriptor. This function checks the
directory purpose and hard assert if it is not for fetching.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit makes the client use the intro point state cache. It notes down
when we get a NACK from the intro point and then uses that cache to decide if
it should either close the circuits or re-extend to a new intro point.
This also introduces a very useful function that checks if an intro point is
usable that is query the state cache and checks a series of requirement.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This cache keeps track of the state of intro points which is needed when we
have failures when using them. It is similar to the failure cache of the
legacy system.
At this commit, it is unused but initialized, cleanup and freed.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This moves it to hs_client.c so it can be used by both system (legacy and
prop224). For now, only the legacy system uses it.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Don't assert() on rend_data when closing circuits to report an IP failure. The
prop224 code doesn't have yet the support for this.
Signed-off-by: David Goulet <dgoulet@torproject.org>
For now, prop224 doesn't have a mechanism to note down connection attempts so
we only do it for legacy system using rend_data.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The client can now handle RENDEZVOUS2 cell when they arrive. This consolidate
both hidden service version in one function.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The client is now able to handle an INTRODUCE_ACK cell and do the appropriate
actions.
An intro point failure cache is missing and a way to close all intro point
that were launched in parallel. Some notes are in the comment for that.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a function to parse an INTRODUCE_ACK cell in hs_cell.c. Furthermore, add
an enum that lists all possible expected status code.
Signed-off-by: David Goulet <dgoulet@torproject.org>
From an edge connection object, add a function that randomly pick an
introduction point for the requested service.
This follows the code design of rend_client_get_random_intro() and returns an
extend_info_t object ready to be used to extend to.
At this commit, it is not used yet.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When a descriptor fetch has completed and it has been successfully stored in
the client cache, this callback will take appropriate actions to attach
streams and/or launch neede circuits to connect to the service.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Client now handles a RENDEZVOUS_ESTABLISHED cell when it arrives on the
rendezvous circuit. This new function applies for both the legacy system and
prop224.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a function to build the cell.
Add a the logic to send the cell when the rendezvous circuit opens.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Make a single entry point for the entire HS subsystem when a client circuit
opens (every HS version).
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a function in hs_cell.{c|h} for a client to build an INTRODUCE1 cell using
an object that contains all the needed keys to do so.
Add an entry point in hs_client.c that allows a tor client to send an
INTRODUCE1 cell on a given introduction circuit.
It includes the building of the cell, sending it and the setup of the
rendezvous circuit with the circuit identifier.
The entry point function is still unused at this commit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The hs circuit file had this function that takes a list of link specifiers and
return a newly allocated extend info object. Make it public so the client side
can also use it to be able to extend to introduction point.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Put all the possible assert() we can do on a client introduction circuit in
one helper function to make sure it is valid and usable.
It is disabled for now so gcc doesn't complain that we have a unused function.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit only moves code into a function. The client code will need a way
to take a bunch of descriptor link specifier object and encode them into link
specifiers objects.
Make this a public function so it can be used outside of hs_descriptor.c.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This will be useful to the hidden service subsystem that needs to go over all
connections of a certain state to attach them to a hidden service circuit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
- Add tests that ensure that SOCKS requests for v2/v3 addresses get
intercepted and handled.
- Add test that stores and lookups an HS descriptor in the client-side cache.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Once a descriptor has been successfully downloaded from an HSDir, we flag the
directory connection to "has fetched descriptor" so the connection subsystem
doesn't trigger a new fetch on success.
Same has DIR_PURPOSE_HAS_FETCHED_RENDDESC_V2 but for prop224.
Signed-off-by: David Goulet <dgoulet@torproject.org>
- Also add tests for the hidserv_req subsystem.
- Introduce purge_v2_hidserv_req() wrapper to simplify v2 code.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Specifically move the pick_hsdir() function and all the HSDir request tracking
code. We plan to use all that code both for v2 and v3.
This commit only moves code.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Right now there's a single warn_if_unnamed flag for
router_get_consensus_status_by_nickname() and
node_get_by_nickname(), that is nearly always 1. I've turned it
into an 'unsigned' bitfield, and inverted its sense. I've added the
flags argument to node_get_by_hex_id() too, though it does nothing
there right now.
I've removed the router_get_consensus_status_by_nickname() function,
since it was only used in once place.
This patch changes the warning behavior of GETINFO ns/name/<name>,
since all other name lookups from the controller currently warn.
Later I'm going to add more flags, for ed25519 support.
We will need to edit this function, and it's already pretty huge. Let's make
it a bit smaller.
This commit moves code, fixes a 80 char line and add two lines at the start to
make it compile. Trivial change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We need this func so that we recognize SOCKS conns to v3 addresses.
- Also rename rend_valid_service_id() to rend_valid_v2_service_id()
- Also move parse_extended_hostname() tests to their own unittest, and
add a v3 address to the test as well.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When we enter overlap mode we start using the next hsdir index of
relays. However, we only compute the next hsdir index of relays when we
receive a consensus or their descriptor. This means that there is a
window of time between entering the overlap period and fetching the
consensus where relays have their next hsdir index uninitialized. This
patch fixes this by recomputing all hsdir indices when we first enter
the overlap period.
We want to reupload our descriptor if its set of responsible HSDirs
changed to minimize reachability issues.
This patch adds a callback everytime we get new dirinfo which checks if
the hash ring changed and reuploads descriptor if needed.
Because the HS subsystem calls it every second, change the log level to debug
so it doesn't spam the info log.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Our Windows compiler treats "time_t" as long long int but Linux likes it
long int so cast those to make Windows happy.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This change should improve overhead for downloading small numbers of
descriptors and microdescriptors by improving compression
performance and lowering directory request overhead.
Closes ticket 23220.
The old implementation would fail with super-long inputs. We never
gave it any, but still, it's nicer to dtrt here.
Reported by Guido Vranken. Fixes bug 19281.
- Fix various ssize_t/size_t confusions in the tests.
- Fix a weird memset argument:
"bad_memset: Argument -16 in memset loses precision in
memset(&desc_two->blinded_kp.pubkey.pubkey, -16, 32UL)."
- Fix check_after_deref instance in check_state_line_for_service_rev_counter():
"check_after_deref: Null-checking items suggests that it may be null,
but it has already been dereferenced on all paths leading to the
check."
Add a common function for both legacy and prop224 hidden service to increment
and decrement the rendezvous stream counter on an origin circuit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We used to have a small HS desc cert lifetime but those certs can stick
around for 36 hours if they get initialized in the beginning of overlap
period.
[warn] Bug: Non-fatal assertion !(hs_desc_encode_descriptor(desc->desc, &desc->signing_kp, &encoded_desc) < 0) failed in
upload_descriptor_to_hsdir at src/or/hs_service.c:1886. Stack trace: (on Tor 0.3.2.0-alpha-dev b4a14555597fb9b3)
I repurposed the old directory_request_set_hs_ident() into a new
directory_request_upload_set_hs_ident() which is only used for the
upload purpose and so it can assert on the dir_purpose.
When coding the client-side we can make a second function for fetch.
We used to do:
h = H(BLIND_STRING | H(A | s | B | N )
when we should be doing:
h = H(BLIND_STRING | A | s | B | N)
Change the logic so that hs_common.c does the hashing, and our ed25519
libraries just receive the hashed parameter ready-made. That's easier
than doing the hashing on the ed25519 libraries, since that means we
would have to pass them a variable-length param (depending on whether
's' is set or not).
Also fix the ed25519 test vectors since they were also double hashing.
We also had to alter the SRV functions to take a consensus as optional
input, since we might be setting our HSDir index using a consensus that
is currently being processed and won't be returned by the
networkstatus_get_live_consensus() function.
This change has two results:
a) It makes sure we are using a fresh consensus with the right SRV value
when we are calculating the HSDir hash ring.
b) It ensures that we will not use the sr_get_current/previous()
functions when we don't have a consensus which would have falsely
triggered the disaster SRV logic.
In Nick's words:
"We want to always return false if the platform is a Tor version, and it
is not as new as 0.3.0.8 -- but if the platform is not a Tor version, or
if the version is as new as 0.3.0.8, then we want to obey the protocol
list.
That way, other implementations of our protocol won't have to claim any
particular Tor version, and future versions of Tor will have the freedom
to drop this protocol in the distant future."
- Fix log message format string.
- Do extra circuit purpose check.
- wipe memory in a clear function
- Make sure we don't double add intro points in our list
- Make sure we don't double close intro circuits.
- s/tt_u64_op/tt_i64_op/
Turns out that introduction points don't care about the INTRODUCE2 cell
format as long as the top field is LEGACY_KEY_ID as expected. So let's
use a single INTRODUCE format regardless of the introduction point being
legacy or not.
This also removes the polymorphic void* situation.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We consider to be in overlap mode when we are in the period of time between a
fresh SRV and the beginning of the new time period (in the normal network this
is between 00:00 and 12:00 UTC). This commit edits that function to use the
above semantic logic instead of absolute times.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It used to be that time periods were 24 hours long even on chutney,
which made testing harder. With this commit, time periods have the same
length as a full SRV protocol run, which means that they will change
every 4 minutes in a 10-second voting interval chutney network!
Make this function public so we can use it both in hs_circuit.c and
hs_service.c to avoid code duplication.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This introduces a callback to relaunch a service rendezvous circuit when a
previous one failed to build or expired.
It unifies the legacy function rend_service_relaunch_rendezvous() with one for
specific to prop224. There is now only one entry point for that which is
hs_circ_retry_service_rendezvous_point() supporting both legacy and prop224
circuits.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Change the timing for intro point's lifetime and maximum amount of circuit we
are allowed to launch in a TestingNetwork. This is particurlarly useful for
chutney testing to test intro point rotation.
Signed-off-by: David Goulet <dgoulet@torproject.org>
When the circuit is about to be freed which has been marked close before, for
introduction circuit we now call this has_closed() callback so we can cleanup
any introduction point that have retried to many times or at least flag them
that their circuit is not established anymore.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We used to use NULL subcredential which is a terrible terrible idea. Refactor
HS unittests to use subcredentials.
Also add some non-fatal asserts to make sure that we always use subcredentials
when decoding/encoding descs.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit refactors the handle_hs_exit_conn() function introduced at a prior
commit that connects the rendezvous circuit to the edge connection used to
connect to the service virtual port requested in a BEGIN cell.
The refactor adds the support for prop224 adding the
hs_service_set_conn_addr_port() function that has the same purpose has
rend_service_set_connection_addr_port() from the legacy code.
The rend_service_set_connection_addr_port() has also been a bit refactored so
the common code can be shared between the two HS subsystems (legacy and
prop224).
In terms of functionallity, nothing has changed, we still close the circuits
in case of failure for the same reasons as the legacy system currently does.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit simply moves the code from the if condition of a rendezvous
circuit to a function to handle such a connection. No code was modified
_except_ the use or rh.stream_id changed to n_stream->stream_id so we don't
have to pass the cell header to the function.
This is groundwork for prop224 support which will break down the
handle_hs_exit_conn() depending on the version of hidden service the circuit
and edge connection is for.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Introduction point are rotated either if we get X amounts of INTRODUCE2 cells
on it or a time based expiration. This commit adds two consensus parameters
which are the min and max value bounding the random value X.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Imagine a Tor network where you have only 8 nodes available due to some
reasons. And your hidden service wants 8 introduction points. Everything is
fine but then a node goes down bringing the network to 7. The service will
retry 3 times that node and then give up but keep it in a failure cache for 5
minutes (INTRO_CIRC_RETRY_PERIOD) so it doesn't retry it non stop and exhaust
the maximum number of circuit retry.
In the real public network today, this is unlikely to happen unless the
ExcludeNodes list is extremely restrictive.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit adds a directory command function to make an upload directory
request for a service descriptor.
It is not used yet, just the groundwork.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This hsdir index value is used to give an index value to all node_t (relays)
that supports HSDir v3. An index value is then computed using the blinded key
to know where to fetch/upload the service descriptor from/to.
To avoid computing that index value everytime the client/service needs it, we
do that everytime we get a new consensus which then doesn't change until the
next one. The downside is that we need to sort them once we need to compute
the set of responsible HSDir.
Finally, the "hs_index" function is also added but not used. It will be used
in later commits to compute which node_t is a responsible HSDir for the
service we want to fetch/upload the descriptor.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add this helper function that can lookup and return all the needed object from
a circuit identifier. It is a pattern we do often so make it nicer and avoid
duplicating it everywhere.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add the entry point from the circuit subsystem of "circuit has opened" which
is for all type of hidden service circuits. For the introduction point, this
commit actually adds the support for handling those circuits when opened and
sending ESTABLISH_INTRO on a circuit.
Rendevzou point circuit aren't supported yet at this commit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit adds the functionality for a service to build its descriptor.
Also, a global call to build all descriptors for all services is added to the
service scheduled events.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add the main loop entry point to the HS service subsystem. It is run every
second and make sure that all services are in their quiescent state after that
which means valid descriptors, all needed circuits opened and latest
descriptors have been uploaded.
For now, only v2 is supported and placeholders for v3 actions for that main
loop callback.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a function for both the client and service side that is building a blinded
key from a keypair (service) and from a public key (client). Those two
functions uses the current time period information to build the key.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a new and free function for hs_desc_intro_point_t so the service can use
them to setup those objects properly.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Fixes bug 23106; bugfix on 0.2.4.8-alpha.
Fortunately, we only support big-endian and little-endian platforms,
and on both of those, hton*() and ntoh*() behave the same. And if
we did start to support middle endian systems (haha, no), most of
_those_ have hton*(x) == ntoh*(x) too.
Since we start with desc_clean_since = 0, we should have been
starting with non-null desc_dirty_reason.
Fixes bug 22884; bugfix on 0.2.3.4-alpha when X-Desc-Gen-Reason was
added.
Patch from Vort; fixes bug 23081; bugfix on fd992deeea in
0.2.1.16-rc when set_main_thread() was introduced.
See the changes file for a list of all the symptoms this bug has
been causing when running Tor as a Windows Service.
The condition was always true meaning that we would reconsider updating our
directory information every 2 minutes.
If valid_until is 6am today, then now - 24h == 1pm yesterday which means that
"valid_until < (now - 24h)" is false. But at 6:01am tomorrow, "valid_until <
(now - 24h)" becomes true which is that point that we shouldn't trust the
consensus anymore.
Fixes#23091
Signed-off-by: David Goulet <dgoulet@torproject.org>
One log statement was a warning and has been forgotten. It is triggered for a
successful attempt at introducting from a client.
It has been reported here:
https://lists.torproject.org/pipermail/tor-relays/2017-August/012689.html
Three other log_warn() statement changed to protocol warning because they are
errors that basically can come from the network and thus triggered by anyone.
Fixes#23078.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Now that half the threads are permissive and half are strict, we
need to make sure we have at least two threads, so that we'll
have at least one of each kind.
A prop224 descriptor was missing the onion key for an introduction point which
is needed to extend to it by the client.
Closes#22979
Signed-off-by: David Goulet <dgoulet@torproject.org>
Remove the legacy intro point key because both service and client only uses
the ed25519 key even though the intro point chosen is a legacy one.
This also adds the CLIENT_PK key that is needed for the ntor handshake.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We need to keep these around for TAP and old-style hidden services,
but they're obsolete, and we shouldn't encourage anyone to use them.
So I've added "obsolete" to their names, and a comment explaining
what the problem is.
Closes ticket 23026.
It makes more sense to have the version in the configuration object of the
service because it is afterall a torrc option (HiddenServiceVersion).
Signed-off-by: David Goulet <dgoulet@torproject.org>
The added function frees any allocated pointers in a service configuration
object and reset all values to 0.
Signed-off-by: David Goulet <dgoulet@torproject.org>
As per nickm suggestion, an array of config handlers will not play well with
our callgraph tool.
Instead, we'll go with a switch case on the version which has a good side
effect of allowing us to control what we pass to the function intead of a fix
set of parameters.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a helper function to parse uint64_t and also does logging so we can reduce
the amount of duplicate code.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This tests our hs_config.c API to properly load v3 services and register them
to the global map. It does NOT test the service object validity, that will be
the hs service unit test later on.
At this commit, we have 100% code coverage of hs_config.c.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Every hidden service option don't apply to every version so this new function
makes sure we don't have for instance an option that is only for v2 in a v3
configured service.
This works using an exclude lists for a specific version. Right now, there is
only one option that is not allowed in v3. The rest is common.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Try to load or/and generate service keys for v3. This write both the public
and private key file to disk along with the hostname file containing the onion
address.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This also adds unit test and a small python script generating a deterministic
test vector that a unit test tries to match.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit adds the support in the HS subsystem for loading a service from a
set of or_options_t and put them in a staging list.
To achieve this, service accessors have been created and a global hash map
containing service object indexed by master public key. However, this is not
used for now. It's ground work for registration process.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Introduces hs_init() located in hs_common.c which initialize the entire HS v3
subsystem. This is done _prior_ to the options being loaded because we need to
allocate global data structure before we load the configuration.
The hs_free_all() is added to release everything from tor_free_all().
Note that both functions do NOT handle v2 service subsystem but does handle
the common interface that both v2 and v3 needs such as the cache and
circuitmap.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add the hs_config.{c|h} files contains everything that the HS subsystem needs
to load and configure services. Ultimately, it should also contain client
functions such as client authorization.
This comes with a big refactoring of rend_config_services() which has now
changed to only configure a single service and it is stripped down of the
common directives which are now part of the generic handler.
This is ground work for prop224 of course but only touches version 2 services
and add XXX note for version 3.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This object is the foundation of proposal 224 service work. It will change
and be adapted as it's being used more and more in the codebase. So, this
version is just a basic skeleton one that *will* change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
These statistics were largely ununsed, and kept track of statistical information
on things like how many time we had done TLS or how many signatures we had
verified. This information is largely not useful, and would only be logged
after receiving a SIGUSR1 signal (but only if the logging severity level was
less than LOG_INFO).
* FIXES#19871.
* REMOVES note_crypto_pk_op(), dump_pk_op(), and pk_op_counts from
src/or/rephist.c.
* REMOVES every external call to these functions.
Relay operators (especially bridge operators) can use this to lower
or raise the number of consensuses that they're willing to hold for
diff generation purposes.
This enables a workaround for bug 22883.
Groundwork for more prop224 service and client code. This object contains
common data that both client and service uses.
Signed-off-by: David Goulet <dgoulet@torproject.org>
- Move some crypto structures so that they are visible by tests.
- Introduce a func to count number of hops in cpath which will be used
by the tests.
- Mark a function as mockable.
This commit paves the way for the e2e circuit unittests.
Add a stub for the prop224 equivalent of rend_client_note_connection_attempt_ended().
That function was needed for tests, since the legacy function would get
called when we attach streams and our client-side tests would crash with
assert failures on rend_data.
This also introduces hs_client.[ch] to the codebase.
This commit adds most of the work of #21859. It introduces hs_circuit.c
functions that can handle the setup of e2e circuits for prop224 hidden
services, and also for legacy hidden service clients. Entry points are:
prop224 circuits: hs_circuit_setup_e2e_rend_circ()
legacy client-side circuits: hs_circuit_setup_e2e_rend_circ_legacy_client()
This commit swaps the old rendclient code to use the new API.
I didn't try to accomodate the legacy service-side code in this API, since
that's too tangled up and it would mess up the new API considerably IMO (all
this service_pending_final_cpath_ref stuff is complicated and I didn't want to
change it).
Signed-off-by: David Goulet <dgoulet@torproject.org>
The legacy HS circuit code uses rend_data to match between circuits and
streams. We refactor some of that code so that it understands hs_ident
as well which is used for prop224.
circuit_init_cpath_crypto() is responsible for creating the cpath of legacy
SHA1/AES128 circuits currently. We want to use it for prop224 circuits, so we
refactor it to create circuits with SHA3-256 and AES256 as well.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We want to use the circuit_init_cpath_crypto() function to setup our
cpath, and that function accepts a key array as input. So let's make our
HS ntor key expansion function also return a key array as output,
instead of a struct.
Also, we actually don't need KH from the key expansion, so the key
expansion output can be one DIGEST256_LEN shorter. See here for more
info: https://trac.torproject.org/projects/tor/ticket/22052#comment:3
When the new path selection logic went into place, I accidentally
dropped the code that considered the _family_ of the exit node when
deciding if the guard was usable, and we didn't catch that during
code review.
This patch makes the guard_restriction_t code consider the exit
family as well, and adds some (hopefully redundant) checks for the
case where we lack a node_t for a guard but we have a bridge_info_t
for it.
Fixes bug 22753; bugfix on 0.3.0.1-alpha. Tracked as TROVE-2016-006
and CVE-2017-0377.
This makes our directory code check if a client is trying to fetch a
document that matches a digest from our latest consensus document.
See: https://bugs.torproject.org/22702
This patch ensures that the published_out output parameter is set to the
current consensus cache entry's "valid after" field.
See: https://bugs.torproject.org/22702
As of ac2f6b608a in 0.2.1.19-alpha,
Sebastian fixed bug 888 by marking descriptors as "impossible" by
digest if they got rejected during the
router_load_routers_from_string() phase. This fix stopped clients
and relays from downloading the same thing over and over.
But we never made the same change for descriptors rejected during
dirserv_add_{descriptor,extrainfo}. Instead, we tried to notice in
advance that we'd reject them with dirserv_would_reject().
This notice-in-advance check stopped working once we added
key-pinning and didn't make a corresponding key-pinning change to
dirserv_would_reject() [since a routerstatus_t doesn't include an
ed25519 key].
So as a fix, let's make the dirserv_add_*() functions mark digests
as undownloadable when they are rejected.
Fixes bug 22349; I am calling this a fix on 0.2.1.19-alpha, though
you could also argue for it being a fix on 0.2.7.2-alpha.
This mistake causes two possible bugs. I believe they are both
harmless IRL.
BUG 1: memory stomping
When we call the memset, we are overwriting two 0 bytes past the end
of packed_cell_t.body. But I think that's harmless in practice,
because the definition of packed_cell_t is:
// ...
typedef struct packed_cell_t {
TOR_SIMPLEQ_ENTRY(packed_cell_t) next;
char body[CELL_MAX_NETWORK_SIZE];
uint32_t inserted_time;
} packed_cell_t;
So we will overwrite either two bytes of inserted_time, or two bytes
of padding, depending on how the platform handles alignment.
If we're overwriting padding, that's safe.
If we are overwriting the inserted_time field, that's also safe: In
every case where we call cell_pack() from connection_or.c, we ignore
the inserted_time field. When we call cell_pack() from relay.c, we
don't set or use inserted_time until right after we have called
cell_pack(). SO I believe we're safe in that case too.
BUG 2: memory exposure
The original reason for this memset was to avoid the possibility of
accidentally leaking uninitialized ram to the network. Now
remember, if wide_circ_ids is false on a connection, we shouldn't
actually be sending more than 512 bytes of packed_cell_t.body, so
these two bytes can only leak to the network if there is another bug
somewhere else in the code that sends more data than is correct.
Fortunately, in relay.c, where we allocate packed_cell_t in
packed_cell_new() , we allocate it with tor_malloc_zero(), which
clears the RAM, right before we call cell_pack. So those
packed_cell_t.body bytes can't leak any information.
That leaves the two calls to cell_pack() in connection_or.c, which
use stack-alocated packed_cell_t instances.
In or_handshake_state_record_cell(), we pass the cell's contents to
crypto_digest_add_bytes(). When we do so, we get the number of
bytes to pass using the same setting of wide_circ_ids as we passed
to cell_pack(). So I believe that's safe.
In connection_or_write_cell_to_buf(), we also use the same setting
of wide_circ_ids in both calls. So I believe that's safe too.
I introduced this bug with 1c0e87f6d8
back in 0.2.4.11-alpha; it is bug 22737 and CID 1401591
This introduces node_supports_v3_hsdir() and node_supports_ed25519_hs_intro()
that checks the routerstatus_t of a node and if not present, checks the
routerinfo_t.
This is groundwork for proposal 224 service implementation in #20657.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is possible that at some point in time a client will encounter unknown or
new fields for an introduction point in a descriptor so let them ignore it for
forward compatibility.
Signed-off-by: David Goulet <dgoulet@torproject.org>
If `validate_only` is true, then just validate the configuration without warning
about it. This way, we only emit warnings when the listener is actually opened.
(Otherwise, every time we parse the config we will might re-warn and we would
need to keep state; whereas the listeners are only opened once.)
* FIXES#4019.
-authdir_mode_handles_descs(options, ROUTER_PURPOSE_BRIDGE) to authdir_mode_bridge(options).
- authdir_mode_handles_descs(options, ROUTER_PURPOSE_GENERAL) to authdir_mode_v3(options).
This prevents us from calling
allowed_anonymous_connection_compression_method() on the unused
guessed method (if any), and rejecting something that was already
safe to use.
Rationale: When use a guessed compression method, we already gave a
PROTOCOL_WARN when our guess differed from the declared method,
AND we gave a PROTOCOL_WARN when the declared method failed. It is
not a protocol problem that the guessed method failed too; it's just
a recovery attempt that failed.
A cached_dir_t object (for now) is always compressed with
DEFLATE_METHOD, but in handle_get_status_vote() to we were using the
general compression-negotiation code decide what compression to
claim we were using.
This was one of the reasons behind 22502.
Fixes bug 22669; bugfix on 0.3.1.1-alpha
Move the HTTPProxy option to the deprecated list so for now it will only warn
users but feature is still in the code which will be removed in a future
stable version.
Fixes#20575
Signed-off-by: David Goulet <dgoulet@torproject.org>
On an hidden service rendezvous circuit, a BEGIN_DIR could be sent
(maliciously) which would trigger a tor_assert() because
connection_edge_process_relay_cell() thought that the circuit is an
or_circuit_t but is an origin circuit in reality.
Fixes#22494
Reported-by: Roger Dingledine <arma@torproject.org>
Signed-off-by: David Goulet <dgoulet@torproject.org>
This fixes an assertion failure in relay_send_end_cell_from_edge_() when an
origin circuit and a cpath_layer = NULL were passed.
A service rendezvous circuit could do such a thing when a malformed BEGIN cell
is received but shouldn't in the first place because the service needs to send
an END cell on the circuit for which it can not do without a cpath_layer.
Fixes#22493
Reported-by: Roger Dingledine <arma@torproject.org>
Signed-off-by: David Goulet <dgoulet@torproject.org>
Apparently, the unit tests relied on being able to make ed->x509
link certs even when they hadn't set any server flags in the
options. So instead of making "client" mean "never generate an
ed->x509 cert", we'll have it mean "it's okay not to generate an
ed->x509 cert".
(Going with a minimal fix here, since this is supposed to be a
stable version.)
It's okay to call add_ed25519_cert with a NULL argument: so,
document that. Also, add a tor_assert_nonfatal() to catch any case
where we have failed to set own_link_cert when conn_in_server_mode.
Whenever we rotate our TLS context, we change our Ed25519
Signing->Link certificate. But if we've already started a TLS
connection, then we've already sent the old X509 link certificate,
so the new Ed25519 Signing->Link certificate won't match it.
To fix this, we now store a copy of the Signing->Link certificate
when we initialize the handshake state, and send that certificate
as part of our CERTS cell.
Fixes one case of bug22460; bugfix on 0.3.0.1-alpha.
Previously we could sometimes change our signing key, but not
regenerate the certificates (signing->link and signing->auth) that
were signed with it. Also, we would regularly replace our TLS x.509
link certificate (by rotating our TLS context) but not replace our
signing->link ed25519 certificate. In both cases, the resulting
inconsistency would make other relays reject our link handshakes.
Fixes two cases of bug 22460; bugfix on 0.3.0.1-alpha.
The encrypted_data_length_is_valid() function wasn't validating correctly the
length of the encrypted data of a v3 descriptor. The side effect of this is
that an HSDir was rejecting the descriptor and ultimately not storing it.
Fixes#22447
Signed-off-by: David Goulet <dgoulet@torproject.org>
A fair number of our mock_impl declarations were messed up so that
even our special AM_ETAGSFLAGS couldn't find them.
This should be a whitespace-only patch.
When directory authorities reject a router descriptor due to keypinning,
free the router descriptor rather than leaking the memory.
Fixes bug 22370; bugfix on 0.2.7.2-alpha.
If we add the element itself, we will later free it when we free the
descriptor, and the next time we go to look at MyFamily, things will
go badly.
Fixes the rest of bug 22368; bugfix on 0.3.1.1-alpha.
If we free them here, we will still attempt to access the freed memory
later on, and also we will double-free when we are freeing the config.
Fixes part of bug 22368.
This patch lifts the return value, rv, variable to the beginning of the
function, adds a 'done' label for clean-up and function exit and makes
the rest of the function use the rv value + goto done; instead of
cleaning up in multiple places.
See: https://bugs.torproject.org/22305
We used to not set the guard state in launch_direct_bridge_descriptor_fetch().
So when a bridge descriptor fetch failed, the guard subsystem would never
learn about the fail (and hence the guard's reachability state would not
be updated).
We used to not set the guard state in launch_direct_bridge_descriptor_fetch().
So when a bridge descriptor fetch failed, the guard subsystem would never
learn about the fail (and hence the guard's reachability state would not
be updated).
Directory authorities now reject relays running versions
0.2.9.1-alpha through 0.2.9.4-alpha, because those relays
suffer from bug 20499 and don't keep their consensus cache
up-to-date.
Resolves ticket 20509.
This gives an indication in the log that Tor was built with Rust
support, as well as laying some groundwork for further string-returning
APIs to be converted to Rust
config_get_lines is now split into two functions:
- config_get_lines which is the same as before we had %include
- config_get_lines_include which actually processes %include
Before we've set our options, we can neither call get_options() nor
networkstatus_get_latest_consensus().
Fixes bug 22252; bugfix on 4d9d2553ba
in 0.2.9.3-alpha.
Replace the 177 fallbacks originally introduced in Tor 0.2.9.8 in
December 2016 (of which ~126 were still functional), with a list of
151 fallbacks (32 new, 119 existing, 58 removed) generated in May 2017.
Resolves ticket 21564.
This patch makes us use FALLBACK_COMPRESS_METHOD to try to fetch an
object from the consensus diff manager in case no mutually supported
result was found. This object, if found, is then decompressed using the
spooling system to the client.
See: https://bugs.torproject.org/21667
This patch removes the calls to spooled_resource_new() when trying to
download the consensus. All calls should now be going through the
consdiff manager.
See: https://bugs.torproject.org/21667
This patch ensures that we use the current consensus in the case where
no consensus diff was found or a consensus diff wasn't requested.
See: https://bugs.torproject.org/21667
These still won't do anything till I get the values to be filled in.
Also, I changed the API a little (with corresponding changes in
directory.c) to match things that it's easier to store.
This patch changes handle_get_current_consensus() to make it read the
current consensus document from the consensus caching subsystem.
See: https://bugs.torproject.org/21667
Failure to do this caused an assertion failure with #22246 . This
assertion failure can be triggered remotely, so we're tracking it as
medium-severity TROVE-2017-002.