This is related to ticket #40360 which found this problem when a Bridge entry
with a transport name (let say obfs4) is set without a fingerprint:
Bridge obfs4 <IP>:<PORT> cert=<...> iat-mode=0
(Notice, no fingerprint between PORT and "cert=")
Problem: commit 09c6d03246 added a check in
get_sampled_guard_for_bridge() that would return NULL if the selected bridge
did not have a valid transport name (that is the Bridge transport name that
corresponds to a ClientTransportPlugin).
Unfortuantely, this function is also used when selecting our eligible guards
which is done *before* the transport list is populated and so the added check
for the bridge<->transport name is querying an empty list of transports
resulting in always returning NULL.
For completion, the logic is: Pick eligible guards (use bridge(s) if need be)
then for those, initiate a connection to the pluggable transport proxy and
then populate the transport list once we've connected.
Back to get_sampled_guard_for_bridge(). As said earlier, it is used when
selecting our eligible guards in a way that prevents us from selecting
duplicates. In other words, if that function returns non-NULL, the selection
continues considering the bridge was sampled before. But if it returns NULL,
the relay is added to the eligible list.
This bug made it that our eligible guard list was populated with the *same*
bridge 3 times like so (remember no fingerprint):
[info] entry_guards_update_primary(): Primary entry guards have changed. New primary guard list is:
[info] entry_guards_update_primary(): 1/3: [bridge] ($0000000000000000000000000000000000000000)
[info] entry_guards_update_primary(): 2/3: [bridge] ($0000000000000000000000000000000000000000)
[info] entry_guards_update_primary(): 3/3: [bridge] ($0000000000000000000000000000000000000000)
When tor starts, it will find the bridge fingerprint by connecting to it and
will then update the primary guard list by calling
entry_guard_learned_bridge_identity() which then goes and update only 1 single
entry resulting in this list:
[debug] sampled_guards_update_consensus_presence(): Sampled guard [bridge] ($<FINGERPRINT>) is still listed.
[debug] sampled_guards_update_consensus_presence(): Sampled guard [bridge] ($0000000000000000000000000000000000000000) is still listed.
[debug] sampled_guards_update_consensus_presence(): Sampled guard [bridge] ($0000000000000000000000000000000000000000) is still listed.
And here lies the problem, now tor is stuck attempting to wait for a valid
descriptor for at least 2 guards where the second one is a bunch of zeroes and
thus tor will never fully bootstraps:
[info] I learned some more directory information, but not enough to build a
circuit: We're missing descriptors for 1/2 of our primary entry guards
(total microdescriptors: 6671/6703). That's ok. We will try to fetch missing
descriptors soon.
Now, why passing the fingerprint then works? This is because the list of
guards contains 3 times the same bridge but they all have a fingerprint and so
the descriptor can be found and tor can bootstraps.
The solution here is to entirely remove the transport name check in
get_sampled_guard_for_bridge() since the transport_list is empty at that
point. That way, the eligible guard list only gets 1 entry, the bridge, and
can then go on to bootstrap properly.
It is OK to do so since when launching a bridge descriptor fetch, we validate
that the bridge transport name is OK and thus avoid connecting to a bridge
without a ClientTransportPlugin. If we wanted to keep the check in place, we
would need to populate the transport_list much earlier and this would require
a much bigger refactoring.
Fixes#40360
Signed-off-by: David Goulet <dgoulet@torproject.org>
When seccomp sandbox is active, SAVECONF failed because it was not
able to save the backup files for torrc. This commit simplifies
the implementation of SAVECONF and sandbox by making it keep only
one backup of the configuration file.
On Linux systems, glob automatically ignores the errors ENOENT and
ENOTDIR because they are expected during glob expansion. But BSD
systems do not ignore these, resulting in glob failing when globs
expand to invalid paths. This is fixed by adding a custom error
handler that ignores only these two errors and removing the
GLOB_ERR flag as it makes glob fail even if the error handler
ignores the error and is unnecessary as the error handler will
make glob fail on all other errors anyway.
Fortunately, our tor_free() is setting the variable to NULL after so we were
in a situation where NULL was always used instead of the transport name.
This first appeared in 894ff2dc84 and results in
basically no bridge with a transport being able to use DoS defenses.
Fixes#40345
Signed-off-by: David Goulet <dgoulet@torproject.org>
We use it in router.c, where chunks are joined with "", not with
NL... so leaving off the terminating NL will lead to an unparseable
extrainfo.
Found by toralf. Bug not in any released Tor.
```
src/feature/stats/rephist.c: In function ‘overload_happened_recently’:
src/feature/stats/rephist.c:215:21: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
if (overload_time > approx_time() - 3600 * n_hours) {
```
from https://gitlab.torproject.org/tpo/core/tor/-/issues/40341#note_2729364
Clients now check whether their streams are attempting to re-enter
the Tor network (i.e. to send Tor traffic over Tor), and they close
them preemptively if they think exit relays will refuse them.
See bug 2667 for details. Resolves ticket 40271.
- Implement overload statistics structure.
- Implement function that keeps track of overload statistics.
- Implement function that writes overload statistics to descriptor.
- Unittest for the whole logic.
This option changes the time for which a bandwidth measurement period
must have been in progress before we include it when reporting our
observed bandwidth in our descriptors. Without this option, we only
consider a time period towards our maximum if it has been running
for a full day. Obviously, that's unacceptable for testing
networks, where we'd like to get results as soon as possible.
For non-testing networks, I've put a (somewhat arbitrary) 2-hour
minimum on the option, since there are traffic analysis concerns
with immediate reporting here.
Closes#40337.
We were looking for the first instance of "directory-signature "
when instead the correct behavior is to look for the first instance
of "directory-signature " at the start of a line.
Unfortunately, this can be exploited as to crash authorities while
they're voting.
Fixes#40316; bugfix on 0.2.2.4-alpha. This is TROVE-2021-002,
also tracked as CVE-2021-28090.
It was made to convert Maxmind's "mmdb" files into the older format
that we used. But now thanks to IPFire Location, we don't have to
touch Maxmind formats any more. (See ticket #40224.)
When reloading a service, we can re-register a service and thus end up again
in the metrics store initialization code path which is fine. No need to BUG()
anymore.
Fixes#40334
Signed-off-by: David Goulet <dgoulet@torproject.org>
Use find_str_at_start_of_line(), not strstr() here: we don't want
to match "MemTotal: " if it appears in the middle of a line.
Fixes#40315; bugfix on 0.2.5.4-alpha.
The directory_fetches_from_authorities() is used to know if a client or relay
should fetch data from an authority early in the boot process.
We had a condition in that function that made a relay trigger that fetch if it
didn't know its address (so we can learn it). However, when this is called,
the address discovery has not been done yet so it would always return true for
a relay.
Furthermore, it would always trigger a log notice that the IPv4 couldn't be
found which was inevitable because the address discovery process has not been
done yet (done when building our first descriptor).
It is also important to point out that starting in 0.4.5.1-alpha, asking an
authority for an address is done during address discovery time using a one-hop
circuit thus independent from the relay deciding to fetch or not documents
from an authority.
Small fix also is to reverse the "IPv(4|6)Only" flag in the notice so that if
we can't find IPv6 it would output to use IPv4Only.
Fixes#40300
Signed-off-by: David Goulet <dgoulet@torproject.org>
Fix a bug introduced in 94b56eaa75 which
overwrite the connection message line.
Furthermore, improve how we generate that line by using a smartlist and change
the format so it is clearer of what is being rejected/detected and, if
applicable, which option is disabled thus yielding no stats.
Closes#40308
Signed-off-by: David Goulet <dgoulet@torproject.org>
No behavior change except for logging. This is so the connection related
statistics are in the right object.
Related to #40253
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is a new detection type which is that a relay can now control the rate of
client connections from a single address.
The mechanism is pretty simple, if the rate/burst is reached, the address is
marked for a period of time and any connection from that address is denied.
Closes#40253
Signed-off-by: David Goulet <dgoulet@torproject.org>
When trying to find our address to publish, we would log notice if we couldn't
find it from the cache but then we would look at the suggested cache (which
contains the address from the authorities) in which we might actually have the
address.
Thus that log notice was misplaced. Move it down after the suggested address
cache lookup.
Closes#40300
Signed-off-by: David Goulet <dgoulet@torproject.org>
Relay will always publish 0 as DirPort value in their descriptor from now on
except authorities.
Related to #40282
Signed-off-by: David Goulet <dgoulet@torproject.org>
Regular relays are about to get their DirPort removed so that reachability
test is not useful anymore
Authorities will still use the DirPort but because network reentry towards
their DirPort is now denied network wide, this test is not useful anymore and
so it should simply be considered reachable at all time.
Part of #40282
Signed-off-by: David Goulet <dgoulet@torproject.org>
This bug made the pipeline fail. It basically tries to access a service we just
freed because it's still on the service list.
It only occurs about once every 10 tests and it looks like this:
$ ./src/test/test hs_control/hs_control_add_onion_helper_add_service
hs_control/hs_control_add_onion_helper_add_service: [forking] =================================================================
==354311==ERROR: AddressSanitizer: heap-use-after-free on address 0x613000000940 at pc 0x55a159251b03 bp 0x7ffc6abb5b30 sp 0x7ffc6abb5b28
READ of size 8 at 0x613000000940 thread T0
^[[A
#0 0x55a159251b02 in hs_service_ht_HT_FIND_P_ src/feature/hs/hs_service.c:153
#1 0x55a159251b02 in hs_service_ht_HT_FIND src/feature/hs/hs_service.c:153
#2 0x55a159251b02 in find_service src/feature/hs/hs_service.c:175
#3 0x55a159251c2c in register_service src/feature/hs/hs_service.c:188
#4 0x55a159262379 in hs_service_add_ephemeral src/feature/hs/hs_service.c:3811
#5 0x55a158e865e6 in test_hs_control_add_onion_helper_add_service src/test/test_hs_control.c:847
#6 0x55a1590fe77b in testcase_run_bare_ src/ext/tinytest.c:107
#7 0x55a1590fee98 in testcase_run_forked_ src/ext/tinytest.c:201
#8 0x55a1590fee98 in testcase_run_one src/ext/tinytest.c:267
#9 0x55a1590ffb06 in tinytest_main src/ext/tinytest.c:454
#10 0x55a158b1b1a4 in main src/test/testing_common.c:420
#11 0x7f7f06f8dd09 in __libc_start_main ../csu/libc-start.c:308
#12 0x55a158b21f69 in _start (/home/f/Computers/tor/mytor/src/test/test+0x372f69)
0x613000000940 is located 64 bytes inside of 344-byte region [0x613000000900,0x613000000a58)
freed by thread T0 here:
#0 0x7f7f0774ab6f in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:123
#1 0x55a158e86508 in test_hs_control_add_onion_helper_add_service src/test/test_hs_control.c:838
#2 0x55a1590fe77b in testcase_run_bare_ src/ext/tinytest.c:107
#3 0x55a1590fee98 in testcase_run_forked_ src/ext/tinytest.c:201
#4 0x55a1590fee98 in testcase_run_one src/ext/tinytest.c:267
#5 0x55a1590ffb06 in tinytest_main src/ext/tinytest.c:454
#6 0x55a158b1b1a4 in main src/test/testing_common.c:420
#7 0x7f7f06f8dd09 in __libc_start_main ../csu/libc-start.c:308
previously allocated by thread T0 here:
#0 0x7f7f0774ae8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x55a15948b728 in tor_malloc_ src/lib/malloc/malloc.c:45
#2 0x55a15948b7c0 in tor_malloc_zero_ src/lib/malloc/malloc.c:71
#3 0x55a159261bb5 in hs_service_new src/feature/hs/hs_service.c:4290
#4 0x55a159261f49 in hs_service_add_ephemeral src/feature/hs/hs_service.c:3758
#5 0x55a158e8619f in test_hs_control_add_onion_helper_add_service src/test/test_hs_control.c:832
#6 0x55a1590fe77b in testcase_run_bare_ src/ext/tinytest.c:107
#7 0x55a1590fee98 in testcase_run_forked_ src/ext/tinytest.c:201
#8 0x55a1590fee98 in testcase_run_one src/ext/tinytest.c:267
#9 0x55a1590ffb06 in tinytest_main src/ext/tinytest.c:454
#10 0x55a158b1b1a4 in main src/test/testing_common.c:420
#11 0x7f7f06f8dd09 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-use-after-free src/feature/hs/hs_service.c:153 in hs_service_ht_HT_FIND_P_
Shadow bytes around the buggy address:
0x0c267fff80d0: 00 00 00 00 00 00 00 00 00 00 00 00 fa fa fa fa
0x0c267fff80e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c267fff80f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c267fff8100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c267fff8110: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
=>0x0c267fff8120: fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd
0x0c267fff8130: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c267fff8140: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
0x0c267fff8150: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c267fff8160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c267fff8170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==354311==ABORTING
[Lost connection!]
[hs_control_add_onion_helper_add_service FAILED]
1/1 TESTS FAILED. (0 skipped)
Now that v2 is off the table, 'rend_cache_lookup_result' is useless in
connection_ap_handle_onion() because it can only take the ENOENT value. Let's
remove that helper variable and handle the ENOENT case specifically when we
check the cache.
Also remove the 'onion_address' helper variable.
With v2 support for HSFETCH gone, we only support v3 addresses. We don't
support v2 descriptor IDs anymore and hence we can remove that code.
The code removed would ensure that if a v2 descriptor ID was provided, the user
also had to provide HSDirs explicitly.
In the v3 case, the code should work even if no HSDirs are provided, and Tor
would find the HSDirs itself.
For a user using "HiddenServiceVersion 2", a log warning is emitted indicating
that v2 is now obsolete instead of a confusing message saying that the version
is not supported.
Also, if an introduction point gets a legacy (v2) ESTABLISH_INTRO, we'll
simply close the circuit without emitting a protocol warning log onto the
relay.
Related to #40266
Signed-off-by: David Goulet <dgoulet@torproject.org>
We still keep v2 rendezvous stats since we will allow them until the network
has entirely phased out.
Related to #40266
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is unfortunately massive but both functionalities were extremely
intertwined and it would have required us to actually change the HSv2 code in
order to be able to split this into multiple commits.
After this commit, there are still artefacts of v2 in the code but there is no
more support for service, intro point and HSDir.
The v2 support for rendezvous circuit is still available since that code is
the same for the v3 and we will leave it in so if a client is able to
rendezvous on v2 then it can still transfer traffic. Once the entire network
has moved away from v2, we can remove v2 rendezvous point support.
Related to #40266
Signed-off-by: David Goulet <dgoulet@torproject.org>
It can be called with strings that should have been
length-delimited, but which in fact are not. This can cause a
CPU-DoS bug or, in a worse case, a crash.
Since this function isn't essential, the best solution for older
Tors is to just turn it off.
Fixes bug 40286; bugfix on 0.2.2.1-alpha when dump_desc() was
introduced.
Now that exit relays don't allow exit connections to directory authority
DirPorts, the follow-up step is to make directory authorities stop doing
DirPort reachability checks.
Fixes#40287
Signed-off-by: David Goulet <dgoulet@torproject.org>
Turns out, we forgot to add the METRICS connection type fo the finished
flushing handler.
Fixes#40295
Signed-off-by: David Goulet <dgoulet@torproject.org>
The comment of that specific unit test wanted 4 ORPorts but for some reasons
we tested for 3 which before the previous commit related to #40289, test would
pass but it was in fact wrong.
Now the code is correct and 4 was in fact correct expected number of ports.
Related to #40289
Signed-off-by: David Goulet <dgoulet@torproject.org>
We were just looking at the family which is not correct because it is possible
to have two explicit ORPort for the same family but different addresses. One
example is:
ORPort 127.0.0.1:9001 NoAdvertise
ORPort 1.2.3.4:9001 NoListen
Thus, this patch now ignores ports that have different addresses iff they are
both explicits. That is, if we have this example, also two different
addresses:
ORPort 9001
ORPort 127.0.0.1:9001 NoAdvertise
The first one is implicit and second one is explicit and thus we have to
consider them for removal which in this case would remove the "ORPort 9001" in
favor of the second port.
Fixes#40289
Signe-off-by: David Goulet <dgoulet@torproject.org>
Fun bug where we thought we were using the default "false" value when an
implicit address was detected but if we had an explicit address before, the
flag was set to true and then we would only use that value.
And thus, for some configurations, implicit addresses would be flagged as
explicit and then configuring ports goes bad.
Related to #40289
Signed-off-by: David Goulet <dgoulet@torproject.org>
In other words, if PublishServerDescriptor is set to 0 and AssumeReachable to
1, then allow a relay to hold a RFC1918 address.
Reasons for this are documented in #40208Fixes#40208
Signed-off-by: David Goulet <dgoulet@torproject.org>
That comes from 685c4866ac which added that
check correctly except for when we build a descriptor.
We already omit the IPv6 address, if we need to, when we encode the descriptor
but we need to keep the actual discovered address in the descriptor so we can
notice future IP changes and be able to assess that we are not publishable as
long as we don't specifically set the omit flag.
This lead to also having tor noticing that our IP changed from <nothing> (no
IPv6 in the descriptor) to a discovered one which would trigger every minute.
Fixes#40279, #40288
Signed-off-by: David Goulet <dgoulet@torproject.org>
Handle the EOF situation for a metrics connection. Furthermore, if we failed
to fetch the data from the inbuf properly, mark the socket as closed because
the caller, connection_process_inbuf(), assumes that we did so on error.
Fixes#40257
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously we would warn in this case... but there's really no
justification for doing so, and it can only cause confusion.
Fixes bug #40281; bugfix on 0.4.0.1-alpha.
In two instances we must look at this flag:
1. When we build the descriptor so the IPv6 is NOT added to the descriptor in
case we judge that we need to omit the address but still publish.
2. When we are deciding if the descriptor is publishable. This flags tells us
that the IPv6 was not found reachable but we should still publish.
Fixes#40279
Signed-off-by: David Goulet <dgoulet@torproject.org>
While trying to resolve our CI issues, the Windows build broke with an
unused function error:
src/test/test_switch_id.c:37:1: error: ‘unprivileged_port_range_start’
defined but not used [-Werror=unused-function]
We solve this by moving the `#if !defined(_WIN32)` test above the
`unprivileged_port_range_start()` function defintion such that it is
included in its body.
This is an unreviewed commit.
See: tor#40275
We currently assume that the only way for Tor to listen on ports in the
privileged port range (1 to 1023), on Linux, is if we are granted the
NET_BIND_SERVICE capability. Today on Linux, it's possible to specify
the beginning of the unprivileged port range using a sysctl
configuration option. Docker (and thus the CI service Tor uses) recently
changed this sysctl value to 0, which causes our tests to fail as they
assume that we should NOT be able to bind to a privileged port *without*
the NET_BIND_SERVICE capability.
In this patch, we read the value of the sysctl value via the /proc/sys/
filesystem iff it's present, otherwise we assume the default
unprivileged port range begins at port 1024.
See: tor#40275
The TORPROTOCOL reason causes the client to close the circuit which is not
what we want because other valid streams might be on it.
Instead, CONNECTION_REFUSED will leave it open but will not allow more streams
to be attached to it. The client then open a new circuit to the destination.
Closes#40270
Signed-off-by: David Goulet <dgoulet@torproject.org>