Really, the uint32_t is only an optimization; any kind of unit
should work fine. Some users might want to use time_t or
monotime_coarse_t or something like that.
Begin by creating a lowest-level triple of the types needed to
implement a token bucket: a configuration, a timestamp, and the raw
bucket itself.
Note that for low-level buckets, the units of the timestamp and the
bucket itself are unspecified: each user can use a different type.
(This patch breaks check-spaces; a later patch will fix it)
This is a simple search-and-replace to rename the token bucket type
to indicate that it contains both a read and a write bucket, bundled
with their configuration. It's preliminary to refactoring the
bucket type.
This test works by having two post-loop events activate one another
in a tight loop. If the "post-loop" mechanism didn't work, this
would be enough to starve all other events.
This differs from our previous token bucket abstraction in a few
ways:
1) It is an abstraction, and not a collection of fields.
2) It is meant to be used with monotonic timestamps, which should
produce better results than calling gettimeofday over and over.
When size_t is 32 bits, the unit tests can't fit anything more than
4GB-1 into a size_t.
Additionally, tt_int_op() uses "long" -- we need tt_u64_op() to
safely test uint64_t values for equality.
Bug caused by tests for #24782 fix; not in any released Tor.
This patch changes the algorithm of compute_real_max_mem_in_queues() to
use 0.4 * RAM iff the system has more than or equal to 8 GB of RAM, but
will continue to use the old value of 0.75 * RAM if the system have less
than * GB of RAM available.
This patch also adds tests for compute_real_max_mem_in_queues().
See: https://bugs.torproject.org/24782
This roughly doubles our test coverage of the bridges.c module.
* ADD new testing module, .../src/test/test_bridges.c.
* CHANGE a few function declarations from `static` to `STATIC`.
* CHANGE one function in transports.c, transport_get_by_name(), to be
mockable.
* CLOSES#25425: https://bugs.torproject.org/25425
* ADD new /src/common/crypto_rand.[ch] module.
* ADD new /src/common/crypto_util.[ch] module (contains the memwipe()
function, since all crypto_* modules need this).
* FIXES part of #24658: https://bugs.torproject.org/24658
This module doesn't actually need to mock the libevent mainloop at
all: it can just use the regular mainloop that the test environment
sets up.
Part of ticket 23750.
This change makes cpuworker and test_workqueue no longer need to
include event2/event.h. Now workqueue.c needs to include it, but
that is at least somewhat logical here.
There's now no difference in these tests w.r.t. the C or Rust: both
fail miserably (well, Rust fails with nice descriptive errors, and C
gives you a traceback, because, well, C).
The DoS potential is slightly higher in C now due to some differences to the
Rust code, see the C_RUST_DIFFERS tags in src/rust/protover/tests/protover.rs.
Also, the comment about "failing at the splitting stage" in Rust wasn't true,
since when we split, we ignore empty chunks (e.g. "1--1" parses into
"(1,None),(None,1)" and "None" can't be parsed into an integer).
Finally, the comment about "Rust seems to experience an internal error" is only
true in debug mode, where u32s are bounds-checked at runtime. In release mode,
code expressing the equivalent of this test will error with
`Err(ProtoverError::Unparseable)` because 4294967295 is too large.
Previously, if "Link=1-5" was supported, and you asked protover_all_supported()
(or protover::all_supported() in Rust) if it supported "Link=3-999", the C
version would return "Link=3-999" and the Rust would return "Link=6-999". These
both behave the same now, i.e. both return "Link=6-999".
There's now no difference in these tests w.r.t. the C or Rust: both
fail miserably (well, Rust fails with nice descriptive errors, and C
gives you a traceback, because, well, C).
The DoS potential is slightly higher in C now due to some differences to the
Rust code, see the C_RUST_DIFFERS tags in src/rust/protover/tests/protover.rs.
Also, the comment about "failing at the splitting stage" in Rust wasn't true,
since when we split, we ignore empty chunks (e.g. "1--1" parses into
"(1,None),(None,1)" and "None" can't be parsed into an integer).
Finally, the comment about "Rust seems to experience an internal error" is only
true in debug mode, where u32s are bounds-checked at runtime. In release mode,
code expressing the equivalent of this test will error with
`Err(ProtoverError::Unparseable)` because 4294967295 is too large.
Previously, if "Link=1-5" was supported, and you asked protover_all_supported()
(or protover::all_supported() in Rust) if it supported "Link=3-999", the C
version would return "Link=3-999" and the Rust would return "Link=6-999". These
both behave the same now, i.e. both return "Link=6-999".
These tests handle incoming and outgoing cells on a three-hop
circuit, and make sure that the crypto works end-to-end. They don't
yet test spec conformance, leaky-pipe, or various error cases.
Additionally, this change extracts the functions that created and
freed these elements.
These structures had common "forward&reverse stream&digest"
elements, but they were initialized and freed through cpath objects,
and different parts of the code depended on them. Now all that code
is extacted, and kept in relay_crypto.c
This should help us improve modularity, and should also make it
easier for people to experiment with other relay crypto strategies
down the road.
This commit is pure function movement.
This should avoid most intermittent test failures on developer and CI machines,
but there could (and probably should) be a more elegant solution.
Also, this test was testing that the IP was created and its expiration time was
set to a time greater than or equal to `now+INTRO_POINT_LIFETIME_MIN_SECONDS+5`:
/* Time to expire MUST also be in that range. We add 5 seconds because
* there could be a gap between setting now and the time taken in
* service_intro_point_new. On ARM, it can be surprisingly slow... */
tt_u64_op(ip->time_to_expire, OP_GE,
now + INTRO_POINT_LIFETIME_MIN_SECONDS + 5);
However, this appears to be a typo, since, according to the comment above it,
adding five seconds was done because the IP creation can be slow on some
systems. But the five seconds is added to the *minimum* time we're comparing
against, and so it actually functions to make this test *more* likely to fail on
slower systems. (It should either subtract five seconds, or instead add it to
time_to_expire.)
* FIXES#25450: https://bugs.torproject.org/25450
These were meant to demonstrate old behavior, or old rust behavior.
One of them _should_ work in Rust, but won't because of
implementation details. We'll fix that up later.
The C code and the rust code had different separate integer overflow
bugs here. That suggests that we're better off just forbidding this
pathological case.
Also, add tests for expected behavior on receiving a bad protocol
list in a consensus.
Fixes another part of 25249.
I've refactored these to be a separate function, to avoid tricky
merge conflicts.
Some of these are disabled with "XXXX" comments; they should get
fixed moving forward.
* ADD includes for "torint.h" and "container.h" to crypto_digest.h.
* ADD includes for "crypto_digest.h" to a couple places in which
crypto_digest_t was then missing.
* FIXES part of #24658: https://bugs.torproject.org/24658#comment:30
Folks have found two in the past week or so; we may as well fix the
others.
Found with:
\#!/usr/bin/python3
import re
def findMulti(fname):
includes = set()
with open(fname) as f:
for line in f:
m = re.match(r'^\s*#\s*include\s+["<](\S+)[>"]', line)
if m:
inc = m.group(1)
if inc in includes:
print("{}: {}".format(fname, inc))
includes.add(m.group(1))
import sys
for fname in sys.argv[1:]:
findMulti(fname)
Since 0.2.4, tor uses EWMA circuit policy to prioritize. The previous
algorithm, round-robin, hasn't been used since then but was still used as a
fallback.
Now that EWMA is mandatory, remove that code entirely and enforce a cmux
policy to be set.
This is part of a circuitmux cleanup to improve performance and reduce
complexity in the code. We'll be able to address future optimization with this
work.
Closes#25268
Signed-off-by: David Goulet <dgoulet@torproject.org>
To achieve this, a default value for the CircuitPriorityHalflife option was
needed. We still look in the options and then the consensus but in case no
value can be found, the default CircuitPriorityHalflifeMsec=30000 is used. It
it the value we've been using since 0.2.4.4-alpha.
This means that EWMA, our only policy, can not be disabled anymore fallbacking
to the round robin algorithm. Unneeded code to control that is removed in this
commit.
Part of #25268
Signed-off-by: David Goulet <dgoulet@torproject.org>
On slow system, 1 msec between one read and the other was too tight. For
instance, it failed on armel with a 4msec gap:
https://buildd.debian.org/status/package.php?p=tor&suite=experimental
Increase to 10 msec for now to address slow system. It is important that we
keep this OP_LE test in so we make sure the msec/usec/nsec read aren't
desynchronized by huge gaps. We'll adjust again if we ever encounter a system
that goes slower than 10 msec between calls.
Fixes#25113
Signed-off-by: David Goulet <dgoulet@torproject.org>
Since we're making it so that unstable zstd apis can be disabled,
we need to test them. I do this by adding a variant setup/cleanup
function for the tests, and teaching it about a fake compression
method called "x-zstd:nostatic".
If the cache is using 20% of our maximum allowed memory, clean 10% of it. Same
behavior as the HS descriptor cache.
Closes#25122
Signed-off-by: David Goulet <dgoulet@torproject.org>
The current code flow makes it that we can release a channel in a PENDING
state but not in the pending list. This happens while the channel is being
processed in the scheduler loop.
Fixes#25125
Signed-off-by: David Goulet <dgoulet@torproject.org>
This tests many cases of the KIST scheduler with the pending list state by
calling entry point in the scheduler while channels are scheduled or not.
Also, it adds a test for the bug #24700.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In 0.3.2.1-alpha, we've added notify_networkstatus_changed() in order to have
a way to notify other subsystems that the consensus just changed. The old and
new consensus are passed to it.
Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.
This commit changes the notify_networkstatus_changed() to be the "before"
function and adds an "after" notification from which the scheduler subsystem
is notified.
Fixes#24975
When we stopped looking at the "protocols" variable directly, we
broke the hs_service/build_update_descriptors test, since it didn't
actually update any of the flags.
The fix here is to call summarize_protover_flags() from that test,
and to expose summarize_protover_flags() as "STATIC" from
routerparse.c.
Since helper_create_introduce1_cell() checks "cell" for nullness,
scan-build is concerned that test_introduce1_validation()
dereferences it without checking it. So, add a check.
Not backporting, since this is spurious, _and_ tests-only.
These are all about local variables shadowing global
functions. That isn't normally a problem, but at least one
compiler we care about seems to treat this as a case of -Wshadow
violation, so let's fix it.
Fixes bug 24634; bugfix on 0.3.2.1-alpha.
Using tt_assert in these helpers was implying to scan-build that our
'new' functions might be returning NULL, which in turn would make it
warn about null-pointer use.
We've been seeing problems with destroy cells queues taking up a
huge amount of RAM. We can mitigate this, since while a full packed
destroy cell takes 514 bytes, we only need 5 bytes to remember a
circuit ID and a reason.
Fixes bug 24666. Bugfix on 0.2.5.1-alpha, when destroy cell queues
were introduced.
Using absolute_msec requires a 64-bit division operation every time
we calculate it, which gets expensive on 32-bit architectures.
Instead, just use the lazy "monotime_coarse_get()" operation, and
don't convert to milliseconds until we absolutely must.
In this case, it seemed fine to use a full monotime_coarse_t rather
than a truncated "stamp" as we did to solve this problem for the
timerstamps in buf_t and packed_cell_t: There are vastly more cells
and buffer chunks than there are channels, and using 16 bytes per
channel in the worst case is not a big deal.
There are still more millisecond operations here than strictly
necessary; let's see any divisions show up in profiles.
Retry directory downloads when we get our first bridge descriptor
during bootstrap or while reconnecting to the network. Keep retrying
every time we get a bridge descriptor, until we have a reachable bridge.
Stop delaying bridge descriptor fetches when we have cached bridge
descriptors. Instead, only delay bridge descriptor fetches when we
have at least one reachable bridge.
Fixes bug 24367; bugfix on 0.2.0.3-alpha.
networkstatus_consensus_has_ipv6() tells us whether the consensus method of
our current consensus supports IPv6 ORPorts in the consensus.
Part of #23827.
This commit was made mechanically by this perl script:
\#!/usr/bin/perl -w -i -p
next if /^#define FREE_AND_NULL/;
s/\bFREE_AND_NULL\((\w+),/FREE_AND_NULL\(${1}_t, ${1}_free_,/;
s/\bFREE_AND_NULL_UNMATCHED\(/FREE_AND_NULL\(/;
Couple things happen in this commit. First, we do not re-queue a cell back in
the circuit queue if the write packed cell failed. Currently, it is close to
impossible to have it failed but just in case, the channel is mark as closed
and we move on.
The second thing is that the channel_write_packed_cell() always took ownership
of the cell whatever the outcome. This means, on success or failure, it needs
to free it.
It turns out that that we were using the wrong free function in one case and
not freeing it in an other possible code path. So, this commit makes sure we
only free it in one place that is at the very end of
channel_write_packed_cell() which is the top layer of the channel abstraction.
This makes also channel_tls_write_packed_cell_method() return a negative value
on error.
Two unit tests had to be fixed (quite trivial) due to a double free of the
packed cell in the test since now we do free it in all cases correctly.
Part of #23709
Signed-off-by: David Goulet <dgoulet@torproject.org>
This makes sure that a non opened channel is never put back in the channel
pending list and that its state is consistent with what we expect that is
IDLE.
Test the fixes in #24502.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This introduces the test_hs_control.c file which at this commit contains basic
unit test for the HS_DESC event.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This changes the control_event_hs_descriptor_requested() call to add the hsdir
index optional value. v2 passes NULL all the time.
This commit creates hs_control.{c|h} that contains wrappers for the HS
subsystem to interact with the control port subsystem.
The descriptor REQUESTED event is implemented following proposal 284 extension
for v3.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Make control_event_hs_descriptor_received() and
control_event_hs_descriptor_failed() v2 specific because they take a
rend_data_t object and v3 will need to pass a different object.
No behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is a naming refactor mostly _except_ for a the events' function that take
a rend_data_t which will require much more refactoring.
No behavior change at this commit, cleanup and renaming stuff to not be only
v2 specific.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The functions are now used by the ADD_ONION/DEL_ONION control port command as
well. This commits makes them fully functionnal with hidden service v3.
Part of #20699
Signed-off-by: David Goulet <dgoulet@torproject.org>
Instead of using the cwd to specify the location of Cargo.toml, we
use the --manifest-path option to specify its location explicitly.
This works around the bug that isis diagnosed on our jenkins builds.
The goal here is to replace our use of msec-based timestamps with
something less precise, but easier to calculate. We're doing this
because calculating lots of msec-based timestamps requires lots of
64/32 division operations, which can be inefficient on 32-bit
platforms.
We make sure that these stamps can be calculated using only the
coarse monotonic timer and 32-bit bitwise operations.
First, that test was broken from the previous commit because the
channel_queue_cell() has been removed. This now tests the
channel_process_cell() directly.
Second, it wasn't testing much except if the channel subsystem actually went
through the cell handler. This commit adds more checks on the state of a
channel going from open, receiving a cell and closing.
Third, this and the id_map unit test are working, not the others so they've
been marked as not working and future commit will improve and fix those.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The channel_write_cell() and channel_write_var_cell() can't be possibly called
nor are used by tor. We only write on the connection outbuf packed cell coming
from the scheduler that takes them from the circuit queue.
This makes channel_write_packed_cell() the only usable function. It is
simplify and now returns a code value. The reason for this is that in the next
commit(s), we'll re-queue the cell onto the circuit queue if the write fails.
Finally, channel unit tests are being removed with this commit because they do
not match the new semantic. They will be re-written in future commits.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The channel subsystem was doing a whole lot to track and try to predict the
channel queue size but they are gone due to previous commit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
For the rationale, see ticket #23709.
This is a pretty massive commit. Those queues were everywhere in channel.c and
it turns out that it was used by lots of dead code.
The channel subsystem *never* handles variable size cell (var_cell_t) or
unpacked cells (cell_t). The variable ones are only handled in channeltls and
outbound cells are always packed from the circuit queue so this commit removes
code related to variable and unpacked cells.
However, inbound cells are unpacked (cell_t), that is untouched and is handled
via channel_process_cell() function.
In order to make the commit compile, test have been modified but not passing
at this commit. Also, many tests have been removed but better improved ones
get added in future commits.
This commit also adds a XXX: which indicates that the handling process of
outbound cells isn't fully working. This as well is fixed in a future commit.
Finally, at this commit, more dead code remains, it will be cleanup in future
commits.
Fixes#23709
Signed-off-by: David Goulet <dgoulet@torproject.org>
Stop checking for bridge descriptors when we actually want to know if
any bridges are usable. This avoids potential bootstrapping issues.
Fixes bug 24367; bugfix on 0.2.0.3-alpha.
Stop stalling when bridges are changed at runtime. Stop stalling when
old bridge descriptors are cached, but they are not in use.
Fixes bug 24367; bugfix on 23347 in 0.3.2.1-alpha.
At this commit, the key handling and generation is supported for a v3 service
(ED25519-V3). However, the service creation is not yet implemented. This only
adds the interface and code to deal with the new ED25519-V3 key type.
Tests have been updated for RSA key type but nothing yet for ED25519-v3.
Part of #20699
Signed-off-by: David Goulet <dgoulet@torproject.org>
This check makes it so we can reach "done" without setting "conn",
and so the "if (conn)" check will not be redundant, and so coverity
won't complain. Fixes CID 1422205. Not actually a bug.
This function -- a mock replacement used only for fuzzing -- would
have a buffer overflow if it got an RSA key whose modulus was under
20 bytes long.
Fortunately, Tor itself does not appear to have a bug here.
Fixes bug 24247; bugfix on 0.3.0.3-alpha when fuzzing was
introduced. Found by OSS-Fuzz; this is OSS-Fuzz issue 4177.
If it decrypts something that turns out to start with a NUL byte,
then decrypt_desc_layer() will return 0 to indicate the length of
its result. But 0 also indicates an error, which causes the result
not to be freed by decrypt_desc_layer()'s callers.
Since we're trying to stabilize 0.3.2.x, I've opted for the simpler
possible fix here and made it so that an empty decrypted string will
also count as an error.
Fixes bug 24150 and OSS-Fuzz issue 3994.
The original bug was present but unreachable in 0.3.1.1-alpha. I'm
calling this a bugfix on 0.3.2.1-alpha since that's the first version
where you could actually try to decrypt these descriptors.
When running "make test-network-all", test that IPv6-only clients can use
microdescriptors. IPv6-only microdescriptor client support was fixed in
tor 0.3.0.1-alpha.
Requires chutney master 61c28b9 or later.
Closes ticket 24109.
When the directory information changes, callback to the HS client subsystem so
it can check if any pending SOCKS connections are waiting for a descriptor. If
yes, attempt a refetch for those.
Fixes#23762
Signed-off-by: David Goulet <dgoulet@torproject.org>
The new decryption function performs no decryption, skips the salt,
and doesn't check the mac. This allows us to fuzz the
hs_descriptor.c code using unencrypted descriptor test, and exercise
more of the code.
Related to 21509.
The exposed get_voting_schedule() allocates and return a new object everytime
it is called leading to an awful lot of memory allocation when getting the
start time of the current round which is done for each node in the consensus.
Closes#23623
Signed-off-by: David Goulet <dgoulet@torproject.org>
At first, we put the tor_git_revision constant in tor_main.c, so
that we wouldn't have to recompile config.o every time the git
revision changed. But putting it there had unintended side effect
of forcing every program that wanted to link libor.a (including
test, test-slow, the fuzzers, the benchmarks, etc) to declare their
own tor_git_revision instance.
That's not very nice, especially since we want to start supporting
others who want to link against Tor (see 23846).
So, create a new git_revision.c file that only contains this
constant, and remove the duplicated boilerplate from everywhere
else.
Part of implementing ticket 23845.
Skip test_config_include_no_permission() when running as root, because
it will get an unexpected success from config_get_lines_include().
This affects some continuous integration setups. Fixes bug 23758.
Also demote a log message that can occur under natural causes
(if the circuit subsystem is missing descriptors/consensus etc.).
The HS subsystem will naturally retry to connect to intro points,
so no need to make that log user-facing.
Each type of scheduler implements its own static scheduler_t object and
returns a reference to it.
This commit also makes it a const pointer that is it can only change inside
the scheduler type subsystem but not outside for extra protection.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This option is a list of possible scheduler type tor can use ordered by
priority. Its default value is "KIST,KISTLite,Vanilla" which means that KIST
will be used first and if unavailable will fallback to KISTLite and so on.
Signed-off-by: David Goulet <dgoulet@torproject.org>
- massive change to src/tgest/test_options.c since the sched options
were added all over the place in it
- removing the sched options caused some tests to pass/fail in new ways
so I assumed current behavior is correct and made them pass again
- ex: "ConnLimit must be greater" lines
- ex: "Authoritative directory servers must" line
- remove test_options_validate__scheduler in prep for new sched tests
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit adds a pretty advanced test for the client-side making sure that
picking intro is done properly.
This unittest also reveals a memleak on the client_pick_intro() function which
is fixed by the subsequent commit.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Using a test vector in python, test both hs_build_hsdir_index() and
hs_build_hs_index().
This commit also adds the hs_build_address.py to EXTRA_DIST which was missing.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Do two major improvements:
a) Make the client pick 6 HSDirs instead of just 1 and make sure they
all match the service's HSDirs.
b) Test two additional missing scenarios borrowed from the
test_reachability() test.
Change the contract of control_event_bootstrap_problem() to be more
general and to take a connection_t. New function
control_event_bootstrap_prob_or() has the specific or_connection_t
funcionality previously used.
Fix the test_build_address() test and its test vectors python script.
They were both using a bogus pubkey for building an HS address which
does not validate anymore.
Also fix a few more unittests that were using bogus onion addresses
and were failing the validation. I replaced the bogus address with
the one generated from the test vector script.
We enrich the test_client_cache() test in two ways:
a) We check that transitioning time periods also cleans up expired
descriptors in client memory.
b) We test hs_cache_lookup_as_client() instead of
lookup_v3_desc_as_client(). The former is a higher level function
which calls the latter and allows us to test deeper into the
subsystem.
OpenBSD doesn't like tricks where you use a too-wide sscanf argument
for a too-narrow array, even when you know the input string
statically. The fix here is just to use bigger buffers.
Fixes 15582; bugfix on a3dafd3f58 in 0.2.6.2-alpha.
But when clients are just starting, make them try each bridge a few times
before giving up on it.
These changes make the bridge download schedules more explicit: before
17750, they relied on undocumented behaviour and specific schedule
entries. (And between 17750 and this fix, they were broken.)
Fixes 23347, not in any released version of tor.
This test is important because it tests that upload_descriptor_to_all()
is in synch with pick_hsdir_v3(). That's not the case for the
reachability test which just compares the responsible hsdir sets.
Because of the latest changes on when we rotate, longer lifetime of
descriptors and no more overlap period, the tests needed to be improved to
test more functionnalities.
Signed-off-by: David Goulet <dgoulet@torproject.org>
First, this fixes#23372.
Second, the consensus timings for the build descriptor have been changed to
the current test can pass. More extensive tests of descriptor rotation are
coming in a commit near you because the rotation and time period logic has
been changed.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This is a large and important unit test for the hidden service version
3! It tests the service reachability for a client using different
consensus timings and makes sure that the computed hashring is the same
on both side so it is actually reachable.
Signed-off-by: David Goulet <dgoulet@torproject.org>
With the latest change on how we use the HSDir index, the client and service
need to pick their responsible HSDir differently that is depending on if they
are before or after a new time period.
The overlap mode is active function has been renamed for this and test added.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because of #23387, we've realized that there is one scenario that makes
the client unable to reach the service because of a desynch in the time
period used. The scenario is as follows:
+------------------------------------------------------------------+
| |
| 00:00 12:00 00:00 12:00 00:00 12:00 |
| SRV#1 TP#1 SRV#2 TP#2 SRV#3 TP#3 |
| |
| $==========|-----------$===========|-----------$===========| |
| ^ ^ |
| C S |
+------------------------------------------------------------------+
In this scenario the HS has a newer consensus than the client, and the
HS just moved to the next TP but the client is still stuck on the old
one. However, the service is not in any sort of overlap mode so it
doesn't cover the old TP anymore, so the client is unable to fetch a
descriptor.
We've decided to solve this by extending the concept of overlap period
to be permanent so that the service always publishes two descriptors and
aims to cover clients with both older and newer consensuses. See the
spec patch in #23387 for more details.
Based on our #23387 findings, it seems like to maintain 24/7
reachability we need to employ different logic when computing hsdir
indices for fetching vs storing. That's to guarantee that the client
will always fetch the current descriptor, while the service will always
publish two descriptors aiming to cover all possible edge cases.
For more details see the next commit and the spec branch.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Use the valid_after time from the consensus to get the time period number else
we might get out of sync with the overlap period that uses valid_after.
Make it an optional feature since some functions require passing a
specific time (like hs_get_start_time_of_next_time_period()).
Signed-off-by: David Goulet <dgoulet@torproject.org>
Undeprecate it;
rename it to TestingClientDNSRejectInternalAddresses;
add the old name as an alias;
reject configurations where it is set but TestingTorNetwork is not;
change the documentation accordingly.
Closes tickets 21031 and 21522.
By convention, a function that frobs a foo_t should be called
foo_frob, and it should have a foo_t * as its first argument. But
for many of the buf_t functions, the buf_t was the final argument,
which is silly.
Our convention is that functions which manipulate a type T should be
named T_foo. But the buffer functions were super old, and followed
all kinds of conventions. Now they're uniform.
Here's the perl I used to do this:
\#!/usr/bin/perl -w -i -p
s/read_to_buf\(/buf_read_from_socket\(/;
s/flush_buf\(/buf_flush_to_socket\(/;
s/read_to_buf_tls\(/buf_read_from_tls\(/;
s/flush_buf_tls\(/buf_flush_to_tls\(/;
s/write_to_buf\(/buf_add\(/;
s/write_to_buf_compress\(/buf_add_compress\(/;
s/move_buf_to_buf\(/buf_move_to_buf\(/;
s/peek_from_buf\(/buf_peek\(/;
s/fetch_from_buf\(/buf_get_bytes\(/;
s/fetch_from_buf_line\(/buf_get_line\(/;
s/fetch_from_buf_line\(/buf_get_line\(/;
s/buf_remove_from_front\(/buf_drain\(/;
s/peek_buf_startswith\(/buf_peek_startswith\(/;
s/assert_buf_ok\(/buf_assert_ok\(/;
This lets us drop the testing-only function buf_get_first_chunk_data(),
and lets us implement proto_http and proto_socks without looking at
buf_t internals.
The function was never returning an error code on failure to parse the
OutboundAddress* options.
In the process, it was making our test_options_validate__outbound_addresses()
not test the right thing.
Fixes#23366
Signed-off-by: David Goulet <dgoulet@torproject.org>
This fixes a serious bug in our hsdir set change logic:
We used to add nodes in the list of previous hsdirs everytime we
uploaded to a new hsdir and we only cleared the list when we built a new
descriptor. This means that our prev_hsdirs list could end up with 7
hsdirs, if for some reason we ended up uploading our desc to 7 hsdirs
before rebuilding our descriptor (e.g. this can happen if the set of
hsdirs changed).
After our previous hdsir set had 7 nodes, then our old algorithm would
always think that the set has changed since it was comparing a smartlist
with 7 elements against a smartlist with 6 elements.
This commit fixes this bug, by clearning the prev_hsdirs list before we
upload to all hsdirs. This makes sure that our prev_hsdirs list always
contains the latest hsdirs!
Since ssize_t is signed and might be 64 bits, we should use
tt_i64_op to make sure it's positive. Otherwise, if it is negative,
and we use tt_u64_op, we'll be treating it as a uint64_t, and we
won't detect negative values.
This fixes CID 1416338 and 1416339. Bug not in any released Tor.
We refactor the descriptor reupload logic to be similar to the v2 logic
where we update a global 'consider_republishing_rend_descriptors' flag
and then we use that to check for hash ring changes during the global
hidden service callbacks.
This fixes bugs where we would inspect the hash ring immediately as we
receive new dirinfo (e.g. consensus) but before running the hidden
service housekeeping events. That was leaving us in an inconsistent
state wrt hsdir indices and causing bugs all around.
The `test-operator-cleanup` patch, and related coccinelle patches,
don't do any checks for line length. This patch fixes the line
length issues caused by the previous commits.
This patch fixes the operator usage in src/test/*.c to use the symbolic
operators instead of the normal C comparison operators.
This patch was generated using:
./scripts/coccinelle/test-operator-cleanup src/test/*.[ch]
Only register the RP circuit when it opens and not when we send the INTRODUCE1
cell else, when re-extending to a new IP, we would register the same RP
circuit with the same cookie twice leading to the circuit being closed.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We can't trigger a valid upload because it would require us to MOCK a long
list of functions ultimately not really testing the upload because we aren't
on a running network.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Conflicts:
src/test/test_hs_service.c
- Add tests that ensure that SOCKS requests for v2/v3 addresses get
intercepted and handled.
- Add test that stores and lookups an HS descriptor in the client-side cache.
Signed-off-by: David Goulet <dgoulet@torproject.org>
- Also add tests for the hidserv_req subsystem.
- Introduce purge_v2_hidserv_req() wrapper to simplify v2 code.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Right now there's a single warn_if_unnamed flag for
router_get_consensus_status_by_nickname() and
node_get_by_nickname(), that is nearly always 1. I've turned it
into an 'unsigned' bitfield, and inverted its sense. I've added the
flags argument to node_get_by_hex_id() too, though it does nothing
there right now.
I've removed the router_get_consensus_status_by_nickname() function,
since it was only used in once place.
This patch changes the warning behavior of GETINFO ns/name/<name>,
since all other name lookups from the controller currently warn.
Later I'm going to add more flags, for ed25519 support.
We need this func so that we recognize SOCKS conns to v3 addresses.
- Also rename rend_valid_service_id() to rend_valid_v2_service_id()
- Also move parse_extended_hostname() tests to their own unittest, and
add a v3 address to the test as well.
Signed-off-by: David Goulet <dgoulet@torproject.org>
We want to reupload our descriptor if its set of responsible HSDirs
changed to minimize reachability issues.
This patch adds a callback everytime we get new dirinfo which checks if
the hash ring changed and reuploads descriptor if needed.