There are a few reasons that relays might be uploading desciptors
without saying X-Desc-Gen-Reason:
1. They are running an old version of our software, before 0.3.2.stable.
2. They are not running our software, but they are claiming they
are.
3. They are uploading through a proxy that strips X-Desc-Gen-Reason.
4. They somehow had a bug in their software.
According to the 25686 data, 1 is the most common reason. This
ticket is an attempt to diagnose case 4, or prove that case 4
doesn't actually happen.
* REFACTORS `UnvalidatedProtoEntry::from_str` to place the bulk of the
splitting/parsing logic in to a new
`UnvalidatedProtoEntry::parse_protocol_and_version_str()` method (so that
both `from_str()` and `from_str_any_len()` can call it.)
* ADD a new `UnvalidatedProtoEntry::from_str_any_len()` method in order to
maintain compatibility with consensus methods older than 29.
* ADD a limit on the number of characters in a protocol name.
* FIXES part of #25517: https://bugs.torproject.org/25517
In protover.c, the `expand_protocol_list()` function expands a `smartlist_t` of
`proto_entry_t`s to their protocol name concatenated with each version number.
For example, given a `proto_entry_t` like so:
proto_entry_t *proto = tor_malloc(sizeof(proto_entry_t));
proto_range_t *range = tor_malloc_zero(sizeof(proto_range_t));
proto->name = tor_strdup("DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa");
proto->ranges = smartlist_new();
range->low = 1;
range->high = 65536;
smartlist_add(proto->ranges, range);
(Where `[19KB]` is roughly 19KB of `"a"` bytes.) This would expand in
`expand_protocol_list()` to a `smartlist_t` containing 65536 copies of the
string, e.g.:
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=1"
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=2"
[…]
"DoSaaaaaaaaaaaaaaaaaaaaaa[19KB]aaa=65535"
Thus constituting a potential resource exhaustion attack.
The Rust implementation is not subject to this attack, because it instead
expands the above string into a `HashMap<String, HashSet<u32>` prior to #24031,
and a `HashMap<UnvalidatedProtocol, ProtoSet>` after). Neither Rust version is
subject to this attack, because it only stores the `String` once per protocol.
(Although a related, but apparently of too minor impact to be usable, DoS bug
has been fixed in #24031. [0])
[0]: https://bugs.torproject.org/24031
* ADDS hard limit on protocol name lengths in protover.c and checks in
parse_single_entry() and expand_protocol_list().
* ADDS tests to ensure the bug is caught.
* FIXES#25517: https://bugs.torproject.org/25517
The specification describes the signature token to be right after a newline
(\n) then the token "signature" and then a white-space followed by the encoded
signature.
This commit makes sure that when we parse the signature from the descriptor,
we are always looking for that extra white-space at the end of the token.
It will allow us also to support future fields that might start with
"signature".
Fixes#26069
Signed-off-by: David Goulet <dgoulet@torproject.org>
The any_client_port_set() returns true if the ControlPort is set which is
wrong because we can have that port open but still not behave as a tor client
(like many relays for instance).
Fixes#26062
Signed-off-by: David Goulet <dgoulet@torproject.org>
When directory authorities read a zero-byte bandwidth file, they log
a warning with the contents of an uninitialised buffer. Log a warning
about the empty file instead.
Fixes bug 26007; bugfix on 0.2.2.1-alpha.
Remove v3 optimization which made Tor not detect disabling services.
This optimization is not so needed because we only call that function after HUP
anyway.
Fixes bug #25761.
During service configuration, rend_service_prune_list_impl_() sets
rend_service_staging_list to NULL, which blocked pruning after a HUP.
This patch initializes rend_service_staging_list when needed, so that HUP can
detect disabled onion services.
Fixes bug #25761.
When directory authorities read a zero-byte bandwidth file, they log
a warning with the contents of an uninitialised buffer. Log a warning
about the empty file instead.
Fixes bug 26007; bugfix on 0.2.2.1-alpha.
LibreSSL, despite not having the OpenSSL 1.1 API, does define
OPENSSL_VERSION in crypto.h. Additionally, it apparently annotates
some functions as returning NULL, so that our unit tests need to be
more careful about checking for NULL so they don't get compilation
warnings.
Closes ticket 26006.
Prior to #23100, we were not counting HS circuit build times in our
calculation of the timeout. This could lead to a condition where our timeout
was set too low, based on non HS circuit build times, and then we would
abandon all HS circuits, storing no valid timeouts in the histogram.
This commit avoids the assert.
It tried to pick nodes for which only routerinfo_t items are set,
but without setting UseMicroDescriptors to 0. This won't work any
more, now that we're strict about using the right descriptor types
due to 25691/25692/25213.
In order to fix 25691 and 25692, we need to pass the "direct_conn"
flag to more places -- particularly when choosing single-hop
tunnels. The right way to do this involves having a couple more
functions accept router_crn_flags_t, rather than a big list of
boolean arguments.
This commit also makes sure that choose_good_exit_server_general()
honors the direct_conn flag, to fix 25691 and 25692.
In router_add_running_nodes_to_smartlist(), we had an inline
implementation of the logic from node_has_descriptor(), which should
be changed to node_has_preferred_descriptor().
This patch adds a new node_has_preferred_descriptor() function, and
replaces most users of node_has_descriptor() with it. That's an
important change, since as of d1874b4339 (our fix for #25213),
we are willing to say that a node has _some_ descriptor, but not the
_right_ descriptor for a particular use case.
Part of a fix for 25691 and 25692.
Now that we allow cpuworkers for dirport-only hosts (to fix 23693),
we need to allow dup_onion_keys() to succeed for them.
The change to construct_ntor_key_map() is for correctness,
but is not strictly necessary.
We recently merged a circuit cell queue size safeguard. This commit adds the
number of killed circuits that have reached the limit to the DoS heartbeat. It
now looks like this:
[notice] DoS mitigation since startup: 0 circuits killed with too many
cells. 0 circuits rejected, 0 marked addresses. 0 connections closed. 0
single hop clients refused.
Second thing that this patch does. It makes tor always print the DoS
mitigation heartbeat line (for a relay) even though no DoS mitigation have
been enabled. The reason is because we now kill circuits that have too many
cells regardless on if it is enabled or not but also it will give the operator
a chance to learn what is enabled with the heartbeat instead of suddenly
appearing when it is enabled by let say the consensus.
Fixes#25824
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit introduces the consensus parameter "circ_max_cell_queue_size"
which controls the maximum number of cells a circuit queue should have.
The default value is currently 50000 cells which is above what should be
expected but keeps us a margin of error for padding cells.
Related to this is #9072. Back in 0.2.4.14-alpha, we've removed that limit due
to a Guard discovery attack. Ticket #25226 details why we are putting back the
limit due to the memory pressure issue on relays.
Fixes#25226
Signed-off-by: David Goulet <dgoulet@torproject.org>
In d1874b4339, we adjusted this check so that we insist on
using routerinfos for bridges. That's almost correct... but if we
have a bridge that is also a regular relay, then we should use
insist on its routerinfo when connecting to it as a bridge
(directly), and be willing to use its microdescriptor when
connecting to it elsewhere in our circuits.
This bug is a likely cause of some (all?) of the (exit_ei == NULL)
failures we've been seeing.
Fixes bug 25691; bugfix on 0.3.3.4-alpha
When size_t is 32 bits, the unit tests can't fit anything more than
4GB-1 into a size_t.
Additionally, tt_int_op() uses "long" -- we need tt_u64_op() to
safely test uint64_t values for equality.
Bug caused by tests for #24782 fix; not in any released Tor.
This patch changes the algorithm of compute_real_max_mem_in_queues() to
use 0.4 * RAM iff the system has more than or equal to 8 GB of RAM, but
will continue to use the old value of 0.75 * RAM if the system have less
than * GB of RAM available.
This patch also adds tests for compute_real_max_mem_in_queues().
See: https://bugs.torproject.org/24782
This patch makes get_total_system_memory mockable, which allows us to
alter the return value of the function in tests.
See: https://bugs.torproject.org/24782
This one happens if for some reason you start with DirPort enabled
but server mode turned off entirely.
Fixes a case of bug 23693; bugfix on 0.3.1.1-alpha.
This patch lifts the list of default directory authorities from config.c
into their own auth_dirs.inc file, which is then included in config.c
using the C preprocessor.
Patch by beastr0.
See: https://bugs.torproject.org/24854
Next commit is addressing the circuit queue cell limit so cleanup before doing
anything else.
Part of #25226
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, the limit for MAX_PROTOCOLS_TO_EXPAND was actually being applied
in Rust to the maximum number of version (total, for all subprotocols).
Whereas in C, it was being applied to the number of subprotocols that were
allowed. This changes the Rust to match C's behaviour.
The behaviours still do not match, unsurprisingly, but now we know where a
primary difference is: the Rust is validating version ranges more than the C,
so in the C it's possible to call protover_all_supported on a ridiculous
version range like "Sleen=0-4294967294" because the C uses
MAX_PROTOCOLS_TO_EXPAND to count the number of *subprotocols* whereas the Rust
uses it to count the total number of *versions* of all subprotocols.
There's now no difference in these tests w.r.t. the C or Rust: both
fail miserably (well, Rust fails with nice descriptive errors, and C
gives you a traceback, because, well, C).
The DoS potential is slightly higher in C now due to some differences to the
Rust code, see the C_RUST_DIFFERS tags in src/rust/protover/tests/protover.rs.
Also, the comment about "failing at the splitting stage" in Rust wasn't true,
since when we split, we ignore empty chunks (e.g. "1--1" parses into
"(1,None),(None,1)" and "None" can't be parsed into an integer).
Finally, the comment about "Rust seems to experience an internal error" is only
true in debug mode, where u32s are bounds-checked at runtime. In release mode,
code expressing the equivalent of this test will error with
`Err(ProtoverError::Unparseable)` because 4294967295 is too large.
Previously, if "Link=1-5" was supported, and you asked protover_all_supported()
(or protover::all_supported() in Rust) if it supported "Link=3-999", the C
version would return "Link=3-999" and the Rust would return "Link=6-999". These
both behave the same now, i.e. both return "Link=6-999".
During code review and discussion with Chelsea Komlo, she pointed out
that protover::compute_for_old_tor() was a public function whose
return type was `&'static CStr`. We both agree that C-like parts of
APIs should:
1. not be exposed publicly (to other Rust crates),
2. only be called in the appropriate FFI code,
3. not expose types which are meant for FFI code (e.g. `*mut char`,
`CString`, `*const c_int`, etc.) to the pure-Rust code of other
crates.
4. FFI code (e.g. things in `ffi.rs` modules) should _never_ be called
from pure-Rust, not even from other modules in its own crate
(i.e. do not call `protover::ffi::*` from anywhere in
`protover::protoset::*`, etc).
With that in mind, this commit makes the following changes:
* CHANGE `protover::compute_for_old_tor()` to be
visible only at the `pub(crate)` level.
* RENAME `protover::compute_for_old_tor()` to
`protover::compute_for_old_tor_cstr()` to reflect the last change.
* ADD a new `protover::compute_for_old_tor()` function wrapper which
is public and intended for other Rust code to use, which returns a
`&str`.
It was changed to take borrows instead of taking ownership.
* REFACTOR `protover::ffi::protover_is_supported_here()` to use changed method
signature on `protover::is_supported_here()`.
This includes a subtle difference in behaviour to the previous Rust
implementation, where, for each vote that we're computing over, if a single one
fails to parse, we skip it. This now matches the current behaviour in the C
implementation.
* REFACTOR `protover::ffi::protover_compute_vote()` to use
new types and methods.
This includes a subtle difference in behaviour, as in 4258f1e18, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.
* REFACTOR `protover::ffi::protover_list_supports_protocol_or_later()` to use
new types and methods.
This includes a subtle difference in behaviour, as in 4258f1e18, where we return
(matching the C impl's return behaviour) earlier than before if parsing failed,
saving us computation in parsing the versions into a
protover::protoset::ProtoSet.
* REFACTOR `protover::ffi::protover_list_supports_protocol()` to use new types
and methods.
This includes differences in behaviour to before, which should now more closely
match the C version:
- If parsing a protover `char*` from C, and the string is not parseable, this
function will return 1 early, which matches the C behaviour when protocols
are unparseable. Previously, we would parse it and its version numbers
simultaneously, i.e. there was no fail early option, causing us to spend more
time unnecessarily parsing versions.
* REFACTOR `protover::ffi::protover_all_supported()` to use new types and
methods.
Previously, the rust implementation of protover considered an empty string to be
a valid ProtoEntry, while the C version did not (it must have a "=" character).
Other differences include that unknown protocols must now be parsed as
`protover::UnknownProtocol`s, and hence their entries as
`protover::UnvalidatedProtoEntry`s, whereas before (nearly) all protoentries
could be parsed regardless of how erroneous they might be considered by the C
version.
My apologies for this somewhat messy and difficult to read commit, if any part
is frustrating to the reviewer, please feel free to ask me to split this into
smaller changes (possibly hard to do, since so much changed), or ask me to
comment on a specific line/change and clarify how/when the behaviours differ.
The tests here should more closely match the behaviours exhibited by the C
implementation, but I do not yet personally guarantee they match precisely.
* REFACTOR unittests in protover::protover.
* ADD new integration tests for previously untested behaviour.
* FIXES part of #24031: https://bugs.torproject.org/24031.
This changes `protover::is_supported_here()` to be aware of new datatypes
(e.g. don't call `.0` on things which are no longer tuple structs) and also
changes the method signature to take borrows, making it faster, threadable, and
easier to read (i.e. the caller can know from reading the function signature
that the function won't mutate values passed into it).
* CHANGE the `protover::is_supported_here()` function to take borrows.
* REFACTOR the `protover::is_supported_here()` function to be aware of new
datatypes.
* FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new type for votes upon `protover::ProtoEntry`s (technically, on
`protover::UnvalidatedProtoEntry`s, because the C code does not validate based
upon currently known protocols when voting, in order to maintain
future-compatibility), and converts several functions which would have operated
on this datatype into methods for ease-of-use and readability.
This also fixes a behavioural differentce to the C version of
protover_compute_vote(). The C version of protover_compute_vote() calls
expand_protocol_list() which checks if there would be too many subprotocols *or*
expanded individual version numbers, i.e. more than MAX_PROTOCOLS_TO_EXPAND, and
does this *per vote* (but only in compute_vote(), everywhere else in the C seems
to only care about the number of subprotocols, not the number of individual
versions). We need to match its behaviour in Rust and ensure we're not allowing
more than it would to get the votes to match.
* ADD new `protover::ProtoverVote` datatype.
* REMOVE the `protover::compute_vote()` function and refactor it into an
equivalent-in-behaviour albeit more memory-efficient voting algorithm based
on the new underlying `protover::protoset::ProtoSet` datatype, as
`ProtoverVote::compute()`.
* REMOVE the `protover::write_vote_to_string()` function, since this
functionality is now generated by the impl_to_string_for_proto_entry!() macro
for both `ProtoEntry` and `UnvalidatedProtoEntry` (the latter of which is the
correct type to return from a voting protocol instance, since the entity
voting may not know of all protocols being voted upon or known about by other
voting parties).
* FIXES part of #24031: https://bugs.torproject.org/24031
rust: Fix a difference in compute_vote() behaviour to C version.
This implements conversions from either a ProtoEntry or an UnvalidatedProtoEntry
into a String, for use in replacing such functions as
`protover::write_vote_to_string()`.
* ADD macro for implementing ToString trait for ProtoEntry and
UnvalidatedProtoEntry.
* FIXES part of #24031: https://bugs.torproject.org/24031
This adds a new protover::UnvalidatedProtoEntry type, which is the
UnknownProtocol variant of a ProtoEntry, and refactors several functions which
should operate on this type into methods.
This also fixes what was previously another difference to the C implementation:
if you asked the C version of protovet_compute_vote() to compute a single vote
containing "Fribble=", it would return NULL. However, the Rust version would
return "Fribble=" since it didn't check if the versions were empty before
constructing the string of differences. ("Fribble=" is technically a valid
protover string.) This is now fixed, and the Rust version in that case will,
analogous to (although safer than) C returning a NULL, return None.
* REMOVE internal `contains_only_supported_protocols()` function.
* REMOVE `all_supported()` function and refactor it into
`UnvalidatedProtoEntry::all_supported()`.
* REMOVE `parse_protocols_from_string_with_no_validation()` and
refactor it into the more rusty implementation of
`impl FromStr for UnvalidatedProtoEntry`.
* REMOVE `protover_string_supports_protocol()` and refactor it into
`UnvalidatedProtoEntry::supports_protocol()`.
* REMOVE `protover_string_supports_protocol_or_later()` and refactor
it into `UnvalidatedProtoEntry::supports_protocol_or_later()`.
* FIXES part of #24031: https://bugs.torproject.org/24031
rust: Fix another C/Rust different in compute_vote().
This fixes the unittest from the prior commit by checking if the versions are
empty before adding a protocol to a vote.
This replaces the `protover::SupportedProtocols` (why would you have a type just
for things which are supported?) with a new, more generic type,
`protover::ProtoEntry`, which utilises the new, more memory-efficient datatype
in protover::protoset.
* REMOVE `get_supported_protocols()` and `SupportedProtocols::tor_supported()`
(since they were never used separately) and collapse their functionality into
a single `ProtoEntry::supported()` method.
* REMOVE `SupportedProtocols::from_proto_entries()` and reimplement its
functionality as the more rusty `impl FromStr for ProtoEntry`.
* REMOVE `get_proto_and_vers()` function and refactor it into the more rusty
`impl FromStr for ProtoEntry`.
* FIXES part of #24031: https://bugs.torproject.org/24031
* ADD new type, protover::UnknownProtocol, so that we have greater type safety
and our protover functionality which works with unsanitised protocol names is
more clearly demarcated.
* REFACTOR protover::Proto, renaming it protover::Protocol to mirror the new
protover::UnknownProtocol type name.
* ADD a utility conversion of `impl From<Protocol> for UnknownProtocol` so that
we can easily with known protocols and unknown protocols simultaneously
(e.g. doing comparisons, checking their version numbers), while not allowing
UnknownProtocols to be accidentally used in functions which should only take
Protocols.
* FIXES part of #24031: https://bugs.torproject.org/24031
* ADD new protover::protoset module.
* ADD new protover::protoset::ProtoSet class for holding protover versions.
* REMOVE protover::Versions type implementation and its method
`from_version_string()`, and instead implement this behaviour in a more
rust-like manner as `impl FromStr for ProtoSet`.
* MOVE the `find_range()` utility function from protover::protover to
protover::protoset since it's only used internally in the
implementation of ProtoSet.
* REMOVE the `contract_protocol_list()` function from protover::protover and
instead refactor it (reusing nearly the entire thing, with minor superficial,
i.e. non-behavioural, changes) into a more rusty
`impl ToString for ProtoSet`.
* REMOVE the `expand_version_range()` function from protover::protover and
instead refactor it into a more rusty implementation of
`impl Into<Vec<Version>> for ProtoSet` using the new error types in
protover::errors.
* FIXES part of #24031: https://bugs.torproject.org/24031.
This will allow us to do actual error handling intra-crate in a more
rusty manner, e.g. propogating errors in match statements, conversion
between error types, logging messages, etc.
* FIXES part of #24031: https://bugs.torproject.org/24031.
Directory authorities no longer vote in favor of the Guard flag
for relays that don't advertise directory support.
Starting in Tor 0.3.0.1-alpha, Tor clients have been avoiding using
such relays in the Guard position, leading to increasingly broken load
balancing for the 5%-or-so of Guards that don't advertise directory
support.
Fixes bug 22310; bugfix on 0.3.0.6.
The earlier checks in this function should ensure that components is
always nonempty. But in case somebody messes with them in the
future, let's add an extra check to make sure we aren't crashing.
Coverity found a null pointer reference in nodelist_add_microdesc().
This is almost certainly impossible assuming that the routerstatus_t
returned by router_get_consensus_status_by_descriptor_digest() always
corresponds to an entry in the nodelist. Fixes bug 25629.
Coverity found a null pointer reference in nodelist_add_microdesc().
This is almost certainly impossible assuming that the routerstatus_t
returned by router_get_consensus_status_by_descriptor_digest() always
corresponds to an entry in the nodelist. Fixes bug 25629.
Coverity found a null pointer reference in nodelist_add_microdesc().
This is almost certainly impossible assuming that the routerstatus_t
returned by router_get_consensus_status_by_descriptor_digest() always
corresponds to an entry in the nodelist. Fixes bug 25629.
If we failed to connect at the TCP level to a relay, note it down and refuse
to connect again for another 60 seconds.
Fixes#24767
Signed-off-by: David Goulet <dgoulet@torproject.org>
This update is needed to make it consistent with the behavior of
node_awaiting_ipv6(), which doesn't believe in the addresses from
routerinfos unless it actually plans to use those routerinfos.
Fixes bug 25213; bugfix on b66b62fb75 in 0.3.3.1-alpha,
which tightened up the definition of node_awaiting_ipv6().
There was a nonfatal assertion in pathbias_should_count that would
trigger if onehop_tunnel was set, but the desired_path_length was
greater than 1. This patch fixes that. Fixes bug 24903; bugfix on
0.2.5.2-alpha.
Make sure we actually only report client channel to the geoip cache instead of
looking if it is a known relay. Looking if it is a known relay can be
unreliable because they come and go from the consensus.
Fixes#24904
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because of #25306 for which we are unable to reproduce nor understand how it
is possible, this commit removes the asserts() and BUG() on the missing
descriptors instead when rotating them.
This allows us to log more data on error but also to let tor recover
gracefully instead of dying.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This patch adds some additional logging to circuits_handle_oom() to give
us more information about which specific compression backend that is
using a certain amount of memory.
See: https://bugs.torproject.org/25372
This should avoid most intermittent test failures on developer and CI machines,
but there could (and probably should) be a more elegant solution.
Also, this test was testing that the IP was created and its expiration time was
set to a time greater than or equal to `now+INTRO_POINT_LIFETIME_MIN_SECONDS+5`:
/* Time to expire MUST also be in that range. We add 5 seconds because
* there could be a gap between setting now and the time taken in
* service_intro_point_new. On ARM, it can be surprisingly slow... */
tt_u64_op(ip->time_to_expire, OP_GE,
now + INTRO_POINT_LIFETIME_MIN_SECONDS + 5);
However, this appears to be a typo, since, according to the comment above it,
adding five seconds was done because the IP creation can be slow on some
systems. But the five seconds is added to the *minimum* time we're comparing
against, and so it actually functions to make this test *more* likely to fail on
slower systems. (It should either subtract five seconds, or instead add it to
time_to_expire.)
* FIXES#25450: https://bugs.torproject.org/25450
These were meant to demonstrate old behavior, or old rust behavior.
One of them _should_ work in Rust, but won't because of
implementation details. We'll fix that up later.
The C code and the rust code had different separate integer overflow
bugs here. That suggests that we're better off just forbidding this
pathological case.
Also, add tests for expected behavior on receiving a bad protocol
list in a consensus.
Fixes another part of 25249.
I've refactored these to be a separate function, to avoid tricky
merge conflicts.
Some of these are disabled with "XXXX" comments; they should get
fixed moving forward.
This one can only be exploited if you can generate a correctly
signed consensus, so it's not as bad as 25074.
Fixes bug 25251; also tracked as TROVE-2018-004.
In some cases we had checked for it, but in others we had not. One
of these cases could have been used to remotely cause
denial-of-service against directory authorities while they attempted
to vote.
Fixes TROVE-2018-001.
since all it does is produce false positives
this commit should get merged into 0.2.9 and 0.3.0 *and* 0.3.1, even
though the code in the previous commit is already present in 0.3.1. sorry
for the mess.
[Cherry-picked]
since all it does is produce false positives
this commit should get merged into 0.2.9 and 0.3.0 *and* 0.3.1, even
though the code in the previous commit is already present in 0.3.1. sorry
for the mess.
This commit takes a piece of commit af8cadf3a9 and a piece of commit
46fe353f25, with the goal of making channel_is_client() be based on what
sort of connection handshake the other side used, rather than seeing
whether the other side ever sent a create_fast cell to us.