The clean_consdiffmgr() callback is only for relays acting as a directory
server, not all relays.
This commit adds a role for only directory server and sets the
clean_consdiffmgr() callback to use it.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, we were ignoring values _over_ EPSILON. This bug was
also causing a warning at startup because the default value is set
to -1.0.
Fixes bug 25577; bugfix on 6b1dba214d. Bug not in any released tor.
Two helper functions to enable an event and disable an event which wraps the
launch and destroy of an event but takes care of the enabled flag.
They are also idempotent that is can be called multiple time on the same event
without effect if the event was already enabled or disabled.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In case we transitionned to a new role in Tor, we need to launch and/or
destroy some periodic events.
Signed-off-by: David Goulet <dgoulet@torproject.org>
In tor, we have a series of possible "roles" that the tor daemon can be
enabled for. They are:
Client, Bridge, Relay, Authority (directory or bridge) and Onion service.
They can be combined sometimes. For instance, a Directory Authority is also a
Relay. This adds a "roles" field to a periodic event item object which is used
to know for which roles the event is for.
The next step is to enable the event only if the roles apply. No behavior
change at this commit.
Pars of #25762
Signed-off-by: David Goulet <dgoulet@torproject.org>
No behavior change, just to make it easier to find callbacks and for the sake
of our human brain to parse the list properly.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Consensus method 25 is the oldest one supported by any stable
version of 0.2.9, which is our current most-recent LTS. Thus, by
proposal 290, they should be removed.
This commit does not actually remove the code to implement these
methods: it only makes it so authorities will no longer support
them. I'll remove the backend code for them in later commits.
It tried to pick nodes for which only routerinfo_t items are set,
but without setting UseMicroDescriptors to 0. This won't work any
more, now that we're strict about using the right descriptor types
due to 25691/25692/25213.
In order to fix 25691 and 25692, we need to pass the "direct_conn"
flag to more places -- particularly when choosing single-hop
tunnels. The right way to do this involves having a couple more
functions accept router_crn_flags_t, rather than a big list of
boolean arguments.
This commit also makes sure that choose_good_exit_server_general()
honors the direct_conn flag, to fix 25691 and 25692.
In router_add_running_nodes_to_smartlist(), we had an inline
implementation of the logic from node_has_descriptor(), which should
be changed to node_has_preferred_descriptor().
This patch adds a new node_has_preferred_descriptor() function, and
replaces most users of node_has_descriptor() with it. That's an
important change, since as of d1874b4339 (our fix for #25213),
we are willing to say that a node has _some_ descriptor, but not the
_right_ descriptor for a particular use case.
Part of a fix for 25691 and 25692.
Now that we allow cpuworkers for dirport-only hosts (to fix 23693),
we need to allow dup_onion_keys() to succeed for them.
The change to construct_ntor_key_map() is for correctness,
but is not strictly necessary.
This is done as follows:
* Only one function (find_dl_schedule()) actually returned a
smartlist. Now it returns an int.
* The CSV_INTERVAL type has been altered to ignore everything
after the first comma, and to store the value before the first
comma in an int.
This commit won't compile. It was made with the following perl
scripts:
s/smartlist_t \*(.*)DownloadSchedule;/int $1DownloadInitialDelay;/;
s/\b(\w*)DownloadSchedule\b/$1DownloadInitialDelay/;
sizeof(ret) is the size of the pointer, not the size of what it
points to. Fortunately, we already have a function to compare
tor_addr_port_t values for equality.
Bugfix on c2c5b13e5d8a77e; bug not in any released Tor. Found by
clang's scan-build.
Now that we update our buckets on demand before reading or writing,
we no longer need to update them all every TokenBucketRefillInterval
msec.
When a connection runs out of bandwidth, we do need a way to
reenable it, however. We do this by scheduling a timer to reenable
all blocked connections for TokenBucketRefillInterval msec after a
connection becomes blocked.
(If we were using PerConnBWRate more, it might make sense to have a
per-connection timer, rather than a single timeout. But since
PerConnBWRate is currently (mostly) unused, I'm going to go for the
simpler approach here, since usually whenever one connection has
become blocked on bandwidth, most connections are blocked on
bandwidth.)
Implements ticket 25373.
Previously this was done as part of the refill callback, but there's
no real reason to do it like that. Since we're trying to remove the
refill callback completely, we can do this work as part of
record_num_bytes_transferred_impl(), which already does quite a lot
of this.
We used to do this 10x per second in connection_buckets_refill();
instead, we now do it when the bucket becomes empty. This change is
part of the work of making connection_buckets_refill() obsolete.
Closes ticket 25828; bugfix on 0.2.3.5-alpha.
We recently merged a circuit cell queue size safeguard. This commit adds the
number of killed circuits that have reached the limit to the DoS heartbeat. It
now looks like this:
[notice] DoS mitigation since startup: 0 circuits killed with too many
cells. 0 circuits rejected, 0 marked addresses. 0 connections closed. 0
single hop clients refused.
Second thing that this patch does. It makes tor always print the DoS
mitigation heartbeat line (for a relay) even though no DoS mitigation have
been enabled. The reason is because we now kill circuits that have too many
cells regardless on if it is enabled or not but also it will give the operator
a chance to learn what is enabled with the heartbeat instead of suddenly
appearing when it is enabled by let say the consensus.
Fixes#25824
Signed-off-by: David Goulet <dgoulet@torproject.org>
Unfortunately, the units passed to
monotime_coarse_stamp_units_to_approx_msec() was always 0 due to a type
conversion.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit introduces the consensus parameter "circ_max_cell_queue_size"
which controls the maximum number of cells a circuit queue should have.
The default value is currently 50000 cells which is above what should be
expected but keeps us a margin of error for padding cells.
Related to this is #9072. Back in 0.2.4.14-alpha, we've removed that limit due
to a Guard discovery attack. Ticket #25226 details why we are putting back the
limit due to the memory pressure issue on relays.
Fixes#25226
Signed-off-by: David Goulet <dgoulet@torproject.org>
Both header and code file had some indentation issues after mass renaming.
No code behavior change.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Really, the uint32_t is only an optimization; any kind of unit
should work fine. Some users might want to use time_t or
monotime_coarse_t or something like that.
Begin by creating a lowest-level triple of the types needed to
implement a token bucket: a configuration, a timestamp, and the raw
bucket itself.
Note that for low-level buckets, the units of the timestamp and the
bucket itself are unspecified: each user can use a different type.
(This patch breaks check-spaces; a later patch will fix it)
This is a simple search-and-replace to rename the token bucket type
to indicate that it contains both a read and a write bucket, bundled
with their configuration. It's preliminary to refactoring the
bucket type.
This test works by having two post-loop events activate one another
in a tight loop. If the "post-loop" mechanism didn't work, this
would be enough to starve all other events.
A linked connection_t is one that gets its I/O, not from the
network, but from another connection_t. When such a connection has
something to write, we want the corresponding connection to run its
read callback ... but not immediately, to avoid infinite recursion
and/or event loop starvation.
Previously we handled this case by activating the read events
outside the event loop. Now we use the "postloop event" logic.
This lets us simplify do_main_loop_once() a little.
We've been labeling some events as happening "outside the event
loop", to avoid Libevent starvation. This patch provides a cleaner
mechanism to avoid that starvation.
For background, the problem here is that Libevent only scans for new
events once it has run all its active callbacks. So if the
callbacks keep activating new callbacks, they could potentially
starve Libevent indefinitely and keep it from ever checking for
timed, socket, or signal events.
To solve this, we add the ability to label some events as
"post-loop". The rule for a "post-loop" event is that any events
_it_ activates can only be run after libevent has re-scanned for new
events at least once.
This differs from our previous token bucket abstraction in a few
ways:
1) It is an abstraction, and not a collection of fields.
2) It is meant to be used with monotonic timestamps, which should
produce better results than calling gettimeofday over and over.
In d1874b4339, we adjusted this check so that we insist on
using routerinfos for bridges. That's almost correct... but if we
have a bridge that is also a regular relay, then we should use
insist on its routerinfo when connecting to it as a bridge
(directly), and be willing to use its microdescriptor when
connecting to it elsewhere in our circuits.
This bug is a likely cause of some (all?) of the (exit_ei == NULL)
failures we've been seeing.
Fixes bug 25691; bugfix on 0.3.3.4-alpha
When size_t is 32 bits, the unit tests can't fit anything more than
4GB-1 into a size_t.
Additionally, tt_int_op() uses "long" -- we need tt_u64_op() to
safely test uint64_t values for equality.
Bug caused by tests for #24782 fix; not in any released Tor.
This patch changes the algorithm of compute_real_max_mem_in_queues() to
use 0.4 * RAM iff the system has more than or equal to 8 GB of RAM, but
will continue to use the old value of 0.75 * RAM if the system have less
than * GB of RAM available.
This patch also adds tests for compute_real_max_mem_in_queues().
See: https://bugs.torproject.org/24782
This patch makes get_total_system_memory mockable, which allows us to
alter the return value of the function in tests.
See: https://bugs.torproject.org/24782