Because of #25306 for which we are unable to reproduce nor understand how it
is possible, this commit removes the asserts() and BUG() on the missing
descriptors instead when rotating them.
This allows us to log more data on error but also to let tor recover
gracefully instead of dying.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This patch adds some additional logging to circuits_handle_oom() to give
us more information about which specific compression backend that is
using a certain amount of memory.
See: https://bugs.torproject.org/25372
This should avoid most intermittent test failures on developer and CI machines,
but there could (and probably should) be a more elegant solution.
Also, this test was testing that the IP was created and its expiration time was
set to a time greater than or equal to `now+INTRO_POINT_LIFETIME_MIN_SECONDS+5`:
/* Time to expire MUST also be in that range. We add 5 seconds because
* there could be a gap between setting now and the time taken in
* service_intro_point_new. On ARM, it can be surprisingly slow... */
tt_u64_op(ip->time_to_expire, OP_GE,
now + INTRO_POINT_LIFETIME_MIN_SECONDS + 5);
However, this appears to be a typo, since, according to the comment above it,
adding five seconds was done because the IP creation can be slow on some
systems. But the five seconds is added to the *minimum* time we're comparing
against, and so it actually functions to make this test *more* likely to fail on
slower systems. (It should either subtract five seconds, or instead add it to
time_to_expire.)
* FIXES#25450: https://bugs.torproject.org/25450
The C code and the rust code had different separate integer overflow
bugs here. That suggests that we're better off just forbidding this
pathological case.
Also, add tests for expected behavior on receiving a bad protocol
list in a consensus.
Fixes another part of 25249.
This one can only be exploited if you can generate a correctly
signed consensus, so it's not as bad as 25074.
Fixes bug 25251; also tracked as TROVE-2018-004.
In some cases we had checked for it, but in others we had not. One
of these cases could have been used to remotely cause
denial-of-service against directory authorities while they attempted
to vote.
Fixes TROVE-2018-001.
since all it does is produce false positives
this commit should get merged into 0.2.9 and 0.3.0 *and* 0.3.1, even
though the code in the previous commit is already present in 0.3.1. sorry
for the mess.
[Cherry-picked]
since all it does is produce false positives
this commit should get merged into 0.2.9 and 0.3.0 *and* 0.3.1, even
though the code in the previous commit is already present in 0.3.1. sorry
for the mess.
This commit takes a piece of commit af8cadf3a9 and a piece of commit
46fe353f25, with the goal of making channel_is_client() be based on what
sort of connection handshake the other side used, rather than seeing
whether the other side ever sent a create_fast cell to us.
We had this safeguard around dos_init() but not when the consensus changes
which can modify consensus parameters and possibly enable the DoS mitigation
even if tor wasn't a public relay.
Fixes#25223
Signed-off-by: David Goulet <dgoulet@torproject.org>
Explicitly inform the operator of the rejected relay to set a valid email
address in the ContactInfo field and contact bad-relays@ mailing list.
Fixes#25170
Signed-off-by: David Goulet <dgoulet@torproject.org>
Explicitly inform the operator of the rejected relay to set a valid email
address in the ContactInfo field and contact bad-relays@ mailing list.
Fixes#25170
Signed-off-by: David Goulet <dgoulet@torproject.org>
At this commit, the SocksSocketsGroupWritable option is renamed to
UnixSocksGroupWritable. A deprecated warning is triggered if the old option is
used and tor will use it properly.
Fixes#24343
Signed-off-by: David Goulet <dgoulet@torproject.org>
On slow system, 1 msec between one read and the other was too tight. For
instance, it failed on armel with a 4msec gap:
https://buildd.debian.org/status/package.php?p=tor&suite=experimental
Increase to 10 msec for now to address slow system. It is important that we
keep this OP_LE test in so we make sure the msec/usec/nsec read aren't
desynchronized by huge gaps. We'll adjust again if we ever encounter a system
that goes slower than 10 msec between calls.
Fixes#25113
Signed-off-by: David Goulet <dgoulet@torproject.org>
Remove a series of connection counters that were only used when dumping the
rephist statistics with SIGUSR1 signal.
This reduces the or_history_t structure size.
Closes#25163
Signed-off-by: David Goulet <dgoulet@torproject.org>
Services can keep rendezvous circuits for a while so don't log them if tor is
a single onion service.
Fixes#25116
Signed-off-by: David Goulet <dgoulet@torproject.org>
If the cache is using 20% of our maximum allowed memory, clean 10% of it. Same
behavior as the HS descriptor cache.
Closes#25122
Signed-off-by: David Goulet <dgoulet@torproject.org>
This reverts commit 9a06282546.
It appears that I misunderstood how the seccomp2 filter rules
interact. It appears that `SCMP_ACT_ERRNO()` always takes
precedence over `SCMP_ACT_ALLOW()` -- I had thought instead that
earlier rules would override later ones. But this change caused bug
25115 (not in any released Tor).
The accurate address of a connection is real_addr, not the addr member.
channel_tls_get_remote_addr_method() now returns real_addr instead.
Fixes#24952; bugfix on 707c1e2 in 0.2.4.11-alpha.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Tor preemptiely builds circuits and they can be cannibalized later in their
lifetime. A Guard node can become unusable (from our guard state) but we can
still have circuits using that node opened. It is important to not pick those
circuits for any usage through the cannibalization process.
Fixes#24469
Signed-off-by: David Goulet <dgoulet@torproject.org>
In 0.3.2.1-alpha, we've added notify_networkstatus_changed() in order to have
a way to notify other subsystems that the consensus just changed. The old and
new consensus are passed to it.
Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.
This commit changes the notify_networkstatus_changed() to be the "before"
function and adds an "after" notification from which the scheduler subsystem
is notified.
Fixes#24975
This is the quick fix that is keeping the channel in PENDING state so if we
ever try to reschedule the same channel, it won't happened.
Fixes#24700
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is possible in normal circumstances that a client fetches a descriptor
that has a lower revision counter than the one in its cache. This can happen
due to HSDir desync.
Fixes#24976
Signed-off-by: David Goulet <dgoulet@torproject.org>
In 0.3.2.1-alpha, we've added this function in order to have a way to notify
other subsystems that the consensus just changed. The old consensus and the
new one are passed to it.
Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.
With this commit, notify_networkstatus_changed() has been moved _after_ the
new consensus is set globally. The main obvious reasons is to fix the bug
described above and in #24975. The other reason is that this notify function
doesn't return anything which could be allowing the possibility of refusing to
set the new consensus on error. In other words, the new consensus is set right
after the notification whatever happens.
It does no harm or change in behavior to set the new consensus first and then
notify the subsystems. The two functions currently used are for the control
port using the old and new consensus and sending the diff. The second is the
scheduler that needs the new consensus to be set globally before being called.
Of course, the function has been documented accordinly to clearly state it is
done _after_ the new consensus is set.
Fixes#24975
Signed-off-by: David Goulet <dgoulet@torproject.org>
Stop adding unneeded channel padding right after we finish flushing
to a connection that has been trying to flush for many seconds.
Instead, treat all partial or complete flushes as activity on the
channel, which will defer the time until we need to add padding.
This fix should resolve confusing and scary log messages like
"Channel padding timeout scheduled 221453ms in the past."
Fixes bug 22212; bugfix on 0.3.1.1-alpha.
I think technically we could resolve bug 22212 by adding a call to
channel_timestamp_active() only in the finished_flushing case. But I added
a call in the flushed_some case too since that seems to more accurately
reflect the notion of "active".
This patch updates the HiddenServiceVersion man-page entry to only
accept either 2 or 3 as argument and not a list of multiple versions.
See: https://bugs.torproject.org/25026
Previously, we wouldn't do this when running with a routerinfo_t in
some cases, leading to many needless calls to the protover module.
This change also cleans up the code in nodelist.c a bit.
Fixes bug 25008; bugfix on 0.2.9.4-alpha.
Without this patch, not only will TLS1.3 not work with Tor, but
OpenSSL 1.1.1 with TLS1.3 enabled won't build any connections at
all: It requires that either TLS1.3 be disabled, or some TLS1.3
ciphersuites be listed.
Closes ticket 24978.
For 23847, we want Tor to be able to shut down and then restart in
the same process. Here's a patch to make the Tor binary do that.
To test it, you need to build with --enable-restart-debugging, and
then you need to set the environment variable TOR_DEBUG_RESTART.
With this option, Tor will then run for 5 seconds, then restart
itself in-process without exiting. This only happens once.
You can change the 5-second interval using
TOR_DEBUG_RESTART_AFTER_SECONDS.
Implements ticket 24583.
Fix an "off by 2" error in counting rendezvous failures on the onion
service side.
While we thought we would stop the rendezvous attempt after one failed
circuit, we were actually making three circuit attempts before giving up.
Fixes bug 24895; bugfix on 0.0.6.
Fix a set of false positives where relays would consider connections
to other relays as being client-only connections (and thus e.g.
deserving different link padding schemes) if those relays fell out
of the consensus briefly.
Now we look only at the initial handshake and whether the connection
authenticated as a relay.
Fixes bug 24898; bugfix on 0.3.1.1-alpha.
New-style (v3) onion services now obey the "max rendezvous circuit
attempts" logic.
Previously they would make as many rendezvous circuit attempts as they
could fit in the MAX_REND_TIMEOUT second window before giving up.
Fixes bug 24894; bugfix on 0.3.2.1-alpha.
Define TOR_PRIuSZ as minGW compiler doesn't support zu format specifier for
size_t type.
Fixes#24861 on ac9eebd.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
... in get_interface_addresses_ioctl().
This pointer alignment issue exists on x86_64 macOS, but is unlikely to exist
elsewhere. (i386 macOS only requires 4-byte alignment, and other OSs have
8-byte ints.)
Fixes bug 24733; not in any released version of tor.
Avoid selecting fallbacks that change their IP addresses too often.
Select more fallbacks by ignoring the Guard flag, and allowing lower
cutoffs for the Running and V2Dir flags. Also allow a lower bandwidth,
and a higher number of fallbacks per operator (5% of the list).
Implements ticket 24785.
Add the generateFallbackDirLine.py script for automatically generating
fallback directory mirror lines from relay fingerprints. No more typos!
Add the lookupFallbackDirContact.py script for automatically looking up
operator contact info from relay fingerprints.
Implements ticket 24706.
This removes some redundant repeated lines.
Ticket 24681 will maintain the current fallback weights by changing
Tor's default fallback weight to 10.
Implements ticket 24679.
These are all about local variables shadowing global
functions. That isn't normally a problem, but at least one
compiler we care about seems to treat this as a case of -Wshadow
violation, so let's fix it.
Fixes bug 24634; bugfix on 0.3.2.1-alpha.
Tor now sets IPv6 preferences on rewrite_node_address_for_bridge() even if
there is only ri or rs. It always warns about them.
Also Tor now sets the IPv6 address in rs as well as it sets the one in ri.
Fixes#24572 on 9e9edf7 in 0.2.4.5-alpha.
Fixes#24573 on c213f27 in 0.2.8.2-alpha.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
Tor test now checks if "ping -6 -c 1 -W 1 ::1" works when "ping6 -c 1 -W 1 ::1"
fails on tests.
Fixes#24677; bugfix in 0.2.9.3-alpha.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
When the fascist_firewall_choose_address_ functions don't find a
reachable address, set the returned address to the null address and port.
This is a precautionary measure, because some callers do not check the
return value.
Fixes bug 24736; bugfix on 0.2.8.2-alpha.
This makes clients on the public tor network prefer to bootstrap off fallback
directory mirrors.
This is a follow-up to 24679, which removed weights from the default fallbacks.
Implements ticket 24681.
We've been seeing problems with destroy cells queues taking up a
huge amount of RAM. We can mitigate this, since while a full packed
destroy cell takes 514 bytes, we only need 5 bytes to remember a
circuit ID and a reason.
Fixes bug 24666. Bugfix on 0.2.5.1-alpha, when destroy cell queues
were introduced.
With extra_space negative, it means that the "notsent" queue is quite large so
we must consider that value with the current computed tcp_space. If we end up
to have negative space, we should not add more data to the kernel since the
notsent queue is just too filled up.
Fixes#24665
Signed-off-by: David Goulet <dgoulet@torproject.org>
Instead of using INT_MAX as a write limit for KISTLite, use the lower layer
limit which is using the specialized num_cells_writeable() of the channel that
will down the line check the connection's outbuf and limit it to 32KB
(OR_CONN_HIGHWATER).
That way we don't take the chance of bloating the connection's outbuf and we
keep the cells in the circuit queue which our OOM handler can take care of,
not the outbuf.
Finally, this commit adds a log_debug() in the update socket information
function of KIST so we can get the socket information in debug.
Fixes#24671
Signed-off-by: David Goulet <dgoulet@torproject.org>
Exposing cell_queues_get_total_allocation(), buf_get_total_allocation(),
tor_compress_get_total_allocation(), tor_compress_get_total_allocation() when
hit MaxMemInQueues threshold.
Fixes#24501
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
This patch adds support for MainloopStats that allow developers to get
main event loop statistics via Tor's heartbeat status messages. The new
status log message will show how many succesful, erroneous, and idle
event loop iterations we have had.
See: https://bugs.torproject.org/24605
Adding tor_remove_file(filename) and refactoring tor_cleanup().
Removing CookieAuthFile and ExtORPortCookieAuthFile when tor_cleanup() is
called.
Fixes#23271.
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
Using absolute_msec requires a 64-bit division operation every time
we calculate it, which gets expensive on 32-bit architectures.
Instead, just use the lazy "monotime_coarse_get()" operation, and
don't convert to milliseconds until we absolutely must.
In this case, it seemed fine to use a full monotime_coarse_t rather
than a truncated "stamp" as we did to solve this problem for the
timerstamps in buf_t and packed_cell_t: There are vastly more cells
and buffer chunks than there are channels, and using 16 bytes per
channel in the worst case is not a big deal.
There are still more millisecond operations here than strictly
necessary; let's see any divisions show up in profiles.
* ADDS several `AC_MSG_RESULT`s which print the result of our checks
for our rust dependencies and a check for a suitable rustc compiler
version.
* FIXES#24612: https://bugs.torproject.org/24612
Retry directory downloads when we get our first bridge descriptor
during bootstrap or while reconnecting to the network. Keep retrying
every time we get a bridge descriptor, until we have a reachable bridge.
Stop delaying bridge descriptor fetches when we have cached bridge
descriptors. Instead, only delay bridge descriptor fetches when we
have at least one reachable bridge.
Fixes bug 24367; bugfix on 0.2.0.3-alpha.
In KIST, we could have a small congestion window value than the unacked
packets leading to a integer overflow which leaves the tcp_space value to be
humongous.
This has no security implications but it results in KIST scheduler allowing to
send cells on a potentially saturated connection.
Found by #24423. Fixes#24590.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This function leaks memory when the event_base is freed before the
event itself fires. That's not harmful, but it's annoying when
trying to debug other memory leaks.
Fixes bug 24584; bugfix on 0.2.8.1-alpha.
This patch adds support for Android's logging subsystem in Tor. When
debugging Android applications it is useful to be able to collect
information about the application running on the platform via the
various system services that is available on the platform.
This patch allows you to add "Log notice android" to your torrc and have
Tor send everything above and including the notice severity to Android's
ring buffer which can be inspected using the 'adb logcat' program.
See: https://bugs.torproject.org/24362
Also make IPv6-only clients wait for microdescs for relays, even if we were
previously using descriptors (or were using them as a bridge) and have
a cached descriptor for them.
But if node_is_a_configured_bridge(), stop waiting for its IPv6 address in
a microdescriptor, because we'll never use it.
Implements #23827.
Split hs_circuitmap_get_rend_circ_client_side(). One returns only established
circuits (hs_circuitmap_get_established_rend_circ_client_side()) and the other
returns all kinds of circuits.
Fixes#23459
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
Previously, circuit_stream_is_being_handled incorrectly reported
that (1) an exit port was "handled" by a circuit regardless of
whether the circuit was already isolated in some way, and
(2) that a stream could be "handled" by a circuit even if their
isolation settings were incompatible.
As a result of (1), in Tor Browser, circuit_get_unhandled_ports was
reporting that all ports were handled even though all non-internal
circuits had already been isolated by a SOCKS username+password.
Therefore, circuit_predict_and_launch_new was declining to launch
new exit circuits. Then, when the user visited a new site in Tor
Browser, a stream with new SOCKS credentials would be initiated,
and the stream would have to wait while a new circuit with those
credentials could be built. That wait was making the
time-to-first-byte longer than it needed to be.
Now, clean, not-yet-isolated circuit(s) will be automatically
launched ahead of time and be ready for use whenever a new stream
with new SOCKS credentials (or other isolation criteria) is
initiated.
Fixes bug 18859. Thanks to Nick Mathewson for improvements.
Instead of using the cwd to specify the location of Cargo.toml, we
use the --manifest-path option to specify its location explicitly.
This works around the bug that isis diagnosed on our jenkins builds.
Making errno error log more useful for getrandom() call. Adding if statement to
make difference between ENOSYS and other errors.
Fixes#24500
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>
First, hs_service_intro_circ_has_closed() is now called in circuit_mark_for
close() because the HS subsystem needs to learn when an intro point is
actually not established anymore as soon as possible. There is a time window
between a close and a free.
Second, when we mark for close, we also remove it from the circuitmap because
between the close and the free, a service can launch an new circuit to that
same intro point and thus register it which only succeeds if the intro point
authentication key is not already in the map.
However, we still do a remove from the circuitmap in circuit_free() in order
to also cleanup the circuit if it wasn't marked for close prior to the free.
Fixes#23603
Signed-off-by: David Goulet <dgoulet@torproject.org>
In the KIST main loop, if the channel happens to be not opened, set its state
to IDLE so we can release it properly later on. Prior to this fix, the channel
was in PENDING state, removed from the channel pending list and then kept in
that state because it is not opened.
This bug was introduced in commit dcabf801e5 for
which we made the scheduler loop not consider unopened channel.
This has no consequences on tor except for an annoying but harmless BUG()
warning.
Fixes#24502
Signed-off-by: David Goulet <dgoulet@torproject.org>
Some platforms don't have good monotonic time support so don't warn when the
diff between the last run of the scheduler time and now is negative. The
scheduler recovers properly from this so no need to be noisy.
Fixes#23696
Signed-off-by: David Goulet <dgoulet@torproject.org>
When creating a routerstatus (vote) from a routerinfo (descriptor),
set the IPv6 address to the unspecified IPv6 address, and explicitly
initialise the port to zero.
Also clarify the documentation for the function.
Fixes bug 24488; bugfix on 0.2.4.1-alpha.
Modified -Wnormalized flag to nfkc option in configure.ac to avoid source code
identifier confusion.
Fixes#24467
Signed-off-by: Fernando Fernandez Mancera <ffernandezmancera@gmail.com>