If the cache is using 20% of our maximum allowed memory, clean 10% of it. Same
behavior as the HS descriptor cache.
Closes#25122
Signed-off-by: David Goulet <dgoulet@torproject.org>
The current code flow makes it that we can release a channel in a PENDING
state but not in the pending list. This happens while the channel is being
processed in the scheduler loop.
Fixes#25125
Signed-off-by: David Goulet <dgoulet@torproject.org>
The accurate address of a connection is real_addr, not the addr member.
channel_tls_get_remote_addr_method() now returns real_addr instead.
Fixes#24952; bugfix on 707c1e2 in 0.2.4.11-alpha.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
In 0.3.2.1-alpha, we've added notify_networkstatus_changed() in order to have
a way to notify other subsystems that the consensus just changed. The old and
new consensus are passed to it.
Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.
This commit changes the notify_networkstatus_changed() to be the "before"
function and adds an "after" notification from which the scheduler subsystem
is notified.
Fixes#24975
This is the quick fix that is keeping the channel in PENDING state so if we
ever try to reschedule the same channel, it won't happened.
Fixes#24700
Signed-off-by: David Goulet <dgoulet@torproject.org>
It is possible in normal circumstances that a client fetches a descriptor
that has a lower revision counter than the one in its cache. This can happen
due to HSDir desync.
Fixes#24976
Signed-off-by: David Goulet <dgoulet@torproject.org>
In 0.3.2.1-alpha, we've added this function in order to have a way to notify
other subsystems that the consensus just changed. The old consensus and the
new one are passed to it.
Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.
With this commit, notify_networkstatus_changed() has been moved _after_ the
new consensus is set globally. The main obvious reasons is to fix the bug
described above and in #24975. The other reason is that this notify function
doesn't return anything which could be allowing the possibility of refusing to
set the new consensus on error. In other words, the new consensus is set right
after the notification whatever happens.
It does no harm or change in behavior to set the new consensus first and then
notify the subsystems. The two functions currently used are for the control
port using the old and new consensus and sending the diff. The second is the
scheduler that needs the new consensus to be set globally before being called.
Of course, the function has been documented accordinly to clearly state it is
done _after_ the new consensus is set.
Fixes#24975
Signed-off-by: David Goulet <dgoulet@torproject.org>
Because this touches too many commits at once, it is made into one single
commit.
Remove the use of "tenths" for the circuit rate to simplify things. We can
only refill the buckets at best once every second because of the use of
approx_time() and our token system is set to be 1 token = 1 circuit so make
the rate a flat integer of circuit per second.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Imagine this scenario. We had 10 connections over the 24h lifetime of a geoip
cache entry. The lifetime of the entry has been reached so it is about to get
freed but 2 connections remain for it. After the free, a third connection
comes in thus making us create a new geoip entry for that address matching the
2 previous ones that are still alive. If they end up being closed, we'll have
a concurrent count desynch from what the reality is.
To mitigate this probably very rare scenario in practice, when we free a geoip
entry and it has a concurrent count above 0, we'll go over all connections
matching the address and clear out the tracked flag. So once they are closed,
we don't try to decrement the count.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This option refuses any ESTABLISH_RENDEZVOUS cell arriving from a client
connection. Its default value is "auto" for which we can turn it on or off
with a consensus parameter. Default value is 0.
Signed-off-by: David Goulet <dgoulet@torproject.org>
If the client address was detected as malicious, apply a defense which is at
this commit to return a DESTROY cell.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Add a function that notifies the DoS subsystem that a new CREATE cell has
arrived. The statistics are updated accordingly and the IP address can also be
marked as malicious if it is above threshold.
At this commit, no defense is applied, just detection with a circuit creation
token bucket system.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Implement a basic connection tracking that counts the number of concurrent
connections when they open and close.
This commit also adds the circuit creation mitigation data structure that will
be needed at later commit to keep track of the circuit rate.
Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit introduces the src/or/dos.{c|h} files that contains the code for
the Denial of Service mitigation subsystem. It currently contains basic
functions to initialize and free the subsystem. They are used at this commit.
The torrc options and consensus parameters are defined at this commit and
getters are implemented.
Signed-off-by: David Goulet <dgoulet@torproject.org>
The upcoming DoS mitigation subsytem needs to keep information on a per-IP
basis which is also what the geoip clientmap does.
For another subsystem to access that clientmap, this commit adds a lookup
function that returns the entry. For this, the clientmap_entry_t had to be
moved to the header file.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Without this patch, not only will TLS1.3 not work with Tor, but
OpenSSL 1.1.1 with TLS1.3 enabled won't build any connections at
all: It requires that either TLS1.3 be disabled, or some TLS1.3
ciphersuites be listed.
Closes ticket 24978.
Fix an "off by 2" error in counting rendezvous failures on the onion
service side.
While we thought we would stop the rendezvous attempt after one failed
circuit, we were actually making three circuit attempts before giving up.
Fixes bug 24895; bugfix on 0.0.6.
Fix a set of false positives where relays would consider connections
to other relays as being client-only connections (and thus e.g.
deserving different link padding schemes) if those relays fell out
of the consensus briefly.
Now we look only at the initial handshake and whether the connection
authenticated as a relay.
Fixes bug 24898; bugfix on 0.3.1.1-alpha.
New-style (v3) onion services now obey the "max rendezvous circuit
attempts" logic.
Previously they would make as many rendezvous circuit attempts as they
could fit in the MAX_REND_TIMEOUT second window before giving up.
Fixes bug 24894; bugfix on 0.3.2.1-alpha.
These are all about local variables shadowing global
functions. That isn't normally a problem, but at least one
compiler we care about seems to treat this as a case of -Wshadow
violation, so let's fix it.
Fixes bug 24634; bugfix on 0.3.2.1-alpha.
When the fascist_firewall_choose_address_ functions don't find a
reachable address, set the returned address to the null address and port.
This is a precautionary measure, because some callers do not check the
return value.
Fixes bug 24736; bugfix on 0.2.8.2-alpha.
This makes clients on the public tor network prefer to bootstrap off fallback
directory mirrors.
This is a follow-up to 24679, which removed weights from the default fallbacks.
Implements ticket 24681.
We've been seeing problems with destroy cells queues taking up a
huge amount of RAM. We can mitigate this, since while a full packed
destroy cell takes 514 bytes, we only need 5 bytes to remember a
circuit ID and a reason.
Fixes bug 24666. Bugfix on 0.2.5.1-alpha, when destroy cell queues
were introduced.
With extra_space negative, it means that the "notsent" queue is quite large so
we must consider that value with the current computed tcp_space. If we end up
to have negative space, we should not add more data to the kernel since the
notsent queue is just too filled up.
Fixes#24665
Signed-off-by: David Goulet <dgoulet@torproject.org>
Instead of using INT_MAX as a write limit for KISTLite, use the lower layer
limit which is using the specialized num_cells_writeable() of the channel that
will down the line check the connection's outbuf and limit it to 32KB
(OR_CONN_HIGHWATER).
That way we don't take the chance of bloating the connection's outbuf and we
keep the cells in the circuit queue which our OOM handler can take care of,
not the outbuf.
Finally, this commit adds a log_debug() in the update socket information
function of KIST so we can get the socket information in debug.
Fixes#24671
Signed-off-by: David Goulet <dgoulet@torproject.org>
Retry directory downloads when we get our first bridge descriptor
during bootstrap or while reconnecting to the network. Keep retrying
every time we get a bridge descriptor, until we have a reachable bridge.
Stop delaying bridge descriptor fetches when we have cached bridge
descriptors. Instead, only delay bridge descriptor fetches when we
have at least one reachable bridge.
Fixes bug 24367; bugfix on 0.2.0.3-alpha.
When entry_list_is_constrained() is true, guards_retry_optimistic()
always returns true.
When entry_list_is_constrained() is false,
options->UseBridges is always false,
therefore !options->UseBridges is always true,
therefore (!options->UseBridges || ...) is always true.
Cleanup after #24367.
Commit e80893e51b made tor call
hs_service_intro_circ_has_closed() when we mark for close a circuit.
When we cleanup intro points, we iterate over the descriptor's map of intro
points and we can possibly mark for close a circuit. This was problematic
because we would MAP_DEL_CURRENT() the intro point then free it and finally
mark for close the circuit which would lookup the intro point that we just
free in the map we are iterating over.
This can't be done and leads to a use-after-free because the intro point will
be returned successfully due to the fact that we are still in the loop
iterating. In other words, MAP_DEL_CURRENT() followed by a digest256map_get()
of the same object should never be done in the same loop.
Fixes#24595
Signed-off-by: David Goulet <dgoulet@torproject.org>
In KIST, we could have a small congestion window value than the unacked
packets leading to a integer overflow which leaves the tcp_space value to be
humongous.
This has no security implications but it results in KIST scheduler allowing to
send cells on a potentially saturated connection.
Found by #24423. Fixes#24590.
Signed-off-by: David Goulet <dgoulet@torproject.org>
Previously, circuit_stream_is_being_handled incorrectly reported
that (1) an exit port was "handled" by a circuit regardless of
whether the circuit was already isolated in some way, and
(2) that a stream could be "handled" by a circuit even if their
isolation settings were incompatible.
As a result of (1), in Tor Browser, circuit_get_unhandled_ports was
reporting that all ports were handled even though all non-internal
circuits had already been isolated by a SOCKS username+password.
Therefore, circuit_predict_and_launch_new was declining to launch
new exit circuits. Then, when the user visited a new site in Tor
Browser, a stream with new SOCKS credentials would be initiated,
and the stream would have to wait while a new circuit with those
credentials could be built. That wait was making the
time-to-first-byte longer than it needed to be.
Now, clean, not-yet-isolated circuit(s) will be automatically
launched ahead of time and be ready for use whenever a new stream
with new SOCKS credentials (or other isolation criteria) is
initiated.
Fixes bug 18859. Thanks to Nick Mathewson for improvements.