This is the first half of implementing proposal 301. The
RecommendedPackages torrc option is marked as obsolete and
the test cases for the option removed. Additionally, the code relating
to generating and formatting package lines in votes is removed.
These lines may still appear in votes from other directory authorities
running earlier versions of the code and so consensuses may still
contain package lines. A new consensus method will be needed to stop
including package lines in consensuses.
Fixes: #28465
To ease debugging of miscount issues, attach vanguards with --loglevel DEBUG
and obtain control port logs (or use any other control port CIRC and
CIRC_MINOR event logging mechanism).
If circuit padding wants to keep a circuit open and pathbias used to ignore
it, pathbias should continue to ignore it.
This may catch other purpose-change related miscounts (such as timeout
measurement, cannibalization, onion service circuit transitions, and
vanguards).
Fortunately, in 0.3.5.1-alpha we improved logging for various
failure cases involved with onion service client auth.
Unfortunately, for this one, we freed the file right before logging
its name.
Fortunately, tor_free() sets its pointer to NULL, so we didn't have
a use-after-free bug.
Unfortunately, passing NULL to %s is not defined.
Fortunately, GCC 9.1.1 caught the issue!
Unfortunately, nobody has actually tried building Tor with GCC 9.1.1
before. Or if they had, they didn't report the warning.
Fixes bug 30475; bugfix on 0.3.5.1-alpha.
Some of these functions are now public and cpath-specific so their name should
signify the fact they are part of the cpath module:
assert_cpath_layer_ok -> cpath_assert_layer_ok
assert_cpath_ok -> cpath_assert_ok
onion_append_hop -> cpath_append_hop
circuit_init_cpath_crypto -> cpath_init_circuit_crypto
circuit_free_cpath_node -> cpath_free
onion_append_to_cpath -> cpath_extend_linked_list
We are using an opaque pointer so the structure needs to be allocated on the
heap. This means we now need a constructor for crypt_path_t.
Also modify all places initializing a crypt_path_t to use the constructor.
For various reasons, this was a nontrivial movement. There are
several places in the code where we do something like "update the
flags on this routerstatus or node if we're an authority", and at
least one where we pretended to be an authority when we weren't.
I don't believe any of these represent a real timing vulnerability
(remote timing against memcmp() on a modern CPU is not easy), but
these are the ones where I believe we should be more careful.
Manually fix up some reply-generating code that the Coccinelle scripts
won't match. Some more complicated ones remain -- these are mostly
ones that accumulate data to send, and then call connection_buf_add()
or connection_write_str_to_buf() directly.
Create a set of abstractions for controller commands and events to
output replies to the control channel. The control protocol has a
relatively consistent SMTP-like structure, so it's helpful when code
that implements control commands and events doesn't explicitly format
everything on its own.
Split the core reply formatting code out of control_fmt.c into
control_proto.c. The remaining code in control_format.c deals with
specific subsystems and will eventually move to join those subsystems.
When we tell the periodic event manager about an event, we are
"registering" that event. The event sits around without being
usable, however, until we "connect" the event to libevent. In the
end, we "disconnect" the event and remove its libevent parts.
Previously, we called these operations "add", "setup", and
"destroy", which led to confusion.
The nodelist_idx for each node_t serves as a unique identifier for
the node, so we can use a bitarray to hold all the excluded
nodes, and then remove them from the smartlist.
Previously use used smartlist_subtract(sl, excluded), which is
O(len(sl)*len(excluded)).
We can use this function in other places too, but this is the one
that showed up on the profiles of 30291.
Closes ticket 30307.
This command does not fit perfectly with the others, since its
second argument is optional and may contain equal signs. Still,
it's probably better to squeeze it into the new metaformat, since
doing so allows us to remove several pieces of the old
command-parsing machinery.
(This should be all of the command that work nicely with positional
arguments only.)
Some of these commands should probably treat extra arguments as
incorrect, but for now I'm trying to be careful not to break
any existing users.
The first line break in particular was mishandled: it was discarded
if no arguments came before it, which made it impossible to
distinguish arguments from the first line of the body.
To solve this, we need to allocate a copy of the command rather than
using NUL to separate it, since we might have "COMMAND\n" as our input.
Fixes ticket 29984.
There _is_ an underlying logic to these commands, but it isn't
wholly uniform, given years of tweaks and changes. Fortunately I
think there is a superset that will work.
This commit adds a parser for some of the most basic cases -- the
ones currently handled by getargs_helper() and some of the
object-taking ones. Soon will come initial tests; then I'll start using
the parser.
After that, I'll expand the parser to handle the other cases that come
up in the controller protocol.
In this patch we lower the log level of the failures for the three calls
to unlink() in networkstatus_set_current_consensus(). These errors might
trigger on Windows because the memory mapped consensus file keeps the
file in open state even after we have close()'d it. Windows will then
error on the unlink() call with a "Permission denied" error.
The consequences of ignoring these errors is that we leave an unused
file around on the file-system, which is an easier way to fix this
problem right now than refactoring networkstatus_set_current_consensus().
See: https://bugs.torproject.org/29930
We need to encode here instead of doing escaped(), since fwict
escaped() does not currently handle NUL bytes.
Also, use warn_if_nul_found in more cases to avoid duplication.
Use a table-based lookup to find the right command handler. This
will serve as the basement for several future improvements, as we
improve the API for parsing commands.
This should please coverity, and fix CID 1415721. It didn't
understand that networkstatus_get_param() always returns a value
between its minimum and maximum values.
This should please coverity, and fix CID 1415722. It didn't
understand that networkstatus_get_param() always returns a value
between its minimum and maximum values.
This should please coverity, and fix CID 1415723. It didn't understand
that networkstatus_get_param() always returns a value between its
minimum and maximum values.
And fix the documentation on the function: it does produce trailing
"="s as padding.
Also remove all checks for the return value, which were redundant anyway,
because the function never failed.
Part of 29660.
... and ed25519_public_to_base64(). Also remove all checks for the return
values, which were redundant anyway, because the functions never failed.
Part of 29960.
When we fixed 28614, our answer was "if we failed to load the
consensus on windows and it had a CRLF, retry it." But we logged
the failure at "warn", and we only logged the retry at "info".
Now we log the retry at "notice", with more useful information.
Fixes bug 30004.
Previously we used time(NULL) to set the Expires: header in our HTTP
responses. This made the actual contents of that header untestable,
since the unit tests have no good way to override time(), or to see
what time() was at the exact moment of the call to time() in
dircache.c.
This gave us a race in dir_handle_get/status_vote_next_bandwidth,
where the time() call in dircache.c got one value, and the call in
the tests got another value.
I'm applying our regular solution here: using approx_time() so that
the value stays the same between the code and the test. Since
approx_time() is updated on every event callback, we shouldn't be
losing any accuracy here.
Fixes bug 30001. Bug introduced in fb4a40c32c4a7e5; not in any
released Tor.
When a directory authority is using a bandwidth file to obtain the
bandwidth values that will be included in the next vote, serve this
bandwidth file at /tor/status-vote/next/bandwidth.z.
This is something we should think about harder, but we probably want dormant
mode to be more powerful than padding in case a client has been inactive for a
day or so. After all, there are probably no circuits open at this point and
dormant mode will not allow the client to open more circuits.
Furthermore, padding should not block dormant mode from being activated, since
dormant mode relies on SocksPort activity, and circuit padding does not mess
with that.
The previous commits introduced link_specifier_dup(), which is
implemented using trunnel's opaque interfaces. So we can now
remove hs_desc_link_specifier_dup().
Cleanup after bug 22781.
The previous commits for 23576 confused hs_desc_link_specifier_t
and link_specifier_t. Removing hs_desc_link_specifier_t fixes this
confusion.
Fixes bug 22781; bugfix on 0.3.2.1-alpha.
This patch fixes a crash bug (assertion failure) in the PT subsystem
that could get triggered if the user cancels bootstrap via the UI in
TorBrowser. This would cause Tor to call `managed_proxy_destroy()` which
called `process_free()` after it had called `process_terminate()`. This
leads to a crash when the various process callbacks returns with data
after the `process_t` have been freed using `process_free()`.
We solve this issue by ensuring that everywhere we call
`process_terminate()` we make sure to detach the `managed_proxy_t` from
the `process_t` (by calling `process_set_data(process, NULL)`) and avoid
calling `process_free()` at all in the transports code. Instead we just
call `process_terminate()` and let the process exit callback in
`managed_proxy_exit_callback()` handle the `process_free()` call by
returning true to the process subsystem.
See: https://bugs.torproject.org/29562
Remove router_update_info_send_unencrypted(), and move its code into the
relevant functions.
Then, re-use an options pointer.
Preparation for testing 29017 and 20918.
Remove some tiny static functions called by router_build_fresh_descriptor(),
and move their code into more relevant functions.
Then, give router_update_{router,extra}info_descriptor_body identical layouts.
Preparation for testing 29017 and 20918.
Make sure that these static functions aren't passed NULL.
If they are, log a BUG() warning, and return an error.
Preparation for testing 29017 and 20918.
Split the body of router_build_fresh_descriptor() into static functions,
by inserting function prologues and epilogues between existing sections.
Write a new body for router_build_fresh_descriptor() that calls the new
static functions.
Initial refactor with no changes to the body of the old
router_build_fresh_descriptor(), except for the split.
Preparation for testing 29017 and 20918.
When ExtraInfoStatistics is 0, stop including bandwidth usage statistics,
GeoIPFile hashes, ServerTransportPlugin lines, and bridge statistics
by country in extra-info documents.
Fixes bug 29018; bugfix on 0.2.4.1-alpha (and earlier versions).
Rewrite service_intro_point_new() to take a node_t. Since
node_get_link_specifier_smartlist() supports IPv6 link specifiers,
this refactor adds IPv6 addresses to onion service descriptors.
Part of 23576, implements 26992.
Also, when we log about a failure from base32_decode(), we now
say that the length is wrong or that the characters were invalid:
previously we would just say that there were invalid characters.
Follow-up on 28913 work.
Hope is this will make it easier to test on the live tor network.
Does not need to be merged if we don't want to, but will come in handy
for researchers.
Co-authored-by: George Kadianakis <desnacked@riseup.net>
Redefine the set of bootstrap phases to allow display of finer-grained
progress in the early connection stages of connecting to a relay.
This includes adding intermediate phases for proxy and PT connections.
Also add a separate new phase to indicate obtaining enough directory
info to build circuits so we can report that independently of actually
initiating an ORCONN to build the first application circuit.
Previously, we would claim to be connecting to a relay when we had
merely finished obtaining directory info.
Part of ticket 27167.
Replace a few invocations of control_event_bootstrap() with calls from
the bootstrap tracker subsystem. This mostly leaves behavior
unchanged. The actual behavior changes come in the next commit.
Part of ticket 27167.
Add a tracker for bootstrap progress, tracking events related to
origin circuit and ORCONN states. This uses the ocirc_event and
orconn_event publish-subscribe subsystems.
Part of ticket 27167.
Add a publish-subscribe subsystem to publish messages about changes to
origin circuits.
Functions in circuitbuild.c and circuitlist.c publish messages to this
subsystem.
Move circuit event constants out of control.h so that subscribers
don't have to include all of control.h to take actions based on
messages they receive.
Part of ticket 27167.
Add a publish-subscribe subsystem to publish messages about changes to
OR connections.
connection_or_change_state() in connection_or.c and
control_event_or_conn_event() in control.c publish messages to this
subsystem via helper functions.
Move state constants from connection_or.h to orconn_state.h so that
subscribers don't have to include all of connection_or.h to take
actions based on changes in OR connection state. Move event constants
from control.h for similar reasons.
Part of ticket 27167.
This patch adds support for the new STATUS message that PT's can emit
from their standard out. The STATUS message uses the `config_line_t` K/V
format that was recently added in Tor.
See: https://bugs.torproject.org/28846
This patch changes the LOG pluggable transport message to use the recent
K/V parser that landed in Tor. This allows PT's to specify the log
severity level as well as the message. A mapping between the PT log
severity levels and Tor's log serverity level is provided.
See: https://bugs.torproject.org/28846
Previously we had decoded the asn.1 to get a public key, and then
discarded the asn.1 so that we had to re-encode the key to store it
in the onion_pkey field of a microdesc_t or routerinfo_t.
Now we can just do a tor_memdup() instead, which should be loads
faster.
Previously, we would decode the PEM wrapper for keys twice: once in
get_next_token, and once later in PEM decode. Now we just do all of
the wrapper and base64 stuff in get_next_token, and store the
base64-decoded part in the token object for keys and non-keys alike.
This change should speed up parsing slightly by letting us skip a
bunch of stuff in crypto_pk_read_*from_string(), including the tag
detection parts of pem_decode(), and an extra key allocation and
deallocation pair.
Retaining the base64-decoded part in the token object will allow us
to speed up our microdesc parsing, since it is the asn1 portion that
we actually want to retain.
This patch changes our EVENT_TRANSPORT_LOG event to be EVENT_PT_LOG. The
new message includes the path to the PT executable instead of the
transport name, since one PT binary can include multiple transport they
sometimes might need to log messages that are not specific to a given
transport.
See: https://bugs.torproject.org/28179
This patch changes our process_t's exit_callback to return a boolean
value. If the returned value is true, the process subsystem will call
process_free() on the given process_t.
See: https://bugs.torproject.org/28179
This patch moves the remaining code from subprocess.{h,c} to more
appropriate places in the process.c and process_win32.c module.
We also delete the now empty subprocess module files.
See: https://bugs.torproject.org/28179
This patch adds support for the "LOG" protocol message from a pluggable
transport. This allows pluggable transport developers to relay log
messages from their binary to Tor, which will both emit them as log
messages from the Tor process itself, but also pass them on via the
control port.
See: https://bugs.torproject.org/28180
See: https://bugs.torproject.org/28181
See: https://bugs.torproject.org/28182
This patch makes the managed proxy subsystem use the process_t data
structure such that we can get events from the PT process while Tor is
running and not just when the PT process is being configured.
See: https://bugs.torproject.org/28179
The strcmp_len() function was somewhat misconceived, since we're
only using it to test whether a length+extent string is equal to a
NUL-terminated string or not. By simplifying it and making it
inlined, we should be able to make it a little faster.
(It *does* show up in profiles.)
Closes ticket 28856.
I believe we originally added this for "just in case" safety, but it
isn't actually needed -- we never copy uninitialized stack here.
What's more, this one memset is showing up on our startup profiles,
so we ought to remove it.
Closes ticket 28852.
Add the bootstrap tag name to the log messages, so people
troubleshooting connection problems can look up a symbol instead of a
number. Closes ticket 28731.
When retrying all SOCKS connection because new directory information just
arrived, do not BUG() if a connection in state AP_CONN_STATE_RENDDESC_WAIT is
found to have a usable descriptor.
There is a rare case when this can happen as detailed in #28669 so the right
thing to do is put that connection back in circuit wait state so the
descriptor can be retried.
Fixes#28669
Signed-off-by: David Goulet <dgoulet@torproject.org>
This helper function marks an entry connection as pending for a circuit and
changes its state to AP_CONN_STATE_CIRCUIT_WAIT. The timestamps are set to
now() so it can be considered as new.
No behaviour change, this helper function will be used in next commit.
Part of #28669
Signed-off-by: David Goulet <dgoulet@torproject.org>
Use the helper function connection_ap_mark_as_waiting_for_renddesc()
introduced in previous commit everywhere in the code where an AP connection
state is transitionned to AP_CONN_STATE_RENDDESC_WAIT.
Part of #28669
Signed-off-by: David Goulet <dgoulet@torproject.org>
Specifically, if the consensus is older than the (estimted or
measured) release date for this version of tor, we assume that the
required versions may have changed in between that consensus and
this release.
Implements ticket 27735 and proposal 297.
This patch has routers use the same canonicalization logic as
authorities when encoding their family lists. Additionally, they
now warn if any router in their list is given by nickname, since
that's error-prone.
This patch also adds some long-overdue tests for family formatting.
Prop298 says that family entries should be formatted with
$hexids in uppercase, nicknames in lower case, $hexid~names
truncated, and everything sorted lexically. These changes implement
that ordering for nodefamily.c.
We don't _strictly speaking_ need to nodefamily.c formatting use
this for prop298 microdesc generation, but it seems silly to have
two separate canonicalization algorithms.
This representation is meant to save memory in microdescriptors --
we can't use it in routerinfo_t yet, since those families need to be
encoded losslessly for directory voting to work.
This representation saves memory in three ways:
1. It uses only one allocation per family. (The old way used a
smartlist (2 allocs) plus one strdup per entry.)
2. It stores identity digests in binary, not hex.
3. It keeps families in a canonical format, memoizes, and
reference-counts them.
Part of #27359.
This is part of 28422, so we don't have to call
consider_hibernation() once per second when we're dormant.
This commit does not remove delayed shutdown from hibernate.c: it
uses it as a backup shutdown mechanism, in case the regular shutdown
timer mechanism fails for some reason.
With the new refresh_service_descriptor() function we had both
refresh_service_descriptor() and update_service_descriptor() which is basically
the same thing.
This commit renames update_service_descriptor() to
update_service_descriptor_intro_points() to make it clear it's not a generic
refresh and it's only about intro points.
Commit changes no code.
Before this commit, we would create the descriptor signing key certificate
when first building the descriptor.
In some extreme cases, it lead to the expiry of the certificate which triggers
a BUG() when encoding the descriptor before uploading.
Ticket #27838 details a possible scenario in which this can happen. It is an
edge case where tor losts internet connectivity, notices it and closes all
circuits. When it came back up, the HS subsystem noticed that it had no
introduction circuits, created them and tried to upload the descriptor.
However, in the meantime, if tor did lack a live consensus because it is
currently seeking to download one, we would consider that we don't need to
rotate the descriptors leading to using the expired signing key certificate.
That being said, this commit does a bit more to make this process cleaner.
There are a series of things that we need to "refresh" before uploading a
descriptor: signing key cert, intro points and revision counter.
A refresh function is added to deal with all mutable descriptor fields. It in
turn simplified a bit the code surrounding the creation of the plaintext data.
We keep creating the cert when building the descriptor in order to accomodate
the unit tests. However, it is replaced every single time the descriptor is
uploaded.
Fixes#27838
Signed-off-by: David Goulet <dgoulet@torproject.org>
When storing a descriptor in the client cache, if we are about to replace an
existing descriptor, make sure to close every introduction circuits of the old
descriptor so we don't have leftovers lying around.
Ticket 27471 describes a situation where tor is sending an INTRODUCE1 cell on
an introduction circuit for which it doesn't have a matching intro point
object (taken from the descriptor).
The main theory is that, after a new descriptor showed up, the introduction
points changed which led to selecting an introduction circuit not used by the
service anymore thus for which we are unable to find the corresponding
introduction point within the descriptor we just fetched.
Closes#27471.
Signed-off-by: David Goulet <dgoulet@torproject.org>
It won't be used if there are no authorized client configured. We do that so
we can easily support the addition of a client with a HUP signal which allow
us to avoid more complex code path to generate that cookie if we have at least
one client auth and we had none before.
Fixes#27995
Signed-off-by: David Goulet <dgoulet@torproject.org>
Both client and service had their own code for this. Consolidate into one
place so we avoid duplication.
Closes#27549
Signed-off-by: David Goulet <dgoulet@torproject.org>
When freeing a configuration object from confparse.c in
dump_config(), we need to call the appropriate higher-level free
function (like or_options_free()) and not just config_free().
This only happens with options (since they're the one where
options_validate allocates extra stuff) and only when running
--dump-config with something other than minimal (since
OPTIONS_DUMP_MINIMAL doesn't hit this code).
Fixes bug 27893; bugfix on 0.3.2.1-alpha.
It differs from the rest of the rephist code in that it's actually
necessary for Tor to operate, so it should probably go somewhere
else. I'm not sure where yet, so I'll leave it in the same
directory, but give it its own file.
Since this is completely core functionality, I'm putting it in
core/mainloop, even though it depends on feature/hibernate. We'll
have to sort that out in the future.