Applies the 6c443e987d fix to router_pick_directory_server_impl.
6c443e987d applied to directory servers chosen from the consensus,
and was:
"Tweak the 9969 fix a little
If we have busy nodes and excluded nodes, then don't retry with the
excluded ones enabled. Instead, wait for the busy ones to be nonbusy."
"Tor has included a feature to fetch the initial consensus from nodes
other than the authorities for a while now. We just haven't shipped a
list of alternate locations for clients to go to yet.
Reasons why we might want to ship tor with a list of additional places
where clients can find the consensus is that it makes authority
reachability and BW less important.
We want them to have been around and using their current key, address,
and port for a while now (120 days), and have been running, a guard,
and a v2 directory mirror for most of that time."
Features:
* whitelist and blacklist for an opt-in/opt-out trial.
* excludes BadExits, tor versions that aren't recommended, and low
consensus weight directory mirrors.
* reduces the weighting of Exits to avoid overloading them.
* places limits on the weight of any one fallback.
* includes an IPv6 address and orport for each FallbackDir, as
implemented in #17327. (Tor won't bootstrap using IPv6 fallbacks
until #17840 is merged.)
* generated output includes timestamps & Onionoo URL for traceability.
* unit test ensures that we successfully load all included default
fallback directories.
Closes ticket #15775. Patch by "teor".
OnionOO script by "weasel", "teor", "gsathya", and "karsten".
router_digest_is_fallback_dir returns 1 if the digest is in the
currently loaded list of fallback directories, and 0 otherwise.
This function is for future use.
Prop210: Add attempt-based connection schedules
Existing tor schedules increment the schedule position on failure,
then retry the connection after the scheduled time.
To make multiple simultaneous connections, we need to increment the
schedule position when making each attempt, then retry a (potentially
simultaneous) connection after the scheduled time.
(Also change find_dl_schedule_and_len to find_dl_schedule, as it no
longer takes or returns len.)
Prop210: Add multiple simultaneous consensus downloads for clients
Make connections on TestingClientBootstrapConsensus*DownloadSchedule,
incrementing the schedule each time the client attempts to connect.
Check if the number of downloads is less than
TestingClientBootstrapConsensusMaxInProgressTries before trying any
more connections.
Some functions that use digest maps did not mention that the digests are
expected to have DIGEST_LEN bytes. This lead to buffer over-reads in the
past.
Mark fallback directory mirrors as "too busy" when they return
a 503 response. Previously, the code just marked authorities as busy.
Unless clients set their own fallback directories, they will never see
this bug. (There are no default fallbacks yet.)
Fixes bug 17572; bugfix on 5c51b3f1f0 released in 0.2.4.7-alpha.
Patch by "teor".
Instead of having it call update_all_descriptor_downloads and
update_networkstatus_downloads directly, we can have it cause them to
get rescheduled and called from run_scheduled_events.
Closes ticket 16789.
Extrainfo documents are now ed-signed just as are router
descriptors, according to proposal 220. This patch also includes
some more tests for successful/failing parsing, and fixes a crash
bug in ed25519 descriptor parsing.
Routers now use TAP and ntor onion keys to sign their identity keys,
and put these signatures in their descriptors. That allows other
parties to be confident that the onion keys are indeed controlled by
the router that generated the descriptor.
Fixes bug 11454, where we would keep around a superseded descriptor
if the descriptor replacing it wasn't at least a week later. Bugfix
on 0.2.1.8-alpha.
Fixes bug 11457, where a certificate with a publication time in the
future could make us discard existing (and subsequent!) certificates
with correct publication times. Bugfix on 0.2.0.3-alpha.
Decrease minimum consensus interval to 10 seconds
when TestingTorNetwork is set. (Or 5 seconds for
the first consensus.)
Fix code that assumes larger interval values.
This assists in quickly bootstrapping a testing
Tor network.
Fixes bugs 13718 & 13823.
Instead, generate new keys, and overwrite the empty key files.
Adds FN_EMPTY to file_status_t and file_status.
Fixes bug 13111.
Related changes due to review of FN_FILE usage:
Stop generating a fresh .old RSA key file when the .old file is missing.
Avoid overwriting .old key files with empty key files.
Skip loading zero-length extra info store, router store, stats, state,
and key files.
We were only using it when smartlist_choose_node_by_bandwidth_weights
failed. But that function could only fail in the presence of
buggy/ancient authorities or in the absence of a consensus. Either
way, it's better to use sensible defaults and a nicer algorithm.
1. The test that adds things to the cache needs to set the clock back so
that the descriptors it adds are valid.
2. We split ROUTER_NOT_NEW into ROUTER_TOO_OLD, so that we can
distinguish "already had it" from "rejected because of old published
date".
3. We make extrainfo_insert() return a was_router_added_t, and we
make its caller use it correctly. This is probably redundant with
the extrainfo_is_bogus flag.
One pain point in evolving the Tor design and implementing has been
adding code that makes clients reject directory documents that they
previously would have accepted, if those descriptors actually exist.
When this happened, the clients would get the document, reject it,
and then decide to try downloading it again, ad infinitum. This
problem becomes particularly obnoxious with authorities, since if
some authorities accept a descriptor that others don't, the ones
that don't accept it would go crazy trying to re-fetch it over and
over. (See for example ticket #9286.)
This patch tries to solve this problem by tracking, if a descriptor
isn't parseable, what its digest was, and whether it is invalid
because of some flaw that applies to the portion containing the
digest. (This excludes RSA signature problems: RSA signatures
aren't included in the digest. This means that a directory
authority can still put another directory authority into a loop by
mentioning a descriptor, and then serving that descriptor with an
invalid RSA signatures. But that would also make the misbehaving
directory authority get DoSed by the server it's attacking, so it's
not much of an issue.)
We already have a mechanism to mark something undownloadable with
downloadstatus_mark_impossible(); we use that here for
microdescriptors, extrainfos, and router descriptors.
Unit tests to follow in another patch.
Closes ticket #11243.
Document usage of the NO_DIRINFO and ALL_DIRINFO flags clearly in functions
which take them as arguments. Replace 0 with NO_DIRINFO in a function call
for clarity.
Seeks to prevent future issues like 13163.
Avoid 4 null pointer errors under clang shallow analysis (the default when
building under Xcode) by using tor_assert() to prove that the pointers
aren't null. Resolves issue 13284 via minor code refactoring.
Technically, we're not allowed to take the address of a member can't
exist relative to the null pointer. That makes me wonder how any sane
compliant system implements the offsetof macro, but let's let sleeping
balrogs lie.
Fixes 13096; patch on 0.1.1.9-alpha; patch from "teor", who was using
clang -fsanitize=undefined-trap -fsanitize-undefined-trap-on-error -ftrapv
Coverity thinks that when we do "double x = int1/int2;", we probably
meant "double x = ((double)int1) / int2;". In these cases, we
didn't.
[Coverity CID 1232089 and 1232090]
Clients should always believe that v3 directory authorities serve
extra-info documents, regardless of whether their server descriptor
contains a "caches-extra-info" line or not.
Fixes part of #11683.
These options were added back in 0.1.2.5-alpha, but no longer make any
sense now that all directories support tunneled connections and
BEGIN_DIR cells. These options were on by default; now they are
always-on.
This is a fix for 10849, where TunnelDirConns 0 would break hidden
services -- and that bug arrived, I think, in 0.2.0.10-alpha.
The remaining vestige is that we continue to publish the V2dir flag,
and that, for the controller, we continue to emit v2 directory
formats when requested.
We were freeing these on exit, but when we added the dl_status_map
field to them in fddb814f, we forgot to arrange for it to be freed.
I've moved the cert_list_free() code into its own function, and added
an appropriate dsmap_free() call.
Fixes bug 9644; bugfix on 0.2.4.13-alpha.
We previously used FILENAME_PRIVATE identifiers mostly for
identifiers exposed only to the unit tests... but also for
identifiers exposed to the benchmarker, and sometimes for
identifiers exposed to a similar module, and occasionally for no
really good reason at all.
Now, we use FILENAME_PRIVATE identifiers for identifiers shared by
Tor and the unit tests. They should be defined static when we
aren't building the unit test, and globally visible otherwise. (The
STATIC macro will keep us honest here.)
For identifiers used only by the unit tests and never by Tor at all,
on the other hand, we wrap them in #ifdef TOR_UNIT_TESTS.
This is not the motivating use case for the split test/non-test
build system; it's just a test example to see how it works, and to
take a chance to clean up the code a little.
It returns the method by which we decided our public IP address
(explicitly configured, resolved from explicit hostname, guessed from
interfaces, learned by gethostname).
Now we can provide more helpful log messages when a relay guesses its IP
address incorrectly (e.g. due to unexpected lines in /etc/hosts). Resolves
ticket 2267.
While we're at it, stop sending a stray "(null)" in some cases for the
server status "EXTERNAL_ADDRESS" controller event. Resolves bug 8200.
This is meant to avoid conflict with the built-in log() function in
math.h. It resolves ticket 7599. First reported by dhill.
This was generated with the following perl script:
#!/usr/bin/perl -w -i -p
s/\blog\(LOG_(ERR|WARN|NOTICE|INFO|DEBUG)\s*,\s*/log_\L$1\(/g;
s/\blog\(/tor_log\(/g;
This is a minimal refactoring to expose the weighted bandwidth
calculations for each node so I can use them to see what fraction of
nodes, weighted by bandwidth, we have descriptors for.
This is an automatically generated commit, from the following perl script,
run with the options "-w -i -p".
s/smartlist_string_num_isin/smartlist_contains_int_as_string/g;
s/smartlist_string_isin((?:_case)?)/smartlist_contains_string$1/g;
s/smartlist_digest_isin/smartlist_contains_digest/g;
s/smartlist_isin/smartlist_contains/g;
s/digestset_isin/digestset_contains/g;
Here we try to handle curve25519 onion keys from generating them,
loading and storing them, publishing them in our descriptors, putting
them in microdescriptors, and so on.
This commit is untested and probably buggy like whoa
It's important not to call choose_array_element_by_weight and then
pass its return value unchecked to smartlist_get : it is allowed to
return -1.
Fixes bug 7756; bugfix on 4e3d07a6 (not in any released Tor)
We want to be saying fast_mem{cmp,eq,neq} when we're doing a
comparison that's allowed to exit early, or tor_mem{cmp,eq,neq} when
we need a data-invariant timing. Direct use of memcmp tends to imply
that we haven't thought about the issue.
This replaces the old FallbackConsensus notion, and should provide a
way -- assuming we pick reasonable nodes! -- to give clients
suggestions of placs to go to get their first consensus.
Now creating a dir_server_t and adding it are separate functions, and
there are frontend functions for adding a trusted dirserver and a
fallback dirserver.
We use trusted_dir_server_t for two pieces of functionality: a list of
all directory authorities, and a list of initial places to look for
a directory. With this patch we start to separate those two roles.
There is as of now no actual way to be a fallback directory without being
an authority.
When I removed version_supports_begindir, I accidentally removed the
mechanism we had been using to make a directory cache self-test its
directory port. This caused bug 6815, which caused 6814 (both in
0.2.4.2-alpha).
To fix this bug, I'm replacing the "anonymized_connection" argument to
directory_initiate_command_* with an enumeration to say how indirectly
to connect to a directory server. (I don't want to reinstate the
"version_supports_begindir" argument as "begindir_ok" or anything --
these functions already take too many arguments.)
For safety, I made sure that passing 0 and 1 for 'indirection' gives
the same result as you would have gotten before -- just in case I
missed any 0s or 1s.
With this patch, I dump the old kludge of using magic negative
numbers to indicate unknown bandwidths. I also compute each node's
weighted bandwidth exactly once, rather than computing it once in
a loop to compute the total weighted bandwidth and a second time in
a loop to find which one we picked.
Previously, we had incremented rand_bw so that when we later tested
"tmp >= rand_bw", we wouldn't have an off-by-one error. But instead,
it makes more sense to leave rand_bw alone and test "tmp > rand_bw".
Note that this is still safe. To take the example from the bug1203
writeup: Suppose that we have 3 nodes with bandwidth 1. So the
bandwidth array is { 1, 1, 1 }, and the total bandwidth is 3. We
choose rand_bw == 0, 1, or 2. With the first iteration of the loop,
tmp is now 1; with the second, tmp is 2; with the third, tmp is 3.
Now that our check is tmp > rand_bw, we will set i in the first
iteration of the loop iff rand_bw == 0; in the second iteration of
the loop iff rand_bw == 1, and in the third iff rand_bw == 2.
That's what we want.
Incidentally, this change makes the bug 6538 fix more ironclad: once
rand_bw is set to UINT64_MAX, tmp > rand_bw is obviously false
regardless of the value of tmp.
The old approach, because of its "tmp >= rand_bw &&
!i_has_been_chosen" check, would run through the second part of the
loop slightly slower than the first part. Now, we remove
i_has_been_chosen, and instead set rand_bw = UINT64_MAX, so that
every instance of the loop will do exactly the same amount of work
regardless of the initial value of rand_bw.
Fix for bug 6538.
This should make our preferred solution to #6538 easier to
implement, avoid a bunch of potential nastiness with excessive
int-vs-double math, and generally make the code there a little less
scary.
"But wait!" you say. "Is it really safe to do this? Won't the
results come out differently?"
Yes, but not much. We now round every weighted bandwidth to the
nearest byte before computing on it. This will make every node that
had a fractional part of its weighted bandwidth before either
slighty more likely or slightly less likely. Further, the rand_bw
value was only ever set with integer precision, so it can't
accurately sample routers with tiny fractional bandwidth values
anyway. Finally, doing repeated double-vs-uint64 comparisons is
just plain sad; it will involve an implicit cast to double, which is
never a fun thing.