this was causing directory authorities to send a time of 0 on all
connections they generated themselves, which means everybody reachability
test caused a time skew warning in the log for that relay.
(i didn't just revert, because the changes file has been modified by
other later commits.)
This isn't actually much of an issue, since only relays send
AUTHENTICATE cells, but while we're removing timestamps, we might as
well do this too.
Part of proposal 222. I didn't take the approach in the proposal of
using a time-based HMAC, since that was a bad-prng-mitigation hack
from SSL3, and in real life, if you don't have a good RNG, you're
hopeless as a Tor server.
For now, round down to the nearest 10 minutes. Later, eliminate entirely by
setting a consensus parameter.
(This rounding is safe because, in 0.2.2, where the timestamp mattered,
REND_REPLAY_TIME_INTERVAL was a nice generous 60 minutes.)
We were freeing these on exit, but when we added the dl_status_map
field to them in fddb814f, we forgot to arrange for it to be freed.
I've moved the cert_list_free() code into its own function, and added
an appropriate dsmap_free() call.
Fixes bug 9644; bugfix on 0.2.4.13-alpha.
The problem was that the server_identity_key_is_set() function could
return true under conditions where we don't really have an identity
key -- specifically, where we used to have one, but we stopped being a
server.
This is a fix for 6979; bugfix on 0.2.2.18-alpha where we added that
assertion to get_server_identity_key().
Whenever we had an non-option commandline arguments *and*
option-bearing commandline arguments on the commandline, we would save
only the latter across invocations of options_init_from_torrc, but
take their existence as license not to re-parse the former. Yuck!
Incidentally, this fix lets us throw away the backup_arg[gv] logic.
Fix for bug 9746; bugfix on d98dfb3746,
not in any released Tor. Found by Damian. Thanks, Damian!
This just goes to show: never cast a function pointer. Found while
testing new command line parse logic.
Bugfix on 1293835440, which implemented
6752: Not in any released tor.
Fall back to SOMAXCONN if INT_MAX doesn't work.
We'd like to do this because the actual maximum is overrideable by the
kernel, and the value in the header file might not be right at all.
All implementations I can find out about claim that this is supported.
Fix for 9716; bugfix on every Tor.
Now we explicitly check for overflow.
This approach seemed smarter than a cascade of "change int to unsigned
int and hope nothing breaks right before the release".
Nick, feel free to fix in a better way, maybe in master.
This would make us do testing circuits "even when cbt is disabled by
consensus, or when we're a directory authority, or when we've failed
to write cbt history to our state file lately." (Roger's words.)
This is a fix for 9671 and an improvement in our fix for 5049.
The original misbehavior was in 0.2.2.14-alpha; the incomplete
fix was in 0.2.3.17-beta.
(In practice they don't exist, but so long as we're making changes for
standards compliance...)
Also add several more unit tests for good and bad URL types.
There were only two functions outside of circuitstats that actually
wanted to know what was inside this. Making the structure itself
hidden should help isolation and prevent us from spaghettifying the
thing more.
The spec requires them to do so, and not doing so creates a situation
where they can't send-test because relays won't extend to them because
of the other part of bug 9546.
Fixes bug 9546; bugfix on 0.2.3.6-alpha.
The spec requires them to do so, and not doing so creates a situation
where they can't send-test because relays won't extend to them because
of the other part of bug 9546.
Fixes bug 9546; bugfix on 0.2.3.6-alpha.
(Backport to Tor 0.2.3)
Relays previously, when initiating a connection, would only send a
NETINFO after sending an AUTHENTICATE. But bridges, when receiving a
connection, would never send AUTH_CHALLENGE. So relays wouldn't
AUTHENTICATE, and wouldn't NETINFO, and then bridges would be
surprised to be receiving CREATE cells on a non-open circuit.
Fixes bug 9546.
Relays previously, when initiating a connection, would only send a
NETINFO after sending an AUTHENTICATE. But bridges, when receiving a
connection, would never send AUTH_CHALLENGE. So relays wouldn't
AUTHENTICATE, and wouldn't NETINFO, and then bridges would be
surprised to be receiving CREATE cells on a non-open circuit.
Fixes bug 9546.
Use the generic function for both the ControlPort cookie and the
ExtORPort cookie.
Also, place the global cookie variables in the heap so that we can
pass them around more easily as pointers.
Also also, fix the unit tests that broke by this change.
Conflicts:
src/or/config.h
src/or/ext_orport.c
memwipe some stack-allocated stuff
Add DOCDOC comments for state machines
Use memdup_nulterm as appropriate
Check for NULs in useraddr
Add a macro so that <= AUTH_MAX has a meaning.
- Don't leak if a transport proxy sends us a TRANSPORT command more
than once.
- Don't use smartlist_string_isin() in geoip_get_transport_history().
(pointed out by Nick)
- Use the 'join' argument of smartlist_join_strings() instead of
trying to write the separator on our own.
(pointed out by Nick)
- Document 'ext_or_transport' a bit better.
(pointed out by Nick)
- Be a bit more consistent with the types of the values of 'transport_counts'.
(pointed out by Nick)
Fortunately, later checks mean that uninitialized data can't get sent
to the network by this bug. Unfortunately, reading uninitialized heap
*can* (in some cases, with some allocators) cause a crash if you get
unlucky and go off the end of a page.
Found by asn. Bugfix on 0.2.4.1-alpha.
Both 'managed_proxy_list' and 'unconfigured_proxies_n' are global
src/or/transports.c variables that are not initialized properly when
unit tests are run.
When we moved channel_matches_target_addr_for_extend() into a separate
function, its sense was inverted from what one might expect, and we
didn't have a ! in one place where we should have.
Found by skruffy.
When we moved channel_matches_target_addr_for_extend() into a separate
function, its sense was inverted from what one might expect, and we
didn't have a ! in one place where we should have.
Found by skruffy.
I added this so I could write a unit test for ServerTransportOptions,
but it incidentally exercises the succeed-on-defaults case of
options_validate too.
Previously we would accept relative paths, but only if they contained a
slash somewhere (not at the end).
Otherwise we would silently not work. Closes: #9258. Bugfix on
0.2.3.16-alpha.
If you pass the --enable-coverage flag on the command line, we build
our testing binaries with appropriate options eo enable coverage
testing. We also build a "tor-cov" binary that has coverage enabled,
for integration tests.
On recent OSX versions, test coverage only works with clang, not gcc.
So we warn about that.
Also add a contrib/coverage script to actually run gcov with the
appropriate options to generate useful .gcov files. (Thanks to
automake, the .o files will not have the names that gcov expects to
find.)
Also, remove generated gcda and gcno files on clean.
We previously used FILENAME_PRIVATE identifiers mostly for
identifiers exposed only to the unit tests... but also for
identifiers exposed to the benchmarker, and sometimes for
identifiers exposed to a similar module, and occasionally for no
really good reason at all.
Now, we use FILENAME_PRIVATE identifiers for identifiers shared by
Tor and the unit tests. They should be defined static when we
aren't building the unit test, and globally visible otherwise. (The
STATIC macro will keep us honest here.)
For identifiers used only by the unit tests and never by Tor at all,
on the other hand, we wrap them in #ifdef TOR_UNIT_TESTS.
This is not the motivating use case for the split test/non-test
build system; it's just a test example to see how it works, and to
take a chance to clean up the code a little.
This is mainly a matter of automake trickery: we build each static
library in two versions now: one with the TOR_UNIT_TESTS macro
defined, and one without. When TOR_UNIT_TESTS is defined, we can
enable mocking and expose more functions. When it's not defined, we
can lock the binary down more.
The alternatives would be to have alternate build modes: a "testing
configuration" for building the libraries with test support, and a
"production configuration" for building them without. I don't favor
that approach, since I think it would mean more people runnning
binaries build for testing, or more people not running unit tests.
Fix a bug in the voting algorithm that could yield incorrect results
when a non-naming authority declared too many flags. Fixes bug 9200;
bugfix on 0.2.0.3-alpha.
Found by coverity scan.
This implements "algorithm 1" from my discussion of bug #9072: on OOM,
find the circuits with the longest queues, and kill them. It's also a
fix for #9063 -- without the side-effects of bug #9072.
The memory bounds aren't perfect here, and you need to be sure to
allow some slack for the rest of Tor's usage.
This isn't a perfect fix; the rest of the solutions I describe on
codeable.
In my #7912 fix, there wasn't any code to remove entries from the
(channel, circuit ID)->circuit map corresponding to queued but un-sent
DESTROYs.
Spotted by skruffy. Fixes bug 9082; bug not in any released Tor.
I added the code to pass a destroy cell to a queueing function rather
than writing it immediately, and the code to remember that we
shouldn't reuse the circuit id until the destroy is actually sent, and
the code to release the circuit id once the destroy has been sent...
and then I finished by hooking destroy_cell_queue into the rest of
Tor.
This is a reprise of the fix in bdff7e3299d78; 6905c1f6 reintroduced
that bug. Briefly: windows doesn't seem to like deleting a mapped
file. I tried adding the PROT_SHARED_DELETE flag to the createfile
all, but that didn't actually fix this issue. Fortunately, the unit
test I added in 4f4fc63fea should
prevent us from making this particular screw-up again.
This patch also tries to limit the crash potential of a failure to
write by a little bit, although it could do a better job of retaining
microdescriptor bodies.
Fix for bug 8822, bugfix on 0.2.4.12-alpha.
There's an assertion failure that can occur if a connection has
optimistic data waiting, and then the connect() call returns 0 on the
first attempt (rather than -1 and EINPROGRESS). That latter behavior
from connect() appears to be an (Open?)BSDism when dealing with remote
addresses in some cases. (At least, I've only seen it reported with
the BSDs under libevent, even when the address was 127.0.0.1. And
we've only seen this problem in Tor with OpenBSD.)
Fixes bug 9017; bugfix on 0.2.3.1-alpha, which first introduced
optimistic data. (Although you could also argue that the commented-out
connection_start_writing in 155c9b80 back in 2002 is the real source
of the issue.)
A new option TestingV3AuthVotingStartOffset is added which offsets the
starting time of the voting interval. This is possible only when
TestingTorNetwork is set.
This patch makes run_scheduled_events() check for new consensus
downloads every second when TestingTorNetwork, instead of every
minute. This should be fine, see #8532 for reasoning.
This patch also brings MIN_VOTE_SECONDS and MIN_DIST_SECONDS down from
20 to 2 seconds, unconditionally. This makes sanity checking of
misconfiguration slightly less sane.
Addresses #8532.
You can't use != to compare arbitary members of or_options_t.
(Also, generate a better error message to say which Testing* option
was set.)
Fix for bug 8992. Bugfix on b0d4ca49. Bug not in any released Tor.
- Rename n_read and n_written in origin_circuit_t to make it clear that
these are only used for CIRC_BW events.
- Extract new code in control_update_global_event_mask to new
clear_circ_bw_fields function.
- Avoid control_event_refill_global function with 13 arguments and
increase code reuse factor by moving more code from control.c to
connection.c.
- Avoid an unsafe uint32_t -> int cast.
- Add TestingEnableTbEmptyEvent option.
- Prepare functions for testing.
- Rename a few functions and improve documentation.
- Move cell_command_to_string from control.c to command.c.
- Use accessor for global_circuitlist instead of extern.
- Add a struct for cell statistics by command instead of six arrays.
- Split up control_event_circuit_cell_stats by using two helper functions.
- Add TestingEnableCellStatsEvent option.
- Prepare functions for testing.
- Rename a few variables and document a few things better.
- Move new ID= parameter in ORCONN event to end. Avoids possible trouble
from controllers that parse parameters by position, even though they
shouldn't.
Create new methods check_or_create_data_subdir() and
write_to_data_subdir() in config.c and use them throughout
rephist.c and geoip.c.
This should solve ticket #4282.
This is a fix for bug 8844, where eugenis correctly notes that there's
a sentinel value at the end of the list-of-freelists that's never
actually checked. It's a bug since the first version of the chunked
buffer code back in 0.2.0.16-alpha.
This would probably be a crash bug if it ever happens, but nobody's
ever reported something like this, so I'm unsure whether it can occur.
It would require write_to_buf, write_to_buf_zlib, read_to_buf, or
read_to_buf_tls to get an input size of more than 32K. Still, it's a
good idea to fix this kind of thing!
It appears that moria1 crashed because of one instance of this (the
one in router_counts_toward_thresholds). The other instance I fixed
won't actually have broken anything, but I think it's more clear this
way.
Fixes bug 8833; bugfix on 0.2.4.12-alpha.
We need to subtract both the current built circuits *and* the attempted
circuits from the attempt count during scaling, since *both* have already been
counted there.
It sure is a good thing we can run each test in its own process, or
else the amount of setup I needed to do to make this thing work
would have broken all the other tests.
Test mocking would have made this easier to write too.
Now we can compute the hash and signature of a dirobj before
concatenating the smartlist, and we don't need to play silly games
with sigbuf and realloc any more.
I believe this was introduced in 6bc071f765, which makes
this a fix on 0.2.0.10-alpha. But my code archeology has not extended
to actually testing that theory.
We were mixing bandwidth file entries (which are in kilobytes) with
router_get_advertised_bw() entries, which were in bytes.
Also, use router_get_advertised_bandwidth_capped() for credible_bandwidth.
Since 7536c40 only DNS results for real SOCKS requests are added to the cache,
but not DNS results for DNSPort queries or control connection RESOLVE queries.
Only cache additions would trigger ADDRMAP events on successful resolve.
Change it so that DNS results received after a RESOLVE command also generate
ADDRMAP events.
There was a bug in Tor prior to 0.2.4.10-alpha that allowed counts to
become invalid. Clipping the counts at startup allows us to rule out
log messages due to corruption from these prior Tor versions.
This was causing dirauths to emit flag weight validation warns if there
was a sufficiently large amount of badexit bandwidth to make a difference in
flag weight results.
It seems that some versions of clang that would prefer the
-Wswitch-enum compiler flag to warn about switch statements with
missing enum values, even if those switch statements have a
default.
Fixes bug 8598; bugfix on 0.2.4.10-alpha.
Found while investigating 8093, but probably not the cause of it,
since this bug would result in us sending too few SENDMEs, not in us
receiving SENDMEs unexpectedly.
Bugfix on the fix for 7889, which has appeared in 0.2.4.10-alpha, but
not yet in any released 0.2.3.x version.
It can never be NULL, since it's an array in bridge_line_t.
Introduced in 266f8cddd8. Found by coverity; this is CID 992691. Bug
not in any released Tor.
If we get a write error on a SOCKS connection, we can't send a
SOCKS reply, now can we?
This bug has been here since 36baf7219, where we added the "hey, I'm
closing an AP connection but I haven't finished the socks
handshake!" message. It's bug 8427.
Also, don't call the exit node 'reject *' unless our decision to pick
that node was based on a non-summarized version of that node's exit
policy.
rransom and arma came up with the ideas for this fix.
Fix for 7582; the summary-related part is a bugfix on 0.2.3.2-alpha.
When we're hibernating, the main reqason we can't bootstrap will
always be that we're hibernating: reporting anything else at severity
WARN is pointless.
Fixes part of 7302.
This bug affects hosts where time_t is unsigned, which AFAICT does
not include anything we currently support. (It _does_ include
OpenVMS, about a month of BSD4.2's history[1], and a lot of the 1970s.)
There are probably more bugs when time_t is unsigned. This one was
[1] http://mail-index.netbsd.org/tech-userlevel/1998/06/04/0000.html
This time, I'm checking whether our calculated offset matches our
real offset, in each case, as we go along. I don't think this is
the bug, but it can't hurt to check.
This should have been 2 bytes all along, since version numbers can
be 16 bits long. This isn't a live bug, since the call to
is_or_protocol_version_known in channel_tls_process_versions_cell
will reject any version number not in the range 1..4. Still, let's
fix this before we accidentally start supporting version 256.
Reported pseudonymously. Fixes bug 8062; bugfix on 0.2.0.10-alpha --
specifically, on commit 6fcda529, where during development I
increased the width of a version to 16 bits without changing the
type of link_proto.
Our ++ should have been += 2. This means that we'd accept version
numbers even when they started at an odd position.
This bug should be harmless in practice for so long as every version
number we allow begins with a 0 byte, but if we ever have a version
number starting with 1, 2, 3, or 4, there will be trouble here.
Fix for bug 8059, reported pseudonymously. Bugfix on 0.2.0.10-alpha
-- specifically, commit 6fcda529, where during development I
increased the width of a version to 16 bits without changing the
loop step.
I have no idea whether b0rken clients will DoS the network if the v2
authorities all turn this on or not. It's experimental. See #6783 for
a description of how to test it more or less safely, and please be
careful!
Now that circid_t is 4 bytes long, the default integer promotions will
leave it alone when sizeof(int) == 4, which will leave us formatting an
unsigned as an int. That's technically undefined behavior.
Fixes bug 8447 on bfffc1f0fc. Bug not
in any released Tor.
In a number of places, we decrement timestamp_dirty by
MaxCircuitDirtiness in order to mark a stream as "unusable for any
new connections.
This pattern sucks for a few reasons:
* It is nonobvious.
* It is error-prone: decrementing 0 can be a bad choice indeed.
* It really wants to have a function.
It can also introduce bugs if the system time jumps backwards, or if
MaxCircuitDirtiness is increased.
So in this patch, I add an unusable_for_new_conns flag to
origin_circuit_t, make it get checked everywhere it should (I looked
for things that tested timestamp_dirty), and add a new function to
frob it.
For now, the new function does still frob timestamp_dirty (after
checking for underflow and whatnot), in case I missed any cases that
should be checking unusable_for_new_conns.
Fixes bug 6174. We first used this pattern in 516ef41ac1,
which I think was in 0.0.2pre26 (but it could have been 0.0.2pre27).
Without this patch, there's no way to know what went wrong when we
fail to parse a torrc line entirely (that is, we can't turn it into
a K,V pair.) This patch introduces a new function that yields an
error message on failure, so we can at least tell the user what to
look for in their nonfunctional torrc.
(Actually, it's the same function as before with a new name:
parse_config_line_from_str is now a wrapper macro that the unit
tests use.)
Fixes bug 7950; fix on 0.2.0.16-alpha (58de695f90) which first
introduced the possibility of a torrc value not parsing correctly.
This patch moves the measured_bw field and the has_measured_bw field
into vote_routerstatus_t, since only votes have 'Measured=XX' set on
their weight line.
I also added a new bw_is_unmeasured flag to routerstatus_t to
represent the Unmeasured=1 flag on a w line. Previously, I was using
has_measured_bw for this, which was quite incorrect: has_measured_bw
means that the measured_bw field is set, and it's probably a mistake
to have it serve double duty as meaning that 'baandwidth' represents a
measured value.
While making this change,I also found a harmless but stupid bug in
dirserv_read_measured_bandwidths: It assumes that it's getting a
smartlist of routerstatus_t, when really it's getting a smartlist of
vote_routerstatus_t. C's struct layout rules mean that we could never
actually get an error because of that, but it's still quite incorrect.
I fixed that, and in the process needed to add two more sorting and
searching helpers.
Finally, I made the Unmeasured=1 flag get parsed. We don't use it for
anything yet, but someday we might.
This isn't complete yet -- the new 2286 unit test doesn't build.
Instead of capping whenever a router has fewer than 3 measurements,
we cap whenever a router has fewer than 3 measurements *AND* there
are at least 3 authorities publishing measured bandwidths.
We also generate bandwidth lines with a new "Unmeasured=1" flag,
meaning that we didn't have enough observations for a node to use
measured bandwidth values in the authority's input, whether we capped
it or not.
Stop marking every relay as having been down for one hour every
time we restart a directory authority. These artificial downtimes
were messing with our Stable and Guard flag calculations.
Fixes bug 8218 (introduced by the fix for 1035). Bugfix on 0.2.2.23-alpha.
Apparently something in the directory guard code made it possible
for the same node to get added as a guard over and over when there
were no actual running guard nodes.
Relays used to check every 10 to 60 seconds, as an accidental side effect
of calling directory_fetches_from_authorities() when considering doing
a directory fetch. The fix for bug 1992 removes that side effect. At the
same time, bridge relays never had the side effect, leading to confused
bridge operators who tried crazy tricks to get their bridges to notice
IP address changes (see ticket 1913).
The new behavior is to reinstate an every-60-seconds check for both
public relays and bridge relays, now that the side effect is gone.
For example, we were doing a resolve every time we think about doing a
directory fetch. Now we reuse the cached answer in some cases.
Fixes bugs 1992 (bugfix on 0.2.0.20-rc) and 2410 (bugfix on
0.1.2.2-alpha).
When we compute the estimated microseconds we need to handle our
pending onionskins, we could (in principle) overflow a uint32_t if
we ever had 4 million pending onionskins before we had any data
about how onionskins take. Nevertheless, let's compute it properly.
Fixes bug 8210; bugfix on 0.2.4.10. Found by coverity; this is CID
980651.
Coverity is worried that we're checking entry_conn in some cases,
but not in the case where we set entry_conn->pending_optimistic_data.
This commit should calm it down (CID 718623).