Fix a bug in the voting algorithm that could yield incorrect results
when a non-naming authority declared too many flags. Fixes bug 9200;
bugfix on 0.2.0.3-alpha.
Found by coverity scan.
This implements "algorithm 1" from my discussion of bug #9072: on OOM,
find the circuits with the longest queues, and kill them. It's also a
fix for #9063 -- without the side-effects of bug #9072.
The memory bounds aren't perfect here, and you need to be sure to
allow some slack for the rest of Tor's usage.
This isn't a perfect fix; the rest of the solutions I describe on
codeable.
doc/TODO and doc/spec/README were placeholders to tell people where to
look for the real TODO and README stuff -- we replaced them years ago,
though.
authority-policy, v3-authority-howto, and torel-design.txt belong in
torspec. I'm putting them in attic there since I think they may be in
large part obsolete, but someone can rescue them if they're not.
translations.txt is outdated, and refers to lots of programs other
than Tor. We have much better translation resources on the website
now.
tor-win32-mingw-creation.txt is pending review of a revised version
for 0.2.5 (see ticket #4520), but there's no reason to ship this one
while we're waiting for an accurate version.
the tor-rpm-creation.txt isn't obsolete AFAIK, but it belongs in
doc/contrib if anywhere.
Resolves bug #8965.
This is a reprise of the fix in bdff7e3299d78; 6905c1f6 reintroduced
that bug. Briefly: windows doesn't seem to like deleting a mapped
file. I tried adding the PROT_SHARED_DELETE flag to the createfile
all, but that didn't actually fix this issue. Fortunately, the unit
test I added in 4f4fc63fea should
prevent us from making this particular screw-up again.
This patch also tries to limit the crash potential of a failure to
write by a little bit, although it could do a better job of retaining
microdescriptor bodies.
Fix for bug 8822, bugfix on 0.2.4.12-alpha.
This reverts commit 884a0e269c.
I'm reverting this because it doesn't actually make the problem go
away. It appears that instead we need to do unmap-then-replace.
A comment by rransom on #8795 taken together with a comment by doorss
recorded on #2077 suggest that *every* attempt to replace the md cache
will fail on Vista/Win7 if we don't have the FILE_SHARE_DELETE flag
passed to CreateFile, and if we try to replace the file ourselves
before unmapping it. I'm adding the FILE_SHARE_DELETE, since that's
this simplest fix. Broken indexers (the favored #2077 hypothesis)
could still cause trouble here, but at least this patch should make us
stop stepping on our own feet.
Likely fix for #2077 and its numerous duplicates. Bugfix on
0.2.2.6-alpha, which first had a microdescriptor cache that would get
replaced before remapping it.
There's an assertion failure that can occur if a connection has
optimistic data waiting, and then the connect() call returns 0 on the
first attempt (rather than -1 and EINPROGRESS). That latter behavior
from connect() appears to be an (Open?)BSDism when dealing with remote
addresses in some cases. (At least, I've only seen it reported with
the BSDs under libevent, even when the address was 127.0.0.1. And
we've only seen this problem in Tor with OpenBSD.)
Fixes bug 9017; bugfix on 0.2.3.1-alpha, which first introduced
optimistic data. (Although you could also argue that the commented-out
connection_start_writing in 155c9b80 back in 2002 is the real source
of the issue.)
This is a fix for bug 8844, where eugenis correctly notes that there's
a sentinel value at the end of the list-of-freelists that's never
actually checked. It's a bug since the first version of the chunked
buffer code back in 0.2.0.16-alpha.
This would probably be a crash bug if it ever happens, but nobody's
ever reported something like this, so I'm unsure whether it can occur.
It would require write_to_buf, write_to_buf_zlib, read_to_buf, or
read_to_buf_tls to get an input size of more than 32K. Still, it's a
good idea to fix this kind of thing!
It appears that moria1 crashed because of one instance of this (the
one in router_counts_toward_thresholds). The other instance I fixed
won't actually have broken anything, but I think it's more clear this
way.
Fixes bug 8833; bugfix on 0.2.4.12-alpha.
I believe this was introduced in 6bc071f765, which makes
this a fix on 0.2.0.10-alpha. But my code archeology has not extended
to actually testing that theory.
It seems that some versions of clang that would prefer the
-Wswitch-enum compiler flag to warn about switch statements with
missing enum values, even if those switch statements have a
default.
Fixes bug 8598; bugfix on 0.2.4.10-alpha.
It was previously --Test in the help output and --test-commandline in
the getopt call. The man page already had --test.
(Originally by David, who resolved the tie in favor of "--test"; I
chose --test-commandline" instead so that nothing that depended
on it could break. -Nick)
If we get a write error on a SOCKS connection, we can't send a
SOCKS reply, now can we?
This bug has been here since 36baf7219, where we added the "hey, I'm
closing an AP connection but I haven't finished the socks
handshake!" message. It's bug 8427.
Also, don't call the exit node 'reject *' unless our decision to pick
that node was based on a non-summarized version of that node's exit
policy.
rransom and arma came up with the ideas for this fix.
Fix for 7582; the summary-related part is a bugfix on 0.2.3.2-alpha.
When we're hibernating, the main reqason we can't bootstrap will
always be that we're hibernating: reporting anything else at severity
WARN is pointless.
Fixes part of 7302.
Inspired by #8042.
As far as I know, OpenVMS is the only place you're likely to hit an
unsigned time_t these days, and Tor's VMS support
is... lacking. Still worth letting people know about it, though.
This should have been 2 bytes all along, since version numbers can
be 16 bits long. This isn't a live bug, since the call to
is_or_protocol_version_known in channel_tls_process_versions_cell
will reject any version number not in the range 1..4. Still, let's
fix this before we accidentally start supporting version 256.
Reported pseudonymously. Fixes bug 8062; bugfix on 0.2.0.10-alpha --
specifically, on commit 6fcda529, where during development I
increased the width of a version to 16 bits without changing the
type of link_proto.
Our ++ should have been += 2. This means that we'd accept version
numbers even when they started at an odd position.
This bug should be harmless in practice for so long as every version
number we allow begins with a 0 byte, but if we ever have a version
number starting with 1, 2, 3, or 4, there will be trouble here.
Fix for bug 8059, reported pseudonymously. Bugfix on 0.2.0.10-alpha
-- specifically, commit 6fcda529, where during development I
increased the width of a version to 16 bits without changing the
loop step.
I have no idea whether b0rken clients will DoS the network if the v2
authorities all turn this on or not. It's experimental. See #6783 for
a description of how to test it more or less safely, and please be
careful!
Now the manpages no longer refer to tsocks or tsocks.conf, and we no
longer have or ship a tor-tsocks.conf. The only remaining instances
of "tsocks" in our repository are old ChangeLog and ReleaseNotes
entries, and the torify script saying that it doesn't support tsocks.
Fixes bug 8290.
In a number of places, we decrement timestamp_dirty by
MaxCircuitDirtiness in order to mark a stream as "unusable for any
new connections.
This pattern sucks for a few reasons:
* It is nonobvious.
* It is error-prone: decrementing 0 can be a bad choice indeed.
* It really wants to have a function.
It can also introduce bugs if the system time jumps backwards, or if
MaxCircuitDirtiness is increased.
So in this patch, I add an unusable_for_new_conns flag to
origin_circuit_t, make it get checked everywhere it should (I looked
for things that tested timestamp_dirty), and add a new function to
frob it.
For now, the new function does still frob timestamp_dirty (after
checking for underflow and whatnot), in case I missed any cases that
should be checking unusable_for_new_conns.
Fixes bug 6174. We first used this pattern in 516ef41ac1,
which I think was in 0.0.2pre26 (but it could have been 0.0.2pre27).
Without this patch, there's no way to know what went wrong when we
fail to parse a torrc line entirely (that is, we can't turn it into
a K,V pair.) This patch introduces a new function that yields an
error message on failure, so we can at least tell the user what to
look for in their nonfunctional torrc.
(Actually, it's the same function as before with a new name:
parse_config_line_from_str is now a wrapper macro that the unit
tests use.)
Fixes bug 7950; fix on 0.2.0.16-alpha (58de695f90) which first
introduced the possibility of a torrc value not parsing correctly.
Instead of capping whenever a router has fewer than 3 measurements,
we cap whenever a router has fewer than 3 measurements *AND* there
are at least 3 authorities publishing measured bandwidths.
We also generate bandwidth lines with a new "Unmeasured=1" flag,
meaning that we didn't have enough observations for a node to use
measured bandwidth values in the authority's input, whether we capped
it or not.
There are two ways to use sysconf to ask about the number of
CPUs. When we're on a VM, we would sometimes get it wrong by asking
for the number of total CPUs (say, 64) when we should have been asking
for the number of CPUs online (say, 1 or 2).
Fix for bug 8002.
Stop marking every relay as having been down for one hour every
time we restart a directory authority. These artificial downtimes
were messing with our Stable and Guard flag calculations.
Fixes bug 8218 (introduced by the fix for 1035). Bugfix on 0.2.2.23-alpha.
Relays used to check every 10 to 60 seconds, as an accidental side effect
of calling directory_fetches_from_authorities() when considering doing
a directory fetch. The fix for bug 1992 removes that side effect. At the
same time, bridge relays never had the side effect, leading to confused
bridge operators who tried crazy tricks to get their bridges to notice
IP address changes (see ticket 1913).
The new behavior is to reinstate an every-60-seconds check for both
public relays and bridge relays, now that the side effect is gone.
For example, we were doing a resolve every time we think about doing a
directory fetch. Now we reuse the cached answer in some cases.
Fixes bugs 1992 (bugfix on 0.2.0.20-rc) and 2410 (bugfix on
0.1.2.2-alpha).
When we compute the estimated microseconds we need to handle our
pending onionskins, we could (in principle) overflow a uint32_t if
we ever had 4 million pending onionskins before we had any data
about how onionskins take. Nevertheless, let's compute it properly.
Fixes bug 8210; bugfix on 0.2.4.10. Found by coverity; this is CID
980651.
The refactoring in commit 471ab34032 wasn't complete enough: we
were checking the auth_len variable, but never actually setting it,
so it would never seem that authentication had been provided.
This commit also removes a bunch of unused variables from
rend_service_introduce, whose unusedness we hadn't noticed because
we were wiping them at the end of the function.
Fix for bug 8207; bugfix on 0.2.4.1-alpha.
It returns the method by which we decided our public IP address
(explicitly configured, resolved from explicit hostname, guessed from
interfaces, learned by gethostname).
Now we can provide more helpful log messages when a relay guesses its IP
address incorrectly (e.g. due to unexpected lines in /etc/hosts). Resolves
ticket 2267.
While we're at it, stop sending a stray "(null)" in some cases for the
server status "EXTERNAL_ADDRESS" controller event. Resolves bug 8200.
Right now, all our curve25519 backends ignore the high bit of the
public key. But possibly, others could treat the high bit of the
public key as encoding out-of-bounds values, or as something to be
preserved. This could be used to distinguish clients with different
backends, at the cost of killing a circuit.
As a workaround, let's just clear the high bit of each public key
indiscriminately before we use it. Fix for bug 8121, reported by
rransom. Bugfix on 0.2.4.8-alpha.
The fix is to move the two functions to format/parse base64
curve25519 public keys into a new "crypto_format.c" file. I could
have put them in crypto.c, but that's a big file worth splitting
anyway.
Fixes bug 8153; bugfix on 0.2.4.8-alpha where I did the fix for 7869.
Now as we move into a future where most bridges can handle microdescs
we will generally find ourselves using them, rather than holding back
just because one of our bridges doesn't use them.
When we first implemented TLS, we assumed in conneciton_handle_write
that a TOR_TLS_WANT_WRITE from flush_buf_tls meant that nothing had
been written. But when we moved our buffers to a ring buffer
implementation back in 0.1.0.5-rc (!), we broke that invariant: it's
possible that some bytes have been written but nothing.
That's bad. It means that if we do a sequence of TLS writes that ends
with a WANTWRITE, we don't notice that we flushed any bytes, and we
don't (I think) decrement buckets.
Fixes bug 7708; bugfix on 0.1.0.5-rc
Instead of hardcoding the minimum fraction of possible paths to 0.6, we
take it from the user, and failing that from the consensus, and
failing that we fall back to 0.6.
Previously we did this based on the fraction of descriptors we
had. But really, we should be going based on what fraction of paths
we're able to build based on weighted bandwidth, since otherwise a
directory guard or two could make us behave quite oddly.
Implementation for feature 5956