We would stash the certs in the handshake state before checking them
for validity... and then if they turned out to be invalid, we'd give
an error and free them. Then, later, we'd free them again when we
tore down the connection.
Fixes bug 4343; fix on 0.2.3.6-alpha.
This way, all of the DA operators can upgrade immediately, without nuking
every client's set of entry guards as soon as a majority of them upgrade.
Until enough guards have upgraded, a majority of dirauths should set this
config option so that there are still enough guards in the network. After
a few days pass, all dirauths should use the default.
It used to mean "Force": it would tell tor-resolve to ask tor to
resolve an address even if it ended with .onion. But when
AutomapHostsOnResolve was added, automatically refusing to resolve
.onion hosts stopped making sense. So in 0.2.1.16-rc (commit
298dc95dfd), we made tor-resolve happy to resolve anything.
The -F option stayed in, though, even though it didn't do anything.
Oddly, it never got documented.
Found while fixing GCC 4.6 "set, unused variable" warnings.
The patch for 3228 made us try to run init_keys() before we had loaded
our state file, resulting in an assert inside init_keys. We had moved
it too early in the function.
Now it's later in the function, but still above the accounting calls.
Previously we did this nearer to the end (in the old_options &&
transition_affects_workers() block). But other stuff cares about
keys being consistent with options... particularly anything which
tries to access a key, which can die in assert_identity_keys_ok().
Fixes bug 3228; bugfix on 0.2.2.18-alpha.
Conflicts:
src/or/config.c
When we added support for separate client tls certs on bridges in
a2bb0bfdd5 we forgot to correctly initialize this when changing
from relay to bridge or vice versa while Tor is running. Fix that
by always initializing keys when the state changes.
Fixes bug 2433.
Conflicts:
src/or/config.c
We use a hash of the identity key to seed a prng to tell when an
accounting period should end. But thanks to the bug998 changes,
clients no longer have server-identity keys to use as a long-term seed
in accounting calculations. In any case, their identity keys (as used
in TLS) were never never fixed. So we can just set the wakeup time
from a random seed instead there. Still open is whether everybody
should be random.
This patch fixes bug 2235, which was introduced in 0.2.2.18-alpha.
Diagnosed with help from boboper on irc.
In a2bb0bf we started using a separate client identity key. When we are
in "public server mode" (that means not a bridge) we will use the same
key. Reusing the key without doing the proper refcounting leads to a
segfault on cleanup during shutdown. Fix that.
Also introduce an assert that triggers if our refcount falls below 0.
That should never happen.
We now require that:
- Only actual servers should ever call get_server_identity_key
- If you're being a client or bridge, the client and server keys should
differ.
- If you're being a public relay, the client and server keys
should be the same.
* Make tor_tls_context_new internal to tortls.c, and return the new
tor_tls_context_t from it.
* Add a public tor_tls_context_init wrapper function to replace it.
Conflicts:
src/or/main.c
src/or/router.c
SSL_read(), SSL_write() and SSL_do_handshake() can always progress the
SSL protocol instead of their normal operation, this means that we
must be checking for needless renegotiations after they return.
Introduce tor_tls_got_excess_renegotiations() which makes the
tls->server_handshake_count > 2
check for us, and use it in tor_tls_read() and tor_tls_write().
Cases that should not be handled:
* SSL_do_handshake() is only called by tor_tls_renegotiate() which is a
client-only function.
* The SSL_read() in tor_tls_shutdown() does not need to be handled,
since SSL_shutdown() will be called if SSL_read() returns an error.
From the code:
zlib 1.2.4 and 1.2.5 do some "clever" things with macros. Instead of
saying "(defined(FOO) ? FOO : 0)" they like to say "FOO-0", on the theory
that nobody will care if the compile outputs a no-such-identifier warning.
Sorry, but we like -Werror over here, so I guess we need to define these.
I hope that zlib 1.2.6 doesn't break these too.
Possible fix for bug 1526.
Since we check for naughty renegotiations using
tor_tls_t.server_handshake_count we don't need that semi-broken
function (at least till there is a way to disable rfc5746
renegotiations too).
Switch 'server_handshake_count' from a uint8_t to 2 unsigned int bits.
Since we won't ever be doing more than 3 handshakes, we don't need the
extra space.
Toggle tor_tls_t.got_renegotiate based on the server_handshake_count.
Also assert that when we've done two handshakes as a server (the initial
SSL handshake, and the renegotiation handshake) we've just
renegotiated.
Finally, in tor_tls_read() return an error if we see more than 2
handshakes.
The renegotiation callback was called only when the first Application
Data arrived, instead of when the renegotiation took place.
This happened because SSL_read() returns -1 and sets the error to
SSL_ERROR_WANT_READ when a renegotiation happens instead of reading
data [0].
I also added a commented out aggressive assert that I won't enable yet
because I don't feel I understand SSL_ERROR_WANT_READ enough.
[0]: Look at documentation of SSL_read(), SSL_get_error() and
SSL_CTX_set_mode() (SSL_MODE_AUTO_RETRY section).
Introduce tor_tls_state_changed_callback(), which handles every SSL
state change.
The new function tor_tls_got_server_hello() is called every time we
send a ServerHello during a v2 handshake, and plays the role of the
previous tor_tls_server_info_callback() function.
When you're doing malloc(sizeof(int)), something may well have gone
wrong.
This technique is a bit abusive, but we're already relying on it
working correctly in geoip.c.
To get a better idea what's going on on Tonga, add some code to report
how often the most and least frequently fetched descriptor was fetched,
as well as 25, 50, 75 percentile.
Also ensure we only count bridge descriptors here.
- Add a tor_process_get_pid() function that returns the PID of a
process_handle_t.
- Conform to make check-spaces.
- Add some more documentation.
- Improve some log messages.
We used to try to terminate the managed proxy process even if it
failed while launching. We introduce a new managed proxy state, to
represent a *broken* and *not launched* proxy.
This is used for the bridge authority currently, to get a better
intuition on how many descriptors are actually fetched from it and how
many fetches happen in total.
Implements ticket 4200.
Fixes bug 4259, bugfix on 0.2.2.25-alpha. Bugfix by "Tey'".
Original message by submitter:
Changing nodes restrictions using a controller while Tor is doing
DNS resolution could makes Tor crashes (on WinXP at least). The
problem can be repeated by trying to reach a non-existent domain
using Tor:
curl --socks4a 127.0.0.1:9050 inexistantdomain.ext
.. and changing the ExitNodes parameter through the control port
before Tor returns a DNS resolution error (of course, the following
command won't work directly if the control port is password
protected):
echo SETCONF ExitNodes=TinyTurtle | nc -v 127.0.0.1 9051
Using a non-existent domain is needed to repeat the issue so that
Tor takes a few seconds for resolving the domain (which allows us to
change the configuration). Tor will crash while processing the
configuration change.
The bug is located in the addressmap_clear_excluded_trackexithosts
method which iterates over the entries of the addresses map in order
to check whether the changes made to the configuration will impact
those entries. When a DNS resolving is in progress, the new_adress
field of the associated entry will be set to NULL. The method
doesn't expect this field to be NULL, hence the crash.
Previously, we would treat an intro circuit failure as a timeout iff the
circuit failed due to a mismatch in relay identity keys. (Due to a bug
elsewhere, we only recognize relay identity-key mismatches on the first
hop, so this isn't as bad as it could have been.)
Bugfix on commit eaed37d14c, not yet in any
release.
It's too risky to have a function where if you leave one parameter
NULL, it splits up address:port strings, but if you set it, it does
hostname resolution.
Under the new convention, having a tor_addr.*lookup function that
doesn't do hostname resolution is too close for comfort.
I used this script here, and have made no other changes.
s/tor_addr_parse_reverse_lookup_name/tor_addr_parse_PTR_name/g;
s/tor_addr_to_reverse_lookup_name/tor_addr_to_PTR_name/g;
Now let's have "lookup" indicate that there can be a hostname
resolution, and "parse" indicate that there wasn't. Previously, we
had one "lookup" function that did resolution; four "parse" functions,
half of which did resolution; and a "from_str()" function that didn't
do resolution. That's confusing and error-prone!
The code changes in this commit are exactly the result of this perl
script, run under "perl -p -i.bak" :
s/tor_addr_port_parse/tor_addr_port_lookup/g;
s/parse_addr_port(?=[^_])/addr_port_lookup/g;
s/tor_addr_from_str/tor_addr_parse/g;
This patch leaves aton and pton alone: their naming convention and
behavior is is determined by the sockets API.
More renaming may be needed.
Right now we can take the digests only of an RSA key, and only expect to
take the digests of an RSA key. The old tor_cert_get_id_digests() would
return a good set of digests for an RSA key, and an all-zero one for a
non-RSA key. This behavior is too error-prone: it carries the risk that
we will someday check two non-RSA keys for equality and conclude that
they must be equal because they both have the same (zero) "digest".
Instead, let's have tor_cert_get_id_digests() return NULL for keys we
can't handle, and make its callers explicitly test for NULL.
Our keys and x.509 certs are proliferating here. Previously we had:
An ID cert (using the main ID key), self-signed
A link cert (using a shorter-term link key), signed by the ID key
Once proposal 176 and 179 are done, we will also have:
Optionally, a presentation cert (using the link key),
signed by whomever.
An authentication cert (using a shorter-term ID key), signed by
the ID key.
These new keys are managed as part of the tls context infrastructure,
since you want to rotate them under exactly the same circumstances,
and since they need X509 certificates.
Also, define all commands > 128 as variable-length when using
v3 or later link protocol. Running into a var cell with an
unrecognized type is no longer a bug.
Without this patch, Tor wasn't sure whether it would be hibernating or
not, so it postponed opening listeners until after the privs had been
dropped. This doesn't work so well for low ports. Bug was introduced in
the fix for bug 2003. Fixes bug 4217, reported by Zax and katmagic.
Thanks!
One of its callers assumes a non-zero result indicates a permanent failure
(i.e. the current attempt to connect to this HS either has failed or is
doomed). The other caller only requires that this function's result
never equal -2.
Bug reported by Sebastian Hahn.
Change the default values for collecting directory request statistics and
inlcuding them in extra-info descriptors to 1.
Don't break if we are configured to collect directory request or entry
statistics and don't have a GeoIP database. Instead, print out a notice
and skip initializing the affected statistics code.
This is the cherry-picked 499661524b.
Previously Tor would always claim to have been given a hostname
by the client, while actually only verifying that the client
is using SOCKS4A or SOCKS5 with hostnames. Both protocol versions
allow IP addresses, too, in which case the log messages were wrong.
Fixes#4094.
Previously, we wouldn't refetch an HS's descriptor unless we didn't
have one at all. That was equivalent to refetching iff we didn't have
a usable one, but the next commit will make us keep some non-usable HS
descriptors around in our cache.
Code bugfix on the release that introduced the v2 HS directory system,
because rend_client_refetch_v2_renddesc's documentation comment should
have described what it actually did, not what its behaviour happened
to be equivalent to; no behaviour change in this commit.
Otherwise, we can wind up munging our reference counts if we set it in
the middle of loading the nodes. This happens because
nodelist_set_consensus() and microdesc_reload_cache() are both in the
business of adjusting microdescriptors' references.
we don't need to check whether we don't have enough guards right after
concluding that we do have enough.
slight efficiency fix suggested by an anonymous fellow on irc.
We were doing "divide bandwidth by 1000, then multiply by msec", but
that would lose accuracy: instead of getting your full bandwidth,
you'd lose up to 999 bytes per sec. (Not a big deal, but every byte
helps.)
Instead, do the multiply first, then the division. This can easily
overflow a 32-bit value, so make sure to do it as a 64-bit operation.
GCC 4.2 and maybe other compilers optimize away unsigned integer
overflow checks of the form (foo + bar < foo), for all bar.
Fix one such check in `src/common/OpenBSD_malloc_Linux.c'.
With managed proxies you would always get the error message:
"You have a Bridge line using the X pluggable transport, but there
doesn't seem to be a corresponding ClientTransportPlugin line."
because the check happened directly after parse_client_transport_line()
when managed proxies were not fully configured and their transports
were not registered.
The fix is to move the validation to run_scheduled_events() and make
sure that all managed proxies are configured first.
* Create mark/sweep functions for transports.
* Create a transport_resolve_conflicts() function that tries to
resolve conflicts when registering transports.
Previously the FooPort was ignored and the default used instead,
causing Tor to bind to the wrong port if FooPort and the default
port don't match or the CONN_TYPE_FOO_LISTENER has no default port.
Fixes#3936.
Right now we only force a new descriptor upload every 18 hours.
This can make servers become unlisted if they upload a descriptor at
time T which the authorities reject as being "too similar" to one
they uploaded before. Nothing will actually make the server upload a
new descriptor later on, until another 18 hours have passed.
This patch changes the upload behavior so that the 18 hour interval
applies only when we're listed in a live consensus with a descriptor
published within the last 18 hours. Otherwise--if we're not listed
in the live consensus, or if we're listed with a publication time
over 18 hours in the past--we upload a new descriptor every 90
minutes.
This is an attempted bugfix for #3327. If we merge it, it should
obsolete #535.
Conflicts:
src/or/connection.c
src/or/connection_edge.c
src/or/connection_edge.h
src/or/dnsserv.c
Some of these were a little tricky, since they touched code that
changed because of the prop171 fixes.
On some platforms, with non-blocking IO, on EOF you first
get EAGAIN, and then on the second read you get zero bytes
and EOF is set. However on others, the EOF flag is set as
soon as the last byte is read. This patch fixes the test
case in the latter scenario.
Add a "default" state which we use until we've decided whether we're
live or hibernating. This allows us to properly track whether we're
resuming a hibernation period or not. Fixes bug 2003.
After a stream reached eof, we fclose it, but then
test_util_spawn_background_partial_read() reads from it again, which causes
an error and thus another fclose(). Some platforms are fine with this, others
(e.g. debian-sid-i386) trigger a double-free() error. The actual code used by
Tor (log_from_pipe() and tor_check_port_forwarding()) handle this case
correctly.
Mainly used for testing reading from subprocesses. To be more generic
we now pass in a pointer to a process_handle_t rather than a Windows-
specific HANDLE.
For printf, %f and %lf are synonymous, since floats are promoted to
doubles when passed as varargs. It's only for scanf that we need to
say "%lf" for doubles and "%f" for floats.
Apparenly, some older compilers think it's naughty to say %lf and like
to spew warnings about it.
Found by grarpamp.
For bufferevents, we had all of connection_buckets_decrement() stubbed
out. But that's not actually right! The rephist_* parts were
essential for, inter alia, recording our own bandwidth. This patch
splits out the rephist parts of connection_buckets_decrement() into their
own function, and makes the bufferevent code call that new function.
Fixes bug 3803, and probably 3824 and 3826 too. Bugfix on 0.2.3.1-alpha.
Previously, if you were set up to use microdescriptors, and you
weren't a cache, you'd never fetch router descriptors (except for
bridges). Now FetchUselessDescriptors causes descriptors and
mirodescs to get cached. Also, FetchUselessDescriptors changes the
behavior of "UseMicrodescriptors auto" to be off, since there's no
point in saying "UseMicrodescriptors 1" when you have full descriptors
too.
Fix for bug 3851; bugfix on 0.2.3.1-alpha.
Conventionally in Tor, structs are returned as pointers, so change
tor_spawn_background() to return the process handle in a pointer rather
than as return value.
Because tunneled connections are implemented with buffervent_pair,
writing to them can cause an immediate flush. This means that
added to them and then checking to see whether their outbuf is
empty is _not_ an adequate way to see whether you added anything.
This caused a problem in directory server connections, since they
would try spooling a little more data out, and then close the
connection if there was no queued data to send.
This fix should improve matters; it only closes the connection if
there is no more data to spool, and all of the spooling callbacks
are supposed to put the dirconn into dir_spool_none on completion.
This is bug 3814; Sebastian found it; bugfix on 0.2.3.1-alpha.
When we're doing filtering ssl bufferevents, we want the rate-limits
to apply to the lowest level of the bufferevent stack, so that we're
actually limiting bytes sent on the network. Otherwise, we'll read
from the network aggressively, and only limit stuff as we process it.
- Update configure script to test for libminiupnpc along with the
libws2_32 and libiphlpapi libraries required by libminiupnpc
- When building tor-fw-helper, link in libiphlpapi
- Link in libminiupnpc statically becasue I could not get the DLL
to link properly
- Call WSAStartup before doing network operations
- Fix up a compiler warning about uninitialized backend_state
N.B. The changes to configure.in and Makefile.am will break on non-
Windows platforms.
This behavior is normal when we want more data than the evbuffer
actually has for us. We'll ask for (say) 7 bytes, get only 5
(because that's all there is), try to parse the 5 bytes, and get
told "no, I want 7". One option would be to bail out early whenever
want_length is > buflen, but sometimes we use an over-large
want_length. So instead, let's just remove the warning here: it's
not a bug after all.
* Use strcmpstart() instead of strcmp(x,y,strlen(y)).
* Warn the user if the managed proxy failed to launch.
* Improve function documentation.
* Use smartlist_len() instead of n_unconfigured_proxies.
* Split managed_proxy_destroy() to managed_proxy_destroy()
and managed_proxy_destroy_with_transports().
* Constification.
It turns out that it wasn't enough to set the configuration to
"auto", since the correct behavior for "auto" had been disabled in
microdesc.c. :p
(Hasn't been in a release yet, so doesn't need a changes entry.)
Also remove a few other related warnings that could occur during the ssl
handshake. We do this because the relay operator can't do anything about
them, and they aren't their fault.
Now we track *which* stream with ISO_STREAM set is associated to a
particular circuit, so that we won't think that stream is incompatible
with its circuit and launch another one a second later, and we use that
same field to mark circuits which have had an ISO_STREAM stream attached
to them, so that we won't ever put a second stream on that circuit.
Fixes bug 3695.
Only write a bridge-stats string if bridge stats have been
initialized. This behavior is similar to dirreq-stats, entry-stats,
etc.
Also add a few unit tests for the bridge-stats code.
This patch separates the generation of a dirreq-stats string from
actually writing it to disk. The new geoip_format_dirreq_stats()
generates a dirreq-stats string that geoip_dirreq_stats_write() writes
to disk. All the state changing (e.g., resetting the dirreq-stats
history and initializing the next measurement interval) takes place in
geoip_dirreq_stats_write(). That allows us to finally test the
dirreq-stats code better.
Now that formatting the buffer-stats string is separate from writing
it to disk, we can also decouple the logic to extract stats from
circuits and finally write some unit tests for the history code.
The new rep_hist_format_buffer_stats() generates a buffer-stats string
that rep_hist_buffer_stats_write() writes to disk. All the state
changing (e.g., resetting the buffer-stats history and initializing
the next measurement interval) takes place in
rep_hist_buffer_stats_write(). That allows us to finally test the
buffer-stats code better.
So far, if we didn't see a single circuit, we refrained from
generating a cell-stats string and logged a warning. Nobody will
notice the warning, and people will wonder why there's no cell-stats
string in the extra-info descriptor. The better behavior is to
generate a cell-stats string with all zeros.
Right now, we append statistics to files in the stats/ directory for
half of the statistics, whereas we overwrite these files for the other
half. In particular, we append buffer, dirreq, and entry stats and
overwrite exit, connection, and bridge stats.
Appending to files was useful when we didn't include stats in extra-info
descriptors, because otherwise we'd have to copy them away to prevent
Tor from overwriting them.
But now that we include statistics in extra-info descriptors, it makes
no sense to keep the old statistics forever. We should change the
behavior to overwriting instead of appending for all statistics.
Implements #2930.
They *are* non-NUL-terminated, after all (and they have to be, since
the SOCKS5 spec allows them to contain embedded NULs. But the code
to implement proposal 171 was copying them with tor_strdup and
comparing them with strcmp_opt.
Fix for bug on 3683; bug not present in any yet-released version.
Previously we'd just looked at the connection type, but that's
always CONN_TYPE_AP. Instead, we should be looking at the type of
the listener that created the connection.
Spotted by rransom; fixes bug 3636.
We'll still need to tweak it so that it looks for includes and
libraries somewhere more sensible than "where we happened to find
them on Erinn's system"; so that tests and tools get built too;
so that it's a bit documented; and so that we actually try running
the output.
Work done with Erinn Clark.
- pid, stdout/stderr_pipe now encapsulated in process_handle
- read_all replaced by tor_read_all_from_process_stdin/stderr
- waitpid replaced by tor_get_exit_code
Untested on *nix
Previously, if tor_addr_to_str() returned NULL, we would reuse the
last value returned by fmt_addr(). (This could happen if we were
erroneously asked to format an AF_UNSPEC address.) Now instead we
return "???".
The conflicts are with the proposal 171 circuit isolation code, and
they're all trivial: they're just a matter of both branches adding
some unrelated code in the same places.
Conflicts:
src/or/circuituse.c
src/or/connection.c
The problem was that we weren't initializing want_length to 0 before
calling parse_socks() the first time, so it looked like we were
risking an infinite loop when in fact we were safe.
Fixes 3615; bugfix on 0.2.3.2-alpha.
Back when I added this logic in 20c0581a79, the rule was that whenever
a circuit finished building, we cleared its isolation info. I did that
so that we would still use the circuit even if all the streams that
had previously led us to tentatively set its isolation info had closed.
But there were problems with that approach: We could pretty easily get
into a case where S1 had led us to launch C1 and S2 had led us to
launch C2, but when C1 finished, we cleared its isolation and attached
S2 first. Since C2 was still marked in a way that made S1
unattachable to it, we'd then launch another circuit needlessly.
So instead, we try the following approach now: when a circuit is done
building, we try to attach streams to it. If it remains unused after
we try attaching streams, then we clear its isolation info, and try
again to attach streams.
Thanks to Sebastian for helping me figure this out.
One-hop dirconn streams all share a session group, and get the
ISO_SESSIONGRP flag: they may share circuits with each other and
nothing else.
Anonymized dirconn streams get a new internal-use-only ISO_STREAM
flag: they may not share circuits with anything, including each other.
The new candidate rule, which arma suggested and I like, is that
the original address as received from the client connection or as
rewritten by the controller is the address that counts.
This is mainly meant as a way to keep clients from accidentally
DOSing themselves by (e.g.) enabling IsolateDestAddr or
IsolateDestPort on a port that they use for HTTP.
Our old "do we need to launch a circuit for stream S" logic was,
more or less, that if we had a pending circuit that could handle S,
we didn't need to launch a new one.
But now that we have streams isolated from one another, we need
something stronger here: It's possible that some pending C can
handle either S1 or S2, but not both.
This patch reuses the existing isolation logic for a simple
solution: when we decide during circuit launching that some pending
C would satisfy stream S1, we "hypothetically" mark C as though S1
had been connected to it. Now if S2 is incompatible with S1, it
won't be something that can attach to C, and so we'll launch a new
stream.
When the circuit becomes OPEN for the first time (with no streams
attached to it), we reset the circuit's isolation status. I'm not
too sure about this part: I wanted some way to be sure that, if all
streams that would have used a circuit die before the circuit is
done, the circuit can still get used. But I worry that this
approach could also lead to us launching too many circuits. Careful
thought needed here.
This is the meat of proposal 171: we change circuit_is_acceptable()
to require that the connection is compatible with every connection
that has been linked to the circuit; we update circuit_is_better to
prefer attaching streams to circuits in the way that decreases the
circuits' usefulness the least; and we update link_apconn_to_circ()
to do the appropriate bookkeeping.
The "nym epoch" of a stream is defined as the number of times that
NEWNYM had been called before the stream was opened. All streams
are isolated by nym epoch.
This feature should be redundant with existing signewnym stuff, but
it provides a good belt-and-suspenders way for us to avoid ever
letting any circuit type bypass signewnym.
This patch adds fields to track how streams should be isolated, and
ensures that those fields are set correctly. It also adds fields to
track what streams can go on a circuit, and adds functions to see
whether a streams can go on a circuit and update the circuit
accordingly. Those functions aren't yet called.
Proposal 171 gives us a new syntax for parsing client port options.
You can now have as many FooPort options as you want (for Foo in
Socks, Trans, DNS, NATD), and they can have address:port arguments,
and you can specify the level of isolation on those ports.
Additionally, this patch refactors the client port parsing logic to
use a new type, port_cfg_t. Previously, ports to be bound were
half-parsed in config.c, and later re-parsed in connection.c when
we're about to bind them. Now, parsing a port means converting it
into a port_cfg_t, and binding it uses only a port_cfg_t, without
needing to parse the user-provided strings at all.
We should do a related refactoring on other port types. For
control ports, that'll be easy enough. For ORPort and DirPort,
we'll want to do this when we solve proposal 118 (letting servers
bind to and advertise multiple ports).
This implements tickets 3514 and 3515.
This adds a little code complexity: we need to remember for each
node whether it supports the right feature, and then check for each
connection whether it's exiting at such a node. We store this in a
flag in the edge_connection_t, and set that flag at link time.
- Conform to make check-spaces
- Build without warnings from passing size_t to %d
- Use connection_get_inbuf_len(), not buf_datalen (otherwise bufferevents
won't work).
- Don't log that we're using this feature at warn.
Previously we were using router_get_by_id(foo) to test "do we have a
descriptor that will let us make an anonymous circuit to foo". But
that isn't right for microdescs: we should have been using node_t.
Fixes bug 3601; bugfix on 0.2.3.1-alpha.
Instead, use compare_tor_addr_to_node_policy everywhere.
One advantage of this is that compare_tor_addr_to_node_policy can
better distinguish 0.0.0.0 from "unknown", which caused a nasty bug
with microdesc users.
Previously, we had an issue where we'd treat an unknown address as
0, which turned into "0.0.0.0", which looked like a rejected
address. This meant in practice that as soon as we started doing
comparisons of unknown uint32 addresses to short policies, we'd get
'rejected' right away. Because of the circumstances under which
this would be called, it would only happen when we had local DNS
cached entries and we were looking to launch new circuits.
* Add some utility transport functions in circuitbuild.[ch] so that we
can use them from pt.c.
* Make the accounting system consider traffic coming from proxies.
* Make sure that we only fetch bridge descriptors when all the
transports are configured.
* Create a function that will get input from a stream, so that we can
communicate with the managed proxy.
* Hackish change to tor_spawn_background() so that we can specify an
environ for our spawn.
Rationale: right now there seems to be no way for our bootstrap
status to dip under 100% once it has reached 100%. Thus, recording
broken connections after that point is useless, and wastes memory.
If at some point in the future we allow our bootstrap level to go
backwards, then we should change this rule so that we disable
recording broken connection states _as long as_ the bootstrap status
is 100%.
- We were reporting the _bottom_ N failing states, not the top N.
- With bufferevents enabled, we logged all TLS states as being "in
bufferevent", which isn't actually informative.
- When we had nothing to report, we reported nothing too loudly.
- Also, we needed documentation.
This code lets us record the state of any outgoing OR connection
that fails before it becomes open, so we can notice if they're all
dying in the same SSL state or the same OR handshake state.
More work is still needed:
- We need documentation
- We need to actually call the code that reports the failure when
we realize that we're having a hard time connecting out or
making circuits.
- We need to periodically clear out all this data -- perhaps,
whenever we build a circuit successfully?
- We'll eventually want to expose it to controllers, perhaps.
Partial implementation of feature 3116.
It's very easy for nodelist_add_node_family(sl,node) to accidentally
add 'node', and kind of hard to make sure that it omits it. Instead
of taking pains to leave 'node' out, let's instead make sure that we
always include it.
I also rename the function to nodelist_add_node_and_family, and
audit its users so that they don't add the node itself any longer,
since the function will take care of that for them.
Resolves bug 2616, which was not actually a bug.
Previously, we'd just take all the nodes in EntryNodes, see which
ones were already in the guard list, and add the ones that weren't.
There were some problems there, though:
* We'd add _every_ entry in EntryNodes, and add them in the order
they appeared in the routerlist. This wasn't a problem
until we added the ability to give country-code or IP-range
entries in the EntryNodes set, but now that we did, it is.
(Fix: We now shuffle the entry nodes before adding them; only
add up to 10*NumEntryGuards)
* We didn't screen EntryNodes for the Guard flag. That's okay
if the user has specified two or three entry nodes manually,
but if they have listed a whole subcontinent, we should
restrict ourselves to the entries that are currently guards.
(Fix: separate out the new guard from the new non-guard nodes,
and add the Guards first.)
* We'd prepend new EntryNodes _before_ the already configured
EntryNodes. This could lead to churn.
(Fix: don't prepend these.)
This patch also pre-screens EntryNodes entries for
reachableaddresses/excludenodes, even though we check for that
later. This is important now, since we cap the number of entries
we'll add.
Just looking at the "latest" consensus could give us a microdesc
consensus, if microdescs were enabled. That would make us decide
that every routerdesc was unlisted in the latest consensus and drop
them all: Ouch.
Fixes bug 3113; bugfix on 0.2.3.1-alpha.
We have an invariant that a node_t should have an md only if it has
a routerstatus. nodelist_purge tried to preserve this by removing
all nodes without a routerstatus or a routerinfo. But this left
nodes with a routerinfo and a microdesc untouched, even if they had
a routerstatus.
Bug found by frosty_un.
All of the routerset_contains*() functions return 0 if their
routerset_t argument is NULL. Therefore, there's no point in
doing "if (ExcludeNodes && routerset_contains*(ExcludeNodes...))",
for example.
This patch fixes every instance of
if (X && routerstatus_contains*(X,...))
Note that there are other patterns that _aren't_ redundant. For
example, we *don't* want to change:
if (EntryNodes && !routerstatus_contains(EntryNodes,...))
Fixes#2797. No bug here; just needless code.
Previously, we'd get a new descriptor for free when
public_server_mode() changed, since it would count as
affects_workers, which would call init_keys(), which would make us
regenerate a new descriptor. But now that we fixed bug 3263,
init_keys() is no longer necessarily a new descriptor, and so we
need to make sure that public_server_mode() counts as a descriptor
transition.
The issue was that we overlooked the possibility of reverse DNS success
at the end of connection_ap_handshake_socks_resolved(). Issue discovered
by katmagic, thanks!
Returning a tristate is needless here; we can just use the yielded
transport/proxy_type field to tell whether there's a proxy, and have
the return indicate success/failure.
Also, store the proxy_type in the or_connection_t rather than letting
it get out of sync if a configuration reload happens between launching
the or_connection and deciding what to say with it.
- const-ify some transport_t pointers
- Remove a vestigial argument to parse_bridge_line
- Make it compile without warnings on my laptop with
--enable-gcc-warnings
We were using strncpy before, which isn't our style for stuff like
this.
This isn't a bug, though: before calling strncpy, we were checking
that strlen(src) was indeed == HEX_DIGEST_LEN, which is less than
sizeof(dst), so there was no way we could fail to NUL-terminate.
Still, strncpy(a,b,sizeof(a)) is an idiom that we ought to squash
everyplace.
Fixes CID #427.
Using strncpy meant that if listenaddress were ever >=
sizeof(sockaddr_un.sun_path), we would fail to nul-terminate
sun_path. This isn't a big deal: we never read sun_path, and the
kernel is smart enough to reject the sockaddr_un if it isn't
nul-terminated. Nonetheless, it's a dumb failure mode. Instead, we
should reject addresses that don't fit in sockaddr_un.sun_path.
Coverity found this; it's CID 428. Bugfix on 0.2.0.3-alpha.
When we rejected a descriptor for not being the one we wanted, we
were letting the parsed descriptor go out of scope.
Found by Coverity; CID # 30.
Bugfix on 0.2.1.26.
(No changes file yet, since this is not in any 0.2.1.x release.)
I'm not one to insist on C's miserly stack limits, but allocating a
256K array on the stack is too much even for me.
Bugfix on 0.2.1.7-alpha. Found by coverity. Fixes CID # 450.
Every node_t has either a routerinfo_t or a routerstatus_t, so every
node_t *should* have a nickname. Nonetheless, let's make sure in
hex_digest_nickname_matches().
Should quiet CID 434.
This is a little error-prone when the local has a different type
from the parameter, and is very error-prone with both have the same
type. Let's not do this.
Fixes CID #437,438,439,440,441.
For some inexplicable reason, Coverity departs from its usual
standards of avoiding false positives here, and warns about all
sscanf usage, even when the formatting strings are totally safe.
Addresses CID # 447, 446.
Previously, fetch_from_buf_socks() might return 0 if there was still
data on the buffer and a subsequent call to fetch_from_buf_socks()
would return 1. This was making some of the socks5 unit tests
harder to write, and could potentially have caused misbehavior with
some overly verbose SOCKS implementations. Now,
fetch_from_buf_socks() does as much processing as it can, and
returns 0 only if it really needs more data. This brings it into
line with the evbuffer socks implementation.
We added this back in 0649fa14 in 2006, to deal with the case where
the client unconditionally sent us authentication data. Hopefully,
that's not needed any longer, since we now can actually parse
authentication data.
This change also requires us to add and use a pair of
allocator/deallocator functions for socks_request_t, instead of
using tor_malloc_zero/tor_free directly.
In the code as it stood, we would accept any number of socks5
username/password authentication messages, regardless of whether we
had actually negotiated username/password authentication. Instead,
we should only accept one, and only if we have really negotiated
username/password authentication.
This patch also makes some fields of socks_request_t into uint8_t,
for safety.
Multiple Bridge lines can point to the same one ClientTransportPlugin
line, and we can have multiple ClientTransportPlugin lines in our
configuration file that don't match with a bridge. We also issue a
warning when we have a Bridge line with a pluggable transport but we
can't match it to a ClientTransportPlugin line.
* Renamed transport_info_t to transport_t.
* Introduced transport_get_by_name().
* Killed match_bridges_with_transports().
We currently *don't* detect whether any bridges miss their transports,
of if any transports miss their bridges.
* Various code and aesthetic tweaks and English language changes.
A couple of places in control.c were using connection_handle_write()
to flush important stuff (the response to a SIGNAL command, an
ERR-level status event) before Tor went down. But
connection_handle_write() isn't meaningful for bufferevents, so we'd
crash.
This patch adds a new connection_flush() that works for all connection
backends, and makes control.c use that instead.
Fix for bug 3367; bugfix on 0.2.3.1-alpha.
libnatpmp-20110618 changed the API that tor-fw-helper used and for a time
tor-fw-helper could not build against the newest libnatpmp. This patch brings
support for libnatpmp to tor-fw-helper.
debug-level since it will be quite common. logged at both client
and server side. this step should help us track what's going on
with people filtering tor connections by our ssl habits.
This lets us make a lot of other stuff const, allows the compiler to
generate (slightly) better code, and will make me get slightly fewer
patches from folks who stick mutable stuff into or_options_t.
const: because not every input is an output!
Original message from bug3393:
check_private_dir() to ensure that ControlSocketsGroupWritable is
safe to use. Unfortunately, check_private_dir() only checks against
the currently running user… which can be root until privileges are
dropped to the user and group configured by the User config option.
The attached patch fixes the issue by adding a new effective_user
argument to check_private_dir() and updating the callers. It might
not be the best way to fix the issue, but it did in my tests.
(Code by lunar; changelog by nickm)
* Improved function documentation.
* Renamed find_bridge_transport_by_addrport() to
find_transport_by_bridge_addrport().
* Sanitized log severities we use.
* Ran check-spaces.
When we set a networkstatus in the non-preferred flavor, we'd check
the time in the current_consensus. But that might have been NULL,
which could produce a crash as seen in bug 3361.
George Kadianakis notes that if you give crypto_rand_int() a value
above INT_MAX, it can return a negative number, which is not what
the documentation would imply.
The simple solution is to assert that the input is in [1,INT_MAX+1].
If in the future we need a random-value function that can return
values up to UINT_MAX, we can add one.
Fixes bug 3306; bugfix on 0.2.2pre14.
When we added the check for key size, we required that the keys be
128 bytes. But RSA_size (which defers to BN_num_bytes) will return
128 for keys of length 1017..1024. This patch adds a new
crypto_pk_num_bits() that returns the actual number of significant
bits in the modulus, and uses that to enforce key sizes.
Also, credit the original bug3318 in the changes file.
UseBridges 1 now means "connect only to bridges; if you know no
bridges, don't make connections." UseBridges auto means "Use bridges
if they are known, and we have no EntryNodes set, and we aren't a
server." UseBridges 0 means "don't use bridges."
This merge was a bit nontrivial, since I had to write a new
node_is_a_configured_bridge to parallel router_is_a_configured_bridge.
Conflicts:
src/or/circuitbuild.c
options->DirPort is 0 in the unit tests, so
router_get_advertised_dir_port() would return 0 so we wouldn't pick a
dirport. This isn't what we want for the unit tests. Fixes bug
introduced in 95ac3ea594.
Previously, Tor would dereference a NULL pointer and crash if
lookup_last_hid_serv_request were called before the first call to
directory_clean_last_hid_serv_requests. As far as I can tell, that's
currently impossible, but I want that undocumented invariant to go away
in case I^Wwe break it someday.
If set to 1, Tor will attempt to prevent basic debugging
attachment attempts by other processes. (Default: 1)
Supports Mac OS X and Gnu/Linux.
Sebastian provided useful feedback and refactoring suggestions.
Signed-off-by: Jacob Appelbaum <jacob@appelbaum.net>
An elusive compile-error (MingW-gcc v4.50 on Win_XP); a missing
comma (!) and a typo ('err_msg' at line 277 changed to 'errmsg').
Aso changed the format for 'err_code' at line 293 into a "%ld" to suppress
a warning. How did this go unnoticed for ~1 month? Btw. This is my 1st ever
'git commit', so it better work.
When we introduced NEED_KEY_1024 in routerparse.c back in
0.2.0.1-alpha, I forgot to add a *8 when logging the length of a
bad-length key.
Bugfix for 3318 on 0.2.0.1-alpha.
The patch for 3228 made us try to run init_keys() before we had loaded
our state file, resulting in an assert inside init_keys. We had moved
it too early in the function.
Now it's later in the function, but still above the accounting calls.
The conflicts were mainly caused by the routerinfo->node transition.
Conflicts:
src/or/circuitbuild.c
src/or/command.c
src/or/connection_edge.c
src/or/directory.c
src/or/dirserv.c
src/or/relay.c
src/or/rendservice.c
src/or/routerlist.c
The comment fixes are trivial. The defensive programming trick is to
tolerate receiving NULL inputs on the describe functions. That should
never actually happen, but it seems like the likeliest mistake for us
to make in the future.
Previously we did this nearer to the end (in the old_options &&
transition_affects_workers() block). But other stuff cares about
keys being consistent with options... particularly anything which
tries to access a key, which can die in assert_identity_keys_ok().
Fixes bug 3228; bugfix on 0.2.2.18-alpha.
Fixes part of #1297; bugfix on 48e0228f1e,
when circuit_expire_building was changed to assume that timestamp_dirty
was set when a circuit changed purpose to _C_REND_READY. (It wasn't.)
The previous attempt was incomplete: it told us not to publish a
descriptor, but didn't stop us from generating one. Now we treat an
absent OR port the same as not knowing our address. (This means
that when we _do_ get an OR port, we need to mark the descriptor
dirty.)
More attempt to fix bug3216.
This situation can happen easily if you set 'ORPort auto' and
'AccountingMax'. Doing so means that when you have no ORPort, you
won't be able to set an ORPort in a descriptor, so instead you would
just generate lots of invalid descriptors, freaking out all the time.
Possible fix for 3216; fix on 0.2.2.26-beta.
We had all the code in place to handle this right... except that we
were unconditionally opening a PF_INET socket instead of looking at
sa_family. Ow.
Fixes bug 2574; not a bugfix on any particular version, since this
never worked before.
Most instances were dead code; for those, I removed the assignments.
Some were pieces of info we don't currently plan to use, but which
we might in the future. For those, I added an explicit cast-to-void
to indicate that we know that the thing's unused. Finally, one was
a case where we were testing the wrong variable in a unit test.
That one I fixed.
This resolves bug 3208.
It used to mean "Force": it would tell tor-resolve to ask tor to
resolve an address even if it ended with .onion. But when
AutomapHostsOnResolve was added, automatically refusing to resolve
.onion hosts stopped making sense. So in 0.2.1.16-rc (commit
298dc95dfd), we made tor-resolve happy to resolve anything.
The -F option stayed in, though, even though it didn't do anything.
Oddly, it never got documented.
Found while fixing GCC 4.6 "set, unused variable" warnings.
On win64, sockets are of type UINT_PTR; on win32 they're u_int;
elsewhere they're int. The correct windows way to check a socket for
being set is to compare it with INVALID_SOCKET; elsewhere you see if
it is negative.
On Libevent 2, all callbacks take sockets as evutil_socket_t; we've
been passing them int.
This patch should fix compilation and correctness when built for
64-bit windows. Fixes bug 3270.
We used to regenerate our descriptor whenever we'd get a sighup. This
was caused by a bug in options_transition_affects_workers() that would
return true even if the options were exactly the same. Down the call
path we'd call init_keys(), which made us make a new descriptor which
the authorities would reject, and the node would subsequently fall out
of the consensus.
This patch fixes only the first part of this bug:
options_transition_affects_workers() behaves correctly now. The second
part still wants a fix.
tor_process_monitor_new can't currently return NULL, but if it ever can,
we want that to be an explicitly fatal error, without relying on the fact
that monitor_owning_controller_process's chain of caller will exit if it
fails.
When we configure a new bridge via the controller, don't wait up to ten
seconds before trying to fetch its descriptor. This wasn't so bad when
you listed your bridges in torrc, but it's dreadful if you configure
your bridges via vidalia.
Bumped the char maximum to 512 for HTTPProxyAuthenticator &
HTTPSProxyAuthenticator. Now stripping all '\n' after base64
encoding in alloc_http_authenticator.
Rename crypto_pk_check_key_public_exponent to crypto_pk_public_exponent_ok:
it's nice to name predicates s.t. you can tell how to interpret true
and false.
Fixed a trivial conflict where this and the ControlSocketGroupWritable
code both added different functions to the same part of connection.c.
Conflicts:
src/or/connection.c
This was harmless, since we only used this for checking for lists of
port values, but it's the principle of the thing.
Fixes 3175; bugfix on 0.1.0.1-rc
This patch introduces a few new functions in router.c to produce a
more helpful description of a node than its nickame, and then tweaks
nearly all log messages taking a nickname as an argument to call these
functions instead.
There are a few cases where I left the old log messages alone: in
these cases, the nickname was that of an authority (whose nicknames
are useful and unique), or the message already included an identity
and/or an address. I might have missed a couple more too.
This is a fix for bug 3045.
We'll need this for checking permissions on the directories that hold
control sockets: if somebody says "ControlSocket ~/foo", it would be
pretty rude to do a chmod 700 on their homedir.
When running a system-wide instance of Tor on Unix-like systems, having
a ControlSocket is a quite handy mechanism to access Tor control
channel. But it would be easier if access to the Unix domain socket can
be granted by making control users members of the group running the Tor
process.
This change introduces a UnixSocketsGroupWritable option, which will
create Unix domain sockets (and thus ControlSocket) 'g+rw'. This allows
ControlSocket to offer same access control measures than
ControlPort+CookieAuthFileGroupReadable.
See <http://bugs.debian.org/552556> for more details.
This code changes it so that we don't remove bridges immediately when
we start re-parsing our configuration. Instead, we mark them all, and
remove all the marked ones after re-parsing our bridge lines. As we
add a bridge, we see if it's already in the list. If so, we just
unmark it.
This new behavior will lose the property we used to have that bridges
were in bridge_list in the same order in which they appeared in the
torrc. I took a quick look through the code, and I'm pretty sure we
didn't actually depend on that anywhere.
This is for bug 3019; it's a fix on 0.2.0.3-alpha.
rransom notes correctly that now that we aren't checking our HSDir
flag, we have no actual reason to check whether we are listed in the
consensus at all when determining if we should act like a hidden
service directory.
Previously, if they changed in torrc during a SIGHUP, all was well,
since we would just clear all transient entries from the addrmap
thanks to bug 1345. But if you changed them from the controller, Tor
would leave old mappings in place.
The VirtualAddrNetwork bug has been here since 0.1.1.19-rc; the
AutomapHosts* bug has been here since 0.2.0.1-alpha.
This bug couldn't happen when TrackExitHosts changed in torrc, since
the SIGHUP to reload the torrc would clear out all the transient
addressmap entries before. But if you used SETCONF to change
TrackExitHosts, old entries would be left alone: that's a bug, and so
this is a bugfix on Tor 0.1.0.1-rc.
If you really want to purge the client DNS cache, the TrackHostExits
mappings, and the virtual address mappings, you should be using NEWNYM
instead.
Fixes bug 1345; bugfix on Tor 0.1.0.1-rc.
Note that this needs more work: now that we aren't nuking the
transient addressmap entries on HUP, we need to make sure that
configuration changes to VirtualAddressMap and TrackHostExits actually
have a reasonable effect.
We'll eventually want to do more work here to make sure that the ports
are stable over multiple invocations. Otherwise, turning your node on
and off will get you a new DirPort/ORPort needlessly.
Otherwise, it will just immediately close any port declared with "auto"
on the grounds that it wasn't configured. Now, it will allow "auto" to
match any port.
This means FWIW if you configure a socks port with SocksPort 9999
and then transition to SocksPort auto, the original socksport will
not get closed and reopened. I'm considering this a feature.
HTTPS error code 403 is now reported as:
"The https proxy refused to allow connection".
Used a switch statement for additional error codes to be explained
in the future.
On IRC, wanoskarnet notes that if we ever do microdesc_free() on a
microdesc that's in the nodelist, we're in trouble. Also, we're in
trouble if we free one that's still in the microdesc_cache map.
This code adds a flag to microdesc_t to note where the microdesc is
referenced from, and checks those flags from microdesc_free(). I
don't believe we have any errors here now, but if we introduce some
later, let's log and recover from them rather than introducing
heisenbugs later on.
Addresses bug 3153.
The old behavior contributed to unreliability when hidden services and
hsdirs had different consensus versions, and so had different opinions
about who should be cacheing hsdir info.
Bugfix on 0.2.0.10-alpha; based on discussions surrounding bug 2732.
The new behavior is to try to rename the old file if there is one there
that we can't read. In all likelihood, that will fail too, but at least
we tried, and at least it won't crash.
Conflicts in various places, mainly node-related. Resolved them in
favor of HEAD, with copying of tor_mem* operations from bug3122_memcmp_022.
src/common/Makefile.am
src/or/circuitlist.c
src/or/connection_edge.c
src/or/directory.c
src/or/microdesc.c
src/or/networkstatus.c
src/or/router.c
src/or/routerlist.c
src/test/test_util.c
Conflicts throughout. All resolved in favor of taking HEAD and
adding tor_mem* or fast_mem* ops as appropriate.
src/common/Makefile.am
src/or/circuitbuild.c
src/or/directory.c
src/or/dirserv.c
src/or/dirvote.c
src/or/networkstatus.c
src/or/rendclient.c
src/or/rendservice.c
src/or/router.c
src/or/routerlist.c
src/or/routerparse.c
src/or/test.c
Here I looked at the results of the automated conversion and cleaned
them up as follows:
If there was a tor_memcmp or tor_memeq that was in fact "safe"[*] I
changed it to a fast_memcmp or fast_memeq.
Otherwise if there was a tor_memcmp that could turn into a
tor_memneq or tor_memeq, I converted it.
This wants close attention.
[*] I'm erring on the side of caution here, and leaving some things
as tor_memcmp that could in my opinion use the data-dependent
fast_memcmp variant.
Make that explicit by adding an assert and removing a null-check. All of
its callers currently depend on the argument being non-null anyway.
Silences a few clang complaints.
The analyzer assumed that bootstrap_percent could be less than 0 when we
call control_event_bootstrap_problem(), which would mean we're calling
log_fn() with undefined values. The assert makes it clear this can't
happen.
When configure tor with --enable-bufferevents and
--enable-static-libevent, libevent_openssl would still be linked
dynamically. Fix this and refactor src/or/Makefile.am along the way.
To make sure that a server learns if its IP has changed, the server
sometimes launches authority.z descriptor fetches from
update_router_descriptor_downloads. That's nice, but we're moving
towards a situation where update_router_descriptor_downloads doesn't
always get called. So this patch breaks the authority.z
check-and-fetch into a new function.
This function also renames last_routerdesc_download to a more
appropriate last_descriptor_download, and adds a new
update_all_descriptor_downloads() function.
(For now, this is unnecessary, since servers don't actually use
microdescriptors. But that could change, or bridges could start
using microdescriptors, and then we'll be glad this is refactored
nicely.)
To turn this on, set UseMicrodescriptors to "1" (or "auto" if you
want it on-if-you're-a-client). It should go auto-by-default once
0.2.3.1-alpha is released.
Because of our node logic, directory caches will never use
microdescriptors when they have the right routerinfo available.
See bug 2850 for rationale: it appears that on some busy exits, the OS
decides that every single port is now unusable because they have been
all used too recently.
Previously we ensured that it would get called periodically by doing
it from inside the code that added microdescriptors. That won't work
though: it would interfere with our code that tried to read microdescs
from disk initially. Instead, we should consider rebuilding the cache
periodically, and on startup.
Previously on 0.2.2, we'd never clean the cache. Now that we can
clean it, we want to add a condition to rebuild it: that should happen
whenever we have dropped enough microdescriptors that we could save a
lot of space.
No changes file, since 0.2.3 doesn't need one and 0.2.2 already has some
changes files for the backport of the microdesc_clean_cahce() function.
n_supported[i] has a random value prior to initialization, so a node
that doesn't have routerinfo available can have a random priority.
Patch contributed by wanoskarnet from #tor. Thanks!
Instead of just saying "boogity boogity!" let's actually warn people
that they need to configure stuff right to be safe, and point them
at instructions for how to do that.
Resolves bug 2474.
Clients and relays haven't used them since early 0.2.0.x. The only
remaining use by authorities learning about new relays ahead of scedule;
see proposal 147 for what we intend to do about that.
We're leaving in an option (FetchV2Networkstatus) to manually fetch v2
networkstatuses, because apparently dnsel and maybe bwauth want them.
This fixes bug 3022.
Previously it would erroneously return true if ListenAddr was set for
a client port, even if that port itself was 0. This would give false
positives, which were not previously harmful... but which were about
to become.
A v0 HS authority stores v0 HS descriptors in the same descriptor
cache that its HS client functionality uses. Thus, if the HS
authority operator clears its client HS descriptor cache, ALL v0
HS descriptors will be lost. That would be bad.
These functions can return NULL for otherwise-valid values of
time_t. Notably, the glibc gmtime manpage says it can return NULL
if the year if greater than INT_MAX, and the windows MSDN gmtime
page says it can return NULL for negative time_t values.
Also, our formatting code is not guaranteed to correctly handle
years after 9999 CE.
This patch tries to correct this by detecting NULL values from
gmtime/localtime_r, and trying to clip them to a reasonable end of
the scale. If they are in the middle of the scale, we call it a
downright error.
Arguably, it's a bug to get out-of-bounds dates like this to begin
with. But we've had bugs of this kind in the past, and warning when
we see a bug is much kinder than doing a NULL-pointer dereference.
Boboper found this one too.
If the user sent a SIGNAL NEWNYM command after we fetched a rendezvous
descriptor, while we were building the introduction-point circuit, we
would give up entirely on trying to connect to the hidden service.
Original patch by rransom slightly edited to go into 0.2.1
Previously, it would remove every trackhostexits-derived mapping
*from* xyz.<exitname>.exit; it was supposed to remove every
trackhostexits-derived mapping *to* xyz.<exitname>.exit.
Bugfix on 0.2.0.20-rc: fixes an XXX020 added while staring at bug-1090
issues.
Resolved conflicts in:
doc/tor.1.txt
src/or/circuitbuild.c
src/or/circuituse.c
src/or/connection_edge.c
src/or/connection_edge.h
src/or/directory.c
src/or/rendclient.c
src/or/routerlist.c
src/or/routerlist.h
These were mostly releated to the routerinfo_t->node_t conversion.
Now we believe it to be the case that we never build a circuit for our
stream that has an unsuitable exit, so we'll never need to use such
a circuit. The risk is that we have some code that builds the circuit,
but now we refuse to use it, meaning we just build a bazillion circuits
and ignore them all.
This looked at first like another fun way around our node selection
logic: if we had introduction circuits, and we wound up building too
many, we would turn extras into general-purpose circuits. But when we
did so, we wouldn't necessarily check whether the general-purpose
circuits conformed to our node constraints. For example, the last
node could totally be in ExcludedExitNodes and we wouldn't have cared...
...except that the circuit should already be internal, so it won't get user
streams attached to it, so the transition should generally be allowed.
Add an assert to make sure we're right about this, and have it not
check whether ExitNodes is set, since that's irrelevant to internal
circuits.
IOW, if we were using TrackExitHosts, and we added an excluded node or
removed a node from exitnodes, we wouldn't actually remove the mapping
that points us at the new node.
Also, note with an XXX022 comment a place that I think we are looking
at the wrong string.
The routerset_equal function explicitly handles NULL inputs, so
there's no need to check inputs for NULL before calling it.
Also fix a bug in routerset_equal where a non-NULL routerset with no
entries didn't get counted as equal to a NULL routerset. This was
untriggerable, I think, but potentially annoying down the road.
ExcludeExitNodes foo now means that foo.exit doesn't work. If
StrictNodes is set, then ExcludeNodes foo also overrides foo.exit.
foo.exit , however, still works even if foo is not listed in ExitNodes.
This once maybe made sense when ExitNodes meant "Here are 3 exits;
use them all", but now it more typically means "Here are 3
countries; exit from there." Using non-Fast/Stable exits created a
potential partitioning opportunity and an annoying stability
problem.
(Don't worry about the case where all of our ExitNodes are non-Fast
or non-Stable: we handle that later in the function by retrying with
need_capacity and need_uptime set to 0.)
If we're picking a random directory node, never pick an excluded one.
But if we've chosen a specific one (or all), allow it unless strictnodes
is set (in which case warn so the user knows it's their fault).
When warning that we won't connect to a strictly excluded node,
log what it was we were trying to do at that node.
When ExcludeNodes is set but StrictNodes is not set, we only use
non-excluded nodes if we can, but fall back to using excluded nodes
if none of those nodes is usable.
This is a tweak to the bug2917 fix. Basically, if we want to simulate
a signal arriving in the controller, we shouldn't have to pretend that
we're Libevent, or depend on how Tor sets up its Libevent callbacks.
The last entry of the *Maxima values in the state file was inflated by a
factor of NUM_SECS_ROLLING_MEASURE (currently 10). This could lead to
a wrong maximum value propagating through the state file history.
When reading the bw history from the state file, we'd add the 900-second
value as traffic that occured during one second. Fix that by adding the
average value to each second.
This bug was present since 0.2.0.5-alpha, but was hidden until
0.2.23-alpha when we started using the saved values.
Some tor relays would report lines like these in their extrainfo
documents:
dirreq-write-history 2011-03-14 16:46:44 (900 s)
This was confusing to some people who look at the stats. It would happen
whenever a relay first starts up, or when a relay has dirport disabled.
Change this so that lines without actual bw entries are omitted.
Implements ticket 2497.
While doing so, get rid of the now unnecessary function
control_signal_act().
Fixes bug 2917, reported by Robert Ransom. Bugfix on commit
9b4aa8d2ab. This patch is loosely based on
a patch by Robert (Changelog entry).
- Document the structure and variables.
- Make circuits_for_buffer_stats into a static variable.
- Don't die horribly if interval_length is 0.
- Remove the unused local_circ_id field.
- Reorder the fields of circ_buffer_stats_t for cleaner alignment layout.
Instead of answering GETINFO requests about our geoip usage only after
running for 24 hours, this patch makes us answer GETINFO requests
immediately. We still round and quantize as before.
Implements bug2711.
Also, refactor the heck out of the bridge usage formatting code. No
longer should we need to do a generate-parse-and-regenerate cycle to
get the controller string, and that lets us simplify the code a lot.
- Document it in the manpage
- Add a changes entry
- No need to log when it is set: we don't log for other options.
- Use doxygen to document the new flag.
- Test truth of C variables with "if (x)", not "if (x == 1)".
- Simplify a complex boolean expression by breaking it up.
We've got millisecond timers now, we might as well use them.
This change won't actually make circuits get expiered with microsecond
precision, since we only call the expiry functions once per second.
Still, it should avoid the situation where we have a circuit get
expired too early because of rounding.
A couple of the expiry functions now call tor_gettimeofday: this
should be cheap since we're only doing it once per second. If it gets
to be called more often, though, we should onsider having the current
time be an argument again.
Since svn r1475/git 5b6099e8 in tor-0.0.6, we have responded to an
exhaustion of all 65535 stream IDs on a circuit by marking that
circuit for close. That's not the right response. Instead, we
should mark the circuit as "too dirty for new circuits".
Of course in reality this isn't really right either. If somebody
has managed to cram 65535 streams onto a circuit, the circuit is
probably not going to work well for any of those streams, so maybe
we should be limiting the number of streams on an origin circuit
concurrently.
Also, closing the stream in this case is probably the wrong thing to
do as well, but fixing that can also wait.
We fixed bug 539 (where directories would say "503" but send data
anyway) back in 0.2.0.16-alpha/0.1.2.19. Because most directory
versions were affected, we added workaround to make sure that we
examined the contents of 503-replies to make sure there wasn't any
data for them to find. But now that such routers are nonexistent,
we can remove this code. (Even if somebody fired up an 0.1.2.19
directory cache today, it would still be fine to ignore data in its
erroneous 503 replies.)
The first was genuinely impossible, I think: it could only happen
when the amount we read differed from the amount we wanted to read
by more than INT_MAX.
The second is just very unlikely: it would give incorrect results to
the controller if you somehow wrote or read more than 4GB on one
edge conn in one second. That one is a bugfix on 0.1.2.8-beta.
In afe414 (tor-0.1.0.1-rc~173), when we moved to
connection_edge_end_errno(), we used it in handling errors from
connection_connect(). That's not so good, since by the time
connection_connect() returns, the socket is no longer set, and we're
supposed to be looking at the socket_errno return value from
connection_connect() instead. So do what we should've done, and
look at the socket_errno value that we get from connection_connect().
Ian's original message:
The current code actually correctly handles queued data at the
Exit; if there is queued data in a EXIT_CONN_STATE_CONNECTING
stream, that data will be immediately sent when the connection
succeeds. If the connection fails, the data will be correctly
ignored and freed. The problem with the current server code is
that the server currently drops DATA cells on streams in the
EXIT_CONN_STATE_CONNECTING state. Also, if you try to queue data
in the EXIT_CONN_STATE_RESOLVING state, bad things happen because
streams in that state don't yet have conn->write_event set, and so
some existing sanity checks (any stream with queued data is at
least potentially writable) are no longer sound.
The solution is to simply not drop received DATA cells while in
the EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME
cells in this state, so that the OP cannot send more than one
window's worth of data to be queued at the Exit. Finally, patch
the sanity checks so that streams in the EXIT_CONN_STATE_RESOLVING
state that have buffered data can pass.
[...] Here is a simple patch. It seems to work with both regular
streams and hidden services, but there may be other corner cases
I'm not aware of. (Do streams used for directory fetches, hidden
services, etc. take a different code path?)
Right now, we only consider sending stream-level SENDME cells when we
have completely flushed a connection_edge's outbuf, or when it sends
us a DATA cell. Neither of these is ideal for throughput.
This patch changes the behavior so we now call
connection_edge_consider_sending_sendme when we flush _some_ data from
an edge outbuf.
Fix for bug 2756; bugfix on svn r152.
Resolved nontrivial conflict around rewrite_x_address_for_bridge and
learned_bridge_descriptor. Now, since leanred_bridge_descriptor works
on nodes, we must make sure that rewrite_node_address_for_bridge also
works on nodes.
Conflicts:
src/or/circuitbuild.c
hid_serv_responsible_for_desc_id's return value is never negative, and
there is no need to search through the consensus to find out whether we
are responsible for a descriptor ID before we look in our cache for a
descriptor.
Name the magic value "10" rather than re-deriving it.
Comment more.
Use the pattern that works for periodic timers, not the pattern that
doesn't work. ;)
It is important to verify the uptime claim of a relay instead of just
trusting it, otherwise it becomes too easy to blackhole a specific
hidden service. rephist already has data available that we can use here.
Bugfix on 0.2.0.10-alpha.
Partial backport of daa0326aaa .
Resolves bug 2402. Bugfix on 0.2.1.15 (for the part where we switched to
git) and on 0.2.1.30 (for the part where we dumped micro-revisions.)
The calculation of when to send the logmessage was correct, but we
didn't give the correct number of relays required: We want more than
half of all authorities we know about. Fixes bug 2663.
This fixes a remotely triggerable assert on directory authorities, who
don't handle descriptors with ipv6 contents well yet. We will want to
revert this once we're ready to handle ipv6.
Issue raised by lorth on #tor, who wasn't able to use Tor anymore.
Analyzed with help from Christian Fromme. Fix suggested by arma. Bugfix
on 0.2.1.3-alpha.
We want to use the discard port correctly, so a htons() was missing.
Also we need to set it correctly depending on address family.
Review provided by danieldg
SSL_*_app_data uses ex_data index 0, which will be the first one allocated
by SSL_get_ex_new_index. Thus, if we ever started using the ex_data feature
for some other purpose, or a library linked to Tor ever started using
OpenSSL's ex_data feature, Tor would break in spectacular and mysterious
ways. Using the SSL_*_ex_data functions directly now may save us from
that particular form of breakage in the future.
But I would not be surprised if using OpenSSL's ex_data functions at all
(directly or not) comes back to bite us on our backends quite hard. The
specified behaviour of dup_func in the man page is stupid, and
crypto/ex_data.c is a horrific mess.
This should fix a bug that special ran into, where if your state file
didn't record period maxima, it would never decide that it had
successfully parsed itself unless you got lucky with your
uninitialized-variable values.
This patch also tries to improve error messags in the case where a
maximum value legitimately doesn't parse.
In private networks, the defaults for some options are changed. This
means that in options_validate(), where we're testing that the defaults
are what we think they are, we fail. Use a workaround by setting a
hidden configuration option _UsingTestingTorNetwork when we have altered
the configuration this way, so that options_validate() can do the right
thing.
Fixes bug 2250, bugfix on 0.2.1.2-alpha (the version introducing private
network options).
Changed received_netinfo_from_trusted_dir into a
tristate in order to keep track of whether we have
already tried contacting a trusted dir. So we don't
send multiple requests if we get a bunch of skews.
The underlying fix is to stop indicating requests "ns" consensuses by
putting NULL in their requested_resource field: we already had a
specialized meaning for requested_resource==NULL, which was (more or
less) "Treat a failure here as a network failure, since it's too early
to possibly be a resource or directory failure." Overloading the two
meant that very early microdesc consensus download failures would get
treated as ns consensus download failures, so the failure count there
would get incremented, but the microdesc download would get retried
immediately in an infinite loop.
Fix for bug2381. Diagnosed by mobmix.
Sets:
* Documentation
* Logging domain
* Configuration option
* Scheduled event
* Makefile
It also creates status.c and the log_heartbeat() function.
All code was written by Sebastian Hahn. Commit message was
written by me (George Kadianakis).
Our regular DH parameters that we use for circuit and rendezvous
crypto are unchanged. This is yet another small step on the path of
protocol fingerprinting resistance.
(Backport from 0.2.2's 5ed73e3807)
Patch our implementation of tor_lockfile_lock() to handle this case
correctly. Also add a note that blocking behaviour differs from windows
to *nix. Fixes bug 2504, issue pointed out by mobmix.
rransom noticed that a change of ORPort is just as bad as a change of IP
address from a client's perspective, because both mean that the relay is
not available to them while the new information hasn't propagated.
Change the bug1035 fix accordingly.
Also make sure we don't log a bridge's IP address (which might happen
when we are the bridge authority).
It is often not entirely clear what options Tor was built with, so it
might not be immediately obvious which config file Tor is using when it
found one. Log the config file at startup.
When calling circuit_build_times_shuffle_and_store_array, we were
passing a uint32_t as an int. arma is pretty sure that this can't
actually cause a bug, because of checks elsewhere in the code, but
it's best not to pass a uint32_t as an int anyway.
Found by doorss; fix on 0.2.2.4-alpha.
We detect and reject said attempts if there is no chosen exit node or
circuit: connecting to a private addr via a randomly chosen exit node
will usually fail (if all exits reject private addresses), is always
ill-defined (you're not asking for any particular host or service),
and usually an error (you've configured all requests to go over Tor
when you really wanted to configure all _remote_ requests to go over
Tor).
This can also help detect forwarding loop requests.
Found as part of bug2279.
Left circuit_build_times_get_bw_scale() uncommented because it is in the wrong
place due to an improper bug2317 fix. It needs to be moved and renamed, as it
is not a cbt parameter.
To quote arma: "So instead of stopping your CBT from screaming, you're just
going to throw it in the closet and hope you can't hear it?"
Yep. The log message can happen because at 95% point on the curve, we can be
way beyond the max timeout we've seen, if the curve has few points and is
shallow.
Also applied Nick's rule of thumb for rewriting some other notice log messages
to read like how you would explain them to a raving lunatic on #tor who was
shouting at you demanding what they meant. Hopefully the changes live up to
that standard.
If we got a signed digest that was shorter than the required digest
length, but longer than 20 bytes, we would accept it as long
enough.... and then immediately fail when we want to check it.
Fixes bug 2409; bug in 0.2.2.20-alpha; found by piebeer.
Previously if you wanted to say "All messages except network
messages", you needed to say "[*,~net]" and if you said "[~net]" by
mistake, you would get no messages at all. Now, if you say "[~net]",
you get everything except networking messages.
When we added support for separate client tls certs on bridges in
a2bb0bfdd5 we forgot to correctly initialize this when changing
from relay to bridge or vice versa while Tor is running. Fix that
by always initializing keys when the state changes.
Fixes bug 2433.
When we stopped using svn, 0.2.1.x lost the ability to notice its svn
revision and report it in the version number. However, it kept
looking at the micro-revision.i file... so if you switched to master,
built tor, then switched to 0.2.1.x, you'd get a micro-revision.i file
from master reported as an SVN tag. This patch takes out the "include
the svn tag" logic entirely.
Bugfix on 0.2.1.15-rc; fixes bug 2402.
Our regular DH parameters that we use for circuit and rendezvous
crypto are unchanged. This is yet another small step on the path of
protocol fingerprinting resistance.
We need to make sure that the worst thing that a weird consensus param
can do to us is to break our Tor (and only if the other Tors are
reliably broken in the same way) so that the majority of directory
authorities can't pull any attacks that are worse than the DoS that
they can trigger by simply shutting down.
One of these worse things was the cbtnummodes parameter, which could
lead to heap corruption on some systems if the value was sufficiently
large.
This commit fixes this particular issue and also introduces sanity
checking for all consensus parameters.
Our public key functions assumed that they were always writing into a
large enough buffer. In one case, they weren't.
(Incorporates fixes from sebastian)
In dnsserv_resolved(), we carefully made a nul-terminated copy of the
answer in a PTR RESOLVED cell... then never used that nul-terminated
copy. Ouch.
Surprisingly this one isn't as huge a security problem as it could be.
The only place where the input to dnsserv_resolved wasn't necessarily
nul-terminated was when it was called indirectly from relay.c with the
contents of a relay cell's payload. If the end of the payload was
filled with junk, eventdns.c would take the strdup() of the name [This
part is bad; we might crash there if the cell is in a bad part of the
stack or the heap] and get a name of at least length
495[*]. eventdns.c then rejects any name of length over 255, so the
bogus data would be neither transmitted nor altered.
[*] If the name was less than 495 bytes long, the client wouldn't
actually be reading off the end of the cell.
Nonetheless this is a reasonably annoying bug. Better fix it.
Found while looking at bug 2332, reported by doorss. Bugfix on
0.2.0.1-alpha.
The C standard says that INT32_MAX is supposed to be a signed
integer. On platforms that have it, we get the correct
platform-defined value. Our own replacement, however, was
unsigned. That's going to cause a bug somewhere eventually.
Previously, we only looked at up to 128 bytes. This is a bad idea
since socks messages can be at least 256+x bytes long. Now we look at
up to 512 bytes; this should be enough for 0.2.2.x to handle all valid
SOCKS messages. For 0.2.3.x, we can think about handling trickier
cases.
Fixes 2330. Bugfix on 0.2.0.16-alpha.
Right now, Tor routers don't save the maxima values from the
bw_history_t between sessions. That's no good, since we use those
values to determine bandwidth. This code adds a new BWHist.*Maximum
set of values to the state file. If they're not present, we estimate
them by taking the observed total bandwidth and dividing it by the
period length, which provides a lower bound.
This should fix bug 1863. I'm calling it a feature.
Previously, our state parsing code would fail to parse a bwhist
correctly if the Interval was anything other than the default
hardcoded 15 minutes. This change makes the parsing less incorrect,
though the resulting history array might get strange values in it if
the intervals don't match the one we're using. (That is, if stuff was
generated in 15 minute intervals, and we read it into an array that
expects 30 minute intervals, we're fine, since values can be combined
pairwise. But if we generate data at 30 minute intervals and read it
into 15 minute intervals, alternating buckets will be empty.)
Bugfix on 0.1.1.11-alpha.
The trick of looping from i=0..4 , switching on i to set up some
variables, then running some common code is much better expressed by
just calling a function 4 times with 4 sets of arguments. This should
make the code a little easier to follow and maintain here.
An object, you'll recall, is something between -----BEGIN----- and
-----END----- tags in a directory document. Some of our code, as
doorss has noted in bug 2352, could assert if one of these ever
overflowed SIZE_T_CEILING but not INT_MAX. As a solution, I'm setting
a maximum size on a single object such that neither of these limits
will ever be hit. I'm also fixing the INT_MAX checks, just to be sure.
When using libevent 2, we use evdns_base_resolve_*(). When not, we
fake evdns_base_resolve_*() using evdns_resolve_*().
Our old check was looking for negative values (like libevent 2
returns), but our eventdns.c code returns 1. This code makes the
check just test for nonzero.
Note that this broken check was not for _resolve_ failures or even for
failures to _launch_ a resolve: it was for failures to _create_ or
_encode_ a resolve request.
Bug introduced in 81eee0ecfff3dac1e9438719d2f7dc0ba7e84a71; found by
lodger; uploaded to trac by rransom. Bug 2363. Fix on 0.2.2.6-alpha.
This was originally a patch provided by pipe
(http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html) to
provide a method for controllers to query the total amount of traffic
tor has handled (this is a frequently requested piece of information
by relay operators).
C99 allows a syntax for structures whose last element is of
unspecified length:
struct s {
int elt1;
...
char last_element[];
};
Recent (last-5-years) autoconf versions provide an
AC_C_FLEXIBLE_ARRAY_MEMBER test that defines FLEXIBLE_ARRAY_MEMBER
to either no tokens (if you have c99 flexible array support) or to 1
(if you don't). At that point you just use offsetof
[STRUCT_OFFSET() for us] to see where last_element begins, and
allocate your structures like:
struct s {
int elt1;
...
char last_element[FLEXIBLE_ARRAY_MEMBER];
};
tor_malloc(STRUCT_OFFSET(struct s, last_element) +
n_elements*sizeof(char));
The advantages are:
1) It's easier to see which structures and elements are of
unspecified length.
2) The compiler and related checking tools can also see which
structures and elements are of unspecified length, in case they
wants to try weird bounds-checking tricks or something.
3) The compiler can warn us if we do something dumb, like try
to stack-allocate a flexible-length structure.
We were not decrementing "available" every time we did
++next_virtual_addr in addressmap_get_virtual_address: we left out the
--available when we skipped .00 and .255 addresses.
This didn't actually cause a bug in most cases, since the failure mode
was to keep looping around the virtual addresses until we found one,
or until available hit zero. It could have given you an infinite loop
rather than a useful message, however, if you said "VirtualAddrNetwork
127.0.0.255/32" or something broken like that.
Spotted by cypherpunks
We were decrementing "available" twice for each in-use address we ran
across. This would make us declare that we ran out of virtual
addresses when the address space was only half full.
On Windows, we never use pthreads, since it doesn't usually exist,
and when it does it tends to be a little weirdly-behaved. But some
mingw installations have a pthreads installed, so autoconf detects
pthread.h and tells us about it. This would make us include
pthread.h, which could make for trouble when the iffy pthread.h
tried to include config.h.
This patch changes compat.h so that we never include pthread.h on
Windows. Fixes bug 2313; bugfix on 0.1.0.1-rc.
evbuffer_pullup does nothing and returns NULL if the caller asks it to
linearize more data than the buffer contains.
Introduced in 9796b9bfa6.
Reported by piebeer; fixed with help from doors.
Libevent 2.0 has a "changelist" feature that avoids making redundant
syscalls if we wind up doing a lot of event_add/event_del operations
on the same fd in a row. Unfortunately, due to a weird design
choice in Linux, it doesn't work right with epoll when multiple fds
refer to the same socket (e.g., one is a dup() of the other). We
don't dup() anything we give to Libevent, though, so it is safe for
us to explicitly turn this feature on.
If a SOCKS5 client insists on authentication, allow it to
negotiate a connection with Tor's SOCKS server successfully.
Any credentials the client provides are ignored.
This allows Tor to work with SOCKS5 clients that can only
support 'authenticated' connections.
Also add a bunch of basic unit tests for SOCKS4/4a/5 support
in buffers.c.
It's all too easy in C to convert an unsigned value to a signed one,
which will (on all modern computers) give you a huge signed value. If
you have a size_t value of size greater than SSIZE_T_MAX, that is way
likelier to be an underflow than it is to be an actual request for
more than 2gb of memory in one go. (There's nothing in Tor that
should be trying to allocate >2gb chunks.)
If you had TIME_MAX > INT_MAX, and your "time_to_exhaust_bw =
accountingmax/expected_bandwidth_usage * 60" calculation managed to
overflow INT_MAX, then your time_to_consider value could underflow and
wind up being rediculously low or high. "Low" was no problem;
negative values got caught by the (time_to_consider <= 0) check.
"High", however, would get you a wakeup time somewhere in the distant
future.
The fix is to check for time_to_exhaust_bw overflowing INT_MAX, not
TIME_MAX: We don't allow any accounting interval longer than a month,
so if time_to_exhaust_bw is significantly larger than 31*24*60*60, we
can just clip it.
This is a bugfix on 0.0.9pre6, when accounting was first introduced.
It fixes bug 2146, unless there are other causes there too. The fix
is from boboper. (I tweaked it slightly by removing an assignment
that boboper marked as dead, and lowering a variable that no longer
needed to be function-scoped.)
The old logic would have us fetch from authorities if we were refusing
unknown exits and our exit policy was reject*. Instead, we want to
fetch from authorities if we're refusing unknown exits and our exit
policy is _NOT_ reject*.
Fixed by boboper. Fixes more of 2097. Bugfix on 0.2.2.16-alpha.
We use a hash of the identity key to seed a prng to tell when an
accounting period should end. But thanks to the bug998 changes,
clients no longer have server-identity keys to use as a long-term seed
in accounting calculations. In any case, their identity keys (as used
in TLS) were never never fixed. So we can just set the wakeup time
from a random seed instead there. Still open is whether everybody
should be random.
This patch fixes bug 2235, which was introduced in 0.2.2.18-alpha.
Diagnosed with help from boboper on irc.
This is not the most beautiful fix for this problem, but it is the simplest.
Bugfix for 2205. Thanks to Sebastian and Mashael for finding the
bug, and boboper/cypherpunks for figuring out why it was happening
and how to fix it, and for writing a few fixes.
The reason the "streams problem" occurs is due to the complicated
interaction between Tor's congestion control and libevent. At some point
during the experiment, the circuit window is exhausted, which blocks all
edge streams. When a circuit level sendme is received at Exit, it
resumes edge reading by looping over linked list of edge streams, and
calling connection_start_reading() to inform libevent to resume reading.
When the streams are activated again, Tor gets the chance to service the
first three streams activated before the circuit window is exhausted
again, which causes all streams to be blocked again. As an experiment,
we reversed the order in which the streams are activated, and indeed the
first three streams, rather than the last three, got service, while the
others starved.
Our solution is to change the order in which streams are activated. We
choose a random edge connection from the linked list, and then we
activate streams starting from that chosen stream. When we reach the end
of the list, then we continue from the head of the list until our chosen
stream (treating the linked list as a circular linked list). It would
probably be better to actually remember which streams have received
service recently, but this way is simple and effective.
Doing so could make Libevent call Libevent from inside a Libevent
logging call, which is a recipe for reentrant confusion and
hard-to-debug crashes. This would especially hurt if Libevent
debug-level logging is enabled AND the user has a controller
watching for low-severity log messages.
Fix bug 2190; fix on 0.1.0.2-rc.
Doing so could make Libevent call Libevent from inside a Libevent
logging call, which is a recipe for reentrant confusion and
hard-to-debug crashes. This would especially hurt if Libevent
debug-level logging is enabled AND the user has a controller
watching for low-severity log messages.
Fix bug 2190; fix on 0.1.0.2-rc.
Sebastian notes (and I think correctly) that one of our ||s should
have been an &&, which simplifies a boolean expression to decide
whether to replace bridges. I'm also refactoring out the negation at
the start of the expression, to make it more readable.
Pick 5 seconds as the limit. 5 seconds is a compromise here between
making sure the user notices that the bad behaviour is (still) happening
and not spamming their log too much needlessly (the log message is
pretty long). We also keep warning every time if safesocks is
specified, because then the user presumably wants to hear about every
blocked instance.
(This is based on the original patch by Sebastian, then backported to
0.2.2 and with warnings split into their own function.)
Our checks that we don't exceed the 50 KB size limit of extra-info
descriptors apparently failed. This patch fixes these checks and reserves
another 250 bytes for appending the signature. Fixes bug 2183.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Having very long single lines with lots and lots of things in them
tends to make files hard to diff and hard to merge. Since our tools
are one-line-at-a-time, we should try to construct lists that way too,
within reason.
This incidentally turned up a few headers in configure.in that we were
for some reason searching for twice.
We would never actually enforce multiplicity rules when parsing
annotations, since the counts array never got entries added to it for
annotations in the token list that got added by earlier calls to
tokenize_string.
Found by piebeer.
We had a spelling discrepancy between the manpage and the source code
for some option. Resolve these in favor of the manpage, because it
makes more sense (for example, HTTP should be capitalized).
The code that makes use of the RunTesting option is #if 0, so setting
this option has no effect. Mark the option as obsolete for now, so that
Tor doesn't list it as an available option erroneously.
Change the default values for collecting directory request statistics and
inlcuding them in extra-info descriptors to 1.
Don't break if we are configured to collect directory request or entry
statistics and don't have a GeoIP database. Instead, print out a notice
and skip initializing the affected statistics code.
We need filtering bufferevent_openssl so that we can wrap around
IOCP bufferevents on Windows. This patch adds a temporary option to
turn on filtering mode, so that we can test it out on non-IOCP
systems to make sure it hasn't got any surprising bugs.
It also fixes some allocation/teardown errors in using
bufferevent_openssl as a filter.
Instead of rejecting a value that doesn't divide into 1 second, round to
the nearest divisor of 1 second and warn.
Document that the option only controls the granularity written by Tor to a
file or console log. It does not (for example) "batch up" log messages to
affect times logged by a controller, times attached to syslog messages, or
the mtime fields on log files.
In the case where old_router == NULL but sdmap has an entry for the
router, we can currently safely infer that the old_router was not a
bridge. Add an assert to ensure that this remains true, and fix the
logic not to die with the tor_assert(old_router) call.
In a2bb0bf we started using a separate client identity key. When we are
in "public server mode" (that means not a bridge) we will use the same
key. Reusing the key without doing the proper refcounting leads to a
segfault on cleanup during shutdown. Fix that.
Also introduce an assert that triggers if our refcount falls below 0.
That should never happen.
We now require that:
- Only actual servers should ever call get_server_identity_key
- If you're being a client or bridge, the client and server keys should
differ.
- If you're being a public relay, the client and server keys
should be the same.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
Mainly, this comes from turning two lists that needed to be kept in
synch into a single list of structs. This should save a little RAM,
and make the code simpler.
The old comment was from before I tried a huge pile of crazy stuff to
make the inner loop faster. Short answer: GCC already knows how to
unroll loops pretty well. Other short answer: we should have made the
relay payload size an even multiple of 4, 8, or ideally 16.
There's no reason to keep a time_t and a struct timeval to represent
the same value: highres_created.tv_sec was the same as timestamp_created.
This should save a few bytes per circuit.
The short version is, "where we want to do it, we have nothing real to
chose from and we can't do it easily. Where it's easy to do, we have
no reason to do it yet."
Our old code correctly called bufferevent_flush() on linked
connections to make sure that the other side got an EOF event... but
it didn't call bufferevent_flush() when the connection wasn't
hold_open_until_flushed. Directory connections don't use
hold_open_until_flushed, so the linked exit connection never got an
EOF, so they never sent a RELAY_END cell to the client, and the
client never concluded that data had arrived.
The solution is to make the bufferevent_flush() code apply to _all_
closing linked conns whose partner is not already marked for close.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
First start of a fix for bug2001, but my test network still isn't
working: the client and the server send each other VERSIONS cells,
but never notice that they got them.
Currently the unit tests test_util_spawn_background_* assume that they
are run from the Tor build directory. This is not the case when running
make distcheck, so the test will fail. This problem is fixed by autoconf
setting BUILDDIR to be the root of the Tor build directory, and this
preprocessor variable being used to specify the absolute path to
test-child. Also, in test-child, do not print out argv[0] because this will
no longer be predictable. Found by Sebastian Hahn.
* Make tor_tls_context_new internal to tortls.c, and return the new
tor_tls_context_t from it.
* Add a public tor_tls_context_init wrapper function to replace it.
- Responsibility of clearing hex_errno is no longer with caller
- More conservative bounds checking
- Length requirement of hex_errno documented
- Output format documented
Some of these functions only work for routerinfo-based nodes, and as
such are only usable for advisory purposes. Fortunately, our uses
of them are compatible with this limitation.
Also, make the NodeFamily option into a list of routersets. This
lets us git rid of router_in_nickname_list (or whatever it was
called) without porting it to work with nodes, and also lets people
specify country codes and IP ranges in NodeFamily
This was the only flag in routerstatus_t that we would previously
change in a routerstatus_t in a consensus. We no longer have reason
to do so -- and probably never did -- as you can now confirm more
easily than you could have done by grepping for is_running before
this patch.
The name change is to emphasize that the routerstatus_t is_running
flag is only there to tell you whether the consensus says it's
running, not whether it *you* think it's running.
A node_t is an abstraction over routerstatus_t, routerinfo_t, and
microdesc_t. It should try to present a consistent interface to all
of them. There should be a node_t for a server whenever there is
* A routerinfo_t for it in the routerlist
* A routerstatus_t in the current_consensus.
(note that a microdesc_t alone isn't enough to make a node_t exist,
since microdescriptors aren't usable on their own.)
There are three ways to get a node_t right now: looking it up by ID,
looking it up by nickname, and iterating over the whole list of
microdescriptors.
All (or nearly all) functions that are supposed to return "a router"
-- especially those used in building connections and circuits --
should return a node_t, not a routerinfo_t or a routerstatus_t.
A node_t should hold all the *mutable* flags about a node. This
patch moves the is_foo flags from routerinfo_t into node_t. The
flags in routerstatus_t remain, but they get set from the consensus
and should not change.
Some other highlights of this patch are:
* Looking up routerinfo and routerstatus by nickname is now
unified and based on the "look up a node by nickname" function.
This tries to look only at the values from current consensus,
and not get confused by the routerinfo_t->is_named flag, which
could get set for other weird reasons. This changes the
behavior of how authorities (when acting as clients) deal with
nodes that have been listed by nickname.
* I tried not to artificially increase the size of the diff here
by moving functions around. As a result, some functions that
now operate on nodes are now in the wrong file -- they should
get moved to nodelist.c once this refactoring settles down.
This moving should happen as part of a patch that moves
functions AND NOTHING ELSE.
* Some old code is now left around inside #if 0/1 blocks, and
should get removed once I've verified that I don't want it
sitting around to see how we used to do things.
There are still some unimplemented functions: these are flagged
with "UNIMPLEMENTED_NODELIST()." I'll work on filling in the
implementation here, piece by piece.
I wish this patch could have been smaller, but there did not seem to
be any piece of it that was independent from the rest. Moving flags
forces many functions that once returned routerinfo_t * to return
node_t *, which forces their friends to change, and so on.
The node_t type is meant to serve two key functions:
1) Abstracting difference between routerinfo_t and microdesc_t
so that clients can use microdesc_t instead of routerinfo_t.
2) Being a central place to hold mutable state about nodes
formerly held in routerstatus_t and routerinfo_t.
This patch implements a nodelist type that holds a node for every
router that we would consider using.
* MINIUPNPC rather than the generic UPNP
* Nick suggested a better abstraction model for tor-fw-helper
* Fix autoconf to build with either natpmp or miniupnpc
* Add AM_PROG_CC_C_O to fix automake complaint
* update spec to address nickm's concern
* refactor nat-pmp to match upnp state
* we prefer tor_snprintf to snprintf
* link properlty for tor_snprintf
* rename test_commandline_options to log_commandline_options
* cast this uint as an int
* detect possible FD_SETSIZE errors
* make note about future enhancements for natpmp
* add upnp enhancement note
* ChangeLog entry
* doxygen and check-spaces cleanup
* create tor-fw-helper.1.txt
tor-fw-helper is a command-line tool to wrap and abstract various
firewall port-forwarding tools.
This commit matches the state of Jacob's tor-fw-helper branch as of
23 September 2010.
(commit msg by Nick)
When picking bridges (or other nodes without a consensus entry (and
thus no bandwidth weights)) we shouldn't just trust the node's
descriptor. So far we believed anything between 0 and 10MB/s, where 0
would mean that a node doesn't get any use from use unless it is our
only one, and 10MB/s would be a quite siginficant weight. To make this
situation better, we now believe weights in the range from 20kB/s to
100kB/s. This should allow new bridges to get use more quickly, and
means that it will be harder for bridges to see almost all our traffic.
In the first 100 circuits, our timeout_ms and close_ms
are the same. So we shouldn't transition circuits to purpose
CIRCUIT_PURPOSE_C_MEASURE_TIMEOUT, since they will just timeout again
next time we check.
We really should ignore any timeouts that have *no* network activity for their
entire measured lifetime, now that we have the 95th percentile measurement
changes. Usually this is up to a minute, even on fast connections.
If we really want all this complexity for these stages here, we need to handle
it better for people with large timeouts. It should probably go away, though.
Rechecking the timeout condition was foolish, because it is checked on the
same codepath. It was also wrong, because we didn't round.
Also, the liveness check itself should be <, and not <=, because we only have
1 second resolution.
Specifically, a circ attempt that we'd launched while the network was
down could timeout after we've marked our entrynodes up, marking them
back down again. The fix is to annotate as bad the OR conns that were
around before we did the retry, so if a circuit that's attached to them
times out we don't do anything about it.
This is needed for IOCP, since telling the IOCP backend about all
your CPUs is a good idea. It'll also come in handy with asn's
multithreaded crypto stuff, and for people who run servers without
reading the manual.
automake 1.6 doesn't like using a conditional += to add stuff to foo_LDADD.
Instead you need to conditionally define a variable, then non-conditionally
put that variable in foo_LDADD.
This requires the latest Git version of Libevent as of 24 March 2010.
In the future, we'll just say it requires Libevent 2.0.5-alpha or
later.
Since Libevent doesn't yet support hierarchical rate limit groups,
there isn't yet support for tracking relayed-bytes separately when
using the bufferevent system. If a future version does add support
for hierarchical buckets, we can add that back in.
We want to make sure that we don't break old torrc files that might have
used something like this made-up example:
ContactInfo UberUser <uber@user.com> # /// Fake email! \\\
Log info file /home/nick.mathewson/projects/tor-info.log
And we also want to support the following style of writing your torrc:
ExcludeNodes \
# Node1337 is run by the Bavarian Illuminati
Node1337, \
# The operator of Node99 looked at me funny
Node99
The code already handles both cases, but the unit test should help prove
it.
Roger correctly pointed out that my code was broken for accounting
periods that shifted forwards, since
start_of_accounting_period_containing(interval_start_time) would not
be equal to interval_start_time, but potentially much earlier.
This function uses GetSystemDirectory() to make sure we load the version
of the library from c:\windows\system32 (or local equivalent) rather than
whatever version lives in the cwd.
The RefuseUnknownExits config option is now a tristate, with "1"
meaning "enable it no matter what the consensus says", "0" meaning
"disable it no matter what the consensus says", and "auto" meaning "do
what the consensus says". If the consensus is silent, we enable
RefuseUnknownExits.
This patch also changes the dirserv logic so that refuseunknownexits
won't make us cache unless we're an exit.
Do not double-report signatures from unrecognized authorities both as
"from unknown authority" and "not present". Fixes bug 1956, bugfix on
0.2.2.16-alpha.
Our attempt to make compilation work on old versions of Windows
again while keeping wince compatibility broke the build for Win2k+.
helix reports this patch fixes the issue for WinXP. Bugfix on
0.2.2.15-alpha; related to bug 1797.
When the CellStatistics option is off, we don't store cell insertion
times. Doing so would also not be very smart, because there seem to
still be some performance issues with this type of statistics. Nothing
harmful happens when we don't have insertion times, so we don't need to
alarm the user.
Previously[*], the function would start with the first stream on the
circuit, and let it package as many cells as it wanted before
proceeding to the next stream in turn. If a circuit had many live
streams that all wanted to package data, the oldest would get
preference, and the newest would get ignored.
Now, we figure out how many cells we're willing to send per stream,
and try to allocate them fairly.
Roger diagnosed this in the comments for bug 1298.
[*] This bug has existed since before the first-ever public release
of Tor. It was added by r152 of Tor on 26 Jan 2003, which was
the first commit to implement streams (then called "topics").
This is not the oldest bug to be fixed in 0.2.2.x: that honor
goes to the windowing bug in r54, which got fixed in e50b7768 by
Roger with diagnosis by Karsten. This is, however, the most
long-lived bug to be fixed in 0.2.2.x: the r54 bug was fixed
2580 days after it was introduced, whereas I am writing this
commit message 2787 days after r152.
I'm going to use this to implement more fairness in
circuit_resume_edge_reading_helper in an attempt to fix bug 1298.
(Updated with fixes from arma and Sebastian)
An authority should never download a consensus if it has a live one,
but when it doesn't, it should admit that it's not going to get one,
and see if anybody else can give it one.
Fixes 1300, fix on 0.2.0.9-alpha
Bridges and other relays not included in the consensus don't
necessarily have a non-zero bandwidth capacity. If all our
configured bridges had a zero bw capacity we would warn the
user. Change that.
Previously, we were also considering the time spent in
soft-hibernation. If this was a long time, we would wind up
underestimating our bandwidth by a lot, and skewing our wakeup time
towards the start of the accounting interval.
This patch also makes us store a few more fields in the state file,
including the time at which we entered soft hibernation.
Fixes bug 1789. Bugfix on 0.0.9pre5.
This will make changes for DST still work, and avoid double-spending
bytes when there are slight changes to configurations.
Fixes bug 1511; the DST issue is a bugfix on 0.0.9pre5.
When we introduced the code to close non-open OR connections after
KeepalivePeriod had passed, we replaced some code that said
if (!connection_is_open(conn)) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
with new code that said
if (!connection_is_open(conn) && past_keepalive) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
This was a mistake, since it made all the other tests start applying
to non-open connections, thus causing bug 1840, where non-open
connections get closed way early.
Fixes bug 1840. Bugfix on 0.2.1.26 (commit 67b38d50).
- Move checks for extra_info to callers
- Change argument name from failed to descs
- Use strlen("fp/") instead of a magic number
- I passed on the suggestion to rename functions from *_failed() to
*_handle_failure(). There are a lot of these so for now just follow
the house style.
It's normal when bootstrapping to have a lot of different certs
missing, so we don't want missing certs to make us warn... unless
the certs we're missing are ones that we've tried to fetch a couple
of times and failed at.
May fix bug 1145.
We frequently add cells to stream-blocked queues for valid reasons
that don't mean we need to block streams. The most obvious reason
is if the cell arrives over a circuit rather than from an edge: we
don't block circuits, no matter how full queues get. The next most
obvious reason is that we allow CONNECTED cells from a newly created
stream to get delivered just fine.
This patch changes the behavior so that we only iterate over the
streams on a circuit when the cell in question came from a stream,
and we only block the stream that generated the cell, so that other
streams can still get their CONNECTEDs in.
This should keep WinCE working (unicode always-on) and get Win98
working again (unicode never-on).
There are two places where we explicitly use ASCII-only APIs, still:
in ntmain.c and in the unit tests.
This patch also fixes a bug in windoes tor_listdir that would cause
the first file to be listed an arbitrary number of times that was
also introduced with WinCE support.
Should fix bug 1797.
Setting CookieAuthFileGroupReadable but without setting CookieAuthFile makes
no sense, because unix directory permissions for the data directory prevent
the group from accessing the file anyways.
When this happens, run through the streams on the circuit and make
sure they're all blocked. If some aren't, that's a bug: block them
all and log it! If they all are, where did the cell come from? Log
it!
(I suspect that this actually happens pretty frequently, so I'm making
these log messages appear at INFO.)
Do not start reading on exit streams when we get a SENDME unless we
have space in the appropriate circuit's cell queue.
Draft fix for bug 1653.
(commit message by nickm)
router_add_to_routerlist() is supposed to be a nice minimal function
that only touches the routerlist structures, but it included a call to
dirserv_single_reachability_test().
We have a function that gets called _after_ adding descriptors
successfully: routerlist_descriptors_added. This patch moves the
responsibility for testing there.
Because the decision of whether to test or not depends on whether
there was an old routerinfo for this router or not, we have to first
detect whether we _will_ want to run the tests if the router is added.
We make this the job of
routers_update_status_from_consensus_networkstatus().
Finally, this patch makes the code notice if a router is going from
hibernating to non-hibernating, and if so causes a reachability test
to get launched.
This solves the problem Roger noted as:
What if the router has a clock that's 5 minutes off, so it
publishes a descriptor for 5 minutes in the future, and we test it
three minutes in. In this edge case, we will continue to advertise
it as Running for the full 45 minute period.
If the voting interval was short enough, the two-minutes delay
of CONSENSUS_MIN_SECONDS_BEFORE_CACHING would confuse bridges
to the point where they would assert before downloading a consensus.
It it was even shorter (<4 minutes, I think), caches would
assert too. This patch fixes that by having replacing the
two-minutes value with MIN(2 minutes, interval/16).
Bugfix for 1141; the cache bug could occur since 0.2.0.8-alpha, so
I'm calling this a bugfix on that. Robert Hogan diagnosed this.
Done as a patch against maint-0.2.1, since it makes it hard to
run some kinds of testing networks.
requests to authorities fail due to a network error.
Bug#1138
"When a Tor client starts up using a bridge, and UpdateBridgesFromAuthority
is set, Tor will go to the authority first and look up the bridge by
fingerprint. If the bridge authority is filtered, Tor will never notice that
the bridge authority lookup failed. So it will never fall back."
Add connection_dir_bridge_routerdesc_failed(), a function for unpacking
the bridge information from a failed request, and ensure
connection_dir_request_failed() calls it if the failed request
was for a bridge descriptor.
Test:
1. for ip in `grep -iR 'router ' cached-descriptors|cut -d ' ' -f 3`;
do sudo iptables -A OUTPUT -p tcp -d $ip -j DROP; done
2. remove all files from user tor directory
3. Put the following in torrc:
UseBridges 1
UpdateBridgesFromAuthority 1
Bridge 85.108.88.19:443 7E1B28DB47C175392A0E8E4A287C7CB8686575B7
4. Launch tor - it should fall back to downloading descriptors
directly from the bridge.
Initial patch reviewed and corrected by mingw-san.
Apparently the way we handled cleaning up temporary directories with
atexit() meant that when the child process exited, it would remove the
temporary directory, thus making other tests in the main process fail.
https://trac.torproject.org/projects/tor/ticket/1525
"The codepath taken by the control port "RESOLVE" command to create a
synthetic SOCKS resolve request isn't the same as the path taken by
a real SOCKS request from 'tor-resolve'.
This prevents controllers who set LeaveStreamsUnattached=1 from
being able to attach RESOLVE streams to circuits of their choosing."
Create a new function connection_ap_rewrite_and_attach_if_allowed()
and call that when Tor needs to attach a stream to a circuit but
needs to know if the controller permits it.
No tests added.
Since the rend code doesn't like the port to be 0, we shouldn't generate
the port by declaring crypto_rand_int(65536); instead we should
say crypto_rand_int(65535)+1.
Diagnosed by Matt Edman; fixes bug 1808.
With this patch we stop scheduling when we should write statistics using a
single timestamp in run_scheduled_events(). Instead, we remember when a
statistics interval starts separately for each statistic type in geoip.c
and rephist.c. Every time run_scheduled_events() tries to write stats to
disk, it learns when it should schedule the next such attempt.
This patch also enables all statistics to be stopped and restarted at a
later time.
This patch comes with a few refactorings, some of which were not easily
doable without the patch.
We already had the country code ?? indicating an unknown country, so all we
needed to do to make unknown countries excludable was to make the ?? code
discoverable.
It's okay to get (say) a SocksPort line in the torrc, and then a
SocksPort on the command line to override it, and then a SocksPort via
a controller to override *that*. But if there are two occurrences of
SocksPort in the torrc, or on the command line, or in a single SETCONF
command, then the user is likely confused. Our old code would not
help unconfuse the user, but would instead silently ignore all but
the last occurrence.
This patch changes the behavior so that if the some option is passed
more than once to any torrc, command line, or SETCONF (each of which
coincidentally corresponds to a call to config_assign()), and the
option is not a type that allows multiple occurrences (LINELIST or
LINELIST_X), then we can warn the user.
This closes trac entry 1384.
At best, this patch helps us avoid sending queued relayed cells that
would get ignored during the time between when a destroy cell is
sent and when the circuit is finally freed. At worst, it lets us
release some memory a little earlier than it would otherwise.
Fix for bug #1184. Bugfix on 0.2.0.1-alpha.
The next series of commits begins addressing the issue that we're
currently including the complete or.h file in all of our source files.
To change that, we're splitting function definitions into new header
files (one header file per source file).
Right now it says "552 internal error" because there's no way for
getinfo_helper_*() countries to specify an error message. This
patch changes the getinfo_helper_*() interface, and makes most of the
getinfo helpers give useful error messages in response to failures.
This should prevent recurrences of bug 1699, where a missing GeoIPFile
line in the torrc made GETINFO ip-to-county/* fail in a "not obvious
how to fix" way.
V3 authorities no longer decide not to vote on Guard+Exit. The bandwidth
weights should take care of this now.
Also, lower the max threshold for WFU to 0.98, to allow more nodes to become
guards.
This should make us conflict less with system files named "log.h".
Yes, we shouldn't have been conflicting with those anyway, but some
people's compilers act very oddly.
The actual change was done with one "git mv", by editing
Makefile.am, and running
find . -name '*.[ch]' | xargs perl -i -pe 'if (/^#include.*\Wlog.h/) {s/log.h/torlog.h/; }'
We now record large times as abandoned, to prevent a filter step from
happening and skewing our results.
Also, issue a warn for a rare case that can happen for funky values of Xm or
too many abandoned circuits. Can happen (very rarely) during unit tests, but
should not be possble during live operation, due to network liveness filters
and discard logic.
Many friendly operating systems have 64-bit times, and it's not nice
to pass them to an %ld format.
It's also extremely not-nice to write a time to the log as an
integer. Most people think it's 2010 June 29 23:57 UTC+epsilon, not
1277855805+epsilon.
These timers behave better with non-monotonic clocks than our old
ones, and also try harder to make once-per-second events get called
one second apart, rather than one-plus-epsilon seconds apart.
This fixes bug 943 for everybody using Libevent 2.0 or later.
From the code:
zlib 1.2.4 and 1.2.5 do some "clever" things with macros. Instead of
saying "(defined(FOO) ? FOO : 0)" they like to say "FOO-0", on the theory
that nobody will care if the compile outputs a no-such-identifier warning.
Sorry, but we like -Werror over here, so I guess we need to define these.
I hope that zlib 1.2.6 doesn't break these too.
Possible fix for bug 1526.
This should never happen unless openssl is buggy or some of our
assumptions are deeply wrong, but one of those might have been the
cause of the not-yet-reproducible bug 1209. If it ever happens again,
let's get some info we can use.
We need to ensure that we close timeout measurement circuits. While
we're at it, we should close really old circuits of certain types that
aren't in use, and log really old circuits of other types.
We need to record different statistics at point of timeout, vs the point
of forcible closing.
Also, give some better names to constants and state file variables
to indicate they are not dealing with timeouts, but abandoned circuits.
Having ~/.tor expand into /.tor is, after all, almost certainly not
what the user wanted, and it deserves a warning message.
Also, convert a guess-and-malloc-and-sprintf triple into an asprintf.
In rare cases, we could cannibalize a one-hop circuit, ending up
with a two-hop circuit. This circuit would not be actually used,
but we should prevent its creation in the first place.
Thanks to outofwords and swissknife for helping to analyse this.
Most of the changes here are switches to use APIs available on Windows
CE. The most pervasive change is that Windows CE only provides the
wide-character ("FooW") variants of most of the windows function, and
doesn't support the older ASCII verions at all.
This patch will require use of the wcecompat library to get working
versions of the posix-style fd-based file IO functions.
[commit message by nickm]
Back when we changed the idea of a connection being "too old" for new
circuits into the connection being "bad" for new circuits, we didn't
actually change the info messages. This led to telling the user that
we were labelling connections as "too old" for being worse than
connections that were actually older than them.
Found by Scott on or-talk.
There are now four ways that CBT can be disabled:
1. Network-wide, with the cbtdisabled consensus param.
2. Via config, with "LearnCircuitBuildTimeout 0"
3. Via config, with "AuthoritativeDirectory 1"
4. Via a state file write failure.
This should prevent some asserts and storage of incorrect build times
for the cases where Tor is suspended during a circuit construction, or
just after completing a circuit. The idea is that if the circuit
build time is much greater than we would have cut it off at, we probably
had a suspend event along this codepath, and we should discard the
value.
In case we decide that the timeout rate is now too high due to our
change of the max synthetic quantile value, this consensus parameter
will allow us to restore it to the previous value.
This is for the other issue we saw in Bug 1335. A large number of high
timeouts were causing the timeout calculation to slowly drift upwards,
especially in conditions of load. This fix repeatedly regenerates all of
our synthetic timeouts whenever the timeout changes, to try to prevent
drift.
It also lowers the timeout cap to help for some cases of Bug 1245, where
some timeout values were so large that we ended up allocating a ton of
scratch memory to count the histogram bins.
The downside is that lowering this cap is affecting our timeout rate.
Unfortunately, the buildtimeout quantile is now higher than the actual
completion rate by what appears to be about 7-10%, which probably
represents the skew in the distribution due to lowering this synthetic
cap.
In my state files, I was seeing several peaks, probably due to different
guards having different latency. This change is meant to better capture
this behavior and generate more reasonable timeouts when it happens. It
is improving the timeout values for my collection of state files.
what's happening here is that we're fetching certs for obsolete
authorities -- probably legacy signers in this case. but try to
remain general in the log message.
It's natural for the definition of bandwidth_rule_t to be with the functions
that actually care about its values. Unfortunately, this means declaring
bandwidth_rate_rule_to_string() out of sequence. Someday we'll just rename
reasons.c to strings.c, and put it at the end of or.h, and this will all be
better.
1) mingw doesn't have _vscprintf(); mingw instead has a working snprintf.
2) windows compilers that _do_ have a working _vscprintf spell it so; they do
not spell it _vcsprintf().
Works like the --enable-static-openssl/libevent options. Requires
--with-zlib-dir to be set. Note that other dependencies might still
pull in a dynamicly linked zlib, if you don't link them in statically
too.
Our code assumed that any version of OpenSSL before 0.9.8l could not
possibly require SSL_OP_ALLOW_UNSAFE_LEGACY_RENEGOTIATION. This is
so... except that many vendors have backported the flag from later
versions of openssl when they backported the RFC5476 renegotiation
feature.
The new behavior is particularly annoying to detect. Previously,
leaving SSL_OP_ALLOW_UNSAFE_LEGACY_RENEGOTIATION unset meant that
clients would fail to renegotiate. People noticed that one fast!
Now, OpenSSL's RFC5476 support means that clients will happily talk to
any servers there are, but servers won't accept renegotiation requests
from unpatched clients unless SSL_OP_ALLOW_etc is set. More fun:
servers send back a "no renegotiation for you!" error, which unpatched
clients respond to by stalling, and generally producing no useful
error message.
This might not be _the_ cause of bug 1346, but it is quite likely _a_
cause for bug 1346.
Everything that accepted the 'Circ' name handled it wrong, so even now
that we fixed the handling of the parameter, we wouldn't be able to
set it without making all the 0.2.2.7..0.2.2.10 relays act wonky.
This patch makes Tors accept the 'Circuit' name instead, so we can
turn on circuit priorities without confusing the versions that treated
the 'Circ' name as occasion to act weird.
I'm adding this because I can never remember what stuff like 'rule 3'
means. That's the one where if somebody goes limp or taps out, the
fight is over, right?
When you mean (a=b(c,d)) >= 0, you had better not say (a=b(c,d)>=0).
We did the latter, and so whenever CircPriorityHalflife was in the
consensus, it was treated as having a value of 1 msec (that is,
boolean true).
We need to make sure we have an event_base in dns.c before we call
anything that wants one. Make sure we always have one in dns_reset()
when we're a client. Fixes bug 1341.
Now if you're a published relay and you set RefuseUnknownExits, even
if your dirport is off, you'll fetch dir info from the authorities,
fetch it early, and cache it.
In the future, RefuseUnknownExits (or something like it) will be on
by default.
From http://archives.seul.org/tor/relays/Mar-2010/msg00006.html :
As I understand it, the bug should show up on relays that don't set
Address to an IP address (so they need to resolve their Address
line or their hostname to guess their IP address), and their
hostname or Address line fails to resolve -- at that point they'll
pick a random 4 bytes out of memory and call that their address. At
the same time, relays that *do* successfully resolve their address
will ignore the result, and only come up with a useful address if
their interface address happens to be a public IP address.
When the bandwidth-weights branch added the "directory-footer"
token, and began parsing the directory footer at the first
occurrence of "directory-footer", it made it possible to fool the
parsing algorithm into accepting unsigned data at the end of a
consensus or vote. This patch fixes that bug by treating the footer
as starting with the first "directory-footer" or the first
"directory-signature", whichever comes first.
Treat strings returned from signed_descriptor_get_body_impl() as not
NUL-terminated. Since the length of the strings is available, this is
not a big problem.
Discovered by rieo.
Don't allow anything but directory-signature tokens in a consensus after
the first directory-signature token. Fixes bug in bandwidth-weights branch.
Found by "outofwords."
Another dereference-then-NULL-check sequence. No reports of this bug
triggered in the wild. Fixes bugreport 1256.
Thanks to ekir for discovering and reporting this bug.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
We used to only zero the first ptrsize bytes of the cipher. Since
cipher is large enough, we didn't zero too many bytes. Discovered
and fixed by ekir. Fixes bug 1254.
This means that "if (E<G) {abc} else if (E>=G) {def}" can be replaced with
"if (E<G) {abc} else {def}"
Doing the second test explicitly made my mingw gcc nervous that we might
never be initializing casename.
For my 64-bit Linux system running with GCC 4.4.3-fc12-whatever, you
can't do 'printf("%lld", (int64_t)x);' Instead you need to tell the
compiler 'printf("%lld", (long long int)x);' or else it doesn't
believe the types match. This is why we added U64_PRINTF_ARG; it
looks like we needed an I64_PRINTF_ARG too.
asprintf() is a GNU extension that some BSDs have picked up: it does a printf
into a newly allocated chunk of RAM.
Our tor_asprintf() differs from standard asprintf() in that:
- Like our other malloc functions, it asserts on OOM.
- It works on windows.
- It always sets its return-field.
All other bandwidthrate settings are restricted to INT32_MAX, but
this check was forgotten for PerConnBWRate and PerConnBWBurst. Also
update the manpage to reflect the fact that specifying a bandwidth
in terabytes does not make sense, because that value will be too
large.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
On Windows, we don't have a notion of ~ meaning "our homedir", so we
were deliberately using an #ifdef to avoid calling expand_filename()
in multiple places. This is silly: The right place to turn a function
into a no-op on a single platform is in the function itself, not in
every single call-site.
We used to only zero the first ptrsize bytes of the cipher. Since
cipher is large enough, we didn't zero too many bytes. Discovered
and fixed by ekir. Fixes bug 1254.
Spec conformance issue: The code didn't force the network-status-version
token to be the first token in a v3 vote or consensus.
Problem discovered by Parakeep.
We need to use evdns_add_server_port_with_base() when configuring
our DNS listener, because libevent segfaults otherwise. Add a macro
in compat_libevent.h to pick the correct implementation depending
on the libevent version.
Fixes bug 1143, found by SwissTorExit
This time, set the SSL3_FLAGS_ALLOW_UNSAFE_RENEGOTIATION flag on every
version before OpenSSL 0.9.8l. I can confirm that the option value (0x0010)
wasn't reused until OpenSSL 1.0.0beta3.
The src and dest of a memcpy() call aren't supposed to overlap,
but we were sometimes calling tor_addr_copy() as a no-op.
Also, tor_addr_assign was a redundant copy of tor_addr_copy(); this patch
removes it.
We implemented ratelimiting for warnings going into the logfile, but didn't
rate-limit controller events. Now both log warnings and controller events
are rate-limited.
Tor has tor_lookup_hostname(), which prefers ipv4 addresses automatically.
Bug 1244 occured because gethostbyname() returned an ipv6 address, which
Tor cannot handle currently. Fixes bug 1244; bugfix on 0.0.2pre25.
Reported by Mike Mestnik.
The problem was that we didn't allocate enough memory on 32-bit
platforms with 64-bit time_t. The memory leak occured every time
we fetched a hidden service descriptor we've fetched before.
When calculating the is_exit flag for a routerinfo_t, we don't need
to call exit_policy_is_general_exit() if router_exit_policy_rejects_all()
tells us it definitely is an exit. This check is much cheaper than
running exit_policy_is_general_exit().
exit_policy_is_general_exit() assumed that there are no redundancies
in the passed policy, in the sense that we actively combine entries
in the policy to really get rid of any redundancy. Since we cannot
do that without massively rewriting the policy lines the relay
operators set, fix exit_policy_is_general_exit().
Fixes bug 1238, discovered by Martin Kowalczyk.
In brief: you mustn't use the SSL3_FLAG solution with anything but 0.9.8l,
and you mustn't use the SSL_OP solution with anything before 0.9.8m, and
you get in _real_ trouble if you try to set the flag in 1.0.0beta, since
they use it for something different.
For the ugly version, see my long comment in tortls.c
We need to do this because Apple doesn't update its dev-tools headers
when it updates its libraries in a security patch. On the bright
side, this might get us out of shipping a statically linked OpenSSL on
OSX.
May fix bug 1225.
[backported]
We need to do this because Apple doesn't update its dev-tools headers
when it updates its libraries in a security patch. On the bright
side, this might get us out of shipping a statically linked OpenSSL on
OSX.
May fix bug 1225.
We accidentally freed the internal buffer for bridge stats when we
were writing the bridge stats file or honoring a control port
request for said data. Change the interfaces for
geoip_get_bridge_stats* to prevent these problems, and remove the
offending free/add a tor_strdup.
Fixes bug 1208.
I believe that since we were allocating *cp while holding a mutex,
coverity deduced that *cp must be protected by that mutex, and later
flipped out when we didn't use it that way. If this is so, we can
solve our problems by moving the *cp = tor_strdup(buf) part outside of
the mutex-protected code.
It's a bit confusing to have a loop where another function,
confusingly named "*_free", is responsible for advancing the loop
variable (or rather, for altering a structure so that the next time
the loop variable's initializer is evaluated it evaluates to something
different.)
Not only has this confused people: it's also confused coverity scan.
Let's fix that.
This was freaking out some relay operators without good reason, as
it is nothing the relay operator can do anything about anyways.
Quieting this warning suggested by rieo.
We were checking for msg==NULL, but not lib or proc. This case can
only occur if we have an error whose string we somehow haven't loaded,
but it's worth coding defensively here.
Spotted by rieo on IRC.
The OutboundBindAddress option is useful for making sure that all of
your outbond connections use a given interface. But when connecting
to 127.0.0.1 (or ::1 even) it's important to actually have the
connection come _from_ localhost, since lots of programs running on
localhost use the source address to authenticate that the connection
is really coming from the same host.
Our old code always bound to OutboundBindAddress, whether connecting
to localhost or not. This would potentially break DNS servers on
localhost, and socks proxies on localhost. This patch changes the
behavior so that we only look at OutboundBindAddress when connecting
to a non-loopback address.
this case can now legitimately happen, if you have a cached v2 status
from moria1, and you run with the new list of dirservers that's missing
the old moria1. it's nothing to worry about; the file will die off in
a month or two.
The following commit:
commit e56747f9cf
Author: Nick Mathewson <nickm@torproject.org>
Date: Tue Dec 15 14:32:55 2009 -0500
Refactor a bit so that it is safe to include math.h, and mostly not needed.
introduced this line:
tor_resolve_LDADD = -lm ../common/libor.a @TOR_LIB_WS32@
which caused the build to fail, because only ../common/libor.a
(via the embedded ../common/util.o via ../common/util.c)
referenced libm's `lround' and `log' symbols, so that the
linker (GNU ld) didn't bother to import those symbols before
reading ../common/libor.a, thus leaving those symbols undefined.
The solution was to swap the order, producing the line:
tor_resolve_LDADD = ../common/libor.a -lm @TOR_LIB_WS32@
Signed-off-by: Michael Witten <mfwitten@gmail.com>
...to let us
rate-limit client connections as they enter the network. It's
controlled in the consensus so we can turn it on and off for
experiments. It's starting out off. Based on proposal 163.
Specifically, there are two cases: a) are we willing to start a new
circuit at a node not in your ExitNodes config option, and b) are we
willing to make use of a circuit that's already established but has an
unsuitable exit.
Now we discard all your circuits when you set ExitNodes, so the only
way you could end up with an exit circuit that ends at an unsuitable
place is if we explicitly ran out of exit nodes, StrictNodes was 0,
and we built this circuit to solve a stream that needs solving.
Fixes bug in dc322931, which would ignore the just-built circuit because
it has an unsuitable exit.
Before it would prepend your requested entrynodes to your list of guard
nodes, but feel free to use others after that. Now it chooses only
from your EntryNodes if any of those are available, and only falls back
to others if a) they're all down and b) StrictNodes is not set.
Also, now we refresh your entry guards from EntryNode at each consensus
fetch (rather than just at startup and then they slowly rot as the
network changes).
The goal here is to make users less likely to set StrictNodes, since
it's doing closer to what they expect it should be doing.
We do this in too many places throughout the code; it's time to start
clamping down.
Also, refactor Karsten's patch to use strchr-then-strndup, rather than
malloc-then-strlcpy-then-strchr-then-clear.
Fix statistics on client numbers by country as seen by bridges that were
broken in 0.2.2.1-alpha. Also switch to reporting full 24-hour intervals
instead of variable 12-to-48-hour intervals.
The HSAuthorityRecordStats option was used to track statistics of overall
hidden service usage on the version 0 hidden service authorities. With the
version 2 hidden service directories being deployed and version 0
descriptors being phased out, these statistics are not as useful anymore.
Goodbye, you fine piece of software; my first major code contribution to
Tor.
The new rule is: safe_str_X() means "this string is a piece of X
information; make it safe to log." safe_str() on its own means
"this string is a piece of who-knows-what; make it safe to log".
The rule is now: take the value from the CircuitPriorityHalflife
config option if it is set. If it zero, disable the cell_ewma
algorithm. If it is set, use it to calculate the scaling factor.
If it is not set, look for a CircPriorityHalflifeMsec parameter in the
consensus networkstatus. If *that* is zero, then disable the cell_ewma
algorithm; if it is set, use it to calculate the scaling factor.
If it is not set at all, disable the algorithm.
In connection_dir_client_reached_eof, we make sure that we either
return when we get an http status code of 503 or handle the problem
and set it to 200. Later we check if the status code is 503. Remove
that check.
There are two big changes here:
- We store active circuits in a priority queue for each or_conn,
rather than doing a linear search over all the active circuits
before we send each cell.
- Rather than multiplying every circuit's cell-ewma by a decay
factor every time we send a cell (thus normalizing the value of a
current cell to 1.0 and a past cell to alpha^t), we instead
only scale down the cell-ewma every tick (ten seconds atm),
normalizing so that a cell sent at the start of the tick has
value 1.0).
Each circuit is ranked in terms of how many cells from it have been
relayed recently, using a time-weighted average.
This patch has been tested this on a private Tor network on PlanetLab,
and gotten improvements of 12-35% in time it takes to fetch a small
web page while there's a simultaneous large data transfer going on
simultaneously.
[Commit msg by nickm based on mail from Ian Goldberg.]
This changes the pqueue API by requiring an additional int in every
structure that we store in a pqueue to hold the index of that structure
within the heap.
Some *_free functions threw asserts when passed NULL. Now all of them
accept NULL as input and perform no action when called that way.
This gains us consistence for our free functions, and allows some
code simplifications where an explicit null check is no longer necessary.
It turns out that OpenSSL 0.9.8m is likely to take a completely
different approach for reenabling renegotiation than OpenSSL 0.9.8l
did, so we need to work with both. :p Fixes bug 1158.
(patch by coderman; commit message by nickm)
Do not segfault when writing buffer stats when we haven't observed a
single circuit to report about. This is a minor bug that would only show
up in testing environments with no traffic and with reduced stats
intervals.
Avoid crashing if the client is trying to upload many bytes and the
circuit gets torn down at the same time, or if the flip side
happens on the exit relay. Bugfix on 0.2.0.1-alpha; fixes bug 1150.
New config option "CircuitStreamTimeout" to override our internal
timeout schedule for how many seconds until we detach a stream from
a circuit and try a new circuit. If your network is particularly
slow, you might want to set this to a number like 60.
On this OSX version, there is a stub mlockall() function
that doesn't work, *and* the declaration for it is hidden by
an '#ifdef _P1003_1B_VISIBLE'. This would make autoconf
successfully find the function, but our code fail to build
when no declaration was found.
This patch adds an additional test for the declaration.
This fixes bug 1147:
bionic doesn't have an actual implementation of mlockall();
mlockall() is merely in the headers but not actually in the library.
This prevents Tor compilation with the bionic libc for Android handsets.
To fix a major security problem related to incorrect use of
SSL/TLS renegotiation, OpenSSL has turned off renegotiation by
default. We are not affected by this security problem, however,
since we do renegotiation right. (Specifically, we never treat a
renegotiated credential as authenticating previous communication.)
Nevertheless, OpenSSL's new behavior requires us to explicitly
turn renegotiation back on in order to get our protocol working
again.
Amusingly, this is not so simple as "set the flag when you create
the SSL object" , since calling connect or accept seems to clear
the flags.
For belt-and-suspenders purposes, we clear the flag once the Tor
handshake is done. There's no way to exploit a second handshake
either, but we might as well not allow it.
This commit implements a new config option: 'DisableAllSwap'
This option probably only works properly when Tor is started as root.
We added two new functions: tor_mlockall() and tor_set_max_memlock().
tor_mlockall() attempts to mlock() all current and all future memory pages.
For tor_mlockall() to work properly we set the process rlimits for memory to
RLIM_INFINITY (and beyond) inside of tor_set_max_memlock().
We behave differently from mlockall() by only allowing tor_mlockall() to be
called one single time. All other calls will result in a return code of 1.
It is not possible to change DisableAllSwap while running.
A sample configuration item was added to the torrc.complete.in config file.
A new item in the man page for DisableAllSwap was added.
Thanks to Moxie Marlinspike and Chris Palmer for their feedback on this patch.
Please note that we make no guarantees about the quality of your OS and its
mlock/mlockall implementation. It is possible that this will do nothing at all.
It is also possible that you can ulimit the mlock properties of a given user
such that root is not required. This has not been extensively tested and is
unsupported. I have included some comments for possible ways we can handle
this on win32.
If your relay can't keep up with the number of incoming create cells, it
would log one warning per failure into your logs. Limit warnings to 1 per
minute.
In 5e4d53d535 we made it so that
crypto_cipher_set_key cannot fail. The call will now
always succeed, to returning a boolean for success/failure makes
no sense.
This was left over from an early draft of the microdescriptor code; it
began to populate the signatures array of a networkstatus vote, even
though there's no actual need to do that for a vote.
In its zeal to keep me from saying memset(x, '0', sizeof(x)), Coverity
disallows memset(x, 48, sizeof(x)). Fine. I'll choose a different
magic number, see if I care!
In C, the code "char x[10]; if (x) {...}" always takes the true branch of
the if statement. Coverity notices this now.
In some cases, we were testing arrays to make sure that an operation
we wanted to do would suceed. Those cases are now always-true.
In some cases, we were testing arrays to see if something was _set_.
Those caes are now tests for strlen(s), or tests for
!tor_mem_is_zero(d,len).
If all authorities restart at once right before a consensus vote, nobody
will vote about "Running", and clients will get a consensus with no usable
relays. Instead, authorities refuse to build a consensus if this happens.
The first happens on an error case when a controller wants an
impossible directory object. The second happens when we can't write
our fingerprint file.
The code for these was super-wrong, but will only break things when we
reset an option on a platform where sizeof(time_t) is different from
sizeof(int).
See task 1114. The most plausible explanation for someone sending us weak
DH keys is that they experiment with their Tor code or implement a new Tor
client. Usually, we don't care about such events, especially not on warn
level. If we really care about someone not following the Tor protocol, we
can set ProtocolWarnings to 1.
This patch introduces a new type called document_signature_t to represent the
signature of a consensus document. Now, each consensus document can have up
to one document signature per voter per digest algorithm. Also, each
detached-signatures document can have up to one signature per <voter,
algorithm, flavor>.
Previously, we insisted that a valid signature must be a signature of
the expected digest. Now we accept anything that starts with the
expected digest. This lets us include another digest later.
When we tried to use the deprecated non-threadsafe evdns
interfaces in Libevent 2 without using the also-deprecated
event_init() interface, Libevent 2 would sensibly crash, since it
has no guess where to find the Libevent library.
Here we use the evdns_base_*() functions instead if they're
present, and fake them if they aren't.
This is a possible fix for bug 1023, where if we vote (or make a v2
consensus networkstatus) right after we come online, we can call
rep_hist_note_router_unreachable() on every router we haven't connected
to yet, and thereby make all their uptime values reset.
This seems to be happening to me a lot on a garbage DSL line.
We may need to come up with 2 threshholds: a high short onehop
count and a lower longer count.
Don't count one-hop circuits when we're estimating how long it
takes circuits to build on average. Otherwise we'll set our circuit
build timeout lower than we should. Bugfix on 0.2.2.2-alpha.
Directory authorities now reject Tor relays with versions less than
0.1.2.14. This step cuts out four relays from the current network,
none of which are very big.
Previously, tor-gencert would call RSA_generate_key() directly.
This won't work on Android, which removes the (deprecated since
OpenSSL 0.9.8) function. We can't call RSA_generate_key_ex()
unconditionally either, since that didn't exist before 0.9.8.
Instead, we must call our own crypto_pk_generate_key_with_bits,
which knows how to call RSA_generate_key or RSA_generate_key_ex as
appropriate.
[Based on patch by Nathan Freitas]
Apparently the Android developers dumped OpenSSL's support for hardware
acceleration in order to save some memory, so you can't build programs using
engines on Android.
[Patch revised by nickm]
This shouldn't be necessary, but apparently the Android cross-compiler
doesn't respect -I as well as it should. (-I is supposed to add to the
*front* of the search path. Android's gcc wrapper apparently likes to add to
the end. This is broken, but we need to work around it.)
Found by coverity
test_mem_op_hex was leaking memory, which showed up in a few
tests.
Also, the dir_param test had a memleak of its own.
Found by Coverity
- Avoid memmoving 0 bytes which might lead to compiler warnings.
- Don't require relays to be entry node AND bridge at the same to time to
record clients.
- Fix a memory leak when writing dirreq-stats.
- Don't say in the stats files that measurement intervals are twice as long
as they really are.
- Reduce minimum observation time for requests to 12 hours, or we might
never record usage.
- Clear exit stats correctly after writing them, or we accumulate old stats
over time.
- Reset interval start for buffer stats, too.
The big change is to add a function to display the current SSL handshake
state, and to log it everywhere reasonable. (A failure in
SSL23_ST_CR_SRVR_HELLO_A is different from one in
SSL3_ST_CR_SESSION_TICKET_A.)
This patch also adds a new log domain for OR handshaking, so you can pull out
all the handshake log messages without having to run at debug for everything.
For example, you'd just say "log notice-err [handshake]debug-err file
tor.log".
This was the only log notice that happened during other
tor invocations, like --verify-config and --list-fingerprint.
Plus, now we think it works, so no need to hear about it.
"Tinytest" is a minimalist C unit testing framework I wrote for
Libevent. It supports some generally useful features, like being able
to run separate unit tests in their own processes.
I tried to do the refactoring to change test.c as little as possible.
Thus, we mostly don't call the tinytest macros directly. Instead, the
test.h header is now a wrapper on tinytest.h to make our existing
test_foo() macros work.
The next step(s) here will be:
- To break test.c into separate files, each with its own test group.
- To look into which things we can test
- To refactor the more fiddly tests to use the tinytest macros
directly and/or run forked.
- To see about writing unit tests for things we couldn't previously
test without forking.
If the networkstatus consensus tells us that we should use a
negative circuit package window, ignore it. Otherwise we'll
believe it and then trigger an assert.
Also, change the interface for networkstatus_get_param() so we
don't have to lookup the consensus beforehand.
A) We were considering a circuit had timed out in the special cases
where we close rendezvous circuits because the final rendezvous
circuit couldn't be built in time.
B) We were looking at the wrong timestamp_created when considering
a timeout.
Don't discard all circuits every MaxCircuitDirtiness, because the
user might legitimately have set that to a very lower number.
Also don't use up all of our idle circuits with testing circuits,
since that defeats the point of preemptive circuits.
We want it to be under our control so it doesn't mess
up initialization. This is likely the cause for
the bug the previous assert-adding commit (09a75ad) was
trying to address.
Using CircuitBuildTimeout is prone to issues with SIGHUP, etc.
Also, shuffle the circuit build times array after loading it
in so that newer measurements don't replace chunks of
similarly timed measurements.
To further attempt to fix bug 1090, make sure connection_ap_can_use_exit
always returns 0 when the chosen exit router is excluded. This should fix
bug1090.
When we excluded some Exits, we were sometimes warning the user that we
were going to use the node regardless. Many of those warnings were in
fact bogus, because the relay in question was not used to connect to
the outside world.
Based on patch by Rotor, thanks!
Tor now reads the "circwindow" parameter out of the consensus,
and uses that value for its circuit package window rather than the
default of 1000 cells. Begins the implementation of proposal 168.
This code adds a new field to vote on: "params". It consists of a list of
sorted key=int pairs. The output is computed as the median of all the
integers for any key on which anybody voted.
Improved with input from Roger.
Adding the same vote to a networkstatus consensus leads to a memory leak
on the client side. Fix that by only using the first vote from any given
voter, and ignoring the others.
Problem found by Rotor, who also helped writing the patch. Thanks!
A vote may only contain exactly one signature. Make sure we reject
votes that violate this.
Problem found by Rotor, who also helped writing the patch. Thanks!
Fix an obscure bug where hidden services on 64-bit big-endian
systems might mis-read the timestamp in v3 introduce cells, and
refuse to connect back to the client. Discovered by "rotor".
Bugfix on 0.2.1.6-alpha.
(Given that we're pretty much assuming that int is 32 bits, and given that
hex values are always unsigned, taking out the "ul" from 0xff000000 should
be fine.)
Add a "getinfo status/accepted-server-descriptor" controller
command, which is the recommended way for controllers to learn
whether our server descriptor has been successfully received by at
least on directory authority. Un-recommend good-server-descriptor
getinfo and status events until we have a better design for them.
We were telling the controller about CHECKING_REACHABILITY and
REACHABILITY_FAILED status events whenever we launch a testing
circuit or notice that one has failed. Instead, only tell the
controller when we want to inform the user of overall success or
overall failure. Bugfix on 0.1.2.6-alpha. Fixes bug 1075. Reported
by SwissTorExit.
When we added support for fractional units (like 1.5 MB) I broke
support for giving units with no space (like 2MB). This patch should
fix that. It also adds a propoer tor_parse_double().
Fix for bug 1076. Bugfix on 0.2.2.1-alpha.
We were triggering a CLOCK_SKEW controller status event whenever
we connect via the v2 connection protocol to any relay that has
a wrong clock. Instead, we should only inform the controller when
it's a trusted authority that claims our clock is wrong. Bugfix
on 0.2.0.20-rc; starts to fix bug 1074. Reported by SwissTorExit.
- Refactor geoip.c by moving duplicate code into rotate_request_period().
- Don't leak memory when cleaning up cell queues.
- Make sure that exit_(streams|bytes_(read|written)) are initialized in all
places accessing these arrays.
- Read only the last block from *stats files and ensure that its timestamp
is not more than 25 hours in the past and not more than 1 hour in the
future.
- Stop truncating the last character when reading *stats files.
The only thing that's left now is to avoid reading whole *stats files into
memory.
Note that unlike subversion revision numbers, it isn't meaningful to
compare these for anything but equality. We define a sort-order anyway,
in case one of these accidentally slips into a recommended-versions
list.
If any the v3 certs we download are unparseable, we should actually
notice the failure so we don't retry indefinitely. Bugfix on 0.2.0.x;
reported by "rotator".
Once we had called log_free_all(), anything that tried to log a
message (like a failed tor_assert()) would fail like this:
1. The logging call eventually invokes the _log() function.
2. _log() calls tor_mutex_lock(log_mutex).
3. tor_mutex_lock(m) calls tor_assert(m).
4. Since we freed the log_mutex, tor_assert() fails, and tries to
log its failure.
5. GOTO 1.
Now we allocate the mutex statically, and never destroy it on
shutdown.
Bugfix on 0.2.0.16-alpha, which introduced the log mutex.
This bug was found by Matt Edman.
(This would be everywhere running OpenSSL 0.9.7x and earlier, including
all current Macintosh users.)
The code is based on Tom St Denis's LibTomCrypt implementation,
modified to be way less general and use Tor's existing facilities. I
picked this one because it was pretty fast and pretty free, and
because Python uses it too.
The more verbose logs that were added in ee58153 also include a string
that might not have been initialized. This can lead to segfaults, e.g.,
when setting up private Tor networks. Initialize this string with NULL.
Send circuit or stream sendme cells when our window has decreased
by 100 cells, not when it has decreased by 101 cells. Bug uncovered
by Karsten when testing the "reduce circuit window" performance
patch. Bugfix on the 54th commit on Tor -- from July 2002,
before the release of Tor 0.0.0. This is the new winner of the
oldest-bug prize.
I don't think we actually use (or plan to use) strtok_r in a reentrant
way anywhere in our code, but would be nice not to have to think about
whether we're doing it.
Relays no longer publish a new server descriptor if they change
their MaxAdvertisedBandwidth config option but it doesn't end up
changing their advertised bandwidth numbers. Bugfix on 0.2.0.28-rc;
fixes bug 1026. Patch from Sebastian.
Specifically, every time we get a create cell but we have so many already
queued that we refuse it.
Bugfix on 0.2.0.19-alpha; fixes bug 1034. Reported by BarkerJr.
The problem is that clients and hidden services are receiving
relay_early cells, and they tear down the circuit.
Hack #1 is for rendezvous points to rewrite relay_early cells to
relay cells. That way there are never any incoming relay_early cells.
Hack #2 is for clients and hidden services to never send a relay_early
cell on an established rendezvous circuit. That works around rendezvous
points that haven't upgraded yet.
Hack #3 is for clients and hidden services to not tear down the circuit
when they receive an inbound relay_early cell. We already refuse extend
cells at clients.
When determining how long directory requests take or how long cells spend
in queues, we were comparing timestamps on microsecond detail only to
convert results to second or millisecond detail later on. But on 32-bit
architectures this means that 2^31 microseconds only cover time
differences of up to 36 minutes. Instead, compare timestamps on
millisecond detail.
Now that we require EntryStatistics to be 1 for counting connecting
clients, unit tests need to set that config option, too.
Reported by Sebastian Hahn.
Changes to directory request statistics:
- Rename GEOIP statistics to DIRREQ statistics, because they now include
more than only GeoIP-based statistics, whereas other statistics are
GeoIP-dependent, too.
- Rename output file from geoip-stats to dirreq-stats.
- Add new config option DirReqStatistics that is required to measure
directory request statistics.
- Clean up ChangeLog.
Also ensure that entry guards statistics have access to a local GeoIP
database.
This new option will allow clients to download the newest fresh consensus
much sooner than they normally would do so, even if they previously set
FetchDirInfoEarly. This includes a proper ChangeLog entry and an updated man
page.
Introduce a threshold of 0.01% of bytes that must be read and written per
port in order to be included in the statistics. Otherwise we cannot include
these statistics in extra-info documents, because they are too big.
Change the labels "-written" and "-read" so that the meanings are as
intended.
The internal error "could not find intro key" occurs when we want to send
an INTRODUCE1 cell over a recently finished introduction circuit and think
we built the introduction circuit with a v2 hidden service descriptor, but
cannot find the introduction key in our descriptor.
My first guess how we can end up in this situation is that we are wrong in
thinking that we built the introduction circuit based on a v2 hidden
service descriptor. This patch checks if we have a v0 descriptor, too, and
uses that instead.
o Minor features:
- If we're a relay and we change our IP address, be more verbose
about the reason that made us change. Should help track down
further bugs for relays on dynamic IP addresses.
when we write out our stability info, detect relays that have slipped
through the cracks. log about them and correct the problem.
if we continue to see a lot of these over time, it means there's another
spot where relays fall out of the routerlist without being marked as
unreachable.
Marks the control port connection for flushing before closing when
the QUIT command is issued. This allows a QUIT to be issued during
a long reply over the control port, flushing the reply and then
closing the connection. Fixes bug 1015.
rather than the bandwidth values in each relay descriptor. This approach
opens the door to more accurate bandwidth estimates once the directory
authorities start doing active measurements. Implements more of proposal
141.
arma's rationale: "I think this is a bug, since people intentionally
set DirPortFrontPage, so they really do want their relay to serve that
page when it's asked for. Having it appear only sometimes (or roughly
never in Sebastian's case) makes it way less useful."
Fixes bug 1013; bugfix on 0.2.1.8-alpha.
Added a sanity check in config.c and a check in directory.c
directory_initiate_command_rend() to catch any direct connection attempts
when a socks proxy is configured.
If the Tor is running with AutomapHostsOnResolve set, it _is_
reasonable to do a DNS lookup on a .onion address. So instead we make
tor-resolve willing to try to resolve anything. Only if Tor refuses
to resolve it do we suggest to the user that resolving a .onion
address may not work.
Fix for bug 1005.
The rest of the code was only including event.h so that it could see
EV_READ and EV_WRITE, which we were using as part of the
connection_watch_events interface for no very good reason.
This patch adds a new compat_libevent.[ch] set of files, and moves our
Libevent compatibility and utilitity functions there. We build them
into a separate .a so that nothing else in src/commmon depends on
Libevent (partially fixing bug 507).
Also, do not use our own built-in evdns copy when we have Libevent
2.0, whose evdns is finally good enough (thus fixing Bug 920).
Fix an edge case where a malicious exit relay could convince a
controller that the client's DNS question resolves to an internal IP
address. Bug found and fixed by "optimist"; bugfix on 0.1.2.8-beta.
Fix an edge case where a malicious exit relay could convince a
controller that the client's DNS question resolves to an internal IP
address. Bug found and fixed by "optimist"; bugfix on 0.1.2.8-beta.
Hidden service clients didn't use a cached service descriptor that
was older than 15 minutes, but wouldn't fetch a new one either. Now,
use a cached descriptor no matter how old it is and only fetch a new
one when all introduction points fail. Fix for bug 997. Patch from
Marcus Griep.
Apparently all the stuff that does a linear scan over all the DNS
cache entries can get really expensive when your DNS cache is very
large. It's hard to say how much this will help performance, since
gprof doesn't count time spent in OpenSSL or zlib, but I'd guess 10%.
Also, this patch removes calls to assert_connection_ok() from inside
the read and write callbacks, which are similarly unneeded, and a
little costlier than I'm happy with.
This is probably worth backporting to 0.2.0.
Provide a useful warning when launch_circuit tries to make us use a
node we don't want to use. Just give an info message when this is a
normal and okay situation. Fix for logging issues in bug 984.
This patch adds a function to determine whether we're in the main
thread, and changes control_event_logmsg() to return immediately if
we're in a subthread. This is necessary because otherwise we will
call connection_write_to_buf, which modifies non-locked data
structures.
Bugfix on 0.2.0.x; fix for at least one of the things currently
called "bug 977".
Tas (thanks!) noticed that when *ListenAddress is set, Tor would
still warn on startup when *Port is low and hibernation is active.
The patch parses all the *ListenAddress lines, and checks the
ports. Bugfix on 0.2.1.15-rc
With the last fix of task 932 (5f03d6c), client requests are only added to
the history when they happen after the start of the current history. This
conflicts with the unit tests that insert current requests first (defining
the start of the client request history) followed by requests in the past.
The fix is to insert requests in chronological order in the unit tests.
- Write geoip stats to disk every 24 hours, not every hour.
- Remove configuration options and define reasonable defaults.
- Clear history of client requests every 24 hours (which wasn't done at
all before).
Specifically if you send SIGUSR1, it will add two lines to the log file:
May 22 07:41:59.576 [notice] Our DNS cache has 3364 entries.
May 22 07:41:59.576 [notice] Our DNS cache size is approximately 1022656
bytes.
[tweaked a bit by nickm]
Really, our idiocy was that we were calling event_set() on the same
event more than once, which sometimes led to us calling event_set() on
an event that was already inserted, thus making it look uninserted.
With this patch, we just initialize the timeout events when we create
the requests and nameservers, and we don't need to worry about
double-add and double-del cases at all.
If we ever add an event, then set it, then add it again, there will be
now two pointers to the event in the event base. If we delete one and
free it, the first pointer will still be there, and possibly cause a
crash later.
This patch adds detection for this case to the code paths in
eventdns.c, and works around it. If the warning message ever
displays, then a cleverer fix is in order.
{I am not too confident that this *is* the fix, since bug 957 is very
tricky. If it is, it is a bugfix on 0.2.0.}
When we got a descriptor that we (as an authority) rejected as totally
bad, we were freeing it, then using the digest in its RAM to look up its
download status. Caught by arma with valgrind. Bugfix on 0.2.1.9-alpha.
The trick is that we should assert that our next_mem pointer has not
run off the end of the array _before_ we realign the pointer, since
doing that could take us over the end... but only if we're on a system
where malloc() gives us ram in increments smaller than sizeof(void*).
Bridges are not supposed to publish router descriptors to the directory
authorities. It defeats the point of bridges when they are included in the
public relay directory.
This patch puts out a warning and exits when the node is configured as
a bridge and to publish v1, v2, or v3 descriptors at the same time.
Also fixes part of bug 932.
This matters because a cpuworker can close its socket when it
finishes. Cpuworker typically runs in another thread, so without a
lock here, we can have a race condition and get confused about how
many sockets are open. Possible fix for bug 939.
This might detect some possible causes of bug 930, and will at least
make sure we aren't doing some dumb memory-corruption stuff with the heap
and router-parsing.
This addresses the first part of bug 918. Users are now warned when
they try to use hibernation in combination with a port below 1024
when they're not on Windows. We don't want to die here, because
people might run Tor as root, use a capabilities system or some
other platform that will allow them to re-attach low ports.
Wording suggested by Marian
(Don't crash immediately if we have leftover chunks to free after
freeing chunks in a buffer freelist; instead log a debugging message
that might help.)
Now, when you call tor --digests, it dumps the SHA1 digest of each
source file that Tor was built with. We support both 'sha1sum' and
'openssl sha1'. If the user is building from a tarball and they
haven't edited anything, they don't need any program that calculates
SHA1. If they _have_ modified a file but they don't have a program to
calculate SHA1, we try to build so we do not output digests.
bytes (aka 20KB/s), to match our documentation. Also update
directory authorities so they always assign the Fast flag to relays
with 20KB/s of capacity. Now people running relays won't suddenly
find themselves not seeing any use, if the network gets faster
on average.
svn:r19305
IP address changes: directory mirrors were mistakenly telling them
their old address if they asked via begin_dir, so they never got
an accurate answer about their new address, so they just vanished
after a day. Should fix bugs 827, 883, and 900 -- but alas, only
after every directory mirror has upgraded.
svn:r19291
ago. This change should significantly improve client performance,
especially once more people upgrade, since relays that have been
a guard for a long time are currently overloaded.
svn:r19287
The directory authorities were refusing v3 consensus votes from
other authorities, since the votes are now 504K. Fixes bug 959;
bugfix on 0.0.2pre17 (where we raised it from 50K to 500K ;).
svn:r19194
When we used smartlist_free to free the list of succesful uploads
because we had succeeded in uploading everywhere, we did not actually
set the successful_uploads field to NULL, so later it would get freed
again in rend_service_descriptor_free. Fix for bug 948; bug
introduced in 0.2.1.6-alpha.
svn:r19073
tor_sscanf() only handles %u and %s for now, which will make it
adequate to replace sscanf() for date/time/IP parsing. We want this
to prevent attackers from constructing weirdly formed descriptors,
cells, addresses, HTTP responses, etc, that validate under some
locales but not others.
svn:r18760
It seems that 64-bit Sparc Solaris demands 64-bit-aligned access to
uint64_t, but does not 64-bit-align the stack-allocated char array we
use for cpuworker tags. So this patch adds a set/get_uint64 pair, and
uses them to access the conn_id field in the tag.
svn:r18743
It turns out that we weren't updating the _ExcludeExitNodesUnion set's
country numbers when we reloaded (or first loaded!) the IP-to-country
file. Spotted by Lark. Bugfix on 0.2.1.6-alpha.
svn:r18575
stream never finished making its connection, it would live
forever in circuit_wait state. Now we close it after SocksTimeout
seconds. Bugfix on 0.1.2.7-alpha; reported by Mike Perry.
svn:r18516
Previously, when we had the chosen_exit set but marked optional, and
we failed because we couldn't find an onion key for it, we'd just give
up on the circuit. But what we really want to do is try again, without
the forced exit node.
Spotted by rovv. Another case of bug 752. I think this might be
unreachable in our current code, but proposal 158 could change that.
svn:r18451
merge the 'bridge relay' section into the 'main relay'
section, so people stop getting confused about whether they
should fill out both sections (they shouldn't).
svn:r18348
This resolves bug 526, wherein we would crash if the following
events occurred in this order:
A: We're an OR, and one of our nameservers goes down.
B: We launch a probe to it to see if it's up again. (We do this hourly
in steady-state.)
C: Before the probe finishes, we reconfigure our nameservers,
usually because we got a SIGHUP and the resolve.conf file changed.
D: The probe reply comes back, or times out. (There is a five-second
window for this, after B has happens).
IOW, if one of our nameservers is down and our nameserver
configuration has changed, there were 5 seconds per hour where HUPing
the server was unsafe.
Bugfix on 0.1.2.1-alpha. Too obscure to backport.
svn:r18306
This fixes the last known case of bug 891, which could happen if two
hosts, A and B, disagree about how long a circuit has been open,
because of clock drift of some kind. Host A would then mark the
connection as is_bad_for_new_circs when it got too old and open a new
connection. In between when B receives a NETINFO cell on the new
conn, and when B receives a conn cell on the new circuit, the new
circuit will seem worse to B than the old one, and so B will mark it
as is_bad_for_new_circs in the second or third loop of
connection_or_group_set_badness().
Bugfix on 0.1.1.13-alpha. Bug found by rovv.
Not a backport candidate: the bug is too obscure and the fix too tricky.
svn:r18303
crypto_global_init gets called. Also have it be crypto_global_init
that calls crypto_seed_rng, so we are not dependent on OpenSSL's
RAND_poll in these fiddly cases.
Should fix bug 907. Bugfix on 0.0.9pre6. Backport candidate.
svn:r18210
are stored when the --enable-local-appdata option is configured. This
changes the Windows path from %APPDATA% to a host local
%USERPROFILE%\Local Settings\Application Data\ path (aka,
LOCAL_APPDATA).
Patch from coderman.
svn:r18122
It was dumb to have an "announce the value if it's over 0" version of
the code coexisting with an "announce the value if it's at least N"
version. Retain the latter only, with N set to 1.
Incidentally, this should fix a Coverity REVERSE_INULL warning.
svn:r18100
There was a field that _HT_FOI_INSERT was never setting. Everything that calls _HT_FOI_INSERT was setting it via tor_malloc_zero, but that's fragile.
svn:r18064
Unfortunately, old Libevents don't _put_ a version in their headers, so
this can get a little tricky. Fortunately, the only binary-compatibility
issue we care about is the size of struct event. Even more fortunately,
Libevent 2.0 will let us keep binary compatiblity forever by letting us
decouple ourselves from the structs, if we like.
svn:r18014
five days old. Otherwise if Tor is off for a long time and then
starts with cached descriptors, it will try to use the onion
keys in those obsolete descriptors when building circuits. Bugfix
on 0.2.0.x. Fixes bug 887.
svn:r17993
cell back), avoid using that OR connection anymore, and also
tell all the one-hop directory requests waiting for it that they
should fail. Bugfix on 0.2.1.3-alpha.
svn:r17984
using the wrong onion key), we were dropping it and letting the
client time out. Now actually answer with a destroy cell. Bugfix
on 0.0.2pre8.
svn:r17970
When we made bridge authorities stop serving bridge descriptors over
unencrypted links, we also broke DirPort reachability testing for
bridges. So bridges with a non-zero DirPort were printing spurious
warns to their logs. Bugfix on 0.2.0.16-alpha. Fixes bug 709.
svn:r17945
descriptors shortly after startup, and then briefly resume
after a new bandwidth test and/or after publishing a new bridge
descriptor. Bridge users that try to bootstrap from them would
get a recent networkstatus but would get descriptors from up to
18 hours earlier, meaning most of the descriptors were obsolete
already. Reported by Tas; bugfix on 0.2.0.13-alpha.
svn:r17920
discard it rather than trying to use it. In theory it could
be useful because it lists alternate directory mirrors, but in
practice it just means we spend many minutes trying directory
mirrors that are long gone from the network. Helps bug 887 a bit;
bugfix on 0.2.0.x.
svn:r17917
The subversion $Id$ fields made every commit force a rebuild of
whatever file got committed. They were not actually useful for
telling the version of Tor files in the wild.
svn:r17867
use the same download mechanism as other places.
i had to make an ugly hack around "IMPOSSIBLE_TO_DOWNLOAD+1".
we should unhack that sometime.
svn:r17834
Specifically, split compare_tor_addr_to_addr_policy() from a loop with a bunch
of complicated ifs inside into some ifs, each with a simple loop. Rearrange
router_find_exact_exit_enclave() to run a little faster. Bizarrely,
router_policy_rejects_all() shows up on oprofile, so precalculate it per
routerinfo.
svn:r17802
new from their perspective) directory download schedule abstraction.
not done yet, but i'd better get this out of my sandbox before nick
does another sweeping change. :)
svn:r17798
of which countries we've seen clients from recently. Now controllers
like Vidalia can show bridge operators that they're actually making
a difference.
svn:r17796
shahn: "Add some documentation for the WRA_* family of functions, also make
sure that (hopefully) all functions that return was_router_added_t
don't return ints directly and that they don't refer to integers in
their documentation anymore."
svn:r17731
more extreme happens. the default should be to be quiet unless
something more extreme happens.
at least, this doesn't generate complaints anymore. perhaps that
means it is working better? :)
svn:r17724
(The unfixed ones are being downgraded to regular XXXs mainly on the rationale that they don't seem to be exploding Tor, and they were apparently not showstoppers for 0.2.0.x-final.)
svn:r17682
(Many users have no idea what a resolv.conf is, and shouldn't be forced to learn. The old option will keep working for now.)
Also, document it.
svn:r17661
seconds. Warn the user if lower values are given in the
configuration. Bugfix on 0.1.0.1-rc. Patch by Sebastian.
Clip the CircuitBuildTimeout to a minimum of 30 seconds. Warn the
user if lower values are given in the configuration. Bugfix on
0.1.1.17-rc. Patch by Sebastian.
svn:r17657
"connecting" and it receives an "end" relay cell, the exit relay
would silently ignore the end cell and not close the stream. If
the client never closes the circuit, then the exit relay never
closes the TCP connection. Bug introduced in Tor 0.1.2.1-alpha;
reported by "wood".
svn:r17625
This patch makes every RELAY_COMMAND_END cell that we send pass through one of two functions: connection_edge_end and relay_send_end_cell_from_edge. Both of these functions check the circuit purpose, and change the reason to MISC if the circuit purpose means that it's for client use.
svn:r17612
This makes it easier for us to avoid errors where we we forgot to list a keyword as mandatory, and easier for Coverity to detect cases like this too.
svn:r17595
one guard from a given relay family. Otherwise we could end up with
all of our entry points into the network run by the same operator.
Suggested by Camilo Viecco. Fix on 0.1.1.11-alpha.
Not a backport candidate, since I think this might break for users
who only have a given /16 in their reachableaddresses, or something
like that.
svn:r17514
it's obsolete. which causes us to inform the user every time, even
though the user can't do anything about it other than get confused.
now it's an info-level log by default.
svn:r17206
we were complaining about no support for our one-hop streams,
when in fact choose_good_exit_server_general() has no business
caring about one-hop streams. patch from miner.
svn:r17181
dmalloc_malloc, dmalloc_realloc and dmalloc_strdup. It only calls those
functions if we're using the magic USE_DMALLOC macro. If we're not doing
that, we call the normal malloc, realloc and strdup. This is my first
night at malloc disambiguation club, so I had to disambiguate. Also, first commit, I have my commit bit now. Huzzzah!!!
svn:r17157
The "ClientDNSRejectInternalAddresses" config option wasn't being
consistently obeyed: if an exit relay refuses a stream because its
exit policy doesn't allow it, we would remember what IP address
the relay said the destination address resolves to, even if it's
an internal IP address. Bugfix on 0.2.0.7-alpha; patch by rovv.
svn:r17135
Hidden services start out building five intro circuits rather
than three, and when the first three finish they publish a service
descriptor using those. Now we publish our service descriptor much
faster after restart.
svn:r17110
and fail to start. But dangerous permissions on
$datadir/cached-status/ would cause us to open a log and complain
there. Now complain to stdout and fail to start in both cases. Fixes
bug 820, reported by seeess.
svn:r16998
reachability testing circuits to do a bandwidth test -- if
we already have a connection to the middle hop of the testing
circuit, then it could establish the last hop by using the existing
connection. Bugfix on 0.1.2.2-alpha, exposed when we made testing
circuits no longer use entry guards in 0.2.1.3-alpha.
svn:r16997
rejected them in 0.1.0.15, because back in 2005 they were commonly
misconfigured and ended up as spam targets. We hear they are better
locked down these days.
svn:r16898
If not enough of our entry guards are available so we add a new
one, we might use the new one even if it overlapped with the
current circuit's exit relay (or its family). Anonymity bugfix
pointed out by rovv.
svn:r16698
Make definition of tor_mutex_t go into compat.h, so that it is possible to inline mutexes in critical objects. Add init/uninit functions for mutexes allocated inside other structs.
svn:r16623
Make dns resolver code more robust: handle nameservers with IPv6 addresses, make sure names in replies match requested names, make sure origin address of reply matches the address we asked.
svn:r16621
Initial conversion of uint32_t addr to tor_addr_t addr in connection_t and related types. Most of the Tor wire formats using these new types are in, but the code to generate and use it is not. This is a big patch. Let me know what it breaks for you.
svn:r16435
Move n_addr, n_port, and n_conn_id_digest fields of circuit_t into a separately allocated extend_info_t. Saves 22 bytes per connected circuit_t on 32-bit platforms, and makes me more comfortable with using tor_addr_t in place of uint32_t n_addr.
svn:r16257
Allow alternate form of SMARTLIST_FOREACH with paired BEGIN and END macros. This lets the compiler tell us which line an error has occurred on.
svn:r16256
Refactor tor_addr_from_string: it didnt need most of parse_addr_mask_port_range, and its dependence on that latter function made it less flexible.
svn:r16255
Tor_addr_compare did a semantic comparison, such that ::1.2.3.4 and 1.2.3.4 were "equal". we sometimes need an exact comparison. Add a feature to do that.
svn:r16210
Make generic address manipulation functions work better. Switch address policy code to use tor_addr_t, so it can handle IPv6. That is a good place to start.
svn:r16178
Refactor the router_choose_random_node interface: any function with 10 parameters, most of which are boolean and one of which is unused, should get refactored like this.
svn:r16167
Refactor the is_vote field of networkstatus_t to add a third possibility ("opinion") in addition to vote and opinion. First part of implementing proposal 147.
svn:r16166
Patch from Christian Wilms: remove (HiddenService|Rend)(Exclude)?Nodes options. They never worked properly, and nobody seems to be using them. Resolves bug 754.
svn:r16144
In connection_edge_destroy, send a stream status control event when we have an AP connection. Previously, we would send an event when the connection was AP and non-AP at the same time. This didn't work so well. Patch from Anonymous Remailer (Austria). Backport candidate.
svn:r16143
Never allow a circuit to be created with the same circid as a circuit that has been marked for close. May be a fix for bug 779. Needs testing. Backport candidate.
svn:r16136
Add new ExcludeExitNodes option. Also add a new routerset type to handle Exclude[Exit]Nodes. It is optimized for O(1) membership tests, so as to make choosing a random router run in O(N_routers) time instead of in O(N_routers*N_Excluded_Routers).
svn:r16061
to just our our entry guards for the test circuits. Otherwise we
tend to have multiple test circuits going through a single entry
guard, which makes our bandwidth test less accurate. Fixes part
of bug 654; patch contributed by Josh Albrecht.
(Actually, modify Josh's patch to avoid doing that when you're
a bridge relay, since it would leak more than we want to leak.)
svn:r15850