This mitigates an attack proposed by wanoskarnet, in which all of a
client's bridges collude to restrict the exit nodes that the client
knows about. Fixes bug 5343.
In the distant past, connection_handle_read() could be called when there
are pending bytes in the TLS object during the main loop. The design
since then has been to always read all pending bytes immediately, so
read events only trigger when the socket actually has bytes to read.
Resolves bug 5324.
The invariant is: unconfigured_proxies_n is exactly the number of
managed_proxy_t not in state PT_PROTO_COMPLETED.
To maintain this, we need to stop overloading unconfigured_proxies_n
to also count managed_proxy_t items that are in PT_PROTO_COMPLETED but
which might need relaunching. To make it so we can detect those, we
introduce another variable.
This commit also adds a function to assert that we haven't broken the
invariant.
Fix for bug 5084; bugfix on 0.2.3.6-alpha, I think.
In some cases, we solve this by doing a SMARTLIST_DEL_CURRENT before
calling managed_proxy_destroy. But for a trickier one, we just make a
copy of the list before iterating over it, so that changes to the
manage proxy list don't hurt our iteration.
This could be related to bug 5084.
in Makefile.am, we used it without quoting it, causing build failure if
your openssl/sed/sha1sum happened to live in a directory with a space in
it (very common on windows)
If we don't do this, [::] can be interpreted to mean all v4 and all
v6 addresses. Found by dcf. Fixes bug 4760. See RFC 3493 section
5.3 for more info.
There was one MS_WINDOWS that remained because it wasn't on a macro
line; a few remaining uses (and the definition!) in configure.in;
and a now-nonsensical stanza of eventdns_tor.h that previously
defined 'WIN32' if it didn't exist.
This commit is completely mechanical; I used this perl script to make it:
#!/usr/bin/perl -w -i.bak -p
if (/^\s*\#/) {
s/MS_WINDOWS/_WIN32/g;
s/\bWIN32\b/_WIN32/g;
}
Previously the client would ask the bridge for microdescriptors, which are
only supported in 0.2.3.x and later, and then fail to bootstrap when it
didn't get the answers it wanted. Fixes bug 4013; bugfix on 0.2.3.2-alpha.
The fix here is to revert to using normal descriptors if any of our
bridges are known to not support microdescs. This is not ideal, a) because
we'll start downloading a microdesc consensus as soon as we get a bridge
descriptor, and that will waste time if we later get a bridge descriptor
that tells us we don't like microdescriptors; and b) by changing our mind
we're leaking to our other bridges that we have an old-version bridge.
The alternate fix would have been to change
we_use_microdescriptors_for_circuits() to ask if *any* of our bridges
can support microdescriptors, and then change the directory logic that
picks a bridge to only select from those that do. For people living in
the future, where 0.2.2.x is obsolete, there won't be a difference.
Note that in either of these potential fixes, we have risk of oscillation
if our one funny-looking bridges goes away / comes back.
These were found by looking for tor_snprintf() instances that were
preceeded closely by tor_malloc(), though I probably converted some
more snprintfs as well.
(In every case, make sure that the length variable (if any) is
removed, renamed, or lowered, so that anything else that might have
assumed a longer buffer doesn't exist.)
These were found by looking for tor_snprintf() instances that were
followed closely by tor_strdup(), though I probably converted some
other snprintfs as well.
(To ensure correctness, in every case, make sure that the temporary
variable is deleted, renamed, or lowered in scope, so we can't have
any bugs related to accidentally relying on the no-longer-filled
variable.)
Fixes bug #4897, not yet in any release.
Using n_circ_id alone here (and below, when n_conn is NULL) really sucks,
but that's a separate bug which will need a changes/ file.
When we have an effective bandwidthrate configured so that we cannot
exceed our bandwidth limit in one accounting interval, don't disable
advertising the dirport. Implements ticket 2434.
We used to do this as a workaround for older Tors, but now it's never
the correct thing to do (especially since anything that didn't
understand RELAY_EARLY is now deprecated hard).
This patch should make us reject every Tor that was vulnerable to
CVE-2011-0427. Additionally, it makes us reject every Tor that couldn't
handle RELAY_EARLY cells, which helps with proposal 110 (#4339).
Back in #1240, r1eo linked to information about how this could happen
with older Linux kernels in response to nmap. Bugs #4545 and #4547
are about how our approach to trying to deal with this condition was
broken and stupid. Thanks to wanoskarnet for reminding us about #1240.
This is a fix for the abovementioned bugs, and is a bugfix on
0.1.0.3-rc.
Preprocessor directives should not be put inside the arguments
of a macro. This is not supported on older GCC releases (< 3.3)
thus broke compilation on Haiku (running gcc2).
It's inefficient, but the more efficient solution (only try to attach
streams aiming for this HS) would require far more complexity for a gain
that should be tiny.
The client's anonymity when accessing a non-HS address in tor2web-mode
would be easily nuked by inserting an inline image with a .onion URL, so
don't even pretend to access non-HS addresses through Tor.
The Tor2webMode torrc option is still required to run a Tor client in
'tor2web mode', but now it can't be turned on at runtime in a normal build
of Tor. (And a tor2web build of Tor can't be used as a normal Tor client,
so we don't have to worry as much about someone distributing packages with
this particular pistol accessible to normal users.)
This resolves a loop warning on "MapAddress *.example.com
example.com", makes the rewrite log messages correct, and fixes the
behavior of "MapAddress *.a *.b" when just given "a" as an input.
MapAddress *.torproject.org torproject.org would have been interpreted
as a map from a domain to itself, and would have cleared the mapping.
Now we require not only a match of domains, but of wildcards.
Incidentally, we've got 30969 lines in master with a comma
in them, of which 1995 have a comma followed by a non-newline,
non-space character. So about 93% of our commas are right,
but we have a substantial number of "crowded" lines.
It might be nice to support this someday, but for now it would fail
with an infinite remap cycle. (If I say "remap * *.foo.exit",
then example.com ->
example.com.foo.exit ->
example.com.foo.exit.foo.exit ->
example.com.foo.exit.foo.exit.foo.exit -> ...)
In this new representation for wildcarded addresses, there are no
longer any 'magic addresses': rather, "a.b c.d", "*.a.b c.d" and
"*.a.b *.c.d" are all represented by a mapping from "a.b" to "c.d". we
now distinguish them by setting bits in the addressmap_entry_t
structure, where src_wildcard is set if the source address had a
wildcard, and dst_wildcard is set if the target address had a
wildcard.
This lets the case where "*.a.b *.c.d" or "*.a.b c.d" remap the
address "a.b" get handled trivially, and lets us simplify and improve
the addressmap_match_superdomains implementation: we can now have it
run in O(parts of address) rather than O(entries in addressmap).
1. Only allow '*.' in MapAddress expressions. Ignore '*ample.com' and '.example.com'.
This has resulted in a slight refactoring of config_register_addressmaps.
2. Add some more detail to the man page entry for AddressMap.
3. Fix initialization of a pointer to NULL rather than 0.
4. Update the unit tests to cater for the changes in 1 and test more explicitly for
recursive mapping.
1. Implement the following mapping rules:
MapAddress a.b.c d.e.f # This is what we have now
MapAddress .a.b.c d.e.f # Replaces any address ending with .a.b.c with d.e.f
MapAddress .a.b.c .d.e.f # Replaces the .a.b.c at the end of any addr with .d.e.f
(Note that 'a.b.c .d.e.f' is invalid, and will be rejected.)
2. Add tests for the new rules.
3. Allow proper wildcard annotation, i.e. '*.d.e' '.d.e' will still work.
4. Update addressmap_entry_t with an is_wildcard member.
This keeps the IP address and TCP for a given OR port together,
reducing the risk of using an address for one address family with a
port of another.
Make node_get_addr() a wrapper function for compatibility.
This is not as conservative as we could do it, f.ex. by looking at the
connection and only do this for connections to bridges. A non-bridge
should never have anything else than its primary IPv4 address set
though, so I think this is safe.
Don't touch the string representation in routerinfo_t->address.
Also, set or clear the routerinfo_t->ipv6_preferred flag based on the
address family of the bridge.
Comments below focus on changes, see diff for added code.
New type tor_addr_port_t holding an IP address and a TCP/UDP port.
New flag in routerinfo_t, ipv6_preferred. This should go in the
node_t instead but not now.
Replace node_get_addr() with
- node_get_prim_addr() for primary address, i.e. IPv4 for now
- node_get_pref_addr() for preferred address, IPv4 or IPv6.
Rename node_get_addr_ipv4h() node_get_prim_addr_ipv4h() for
consistency. The primary address will not allways be an IPv4 address.
Same for node_get_orport() -> node_get_prim_orport().
Rewrite node_is_a_configured_bridge() to take all OR ports into account.
Extend argument list to extend_info_from_node and
extend_info_from_router with a flag indicating if we want to use the
routers primary address or the preferred address. Use the preferred
address in as few situtations as possible for allowing clients to
connect to bridges over IPv6.
This code handles the new ORPort options, and incidentally makes all
remaining port types use the new port configuration systems.
There are some rough edges! It doesn't do well in the case where your
Address says one thing but you say to Advertise another ORPort. It
doesn't handle AllAddrs. It doesn't actually advertise anything besides
the first listed advertised IPv4 ORPort and DirPort. It doesn't do
port forwarding to them either.
It's not tested either, it needs more documentation, and it probably
forgets to put the milk back in the refrigerator.
Some controllers want this so they can mess with Tor's configuration
for a while via the control port before actually letting Tor out of
the house.
We do this with a new DisableNetwork option, that prevents Tor from
making any outbound connections or binding any non-control
listeners. Additionally, it shuts down the same functionality as
shuts down when we are hibernating, plus the code that launches
directory downloads.
To make sure I didn't miss anything, I added a clause straight to
connection_connect, so that we won't even try to open an outbound
socket when the network is disabled. In my testing, I made this an
assert, but since I probably missed something, I've turned it into a
BUG warning for testing.
This will mainly help distributors by giving a way to set system or package
defaults that a user can override, and that a later package can replace.
No promises about the particular future location or semantics for this:
we will probably want to tweak it some before 0.2.3.x-rc
The file is searched for in CONFDIR/torrc-defaults , which can be
overridden with the "--defaults-torrc" option on the command line.
This starts an effort to refactor torrc handling code to make it easier
to live with. It makes it possible to override exit policies from the
command line, and possible to override (rather than append to) socksport
lists from the command line.
It'll be necessary to make a "base" torrc implementation work at all.
Instead of only writing the dynamic DH prime modulus to a file, write
the whole DH parameters set for forward compatibility. At the moment
we only accept '2' as the group generator.
The DH parameters gets stored in base64-ed DER format to the
'dynamic_dh_params' file.
Let's *not* expose more cross-platform-compatibility structures, or
expect code to use them right.
Also, don't fclose() stdout_handle and stdin_handle until we do
tor_process_handle_destroy, or we risk a double-fclose.
We used to do init_keys() if DynamicDHGroups changed after a HUP, so
that the dynamic DH modulus was stored on the disk. Since we are now
doing dynamic DH modulus storing in crypto.c, we can simply initialize
the TLS context and be good with it.
Introduce a new function router_initialize_tls_context() which
initializes the TLS context and use it appropriately.
The function is over 10 or 20% on some of Moritz's profiles, depending
on how you could.
Since it's checking for a multi-hour timeout, this is safe to do.
Fixes bug 4518.
Completely disable stats if we aren't running as a relay. We won't
collect any anyway, so setting up the infrastructure for them and
logging about them is wrong. This also removes a confusing log
message that clients without a geoip db would have seen.
Fixes bug 4353.
When running with IOCP, we are in theory able to use userspace-
allocated buffers to avoid filling up the stingy amount of kernel
space allocated for sockets buffers.
The bufferevent_async implementation in Libevent provides this
ability, in theory. (There are likely to be remaining bugs). This
patch adds a new option that, when using IOCP bufferevents, sets
each socket's send and receive buffers to 0, so that we should use
this ability.
When all the bugs are worked out here, if we are right about bug 98,
this might solve or mitigate bug 98.
This option is experimental and will likely require lots of testing
and debugging.
Let's make it more obvious to the everyday reader that eventdns.c is
a) Based on Libevent's evdns.c
b) Slated for demolition
c) Supposed to keep API-compatibility with Libevent.
d) Not worth tweaking unless there's a bug.
- Add a LOG_WARN message when registering the transports of a server
managed proxy, so that the bridge operator can see in what ports the
transports spawned and notify his/her clients.
The Right Way to expire an intro point is to establish a new one to
replace it, publish a new descriptor that doesn't list any expiring intro
points, and *then*, once our upload attempts for the new descriptor have
ended (whether in success or failure), close the expiring intro points.
Unfortunately, we can't find out when the new descriptor has actually been
uploaded, so we'll have to settle for a five-minute timer.
There should be no significant behaviour changes due to this commit (only
a log-message change or two), despite the rather massive overhaul, so this
commit doesn't include a changes/ file. (The commit that teaches
intro_point_should_expire_now to return non-zero gets a changes/ file,
though.)
We would stash the certs in the handshake state before checking them
for validity... and then if they turned out to be invalid, we'd give
an error and free them. Then, later, we'd free them again when we
tore down the connection.
Fixes bug 4343; fix on 0.2.3.6-alpha.
This way, all of the DA operators can upgrade immediately, without nuking
every client's set of entry guards as soon as a majority of them upgrade.
Until enough guards have upgraded, a majority of dirauths should set this
config option so that there are still enough guards in the network. After
a few days pass, all dirauths should use the default.
The patch for 3228 made us try to run init_keys() before we had loaded
our state file, resulting in an assert inside init_keys. We had moved
it too early in the function.
Now it's later in the function, but still above the accounting calls.
Previously we did this nearer to the end (in the old_options &&
transition_affects_workers() block). But other stuff cares about
keys being consistent with options... particularly anything which
tries to access a key, which can die in assert_identity_keys_ok().
Fixes bug 3228; bugfix on 0.2.2.18-alpha.
Conflicts:
src/or/config.c
When we added support for separate client tls certs on bridges in
a2bb0bfdd5 we forgot to correctly initialize this when changing
from relay to bridge or vice versa while Tor is running. Fix that
by always initializing keys when the state changes.
Fixes bug 2433.
Conflicts:
src/or/config.c
We use a hash of the identity key to seed a prng to tell when an
accounting period should end. But thanks to the bug998 changes,
clients no longer have server-identity keys to use as a long-term seed
in accounting calculations. In any case, their identity keys (as used
in TLS) were never never fixed. So we can just set the wakeup time
from a random seed instead there. Still open is whether everybody
should be random.
This patch fixes bug 2235, which was introduced in 0.2.2.18-alpha.
Diagnosed with help from boboper on irc.
In a2bb0bf we started using a separate client identity key. When we are
in "public server mode" (that means not a bridge) we will use the same
key. Reusing the key without doing the proper refcounting leads to a
segfault on cleanup during shutdown. Fix that.
Also introduce an assert that triggers if our refcount falls below 0.
That should never happen.
We now require that:
- Only actual servers should ever call get_server_identity_key
- If you're being a client or bridge, the client and server keys should
differ.
- If you're being a public relay, the client and server keys
should be the same.
* Make tor_tls_context_new internal to tortls.c, and return the new
tor_tls_context_t from it.
* Add a public tor_tls_context_init wrapper function to replace it.
Conflicts:
src/or/main.c
src/or/router.c
Since we check for naughty renegotiations using
tor_tls_t.server_handshake_count we don't need that semi-broken
function (at least till there is a way to disable rfc5746
renegotiations too).
When you're doing malloc(sizeof(int)), something may well have gone
wrong.
This technique is a bit abusive, but we're already relying on it
working correctly in geoip.c.
To get a better idea what's going on on Tonga, add some code to report
how often the most and least frequently fetched descriptor was fetched,
as well as 25, 50, 75 percentile.
Also ensure we only count bridge descriptors here.
- Add a tor_process_get_pid() function that returns the PID of a
process_handle_t.
- Conform to make check-spaces.
- Add some more documentation.
- Improve some log messages.
We used to try to terminate the managed proxy process even if it
failed while launching. We introduce a new managed proxy state, to
represent a *broken* and *not launched* proxy.
This is used for the bridge authority currently, to get a better
intuition on how many descriptors are actually fetched from it and how
many fetches happen in total.
Implements ticket 4200.
Fixes bug 4259, bugfix on 0.2.2.25-alpha. Bugfix by "Tey'".
Original message by submitter:
Changing nodes restrictions using a controller while Tor is doing
DNS resolution could makes Tor crashes (on WinXP at least). The
problem can be repeated by trying to reach a non-existent domain
using Tor:
curl --socks4a 127.0.0.1:9050 inexistantdomain.ext
.. and changing the ExitNodes parameter through the control port
before Tor returns a DNS resolution error (of course, the following
command won't work directly if the control port is password
protected):
echo SETCONF ExitNodes=TinyTurtle | nc -v 127.0.0.1 9051
Using a non-existent domain is needed to repeat the issue so that
Tor takes a few seconds for resolving the domain (which allows us to
change the configuration). Tor will crash while processing the
configuration change.
The bug is located in the addressmap_clear_excluded_trackexithosts
method which iterates over the entries of the addresses map in order
to check whether the changes made to the configuration will impact
those entries. When a DNS resolving is in progress, the new_adress
field of the associated entry will be set to NULL. The method
doesn't expect this field to be NULL, hence the crash.
Previously, we would treat an intro circuit failure as a timeout iff the
circuit failed due to a mismatch in relay identity keys. (Due to a bug
elsewhere, we only recognize relay identity-key mismatches on the first
hop, so this isn't as bad as it could have been.)
Bugfix on commit eaed37d14c, not yet in any
release.
It's too risky to have a function where if you leave one parameter
NULL, it splits up address:port strings, but if you set it, it does
hostname resolution.
Under the new convention, having a tor_addr.*lookup function that
doesn't do hostname resolution is too close for comfort.
I used this script here, and have made no other changes.
s/tor_addr_parse_reverse_lookup_name/tor_addr_parse_PTR_name/g;
s/tor_addr_to_reverse_lookup_name/tor_addr_to_PTR_name/g;
Now let's have "lookup" indicate that there can be a hostname
resolution, and "parse" indicate that there wasn't. Previously, we
had one "lookup" function that did resolution; four "parse" functions,
half of which did resolution; and a "from_str()" function that didn't
do resolution. That's confusing and error-prone!
The code changes in this commit are exactly the result of this perl
script, run under "perl -p -i.bak" :
s/tor_addr_port_parse/tor_addr_port_lookup/g;
s/parse_addr_port(?=[^_])/addr_port_lookup/g;
s/tor_addr_from_str/tor_addr_parse/g;
This patch leaves aton and pton alone: their naming convention and
behavior is is determined by the sockets API.
More renaming may be needed.
Right now we can take the digests only of an RSA key, and only expect to
take the digests of an RSA key. The old tor_cert_get_id_digests() would
return a good set of digests for an RSA key, and an all-zero one for a
non-RSA key. This behavior is too error-prone: it carries the risk that
we will someday check two non-RSA keys for equality and conclude that
they must be equal because they both have the same (zero) "digest".
Instead, let's have tor_cert_get_id_digests() return NULL for keys we
can't handle, and make its callers explicitly test for NULL.
Also, define all commands > 128 as variable-length when using
v3 or later link protocol. Running into a var cell with an
unrecognized type is no longer a bug.
Without this patch, Tor wasn't sure whether it would be hibernating or
not, so it postponed opening listeners until after the privs had been
dropped. This doesn't work so well for low ports. Bug was introduced in
the fix for bug 2003. Fixes bug 4217, reported by Zax and katmagic.
Thanks!
One of its callers assumes a non-zero result indicates a permanent failure
(i.e. the current attempt to connect to this HS either has failed or is
doomed). The other caller only requires that this function's result
never equal -2.
Bug reported by Sebastian Hahn.
Change the default values for collecting directory request statistics and
inlcuding them in extra-info descriptors to 1.
Don't break if we are configured to collect directory request or entry
statistics and don't have a GeoIP database. Instead, print out a notice
and skip initializing the affected statistics code.
This is the cherry-picked 499661524b.
Previously Tor would always claim to have been given a hostname
by the client, while actually only verifying that the client
is using SOCKS4A or SOCKS5 with hostnames. Both protocol versions
allow IP addresses, too, in which case the log messages were wrong.
Fixes#4094.
Previously, we wouldn't refetch an HS's descriptor unless we didn't
have one at all. That was equivalent to refetching iff we didn't have
a usable one, but the next commit will make us keep some non-usable HS
descriptors around in our cache.
Code bugfix on the release that introduced the v2 HS directory system,
because rend_client_refetch_v2_renddesc's documentation comment should
have described what it actually did, not what its behaviour happened
to be equivalent to; no behaviour change in this commit.
Otherwise, we can wind up munging our reference counts if we set it in
the middle of loading the nodes. This happens because
nodelist_set_consensus() and microdesc_reload_cache() are both in the
business of adjusting microdescriptors' references.
we don't need to check whether we don't have enough guards right after
concluding that we do have enough.
slight efficiency fix suggested by an anonymous fellow on irc.
We were doing "divide bandwidth by 1000, then multiply by msec", but
that would lose accuracy: instead of getting your full bandwidth,
you'd lose up to 999 bytes per sec. (Not a big deal, but every byte
helps.)
Instead, do the multiply first, then the division. This can easily
overflow a 32-bit value, so make sure to do it as a 64-bit operation.
With managed proxies you would always get the error message:
"You have a Bridge line using the X pluggable transport, but there
doesn't seem to be a corresponding ClientTransportPlugin line."
because the check happened directly after parse_client_transport_line()
when managed proxies were not fully configured and their transports
were not registered.
The fix is to move the validation to run_scheduled_events() and make
sure that all managed proxies are configured first.
* Create mark/sweep functions for transports.
* Create a transport_resolve_conflicts() function that tries to
resolve conflicts when registering transports.
Previously the FooPort was ignored and the default used instead,
causing Tor to bind to the wrong port if FooPort and the default
port don't match or the CONN_TYPE_FOO_LISTENER has no default port.
Fixes#3936.
Right now we only force a new descriptor upload every 18 hours.
This can make servers become unlisted if they upload a descriptor at
time T which the authorities reject as being "too similar" to one
they uploaded before. Nothing will actually make the server upload a
new descriptor later on, until another 18 hours have passed.
This patch changes the upload behavior so that the 18 hour interval
applies only when we're listed in a live consensus with a descriptor
published within the last 18 hours. Otherwise--if we're not listed
in the live consensus, or if we're listed with a publication time
over 18 hours in the past--we upload a new descriptor every 90
minutes.
This is an attempted bugfix for #3327. If we merge it, it should
obsolete #535.
Conflicts:
src/or/connection.c
src/or/connection_edge.c
src/or/connection_edge.h
src/or/dnsserv.c
Some of these were a little tricky, since they touched code that
changed because of the prop171 fixes.
Add a "default" state which we use until we've decided whether we're
live or hibernating. This allows us to properly track whether we're
resuming a hibernation period or not. Fixes bug 2003.
For printf, %f and %lf are synonymous, since floats are promoted to
doubles when passed as varargs. It's only for scanf that we need to
say "%lf" for doubles and "%f" for floats.
Apparenly, some older compilers think it's naughty to say %lf and like
to spew warnings about it.
Found by grarpamp.
For bufferevents, we had all of connection_buckets_decrement() stubbed
out. But that's not actually right! The rephist_* parts were
essential for, inter alia, recording our own bandwidth. This patch
splits out the rephist parts of connection_buckets_decrement() into their
own function, and makes the bufferevent code call that new function.
Fixes bug 3803, and probably 3824 and 3826 too. Bugfix on 0.2.3.1-alpha.
Previously, if you were set up to use microdescriptors, and you
weren't a cache, you'd never fetch router descriptors (except for
bridges). Now FetchUselessDescriptors causes descriptors and
mirodescs to get cached. Also, FetchUselessDescriptors changes the
behavior of "UseMicrodescriptors auto" to be off, since there's no
point in saying "UseMicrodescriptors 1" when you have full descriptors
too.
Fix for bug 3851; bugfix on 0.2.3.1-alpha.
Because tunneled connections are implemented with buffervent_pair,
writing to them can cause an immediate flush. This means that
added to them and then checking to see whether their outbuf is
empty is _not_ an adequate way to see whether you added anything.
This caused a problem in directory server connections, since they
would try spooling a little more data out, and then close the
connection if there was no queued data to send.
This fix should improve matters; it only closes the connection if
there is no more data to spool, and all of the spooling callbacks
are supposed to put the dirconn into dir_spool_none on completion.
This is bug 3814; Sebastian found it; bugfix on 0.2.3.1-alpha.
When we're doing filtering ssl bufferevents, we want the rate-limits
to apply to the lowest level of the bufferevent stack, so that we're
actually limiting bytes sent on the network. Otherwise, we'll read
from the network aggressively, and only limit stuff as we process it.
This behavior is normal when we want more data than the evbuffer
actually has for us. We'll ask for (say) 7 bytes, get only 5
(because that's all there is), try to parse the 5 bytes, and get
told "no, I want 7". One option would be to bail out early whenever
want_length is > buflen, but sometimes we use an over-large
want_length. So instead, let's just remove the warning here: it's
not a bug after all.
* Use strcmpstart() instead of strcmp(x,y,strlen(y)).
* Warn the user if the managed proxy failed to launch.
* Improve function documentation.
* Use smartlist_len() instead of n_unconfigured_proxies.
* Split managed_proxy_destroy() to managed_proxy_destroy()
and managed_proxy_destroy_with_transports().
* Constification.
It turns out that it wasn't enough to set the configuration to
"auto", since the correct behavior for "auto" had been disabled in
microdesc.c. :p
(Hasn't been in a release yet, so doesn't need a changes entry.)
Now we track *which* stream with ISO_STREAM set is associated to a
particular circuit, so that we won't think that stream is incompatible
with its circuit and launch another one a second later, and we use that
same field to mark circuits which have had an ISO_STREAM stream attached
to them, so that we won't ever put a second stream on that circuit.
Fixes bug 3695.
Only write a bridge-stats string if bridge stats have been
initialized. This behavior is similar to dirreq-stats, entry-stats,
etc.
Also add a few unit tests for the bridge-stats code.
This patch separates the generation of a dirreq-stats string from
actually writing it to disk. The new geoip_format_dirreq_stats()
generates a dirreq-stats string that geoip_dirreq_stats_write() writes
to disk. All the state changing (e.g., resetting the dirreq-stats
history and initializing the next measurement interval) takes place in
geoip_dirreq_stats_write(). That allows us to finally test the
dirreq-stats code better.
Now that formatting the buffer-stats string is separate from writing
it to disk, we can also decouple the logic to extract stats from
circuits and finally write some unit tests for the history code.
The new rep_hist_format_buffer_stats() generates a buffer-stats string
that rep_hist_buffer_stats_write() writes to disk. All the state
changing (e.g., resetting the buffer-stats history and initializing
the next measurement interval) takes place in
rep_hist_buffer_stats_write(). That allows us to finally test the
buffer-stats code better.
So far, if we didn't see a single circuit, we refrained from
generating a cell-stats string and logged a warning. Nobody will
notice the warning, and people will wonder why there's no cell-stats
string in the extra-info descriptor. The better behavior is to
generate a cell-stats string with all zeros.
Right now, we append statistics to files in the stats/ directory for
half of the statistics, whereas we overwrite these files for the other
half. In particular, we append buffer, dirreq, and entry stats and
overwrite exit, connection, and bridge stats.
Appending to files was useful when we didn't include stats in extra-info
descriptors, because otherwise we'd have to copy them away to prevent
Tor from overwriting them.
But now that we include statistics in extra-info descriptors, it makes
no sense to keep the old statistics forever. We should change the
behavior to overwriting instead of appending for all statistics.
Implements #2930.
They *are* non-NUL-terminated, after all (and they have to be, since
the SOCKS5 spec allows them to contain embedded NULs. But the code
to implement proposal 171 was copying them with tor_strdup and
comparing them with strcmp_opt.
Fix for bug on 3683; bug not present in any yet-released version.
Previously we'd just looked at the connection type, but that's
always CONN_TYPE_AP. Instead, we should be looking at the type of
the listener that created the connection.
Spotted by rransom; fixes bug 3636.
We'll still need to tweak it so that it looks for includes and
libraries somewhere more sensible than "where we happened to find
them on Erinn's system"; so that tests and tools get built too;
so that it's a bit documented; and so that we actually try running
the output.
Work done with Erinn Clark.
The conflicts are with the proposal 171 circuit isolation code, and
they're all trivial: they're just a matter of both branches adding
some unrelated code in the same places.
Conflicts:
src/or/circuituse.c
src/or/connection.c
The problem was that we weren't initializing want_length to 0 before
calling parse_socks() the first time, so it looked like we were
risking an infinite loop when in fact we were safe.
Fixes 3615; bugfix on 0.2.3.2-alpha.
Back when I added this logic in 20c0581a79, the rule was that whenever
a circuit finished building, we cleared its isolation info. I did that
so that we would still use the circuit even if all the streams that
had previously led us to tentatively set its isolation info had closed.
But there were problems with that approach: We could pretty easily get
into a case where S1 had led us to launch C1 and S2 had led us to
launch C2, but when C1 finished, we cleared its isolation and attached
S2 first. Since C2 was still marked in a way that made S1
unattachable to it, we'd then launch another circuit needlessly.
So instead, we try the following approach now: when a circuit is done
building, we try to attach streams to it. If it remains unused after
we try attaching streams, then we clear its isolation info, and try
again to attach streams.
Thanks to Sebastian for helping me figure this out.
One-hop dirconn streams all share a session group, and get the
ISO_SESSIONGRP flag: they may share circuits with each other and
nothing else.
Anonymized dirconn streams get a new internal-use-only ISO_STREAM
flag: they may not share circuits with anything, including each other.
The new candidate rule, which arma suggested and I like, is that
the original address as received from the client connection or as
rewritten by the controller is the address that counts.
This is mainly meant as a way to keep clients from accidentally
DOSing themselves by (e.g.) enabling IsolateDestAddr or
IsolateDestPort on a port that they use for HTTP.
Our old "do we need to launch a circuit for stream S" logic was,
more or less, that if we had a pending circuit that could handle S,
we didn't need to launch a new one.
But now that we have streams isolated from one another, we need
something stronger here: It's possible that some pending C can
handle either S1 or S2, but not both.
This patch reuses the existing isolation logic for a simple
solution: when we decide during circuit launching that some pending
C would satisfy stream S1, we "hypothetically" mark C as though S1
had been connected to it. Now if S2 is incompatible with S1, it
won't be something that can attach to C, and so we'll launch a new
stream.
When the circuit becomes OPEN for the first time (with no streams
attached to it), we reset the circuit's isolation status. I'm not
too sure about this part: I wanted some way to be sure that, if all
streams that would have used a circuit die before the circuit is
done, the circuit can still get used. But I worry that this
approach could also lead to us launching too many circuits. Careful
thought needed here.
This is the meat of proposal 171: we change circuit_is_acceptable()
to require that the connection is compatible with every connection
that has been linked to the circuit; we update circuit_is_better to
prefer attaching streams to circuits in the way that decreases the
circuits' usefulness the least; and we update link_apconn_to_circ()
to do the appropriate bookkeeping.
The "nym epoch" of a stream is defined as the number of times that
NEWNYM had been called before the stream was opened. All streams
are isolated by nym epoch.
This feature should be redundant with existing signewnym stuff, but
it provides a good belt-and-suspenders way for us to avoid ever
letting any circuit type bypass signewnym.
This patch adds fields to track how streams should be isolated, and
ensures that those fields are set correctly. It also adds fields to
track what streams can go on a circuit, and adds functions to see
whether a streams can go on a circuit and update the circuit
accordingly. Those functions aren't yet called.
Proposal 171 gives us a new syntax for parsing client port options.
You can now have as many FooPort options as you want (for Foo in
Socks, Trans, DNS, NATD), and they can have address:port arguments,
and you can specify the level of isolation on those ports.
Additionally, this patch refactors the client port parsing logic to
use a new type, port_cfg_t. Previously, ports to be bound were
half-parsed in config.c, and later re-parsed in connection.c when
we're about to bind them. Now, parsing a port means converting it
into a port_cfg_t, and binding it uses only a port_cfg_t, without
needing to parse the user-provided strings at all.
We should do a related refactoring on other port types. For
control ports, that'll be easy enough. For ORPort and DirPort,
we'll want to do this when we solve proposal 118 (letting servers
bind to and advertise multiple ports).
This implements tickets 3514 and 3515.
This adds a little code complexity: we need to remember for each
node whether it supports the right feature, and then check for each
connection whether it's exiting at such a node. We store this in a
flag in the edge_connection_t, and set that flag at link time.
- Conform to make check-spaces
- Build without warnings from passing size_t to %d
- Use connection_get_inbuf_len(), not buf_datalen (otherwise bufferevents
won't work).
- Don't log that we're using this feature at warn.
Previously we were using router_get_by_id(foo) to test "do we have a
descriptor that will let us make an anonymous circuit to foo". But
that isn't right for microdescs: we should have been using node_t.
Fixes bug 3601; bugfix on 0.2.3.1-alpha.
Instead, use compare_tor_addr_to_node_policy everywhere.
One advantage of this is that compare_tor_addr_to_node_policy can
better distinguish 0.0.0.0 from "unknown", which caused a nasty bug
with microdesc users.
Previously, we had an issue where we'd treat an unknown address as
0, which turned into "0.0.0.0", which looked like a rejected
address. This meant in practice that as soon as we started doing
comparisons of unknown uint32 addresses to short policies, we'd get
'rejected' right away. Because of the circumstances under which
this would be called, it would only happen when we had local DNS
cached entries and we were looking to launch new circuits.
* Add some utility transport functions in circuitbuild.[ch] so that we
can use them from pt.c.
* Make the accounting system consider traffic coming from proxies.
* Make sure that we only fetch bridge descriptors when all the
transports are configured.
Rationale: right now there seems to be no way for our bootstrap
status to dip under 100% once it has reached 100%. Thus, recording
broken connections after that point is useless, and wastes memory.
If at some point in the future we allow our bootstrap level to go
backwards, then we should change this rule so that we disable
recording broken connection states _as long as_ the bootstrap status
is 100%.
- We were reporting the _bottom_ N failing states, not the top N.
- With bufferevents enabled, we logged all TLS states as being "in
bufferevent", which isn't actually informative.
- When we had nothing to report, we reported nothing too loudly.
- Also, we needed documentation.
This code lets us record the state of any outgoing OR connection
that fails before it becomes open, so we can notice if they're all
dying in the same SSL state or the same OR handshake state.
More work is still needed:
- We need documentation
- We need to actually call the code that reports the failure when
we realize that we're having a hard time connecting out or
making circuits.
- We need to periodically clear out all this data -- perhaps,
whenever we build a circuit successfully?
- We'll eventually want to expose it to controllers, perhaps.
Partial implementation of feature 3116.
It's very easy for nodelist_add_node_family(sl,node) to accidentally
add 'node', and kind of hard to make sure that it omits it. Instead
of taking pains to leave 'node' out, let's instead make sure that we
always include it.
I also rename the function to nodelist_add_node_and_family, and
audit its users so that they don't add the node itself any longer,
since the function will take care of that for them.
Resolves bug 2616, which was not actually a bug.
Previously, we'd just take all the nodes in EntryNodes, see which
ones were already in the guard list, and add the ones that weren't.
There were some problems there, though:
* We'd add _every_ entry in EntryNodes, and add them in the order
they appeared in the routerlist. This wasn't a problem
until we added the ability to give country-code or IP-range
entries in the EntryNodes set, but now that we did, it is.
(Fix: We now shuffle the entry nodes before adding them; only
add up to 10*NumEntryGuards)
* We didn't screen EntryNodes for the Guard flag. That's okay
if the user has specified two or three entry nodes manually,
but if they have listed a whole subcontinent, we should
restrict ourselves to the entries that are currently guards.
(Fix: separate out the new guard from the new non-guard nodes,
and add the Guards first.)
* We'd prepend new EntryNodes _before_ the already configured
EntryNodes. This could lead to churn.
(Fix: don't prepend these.)
This patch also pre-screens EntryNodes entries for
reachableaddresses/excludenodes, even though we check for that
later. This is important now, since we cap the number of entries
we'll add.
Just looking at the "latest" consensus could give us a microdesc
consensus, if microdescs were enabled. That would make us decide
that every routerdesc was unlisted in the latest consensus and drop
them all: Ouch.
Fixes bug 3113; bugfix on 0.2.3.1-alpha.
We have an invariant that a node_t should have an md only if it has
a routerstatus. nodelist_purge tried to preserve this by removing
all nodes without a routerstatus or a routerinfo. But this left
nodes with a routerinfo and a microdesc untouched, even if they had
a routerstatus.
Bug found by frosty_un.
All of the routerset_contains*() functions return 0 if their
routerset_t argument is NULL. Therefore, there's no point in
doing "if (ExcludeNodes && routerset_contains*(ExcludeNodes...))",
for example.
This patch fixes every instance of
if (X && routerstatus_contains*(X,...))
Note that there are other patterns that _aren't_ redundant. For
example, we *don't* want to change:
if (EntryNodes && !routerstatus_contains(EntryNodes,...))
Fixes#2797. No bug here; just needless code.
Previously, we'd get a new descriptor for free when
public_server_mode() changed, since it would count as
affects_workers, which would call init_keys(), which would make us
regenerate a new descriptor. But now that we fixed bug 3263,
init_keys() is no longer necessarily a new descriptor, and so we
need to make sure that public_server_mode() counts as a descriptor
transition.
The issue was that we overlooked the possibility of reverse DNS success
at the end of connection_ap_handshake_socks_resolved(). Issue discovered
by katmagic, thanks!
Returning a tristate is needless here; we can just use the yielded
transport/proxy_type field to tell whether there's a proxy, and have
the return indicate success/failure.
Also, store the proxy_type in the or_connection_t rather than letting
it get out of sync if a configuration reload happens between launching
the or_connection and deciding what to say with it.
- const-ify some transport_t pointers
- Remove a vestigial argument to parse_bridge_line
- Make it compile without warnings on my laptop with
--enable-gcc-warnings
We were using strncpy before, which isn't our style for stuff like
this.
This isn't a bug, though: before calling strncpy, we were checking
that strlen(src) was indeed == HEX_DIGEST_LEN, which is less than
sizeof(dst), so there was no way we could fail to NUL-terminate.
Still, strncpy(a,b,sizeof(a)) is an idiom that we ought to squash
everyplace.
Fixes CID #427.
Using strncpy meant that if listenaddress were ever >=
sizeof(sockaddr_un.sun_path), we would fail to nul-terminate
sun_path. This isn't a big deal: we never read sun_path, and the
kernel is smart enough to reject the sockaddr_un if it isn't
nul-terminated. Nonetheless, it's a dumb failure mode. Instead, we
should reject addresses that don't fit in sockaddr_un.sun_path.
Coverity found this; it's CID 428. Bugfix on 0.2.0.3-alpha.
When we rejected a descriptor for not being the one we wanted, we
were letting the parsed descriptor go out of scope.
Found by Coverity; CID # 30.
Bugfix on 0.2.1.26.
(No changes file yet, since this is not in any 0.2.1.x release.)
Every node_t has either a routerinfo_t or a routerstatus_t, so every
node_t *should* have a nickname. Nonetheless, let's make sure in
hex_digest_nickname_matches().
Should quiet CID 434.
This is a little error-prone when the local has a different type
from the parameter, and is very error-prone with both have the same
type. Let's not do this.
Fixes CID #437,438,439,440,441.
For some inexplicable reason, Coverity departs from its usual
standards of avoiding false positives here, and warns about all
sscanf usage, even when the formatting strings are totally safe.
Addresses CID # 447, 446.
Previously, fetch_from_buf_socks() might return 0 if there was still
data on the buffer and a subsequent call to fetch_from_buf_socks()
would return 1. This was making some of the socks5 unit tests
harder to write, and could potentially have caused misbehavior with
some overly verbose SOCKS implementations. Now,
fetch_from_buf_socks() does as much processing as it can, and
returns 0 only if it really needs more data. This brings it into
line with the evbuffer socks implementation.
We added this back in 0649fa14 in 2006, to deal with the case where
the client unconditionally sent us authentication data. Hopefully,
that's not needed any longer, since we now can actually parse
authentication data.
This change also requires us to add and use a pair of
allocator/deallocator functions for socks_request_t, instead of
using tor_malloc_zero/tor_free directly.
In the code as it stood, we would accept any number of socks5
username/password authentication messages, regardless of whether we
had actually negotiated username/password authentication. Instead,
we should only accept one, and only if we have really negotiated
username/password authentication.
This patch also makes some fields of socks_request_t into uint8_t,
for safety.
Multiple Bridge lines can point to the same one ClientTransportPlugin
line, and we can have multiple ClientTransportPlugin lines in our
configuration file that don't match with a bridge. We also issue a
warning when we have a Bridge line with a pluggable transport but we
can't match it to a ClientTransportPlugin line.
* Renamed transport_info_t to transport_t.
* Introduced transport_get_by_name().
* Killed match_bridges_with_transports().
We currently *don't* detect whether any bridges miss their transports,
of if any transports miss their bridges.
* Various code and aesthetic tweaks and English language changes.
A couple of places in control.c were using connection_handle_write()
to flush important stuff (the response to a SIGNAL command, an
ERR-level status event) before Tor went down. But
connection_handle_write() isn't meaningful for bufferevents, so we'd
crash.
This patch adds a new connection_flush() that works for all connection
backends, and makes control.c use that instead.
Fix for bug 3367; bugfix on 0.2.3.1-alpha.
debug-level since it will be quite common. logged at both client
and server side. this step should help us track what's going on
with people filtering tor connections by our ssl habits.
This lets us make a lot of other stuff const, allows the compiler to
generate (slightly) better code, and will make me get slightly fewer
patches from folks who stick mutable stuff into or_options_t.
const: because not every input is an output!
Original message from bug3393:
check_private_dir() to ensure that ControlSocketsGroupWritable is
safe to use. Unfortunately, check_private_dir() only checks against
the currently running user… which can be root until privileges are
dropped to the user and group configured by the User config option.
The attached patch fixes the issue by adding a new effective_user
argument to check_private_dir() and updating the callers. It might
not be the best way to fix the issue, but it did in my tests.
(Code by lunar; changelog by nickm)
* Improved function documentation.
* Renamed find_bridge_transport_by_addrport() to
find_transport_by_bridge_addrport().
* Sanitized log severities we use.
* Ran check-spaces.
When we set a networkstatus in the non-preferred flavor, we'd check
the time in the current_consensus. But that might have been NULL,
which could produce a crash as seen in bug 3361.
When we added the check for key size, we required that the keys be
128 bytes. But RSA_size (which defers to BN_num_bytes) will return
128 for keys of length 1017..1024. This patch adds a new
crypto_pk_num_bits() that returns the actual number of significant
bits in the modulus, and uses that to enforce key sizes.
Also, credit the original bug3318 in the changes file.
UseBridges 1 now means "connect only to bridges; if you know no
bridges, don't make connections." UseBridges auto means "Use bridges
if they are known, and we have no EntryNodes set, and we aren't a
server." UseBridges 0 means "don't use bridges."
This merge was a bit nontrivial, since I had to write a new
node_is_a_configured_bridge to parallel router_is_a_configured_bridge.
Conflicts:
src/or/circuitbuild.c
options->DirPort is 0 in the unit tests, so
router_get_advertised_dir_port() would return 0 so we wouldn't pick a
dirport. This isn't what we want for the unit tests. Fixes bug
introduced in 95ac3ea594.
Previously, Tor would dereference a NULL pointer and crash if
lookup_last_hid_serv_request were called before the first call to
directory_clean_last_hid_serv_requests. As far as I can tell, that's
currently impossible, but I want that undocumented invariant to go away
in case I^Wwe break it someday.
If set to 1, Tor will attempt to prevent basic debugging
attachment attempts by other processes. (Default: 1)
Supports Mac OS X and Gnu/Linux.
Sebastian provided useful feedback and refactoring suggestions.
Signed-off-by: Jacob Appelbaum <jacob@appelbaum.net>
When we introduced NEED_KEY_1024 in routerparse.c back in
0.2.0.1-alpha, I forgot to add a *8 when logging the length of a
bad-length key.
Bugfix for 3318 on 0.2.0.1-alpha.
The patch for 3228 made us try to run init_keys() before we had loaded
our state file, resulting in an assert inside init_keys. We had moved
it too early in the function.
Now it's later in the function, but still above the accounting calls.
The conflicts were mainly caused by the routerinfo->node transition.
Conflicts:
src/or/circuitbuild.c
src/or/command.c
src/or/connection_edge.c
src/or/directory.c
src/or/dirserv.c
src/or/relay.c
src/or/rendservice.c
src/or/routerlist.c
The comment fixes are trivial. The defensive programming trick is to
tolerate receiving NULL inputs on the describe functions. That should
never actually happen, but it seems like the likeliest mistake for us
to make in the future.
Previously we did this nearer to the end (in the old_options &&
transition_affects_workers() block). But other stuff cares about
keys being consistent with options... particularly anything which
tries to access a key, which can die in assert_identity_keys_ok().
Fixes bug 3228; bugfix on 0.2.2.18-alpha.
Fixes part of #1297; bugfix on 48e0228f1e,
when circuit_expire_building was changed to assume that timestamp_dirty
was set when a circuit changed purpose to _C_REND_READY. (It wasn't.)
The previous attempt was incomplete: it told us not to publish a
descriptor, but didn't stop us from generating one. Now we treat an
absent OR port the same as not knowing our address. (This means
that when we _do_ get an OR port, we need to mark the descriptor
dirty.)
More attempt to fix bug3216.
This situation can happen easily if you set 'ORPort auto' and
'AccountingMax'. Doing so means that when you have no ORPort, you
won't be able to set an ORPort in a descriptor, so instead you would
just generate lots of invalid descriptors, freaking out all the time.
Possible fix for 3216; fix on 0.2.2.26-beta.
We had all the code in place to handle this right... except that we
were unconditionally opening a PF_INET socket instead of looking at
sa_family. Ow.
Fixes bug 2574; not a bugfix on any particular version, since this
never worked before.
Most instances were dead code; for those, I removed the assignments.
Some were pieces of info we don't currently plan to use, but which
we might in the future. For those, I added an explicit cast-to-void
to indicate that we know that the thing's unused. Finally, one was
a case where we were testing the wrong variable in a unit test.
That one I fixed.
This resolves bug 3208.
On win64, sockets are of type UINT_PTR; on win32 they're u_int;
elsewhere they're int. The correct windows way to check a socket for
being set is to compare it with INVALID_SOCKET; elsewhere you see if
it is negative.
On Libevent 2, all callbacks take sockets as evutil_socket_t; we've
been passing them int.
This patch should fix compilation and correctness when built for
64-bit windows. Fixes bug 3270.
We used to regenerate our descriptor whenever we'd get a sighup. This
was caused by a bug in options_transition_affects_workers() that would
return true even if the options were exactly the same. Down the call
path we'd call init_keys(), which made us make a new descriptor which
the authorities would reject, and the node would subsequently fall out
of the consensus.
This patch fixes only the first part of this bug:
options_transition_affects_workers() behaves correctly now. The second
part still wants a fix.
tor_process_monitor_new can't currently return NULL, but if it ever can,
we want that to be an explicitly fatal error, without relying on the fact
that monitor_owning_controller_process's chain of caller will exit if it
fails.
When we configure a new bridge via the controller, don't wait up to ten
seconds before trying to fetch its descriptor. This wasn't so bad when
you listed your bridges in torrc, but it's dreadful if you configure
your bridges via vidalia.
Bumped the char maximum to 512 for HTTPProxyAuthenticator &
HTTPSProxyAuthenticator. Now stripping all '\n' after base64
encoding in alloc_http_authenticator.
Rename crypto_pk_check_key_public_exponent to crypto_pk_public_exponent_ok:
it's nice to name predicates s.t. you can tell how to interpret true
and false.
Fixed a trivial conflict where this and the ControlSocketGroupWritable
code both added different functions to the same part of connection.c.
Conflicts:
src/or/connection.c
This patch introduces a few new functions in router.c to produce a
more helpful description of a node than its nickame, and then tweaks
nearly all log messages taking a nickname as an argument to call these
functions instead.
There are a few cases where I left the old log messages alone: in
these cases, the nickname was that of an authority (whose nicknames
are useful and unique), or the message already included an identity
and/or an address. I might have missed a couple more too.
This is a fix for bug 3045.
When running a system-wide instance of Tor on Unix-like systems, having
a ControlSocket is a quite handy mechanism to access Tor control
channel. But it would be easier if access to the Unix domain socket can
be granted by making control users members of the group running the Tor
process.
This change introduces a UnixSocketsGroupWritable option, which will
create Unix domain sockets (and thus ControlSocket) 'g+rw'. This allows
ControlSocket to offer same access control measures than
ControlPort+CookieAuthFileGroupReadable.
See <http://bugs.debian.org/552556> for more details.
This code changes it so that we don't remove bridges immediately when
we start re-parsing our configuration. Instead, we mark them all, and
remove all the marked ones after re-parsing our bridge lines. As we
add a bridge, we see if it's already in the list. If so, we just
unmark it.
This new behavior will lose the property we used to have that bridges
were in bridge_list in the same order in which they appeared in the
torrc. I took a quick look through the code, and I'm pretty sure we
didn't actually depend on that anywhere.
This is for bug 3019; it's a fix on 0.2.0.3-alpha.
rransom notes correctly that now that we aren't checking our HSDir
flag, we have no actual reason to check whether we are listed in the
consensus at all when determining if we should act like a hidden
service directory.
Previously, if they changed in torrc during a SIGHUP, all was well,
since we would just clear all transient entries from the addrmap
thanks to bug 1345. But if you changed them from the controller, Tor
would leave old mappings in place.
The VirtualAddrNetwork bug has been here since 0.1.1.19-rc; the
AutomapHosts* bug has been here since 0.2.0.1-alpha.
This bug couldn't happen when TrackExitHosts changed in torrc, since
the SIGHUP to reload the torrc would clear out all the transient
addressmap entries before. But if you used SETCONF to change
TrackExitHosts, old entries would be left alone: that's a bug, and so
this is a bugfix on Tor 0.1.0.1-rc.
If you really want to purge the client DNS cache, the TrackHostExits
mappings, and the virtual address mappings, you should be using NEWNYM
instead.
Fixes bug 1345; bugfix on Tor 0.1.0.1-rc.
Note that this needs more work: now that we aren't nuking the
transient addressmap entries on HUP, we need to make sure that
configuration changes to VirtualAddressMap and TrackHostExits actually
have a reasonable effect.
We'll eventually want to do more work here to make sure that the ports
are stable over multiple invocations. Otherwise, turning your node on
and off will get you a new DirPort/ORPort needlessly.
Otherwise, it will just immediately close any port declared with "auto"
on the grounds that it wasn't configured. Now, it will allow "auto" to
match any port.
This means FWIW if you configure a socks port with SocksPort 9999
and then transition to SocksPort auto, the original socksport will
not get closed and reopened. I'm considering this a feature.
HTTPS error code 403 is now reported as:
"The https proxy refused to allow connection".
Used a switch statement for additional error codes to be explained
in the future.
On IRC, wanoskarnet notes that if we ever do microdesc_free() on a
microdesc that's in the nodelist, we're in trouble. Also, we're in
trouble if we free one that's still in the microdesc_cache map.
This code adds a flag to microdesc_t to note where the microdesc is
referenced from, and checks those flags from microdesc_free(). I
don't believe we have any errors here now, but if we introduce some
later, let's log and recover from them rather than introducing
heisenbugs later on.
Addresses bug 3153.
The old behavior contributed to unreliability when hidden services and
hsdirs had different consensus versions, and so had different opinions
about who should be cacheing hsdir info.
Bugfix on 0.2.0.10-alpha; based on discussions surrounding bug 2732.
The new behavior is to try to rename the old file if there is one there
that we can't read. In all likelihood, that will fail too, but at least
we tried, and at least it won't crash.
Conflicts in various places, mainly node-related. Resolved them in
favor of HEAD, with copying of tor_mem* operations from bug3122_memcmp_022.
src/common/Makefile.am
src/or/circuitlist.c
src/or/connection_edge.c
src/or/directory.c
src/or/microdesc.c
src/or/networkstatus.c
src/or/router.c
src/or/routerlist.c
src/test/test_util.c
Conflicts throughout. All resolved in favor of taking HEAD and
adding tor_mem* or fast_mem* ops as appropriate.
src/common/Makefile.am
src/or/circuitbuild.c
src/or/directory.c
src/or/dirserv.c
src/or/dirvote.c
src/or/networkstatus.c
src/or/rendclient.c
src/or/rendservice.c
src/or/router.c
src/or/routerlist.c
src/or/routerparse.c
src/or/test.c
Here I looked at the results of the automated conversion and cleaned
them up as follows:
If there was a tor_memcmp or tor_memeq that was in fact "safe"[*] I
changed it to a fast_memcmp or fast_memeq.
Otherwise if there was a tor_memcmp that could turn into a
tor_memneq or tor_memeq, I converted it.
This wants close attention.
[*] I'm erring on the side of caution here, and leaving some things
as tor_memcmp that could in my opinion use the data-dependent
fast_memcmp variant.
Make that explicit by adding an assert and removing a null-check. All of
its callers currently depend on the argument being non-null anyway.
Silences a few clang complaints.
The analyzer assumed that bootstrap_percent could be less than 0 when we
call control_event_bootstrap_problem(), which would mean we're calling
log_fn() with undefined values. The assert makes it clear this can't
happen.
When configure tor with --enable-bufferevents and
--enable-static-libevent, libevent_openssl would still be linked
dynamically. Fix this and refactor src/or/Makefile.am along the way.
To make sure that a server learns if its IP has changed, the server
sometimes launches authority.z descriptor fetches from
update_router_descriptor_downloads. That's nice, but we're moving
towards a situation where update_router_descriptor_downloads doesn't
always get called. So this patch breaks the authority.z
check-and-fetch into a new function.
This function also renames last_routerdesc_download to a more
appropriate last_descriptor_download, and adds a new
update_all_descriptor_downloads() function.
(For now, this is unnecessary, since servers don't actually use
microdescriptors. But that could change, or bridges could start
using microdescriptors, and then we'll be glad this is refactored
nicely.)
To turn this on, set UseMicrodescriptors to "1" (or "auto" if you
want it on-if-you're-a-client). It should go auto-by-default once
0.2.3.1-alpha is released.
Because of our node logic, directory caches will never use
microdescriptors when they have the right routerinfo available.
See bug 2850 for rationale: it appears that on some busy exits, the OS
decides that every single port is now unusable because they have been
all used too recently.
Previously we ensured that it would get called periodically by doing
it from inside the code that added microdescriptors. That won't work
though: it would interfere with our code that tried to read microdescs
from disk initially. Instead, we should consider rebuilding the cache
periodically, and on startup.
Previously on 0.2.2, we'd never clean the cache. Now that we can
clean it, we want to add a condition to rebuild it: that should happen
whenever we have dropped enough microdescriptors that we could save a
lot of space.
No changes file, since 0.2.3 doesn't need one and 0.2.2 already has some
changes files for the backport of the microdesc_clean_cahce() function.
n_supported[i] has a random value prior to initialization, so a node
that doesn't have routerinfo available can have a random priority.
Patch contributed by wanoskarnet from #tor. Thanks!
Instead of just saying "boogity boogity!" let's actually warn people
that they need to configure stuff right to be safe, and point them
at instructions for how to do that.
Resolves bug 2474.
Clients and relays haven't used them since early 0.2.0.x. The only
remaining use by authorities learning about new relays ahead of scedule;
see proposal 147 for what we intend to do about that.
We're leaving in an option (FetchV2Networkstatus) to manually fetch v2
networkstatuses, because apparently dnsel and maybe bwauth want them.
This fixes bug 3022.
Previously it would erroneously return true if ListenAddr was set for
a client port, even if that port itself was 0. This would give false
positives, which were not previously harmful... but which were about
to become.
A v0 HS authority stores v0 HS descriptors in the same descriptor
cache that its HS client functionality uses. Thus, if the HS
authority operator clears its client HS descriptor cache, ALL v0
HS descriptors will be lost. That would be bad.
If the user sent a SIGNAL NEWNYM command after we fetched a rendezvous
descriptor, while we were building the introduction-point circuit, we
would give up entirely on trying to connect to the hidden service.
Original patch by rransom slightly edited to go into 0.2.1
Previously, it would remove every trackhostexits-derived mapping
*from* xyz.<exitname>.exit; it was supposed to remove every
trackhostexits-derived mapping *to* xyz.<exitname>.exit.
Bugfix on 0.2.0.20-rc: fixes an XXX020 added while staring at bug-1090
issues.
Resolved conflicts in:
doc/tor.1.txt
src/or/circuitbuild.c
src/or/circuituse.c
src/or/connection_edge.c
src/or/connection_edge.h
src/or/directory.c
src/or/rendclient.c
src/or/routerlist.c
src/or/routerlist.h
These were mostly releated to the routerinfo_t->node_t conversion.
Now we believe it to be the case that we never build a circuit for our
stream that has an unsuitable exit, so we'll never need to use such
a circuit. The risk is that we have some code that builds the circuit,
but now we refuse to use it, meaning we just build a bazillion circuits
and ignore them all.
This looked at first like another fun way around our node selection
logic: if we had introduction circuits, and we wound up building too
many, we would turn extras into general-purpose circuits. But when we
did so, we wouldn't necessarily check whether the general-purpose
circuits conformed to our node constraints. For example, the last
node could totally be in ExcludedExitNodes and we wouldn't have cared...
...except that the circuit should already be internal, so it won't get user
streams attached to it, so the transition should generally be allowed.
Add an assert to make sure we're right about this, and have it not
check whether ExitNodes is set, since that's irrelevant to internal
circuits.
IOW, if we were using TrackExitHosts, and we added an excluded node or
removed a node from exitnodes, we wouldn't actually remove the mapping
that points us at the new node.
Also, note with an XXX022 comment a place that I think we are looking
at the wrong string.
The routerset_equal function explicitly handles NULL inputs, so
there's no need to check inputs for NULL before calling it.
Also fix a bug in routerset_equal where a non-NULL routerset with no
entries didn't get counted as equal to a NULL routerset. This was
untriggerable, I think, but potentially annoying down the road.
ExcludeExitNodes foo now means that foo.exit doesn't work. If
StrictNodes is set, then ExcludeNodes foo also overrides foo.exit.
foo.exit , however, still works even if foo is not listed in ExitNodes.
This once maybe made sense when ExitNodes meant "Here are 3 exits;
use them all", but now it more typically means "Here are 3
countries; exit from there." Using non-Fast/Stable exits created a
potential partitioning opportunity and an annoying stability
problem.
(Don't worry about the case where all of our ExitNodes are non-Fast
or non-Stable: we handle that later in the function by retrying with
need_capacity and need_uptime set to 0.)
If we're picking a random directory node, never pick an excluded one.
But if we've chosen a specific one (or all), allow it unless strictnodes
is set (in which case warn so the user knows it's their fault).
When warning that we won't connect to a strictly excluded node,
log what it was we were trying to do at that node.
When ExcludeNodes is set but StrictNodes is not set, we only use
non-excluded nodes if we can, but fall back to using excluded nodes
if none of those nodes is usable.
This is a tweak to the bug2917 fix. Basically, if we want to simulate
a signal arriving in the controller, we shouldn't have to pretend that
we're Libevent, or depend on how Tor sets up its Libevent callbacks.
The last entry of the *Maxima values in the state file was inflated by a
factor of NUM_SECS_ROLLING_MEASURE (currently 10). This could lead to
a wrong maximum value propagating through the state file history.
When reading the bw history from the state file, we'd add the 900-second
value as traffic that occured during one second. Fix that by adding the
average value to each second.
This bug was present since 0.2.0.5-alpha, but was hidden until
0.2.23-alpha when we started using the saved values.
Some tor relays would report lines like these in their extrainfo
documents:
dirreq-write-history 2011-03-14 16:46:44 (900 s)
This was confusing to some people who look at the stats. It would happen
whenever a relay first starts up, or when a relay has dirport disabled.
Change this so that lines without actual bw entries are omitted.
Implements ticket 2497.
While doing so, get rid of the now unnecessary function
control_signal_act().
Fixes bug 2917, reported by Robert Ransom. Bugfix on commit
9b4aa8d2ab. This patch is loosely based on
a patch by Robert (Changelog entry).
- Document the structure and variables.
- Make circuits_for_buffer_stats into a static variable.
- Don't die horribly if interval_length is 0.
- Remove the unused local_circ_id field.
- Reorder the fields of circ_buffer_stats_t for cleaner alignment layout.
Instead of answering GETINFO requests about our geoip usage only after
running for 24 hours, this patch makes us answer GETINFO requests
immediately. We still round and quantize as before.
Implements bug2711.
Also, refactor the heck out of the bridge usage formatting code. No
longer should we need to do a generate-parse-and-regenerate cycle to
get the controller string, and that lets us simplify the code a lot.
- Document it in the manpage
- Add a changes entry
- No need to log when it is set: we don't log for other options.
- Use doxygen to document the new flag.
- Test truth of C variables with "if (x)", not "if (x == 1)".
- Simplify a complex boolean expression by breaking it up.
We've got millisecond timers now, we might as well use them.
This change won't actually make circuits get expiered with microsecond
precision, since we only call the expiry functions once per second.
Still, it should avoid the situation where we have a circuit get
expired too early because of rounding.
A couple of the expiry functions now call tor_gettimeofday: this
should be cheap since we're only doing it once per second. If it gets
to be called more often, though, we should onsider having the current
time be an argument again.
Since svn r1475/git 5b6099e8 in tor-0.0.6, we have responded to an
exhaustion of all 65535 stream IDs on a circuit by marking that
circuit for close. That's not the right response. Instead, we
should mark the circuit as "too dirty for new circuits".
Of course in reality this isn't really right either. If somebody
has managed to cram 65535 streams onto a circuit, the circuit is
probably not going to work well for any of those streams, so maybe
we should be limiting the number of streams on an origin circuit
concurrently.
Also, closing the stream in this case is probably the wrong thing to
do as well, but fixing that can also wait.
We fixed bug 539 (where directories would say "503" but send data
anyway) back in 0.2.0.16-alpha/0.1.2.19. Because most directory
versions were affected, we added workaround to make sure that we
examined the contents of 503-replies to make sure there wasn't any
data for them to find. But now that such routers are nonexistent,
we can remove this code. (Even if somebody fired up an 0.1.2.19
directory cache today, it would still be fine to ignore data in its
erroneous 503 replies.)
The first was genuinely impossible, I think: it could only happen
when the amount we read differed from the amount we wanted to read
by more than INT_MAX.
The second is just very unlikely: it would give incorrect results to
the controller if you somehow wrote or read more than 4GB on one
edge conn in one second. That one is a bugfix on 0.1.2.8-beta.
In afe414 (tor-0.1.0.1-rc~173), when we moved to
connection_edge_end_errno(), we used it in handling errors from
connection_connect(). That's not so good, since by the time
connection_connect() returns, the socket is no longer set, and we're
supposed to be looking at the socket_errno return value from
connection_connect() instead. So do what we should've done, and
look at the socket_errno value that we get from connection_connect().
Ian's original message:
The current code actually correctly handles queued data at the
Exit; if there is queued data in a EXIT_CONN_STATE_CONNECTING
stream, that data will be immediately sent when the connection
succeeds. If the connection fails, the data will be correctly
ignored and freed. The problem with the current server code is
that the server currently drops DATA cells on streams in the
EXIT_CONN_STATE_CONNECTING state. Also, if you try to queue data
in the EXIT_CONN_STATE_RESOLVING state, bad things happen because
streams in that state don't yet have conn->write_event set, and so
some existing sanity checks (any stream with queued data is at
least potentially writable) are no longer sound.
The solution is to simply not drop received DATA cells while in
the EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME
cells in this state, so that the OP cannot send more than one
window's worth of data to be queued at the Exit. Finally, patch
the sanity checks so that streams in the EXIT_CONN_STATE_RESOLVING
state that have buffered data can pass.
[...] Here is a simple patch. It seems to work with both regular
streams and hidden services, but there may be other corner cases
I'm not aware of. (Do streams used for directory fetches, hidden
services, etc. take a different code path?)
Right now, we only consider sending stream-level SENDME cells when we
have completely flushed a connection_edge's outbuf, or when it sends
us a DATA cell. Neither of these is ideal for throughput.
This patch changes the behavior so we now call
connection_edge_consider_sending_sendme when we flush _some_ data from
an edge outbuf.
Fix for bug 2756; bugfix on svn r152.
Resolved nontrivial conflict around rewrite_x_address_for_bridge and
learned_bridge_descriptor. Now, since leanred_bridge_descriptor works
on nodes, we must make sure that rewrite_node_address_for_bridge also
works on nodes.
Conflicts:
src/or/circuitbuild.c
hid_serv_responsible_for_desc_id's return value is never negative, and
there is no need to search through the consensus to find out whether we
are responsible for a descriptor ID before we look in our cache for a
descriptor.
Name the magic value "10" rather than re-deriving it.
Comment more.
Use the pattern that works for periodic timers, not the pattern that
doesn't work. ;)
It is important to verify the uptime claim of a relay instead of just
trusting it, otherwise it becomes too easy to blackhole a specific
hidden service. rephist already has data available that we can use here.
Bugfix on 0.2.0.10-alpha.
Partial backport of daa0326aaa .
Resolves bug 2402. Bugfix on 0.2.1.15 (for the part where we switched to
git) and on 0.2.1.30 (for the part where we dumped micro-revisions.)
The calculation of when to send the logmessage was correct, but we
didn't give the correct number of relays required: We want more than
half of all authorities we know about. Fixes bug 2663.
This fixes a remotely triggerable assert on directory authorities, who
don't handle descriptors with ipv6 contents well yet. We will want to
revert this once we're ready to handle ipv6.
Issue raised by lorth on #tor, who wasn't able to use Tor anymore.
Analyzed with help from Christian Fromme. Fix suggested by arma. Bugfix
on 0.2.1.3-alpha.
This should fix a bug that special ran into, where if your state file
didn't record period maxima, it would never decide that it had
successfully parsed itself unless you got lucky with your
uninitialized-variable values.
This patch also tries to improve error messags in the case where a
maximum value legitimately doesn't parse.
In private networks, the defaults for some options are changed. This
means that in options_validate(), where we're testing that the defaults
are what we think they are, we fail. Use a workaround by setting a
hidden configuration option _UsingTestingTorNetwork when we have altered
the configuration this way, so that options_validate() can do the right
thing.
Fixes bug 2250, bugfix on 0.2.1.2-alpha (the version introducing private
network options).
Changed received_netinfo_from_trusted_dir into a
tristate in order to keep track of whether we have
already tried contacting a trusted dir. So we don't
send multiple requests if we get a bunch of skews.
The underlying fix is to stop indicating requests "ns" consensuses by
putting NULL in their requested_resource field: we already had a
specialized meaning for requested_resource==NULL, which was (more or
less) "Treat a failure here as a network failure, since it's too early
to possibly be a resource or directory failure." Overloading the two
meant that very early microdesc consensus download failures would get
treated as ns consensus download failures, so the failure count there
would get incremented, but the microdesc download would get retried
immediately in an infinite loop.
Fix for bug2381. Diagnosed by mobmix.
Sets:
* Documentation
* Logging domain
* Configuration option
* Scheduled event
* Makefile
It also creates status.c and the log_heartbeat() function.
All code was written by Sebastian Hahn. Commit message was
written by me (George Kadianakis).
Our regular DH parameters that we use for circuit and rendezvous
crypto are unchanged. This is yet another small step on the path of
protocol fingerprinting resistance.
(Backport from 0.2.2's 5ed73e3807)
rransom noticed that a change of ORPort is just as bad as a change of IP
address from a client's perspective, because both mean that the relay is
not available to them while the new information hasn't propagated.
Change the bug1035 fix accordingly.
Also make sure we don't log a bridge's IP address (which might happen
when we are the bridge authority).
It is often not entirely clear what options Tor was built with, so it
might not be immediately obvious which config file Tor is using when it
found one. Log the config file at startup.
When calling circuit_build_times_shuffle_and_store_array, we were
passing a uint32_t as an int. arma is pretty sure that this can't
actually cause a bug, because of checks elsewhere in the code, but
it's best not to pass a uint32_t as an int anyway.
Found by doorss; fix on 0.2.2.4-alpha.
We detect and reject said attempts if there is no chosen exit node or
circuit: connecting to a private addr via a randomly chosen exit node
will usually fail (if all exits reject private addresses), is always
ill-defined (you're not asking for any particular host or service),
and usually an error (you've configured all requests to go over Tor
when you really wanted to configure all _remote_ requests to go over
Tor).
This can also help detect forwarding loop requests.
Found as part of bug2279.
Left circuit_build_times_get_bw_scale() uncommented because it is in the wrong
place due to an improper bug2317 fix. It needs to be moved and renamed, as it
is not a cbt parameter.
To quote arma: "So instead of stopping your CBT from screaming, you're just
going to throw it in the closet and hope you can't hear it?"
Yep. The log message can happen because at 95% point on the curve, we can be
way beyond the max timeout we've seen, if the curve has few points and is
shallow.
Also applied Nick's rule of thumb for rewriting some other notice log messages
to read like how you would explain them to a raving lunatic on #tor who was
shouting at you demanding what they meant. Hopefully the changes live up to
that standard.
If we got a signed digest that was shorter than the required digest
length, but longer than 20 bytes, we would accept it as long
enough.... and then immediately fail when we want to check it.
Fixes bug 2409; bug in 0.2.2.20-alpha; found by piebeer.
When we added support for separate client tls certs on bridges in
a2bb0bfdd5 we forgot to correctly initialize this when changing
from relay to bridge or vice versa while Tor is running. Fix that
by always initializing keys when the state changes.
Fixes bug 2433.
When we stopped using svn, 0.2.1.x lost the ability to notice its svn
revision and report it in the version number. However, it kept
looking at the micro-revision.i file... so if you switched to master,
built tor, then switched to 0.2.1.x, you'd get a micro-revision.i file
from master reported as an SVN tag. This patch takes out the "include
the svn tag" logic entirely.
Bugfix on 0.2.1.15-rc; fixes bug 2402.
Our regular DH parameters that we use for circuit and rendezvous
crypto are unchanged. This is yet another small step on the path of
protocol fingerprinting resistance.
We need to make sure that the worst thing that a weird consensus param
can do to us is to break our Tor (and only if the other Tors are
reliably broken in the same way) so that the majority of directory
authorities can't pull any attacks that are worse than the DoS that
they can trigger by simply shutting down.
One of these worse things was the cbtnummodes parameter, which could
lead to heap corruption on some systems if the value was sufficiently
large.
This commit fixes this particular issue and also introduces sanity
checking for all consensus parameters.
Our public key functions assumed that they were always writing into a
large enough buffer. In one case, they weren't.
(Incorporates fixes from sebastian)
In dnsserv_resolved(), we carefully made a nul-terminated copy of the
answer in a PTR RESOLVED cell... then never used that nul-terminated
copy. Ouch.
Surprisingly this one isn't as huge a security problem as it could be.
The only place where the input to dnsserv_resolved wasn't necessarily
nul-terminated was when it was called indirectly from relay.c with the
contents of a relay cell's payload. If the end of the payload was
filled with junk, eventdns.c would take the strdup() of the name [This
part is bad; we might crash there if the cell is in a bad part of the
stack or the heap] and get a name of at least length
495[*]. eventdns.c then rejects any name of length over 255, so the
bogus data would be neither transmitted nor altered.
[*] If the name was less than 495 bytes long, the client wouldn't
actually be reading off the end of the cell.
Nonetheless this is a reasonably annoying bug. Better fix it.
Found while looking at bug 2332, reported by doorss. Bugfix on
0.2.0.1-alpha.
Previously, we only looked at up to 128 bytes. This is a bad idea
since socks messages can be at least 256+x bytes long. Now we look at
up to 512 bytes; this should be enough for 0.2.2.x to handle all valid
SOCKS messages. For 0.2.3.x, we can think about handling trickier
cases.
Fixes 2330. Bugfix on 0.2.0.16-alpha.
Right now, Tor routers don't save the maxima values from the
bw_history_t between sessions. That's no good, since we use those
values to determine bandwidth. This code adds a new BWHist.*Maximum
set of values to the state file. If they're not present, we estimate
them by taking the observed total bandwidth and dividing it by the
period length, which provides a lower bound.
This should fix bug 1863. I'm calling it a feature.
Previously, our state parsing code would fail to parse a bwhist
correctly if the Interval was anything other than the default
hardcoded 15 minutes. This change makes the parsing less incorrect,
though the resulting history array might get strange values in it if
the intervals don't match the one we're using. (That is, if stuff was
generated in 15 minute intervals, and we read it into an array that
expects 30 minute intervals, we're fine, since values can be combined
pairwise. But if we generate data at 30 minute intervals and read it
into 15 minute intervals, alternating buckets will be empty.)
Bugfix on 0.1.1.11-alpha.
The trick of looping from i=0..4 , switching on i to set up some
variables, then running some common code is much better expressed by
just calling a function 4 times with 4 sets of arguments. This should
make the code a little easier to follow and maintain here.
An object, you'll recall, is something between -----BEGIN----- and
-----END----- tags in a directory document. Some of our code, as
doorss has noted in bug 2352, could assert if one of these ever
overflowed SIZE_T_CEILING but not INT_MAX. As a solution, I'm setting
a maximum size on a single object such that neither of these limits
will ever be hit. I'm also fixing the INT_MAX checks, just to be sure.
When using libevent 2, we use evdns_base_resolve_*(). When not, we
fake evdns_base_resolve_*() using evdns_resolve_*().
Our old check was looking for negative values (like libevent 2
returns), but our eventdns.c code returns 1. This code makes the
check just test for nonzero.
Note that this broken check was not for _resolve_ failures or even for
failures to _launch_ a resolve: it was for failures to _create_ or
_encode_ a resolve request.
Bug introduced in 81eee0ecfff3dac1e9438719d2f7dc0ba7e84a71; found by
lodger; uploaded to trac by rransom. Bug 2363. Fix on 0.2.2.6-alpha.
This was originally a patch provided by pipe
(http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html) to
provide a method for controllers to query the total amount of traffic
tor has handled (this is a frequently requested piece of information
by relay operators).
C99 allows a syntax for structures whose last element is of
unspecified length:
struct s {
int elt1;
...
char last_element[];
};
Recent (last-5-years) autoconf versions provide an
AC_C_FLEXIBLE_ARRAY_MEMBER test that defines FLEXIBLE_ARRAY_MEMBER
to either no tokens (if you have c99 flexible array support) or to 1
(if you don't). At that point you just use offsetof
[STRUCT_OFFSET() for us] to see where last_element begins, and
allocate your structures like:
struct s {
int elt1;
...
char last_element[FLEXIBLE_ARRAY_MEMBER];
};
tor_malloc(STRUCT_OFFSET(struct s, last_element) +
n_elements*sizeof(char));
The advantages are:
1) It's easier to see which structures and elements are of
unspecified length.
2) The compiler and related checking tools can also see which
structures and elements are of unspecified length, in case they
wants to try weird bounds-checking tricks or something.
3) The compiler can warn us if we do something dumb, like try
to stack-allocate a flexible-length structure.
We were not decrementing "available" every time we did
++next_virtual_addr in addressmap_get_virtual_address: we left out the
--available when we skipped .00 and .255 addresses.
This didn't actually cause a bug in most cases, since the failure mode
was to keep looping around the virtual addresses until we found one,
or until available hit zero. It could have given you an infinite loop
rather than a useful message, however, if you said "VirtualAddrNetwork
127.0.0.255/32" or something broken like that.
Spotted by cypherpunks
We were decrementing "available" twice for each in-use address we ran
across. This would make us declare that we ran out of virtual
addresses when the address space was only half full.
evbuffer_pullup does nothing and returns NULL if the caller asks it to
linearize more data than the buffer contains.
Introduced in 9796b9bfa6.
Reported by piebeer; fixed with help from doors.
If a SOCKS5 client insists on authentication, allow it to
negotiate a connection with Tor's SOCKS server successfully.
Any credentials the client provides are ignored.
This allows Tor to work with SOCKS5 clients that can only
support 'authenticated' connections.
Also add a bunch of basic unit tests for SOCKS4/4a/5 support
in buffers.c.
If you had TIME_MAX > INT_MAX, and your "time_to_exhaust_bw =
accountingmax/expected_bandwidth_usage * 60" calculation managed to
overflow INT_MAX, then your time_to_consider value could underflow and
wind up being rediculously low or high. "Low" was no problem;
negative values got caught by the (time_to_consider <= 0) check.
"High", however, would get you a wakeup time somewhere in the distant
future.
The fix is to check for time_to_exhaust_bw overflowing INT_MAX, not
TIME_MAX: We don't allow any accounting interval longer than a month,
so if time_to_exhaust_bw is significantly larger than 31*24*60*60, we
can just clip it.
This is a bugfix on 0.0.9pre6, when accounting was first introduced.
It fixes bug 2146, unless there are other causes there too. The fix
is from boboper. (I tweaked it slightly by removing an assignment
that boboper marked as dead, and lowering a variable that no longer
needed to be function-scoped.)
The old logic would have us fetch from authorities if we were refusing
unknown exits and our exit policy was reject*. Instead, we want to
fetch from authorities if we're refusing unknown exits and our exit
policy is _NOT_ reject*.
Fixed by boboper. Fixes more of 2097. Bugfix on 0.2.2.16-alpha.
We use a hash of the identity key to seed a prng to tell when an
accounting period should end. But thanks to the bug998 changes,
clients no longer have server-identity keys to use as a long-term seed
in accounting calculations. In any case, their identity keys (as used
in TLS) were never never fixed. So we can just set the wakeup time
from a random seed instead there. Still open is whether everybody
should be random.
This patch fixes bug 2235, which was introduced in 0.2.2.18-alpha.
Diagnosed with help from boboper on irc.
This is not the most beautiful fix for this problem, but it is the simplest.
Bugfix for 2205. Thanks to Sebastian and Mashael for finding the
bug, and boboper/cypherpunks for figuring out why it was happening
and how to fix it, and for writing a few fixes.
The reason the "streams problem" occurs is due to the complicated
interaction between Tor's congestion control and libevent. At some point
during the experiment, the circuit window is exhausted, which blocks all
edge streams. When a circuit level sendme is received at Exit, it
resumes edge reading by looping over linked list of edge streams, and
calling connection_start_reading() to inform libevent to resume reading.
When the streams are activated again, Tor gets the chance to service the
first three streams activated before the circuit window is exhausted
again, which causes all streams to be blocked again. As an experiment,
we reversed the order in which the streams are activated, and indeed the
first three streams, rather than the last three, got service, while the
others starved.
Our solution is to change the order in which streams are activated. We
choose a random edge connection from the linked list, and then we
activate streams starting from that chosen stream. When we reach the end
of the list, then we continue from the head of the list until our chosen
stream (treating the linked list as a circular linked list). It would
probably be better to actually remember which streams have received
service recently, but this way is simple and effective.
Sebastian notes (and I think correctly) that one of our ||s should
have been an &&, which simplifies a boolean expression to decide
whether to replace bridges. I'm also refactoring out the negation at
the start of the expression, to make it more readable.
Pick 5 seconds as the limit. 5 seconds is a compromise here between
making sure the user notices that the bad behaviour is (still) happening
and not spamming their log too much needlessly (the log message is
pretty long). We also keep warning every time if safesocks is
specified, because then the user presumably wants to hear about every
blocked instance.
(This is based on the original patch by Sebastian, then backported to
0.2.2 and with warnings split into their own function.)
Our checks that we don't exceed the 50 KB size limit of extra-info
descriptors apparently failed. This patch fixes these checks and reserves
another 250 bytes for appending the signature. Fixes bug 2183.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Having very long single lines with lots and lots of things in them
tends to make files hard to diff and hard to merge. Since our tools
are one-line-at-a-time, we should try to construct lists that way too,
within reason.
This incidentally turned up a few headers in configure.in that we were
for some reason searching for twice.
We would never actually enforce multiplicity rules when parsing
annotations, since the counts array never got entries added to it for
annotations in the token list that got added by earlier calls to
tokenize_string.
Found by piebeer.
We had a spelling discrepancy between the manpage and the source code
for some option. Resolve these in favor of the manpage, because it
makes more sense (for example, HTTP should be capitalized).
The code that makes use of the RunTesting option is #if 0, so setting
this option has no effect. Mark the option as obsolete for now, so that
Tor doesn't list it as an available option erroneously.
Change the default values for collecting directory request statistics and
inlcuding them in extra-info descriptors to 1.
Don't break if we are configured to collect directory request or entry
statistics and don't have a GeoIP database. Instead, print out a notice
and skip initializing the affected statistics code.
We need filtering bufferevent_openssl so that we can wrap around
IOCP bufferevents on Windows. This patch adds a temporary option to
turn on filtering mode, so that we can test it out on non-IOCP
systems to make sure it hasn't got any surprising bugs.
It also fixes some allocation/teardown errors in using
bufferevent_openssl as a filter.
Instead of rejecting a value that doesn't divide into 1 second, round to
the nearest divisor of 1 second and warn.
Document that the option only controls the granularity written by Tor to a
file or console log. It does not (for example) "batch up" log messages to
affect times logged by a controller, times attached to syslog messages, or
the mtime fields on log files.
In the case where old_router == NULL but sdmap has an entry for the
router, we can currently safely infer that the old_router was not a
bridge. Add an assert to ensure that this remains true, and fix the
logic not to die with the tor_assert(old_router) call.
In a2bb0bf we started using a separate client identity key. When we are
in "public server mode" (that means not a bridge) we will use the same
key. Reusing the key without doing the proper refcounting leads to a
segfault on cleanup during shutdown. Fix that.
Also introduce an assert that triggers if our refcount falls below 0.
That should never happen.
We now require that:
- Only actual servers should ever call get_server_identity_key
- If you're being a client or bridge, the client and server keys should
differ.
- If you're being a public relay, the client and server keys
should be the same.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
Mainly, this comes from turning two lists that needed to be kept in
synch into a single list of structs. This should save a little RAM,
and make the code simpler.
There's no reason to keep a time_t and a struct timeval to represent
the same value: highres_created.tv_sec was the same as timestamp_created.
This should save a few bytes per circuit.
Our old code correctly called bufferevent_flush() on linked
connections to make sure that the other side got an EOF event... but
it didn't call bufferevent_flush() when the connection wasn't
hold_open_until_flushed. Directory connections don't use
hold_open_until_flushed, so the linked exit connection never got an
EOF, so they never sent a RELAY_END cell to the client, and the
client never concluded that data had arrived.
The solution is to make the bufferevent_flush() code apply to _all_
closing linked conns whose partner is not already marked for close.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
First start of a fix for bug2001, but my test network still isn't
working: the client and the server send each other VERSIONS cells,
but never notice that they got them.
* Make tor_tls_context_new internal to tortls.c, and return the new
tor_tls_context_t from it.
* Add a public tor_tls_context_init wrapper function to replace it.
Some of these functions only work for routerinfo-based nodes, and as
such are only usable for advisory purposes. Fortunately, our uses
of them are compatible with this limitation.
Also, make the NodeFamily option into a list of routersets. This
lets us git rid of router_in_nickname_list (or whatever it was
called) without porting it to work with nodes, and also lets people
specify country codes and IP ranges in NodeFamily
This was the only flag in routerstatus_t that we would previously
change in a routerstatus_t in a consensus. We no longer have reason
to do so -- and probably never did -- as you can now confirm more
easily than you could have done by grepping for is_running before
this patch.
The name change is to emphasize that the routerstatus_t is_running
flag is only there to tell you whether the consensus says it's
running, not whether it *you* think it's running.
A node_t is an abstraction over routerstatus_t, routerinfo_t, and
microdesc_t. It should try to present a consistent interface to all
of them. There should be a node_t for a server whenever there is
* A routerinfo_t for it in the routerlist
* A routerstatus_t in the current_consensus.
(note that a microdesc_t alone isn't enough to make a node_t exist,
since microdescriptors aren't usable on their own.)
There are three ways to get a node_t right now: looking it up by ID,
looking it up by nickname, and iterating over the whole list of
microdescriptors.
All (or nearly all) functions that are supposed to return "a router"
-- especially those used in building connections and circuits --
should return a node_t, not a routerinfo_t or a routerstatus_t.
A node_t should hold all the *mutable* flags about a node. This
patch moves the is_foo flags from routerinfo_t into node_t. The
flags in routerstatus_t remain, but they get set from the consensus
and should not change.
Some other highlights of this patch are:
* Looking up routerinfo and routerstatus by nickname is now
unified and based on the "look up a node by nickname" function.
This tries to look only at the values from current consensus,
and not get confused by the routerinfo_t->is_named flag, which
could get set for other weird reasons. This changes the
behavior of how authorities (when acting as clients) deal with
nodes that have been listed by nickname.
* I tried not to artificially increase the size of the diff here
by moving functions around. As a result, some functions that
now operate on nodes are now in the wrong file -- they should
get moved to nodelist.c once this refactoring settles down.
This moving should happen as part of a patch that moves
functions AND NOTHING ELSE.
* Some old code is now left around inside #if 0/1 blocks, and
should get removed once I've verified that I don't want it
sitting around to see how we used to do things.
There are still some unimplemented functions: these are flagged
with "UNIMPLEMENTED_NODELIST()." I'll work on filling in the
implementation here, piece by piece.
I wish this patch could have been smaller, but there did not seem to
be any piece of it that was independent from the rest. Moving flags
forces many functions that once returned routerinfo_t * to return
node_t *, which forces their friends to change, and so on.
The node_t type is meant to serve two key functions:
1) Abstracting difference between routerinfo_t and microdesc_t
so that clients can use microdesc_t instead of routerinfo_t.
2) Being a central place to hold mutable state about nodes
formerly held in routerstatus_t and routerinfo_t.
This patch implements a nodelist type that holds a node for every
router that we would consider using.
When picking bridges (or other nodes without a consensus entry (and
thus no bandwidth weights)) we shouldn't just trust the node's
descriptor. So far we believed anything between 0 and 10MB/s, where 0
would mean that a node doesn't get any use from use unless it is our
only one, and 10MB/s would be a quite siginficant weight. To make this
situation better, we now believe weights in the range from 20kB/s to
100kB/s. This should allow new bridges to get use more quickly, and
means that it will be harder for bridges to see almost all our traffic.
In the first 100 circuits, our timeout_ms and close_ms
are the same. So we shouldn't transition circuits to purpose
CIRCUIT_PURPOSE_C_MEASURE_TIMEOUT, since they will just timeout again
next time we check.
We really should ignore any timeouts that have *no* network activity for their
entire measured lifetime, now that we have the 95th percentile measurement
changes. Usually this is up to a minute, even on fast connections.
If we really want all this complexity for these stages here, we need to handle
it better for people with large timeouts. It should probably go away, though.
Rechecking the timeout condition was foolish, because it is checked on the
same codepath. It was also wrong, because we didn't round.
Also, the liveness check itself should be <, and not <=, because we only have
1 second resolution.
Specifically, a circ attempt that we'd launched while the network was
down could timeout after we've marked our entrynodes up, marking them
back down again. The fix is to annotate as bad the OR conns that were
around before we did the retry, so if a circuit that's attached to them
times out we don't do anything about it.
This is needed for IOCP, since telling the IOCP backend about all
your CPUs is a good idea. It'll also come in handy with asn's
multithreaded crypto stuff, and for people who run servers without
reading the manual.
automake 1.6 doesn't like using a conditional += to add stuff to foo_LDADD.
Instead you need to conditionally define a variable, then non-conditionally
put that variable in foo_LDADD.
This requires the latest Git version of Libevent as of 24 March 2010.
In the future, we'll just say it requires Libevent 2.0.5-alpha or
later.
Since Libevent doesn't yet support hierarchical rate limit groups,
there isn't yet support for tracking relayed-bytes separately when
using the bufferevent system. If a future version does add support
for hierarchical buckets, we can add that back in.
Roger correctly pointed out that my code was broken for accounting
periods that shifted forwards, since
start_of_accounting_period_containing(interval_start_time) would not
be equal to interval_start_time, but potentially much earlier.
The RefuseUnknownExits config option is now a tristate, with "1"
meaning "enable it no matter what the consensus says", "0" meaning
"disable it no matter what the consensus says", and "auto" meaning "do
what the consensus says". If the consensus is silent, we enable
RefuseUnknownExits.
This patch also changes the dirserv logic so that refuseunknownexits
won't make us cache unless we're an exit.
Do not double-report signatures from unrecognized authorities both as
"from unknown authority" and "not present". Fixes bug 1956, bugfix on
0.2.2.16-alpha.
Our attempt to make compilation work on old versions of Windows
again while keeping wince compatibility broke the build for Win2k+.
helix reports this patch fixes the issue for WinXP. Bugfix on
0.2.2.15-alpha; related to bug 1797.
When the CellStatistics option is off, we don't store cell insertion
times. Doing so would also not be very smart, because there seem to
still be some performance issues with this type of statistics. Nothing
harmful happens when we don't have insertion times, so we don't need to
alarm the user.
Previously[*], the function would start with the first stream on the
circuit, and let it package as many cells as it wanted before
proceeding to the next stream in turn. If a circuit had many live
streams that all wanted to package data, the oldest would get
preference, and the newest would get ignored.
Now, we figure out how many cells we're willing to send per stream,
and try to allocate them fairly.
Roger diagnosed this in the comments for bug 1298.
[*] This bug has existed since before the first-ever public release
of Tor. It was added by r152 of Tor on 26 Jan 2003, which was
the first commit to implement streams (then called "topics").
This is not the oldest bug to be fixed in 0.2.2.x: that honor
goes to the windowing bug in r54, which got fixed in e50b7768 by
Roger with diagnosis by Karsten. This is, however, the most
long-lived bug to be fixed in 0.2.2.x: the r54 bug was fixed
2580 days after it was introduced, whereas I am writing this
commit message 2787 days after r152.
I'm going to use this to implement more fairness in
circuit_resume_edge_reading_helper in an attempt to fix bug 1298.
(Updated with fixes from arma and Sebastian)
An authority should never download a consensus if it has a live one,
but when it doesn't, it should admit that it's not going to get one,
and see if anybody else can give it one.
Fixes 1300, fix on 0.2.0.9-alpha
Bridges and other relays not included in the consensus don't
necessarily have a non-zero bandwidth capacity. If all our
configured bridges had a zero bw capacity we would warn the
user. Change that.
Previously, we were also considering the time spent in
soft-hibernation. If this was a long time, we would wind up
underestimating our bandwidth by a lot, and skewing our wakeup time
towards the start of the accounting interval.
This patch also makes us store a few more fields in the state file,
including the time at which we entered soft hibernation.
Fixes bug 1789. Bugfix on 0.0.9pre5.
This will make changes for DST still work, and avoid double-spending
bytes when there are slight changes to configurations.
Fixes bug 1511; the DST issue is a bugfix on 0.0.9pre5.
When we introduced the code to close non-open OR connections after
KeepalivePeriod had passed, we replaced some code that said
if (!connection_is_open(conn)) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
with new code that said
if (!connection_is_open(conn) && past_keepalive) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
This was a mistake, since it made all the other tests start applying
to non-open connections, thus causing bug 1840, where non-open
connections get closed way early.
Fixes bug 1840. Bugfix on 0.2.1.26 (commit 67b38d50).
- Move checks for extra_info to callers
- Change argument name from failed to descs
- Use strlen("fp/") instead of a magic number
- I passed on the suggestion to rename functions from *_failed() to
*_handle_failure(). There are a lot of these so for now just follow
the house style.
It's normal when bootstrapping to have a lot of different certs
missing, so we don't want missing certs to make us warn... unless
the certs we're missing are ones that we've tried to fetch a couple
of times and failed at.
May fix bug 1145.
We frequently add cells to stream-blocked queues for valid reasons
that don't mean we need to block streams. The most obvious reason
is if the cell arrives over a circuit rather than from an edge: we
don't block circuits, no matter how full queues get. The next most
obvious reason is that we allow CONNECTED cells from a newly created
stream to get delivered just fine.
This patch changes the behavior so that we only iterate over the
streams on a circuit when the cell in question came from a stream,
and we only block the stream that generated the cell, so that other
streams can still get their CONNECTEDs in.
This should keep WinCE working (unicode always-on) and get Win98
working again (unicode never-on).
There are two places where we explicitly use ASCII-only APIs, still:
in ntmain.c and in the unit tests.
This patch also fixes a bug in windoes tor_listdir that would cause
the first file to be listed an arbitrary number of times that was
also introduced with WinCE support.
Should fix bug 1797.
Setting CookieAuthFileGroupReadable but without setting CookieAuthFile makes
no sense, because unix directory permissions for the data directory prevent
the group from accessing the file anyways.
When this happens, run through the streams on the circuit and make
sure they're all blocked. If some aren't, that's a bug: block them
all and log it! If they all are, where did the cell come from? Log
it!
(I suspect that this actually happens pretty frequently, so I'm making
these log messages appear at INFO.)
Do not start reading on exit streams when we get a SENDME unless we
have space in the appropriate circuit's cell queue.
Draft fix for bug 1653.
(commit message by nickm)
router_add_to_routerlist() is supposed to be a nice minimal function
that only touches the routerlist structures, but it included a call to
dirserv_single_reachability_test().
We have a function that gets called _after_ adding descriptors
successfully: routerlist_descriptors_added. This patch moves the
responsibility for testing there.
Because the decision of whether to test or not depends on whether
there was an old routerinfo for this router or not, we have to first
detect whether we _will_ want to run the tests if the router is added.
We make this the job of
routers_update_status_from_consensus_networkstatus().
Finally, this patch makes the code notice if a router is going from
hibernating to non-hibernating, and if so causes a reachability test
to get launched.
This solves the problem Roger noted as:
What if the router has a clock that's 5 minutes off, so it
publishes a descriptor for 5 minutes in the future, and we test it
three minutes in. In this edge case, we will continue to advertise
it as Running for the full 45 minute period.
If the voting interval was short enough, the two-minutes delay
of CONSENSUS_MIN_SECONDS_BEFORE_CACHING would confuse bridges
to the point where they would assert before downloading a consensus.
It it was even shorter (<4 minutes, I think), caches would
assert too. This patch fixes that by having replacing the
two-minutes value with MIN(2 minutes, interval/16).
Bugfix for 1141; the cache bug could occur since 0.2.0.8-alpha, so
I'm calling this a bugfix on that. Robert Hogan diagnosed this.
Done as a patch against maint-0.2.1, since it makes it hard to
run some kinds of testing networks.
requests to authorities fail due to a network error.
Bug#1138
"When a Tor client starts up using a bridge, and UpdateBridgesFromAuthority
is set, Tor will go to the authority first and look up the bridge by
fingerprint. If the bridge authority is filtered, Tor will never notice that
the bridge authority lookup failed. So it will never fall back."
Add connection_dir_bridge_routerdesc_failed(), a function for unpacking
the bridge information from a failed request, and ensure
connection_dir_request_failed() calls it if the failed request
was for a bridge descriptor.
Test:
1. for ip in `grep -iR 'router ' cached-descriptors|cut -d ' ' -f 3`;
do sudo iptables -A OUTPUT -p tcp -d $ip -j DROP; done
2. remove all files from user tor directory
3. Put the following in torrc:
UseBridges 1
UpdateBridgesFromAuthority 1
Bridge 85.108.88.19:443 7E1B28DB47C175392A0E8E4A287C7CB8686575B7
4. Launch tor - it should fall back to downloading descriptors
directly from the bridge.
Initial patch reviewed and corrected by mingw-san.
https://trac.torproject.org/projects/tor/ticket/1525
"The codepath taken by the control port "RESOLVE" command to create a
synthetic SOCKS resolve request isn't the same as the path taken by
a real SOCKS request from 'tor-resolve'.
This prevents controllers who set LeaveStreamsUnattached=1 from
being able to attach RESOLVE streams to circuits of their choosing."
Create a new function connection_ap_rewrite_and_attach_if_allowed()
and call that when Tor needs to attach a stream to a circuit but
needs to know if the controller permits it.
No tests added.
With this patch we stop scheduling when we should write statistics using a
single timestamp in run_scheduled_events(). Instead, we remember when a
statistics interval starts separately for each statistic type in geoip.c
and rephist.c. Every time run_scheduled_events() tries to write stats to
disk, it learns when it should schedule the next such attempt.
This patch also enables all statistics to be stopped and restarted at a
later time.
This patch comes with a few refactorings, some of which were not easily
doable without the patch.
We already had the country code ?? indicating an unknown country, so all we
needed to do to make unknown countries excludable was to make the ?? code
discoverable.
It's okay to get (say) a SocksPort line in the torrc, and then a
SocksPort on the command line to override it, and then a SocksPort via
a controller to override *that*. But if there are two occurrences of
SocksPort in the torrc, or on the command line, or in a single SETCONF
command, then the user is likely confused. Our old code would not
help unconfuse the user, but would instead silently ignore all but
the last occurrence.
This patch changes the behavior so that if the some option is passed
more than once to any torrc, command line, or SETCONF (each of which
coincidentally corresponds to a call to config_assign()), and the
option is not a type that allows multiple occurrences (LINELIST or
LINELIST_X), then we can warn the user.
This closes trac entry 1384.
At best, this patch helps us avoid sending queued relayed cells that
would get ignored during the time between when a destroy cell is
sent and when the circuit is finally freed. At worst, it lets us
release some memory a little earlier than it would otherwise.
Fix for bug #1184. Bugfix on 0.2.0.1-alpha.
The next series of commits begins addressing the issue that we're
currently including the complete or.h file in all of our source files.
To change that, we're splitting function definitions into new header
files (one header file per source file).
Right now it says "552 internal error" because there's no way for
getinfo_helper_*() countries to specify an error message. This
patch changes the getinfo_helper_*() interface, and makes most of the
getinfo helpers give useful error messages in response to failures.
This should prevent recurrences of bug 1699, where a missing GeoIPFile
line in the torrc made GETINFO ip-to-county/* fail in a "not obvious
how to fix" way.
V3 authorities no longer decide not to vote on Guard+Exit. The bandwidth
weights should take care of this now.
Also, lower the max threshold for WFU to 0.98, to allow more nodes to become
guards.
This should make us conflict less with system files named "log.h".
Yes, we shouldn't have been conflicting with those anyway, but some
people's compilers act very oddly.
The actual change was done with one "git mv", by editing
Makefile.am, and running
find . -name '*.[ch]' | xargs perl -i -pe 'if (/^#include.*\Wlog.h/) {s/log.h/torlog.h/; }'
We now record large times as abandoned, to prevent a filter step from
happening and skewing our results.
Also, issue a warn for a rare case that can happen for funky values of Xm or
too many abandoned circuits. Can happen (very rarely) during unit tests, but
should not be possble during live operation, due to network liveness filters
and discard logic.
Many friendly operating systems have 64-bit times, and it's not nice
to pass them to an %ld format.
It's also extremely not-nice to write a time to the log as an
integer. Most people think it's 2010 June 29 23:57 UTC+epsilon, not
1277855805+epsilon.
These timers behave better with non-monotonic clocks than our old
ones, and also try harder to make once-per-second events get called
one second apart, rather than one-plus-epsilon seconds apart.
This fixes bug 943 for everybody using Libevent 2.0 or later.
We need to ensure that we close timeout measurement circuits. While
we're at it, we should close really old circuits of certain types that
aren't in use, and log really old circuits of other types.
We need to record different statistics at point of timeout, vs the point
of forcible closing.
Also, give some better names to constants and state file variables
to indicate they are not dealing with timeouts, but abandoned circuits.
In rare cases, we could cannibalize a one-hop circuit, ending up
with a two-hop circuit. This circuit would not be actually used,
but we should prevent its creation in the first place.
Thanks to outofwords and swissknife for helping to analyse this.
Most of the changes here are switches to use APIs available on Windows
CE. The most pervasive change is that Windows CE only provides the
wide-character ("FooW") variants of most of the windows function, and
doesn't support the older ASCII verions at all.
This patch will require use of the wcecompat library to get working
versions of the posix-style fd-based file IO functions.
[commit message by nickm]
Back when we changed the idea of a connection being "too old" for new
circuits into the connection being "bad" for new circuits, we didn't
actually change the info messages. This led to telling the user that
we were labelling connections as "too old" for being worse than
connections that were actually older than them.
Found by Scott on or-talk.
There are now four ways that CBT can be disabled:
1. Network-wide, with the cbtdisabled consensus param.
2. Via config, with "LearnCircuitBuildTimeout 0"
3. Via config, with "AuthoritativeDirectory 1"
4. Via a state file write failure.
This should prevent some asserts and storage of incorrect build times
for the cases where Tor is suspended during a circuit construction, or
just after completing a circuit. The idea is that if the circuit
build time is much greater than we would have cut it off at, we probably
had a suspend event along this codepath, and we should discard the
value.
In case we decide that the timeout rate is now too high due to our
change of the max synthetic quantile value, this consensus parameter
will allow us to restore it to the previous value.
This is for the other issue we saw in Bug 1335. A large number of high
timeouts were causing the timeout calculation to slowly drift upwards,
especially in conditions of load. This fix repeatedly regenerates all of
our synthetic timeouts whenever the timeout changes, to try to prevent
drift.
It also lowers the timeout cap to help for some cases of Bug 1245, where
some timeout values were so large that we ended up allocating a ton of
scratch memory to count the histogram bins.
The downside is that lowering this cap is affecting our timeout rate.
Unfortunately, the buildtimeout quantile is now higher than the actual
completion rate by what appears to be about 7-10%, which probably
represents the skew in the distribution due to lowering this synthetic
cap.
In my state files, I was seeing several peaks, probably due to different
guards having different latency. This change is meant to better capture
this behavior and generate more reasonable timeouts when it happens. It
is improving the timeout values for my collection of state files.
what's happening here is that we're fetching certs for obsolete
authorities -- probably legacy signers in this case. but try to
remain general in the log message.
It's natural for the definition of bandwidth_rule_t to be with the functions
that actually care about its values. Unfortunately, this means declaring
bandwidth_rate_rule_to_string() out of sequence. Someday we'll just rename
reasons.c to strings.c, and put it at the end of or.h, and this will all be
better.
Works like the --enable-static-openssl/libevent options. Requires
--with-zlib-dir to be set. Note that other dependencies might still
pull in a dynamicly linked zlib, if you don't link them in statically
too.
Everything that accepted the 'Circ' name handled it wrong, so even now
that we fixed the handling of the parameter, we wouldn't be able to
set it without making all the 0.2.2.7..0.2.2.10 relays act wonky.
This patch makes Tors accept the 'Circuit' name instead, so we can
turn on circuit priorities without confusing the versions that treated
the 'Circ' name as occasion to act weird.
I'm adding this because I can never remember what stuff like 'rule 3'
means. That's the one where if somebody goes limp or taps out, the
fight is over, right?
When you mean (a=b(c,d)) >= 0, you had better not say (a=b(c,d)>=0).
We did the latter, and so whenever CircPriorityHalflife was in the
consensus, it was treated as having a value of 1 msec (that is,
boolean true).
We need to make sure we have an event_base in dns.c before we call
anything that wants one. Make sure we always have one in dns_reset()
when we're a client. Fixes bug 1341.
Now if you're a published relay and you set RefuseUnknownExits, even
if your dirport is off, you'll fetch dir info from the authorities,
fetch it early, and cache it.
In the future, RefuseUnknownExits (or something like it) will be on
by default.
From http://archives.seul.org/tor/relays/Mar-2010/msg00006.html :
As I understand it, the bug should show up on relays that don't set
Address to an IP address (so they need to resolve their Address
line or their hostname to guess their IP address), and their
hostname or Address line fails to resolve -- at that point they'll
pick a random 4 bytes out of memory and call that their address. At
the same time, relays that *do* successfully resolve their address
will ignore the result, and only come up with a useful address if
their interface address happens to be a public IP address.
When the bandwidth-weights branch added the "directory-footer"
token, and began parsing the directory footer at the first
occurrence of "directory-footer", it made it possible to fool the
parsing algorithm into accepting unsigned data at the end of a
consensus or vote. This patch fixes that bug by treating the footer
as starting with the first "directory-footer" or the first
"directory-signature", whichever comes first.
Treat strings returned from signed_descriptor_get_body_impl() as not
NUL-terminated. Since the length of the strings is available, this is
not a big problem.
Discovered by rieo.
Don't allow anything but directory-signature tokens in a consensus after
the first directory-signature token. Fixes bug in bandwidth-weights branch.
Found by "outofwords."
Another dereference-then-NULL-check sequence. No reports of this bug
triggered in the wild. Fixes bugreport 1256.
Thanks to ekir for discovering and reporting this bug.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
This means that "if (E<G) {abc} else if (E>=G) {def}" can be replaced with
"if (E<G) {abc} else {def}"
Doing the second test explicitly made my mingw gcc nervous that we might
never be initializing casename.
For my 64-bit Linux system running with GCC 4.4.3-fc12-whatever, you
can't do 'printf("%lld", (int64_t)x);' Instead you need to tell the
compiler 'printf("%lld", (long long int)x);' or else it doesn't
believe the types match. This is why we added U64_PRINTF_ARG; it
looks like we needed an I64_PRINTF_ARG too.
asprintf() is a GNU extension that some BSDs have picked up: it does a printf
into a newly allocated chunk of RAM.
Our tor_asprintf() differs from standard asprintf() in that:
- Like our other malloc functions, it asserts on OOM.
- It works on windows.
- It always sets its return-field.
All other bandwidthrate settings are restricted to INT32_MAX, but
this check was forgotten for PerConnBWRate and PerConnBWBurst. Also
update the manpage to reflect the fact that specifying a bandwidth
in terabytes does not make sense, because that value will be too
large.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
On Windows, we don't have a notion of ~ meaning "our homedir", so we
were deliberately using an #ifdef to avoid calling expand_filename()
in multiple places. This is silly: The right place to turn a function
into a no-op on a single platform is in the function itself, not in
every single call-site.
Spec conformance issue: The code didn't force the network-status-version
token to be the first token in a v3 vote or consensus.
Problem discovered by Parakeep.
We need to use evdns_add_server_port_with_base() when configuring
our DNS listener, because libevent segfaults otherwise. Add a macro
in compat_libevent.h to pick the correct implementation depending
on the libevent version.
Fixes bug 1143, found by SwissTorExit
The src and dest of a memcpy() call aren't supposed to overlap,
but we were sometimes calling tor_addr_copy() as a no-op.
Also, tor_addr_assign was a redundant copy of tor_addr_copy(); this patch
removes it.
We implemented ratelimiting for warnings going into the logfile, but didn't
rate-limit controller events. Now both log warnings and controller events
are rate-limited.
Tor has tor_lookup_hostname(), which prefers ipv4 addresses automatically.
Bug 1244 occured because gethostbyname() returned an ipv6 address, which
Tor cannot handle currently. Fixes bug 1244; bugfix on 0.0.2pre25.
Reported by Mike Mestnik.
The problem was that we didn't allocate enough memory on 32-bit
platforms with 64-bit time_t. The memory leak occured every time
we fetched a hidden service descriptor we've fetched before.
When calculating the is_exit flag for a routerinfo_t, we don't need
to call exit_policy_is_general_exit() if router_exit_policy_rejects_all()
tells us it definitely is an exit. This check is much cheaper than
running exit_policy_is_general_exit().
exit_policy_is_general_exit() assumed that there are no redundancies
in the passed policy, in the sense that we actively combine entries
in the policy to really get rid of any redundancy. Since we cannot
do that without massively rewriting the policy lines the relay
operators set, fix exit_policy_is_general_exit().
Fixes bug 1238, discovered by Martin Kowalczyk.
We accidentally freed the internal buffer for bridge stats when we
were writing the bridge stats file or honoring a control port
request for said data. Change the interfaces for
geoip_get_bridge_stats* to prevent these problems, and remove the
offending free/add a tor_strdup.
Fixes bug 1208.
It's a bit confusing to have a loop where another function,
confusingly named "*_free", is responsible for advancing the loop
variable (or rather, for altering a structure so that the next time
the loop variable's initializer is evaluated it evaluates to something
different.)
Not only has this confused people: it's also confused coverity scan.
Let's fix that.
This was freaking out some relay operators without good reason, as
it is nothing the relay operator can do anything about anyways.
Quieting this warning suggested by rieo.
The OutboundBindAddress option is useful for making sure that all of
your outbond connections use a given interface. But when connecting
to 127.0.0.1 (or ::1 even) it's important to actually have the
connection come _from_ localhost, since lots of programs running on
localhost use the source address to authenticate that the connection
is really coming from the same host.
Our old code always bound to OutboundBindAddress, whether connecting
to localhost or not. This would potentially break DNS servers on
localhost, and socks proxies on localhost. This patch changes the
behavior so that we only look at OutboundBindAddress when connecting
to a non-loopback address.
this case can now legitimately happen, if you have a cached v2 status
from moria1, and you run with the new list of dirservers that's missing
the old moria1. it's nothing to worry about; the file will die off in
a month or two.
...to let us
rate-limit client connections as they enter the network. It's
controlled in the consensus so we can turn it on and off for
experiments. It's starting out off. Based on proposal 163.
Specifically, there are two cases: a) are we willing to start a new
circuit at a node not in your ExitNodes config option, and b) are we
willing to make use of a circuit that's already established but has an
unsuitable exit.
Now we discard all your circuits when you set ExitNodes, so the only
way you could end up with an exit circuit that ends at an unsuitable
place is if we explicitly ran out of exit nodes, StrictNodes was 0,
and we built this circuit to solve a stream that needs solving.
Fixes bug in dc322931, which would ignore the just-built circuit because
it has an unsuitable exit.
Before it would prepend your requested entrynodes to your list of guard
nodes, but feel free to use others after that. Now it chooses only
from your EntryNodes if any of those are available, and only falls back
to others if a) they're all down and b) StrictNodes is not set.
Also, now we refresh your entry guards from EntryNode at each consensus
fetch (rather than just at startup and then they slowly rot as the
network changes).
The goal here is to make users less likely to set StrictNodes, since
it's doing closer to what they expect it should be doing.
We do this in too many places throughout the code; it's time to start
clamping down.
Also, refactor Karsten's patch to use strchr-then-strndup, rather than
malloc-then-strlcpy-then-strchr-then-clear.
Fix statistics on client numbers by country as seen by bridges that were
broken in 0.2.2.1-alpha. Also switch to reporting full 24-hour intervals
instead of variable 12-to-48-hour intervals.
The HSAuthorityRecordStats option was used to track statistics of overall
hidden service usage on the version 0 hidden service authorities. With the
version 2 hidden service directories being deployed and version 0
descriptors being phased out, these statistics are not as useful anymore.
Goodbye, you fine piece of software; my first major code contribution to
Tor.
The new rule is: safe_str_X() means "this string is a piece of X
information; make it safe to log." safe_str() on its own means
"this string is a piece of who-knows-what; make it safe to log".
The rule is now: take the value from the CircuitPriorityHalflife
config option if it is set. If it zero, disable the cell_ewma
algorithm. If it is set, use it to calculate the scaling factor.
If it is not set, look for a CircPriorityHalflifeMsec parameter in the
consensus networkstatus. If *that* is zero, then disable the cell_ewma
algorithm; if it is set, use it to calculate the scaling factor.
If it is not set at all, disable the algorithm.
In connection_dir_client_reached_eof, we make sure that we either
return when we get an http status code of 503 or handle the problem
and set it to 200. Later we check if the status code is 503. Remove
that check.
There are two big changes here:
- We store active circuits in a priority queue for each or_conn,
rather than doing a linear search over all the active circuits
before we send each cell.
- Rather than multiplying every circuit's cell-ewma by a decay
factor every time we send a cell (thus normalizing the value of a
current cell to 1.0 and a past cell to alpha^t), we instead
only scale down the cell-ewma every tick (ten seconds atm),
normalizing so that a cell sent at the start of the tick has
value 1.0).
Each circuit is ranked in terms of how many cells from it have been
relayed recently, using a time-weighted average.
This patch has been tested this on a private Tor network on PlanetLab,
and gotten improvements of 12-35% in time it takes to fetch a small
web page while there's a simultaneous large data transfer going on
simultaneously.
[Commit msg by nickm based on mail from Ian Goldberg.]
This changes the pqueue API by requiring an additional int in every
structure that we store in a pqueue to hold the index of that structure
within the heap.
Some *_free functions threw asserts when passed NULL. Now all of them
accept NULL as input and perform no action when called that way.
This gains us consistence for our free functions, and allows some
code simplifications where an explicit null check is no longer necessary.
Do not segfault when writing buffer stats when we haven't observed a
single circuit to report about. This is a minor bug that would only show
up in testing environments with no traffic and with reduced stats
intervals.
Avoid crashing if the client is trying to upload many bytes and the
circuit gets torn down at the same time, or if the flip side
happens on the exit relay. Bugfix on 0.2.0.1-alpha; fixes bug 1150.
New config option "CircuitStreamTimeout" to override our internal
timeout schedule for how many seconds until we detach a stream from
a circuit and try a new circuit. If your network is particularly
slow, you might want to set this to a number like 60.
To fix a major security problem related to incorrect use of
SSL/TLS renegotiation, OpenSSL has turned off renegotiation by
default. We are not affected by this security problem, however,
since we do renegotiation right. (Specifically, we never treat a
renegotiated credential as authenticating previous communication.)
Nevertheless, OpenSSL's new behavior requires us to explicitly
turn renegotiation back on in order to get our protocol working
again.
Amusingly, this is not so simple as "set the flag when you create
the SSL object" , since calling connect or accept seems to clear
the flags.
For belt-and-suspenders purposes, we clear the flag once the Tor
handshake is done. There's no way to exploit a second handshake
either, but we might as well not allow it.
This commit implements a new config option: 'DisableAllSwap'
This option probably only works properly when Tor is started as root.
We added two new functions: tor_mlockall() and tor_set_max_memlock().
tor_mlockall() attempts to mlock() all current and all future memory pages.
For tor_mlockall() to work properly we set the process rlimits for memory to
RLIM_INFINITY (and beyond) inside of tor_set_max_memlock().
We behave differently from mlockall() by only allowing tor_mlockall() to be
called one single time. All other calls will result in a return code of 1.
It is not possible to change DisableAllSwap while running.
A sample configuration item was added to the torrc.complete.in config file.
A new item in the man page for DisableAllSwap was added.
Thanks to Moxie Marlinspike and Chris Palmer for their feedback on this patch.
Please note that we make no guarantees about the quality of your OS and its
mlock/mlockall implementation. It is possible that this will do nothing at all.
It is also possible that you can ulimit the mlock properties of a given user
such that root is not required. This has not been extensively tested and is
unsupported. I have included some comments for possible ways we can handle
this on win32.
If your relay can't keep up with the number of incoming create cells, it
would log one warning per failure into your logs. Limit warnings to 1 per
minute.
This was left over from an early draft of the microdescriptor code; it
began to populate the signatures array of a networkstatus vote, even
though there's no actual need to do that for a vote.
In C, the code "char x[10]; if (x) {...}" always takes the true branch of
the if statement. Coverity notices this now.
In some cases, we were testing arrays to make sure that an operation
we wanted to do would suceed. Those cases are now always-true.
In some cases, we were testing arrays to see if something was _set_.
Those caes are now tests for strlen(s), or tests for
!tor_mem_is_zero(d,len).
If all authorities restart at once right before a consensus vote, nobody
will vote about "Running", and clients will get a consensus with no usable
relays. Instead, authorities refuse to build a consensus if this happens.
The first happens on an error case when a controller wants an
impossible directory object. The second happens when we can't write
our fingerprint file.
The code for these was super-wrong, but will only break things when we
reset an option on a platform where sizeof(time_t) is different from
sizeof(int).
See task 1114. The most plausible explanation for someone sending us weak
DH keys is that they experiment with their Tor code or implement a new Tor
client. Usually, we don't care about such events, especially not on warn
level. If we really care about someone not following the Tor protocol, we
can set ProtocolWarnings to 1.
This patch introduces a new type called document_signature_t to represent the
signature of a consensus document. Now, each consensus document can have up
to one document signature per voter per digest algorithm. Also, each
detached-signatures document can have up to one signature per <voter,
algorithm, flavor>.
Previously, we insisted that a valid signature must be a signature of
the expected digest. Now we accept anything that starts with the
expected digest. This lets us include another digest later.
When we tried to use the deprecated non-threadsafe evdns
interfaces in Libevent 2 without using the also-deprecated
event_init() interface, Libevent 2 would sensibly crash, since it
has no guess where to find the Libevent library.
Here we use the evdns_base_*() functions instead if they're
present, and fake them if they aren't.
This is a possible fix for bug 1023, where if we vote (or make a v2
consensus networkstatus) right after we come online, we can call
rep_hist_note_router_unreachable() on every router we haven't connected
to yet, and thereby make all their uptime values reset.
This seems to be happening to me a lot on a garbage DSL line.
We may need to come up with 2 threshholds: a high short onehop
count and a lower longer count.
Don't count one-hop circuits when we're estimating how long it
takes circuits to build on average. Otherwise we'll set our circuit
build timeout lower than we should. Bugfix on 0.2.2.2-alpha.
Directory authorities now reject Tor relays with versions less than
0.1.2.14. This step cuts out four relays from the current network,
none of which are very big.
This shouldn't be necessary, but apparently the Android cross-compiler
doesn't respect -I as well as it should. (-I is supposed to add to the
*front* of the search path. Android's gcc wrapper apparently likes to add to
the end. This is broken, but we need to work around it.)
- Avoid memmoving 0 bytes which might lead to compiler warnings.
- Don't require relays to be entry node AND bridge at the same to time to
record clients.
- Fix a memory leak when writing dirreq-stats.
- Don't say in the stats files that measurement intervals are twice as long
as they really are.
- Reduce minimum observation time for requests to 12 hours, or we might
never record usage.
- Clear exit stats correctly after writing them, or we accumulate old stats
over time.
- Reset interval start for buffer stats, too.
The big change is to add a function to display the current SSL handshake
state, and to log it everywhere reasonable. (A failure in
SSL23_ST_CR_SRVR_HELLO_A is different from one in
SSL3_ST_CR_SESSION_TICKET_A.)
This patch also adds a new log domain for OR handshaking, so you can pull out
all the handshake log messages without having to run at debug for everything.
For example, you'd just say "log notice-err [handshake]debug-err file
tor.log".
This was the only log notice that happened during other
tor invocations, like --verify-config and --list-fingerprint.
Plus, now we think it works, so no need to hear about it.
"Tinytest" is a minimalist C unit testing framework I wrote for
Libevent. It supports some generally useful features, like being able
to run separate unit tests in their own processes.
I tried to do the refactoring to change test.c as little as possible.
Thus, we mostly don't call the tinytest macros directly. Instead, the
test.h header is now a wrapper on tinytest.h to make our existing
test_foo() macros work.
The next step(s) here will be:
- To break test.c into separate files, each with its own test group.
- To look into which things we can test
- To refactor the more fiddly tests to use the tinytest macros
directly and/or run forked.
- To see about writing unit tests for things we couldn't previously
test without forking.
If the networkstatus consensus tells us that we should use a
negative circuit package window, ignore it. Otherwise we'll
believe it and then trigger an assert.
Also, change the interface for networkstatus_get_param() so we
don't have to lookup the consensus beforehand.
A) We were considering a circuit had timed out in the special cases
where we close rendezvous circuits because the final rendezvous
circuit couldn't be built in time.
B) We were looking at the wrong timestamp_created when considering
a timeout.
Don't discard all circuits every MaxCircuitDirtiness, because the
user might legitimately have set that to a very lower number.
Also don't use up all of our idle circuits with testing circuits,
since that defeats the point of preemptive circuits.
We want it to be under our control so it doesn't mess
up initialization. This is likely the cause for
the bug the previous assert-adding commit (09a75ad) was
trying to address.
Using CircuitBuildTimeout is prone to issues with SIGHUP, etc.
Also, shuffle the circuit build times array after loading it
in so that newer measurements don't replace chunks of
similarly timed measurements.
To further attempt to fix bug 1090, make sure connection_ap_can_use_exit
always returns 0 when the chosen exit router is excluded. This should fix
bug1090.
When we excluded some Exits, we were sometimes warning the user that we
were going to use the node regardless. Many of those warnings were in
fact bogus, because the relay in question was not used to connect to
the outside world.
Based on patch by Rotor, thanks!
Tor now reads the "circwindow" parameter out of the consensus,
and uses that value for its circuit package window rather than the
default of 1000 cells. Begins the implementation of proposal 168.
This code adds a new field to vote on: "params". It consists of a list of
sorted key=int pairs. The output is computed as the median of all the
integers for any key on which anybody voted.
Improved with input from Roger.
Adding the same vote to a networkstatus consensus leads to a memory leak
on the client side. Fix that by only using the first vote from any given
voter, and ignoring the others.
Problem found by Rotor, who also helped writing the patch. Thanks!
A vote may only contain exactly one signature. Make sure we reject
votes that violate this.
Problem found by Rotor, who also helped writing the patch. Thanks!
Fix an obscure bug where hidden services on 64-bit big-endian
systems might mis-read the timestamp in v3 introduce cells, and
refuse to connect back to the client. Discovered by "rotor".
Bugfix on 0.2.1.6-alpha.
Add a "getinfo status/accepted-server-descriptor" controller
command, which is the recommended way for controllers to learn
whether our server descriptor has been successfully received by at
least on directory authority. Un-recommend good-server-descriptor
getinfo and status events until we have a better design for them.
We were telling the controller about CHECKING_REACHABILITY and
REACHABILITY_FAILED status events whenever we launch a testing
circuit or notice that one has failed. Instead, only tell the
controller when we want to inform the user of overall success or
overall failure. Bugfix on 0.1.2.6-alpha. Fixes bug 1075. Reported
by SwissTorExit.
When we added support for fractional units (like 1.5 MB) I broke
support for giving units with no space (like 2MB). This patch should
fix that. It also adds a propoer tor_parse_double().
Fix for bug 1076. Bugfix on 0.2.2.1-alpha.
We were triggering a CLOCK_SKEW controller status event whenever
we connect via the v2 connection protocol to any relay that has
a wrong clock. Instead, we should only inform the controller when
it's a trusted authority that claims our clock is wrong. Bugfix
on 0.2.0.20-rc; starts to fix bug 1074. Reported by SwissTorExit.
- Refactor geoip.c by moving duplicate code into rotate_request_period().
- Don't leak memory when cleaning up cell queues.
- Make sure that exit_(streams|bytes_(read|written)) are initialized in all
places accessing these arrays.
- Read only the last block from *stats files and ensure that its timestamp
is not more than 25 hours in the past and not more than 1 hour in the
future.
- Stop truncating the last character when reading *stats files.
The only thing that's left now is to avoid reading whole *stats files into
memory.
Note that unlike subversion revision numbers, it isn't meaningful to
compare these for anything but equality. We define a sort-order anyway,
in case one of these accidentally slips into a recommended-versions
list.
If any the v3 certs we download are unparseable, we should actually
notice the failure so we don't retry indefinitely. Bugfix on 0.2.0.x;
reported by "rotator".
The more verbose logs that were added in ee58153 also include a string
that might not have been initialized. This can lead to segfaults, e.g.,
when setting up private Tor networks. Initialize this string with NULL.
Send circuit or stream sendme cells when our window has decreased
by 100 cells, not when it has decreased by 101 cells. Bug uncovered
by Karsten when testing the "reduce circuit window" performance
patch. Bugfix on the 54th commit on Tor -- from July 2002,
before the release of Tor 0.0.0. This is the new winner of the
oldest-bug prize.
I don't think we actually use (or plan to use) strtok_r in a reentrant
way anywhere in our code, but would be nice not to have to think about
whether we're doing it.
Relays no longer publish a new server descriptor if they change
their MaxAdvertisedBandwidth config option but it doesn't end up
changing their advertised bandwidth numbers. Bugfix on 0.2.0.28-rc;
fixes bug 1026. Patch from Sebastian.
Specifically, every time we get a create cell but we have so many already
queued that we refuse it.
Bugfix on 0.2.0.19-alpha; fixes bug 1034. Reported by BarkerJr.
The problem is that clients and hidden services are receiving
relay_early cells, and they tear down the circuit.
Hack #1 is for rendezvous points to rewrite relay_early cells to
relay cells. That way there are never any incoming relay_early cells.
Hack #2 is for clients and hidden services to never send a relay_early
cell on an established rendezvous circuit. That works around rendezvous
points that haven't upgraded yet.
Hack #3 is for clients and hidden services to not tear down the circuit
when they receive an inbound relay_early cell. We already refuse extend
cells at clients.
When determining how long directory requests take or how long cells spend
in queues, we were comparing timestamps on microsecond detail only to
convert results to second or millisecond detail later on. But on 32-bit
architectures this means that 2^31 microseconds only cover time
differences of up to 36 minutes. Instead, compare timestamps on
millisecond detail.
Now that we require EntryStatistics to be 1 for counting connecting
clients, unit tests need to set that config option, too.
Reported by Sebastian Hahn.
Changes to directory request statistics:
- Rename GEOIP statistics to DIRREQ statistics, because they now include
more than only GeoIP-based statistics, whereas other statistics are
GeoIP-dependent, too.
- Rename output file from geoip-stats to dirreq-stats.
- Add new config option DirReqStatistics that is required to measure
directory request statistics.
- Clean up ChangeLog.
Also ensure that entry guards statistics have access to a local GeoIP
database.
This new option will allow clients to download the newest fresh consensus
much sooner than they normally would do so, even if they previously set
FetchDirInfoEarly. This includes a proper ChangeLog entry and an updated man
page.
Introduce a threshold of 0.01% of bytes that must be read and written per
port in order to be included in the statistics. Otherwise we cannot include
these statistics in extra-info documents, because they are too big.
Change the labels "-written" and "-read" so that the meanings are as
intended.
The internal error "could not find intro key" occurs when we want to send
an INTRODUCE1 cell over a recently finished introduction circuit and think
we built the introduction circuit with a v2 hidden service descriptor, but
cannot find the introduction key in our descriptor.
My first guess how we can end up in this situation is that we are wrong in
thinking that we built the introduction circuit based on a v2 hidden
service descriptor. This patch checks if we have a v0 descriptor, too, and
uses that instead.
o Minor features:
- If we're a relay and we change our IP address, be more verbose
about the reason that made us change. Should help track down
further bugs for relays on dynamic IP addresses.
when we write out our stability info, detect relays that have slipped
through the cracks. log about them and correct the problem.
if we continue to see a lot of these over time, it means there's another
spot where relays fall out of the routerlist without being marked as
unreachable.
Marks the control port connection for flushing before closing when
the QUIT command is issued. This allows a QUIT to be issued during
a long reply over the control port, flushing the reply and then
closing the connection. Fixes bug 1015.
rather than the bandwidth values in each relay descriptor. This approach
opens the door to more accurate bandwidth estimates once the directory
authorities start doing active measurements. Implements more of proposal
141.
arma's rationale: "I think this is a bug, since people intentionally
set DirPortFrontPage, so they really do want their relay to serve that
page when it's asked for. Having it appear only sometimes (or roughly
never in Sebastian's case) makes it way less useful."
Fixes bug 1013; bugfix on 0.2.1.8-alpha.
Added a sanity check in config.c and a check in directory.c
directory_initiate_command_rend() to catch any direct connection attempts
when a socks proxy is configured.
The rest of the code was only including event.h so that it could see
EV_READ and EV_WRITE, which we were using as part of the
connection_watch_events interface for no very good reason.
This patch adds a new compat_libevent.[ch] set of files, and moves our
Libevent compatibility and utilitity functions there. We build them
into a separate .a so that nothing else in src/commmon depends on
Libevent (partially fixing bug 507).
Also, do not use our own built-in evdns copy when we have Libevent
2.0, whose evdns is finally good enough (thus fixing Bug 920).
Fix an edge case where a malicious exit relay could convince a
controller that the client's DNS question resolves to an internal IP
address. Bug found and fixed by "optimist"; bugfix on 0.1.2.8-beta.
Fix an edge case where a malicious exit relay could convince a
controller that the client's DNS question resolves to an internal IP
address. Bug found and fixed by "optimist"; bugfix on 0.1.2.8-beta.
Hidden service clients didn't use a cached service descriptor that
was older than 15 minutes, but wouldn't fetch a new one either. Now,
use a cached descriptor no matter how old it is and only fetch a new
one when all introduction points fail. Fix for bug 997. Patch from
Marcus Griep.
Apparently all the stuff that does a linear scan over all the DNS
cache entries can get really expensive when your DNS cache is very
large. It's hard to say how much this will help performance, since
gprof doesn't count time spent in OpenSSL or zlib, but I'd guess 10%.
Also, this patch removes calls to assert_connection_ok() from inside
the read and write callbacks, which are similarly unneeded, and a
little costlier than I'm happy with.
This is probably worth backporting to 0.2.0.
Provide a useful warning when launch_circuit tries to make us use a
node we don't want to use. Just give an info message when this is a
normal and okay situation. Fix for logging issues in bug 984.
This patch adds a function to determine whether we're in the main
thread, and changes control_event_logmsg() to return immediately if
we're in a subthread. This is necessary because otherwise we will
call connection_write_to_buf, which modifies non-locked data
structures.
Bugfix on 0.2.0.x; fix for at least one of the things currently
called "bug 977".
Tas (thanks!) noticed that when *ListenAddress is set, Tor would
still warn on startup when *Port is low and hibernation is active.
The patch parses all the *ListenAddress lines, and checks the
ports. Bugfix on 0.2.1.15-rc
With the last fix of task 932 (5f03d6c), client requests are only added to
the history when they happen after the start of the current history. This
conflicts with the unit tests that insert current requests first (defining
the start of the client request history) followed by requests in the past.
The fix is to insert requests in chronological order in the unit tests.
- Write geoip stats to disk every 24 hours, not every hour.
- Remove configuration options and define reasonable defaults.
- Clear history of client requests every 24 hours (which wasn't done at
all before).
Specifically if you send SIGUSR1, it will add two lines to the log file:
May 22 07:41:59.576 [notice] Our DNS cache has 3364 entries.
May 22 07:41:59.576 [notice] Our DNS cache size is approximately 1022656
bytes.
[tweaked a bit by nickm]
Really, our idiocy was that we were calling event_set() on the same
event more than once, which sometimes led to us calling event_set() on
an event that was already inserted, thus making it look uninserted.
With this patch, we just initialize the timeout events when we create
the requests and nameservers, and we don't need to worry about
double-add and double-del cases at all.
If we ever add an event, then set it, then add it again, there will be
now two pointers to the event in the event base. If we delete one and
free it, the first pointer will still be there, and possibly cause a
crash later.
This patch adds detection for this case to the code paths in
eventdns.c, and works around it. If the warning message ever
displays, then a cleverer fix is in order.
{I am not too confident that this *is* the fix, since bug 957 is very
tricky. If it is, it is a bugfix on 0.2.0.}
When we got a descriptor that we (as an authority) rejected as totally
bad, we were freeing it, then using the digest in its RAM to look up its
download status. Caught by arma with valgrind. Bugfix on 0.2.1.9-alpha.
Bridges are not supposed to publish router descriptors to the directory
authorities. It defeats the point of bridges when they are included in the
public relay directory.
This patch puts out a warning and exits when the node is configured as
a bridge and to publish v1, v2, or v3 descriptors at the same time.
Also fixes part of bug 932.
This addresses the first part of bug 918. Users are now warned when
they try to use hibernation in combination with a port below 1024
when they're not on Windows. We don't want to die here, because
people might run Tor as root, use a capabilities system or some
other platform that will allow them to re-attach low ports.
Wording suggested by Marian
(Don't crash immediately if we have leftover chunks to free after
freeing chunks in a buffer freelist; instead log a debugging message
that might help.)
Now, when you call tor --digests, it dumps the SHA1 digest of each
source file that Tor was built with. We support both 'sha1sum' and
'openssl sha1'. If the user is building from a tarball and they
haven't edited anything, they don't need any program that calculates
SHA1. If they _have_ modified a file but they don't have a program to
calculate SHA1, we try to build so we do not output digests.
bytes (aka 20KB/s), to match our documentation. Also update
directory authorities so they always assign the Fast flag to relays
with 20KB/s of capacity. Now people running relays won't suddenly
find themselves not seeing any use, if the network gets faster
on average.
svn:r19305
IP address changes: directory mirrors were mistakenly telling them
their old address if they asked via begin_dir, so they never got
an accurate answer about their new address, so they just vanished
after a day. Should fix bugs 827, 883, and 900 -- but alas, only
after every directory mirror has upgraded.
svn:r19291
ago. This change should significantly improve client performance,
especially once more people upgrade, since relays that have been
a guard for a long time are currently overloaded.
svn:r19287
The directory authorities were refusing v3 consensus votes from
other authorities, since the votes are now 504K. Fixes bug 959;
bugfix on 0.0.2pre17 (where we raised it from 50K to 500K ;).
svn:r19194
When we used smartlist_free to free the list of succesful uploads
because we had succeeded in uploading everywhere, we did not actually
set the successful_uploads field to NULL, so later it would get freed
again in rend_service_descriptor_free. Fix for bug 948; bug
introduced in 0.2.1.6-alpha.
svn:r19073
tor_sscanf() only handles %u and %s for now, which will make it
adequate to replace sscanf() for date/time/IP parsing. We want this
to prevent attackers from constructing weirdly formed descriptors,
cells, addresses, HTTP responses, etc, that validate under some
locales but not others.
svn:r18760
It seems that 64-bit Sparc Solaris demands 64-bit-aligned access to
uint64_t, but does not 64-bit-align the stack-allocated char array we
use for cpuworker tags. So this patch adds a set/get_uint64 pair, and
uses them to access the conn_id field in the tag.
svn:r18743
It turns out that we weren't updating the _ExcludeExitNodesUnion set's
country numbers when we reloaded (or first loaded!) the IP-to-country
file. Spotted by Lark. Bugfix on 0.2.1.6-alpha.
svn:r18575
stream never finished making its connection, it would live
forever in circuit_wait state. Now we close it after SocksTimeout
seconds. Bugfix on 0.1.2.7-alpha; reported by Mike Perry.
svn:r18516
Previously, when we had the chosen_exit set but marked optional, and
we failed because we couldn't find an onion key for it, we'd just give
up on the circuit. But what we really want to do is try again, without
the forced exit node.
Spotted by rovv. Another case of bug 752. I think this might be
unreachable in our current code, but proposal 158 could change that.
svn:r18451
This resolves bug 526, wherein we would crash if the following
events occurred in this order:
A: We're an OR, and one of our nameservers goes down.
B: We launch a probe to it to see if it's up again. (We do this hourly
in steady-state.)
C: Before the probe finishes, we reconfigure our nameservers,
usually because we got a SIGHUP and the resolve.conf file changed.
D: The probe reply comes back, or times out. (There is a five-second
window for this, after B has happens).
IOW, if one of our nameservers is down and our nameserver
configuration has changed, there were 5 seconds per hour where HUPing
the server was unsafe.
Bugfix on 0.1.2.1-alpha. Too obscure to backport.
svn:r18306
This fixes the last known case of bug 891, which could happen if two
hosts, A and B, disagree about how long a circuit has been open,
because of clock drift of some kind. Host A would then mark the
connection as is_bad_for_new_circs when it got too old and open a new
connection. In between when B receives a NETINFO cell on the new
conn, and when B receives a conn cell on the new circuit, the new
circuit will seem worse to B than the old one, and so B will mark it
as is_bad_for_new_circs in the second or third loop of
connection_or_group_set_badness().
Bugfix on 0.1.1.13-alpha. Bug found by rovv.
Not a backport candidate: the bug is too obscure and the fix too tricky.
svn:r18303
crypto_global_init gets called. Also have it be crypto_global_init
that calls crypto_seed_rng, so we are not dependent on OpenSSL's
RAND_poll in these fiddly cases.
Should fix bug 907. Bugfix on 0.0.9pre6. Backport candidate.
svn:r18210
are stored when the --enable-local-appdata option is configured. This
changes the Windows path from %APPDATA% to a host local
%USERPROFILE%\Local Settings\Application Data\ path (aka,
LOCAL_APPDATA).
Patch from coderman.
svn:r18122
It was dumb to have an "announce the value if it's over 0" version of
the code coexisting with an "announce the value if it's at least N"
version. Retain the latter only, with N set to 1.
Incidentally, this should fix a Coverity REVERSE_INULL warning.
svn:r18100
Unfortunately, old Libevents don't _put_ a version in their headers, so
this can get a little tricky. Fortunately, the only binary-compatibility
issue we care about is the size of struct event. Even more fortunately,
Libevent 2.0 will let us keep binary compatiblity forever by letting us
decouple ourselves from the structs, if we like.
svn:r18014
five days old. Otherwise if Tor is off for a long time and then
starts with cached descriptors, it will try to use the onion
keys in those obsolete descriptors when building circuits. Bugfix
on 0.2.0.x. Fixes bug 887.
svn:r17993
cell back), avoid using that OR connection anymore, and also
tell all the one-hop directory requests waiting for it that they
should fail. Bugfix on 0.2.1.3-alpha.
svn:r17984
using the wrong onion key), we were dropping it and letting the
client time out. Now actually answer with a destroy cell. Bugfix
on 0.0.2pre8.
svn:r17970
When we made bridge authorities stop serving bridge descriptors over
unencrypted links, we also broke DirPort reachability testing for
bridges. So bridges with a non-zero DirPort were printing spurious
warns to their logs. Bugfix on 0.2.0.16-alpha. Fixes bug 709.
svn:r17945
descriptors shortly after startup, and then briefly resume
after a new bandwidth test and/or after publishing a new bridge
descriptor. Bridge users that try to bootstrap from them would
get a recent networkstatus but would get descriptors from up to
18 hours earlier, meaning most of the descriptors were obsolete
already. Reported by Tas; bugfix on 0.2.0.13-alpha.
svn:r17920
discard it rather than trying to use it. In theory it could
be useful because it lists alternate directory mirrors, but in
practice it just means we spend many minutes trying directory
mirrors that are long gone from the network. Helps bug 887 a bit;
bugfix on 0.2.0.x.
svn:r17917
The subversion $Id$ fields made every commit force a rebuild of
whatever file got committed. They were not actually useful for
telling the version of Tor files in the wild.
svn:r17867
use the same download mechanism as other places.
i had to make an ugly hack around "IMPOSSIBLE_TO_DOWNLOAD+1".
we should unhack that sometime.
svn:r17834
Specifically, split compare_tor_addr_to_addr_policy() from a loop with a bunch
of complicated ifs inside into some ifs, each with a simple loop. Rearrange
router_find_exact_exit_enclave() to run a little faster. Bizarrely,
router_policy_rejects_all() shows up on oprofile, so precalculate it per
routerinfo.
svn:r17802
new from their perspective) directory download schedule abstraction.
not done yet, but i'd better get this out of my sandbox before nick
does another sweeping change. :)
svn:r17798
of which countries we've seen clients from recently. Now controllers
like Vidalia can show bridge operators that they're actually making
a difference.
svn:r17796
shahn: "Add some documentation for the WRA_* family of functions, also make
sure that (hopefully) all functions that return was_router_added_t
don't return ints directly and that they don't refer to integers in
their documentation anymore."
svn:r17731
more extreme happens. the default should be to be quiet unless
something more extreme happens.
at least, this doesn't generate complaints anymore. perhaps that
means it is working better? :)
svn:r17724
(The unfixed ones are being downgraded to regular XXXs mainly on the rationale that they don't seem to be exploding Tor, and they were apparently not showstoppers for 0.2.0.x-final.)
svn:r17682
(Many users have no idea what a resolv.conf is, and shouldn't be forced to learn. The old option will keep working for now.)
Also, document it.
svn:r17661
seconds. Warn the user if lower values are given in the
configuration. Bugfix on 0.1.0.1-rc. Patch by Sebastian.
Clip the CircuitBuildTimeout to a minimum of 30 seconds. Warn the
user if lower values are given in the configuration. Bugfix on
0.1.1.17-rc. Patch by Sebastian.
svn:r17657
"connecting" and it receives an "end" relay cell, the exit relay
would silently ignore the end cell and not close the stream. If
the client never closes the circuit, then the exit relay never
closes the TCP connection. Bug introduced in Tor 0.1.2.1-alpha;
reported by "wood".
svn:r17625
This patch makes every RELAY_COMMAND_END cell that we send pass through one of two functions: connection_edge_end and relay_send_end_cell_from_edge. Both of these functions check the circuit purpose, and change the reason to MISC if the circuit purpose means that it's for client use.
svn:r17612
This makes it easier for us to avoid errors where we we forgot to list a keyword as mandatory, and easier for Coverity to detect cases like this too.
svn:r17595
one guard from a given relay family. Otherwise we could end up with
all of our entry points into the network run by the same operator.
Suggested by Camilo Viecco. Fix on 0.1.1.11-alpha.
Not a backport candidate, since I think this might break for users
who only have a given /16 in their reachableaddresses, or something
like that.
svn:r17514
it's obsolete. which causes us to inform the user every time, even
though the user can't do anything about it other than get confused.
now it's an info-level log by default.
svn:r17206
we were complaining about no support for our one-hop streams,
when in fact choose_good_exit_server_general() has no business
caring about one-hop streams. patch from miner.
svn:r17181
The "ClientDNSRejectInternalAddresses" config option wasn't being
consistently obeyed: if an exit relay refuses a stream because its
exit policy doesn't allow it, we would remember what IP address
the relay said the destination address resolves to, even if it's
an internal IP address. Bugfix on 0.2.0.7-alpha; patch by rovv.
svn:r17135
Hidden services start out building five intro circuits rather
than three, and when the first three finish they publish a service
descriptor using those. Now we publish our service descriptor much
faster after restart.
svn:r17110
and fail to start. But dangerous permissions on
$datadir/cached-status/ would cause us to open a log and complain
there. Now complain to stdout and fail to start in both cases. Fixes
bug 820, reported by seeess.
svn:r16998
reachability testing circuits to do a bandwidth test -- if
we already have a connection to the middle hop of the testing
circuit, then it could establish the last hop by using the existing
connection. Bugfix on 0.1.2.2-alpha, exposed when we made testing
circuits no longer use entry guards in 0.2.1.3-alpha.
svn:r16997
rejected them in 0.1.0.15, because back in 2005 they were commonly
misconfigured and ended up as spam targets. We hear they are better
locked down these days.
svn:r16898
If not enough of our entry guards are available so we add a new
one, we might use the new one even if it overlapped with the
current circuit's exit relay (or its family). Anonymity bugfix
pointed out by rovv.
svn:r16698
Make dns resolver code more robust: handle nameservers with IPv6 addresses, make sure names in replies match requested names, make sure origin address of reply matches the address we asked.
svn:r16621
Initial conversion of uint32_t addr to tor_addr_t addr in connection_t and related types. Most of the Tor wire formats using these new types are in, but the code to generate and use it is not. This is a big patch. Let me know what it breaks for you.
svn:r16435
Move n_addr, n_port, and n_conn_id_digest fields of circuit_t into a separately allocated extend_info_t. Saves 22 bytes per connected circuit_t on 32-bit platforms, and makes me more comfortable with using tor_addr_t in place of uint32_t n_addr.
svn:r16257
Tor_addr_compare did a semantic comparison, such that ::1.2.3.4 and 1.2.3.4 were "equal". we sometimes need an exact comparison. Add a feature to do that.
svn:r16210
Make generic address manipulation functions work better. Switch address policy code to use tor_addr_t, so it can handle IPv6. That is a good place to start.
svn:r16178
Refactor the router_choose_random_node interface: any function with 10 parameters, most of which are boolean and one of which is unused, should get refactored like this.
svn:r16167
Refactor the is_vote field of networkstatus_t to add a third possibility ("opinion") in addition to vote and opinion. First part of implementing proposal 147.
svn:r16166
Patch from Christian Wilms: remove (HiddenService|Rend)(Exclude)?Nodes options. They never worked properly, and nobody seems to be using them. Resolves bug 754.
svn:r16144
In connection_edge_destroy, send a stream status control event when we have an AP connection. Previously, we would send an event when the connection was AP and non-AP at the same time. This didn't work so well. Patch from Anonymous Remailer (Austria). Backport candidate.
svn:r16143
Never allow a circuit to be created with the same circid as a circuit that has been marked for close. May be a fix for bug 779. Needs testing. Backport candidate.
svn:r16136
Add new ExcludeExitNodes option. Also add a new routerset type to handle Exclude[Exit]Nodes. It is optimized for O(1) membership tests, so as to make choosing a random router run in O(N_routers) time instead of in O(N_routers*N_Excluded_Routers).
svn:r16061
to just our our entry guards for the test circuits. Otherwise we
tend to have multiple test circuits going through a single entry
guard, which makes our bandwidth test less accurate. Fixes part
of bug 654; patch contributed by Josh Albrecht.
(Actually, modify Josh's patch to avoid doing that when you're
a bridge relay, since it would leak more than we want to leak.)
svn:r15850
their responses even for begin_dir conns. Now clients who only ever use
begin_dir connections still have a way to learn their IP address. Should
fix bug 737. Reported by goldy.
svn:r15571
as soon as you run out of working bridges, rather than waiting
for ten failures -- which will never happen if you have less than
ten bridges.
svn:r15368
If you have more than one bridge but don't know their keys,
you would only learn a request for the descriptor of the first one
on your list. (Tor considered launching requests for the others, but
found that it already had a connection on the way for $0000...0000
so it didn't open another.)
If you have more than one bridge but don't know their keys, and the
connection to one of the bridges failed, you would cancel all
pending bridge connections. (After all, they all have the same
digest.)
svn:r15366
set starting=1 to avoid potential bugs with having it conflict with 0,
which I used to mean uninitialized, when I realized I would be writing
many more lame-sounding paragraphs in the future. Just start it at 0
and handle the bugs.
svn:r15346
run out of reachable directory mirrors. Once upon a time reloading
it would set the 'is_running' flag back to 1 for them. It hasn't
done that for a long time.
svn:r15004
we actually have the descriptor listed in the consensus, not just
any descriptor, for each relay.
don't backport this patch (yet); who knows what it might do.
svn:r14971
the "do we have enough directory info?" calculation that checked
how many relays we believed to still be running based on our own
experience. So if we went offline, we never gave up trying to make
new circuits; worse, when we came back online we didn't recognize
that we should give all the relays another chance. Bugfix on
0.2.0.9-alpha; fixes bugs 648 and 675.
svn:r14970
Fwdport Bugfix: an authority signature is "unrecognized" if we lack a dirserver entry for it, even if we have an older cached certificate that says it is recognized. This affects clients who remove entries from their dirserver list without clearing their certificate cache.
svn:r14597
Forward-port: I had apparently broken OSX and Freebsd by not initializing threading before we initialize the logging system. This patch should do so, and fix bug 671.
svn:r14430
apply patch from lodger: reject requests for reverse-dns lookup of names in private address space. make non-exits reject all dns requests. Fixes bug 619.
svn:r14410
Do not allocate excess space for named_flag and unnamed_flag in dirvote.c. Fixes bug 662. Not a dangerous bug: sizeof(int*) is at least as big as sizeof(int) everywhere.
svn:r14391
Make dumpstats() log the size and fullness of openssl-internal buffers, so I can test my hypothesis that many of them are empty, and my alternative hypothesis that many of them are mostly empty, against the null hypothesis that we really need to be burning 32K per open OR connection on this.
svn:r14350
When we remove old routers, use Bloom filters rather than a digestmap-based set in order to tell which ones we absolutely need to keep. This will save us roughly a kazillion little short-lived allocations for hash table entries.
svn:r14318
Add a new SMARTLIST_FOREACH_JOIN macro to iterate through two sorted lists in lockstep. This happens at least 3 times in the code so far, and is likely to happen more in the future. Previous attempts to do so proved touchy, tricky, and error-prone: now, we only need to get it right in one place.
svn:r14309
New --hush command-line option similar to --quiet. While --quiet disables all
logging to the console on startup, --hush limits the output to messages of
warning and error severity.
svn:r14222
Only log a notice that dmalloc has been set up if it fails. Actually, since we have not added a temp log yet, I am not sure this ever does anything.
svn:r14216
Add new stacklike, free-all-at-once memory allocation strategy. Use it when parsing directory information. This helps parsing speed, and may well help fragmentation some too. hidden-service-related stuff still uses the old tokenizing strategies.
svn:r14194
streams. so they waited 120 seconds before timing out. this
was particularly bad during bootstrapping, if an authority is
down or not answering right.
svn:r14163
attempt, notice more quickly. Some of our bootstrapping attempts have a 60
second delay while we sit there wondering why we're getting no response.
svn:r14162
Only dump all guard node status to the log when the guard node status actually changes. Downgrade the 4 most common remaining INFO log messages to DEBUG.
svn:r14069
Part of fix for bug 617: allow connection_ap_handshake_attach_circuit() to mark connections, to avoid double-mark warnings. Note that this is an incomplete refactoring.
svn:r14066
get_interface_address6() fails regardless of the allocator used,
wever logging to the original severity of 0 causes an assert
error only with the bsd allocator. weird.
svn:r14005
Forward-port: Fix the SVK version detection logic to work right on a branch: tolerate multiple "copied from" tags and only look at the first.
svn:r13959
Fix bug spotted by mwenge: a server_event should not be a sever_event. Also, fix compile errors in config.c and control.c with --enable-gcc-warnings.
svn:r13957
The LOADCONF control command allows posting a config file to Tor
over the control interface. This config file is then loaded as if
it had been read from disk. Sending a HUP signal to Tor will make
it try to load its old config from disk again, thereby forgetting
the config loaded with this command.
svn:r13948
Change options_init_from_string() so that it returns different exit codes in the
error case, depending on what went wrong. Also push the responsibility to log
the error to the caller.
svn:r13947
Part of options_init_from_torrc()'s job was looking for -f flags (to specify
an alternate config file) on the command line, complaining if more than one
is given or the given does not exist. If none is given then use the compiled-in
default location, accepting if it does not exist. This logic has been moved
into its own function in an attemped to make options_init_from_torrc() easier
to deal with.
svn:r13942
Split the argv processing loop into two poarts, one that deals with
figuring out which conffile to use, and the other that figures out
which "command" (hash fingerprint, verify config, list fpr, run tor)
the user asked for.
There is a third part further down that imports command line args
into the config but that is not touched.
svn:r13941
Implement domain-selection for logging. Source is documented; needs documentation in manpage (maybe). For now, see doxygen comment on parse_log_severity_config in log.c
svn:r13875
in options_act() we set running_tor to options->command == CMD_RUN_TOR
once and used that in all but one place. Now we use running_tor in that
place also.
svn:r13819
authorities for their first directory fetch, even if their DirPort
is off or if they don't know they're reachable yet. This will help
them bootstrap better. Bugfix on 0.2.0.18-alpha; fixes bug 609.
svn:r13688
Fix all remaining shorten-64-to-32 errors in src/common. Some were genuine problems. Many were compatibility errors with libraries (openssl, zlib) that like predate size_t. Partial backport candidate.
svn:r13665
Do the last part of arma's fix for bug 437: Track the origin of every addrmap, and use this info so we can remove all the trackhostexits-originated mappings for a given exit.
svn:r13660
example, when answering a directory request), reset the
time-to-give-up timeout every time we manage to write something
on the socket. Bugfix on 0.1.2.x.
svn:r13643
Answer one xxx020 item; move 7 other ones to a new "XXX020rc" category: they should get fixed before we cut a release candidate. arma: please review these to see whether you have fixes/answers for any. Please check out the other 14 XXX020s to see if any look critical for the release candidate.
svn:r13640
Fix a bug that kept buf_find_string_offset from finding a string at the very end of the buffer. Add a unit test for this. Also, do not save a pointer to a chunk that might get reallocated by buf_pullup().
svn:r13635
add a flag to suppress overwriting the certificates file with new certificates, so we do not overwrite all certs when starting as an authority.
svn:r13630
Apply patch from Sebastian Hahn: stop imposing an arbitrary maximum on the number of file descriptors used for busy servers. Bug reported by Olaf Selke.
svn:r13626
would stop building circuits and start refusing connections after
24 hours, since we false believed that Tor was dormant. Reported
by nwf; bugfix on 0.1.2.x.
svn:r13583
Resolved problems with (re-)fetching hidden service descriptors.
Before, v0 descriptors were not fetched at all (fix on 0.2.0.18-alpha),
re-fetching of v2 descriptors did not stop when a v0 descriptor was
received (fix on 0.2.0.18-alpha), and re-fetching of v2 descriptors did
not work in all cases (fix on 0.2.0.19-alpha).
svn:r13540
Fix all but 2 DOCDOC items; defer many XXX020s (particularly those where fixing them would fix no bugs at the risk of introducing some bugs).
svn:r13529
Bugfix from Karsten: "Reversed r13439; v2 rendezvous descriptors were only re-fetched when a directory connection did not finish, not when a directory correctly replied with an error code like 404; bug found by nwf.
svn:r13492
Re-tune mempool parametes based on testing on peacetime: use smaller chuncks, free them a little more aggressively, and try very hard to concentrate allocations on fuller chunks. Also, lots of new documentation.
svn:r13484
Fix some XXX020s in command.c, and make it not-allowed to negotiate v1 using the v2 connection protocol: it is too hard to test, and pointless to support.
svn:r13460
More protocol negotiation work. Make the negotiation actually complete and set the state to open. Fix a crash bug that occured when we forcibly stopped the connection from writing.
svn:r13434
Add a bunch more code documentation; change the interface of fetch_var_cell_from_buf() so it takes the current link protocol into account and can't get confused by weird command bytes on v1 connections.
svn:r13430
Add a couple of (currently disabled) strategies for trying to avoid using too much ram in memory pools: prefer putting new cells in almost-full chunks, and be willing to free the last empty chunk if we have not needed it for a while. Also add better output to mp_pool_log_status to track how many mallocs a given memory pool strategy is saving us, so we can tune the mempool parameters.
svn:r13428
Fix some warnings identified by building with -D_FORTIFY_SOURCE=2. Remove a redundant (and nuts) definition of _FORTIFY_SOURCE from eventdns.c.
svn:r13424
Be more thorough about memory poisoning and clearing. Add an in-place version of aes_crypt in order to remove a memcpy from relay_crypt_one_payload.
svn:r13414
The SSL portion of the revised handshake now seems to work: I just finally got a client and a server to negotiate versions. Now to make sure certificate verification is really happening, connections are getting opened, etc.
svn:r13409
Add more documentation; change the behavior of read_to_buf_tls to be more consistent. Note a longstanding problem with current read/write interfaces.
svn:r13407
even when the local resolv.conf file is missing, broken, or contains
only unusable nameservers.
Now I can run a local network on my laptop when I'm on an airplane.
svn:r13402
otherwise i'll keep asking other authorities for it, which probably
isn't the best way to get it. this made bootstrapping a new network
very hard.
svn:r13400
Initial attempts to track down bug 600, and refactor possibly offending code. 1) complain early if circuit state is set to OPEN when an onionskin is pending. 2) refactor onionskin field into one only used when n_conn is pending, and a separate onionskin field waiting for attention by a cpuworker. This might even fix the bug. More likely, it will make it fail with a more useful core.
svn:r13394
Fix some XXX020 items in control.c: add a maximum line length and note that the number of versioning authorities is no longer apparent to clients.
svn:r13390
Check for correctness of AuthDir* options in options_validate; check for possible bugs where options_validate() is happy but parse_policies_from_options() is sad.
svn:r13384
As planned, rename networkstatus_vote_t to networkstatus_t, now that v3 networkstatuses are working and standard and v2 networkstatuses are obsolete.
svn:r13383
Periodically check whether we have an expired consensus networkstatus. If we do, and we think we have enough directory info, then call router_dir_info_changed(). Fixes bug 401. This bug was deferred from 0.1.2.x, but fixing it there is nontrivial.
svn:r13342
Fix bug 597: stop telling people to email Tor-ops. Also give a better suggestion when some other identity has been assigned the nickname we are using.
svn:r13337
Tor can warn and/or refuse connections to ports commonly used with
vulnerable-plaintext protocols.
We still need to figure out some good defaults for them.
svn:r13198
Fix an assert if we post a general-purpose descriptor via the
control port but that descriptor isn't mentioned in our current
network consensus. Bug reported by Jon McLachlan; bugfix on
0.2.0.9-alpha.
svn:r13153
Basic hacks to get TLS handshakes working: remove dead code; fix post-handshake logic; keep servers from writing while the client is supposed to be renegotiating. This may work. Needs testing.
svn:r13122
If we do not serve v2 directory info, and our cached v2 networkstatus files are very old, remove them. If the directory is old, remove that too. (We already did this for obsolete routers files.)
svn:r13096
Change set_current_consensus interface to take a flags variable. Do not try to fetch certificates until after we have tried loading the fallback consensus. Should fix bug 583.
svn:r13058
Bugfix on fix for 557: Make values containing special characters work right with getconf, setconf, and saveconf. Document this in control-spec.txt
svn:r13056
Consequence of fix for 539: when a client gets a 503 response with a nontrivial body, pretend it got a 200 response. This lets clients use information erroneously sent to them by old buggy servers.
svn:r13054
Add a reverse mapping from SSL to tor_tls_t*: we need this in order to do a couple of things the sensible way from inside callbacks. Also, add a couple of missing cases in connection_or.c
svn:r13040
Fix bug 579: Count DNSPort and hidden services when checking whether Tor is going to do anything. Change "no configured ports" from fatal to warning.
svn:r13036
Push the strdups used for parsing configuration lines into parse_line_from_string(). This will make it easier to parse more complex value formats, which in turn will help fix bug 557
svn:r13020
Fix bug 575: protect the list of logs with a mutex. I couldn't find any appreciable change in logging performance on osx, but ymmv. You can undef USE_LOG_MUTEX to see if stuff gets faster for you.
svn:r13019
Use reference-counting to avoid allocating a zillion little addr_policy_t objects. (This is an old patch that had been sitting on my hard drive for a while.)
svn:r13017
Here, have some terribly clever new buffer code. It uses a mbuf-like strategy rather than a ring buffer strategy, so it should require far far less extra memory to hold any given amount of data. Also, it avoids access patterns like x=malloc(1024);x=realloc(x,1048576);x=realloc(x,1024);append_to_freelist(x) that might have been contributing to memory fragmentation. I've tested it out a little on peacetime, and it seems to work so far. If you want to benchmark it for speed, make sure to remove the #define PARANOIA; #define NOINLINE macros at the head of the module.
svn:r12983
for a v2 or v3 networkstatus object before we were prepared. This
was particularly bad for 0.2.0.13 and later bridge relays, who
would never have a v2 networkstatus and would thus always crash
when used. Bugfixes on 0.2.0.x.
Estimate the v3 networkstatus size more accurately, rather than
estimating it at zero bytes and giving it artificially high priority
compared to other directory requests. Bugfix on 0.2.0.x.
svn:r12952
When we load a bridge descriptor from the cache,
and it was previously unreachable, mark it as retriable so we won't
just ignore it. Also, try fetching a new copy immediately.
svn:r12950
Refactor circuit_launch* functions to take a bitfield of flags rather than 4 separate nonconsecutive flags arguments. Also, note a possible but in circuit_find_to_cannibalize, which seems to be ignoring its purpose argument.
svn:r12948
time you use a given introduction point for your service, but
on subsequent requests we'd be using garbage memory. Fixed by
Karsten Loesing. Bugfix on 0.2.0.12-alpha.
svn:r12913
unexpected (it used to be in our networkstatus when we started
fetching it, but it isn't in our current networkstatus), and we
aren't using bridges. Bugfix on 0.2.0.x.
svn:r12911
create the "cached-status" directory in their datadir. All Tors
used to create it. Bugfix on 0.1.2.x.
Bridge relays with DirPort set to 0 no longer cache v1 or v2
directory information; there's no point. Bugfix on trunk.
svn:r12887
When we decide to send a 503 in response to a request for server descriptors, disable spooling so that we do not then send the descriptors anyway. Fixes bug 539.
svn:r12882
AlternateBridgeAuthority, and AlternateHSAuthority) that let the
user selectively replace the default directory authorities, rather
than the all-or-nothing replacement that DirServer offers.
svn:r12777
using bridges or we have StrictEntryNodes set), don't mark relays
down when they fail a directory request. Otherwise we're too quick
to mark all our entry points down.
svn:r12755
authorities to mark certain relays as "bad directories" in the
networkstatus documents. Also supports the "!baddir" directive in
the approved-routers file.
svn:r12754
downloading new consensus documents. Bridge users now wait until
the end of the interval, so their bridge will be sure to have a
new consensus document.
svn:r12696
running an obsolete version, it used the string "OLD" to describe
it. Yet the "getinfo" interface used the string "OBSOLETE". Now use
"OBSOLETE" in both cases.
svn:r12686
on but your ORPort is off.
Add a new config option BridgeRelay that specifies you want to
be a bridge relay. Right now the only difference is that it makes
you answer begin_dir requests, and it makes you cache dir info,
even if your DirPort isn't on.
Refactor directory_caches_dir_info() into some more functions.
svn:r12668
Change tor_addr_t to be a tagged union of in_addr and in6_addr, not of sockaddr_in and sockaddr_in6. It's hardly used in the main code as it is, but let's get it right before it gets popular.
svn:r12660
server-side code (for when v2 negotiation occurred) to check for renegotiation and adjust client ID info accordingly. server-side of new TLS code is now implemented, but needs testing and debugging.
svn:r12624
Add support to get a callback invoked when the client renegotiate a connection. Also, make clients renegotiate. (not enabled yet, until they detect that the server acted like a v2 server)
svn:r12623
Start getting freaky with openssl callbacks in tortls.c: detect client ciphers, and if the list doesn't look like the list current Tors use, present only a single cert do not ask for a client cert. Also, support for client-side renegotiation. None of this is enabled unless you define V2_HANDSHAKE_SERVER.
svn:r12622
that have no introduction points. But Tor crashed when we tried
to build a descriptor with no intro points (and it would have
crashed if we had tried to parse one). Bugfix on 0.2.0.x; patch
by Karsten Loesing.
svn:r12579
enough directory information. This was causing us to always pick
two new guards on startup (bugfix on 0.2.0.9-alpha), and it was
causing us to discard all our guards on startup if we hadn't been
running for a few weeks (bugfix on 0.1.2.x). Fixes bug 448.
svn:r12570
When authorities detected more than two relays running on the same
IP address, they were clearing all the status flags but forgetting
to clear the "hsdir" flag. So clients were being told that a
given relay was the right choice for a v2 hsdir lookup, yet they
never had its descriptor because it was marked as 'not running'
in the consensus.
svn:r12515
the bridge authority could help us (for example, we don't know
a digest, or there is no bridge authority), don't be so eager to
fall back to asking the bridge authority.
svn:r12512
Nov 16 02:20:50.089 [info] launch_router_descriptor_downloads(): There are not many downloadable routerdescs, but we haven't tried downloading descriptors recently. Downloading.
Get rid of the second line.
svn:r12510
non-running hsdirs, or not give them the flag if they're not running,
or what.
When picking v2 hidden service directories, don't pick ones that
aren't listed as Running.
svn:r12509
relay's public (external) IP address too, unless
ExitPolicyRejectPrivate is turned off. We do this because too
many relays are running nearby to services that trust them based
on network address.
svn:r12459
Mess with the formula for the Guard flag again. Now it requires that you be in the most familiar 7/8 of nodes, and have above median wfu for that 7/8th. See spec for details. Also, log thresholds better.
svn:r12440
Keep track, for each OR connection, of the last time we added a non-padding cell to its outbuf. Use this timestamp, not "lastwritten" to tell if it is time to close a circuitless connection. (We can'tuse lastwritten, since lastwritten is updated when ever the connection flushes anything, and by that point we can no longer tell what is a padding cell and what is not.)
svn:r12437
Clean up log messages from bug 543 fix, and make old_routers also keep track of their indices. This will probably crash some until all the bugs are fixed.
svn:r12412
Define SHARE_DATADIR, LOCALSTATEDIR, and BINDIR in Makefile.am as autoconf recommends. Do not move CONFDIR yet, since we seem to support overriding it in a weird way. Resolves bug 542.
svn:r12376
consensus and certs cached in our datadirectory: we were
caching the consensus in consensus_waiting_for_certs but then
free'ing it right after.
more bugs remain here, i think.
svn:r12369
Try to make hidden service directory lookup functions a bit more efficient: go for fewer O(n) operations, and look at the consensus rather than the routerinfo list.
svn:r12361
Nov 03 11:15:13.129 [info] networkstatus_set_current_consensus(): Got a consensus we already have
Nov 03 11:15:13.129 [warn] Unable to load consensus directory dowloaded from server '86.59.21.38:80'
svn:r12359
As an authority, send back an X-Descriptor-Not-New header when we accept but do not store a descriptor. Partial implementation of fix for bug 535.
svn:r12310
Improved skew reporting: "You are 365 days in the duture" is more useful than "You are 525600 minutes in the future". Also, when we get something that proves we are at least an hour in the past, tell the controller "CLOCK_SKEW MIN_SKEW=-3600" rather than just "CLOCK_SKEW"
svn:r12283
Implement a FallbackNetworkstatusFile (default to $prefix/share/tor/fallback-consensus) to that we know about lots of directory servers and routers when we start up the first time.
svn:r12259
Tidy v2 hidden service descriptor format code: fix memory leaks, fix reference problems, note magic numbers, note questions, remove redundant checks, remove a possible stack smashing bug when encoding a descriptor with no protocols supported.
svn:r12255
edge_connection_t: want_onehop if it must attach to a circuit with
only one hop (e.g. for the current tunnelled connections that use
begin_dir), and use_begindir if we mean to use a BEGIN_DIR relay
command to establish the stream rather than the normal BEGIN. Now
we can make anonymized begin_dir connections for (e.g.) more secure
hidden service posting and fetching.
svn:r12244
Stop servers from crashing if they set a Family option (or
maybe in other situations too). Bugfix on 0.2.0.9-alpha; reported
by Fabian Keil.
svn:r12235
Keep circuitless TLS connections open for 1.5 x MaxCircuitDirtiness: this ensures that we don't thrash closing and repoening connections to our guards.
svn:r12218
Fix logic for downloading consensuses: make getting an duplicate or not-currently-valid consensus count as a failure. Make running out of time to get certificates count as a failure. Delay while fetching certificates.
svn:r12159
Refactor the arguments for router_pick_{directory_|trusteddir}server[_impl] so that they all take the same flags, and so that their flags have names. Fix their documentation too.
svn:r12157
minutes, but writing the "valid-after" line in our vote based
on our configured V3AuthVotingInterval: so unless the intervals
matched up, we immediately rejected our own vote because it didn't
start at the voting interval that caused us to construct a vote.
This caused log entries like:
Oct 23 01:16:16.303 [notice] Choosing expected valid-after time
as 2007-10-23 05:30:00: consensus_set=0, interval=1800
...
Oct 23 01:20:01.203 [notice] Choosing valid-after time in vote as
2007-10-23 06:00:00: consensus_set=0, interval=3600
Oct 23 01:20:01.290 [warn] Rejecting vote with valid-after time of
2007-10-23 06:00:00; we were expecting 2007-10-23 05:30:00
Oct 23 01:20:01.291 [warn] Couldn't store my own vote! (I told
myself, 'Bad valid-after time'.)
Nick, you should look at this, as it's your design. :)
svn:r12129
Make authorities start accepting (and advertising their acceptance of) consensus method 2. If all goes well, we'll have a working Unnamed flag. Otherwise, we'll have a fun backtrace.
svn:r12113
Remove an unused and unneeded layer of abstraction: we only have one store for routers. (I had thought we might need a second one for annotated routers, but that's silly.
svn:r12101
More fixes for bad behavior when downloading extrainfos: do not download an ei if we lack the key to verify it, and do not download it if we already got it and found (weirdly) that it didn't match the corresponding server descriptor.
svn:r12071
Respond to INT and TERM SIGNAL commands before we execute the
signal, in case the signal shuts us down. We had a patch in
0.1.2.1-alpha that tried to do this by queueing the response on
the connection's buffer before shutting down, but that really
isn't the same thing. Bug located by Matt Edman.
This is a bug in 0.1.2.x too, but there's no way we should backport
this fix. Speaking of which, can somebody double-check it? :)
svn:r12070
When we decode to use consensus method 2 or later, compute Unnamed and Named more or less as described in 122. Don't actually use consensus method 2 yet, so we can be sure we didn't screw up v1..
svn:r12055
Remember the valid-until time of the most recent consensus that listed
a router, and (if we are a cache) never delete the routerdesc until
that conensus is expired. This is way easier than retaining multiple
consensuses. (Of course, the info isn't retained across restarts,
but that only affects a few caches at a time.)
svn:r12041
"if (!router_get_trusted_dirservers())" is a bad test: router_get_trusted_dirservers() always returns a list. Instead, check for whether the list is empty.
svn:r12013
When a networkstatus consensus download fails, do not wait 60 seconds to decide whether to retry. (Also, log the time at which we'll try to replace the current networkstatus.)
svn:r12005
oprofile was telling me that a fair bit of our time in openssl was spent in base64_decode, so replace base64_decode with an all-at-once fairly optimized implementation. For decoding keys and digests, it seems 3-3.5x faster than calling out to openssl. (Yes, I wrote it from scratch.)
svn:r12002
Make unverified-consensus get removed when it is accepted or rejected. Make a new get_datadir_fname*() set of functions to eliminate the common code of "get the options, get the datadir, append some stuff".
svn:r12000
Implement v3 networkstatus client code. Remove v2 networkstatus client code, except as needed for caches to fetch and serve v2 networkstatues and the routers they list.
svn:r11957
Make discard_old_votes part of the consensus publishing process, so we conform to spec, and so we avoid a weird bugs where publishing sets the consensus, setting the consensus makes us reschedule, and rescheduling makes us delay vote-discarding.
svn:r11944
when we find our DirPort to be reachable but won't actually publish
it. Extra descriptors without any real changes are dropped by the
authorities, and can screw up our "publish every 18 hours" schedule.
svn:r11915