Sets:
* Documentation
* Logging domain
* Configuration option
* Scheduled event
* Makefile
It also creates status.c and the log_heartbeat() function.
All code was written by Sebastian Hahn. Commit message was
written by me (George Kadianakis).
Our regular DH parameters that we use for circuit and rendezvous
crypto are unchanged. This is yet another small step on the path of
protocol fingerprinting resistance.
(Backport from 0.2.2's 5ed73e3807)
rransom noticed that a change of ORPort is just as bad as a change of IP
address from a client's perspective, because both mean that the relay is
not available to them while the new information hasn't propagated.
Change the bug1035 fix accordingly.
Also make sure we don't log a bridge's IP address (which might happen
when we are the bridge authority).
It is often not entirely clear what options Tor was built with, so it
might not be immediately obvious which config file Tor is using when it
found one. Log the config file at startup.
When calling circuit_build_times_shuffle_and_store_array, we were
passing a uint32_t as an int. arma is pretty sure that this can't
actually cause a bug, because of checks elsewhere in the code, but
it's best not to pass a uint32_t as an int anyway.
Found by doorss; fix on 0.2.2.4-alpha.
We detect and reject said attempts if there is no chosen exit node or
circuit: connecting to a private addr via a randomly chosen exit node
will usually fail (if all exits reject private addresses), is always
ill-defined (you're not asking for any particular host or service),
and usually an error (you've configured all requests to go over Tor
when you really wanted to configure all _remote_ requests to go over
Tor).
This can also help detect forwarding loop requests.
Found as part of bug2279.
Left circuit_build_times_get_bw_scale() uncommented because it is in the wrong
place due to an improper bug2317 fix. It needs to be moved and renamed, as it
is not a cbt parameter.
To quote arma: "So instead of stopping your CBT from screaming, you're just
going to throw it in the closet and hope you can't hear it?"
Yep. The log message can happen because at 95% point on the curve, we can be
way beyond the max timeout we've seen, if the curve has few points and is
shallow.
Also applied Nick's rule of thumb for rewriting some other notice log messages
to read like how you would explain them to a raving lunatic on #tor who was
shouting at you demanding what they meant. Hopefully the changes live up to
that standard.
If we got a signed digest that was shorter than the required digest
length, but longer than 20 bytes, we would accept it as long
enough.... and then immediately fail when we want to check it.
Fixes bug 2409; bug in 0.2.2.20-alpha; found by piebeer.
When we added support for separate client tls certs on bridges in
a2bb0bfdd5 we forgot to correctly initialize this when changing
from relay to bridge or vice versa while Tor is running. Fix that
by always initializing keys when the state changes.
Fixes bug 2433.
When we stopped using svn, 0.2.1.x lost the ability to notice its svn
revision and report it in the version number. However, it kept
looking at the micro-revision.i file... so if you switched to master,
built tor, then switched to 0.2.1.x, you'd get a micro-revision.i file
from master reported as an SVN tag. This patch takes out the "include
the svn tag" logic entirely.
Bugfix on 0.2.1.15-rc; fixes bug 2402.
Our regular DH parameters that we use for circuit and rendezvous
crypto are unchanged. This is yet another small step on the path of
protocol fingerprinting resistance.
We need to make sure that the worst thing that a weird consensus param
can do to us is to break our Tor (and only if the other Tors are
reliably broken in the same way) so that the majority of directory
authorities can't pull any attacks that are worse than the DoS that
they can trigger by simply shutting down.
One of these worse things was the cbtnummodes parameter, which could
lead to heap corruption on some systems if the value was sufficiently
large.
This commit fixes this particular issue and also introduces sanity
checking for all consensus parameters.
Our public key functions assumed that they were always writing into a
large enough buffer. In one case, they weren't.
(Incorporates fixes from sebastian)
In dnsserv_resolved(), we carefully made a nul-terminated copy of the
answer in a PTR RESOLVED cell... then never used that nul-terminated
copy. Ouch.
Surprisingly this one isn't as huge a security problem as it could be.
The only place where the input to dnsserv_resolved wasn't necessarily
nul-terminated was when it was called indirectly from relay.c with the
contents of a relay cell's payload. If the end of the payload was
filled with junk, eventdns.c would take the strdup() of the name [This
part is bad; we might crash there if the cell is in a bad part of the
stack or the heap] and get a name of at least length
495[*]. eventdns.c then rejects any name of length over 255, so the
bogus data would be neither transmitted nor altered.
[*] If the name was less than 495 bytes long, the client wouldn't
actually be reading off the end of the cell.
Nonetheless this is a reasonably annoying bug. Better fix it.
Found while looking at bug 2332, reported by doorss. Bugfix on
0.2.0.1-alpha.
Previously, we only looked at up to 128 bytes. This is a bad idea
since socks messages can be at least 256+x bytes long. Now we look at
up to 512 bytes; this should be enough for 0.2.2.x to handle all valid
SOCKS messages. For 0.2.3.x, we can think about handling trickier
cases.
Fixes 2330. Bugfix on 0.2.0.16-alpha.
Right now, Tor routers don't save the maxima values from the
bw_history_t between sessions. That's no good, since we use those
values to determine bandwidth. This code adds a new BWHist.*Maximum
set of values to the state file. If they're not present, we estimate
them by taking the observed total bandwidth and dividing it by the
period length, which provides a lower bound.
This should fix bug 1863. I'm calling it a feature.
Previously, our state parsing code would fail to parse a bwhist
correctly if the Interval was anything other than the default
hardcoded 15 minutes. This change makes the parsing less incorrect,
though the resulting history array might get strange values in it if
the intervals don't match the one we're using. (That is, if stuff was
generated in 15 minute intervals, and we read it into an array that
expects 30 minute intervals, we're fine, since values can be combined
pairwise. But if we generate data at 30 minute intervals and read it
into 15 minute intervals, alternating buckets will be empty.)
Bugfix on 0.1.1.11-alpha.
The trick of looping from i=0..4 , switching on i to set up some
variables, then running some common code is much better expressed by
just calling a function 4 times with 4 sets of arguments. This should
make the code a little easier to follow and maintain here.
An object, you'll recall, is something between -----BEGIN----- and
-----END----- tags in a directory document. Some of our code, as
doorss has noted in bug 2352, could assert if one of these ever
overflowed SIZE_T_CEILING but not INT_MAX. As a solution, I'm setting
a maximum size on a single object such that neither of these limits
will ever be hit. I'm also fixing the INT_MAX checks, just to be sure.
When using libevent 2, we use evdns_base_resolve_*(). When not, we
fake evdns_base_resolve_*() using evdns_resolve_*().
Our old check was looking for negative values (like libevent 2
returns), but our eventdns.c code returns 1. This code makes the
check just test for nonzero.
Note that this broken check was not for _resolve_ failures or even for
failures to _launch_ a resolve: it was for failures to _create_ or
_encode_ a resolve request.
Bug introduced in 81eee0ecfff3dac1e9438719d2f7dc0ba7e84a71; found by
lodger; uploaded to trac by rransom. Bug 2363. Fix on 0.2.2.6-alpha.
This was originally a patch provided by pipe
(http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html) to
provide a method for controllers to query the total amount of traffic
tor has handled (this is a frequently requested piece of information
by relay operators).
C99 allows a syntax for structures whose last element is of
unspecified length:
struct s {
int elt1;
...
char last_element[];
};
Recent (last-5-years) autoconf versions provide an
AC_C_FLEXIBLE_ARRAY_MEMBER test that defines FLEXIBLE_ARRAY_MEMBER
to either no tokens (if you have c99 flexible array support) or to 1
(if you don't). At that point you just use offsetof
[STRUCT_OFFSET() for us] to see where last_element begins, and
allocate your structures like:
struct s {
int elt1;
...
char last_element[FLEXIBLE_ARRAY_MEMBER];
};
tor_malloc(STRUCT_OFFSET(struct s, last_element) +
n_elements*sizeof(char));
The advantages are:
1) It's easier to see which structures and elements are of
unspecified length.
2) The compiler and related checking tools can also see which
structures and elements are of unspecified length, in case they
wants to try weird bounds-checking tricks or something.
3) The compiler can warn us if we do something dumb, like try
to stack-allocate a flexible-length structure.
We were not decrementing "available" every time we did
++next_virtual_addr in addressmap_get_virtual_address: we left out the
--available when we skipped .00 and .255 addresses.
This didn't actually cause a bug in most cases, since the failure mode
was to keep looping around the virtual addresses until we found one,
or until available hit zero. It could have given you an infinite loop
rather than a useful message, however, if you said "VirtualAddrNetwork
127.0.0.255/32" or something broken like that.
Spotted by cypherpunks
We were decrementing "available" twice for each in-use address we ran
across. This would make us declare that we ran out of virtual
addresses when the address space was only half full.
evbuffer_pullup does nothing and returns NULL if the caller asks it to
linearize more data than the buffer contains.
Introduced in 9796b9bfa6.
Reported by piebeer; fixed with help from doors.
If a SOCKS5 client insists on authentication, allow it to
negotiate a connection with Tor's SOCKS server successfully.
Any credentials the client provides are ignored.
This allows Tor to work with SOCKS5 clients that can only
support 'authenticated' connections.
Also add a bunch of basic unit tests for SOCKS4/4a/5 support
in buffers.c.
If you had TIME_MAX > INT_MAX, and your "time_to_exhaust_bw =
accountingmax/expected_bandwidth_usage * 60" calculation managed to
overflow INT_MAX, then your time_to_consider value could underflow and
wind up being rediculously low or high. "Low" was no problem;
negative values got caught by the (time_to_consider <= 0) check.
"High", however, would get you a wakeup time somewhere in the distant
future.
The fix is to check for time_to_exhaust_bw overflowing INT_MAX, not
TIME_MAX: We don't allow any accounting interval longer than a month,
so if time_to_exhaust_bw is significantly larger than 31*24*60*60, we
can just clip it.
This is a bugfix on 0.0.9pre6, when accounting was first introduced.
It fixes bug 2146, unless there are other causes there too. The fix
is from boboper. (I tweaked it slightly by removing an assignment
that boboper marked as dead, and lowering a variable that no longer
needed to be function-scoped.)
The old logic would have us fetch from authorities if we were refusing
unknown exits and our exit policy was reject*. Instead, we want to
fetch from authorities if we're refusing unknown exits and our exit
policy is _NOT_ reject*.
Fixed by boboper. Fixes more of 2097. Bugfix on 0.2.2.16-alpha.
We use a hash of the identity key to seed a prng to tell when an
accounting period should end. But thanks to the bug998 changes,
clients no longer have server-identity keys to use as a long-term seed
in accounting calculations. In any case, their identity keys (as used
in TLS) were never never fixed. So we can just set the wakeup time
from a random seed instead there. Still open is whether everybody
should be random.
This patch fixes bug 2235, which was introduced in 0.2.2.18-alpha.
Diagnosed with help from boboper on irc.
This is not the most beautiful fix for this problem, but it is the simplest.
Bugfix for 2205. Thanks to Sebastian and Mashael for finding the
bug, and boboper/cypherpunks for figuring out why it was happening
and how to fix it, and for writing a few fixes.
The reason the "streams problem" occurs is due to the complicated
interaction between Tor's congestion control and libevent. At some point
during the experiment, the circuit window is exhausted, which blocks all
edge streams. When a circuit level sendme is received at Exit, it
resumes edge reading by looping over linked list of edge streams, and
calling connection_start_reading() to inform libevent to resume reading.
When the streams are activated again, Tor gets the chance to service the
first three streams activated before the circuit window is exhausted
again, which causes all streams to be blocked again. As an experiment,
we reversed the order in which the streams are activated, and indeed the
first three streams, rather than the last three, got service, while the
others starved.
Our solution is to change the order in which streams are activated. We
choose a random edge connection from the linked list, and then we
activate streams starting from that chosen stream. When we reach the end
of the list, then we continue from the head of the list until our chosen
stream (treating the linked list as a circular linked list). It would
probably be better to actually remember which streams have received
service recently, but this way is simple and effective.
Sebastian notes (and I think correctly) that one of our ||s should
have been an &&, which simplifies a boolean expression to decide
whether to replace bridges. I'm also refactoring out the negation at
the start of the expression, to make it more readable.
Pick 5 seconds as the limit. 5 seconds is a compromise here between
making sure the user notices that the bad behaviour is (still) happening
and not spamming their log too much needlessly (the log message is
pretty long). We also keep warning every time if safesocks is
specified, because then the user presumably wants to hear about every
blocked instance.
(This is based on the original patch by Sebastian, then backported to
0.2.2 and with warnings split into their own function.)
Our checks that we don't exceed the 50 KB size limit of extra-info
descriptors apparently failed. This patch fixes these checks and reserves
another 250 bytes for appending the signature. Fixes bug 2183.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Having very long single lines with lots and lots of things in them
tends to make files hard to diff and hard to merge. Since our tools
are one-line-at-a-time, we should try to construct lists that way too,
within reason.
This incidentally turned up a few headers in configure.in that we were
for some reason searching for twice.
We would never actually enforce multiplicity rules when parsing
annotations, since the counts array never got entries added to it for
annotations in the token list that got added by earlier calls to
tokenize_string.
Found by piebeer.
We had a spelling discrepancy between the manpage and the source code
for some option. Resolve these in favor of the manpage, because it
makes more sense (for example, HTTP should be capitalized).
The code that makes use of the RunTesting option is #if 0, so setting
this option has no effect. Mark the option as obsolete for now, so that
Tor doesn't list it as an available option erroneously.
Change the default values for collecting directory request statistics and
inlcuding them in extra-info descriptors to 1.
Don't break if we are configured to collect directory request or entry
statistics and don't have a GeoIP database. Instead, print out a notice
and skip initializing the affected statistics code.
We need filtering bufferevent_openssl so that we can wrap around
IOCP bufferevents on Windows. This patch adds a temporary option to
turn on filtering mode, so that we can test it out on non-IOCP
systems to make sure it hasn't got any surprising bugs.
It also fixes some allocation/teardown errors in using
bufferevent_openssl as a filter.
Instead of rejecting a value that doesn't divide into 1 second, round to
the nearest divisor of 1 second and warn.
Document that the option only controls the granularity written by Tor to a
file or console log. It does not (for example) "batch up" log messages to
affect times logged by a controller, times attached to syslog messages, or
the mtime fields on log files.
In the case where old_router == NULL but sdmap has an entry for the
router, we can currently safely infer that the old_router was not a
bridge. Add an assert to ensure that this remains true, and fix the
logic not to die with the tor_assert(old_router) call.
In a2bb0bf we started using a separate client identity key. When we are
in "public server mode" (that means not a bridge) we will use the same
key. Reusing the key without doing the proper refcounting leads to a
segfault on cleanup during shutdown. Fix that.
Also introduce an assert that triggers if our refcount falls below 0.
That should never happen.
We now require that:
- Only actual servers should ever call get_server_identity_key
- If you're being a client or bridge, the client and server keys should
differ.
- If you're being a public relay, the client and server keys
should be the same.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
Mainly, this comes from turning two lists that needed to be kept in
synch into a single list of structs. This should save a little RAM,
and make the code simpler.
There's no reason to keep a time_t and a struct timeval to represent
the same value: highres_created.tv_sec was the same as timestamp_created.
This should save a few bytes per circuit.
Our old code correctly called bufferevent_flush() on linked
connections to make sure that the other side got an EOF event... but
it didn't call bufferevent_flush() when the connection wasn't
hold_open_until_flushed. Directory connections don't use
hold_open_until_flushed, so the linked exit connection never got an
EOF, so they never sent a RELAY_END cell to the client, and the
client never concluded that data had arrived.
The solution is to make the bufferevent_flush() code apply to _all_
closing linked conns whose partner is not already marked for close.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
First start of a fix for bug2001, but my test network still isn't
working: the client and the server send each other VERSIONS cells,
but never notice that they got them.
* Make tor_tls_context_new internal to tortls.c, and return the new
tor_tls_context_t from it.
* Add a public tor_tls_context_init wrapper function to replace it.
Some of these functions only work for routerinfo-based nodes, and as
such are only usable for advisory purposes. Fortunately, our uses
of them are compatible with this limitation.
Also, make the NodeFamily option into a list of routersets. This
lets us git rid of router_in_nickname_list (or whatever it was
called) without porting it to work with nodes, and also lets people
specify country codes and IP ranges in NodeFamily
This was the only flag in routerstatus_t that we would previously
change in a routerstatus_t in a consensus. We no longer have reason
to do so -- and probably never did -- as you can now confirm more
easily than you could have done by grepping for is_running before
this patch.
The name change is to emphasize that the routerstatus_t is_running
flag is only there to tell you whether the consensus says it's
running, not whether it *you* think it's running.
A node_t is an abstraction over routerstatus_t, routerinfo_t, and
microdesc_t. It should try to present a consistent interface to all
of them. There should be a node_t for a server whenever there is
* A routerinfo_t for it in the routerlist
* A routerstatus_t in the current_consensus.
(note that a microdesc_t alone isn't enough to make a node_t exist,
since microdescriptors aren't usable on their own.)
There are three ways to get a node_t right now: looking it up by ID,
looking it up by nickname, and iterating over the whole list of
microdescriptors.
All (or nearly all) functions that are supposed to return "a router"
-- especially those used in building connections and circuits --
should return a node_t, not a routerinfo_t or a routerstatus_t.
A node_t should hold all the *mutable* flags about a node. This
patch moves the is_foo flags from routerinfo_t into node_t. The
flags in routerstatus_t remain, but they get set from the consensus
and should not change.
Some other highlights of this patch are:
* Looking up routerinfo and routerstatus by nickname is now
unified and based on the "look up a node by nickname" function.
This tries to look only at the values from current consensus,
and not get confused by the routerinfo_t->is_named flag, which
could get set for other weird reasons. This changes the
behavior of how authorities (when acting as clients) deal with
nodes that have been listed by nickname.
* I tried not to artificially increase the size of the diff here
by moving functions around. As a result, some functions that
now operate on nodes are now in the wrong file -- they should
get moved to nodelist.c once this refactoring settles down.
This moving should happen as part of a patch that moves
functions AND NOTHING ELSE.
* Some old code is now left around inside #if 0/1 blocks, and
should get removed once I've verified that I don't want it
sitting around to see how we used to do things.
There are still some unimplemented functions: these are flagged
with "UNIMPLEMENTED_NODELIST()." I'll work on filling in the
implementation here, piece by piece.
I wish this patch could have been smaller, but there did not seem to
be any piece of it that was independent from the rest. Moving flags
forces many functions that once returned routerinfo_t * to return
node_t *, which forces their friends to change, and so on.
The node_t type is meant to serve two key functions:
1) Abstracting difference between routerinfo_t and microdesc_t
so that clients can use microdesc_t instead of routerinfo_t.
2) Being a central place to hold mutable state about nodes
formerly held in routerstatus_t and routerinfo_t.
This patch implements a nodelist type that holds a node for every
router that we would consider using.
When picking bridges (or other nodes without a consensus entry (and
thus no bandwidth weights)) we shouldn't just trust the node's
descriptor. So far we believed anything between 0 and 10MB/s, where 0
would mean that a node doesn't get any use from use unless it is our
only one, and 10MB/s would be a quite siginficant weight. To make this
situation better, we now believe weights in the range from 20kB/s to
100kB/s. This should allow new bridges to get use more quickly, and
means that it will be harder for bridges to see almost all our traffic.
In the first 100 circuits, our timeout_ms and close_ms
are the same. So we shouldn't transition circuits to purpose
CIRCUIT_PURPOSE_C_MEASURE_TIMEOUT, since they will just timeout again
next time we check.
We really should ignore any timeouts that have *no* network activity for their
entire measured lifetime, now that we have the 95th percentile measurement
changes. Usually this is up to a minute, even on fast connections.
If we really want all this complexity for these stages here, we need to handle
it better for people with large timeouts. It should probably go away, though.
Rechecking the timeout condition was foolish, because it is checked on the
same codepath. It was also wrong, because we didn't round.
Also, the liveness check itself should be <, and not <=, because we only have
1 second resolution.
Specifically, a circ attempt that we'd launched while the network was
down could timeout after we've marked our entrynodes up, marking them
back down again. The fix is to annotate as bad the OR conns that were
around before we did the retry, so if a circuit that's attached to them
times out we don't do anything about it.
This is needed for IOCP, since telling the IOCP backend about all
your CPUs is a good idea. It'll also come in handy with asn's
multithreaded crypto stuff, and for people who run servers without
reading the manual.
automake 1.6 doesn't like using a conditional += to add stuff to foo_LDADD.
Instead you need to conditionally define a variable, then non-conditionally
put that variable in foo_LDADD.
This requires the latest Git version of Libevent as of 24 March 2010.
In the future, we'll just say it requires Libevent 2.0.5-alpha or
later.
Since Libevent doesn't yet support hierarchical rate limit groups,
there isn't yet support for tracking relayed-bytes separately when
using the bufferevent system. If a future version does add support
for hierarchical buckets, we can add that back in.
Roger correctly pointed out that my code was broken for accounting
periods that shifted forwards, since
start_of_accounting_period_containing(interval_start_time) would not
be equal to interval_start_time, but potentially much earlier.
The RefuseUnknownExits config option is now a tristate, with "1"
meaning "enable it no matter what the consensus says", "0" meaning
"disable it no matter what the consensus says", and "auto" meaning "do
what the consensus says". If the consensus is silent, we enable
RefuseUnknownExits.
This patch also changes the dirserv logic so that refuseunknownexits
won't make us cache unless we're an exit.
Do not double-report signatures from unrecognized authorities both as
"from unknown authority" and "not present". Fixes bug 1956, bugfix on
0.2.2.16-alpha.
Our attempt to make compilation work on old versions of Windows
again while keeping wince compatibility broke the build for Win2k+.
helix reports this patch fixes the issue for WinXP. Bugfix on
0.2.2.15-alpha; related to bug 1797.
When the CellStatistics option is off, we don't store cell insertion
times. Doing so would also not be very smart, because there seem to
still be some performance issues with this type of statistics. Nothing
harmful happens when we don't have insertion times, so we don't need to
alarm the user.
Previously[*], the function would start with the first stream on the
circuit, and let it package as many cells as it wanted before
proceeding to the next stream in turn. If a circuit had many live
streams that all wanted to package data, the oldest would get
preference, and the newest would get ignored.
Now, we figure out how many cells we're willing to send per stream,
and try to allocate them fairly.
Roger diagnosed this in the comments for bug 1298.
[*] This bug has existed since before the first-ever public release
of Tor. It was added by r152 of Tor on 26 Jan 2003, which was
the first commit to implement streams (then called "topics").
This is not the oldest bug to be fixed in 0.2.2.x: that honor
goes to the windowing bug in r54, which got fixed in e50b7768 by
Roger with diagnosis by Karsten. This is, however, the most
long-lived bug to be fixed in 0.2.2.x: the r54 bug was fixed
2580 days after it was introduced, whereas I am writing this
commit message 2787 days after r152.
I'm going to use this to implement more fairness in
circuit_resume_edge_reading_helper in an attempt to fix bug 1298.
(Updated with fixes from arma and Sebastian)
An authority should never download a consensus if it has a live one,
but when it doesn't, it should admit that it's not going to get one,
and see if anybody else can give it one.
Fixes 1300, fix on 0.2.0.9-alpha
Bridges and other relays not included in the consensus don't
necessarily have a non-zero bandwidth capacity. If all our
configured bridges had a zero bw capacity we would warn the
user. Change that.
Previously, we were also considering the time spent in
soft-hibernation. If this was a long time, we would wind up
underestimating our bandwidth by a lot, and skewing our wakeup time
towards the start of the accounting interval.
This patch also makes us store a few more fields in the state file,
including the time at which we entered soft hibernation.
Fixes bug 1789. Bugfix on 0.0.9pre5.
This will make changes for DST still work, and avoid double-spending
bytes when there are slight changes to configurations.
Fixes bug 1511; the DST issue is a bugfix on 0.0.9pre5.
When we introduced the code to close non-open OR connections after
KeepalivePeriod had passed, we replaced some code that said
if (!connection_is_open(conn)) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
with new code that said
if (!connection_is_open(conn) && past_keepalive) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
This was a mistake, since it made all the other tests start applying
to non-open connections, thus causing bug 1840, where non-open
connections get closed way early.
Fixes bug 1840. Bugfix on 0.2.1.26 (commit 67b38d50).
- Move checks for extra_info to callers
- Change argument name from failed to descs
- Use strlen("fp/") instead of a magic number
- I passed on the suggestion to rename functions from *_failed() to
*_handle_failure(). There are a lot of these so for now just follow
the house style.
It's normal when bootstrapping to have a lot of different certs
missing, so we don't want missing certs to make us warn... unless
the certs we're missing are ones that we've tried to fetch a couple
of times and failed at.
May fix bug 1145.
We frequently add cells to stream-blocked queues for valid reasons
that don't mean we need to block streams. The most obvious reason
is if the cell arrives over a circuit rather than from an edge: we
don't block circuits, no matter how full queues get. The next most
obvious reason is that we allow CONNECTED cells from a newly created
stream to get delivered just fine.
This patch changes the behavior so that we only iterate over the
streams on a circuit when the cell in question came from a stream,
and we only block the stream that generated the cell, so that other
streams can still get their CONNECTEDs in.
This should keep WinCE working (unicode always-on) and get Win98
working again (unicode never-on).
There are two places where we explicitly use ASCII-only APIs, still:
in ntmain.c and in the unit tests.
This patch also fixes a bug in windoes tor_listdir that would cause
the first file to be listed an arbitrary number of times that was
also introduced with WinCE support.
Should fix bug 1797.
Setting CookieAuthFileGroupReadable but without setting CookieAuthFile makes
no sense, because unix directory permissions for the data directory prevent
the group from accessing the file anyways.
When this happens, run through the streams on the circuit and make
sure they're all blocked. If some aren't, that's a bug: block them
all and log it! If they all are, where did the cell come from? Log
it!
(I suspect that this actually happens pretty frequently, so I'm making
these log messages appear at INFO.)
Do not start reading on exit streams when we get a SENDME unless we
have space in the appropriate circuit's cell queue.
Draft fix for bug 1653.
(commit message by nickm)
router_add_to_routerlist() is supposed to be a nice minimal function
that only touches the routerlist structures, but it included a call to
dirserv_single_reachability_test().
We have a function that gets called _after_ adding descriptors
successfully: routerlist_descriptors_added. This patch moves the
responsibility for testing there.
Because the decision of whether to test or not depends on whether
there was an old routerinfo for this router or not, we have to first
detect whether we _will_ want to run the tests if the router is added.
We make this the job of
routers_update_status_from_consensus_networkstatus().
Finally, this patch makes the code notice if a router is going from
hibernating to non-hibernating, and if so causes a reachability test
to get launched.
This solves the problem Roger noted as:
What if the router has a clock that's 5 minutes off, so it
publishes a descriptor for 5 minutes in the future, and we test it
three minutes in. In this edge case, we will continue to advertise
it as Running for the full 45 minute period.
If the voting interval was short enough, the two-minutes delay
of CONSENSUS_MIN_SECONDS_BEFORE_CACHING would confuse bridges
to the point where they would assert before downloading a consensus.
It it was even shorter (<4 minutes, I think), caches would
assert too. This patch fixes that by having replacing the
two-minutes value with MIN(2 minutes, interval/16).
Bugfix for 1141; the cache bug could occur since 0.2.0.8-alpha, so
I'm calling this a bugfix on that. Robert Hogan diagnosed this.
Done as a patch against maint-0.2.1, since it makes it hard to
run some kinds of testing networks.
requests to authorities fail due to a network error.
Bug#1138
"When a Tor client starts up using a bridge, and UpdateBridgesFromAuthority
is set, Tor will go to the authority first and look up the bridge by
fingerprint. If the bridge authority is filtered, Tor will never notice that
the bridge authority lookup failed. So it will never fall back."
Add connection_dir_bridge_routerdesc_failed(), a function for unpacking
the bridge information from a failed request, and ensure
connection_dir_request_failed() calls it if the failed request
was for a bridge descriptor.
Test:
1. for ip in `grep -iR 'router ' cached-descriptors|cut -d ' ' -f 3`;
do sudo iptables -A OUTPUT -p tcp -d $ip -j DROP; done
2. remove all files from user tor directory
3. Put the following in torrc:
UseBridges 1
UpdateBridgesFromAuthority 1
Bridge 85.108.88.19:443 7E1B28DB47C175392A0E8E4A287C7CB8686575B7
4. Launch tor - it should fall back to downloading descriptors
directly from the bridge.
Initial patch reviewed and corrected by mingw-san.
https://trac.torproject.org/projects/tor/ticket/1525
"The codepath taken by the control port "RESOLVE" command to create a
synthetic SOCKS resolve request isn't the same as the path taken by
a real SOCKS request from 'tor-resolve'.
This prevents controllers who set LeaveStreamsUnattached=1 from
being able to attach RESOLVE streams to circuits of their choosing."
Create a new function connection_ap_rewrite_and_attach_if_allowed()
and call that when Tor needs to attach a stream to a circuit but
needs to know if the controller permits it.
No tests added.
With this patch we stop scheduling when we should write statistics using a
single timestamp in run_scheduled_events(). Instead, we remember when a
statistics interval starts separately for each statistic type in geoip.c
and rephist.c. Every time run_scheduled_events() tries to write stats to
disk, it learns when it should schedule the next such attempt.
This patch also enables all statistics to be stopped and restarted at a
later time.
This patch comes with a few refactorings, some of which were not easily
doable without the patch.
We already had the country code ?? indicating an unknown country, so all we
needed to do to make unknown countries excludable was to make the ?? code
discoverable.
It's okay to get (say) a SocksPort line in the torrc, and then a
SocksPort on the command line to override it, and then a SocksPort via
a controller to override *that*. But if there are two occurrences of
SocksPort in the torrc, or on the command line, or in a single SETCONF
command, then the user is likely confused. Our old code would not
help unconfuse the user, but would instead silently ignore all but
the last occurrence.
This patch changes the behavior so that if the some option is passed
more than once to any torrc, command line, or SETCONF (each of which
coincidentally corresponds to a call to config_assign()), and the
option is not a type that allows multiple occurrences (LINELIST or
LINELIST_X), then we can warn the user.
This closes trac entry 1384.
At best, this patch helps us avoid sending queued relayed cells that
would get ignored during the time between when a destroy cell is
sent and when the circuit is finally freed. At worst, it lets us
release some memory a little earlier than it would otherwise.
Fix for bug #1184. Bugfix on 0.2.0.1-alpha.
The next series of commits begins addressing the issue that we're
currently including the complete or.h file in all of our source files.
To change that, we're splitting function definitions into new header
files (one header file per source file).
Right now it says "552 internal error" because there's no way for
getinfo_helper_*() countries to specify an error message. This
patch changes the getinfo_helper_*() interface, and makes most of the
getinfo helpers give useful error messages in response to failures.
This should prevent recurrences of bug 1699, where a missing GeoIPFile
line in the torrc made GETINFO ip-to-county/* fail in a "not obvious
how to fix" way.
V3 authorities no longer decide not to vote on Guard+Exit. The bandwidth
weights should take care of this now.
Also, lower the max threshold for WFU to 0.98, to allow more nodes to become
guards.
This should make us conflict less with system files named "log.h".
Yes, we shouldn't have been conflicting with those anyway, but some
people's compilers act very oddly.
The actual change was done with one "git mv", by editing
Makefile.am, and running
find . -name '*.[ch]' | xargs perl -i -pe 'if (/^#include.*\Wlog.h/) {s/log.h/torlog.h/; }'
We now record large times as abandoned, to prevent a filter step from
happening and skewing our results.
Also, issue a warn for a rare case that can happen for funky values of Xm or
too many abandoned circuits. Can happen (very rarely) during unit tests, but
should not be possble during live operation, due to network liveness filters
and discard logic.
Many friendly operating systems have 64-bit times, and it's not nice
to pass them to an %ld format.
It's also extremely not-nice to write a time to the log as an
integer. Most people think it's 2010 June 29 23:57 UTC+epsilon, not
1277855805+epsilon.
These timers behave better with non-monotonic clocks than our old
ones, and also try harder to make once-per-second events get called
one second apart, rather than one-plus-epsilon seconds apart.
This fixes bug 943 for everybody using Libevent 2.0 or later.
We need to ensure that we close timeout measurement circuits. While
we're at it, we should close really old circuits of certain types that
aren't in use, and log really old circuits of other types.
We need to record different statistics at point of timeout, vs the point
of forcible closing.
Also, give some better names to constants and state file variables
to indicate they are not dealing with timeouts, but abandoned circuits.
In rare cases, we could cannibalize a one-hop circuit, ending up
with a two-hop circuit. This circuit would not be actually used,
but we should prevent its creation in the first place.
Thanks to outofwords and swissknife for helping to analyse this.
Most of the changes here are switches to use APIs available on Windows
CE. The most pervasive change is that Windows CE only provides the
wide-character ("FooW") variants of most of the windows function, and
doesn't support the older ASCII verions at all.
This patch will require use of the wcecompat library to get working
versions of the posix-style fd-based file IO functions.
[commit message by nickm]
Back when we changed the idea of a connection being "too old" for new
circuits into the connection being "bad" for new circuits, we didn't
actually change the info messages. This led to telling the user that
we were labelling connections as "too old" for being worse than
connections that were actually older than them.
Found by Scott on or-talk.
There are now four ways that CBT can be disabled:
1. Network-wide, with the cbtdisabled consensus param.
2. Via config, with "LearnCircuitBuildTimeout 0"
3. Via config, with "AuthoritativeDirectory 1"
4. Via a state file write failure.
This should prevent some asserts and storage of incorrect build times
for the cases where Tor is suspended during a circuit construction, or
just after completing a circuit. The idea is that if the circuit
build time is much greater than we would have cut it off at, we probably
had a suspend event along this codepath, and we should discard the
value.
In case we decide that the timeout rate is now too high due to our
change of the max synthetic quantile value, this consensus parameter
will allow us to restore it to the previous value.
This is for the other issue we saw in Bug 1335. A large number of high
timeouts were causing the timeout calculation to slowly drift upwards,
especially in conditions of load. This fix repeatedly regenerates all of
our synthetic timeouts whenever the timeout changes, to try to prevent
drift.
It also lowers the timeout cap to help for some cases of Bug 1245, where
some timeout values were so large that we ended up allocating a ton of
scratch memory to count the histogram bins.
The downside is that lowering this cap is affecting our timeout rate.
Unfortunately, the buildtimeout quantile is now higher than the actual
completion rate by what appears to be about 7-10%, which probably
represents the skew in the distribution due to lowering this synthetic
cap.
In my state files, I was seeing several peaks, probably due to different
guards having different latency. This change is meant to better capture
this behavior and generate more reasonable timeouts when it happens. It
is improving the timeout values for my collection of state files.
what's happening here is that we're fetching certs for obsolete
authorities -- probably legacy signers in this case. but try to
remain general in the log message.
It's natural for the definition of bandwidth_rule_t to be with the functions
that actually care about its values. Unfortunately, this means declaring
bandwidth_rate_rule_to_string() out of sequence. Someday we'll just rename
reasons.c to strings.c, and put it at the end of or.h, and this will all be
better.
Works like the --enable-static-openssl/libevent options. Requires
--with-zlib-dir to be set. Note that other dependencies might still
pull in a dynamicly linked zlib, if you don't link them in statically
too.
Everything that accepted the 'Circ' name handled it wrong, so even now
that we fixed the handling of the parameter, we wouldn't be able to
set it without making all the 0.2.2.7..0.2.2.10 relays act wonky.
This patch makes Tors accept the 'Circuit' name instead, so we can
turn on circuit priorities without confusing the versions that treated
the 'Circ' name as occasion to act weird.
I'm adding this because I can never remember what stuff like 'rule 3'
means. That's the one where if somebody goes limp or taps out, the
fight is over, right?
When you mean (a=b(c,d)) >= 0, you had better not say (a=b(c,d)>=0).
We did the latter, and so whenever CircPriorityHalflife was in the
consensus, it was treated as having a value of 1 msec (that is,
boolean true).
We need to make sure we have an event_base in dns.c before we call
anything that wants one. Make sure we always have one in dns_reset()
when we're a client. Fixes bug 1341.
Now if you're a published relay and you set RefuseUnknownExits, even
if your dirport is off, you'll fetch dir info from the authorities,
fetch it early, and cache it.
In the future, RefuseUnknownExits (or something like it) will be on
by default.
From http://archives.seul.org/tor/relays/Mar-2010/msg00006.html :
As I understand it, the bug should show up on relays that don't set
Address to an IP address (so they need to resolve their Address
line or their hostname to guess their IP address), and their
hostname or Address line fails to resolve -- at that point they'll
pick a random 4 bytes out of memory and call that their address. At
the same time, relays that *do* successfully resolve their address
will ignore the result, and only come up with a useful address if
their interface address happens to be a public IP address.
When the bandwidth-weights branch added the "directory-footer"
token, and began parsing the directory footer at the first
occurrence of "directory-footer", it made it possible to fool the
parsing algorithm into accepting unsigned data at the end of a
consensus or vote. This patch fixes that bug by treating the footer
as starting with the first "directory-footer" or the first
"directory-signature", whichever comes first.
Treat strings returned from signed_descriptor_get_body_impl() as not
NUL-terminated. Since the length of the strings is available, this is
not a big problem.
Discovered by rieo.
Don't allow anything but directory-signature tokens in a consensus after
the first directory-signature token. Fixes bug in bandwidth-weights branch.
Found by "outofwords."
Another dereference-then-NULL-check sequence. No reports of this bug
triggered in the wild. Fixes bugreport 1256.
Thanks to ekir for discovering and reporting this bug.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
This means that "if (E<G) {abc} else if (E>=G) {def}" can be replaced with
"if (E<G) {abc} else {def}"
Doing the second test explicitly made my mingw gcc nervous that we might
never be initializing casename.
For my 64-bit Linux system running with GCC 4.4.3-fc12-whatever, you
can't do 'printf("%lld", (int64_t)x);' Instead you need to tell the
compiler 'printf("%lld", (long long int)x);' or else it doesn't
believe the types match. This is why we added U64_PRINTF_ARG; it
looks like we needed an I64_PRINTF_ARG too.
asprintf() is a GNU extension that some BSDs have picked up: it does a printf
into a newly allocated chunk of RAM.
Our tor_asprintf() differs from standard asprintf() in that:
- Like our other malloc functions, it asserts on OOM.
- It works on windows.
- It always sets its return-field.
All other bandwidthrate settings are restricted to INT32_MAX, but
this check was forgotten for PerConnBWRate and PerConnBWBurst. Also
update the manpage to reflect the fact that specifying a bandwidth
in terabytes does not make sense, because that value will be too
large.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
On Windows, we don't have a notion of ~ meaning "our homedir", so we
were deliberately using an #ifdef to avoid calling expand_filename()
in multiple places. This is silly: The right place to turn a function
into a no-op on a single platform is in the function itself, not in
every single call-site.
Spec conformance issue: The code didn't force the network-status-version
token to be the first token in a v3 vote or consensus.
Problem discovered by Parakeep.
We need to use evdns_add_server_port_with_base() when configuring
our DNS listener, because libevent segfaults otherwise. Add a macro
in compat_libevent.h to pick the correct implementation depending
on the libevent version.
Fixes bug 1143, found by SwissTorExit
The src and dest of a memcpy() call aren't supposed to overlap,
but we were sometimes calling tor_addr_copy() as a no-op.
Also, tor_addr_assign was a redundant copy of tor_addr_copy(); this patch
removes it.
We implemented ratelimiting for warnings going into the logfile, but didn't
rate-limit controller events. Now both log warnings and controller events
are rate-limited.
Tor has tor_lookup_hostname(), which prefers ipv4 addresses automatically.
Bug 1244 occured because gethostbyname() returned an ipv6 address, which
Tor cannot handle currently. Fixes bug 1244; bugfix on 0.0.2pre25.
Reported by Mike Mestnik.
The problem was that we didn't allocate enough memory on 32-bit
platforms with 64-bit time_t. The memory leak occured every time
we fetched a hidden service descriptor we've fetched before.
When calculating the is_exit flag for a routerinfo_t, we don't need
to call exit_policy_is_general_exit() if router_exit_policy_rejects_all()
tells us it definitely is an exit. This check is much cheaper than
running exit_policy_is_general_exit().
exit_policy_is_general_exit() assumed that there are no redundancies
in the passed policy, in the sense that we actively combine entries
in the policy to really get rid of any redundancy. Since we cannot
do that without massively rewriting the policy lines the relay
operators set, fix exit_policy_is_general_exit().
Fixes bug 1238, discovered by Martin Kowalczyk.
We accidentally freed the internal buffer for bridge stats when we
were writing the bridge stats file or honoring a control port
request for said data. Change the interfaces for
geoip_get_bridge_stats* to prevent these problems, and remove the
offending free/add a tor_strdup.
Fixes bug 1208.
It's a bit confusing to have a loop where another function,
confusingly named "*_free", is responsible for advancing the loop
variable (or rather, for altering a structure so that the next time
the loop variable's initializer is evaluated it evaluates to something
different.)
Not only has this confused people: it's also confused coverity scan.
Let's fix that.
This was freaking out some relay operators without good reason, as
it is nothing the relay operator can do anything about anyways.
Quieting this warning suggested by rieo.
The OutboundBindAddress option is useful for making sure that all of
your outbond connections use a given interface. But when connecting
to 127.0.0.1 (or ::1 even) it's important to actually have the
connection come _from_ localhost, since lots of programs running on
localhost use the source address to authenticate that the connection
is really coming from the same host.
Our old code always bound to OutboundBindAddress, whether connecting
to localhost or not. This would potentially break DNS servers on
localhost, and socks proxies on localhost. This patch changes the
behavior so that we only look at OutboundBindAddress when connecting
to a non-loopback address.
this case can now legitimately happen, if you have a cached v2 status
from moria1, and you run with the new list of dirservers that's missing
the old moria1. it's nothing to worry about; the file will die off in
a month or two.
...to let us
rate-limit client connections as they enter the network. It's
controlled in the consensus so we can turn it on and off for
experiments. It's starting out off. Based on proposal 163.
Specifically, there are two cases: a) are we willing to start a new
circuit at a node not in your ExitNodes config option, and b) are we
willing to make use of a circuit that's already established but has an
unsuitable exit.
Now we discard all your circuits when you set ExitNodes, so the only
way you could end up with an exit circuit that ends at an unsuitable
place is if we explicitly ran out of exit nodes, StrictNodes was 0,
and we built this circuit to solve a stream that needs solving.
Fixes bug in dc322931, which would ignore the just-built circuit because
it has an unsuitable exit.
Before it would prepend your requested entrynodes to your list of guard
nodes, but feel free to use others after that. Now it chooses only
from your EntryNodes if any of those are available, and only falls back
to others if a) they're all down and b) StrictNodes is not set.
Also, now we refresh your entry guards from EntryNode at each consensus
fetch (rather than just at startup and then they slowly rot as the
network changes).
The goal here is to make users less likely to set StrictNodes, since
it's doing closer to what they expect it should be doing.
We do this in too many places throughout the code; it's time to start
clamping down.
Also, refactor Karsten's patch to use strchr-then-strndup, rather than
malloc-then-strlcpy-then-strchr-then-clear.
Fix statistics on client numbers by country as seen by bridges that were
broken in 0.2.2.1-alpha. Also switch to reporting full 24-hour intervals
instead of variable 12-to-48-hour intervals.
The HSAuthorityRecordStats option was used to track statistics of overall
hidden service usage on the version 0 hidden service authorities. With the
version 2 hidden service directories being deployed and version 0
descriptors being phased out, these statistics are not as useful anymore.
Goodbye, you fine piece of software; my first major code contribution to
Tor.
The new rule is: safe_str_X() means "this string is a piece of X
information; make it safe to log." safe_str() on its own means
"this string is a piece of who-knows-what; make it safe to log".
The rule is now: take the value from the CircuitPriorityHalflife
config option if it is set. If it zero, disable the cell_ewma
algorithm. If it is set, use it to calculate the scaling factor.
If it is not set, look for a CircPriorityHalflifeMsec parameter in the
consensus networkstatus. If *that* is zero, then disable the cell_ewma
algorithm; if it is set, use it to calculate the scaling factor.
If it is not set at all, disable the algorithm.
In connection_dir_client_reached_eof, we make sure that we either
return when we get an http status code of 503 or handle the problem
and set it to 200. Later we check if the status code is 503. Remove
that check.
There are two big changes here:
- We store active circuits in a priority queue for each or_conn,
rather than doing a linear search over all the active circuits
before we send each cell.
- Rather than multiplying every circuit's cell-ewma by a decay
factor every time we send a cell (thus normalizing the value of a
current cell to 1.0 and a past cell to alpha^t), we instead
only scale down the cell-ewma every tick (ten seconds atm),
normalizing so that a cell sent at the start of the tick has
value 1.0).
Each circuit is ranked in terms of how many cells from it have been
relayed recently, using a time-weighted average.
This patch has been tested this on a private Tor network on PlanetLab,
and gotten improvements of 12-35% in time it takes to fetch a small
web page while there's a simultaneous large data transfer going on
simultaneously.
[Commit msg by nickm based on mail from Ian Goldberg.]
This changes the pqueue API by requiring an additional int in every
structure that we store in a pqueue to hold the index of that structure
within the heap.
Some *_free functions threw asserts when passed NULL. Now all of them
accept NULL as input and perform no action when called that way.
This gains us consistence for our free functions, and allows some
code simplifications where an explicit null check is no longer necessary.
Do not segfault when writing buffer stats when we haven't observed a
single circuit to report about. This is a minor bug that would only show
up in testing environments with no traffic and with reduced stats
intervals.
Avoid crashing if the client is trying to upload many bytes and the
circuit gets torn down at the same time, or if the flip side
happens on the exit relay. Bugfix on 0.2.0.1-alpha; fixes bug 1150.
New config option "CircuitStreamTimeout" to override our internal
timeout schedule for how many seconds until we detach a stream from
a circuit and try a new circuit. If your network is particularly
slow, you might want to set this to a number like 60.
To fix a major security problem related to incorrect use of
SSL/TLS renegotiation, OpenSSL has turned off renegotiation by
default. We are not affected by this security problem, however,
since we do renegotiation right. (Specifically, we never treat a
renegotiated credential as authenticating previous communication.)
Nevertheless, OpenSSL's new behavior requires us to explicitly
turn renegotiation back on in order to get our protocol working
again.
Amusingly, this is not so simple as "set the flag when you create
the SSL object" , since calling connect or accept seems to clear
the flags.
For belt-and-suspenders purposes, we clear the flag once the Tor
handshake is done. There's no way to exploit a second handshake
either, but we might as well not allow it.
This commit implements a new config option: 'DisableAllSwap'
This option probably only works properly when Tor is started as root.
We added two new functions: tor_mlockall() and tor_set_max_memlock().
tor_mlockall() attempts to mlock() all current and all future memory pages.
For tor_mlockall() to work properly we set the process rlimits for memory to
RLIM_INFINITY (and beyond) inside of tor_set_max_memlock().
We behave differently from mlockall() by only allowing tor_mlockall() to be
called one single time. All other calls will result in a return code of 1.
It is not possible to change DisableAllSwap while running.
A sample configuration item was added to the torrc.complete.in config file.
A new item in the man page for DisableAllSwap was added.
Thanks to Moxie Marlinspike and Chris Palmer for their feedback on this patch.
Please note that we make no guarantees about the quality of your OS and its
mlock/mlockall implementation. It is possible that this will do nothing at all.
It is also possible that you can ulimit the mlock properties of a given user
such that root is not required. This has not been extensively tested and is
unsupported. I have included some comments for possible ways we can handle
this on win32.
If your relay can't keep up with the number of incoming create cells, it
would log one warning per failure into your logs. Limit warnings to 1 per
minute.
This was left over from an early draft of the microdescriptor code; it
began to populate the signatures array of a networkstatus vote, even
though there's no actual need to do that for a vote.
In C, the code "char x[10]; if (x) {...}" always takes the true branch of
the if statement. Coverity notices this now.
In some cases, we were testing arrays to make sure that an operation
we wanted to do would suceed. Those cases are now always-true.
In some cases, we were testing arrays to see if something was _set_.
Those caes are now tests for strlen(s), or tests for
!tor_mem_is_zero(d,len).
If all authorities restart at once right before a consensus vote, nobody
will vote about "Running", and clients will get a consensus with no usable
relays. Instead, authorities refuse to build a consensus if this happens.
The first happens on an error case when a controller wants an
impossible directory object. The second happens when we can't write
our fingerprint file.
The code for these was super-wrong, but will only break things when we
reset an option on a platform where sizeof(time_t) is different from
sizeof(int).
See task 1114. The most plausible explanation for someone sending us weak
DH keys is that they experiment with their Tor code or implement a new Tor
client. Usually, we don't care about such events, especially not on warn
level. If we really care about someone not following the Tor protocol, we
can set ProtocolWarnings to 1.
This patch introduces a new type called document_signature_t to represent the
signature of a consensus document. Now, each consensus document can have up
to one document signature per voter per digest algorithm. Also, each
detached-signatures document can have up to one signature per <voter,
algorithm, flavor>.
Previously, we insisted that a valid signature must be a signature of
the expected digest. Now we accept anything that starts with the
expected digest. This lets us include another digest later.