When calling circuit_build_times_shuffle_and_store_array, we were
passing a uint32_t as an int. arma is pretty sure that this can't
actually cause a bug, because of checks elsewhere in the code, but
it's best not to pass a uint32_t as an int anyway.
Found by doorss; fix on 0.2.2.4-alpha.
We detect and reject said attempts if there is no chosen exit node or
circuit: connecting to a private addr via a randomly chosen exit node
will usually fail (if all exits reject private addresses), is always
ill-defined (you're not asking for any particular host or service),
and usually an error (you've configured all requests to go over Tor
when you really wanted to configure all _remote_ requests to go over
Tor).
This can also help detect forwarding loop requests.
Found as part of bug2279.
Left circuit_build_times_get_bw_scale() uncommented because it is in the wrong
place due to an improper bug2317 fix. It needs to be moved and renamed, as it
is not a cbt parameter.
To quote arma: "So instead of stopping your CBT from screaming, you're just
going to throw it in the closet and hope you can't hear it?"
Yep. The log message can happen because at 95% point on the curve, we can be
way beyond the max timeout we've seen, if the curve has few points and is
shallow.
Also applied Nick's rule of thumb for rewriting some other notice log messages
to read like how you would explain them to a raving lunatic on #tor who was
shouting at you demanding what they meant. Hopefully the changes live up to
that standard.
If we got a signed digest that was shorter than the required digest
length, but longer than 20 bytes, we would accept it as long
enough.... and then immediately fail when we want to check it.
Fixes bug 2409; bug in 0.2.2.20-alpha; found by piebeer.
When we added support for separate client tls certs on bridges in
a2bb0bfdd5 we forgot to correctly initialize this when changing
from relay to bridge or vice versa while Tor is running. Fix that
by always initializing keys when the state changes.
Fixes bug 2433.
When we stopped using svn, 0.2.1.x lost the ability to notice its svn
revision and report it in the version number. However, it kept
looking at the micro-revision.i file... so if you switched to master,
built tor, then switched to 0.2.1.x, you'd get a micro-revision.i file
from master reported as an SVN tag. This patch takes out the "include
the svn tag" logic entirely.
Bugfix on 0.2.1.15-rc; fixes bug 2402.
Our regular DH parameters that we use for circuit and rendezvous
crypto are unchanged. This is yet another small step on the path of
protocol fingerprinting resistance.
We need to make sure that the worst thing that a weird consensus param
can do to us is to break our Tor (and only if the other Tors are
reliably broken in the same way) so that the majority of directory
authorities can't pull any attacks that are worse than the DoS that
they can trigger by simply shutting down.
One of these worse things was the cbtnummodes parameter, which could
lead to heap corruption on some systems if the value was sufficiently
large.
This commit fixes this particular issue and also introduces sanity
checking for all consensus parameters.
Our public key functions assumed that they were always writing into a
large enough buffer. In one case, they weren't.
(Incorporates fixes from sebastian)
In dnsserv_resolved(), we carefully made a nul-terminated copy of the
answer in a PTR RESOLVED cell... then never used that nul-terminated
copy. Ouch.
Surprisingly this one isn't as huge a security problem as it could be.
The only place where the input to dnsserv_resolved wasn't necessarily
nul-terminated was when it was called indirectly from relay.c with the
contents of a relay cell's payload. If the end of the payload was
filled with junk, eventdns.c would take the strdup() of the name [This
part is bad; we might crash there if the cell is in a bad part of the
stack or the heap] and get a name of at least length
495[*]. eventdns.c then rejects any name of length over 255, so the
bogus data would be neither transmitted nor altered.
[*] If the name was less than 495 bytes long, the client wouldn't
actually be reading off the end of the cell.
Nonetheless this is a reasonably annoying bug. Better fix it.
Found while looking at bug 2332, reported by doorss. Bugfix on
0.2.0.1-alpha.
Previously, we only looked at up to 128 bytes. This is a bad idea
since socks messages can be at least 256+x bytes long. Now we look at
up to 512 bytes; this should be enough for 0.2.2.x to handle all valid
SOCKS messages. For 0.2.3.x, we can think about handling trickier
cases.
Fixes 2330. Bugfix on 0.2.0.16-alpha.
Right now, Tor routers don't save the maxima values from the
bw_history_t between sessions. That's no good, since we use those
values to determine bandwidth. This code adds a new BWHist.*Maximum
set of values to the state file. If they're not present, we estimate
them by taking the observed total bandwidth and dividing it by the
period length, which provides a lower bound.
This should fix bug 1863. I'm calling it a feature.
Previously, our state parsing code would fail to parse a bwhist
correctly if the Interval was anything other than the default
hardcoded 15 minutes. This change makes the parsing less incorrect,
though the resulting history array might get strange values in it if
the intervals don't match the one we're using. (That is, if stuff was
generated in 15 minute intervals, and we read it into an array that
expects 30 minute intervals, we're fine, since values can be combined
pairwise. But if we generate data at 30 minute intervals and read it
into 15 minute intervals, alternating buckets will be empty.)
Bugfix on 0.1.1.11-alpha.
The trick of looping from i=0..4 , switching on i to set up some
variables, then running some common code is much better expressed by
just calling a function 4 times with 4 sets of arguments. This should
make the code a little easier to follow and maintain here.
An object, you'll recall, is something between -----BEGIN----- and
-----END----- tags in a directory document. Some of our code, as
doorss has noted in bug 2352, could assert if one of these ever
overflowed SIZE_T_CEILING but not INT_MAX. As a solution, I'm setting
a maximum size on a single object such that neither of these limits
will ever be hit. I'm also fixing the INT_MAX checks, just to be sure.
When using libevent 2, we use evdns_base_resolve_*(). When not, we
fake evdns_base_resolve_*() using evdns_resolve_*().
Our old check was looking for negative values (like libevent 2
returns), but our eventdns.c code returns 1. This code makes the
check just test for nonzero.
Note that this broken check was not for _resolve_ failures or even for
failures to _launch_ a resolve: it was for failures to _create_ or
_encode_ a resolve request.
Bug introduced in 81eee0ecfff3dac1e9438719d2f7dc0ba7e84a71; found by
lodger; uploaded to trac by rransom. Bug 2363. Fix on 0.2.2.6-alpha.
This was originally a patch provided by pipe
(http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html) to
provide a method for controllers to query the total amount of traffic
tor has handled (this is a frequently requested piece of information
by relay operators).
C99 allows a syntax for structures whose last element is of
unspecified length:
struct s {
int elt1;
...
char last_element[];
};
Recent (last-5-years) autoconf versions provide an
AC_C_FLEXIBLE_ARRAY_MEMBER test that defines FLEXIBLE_ARRAY_MEMBER
to either no tokens (if you have c99 flexible array support) or to 1
(if you don't). At that point you just use offsetof
[STRUCT_OFFSET() for us] to see where last_element begins, and
allocate your structures like:
struct s {
int elt1;
...
char last_element[FLEXIBLE_ARRAY_MEMBER];
};
tor_malloc(STRUCT_OFFSET(struct s, last_element) +
n_elements*sizeof(char));
The advantages are:
1) It's easier to see which structures and elements are of
unspecified length.
2) The compiler and related checking tools can also see which
structures and elements are of unspecified length, in case they
wants to try weird bounds-checking tricks or something.
3) The compiler can warn us if we do something dumb, like try
to stack-allocate a flexible-length structure.
We were not decrementing "available" every time we did
++next_virtual_addr in addressmap_get_virtual_address: we left out the
--available when we skipped .00 and .255 addresses.
This didn't actually cause a bug in most cases, since the failure mode
was to keep looping around the virtual addresses until we found one,
or until available hit zero. It could have given you an infinite loop
rather than a useful message, however, if you said "VirtualAddrNetwork
127.0.0.255/32" or something broken like that.
Spotted by cypherpunks
We were decrementing "available" twice for each in-use address we ran
across. This would make us declare that we ran out of virtual
addresses when the address space was only half full.
evbuffer_pullup does nothing and returns NULL if the caller asks it to
linearize more data than the buffer contains.
Introduced in 9796b9bfa6.
Reported by piebeer; fixed with help from doors.
If a SOCKS5 client insists on authentication, allow it to
negotiate a connection with Tor's SOCKS server successfully.
Any credentials the client provides are ignored.
This allows Tor to work with SOCKS5 clients that can only
support 'authenticated' connections.
Also add a bunch of basic unit tests for SOCKS4/4a/5 support
in buffers.c.
If you had TIME_MAX > INT_MAX, and your "time_to_exhaust_bw =
accountingmax/expected_bandwidth_usage * 60" calculation managed to
overflow INT_MAX, then your time_to_consider value could underflow and
wind up being rediculously low or high. "Low" was no problem;
negative values got caught by the (time_to_consider <= 0) check.
"High", however, would get you a wakeup time somewhere in the distant
future.
The fix is to check for time_to_exhaust_bw overflowing INT_MAX, not
TIME_MAX: We don't allow any accounting interval longer than a month,
so if time_to_exhaust_bw is significantly larger than 31*24*60*60, we
can just clip it.
This is a bugfix on 0.0.9pre6, when accounting was first introduced.
It fixes bug 2146, unless there are other causes there too. The fix
is from boboper. (I tweaked it slightly by removing an assignment
that boboper marked as dead, and lowering a variable that no longer
needed to be function-scoped.)
The old logic would have us fetch from authorities if we were refusing
unknown exits and our exit policy was reject*. Instead, we want to
fetch from authorities if we're refusing unknown exits and our exit
policy is _NOT_ reject*.
Fixed by boboper. Fixes more of 2097. Bugfix on 0.2.2.16-alpha.
We use a hash of the identity key to seed a prng to tell when an
accounting period should end. But thanks to the bug998 changes,
clients no longer have server-identity keys to use as a long-term seed
in accounting calculations. In any case, their identity keys (as used
in TLS) were never never fixed. So we can just set the wakeup time
from a random seed instead there. Still open is whether everybody
should be random.
This patch fixes bug 2235, which was introduced in 0.2.2.18-alpha.
Diagnosed with help from boboper on irc.
This is not the most beautiful fix for this problem, but it is the simplest.
Bugfix for 2205. Thanks to Sebastian and Mashael for finding the
bug, and boboper/cypherpunks for figuring out why it was happening
and how to fix it, and for writing a few fixes.
The reason the "streams problem" occurs is due to the complicated
interaction between Tor's congestion control and libevent. At some point
during the experiment, the circuit window is exhausted, which blocks all
edge streams. When a circuit level sendme is received at Exit, it
resumes edge reading by looping over linked list of edge streams, and
calling connection_start_reading() to inform libevent to resume reading.
When the streams are activated again, Tor gets the chance to service the
first three streams activated before the circuit window is exhausted
again, which causes all streams to be blocked again. As an experiment,
we reversed the order in which the streams are activated, and indeed the
first three streams, rather than the last three, got service, while the
others starved.
Our solution is to change the order in which streams are activated. We
choose a random edge connection from the linked list, and then we
activate streams starting from that chosen stream. When we reach the end
of the list, then we continue from the head of the list until our chosen
stream (treating the linked list as a circular linked list). It would
probably be better to actually remember which streams have received
service recently, but this way is simple and effective.
Sebastian notes (and I think correctly) that one of our ||s should
have been an &&, which simplifies a boolean expression to decide
whether to replace bridges. I'm also refactoring out the negation at
the start of the expression, to make it more readable.
Pick 5 seconds as the limit. 5 seconds is a compromise here between
making sure the user notices that the bad behaviour is (still) happening
and not spamming their log too much needlessly (the log message is
pretty long). We also keep warning every time if safesocks is
specified, because then the user presumably wants to hear about every
blocked instance.
(This is based on the original patch by Sebastian, then backported to
0.2.2 and with warnings split into their own function.)
Our checks that we don't exceed the 50 KB size limit of extra-info
descriptors apparently failed. This patch fixes these checks and reserves
another 250 bytes for appending the signature. Fixes bug 2183.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Sending a log message to a control port can cause Tor to allocate a buffer,
thereby changing the length of the freelist behind buf_shrink_freelists's back,
thereby causing an assertion to fail.
Fixes bug #1125.
Having very long single lines with lots and lots of things in them
tends to make files hard to diff and hard to merge. Since our tools
are one-line-at-a-time, we should try to construct lists that way too,
within reason.
This incidentally turned up a few headers in configure.in that we were
for some reason searching for twice.
We would never actually enforce multiplicity rules when parsing
annotations, since the counts array never got entries added to it for
annotations in the token list that got added by earlier calls to
tokenize_string.
Found by piebeer.
We had a spelling discrepancy between the manpage and the source code
for some option. Resolve these in favor of the manpage, because it
makes more sense (for example, HTTP should be capitalized).
The code that makes use of the RunTesting option is #if 0, so setting
this option has no effect. Mark the option as obsolete for now, so that
Tor doesn't list it as an available option erroneously.
Change the default values for collecting directory request statistics and
inlcuding them in extra-info descriptors to 1.
Don't break if we are configured to collect directory request or entry
statistics and don't have a GeoIP database. Instead, print out a notice
and skip initializing the affected statistics code.
We need filtering bufferevent_openssl so that we can wrap around
IOCP bufferevents on Windows. This patch adds a temporary option to
turn on filtering mode, so that we can test it out on non-IOCP
systems to make sure it hasn't got any surprising bugs.
It also fixes some allocation/teardown errors in using
bufferevent_openssl as a filter.
Instead of rejecting a value that doesn't divide into 1 second, round to
the nearest divisor of 1 second and warn.
Document that the option only controls the granularity written by Tor to a
file or console log. It does not (for example) "batch up" log messages to
affect times logged by a controller, times attached to syslog messages, or
the mtime fields on log files.
In the case where old_router == NULL but sdmap has an entry for the
router, we can currently safely infer that the old_router was not a
bridge. Add an assert to ensure that this remains true, and fix the
logic not to die with the tor_assert(old_router) call.
In a2bb0bf we started using a separate client identity key. When we are
in "public server mode" (that means not a bridge) we will use the same
key. Reusing the key without doing the proper refcounting leads to a
segfault on cleanup during shutdown. Fix that.
Also introduce an assert that triggers if our refcount falls below 0.
That should never happen.
We now require that:
- Only actual servers should ever call get_server_identity_key
- If you're being a client or bridge, the client and server keys should
differ.
- If you're being a public relay, the client and server keys
should be the same.
When intro->extend_info is created for an introduction point, it
only starts out with a nickname, not necessarily an identity digest.
Thus, doing router_get_by_digest isn't necessarily safe.
https://trac.torproject.org/projects/tor/ticket/1859
Use router_get_by_digest() instead of router_get_by_hexdigest()
in circuit_discard_optional_exit_enclaves() and
rend_client_get_random_intro(), per Nick's comments.
Using router_get_by_digest() in rend_client_get_random_intro() will
break hidden services published by Tor versions pre 0.1.2.18 and
0.2.07-alpha as they only publish by nickname. This is acceptable
however as these versions only publish to authority tor26 and
don't work for versions in the 0.2.2.x series anyway.
Mainly, this comes from turning two lists that needed to be kept in
synch into a single list of structs. This should save a little RAM,
and make the code simpler.
There's no reason to keep a time_t and a struct timeval to represent
the same value: highres_created.tv_sec was the same as timestamp_created.
This should save a few bytes per circuit.
Our old code correctly called bufferevent_flush() on linked
connections to make sure that the other side got an EOF event... but
it didn't call bufferevent_flush() when the connection wasn't
hold_open_until_flushed. Directory connections don't use
hold_open_until_flushed, so the linked exit connection never got an
EOF, so they never sent a RELAY_END cell to the client, and the
client never concluded that data had arrived.
The solution is to make the bufferevent_flush() code apply to _all_
closing linked conns whose partner is not already marked for close.
https://trac.torproject.org/projects/tor/ticket/1859
There are two problems in this bug:
1. When an OP makes a .exit request specifying itself as the exit, and the exit
is not yet listed, Tor gets all the routerinfos needed for the circuit but
discovers in circuit_is_acceptable() that its own routerinfo is not in the
routerdigest list and cannot be used. Tor then gets locked in a cycle of
repeating these two steps. When gathering the routerinfos for a circuit,
specifically when the exit has been chosen by .exit notation, Tor needs to
apply the same rules it uses later on when deciding if it can build a
circuit with those routerinfos.
2. A different bug arises in the above situation when the Tor instance's
routerinfo *is* listed in the routerlist, it shares its nickname with a
number of other Tor nodes, and it does not have 'Named' rights to its
nickname.
So for example, if (i) there are five nodes named Bob in the network, (ii) I
am running one of them but am flagged as 'Unnamed' because someone else
claimed the 'Bob' nickname first, and (iii) I run my Tor as both client
and exit the following can happen to me:
- I go to www.evil.com
- I click on a link www.evil.com.bob.exit
- My request will exit through my own Tor node rather than the 'Named'
node Bob or any of the others.
- www.evil.com now knows I am actually browsing from the same computer
that is running my 'Bob' node
So to solve both issues we need to ensure:
- When fulfilling a .exit request we only choose a routerinfo if it exists in
the routerlist, even when that routerinfo is ours.
- When getting a router by nickname we only return our own router information
if it is not going to be used for building a circuit.
We ensure this by removing the special treatment afforded our own router in
router_get_by_nickname(). This means the function will only return the
routerinfo of our own router if it is in the routerlist built from authority
info and has a unique nickname or is bound to a non-unique nickname.
There are some uses of router_get_by_nickname() where we are looking for the
router by name because of a configuration directive, specifically local
declaration of NodeFamilies and EntryNodes and other routers' declaration of
MyFamily. In these cases it is not at first clear if we need to continue
returning our own routerinfo even if our router is not listed and/or has a
non-unique nickname with the Unnamed flag.
The patch treats each of these cases as follows:
Other Routers' Declaration of MyFamily
This happens in routerlist_add_family(). If another router declares our router
in its family and our router has the Unnamed flag or is not in the routerlist
yet, should we take advantage of the fact that we know our own routerinfo to
add us in anyway? This patch says 'no, treat our own router just like any
other'. This is a safe choice because it ensures our client has the same view
of the network as other clients. We also have no good way of knowing if our
router is Named or not independently of the authorities, so we have to rely on
them in this.
Local declaration of NodeFamilies
Again, we have no way of knowing if the declaration 'NodeFamilies
Bob,Alice,Ringo' refers to our router Bob or the Named router Bob, so we have
to defer to the authorities and treat our own router like any other.
Local declaration of NodeFamilies
Again, same as above. There's also no good reason we would want our client to
choose it's own router as an entry guard if it does not meet the requirements
expected of any other router on the network.
In order to reduce the possibility of error, the patch also replaces two
instances where we were using router_get_by_nickname() with calls to
router_get_by_hexdigest() where the identity digest of the router
is available.
First start of a fix for bug2001, but my test network still isn't
working: the client and the server send each other VERSIONS cells,
but never notice that they got them.
* Make tor_tls_context_new internal to tortls.c, and return the new
tor_tls_context_t from it.
* Add a public tor_tls_context_init wrapper function to replace it.
Some of these functions only work for routerinfo-based nodes, and as
such are only usable for advisory purposes. Fortunately, our uses
of them are compatible with this limitation.
Also, make the NodeFamily option into a list of routersets. This
lets us git rid of router_in_nickname_list (or whatever it was
called) without porting it to work with nodes, and also lets people
specify country codes and IP ranges in NodeFamily
This was the only flag in routerstatus_t that we would previously
change in a routerstatus_t in a consensus. We no longer have reason
to do so -- and probably never did -- as you can now confirm more
easily than you could have done by grepping for is_running before
this patch.
The name change is to emphasize that the routerstatus_t is_running
flag is only there to tell you whether the consensus says it's
running, not whether it *you* think it's running.
A node_t is an abstraction over routerstatus_t, routerinfo_t, and
microdesc_t. It should try to present a consistent interface to all
of them. There should be a node_t for a server whenever there is
* A routerinfo_t for it in the routerlist
* A routerstatus_t in the current_consensus.
(note that a microdesc_t alone isn't enough to make a node_t exist,
since microdescriptors aren't usable on their own.)
There are three ways to get a node_t right now: looking it up by ID,
looking it up by nickname, and iterating over the whole list of
microdescriptors.
All (or nearly all) functions that are supposed to return "a router"
-- especially those used in building connections and circuits --
should return a node_t, not a routerinfo_t or a routerstatus_t.
A node_t should hold all the *mutable* flags about a node. This
patch moves the is_foo flags from routerinfo_t into node_t. The
flags in routerstatus_t remain, but they get set from the consensus
and should not change.
Some other highlights of this patch are:
* Looking up routerinfo and routerstatus by nickname is now
unified and based on the "look up a node by nickname" function.
This tries to look only at the values from current consensus,
and not get confused by the routerinfo_t->is_named flag, which
could get set for other weird reasons. This changes the
behavior of how authorities (when acting as clients) deal with
nodes that have been listed by nickname.
* I tried not to artificially increase the size of the diff here
by moving functions around. As a result, some functions that
now operate on nodes are now in the wrong file -- they should
get moved to nodelist.c once this refactoring settles down.
This moving should happen as part of a patch that moves
functions AND NOTHING ELSE.
* Some old code is now left around inside #if 0/1 blocks, and
should get removed once I've verified that I don't want it
sitting around to see how we used to do things.
There are still some unimplemented functions: these are flagged
with "UNIMPLEMENTED_NODELIST()." I'll work on filling in the
implementation here, piece by piece.
I wish this patch could have been smaller, but there did not seem to
be any piece of it that was independent from the rest. Moving flags
forces many functions that once returned routerinfo_t * to return
node_t *, which forces their friends to change, and so on.
The node_t type is meant to serve two key functions:
1) Abstracting difference between routerinfo_t and microdesc_t
so that clients can use microdesc_t instead of routerinfo_t.
2) Being a central place to hold mutable state about nodes
formerly held in routerstatus_t and routerinfo_t.
This patch implements a nodelist type that holds a node for every
router that we would consider using.
When picking bridges (or other nodes without a consensus entry (and
thus no bandwidth weights)) we shouldn't just trust the node's
descriptor. So far we believed anything between 0 and 10MB/s, where 0
would mean that a node doesn't get any use from use unless it is our
only one, and 10MB/s would be a quite siginficant weight. To make this
situation better, we now believe weights in the range from 20kB/s to
100kB/s. This should allow new bridges to get use more quickly, and
means that it will be harder for bridges to see almost all our traffic.
In the first 100 circuits, our timeout_ms and close_ms
are the same. So we shouldn't transition circuits to purpose
CIRCUIT_PURPOSE_C_MEASURE_TIMEOUT, since they will just timeout again
next time we check.
We really should ignore any timeouts that have *no* network activity for their
entire measured lifetime, now that we have the 95th percentile measurement
changes. Usually this is up to a minute, even on fast connections.
If we really want all this complexity for these stages here, we need to handle
it better for people with large timeouts. It should probably go away, though.
Rechecking the timeout condition was foolish, because it is checked on the
same codepath. It was also wrong, because we didn't round.
Also, the liveness check itself should be <, and not <=, because we only have
1 second resolution.
Specifically, a circ attempt that we'd launched while the network was
down could timeout after we've marked our entrynodes up, marking them
back down again. The fix is to annotate as bad the OR conns that were
around before we did the retry, so if a circuit that's attached to them
times out we don't do anything about it.
This is needed for IOCP, since telling the IOCP backend about all
your CPUs is a good idea. It'll also come in handy with asn's
multithreaded crypto stuff, and for people who run servers without
reading the manual.
automake 1.6 doesn't like using a conditional += to add stuff to foo_LDADD.
Instead you need to conditionally define a variable, then non-conditionally
put that variable in foo_LDADD.
This requires the latest Git version of Libevent as of 24 March 2010.
In the future, we'll just say it requires Libevent 2.0.5-alpha or
later.
Since Libevent doesn't yet support hierarchical rate limit groups,
there isn't yet support for tracking relayed-bytes separately when
using the bufferevent system. If a future version does add support
for hierarchical buckets, we can add that back in.
Roger correctly pointed out that my code was broken for accounting
periods that shifted forwards, since
start_of_accounting_period_containing(interval_start_time) would not
be equal to interval_start_time, but potentially much earlier.
The RefuseUnknownExits config option is now a tristate, with "1"
meaning "enable it no matter what the consensus says", "0" meaning
"disable it no matter what the consensus says", and "auto" meaning "do
what the consensus says". If the consensus is silent, we enable
RefuseUnknownExits.
This patch also changes the dirserv logic so that refuseunknownexits
won't make us cache unless we're an exit.
Do not double-report signatures from unrecognized authorities both as
"from unknown authority" and "not present". Fixes bug 1956, bugfix on
0.2.2.16-alpha.
Our attempt to make compilation work on old versions of Windows
again while keeping wince compatibility broke the build for Win2k+.
helix reports this patch fixes the issue for WinXP. Bugfix on
0.2.2.15-alpha; related to bug 1797.
When the CellStatistics option is off, we don't store cell insertion
times. Doing so would also not be very smart, because there seem to
still be some performance issues with this type of statistics. Nothing
harmful happens when we don't have insertion times, so we don't need to
alarm the user.
Previously[*], the function would start with the first stream on the
circuit, and let it package as many cells as it wanted before
proceeding to the next stream in turn. If a circuit had many live
streams that all wanted to package data, the oldest would get
preference, and the newest would get ignored.
Now, we figure out how many cells we're willing to send per stream,
and try to allocate them fairly.
Roger diagnosed this in the comments for bug 1298.
[*] This bug has existed since before the first-ever public release
of Tor. It was added by r152 of Tor on 26 Jan 2003, which was
the first commit to implement streams (then called "topics").
This is not the oldest bug to be fixed in 0.2.2.x: that honor
goes to the windowing bug in r54, which got fixed in e50b7768 by
Roger with diagnosis by Karsten. This is, however, the most
long-lived bug to be fixed in 0.2.2.x: the r54 bug was fixed
2580 days after it was introduced, whereas I am writing this
commit message 2787 days after r152.
I'm going to use this to implement more fairness in
circuit_resume_edge_reading_helper in an attempt to fix bug 1298.
(Updated with fixes from arma and Sebastian)
An authority should never download a consensus if it has a live one,
but when it doesn't, it should admit that it's not going to get one,
and see if anybody else can give it one.
Fixes 1300, fix on 0.2.0.9-alpha
Bridges and other relays not included in the consensus don't
necessarily have a non-zero bandwidth capacity. If all our
configured bridges had a zero bw capacity we would warn the
user. Change that.
Previously, we were also considering the time spent in
soft-hibernation. If this was a long time, we would wind up
underestimating our bandwidth by a lot, and skewing our wakeup time
towards the start of the accounting interval.
This patch also makes us store a few more fields in the state file,
including the time at which we entered soft hibernation.
Fixes bug 1789. Bugfix on 0.0.9pre5.
This will make changes for DST still work, and avoid double-spending
bytes when there are slight changes to configurations.
Fixes bug 1511; the DST issue is a bugfix on 0.0.9pre5.
When we introduced the code to close non-open OR connections after
KeepalivePeriod had passed, we replaced some code that said
if (!connection_is_open(conn)) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
with new code that said
if (!connection_is_open(conn) && past_keepalive) {
/* let it keep handshaking forever */
} else if (do other tests here) {
...
This was a mistake, since it made all the other tests start applying
to non-open connections, thus causing bug 1840, where non-open
connections get closed way early.
Fixes bug 1840. Bugfix on 0.2.1.26 (commit 67b38d50).
- Move checks for extra_info to callers
- Change argument name from failed to descs
- Use strlen("fp/") instead of a magic number
- I passed on the suggestion to rename functions from *_failed() to
*_handle_failure(). There are a lot of these so for now just follow
the house style.
It's normal when bootstrapping to have a lot of different certs
missing, so we don't want missing certs to make us warn... unless
the certs we're missing are ones that we've tried to fetch a couple
of times and failed at.
May fix bug 1145.
We frequently add cells to stream-blocked queues for valid reasons
that don't mean we need to block streams. The most obvious reason
is if the cell arrives over a circuit rather than from an edge: we
don't block circuits, no matter how full queues get. The next most
obvious reason is that we allow CONNECTED cells from a newly created
stream to get delivered just fine.
This patch changes the behavior so that we only iterate over the
streams on a circuit when the cell in question came from a stream,
and we only block the stream that generated the cell, so that other
streams can still get their CONNECTEDs in.
This should keep WinCE working (unicode always-on) and get Win98
working again (unicode never-on).
There are two places where we explicitly use ASCII-only APIs, still:
in ntmain.c and in the unit tests.
This patch also fixes a bug in windoes tor_listdir that would cause
the first file to be listed an arbitrary number of times that was
also introduced with WinCE support.
Should fix bug 1797.
Setting CookieAuthFileGroupReadable but without setting CookieAuthFile makes
no sense, because unix directory permissions for the data directory prevent
the group from accessing the file anyways.
When this happens, run through the streams on the circuit and make
sure they're all blocked. If some aren't, that's a bug: block them
all and log it! If they all are, where did the cell come from? Log
it!
(I suspect that this actually happens pretty frequently, so I'm making
these log messages appear at INFO.)
Do not start reading on exit streams when we get a SENDME unless we
have space in the appropriate circuit's cell queue.
Draft fix for bug 1653.
(commit message by nickm)
router_add_to_routerlist() is supposed to be a nice minimal function
that only touches the routerlist structures, but it included a call to
dirserv_single_reachability_test().
We have a function that gets called _after_ adding descriptors
successfully: routerlist_descriptors_added. This patch moves the
responsibility for testing there.
Because the decision of whether to test or not depends on whether
there was an old routerinfo for this router or not, we have to first
detect whether we _will_ want to run the tests if the router is added.
We make this the job of
routers_update_status_from_consensus_networkstatus().
Finally, this patch makes the code notice if a router is going from
hibernating to non-hibernating, and if so causes a reachability test
to get launched.
This solves the problem Roger noted as:
What if the router has a clock that's 5 minutes off, so it
publishes a descriptor for 5 minutes in the future, and we test it
three minutes in. In this edge case, we will continue to advertise
it as Running for the full 45 minute period.
If the voting interval was short enough, the two-minutes delay
of CONSENSUS_MIN_SECONDS_BEFORE_CACHING would confuse bridges
to the point where they would assert before downloading a consensus.
It it was even shorter (<4 minutes, I think), caches would
assert too. This patch fixes that by having replacing the
two-minutes value with MIN(2 minutes, interval/16).
Bugfix for 1141; the cache bug could occur since 0.2.0.8-alpha, so
I'm calling this a bugfix on that. Robert Hogan diagnosed this.
Done as a patch against maint-0.2.1, since it makes it hard to
run some kinds of testing networks.
requests to authorities fail due to a network error.
Bug#1138
"When a Tor client starts up using a bridge, and UpdateBridgesFromAuthority
is set, Tor will go to the authority first and look up the bridge by
fingerprint. If the bridge authority is filtered, Tor will never notice that
the bridge authority lookup failed. So it will never fall back."
Add connection_dir_bridge_routerdesc_failed(), a function for unpacking
the bridge information from a failed request, and ensure
connection_dir_request_failed() calls it if the failed request
was for a bridge descriptor.
Test:
1. for ip in `grep -iR 'router ' cached-descriptors|cut -d ' ' -f 3`;
do sudo iptables -A OUTPUT -p tcp -d $ip -j DROP; done
2. remove all files from user tor directory
3. Put the following in torrc:
UseBridges 1
UpdateBridgesFromAuthority 1
Bridge 85.108.88.19:443 7E1B28DB47C175392A0E8E4A287C7CB8686575B7
4. Launch tor - it should fall back to downloading descriptors
directly from the bridge.
Initial patch reviewed and corrected by mingw-san.
https://trac.torproject.org/projects/tor/ticket/1525
"The codepath taken by the control port "RESOLVE" command to create a
synthetic SOCKS resolve request isn't the same as the path taken by
a real SOCKS request from 'tor-resolve'.
This prevents controllers who set LeaveStreamsUnattached=1 from
being able to attach RESOLVE streams to circuits of their choosing."
Create a new function connection_ap_rewrite_and_attach_if_allowed()
and call that when Tor needs to attach a stream to a circuit but
needs to know if the controller permits it.
No tests added.
With this patch we stop scheduling when we should write statistics using a
single timestamp in run_scheduled_events(). Instead, we remember when a
statistics interval starts separately for each statistic type in geoip.c
and rephist.c. Every time run_scheduled_events() tries to write stats to
disk, it learns when it should schedule the next such attempt.
This patch also enables all statistics to be stopped and restarted at a
later time.
This patch comes with a few refactorings, some of which were not easily
doable without the patch.
We already had the country code ?? indicating an unknown country, so all we
needed to do to make unknown countries excludable was to make the ?? code
discoverable.
It's okay to get (say) a SocksPort line in the torrc, and then a
SocksPort on the command line to override it, and then a SocksPort via
a controller to override *that*. But if there are two occurrences of
SocksPort in the torrc, or on the command line, or in a single SETCONF
command, then the user is likely confused. Our old code would not
help unconfuse the user, but would instead silently ignore all but
the last occurrence.
This patch changes the behavior so that if the some option is passed
more than once to any torrc, command line, or SETCONF (each of which
coincidentally corresponds to a call to config_assign()), and the
option is not a type that allows multiple occurrences (LINELIST or
LINELIST_X), then we can warn the user.
This closes trac entry 1384.
At best, this patch helps us avoid sending queued relayed cells that
would get ignored during the time between when a destroy cell is
sent and when the circuit is finally freed. At worst, it lets us
release some memory a little earlier than it would otherwise.
Fix for bug #1184. Bugfix on 0.2.0.1-alpha.
The next series of commits begins addressing the issue that we're
currently including the complete or.h file in all of our source files.
To change that, we're splitting function definitions into new header
files (one header file per source file).
Right now it says "552 internal error" because there's no way for
getinfo_helper_*() countries to specify an error message. This
patch changes the getinfo_helper_*() interface, and makes most of the
getinfo helpers give useful error messages in response to failures.
This should prevent recurrences of bug 1699, where a missing GeoIPFile
line in the torrc made GETINFO ip-to-county/* fail in a "not obvious
how to fix" way.
V3 authorities no longer decide not to vote on Guard+Exit. The bandwidth
weights should take care of this now.
Also, lower the max threshold for WFU to 0.98, to allow more nodes to become
guards.
This should make us conflict less with system files named "log.h".
Yes, we shouldn't have been conflicting with those anyway, but some
people's compilers act very oddly.
The actual change was done with one "git mv", by editing
Makefile.am, and running
find . -name '*.[ch]' | xargs perl -i -pe 'if (/^#include.*\Wlog.h/) {s/log.h/torlog.h/; }'
We now record large times as abandoned, to prevent a filter step from
happening and skewing our results.
Also, issue a warn for a rare case that can happen for funky values of Xm or
too many abandoned circuits. Can happen (very rarely) during unit tests, but
should not be possble during live operation, due to network liveness filters
and discard logic.
Many friendly operating systems have 64-bit times, and it's not nice
to pass them to an %ld format.
It's also extremely not-nice to write a time to the log as an
integer. Most people think it's 2010 June 29 23:57 UTC+epsilon, not
1277855805+epsilon.
These timers behave better with non-monotonic clocks than our old
ones, and also try harder to make once-per-second events get called
one second apart, rather than one-plus-epsilon seconds apart.
This fixes bug 943 for everybody using Libevent 2.0 or later.
We need to ensure that we close timeout measurement circuits. While
we're at it, we should close really old circuits of certain types that
aren't in use, and log really old circuits of other types.
We need to record different statistics at point of timeout, vs the point
of forcible closing.
Also, give some better names to constants and state file variables
to indicate they are not dealing with timeouts, but abandoned circuits.
In rare cases, we could cannibalize a one-hop circuit, ending up
with a two-hop circuit. This circuit would not be actually used,
but we should prevent its creation in the first place.
Thanks to outofwords and swissknife for helping to analyse this.
Most of the changes here are switches to use APIs available on Windows
CE. The most pervasive change is that Windows CE only provides the
wide-character ("FooW") variants of most of the windows function, and
doesn't support the older ASCII verions at all.
This patch will require use of the wcecompat library to get working
versions of the posix-style fd-based file IO functions.
[commit message by nickm]
Back when we changed the idea of a connection being "too old" for new
circuits into the connection being "bad" for new circuits, we didn't
actually change the info messages. This led to telling the user that
we were labelling connections as "too old" for being worse than
connections that were actually older than them.
Found by Scott on or-talk.
There are now four ways that CBT can be disabled:
1. Network-wide, with the cbtdisabled consensus param.
2. Via config, with "LearnCircuitBuildTimeout 0"
3. Via config, with "AuthoritativeDirectory 1"
4. Via a state file write failure.
This should prevent some asserts and storage of incorrect build times
for the cases where Tor is suspended during a circuit construction, or
just after completing a circuit. The idea is that if the circuit
build time is much greater than we would have cut it off at, we probably
had a suspend event along this codepath, and we should discard the
value.
In case we decide that the timeout rate is now too high due to our
change of the max synthetic quantile value, this consensus parameter
will allow us to restore it to the previous value.
This is for the other issue we saw in Bug 1335. A large number of high
timeouts were causing the timeout calculation to slowly drift upwards,
especially in conditions of load. This fix repeatedly regenerates all of
our synthetic timeouts whenever the timeout changes, to try to prevent
drift.
It also lowers the timeout cap to help for some cases of Bug 1245, where
some timeout values were so large that we ended up allocating a ton of
scratch memory to count the histogram bins.
The downside is that lowering this cap is affecting our timeout rate.
Unfortunately, the buildtimeout quantile is now higher than the actual
completion rate by what appears to be about 7-10%, which probably
represents the skew in the distribution due to lowering this synthetic
cap.
In my state files, I was seeing several peaks, probably due to different
guards having different latency. This change is meant to better capture
this behavior and generate more reasonable timeouts when it happens. It
is improving the timeout values for my collection of state files.
what's happening here is that we're fetching certs for obsolete
authorities -- probably legacy signers in this case. but try to
remain general in the log message.
It's natural for the definition of bandwidth_rule_t to be with the functions
that actually care about its values. Unfortunately, this means declaring
bandwidth_rate_rule_to_string() out of sequence. Someday we'll just rename
reasons.c to strings.c, and put it at the end of or.h, and this will all be
better.
Works like the --enable-static-openssl/libevent options. Requires
--with-zlib-dir to be set. Note that other dependencies might still
pull in a dynamicly linked zlib, if you don't link them in statically
too.
Everything that accepted the 'Circ' name handled it wrong, so even now
that we fixed the handling of the parameter, we wouldn't be able to
set it without making all the 0.2.2.7..0.2.2.10 relays act wonky.
This patch makes Tors accept the 'Circuit' name instead, so we can
turn on circuit priorities without confusing the versions that treated
the 'Circ' name as occasion to act weird.
I'm adding this because I can never remember what stuff like 'rule 3'
means. That's the one where if somebody goes limp or taps out, the
fight is over, right?
When you mean (a=b(c,d)) >= 0, you had better not say (a=b(c,d)>=0).
We did the latter, and so whenever CircPriorityHalflife was in the
consensus, it was treated as having a value of 1 msec (that is,
boolean true).
We need to make sure we have an event_base in dns.c before we call
anything that wants one. Make sure we always have one in dns_reset()
when we're a client. Fixes bug 1341.
Now if you're a published relay and you set RefuseUnknownExits, even
if your dirport is off, you'll fetch dir info from the authorities,
fetch it early, and cache it.
In the future, RefuseUnknownExits (or something like it) will be on
by default.
From http://archives.seul.org/tor/relays/Mar-2010/msg00006.html :
As I understand it, the bug should show up on relays that don't set
Address to an IP address (so they need to resolve their Address
line or their hostname to guess their IP address), and their
hostname or Address line fails to resolve -- at that point they'll
pick a random 4 bytes out of memory and call that their address. At
the same time, relays that *do* successfully resolve their address
will ignore the result, and only come up with a useful address if
their interface address happens to be a public IP address.
When the bandwidth-weights branch added the "directory-footer"
token, and began parsing the directory footer at the first
occurrence of "directory-footer", it made it possible to fool the
parsing algorithm into accepting unsigned data at the end of a
consensus or vote. This patch fixes that bug by treating the footer
as starting with the first "directory-footer" or the first
"directory-signature", whichever comes first.
Treat strings returned from signed_descriptor_get_body_impl() as not
NUL-terminated. Since the length of the strings is available, this is
not a big problem.
Discovered by rieo.
Don't allow anything but directory-signature tokens in a consensus after
the first directory-signature token. Fixes bug in bandwidth-weights branch.
Found by "outofwords."
Another dereference-then-NULL-check sequence. No reports of this bug
triggered in the wild. Fixes bugreport 1256.
Thanks to ekir for discovering and reporting this bug.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
This means that "if (E<G) {abc} else if (E>=G) {def}" can be replaced with
"if (E<G) {abc} else {def}"
Doing the second test explicitly made my mingw gcc nervous that we might
never be initializing casename.
For my 64-bit Linux system running with GCC 4.4.3-fc12-whatever, you
can't do 'printf("%lld", (int64_t)x);' Instead you need to tell the
compiler 'printf("%lld", (long long int)x);' or else it doesn't
believe the types match. This is why we added U64_PRINTF_ARG; it
looks like we needed an I64_PRINTF_ARG too.
asprintf() is a GNU extension that some BSDs have picked up: it does a printf
into a newly allocated chunk of RAM.
Our tor_asprintf() differs from standard asprintf() in that:
- Like our other malloc functions, it asserts on OOM.
- It works on windows.
- It always sets its return-field.
All other bandwidthrate settings are restricted to INT32_MAX, but
this check was forgotten for PerConnBWRate and PerConnBWBurst. Also
update the manpage to reflect the fact that specifying a bandwidth
in terabytes does not make sense, because that value will be too
large.
Fix a dereference-then-NULL-check sequence. This bug wasn't triggered
in the wild, but we should fix it anyways in case it ever happens.
Also make sure users get a note about this being a bug when they
see it in their log.
Thanks to ekir for discovering and reporting this bug.
On Windows, we don't have a notion of ~ meaning "our homedir", so we
were deliberately using an #ifdef to avoid calling expand_filename()
in multiple places. This is silly: The right place to turn a function
into a no-op on a single platform is in the function itself, not in
every single call-site.
Spec conformance issue: The code didn't force the network-status-version
token to be the first token in a v3 vote or consensus.
Problem discovered by Parakeep.
We need to use evdns_add_server_port_with_base() when configuring
our DNS listener, because libevent segfaults otherwise. Add a macro
in compat_libevent.h to pick the correct implementation depending
on the libevent version.
Fixes bug 1143, found by SwissTorExit
The src and dest of a memcpy() call aren't supposed to overlap,
but we were sometimes calling tor_addr_copy() as a no-op.
Also, tor_addr_assign was a redundant copy of tor_addr_copy(); this patch
removes it.
We implemented ratelimiting for warnings going into the logfile, but didn't
rate-limit controller events. Now both log warnings and controller events
are rate-limited.
Tor has tor_lookup_hostname(), which prefers ipv4 addresses automatically.
Bug 1244 occured because gethostbyname() returned an ipv6 address, which
Tor cannot handle currently. Fixes bug 1244; bugfix on 0.0.2pre25.
Reported by Mike Mestnik.
The problem was that we didn't allocate enough memory on 32-bit
platforms with 64-bit time_t. The memory leak occured every time
we fetched a hidden service descriptor we've fetched before.
When calculating the is_exit flag for a routerinfo_t, we don't need
to call exit_policy_is_general_exit() if router_exit_policy_rejects_all()
tells us it definitely is an exit. This check is much cheaper than
running exit_policy_is_general_exit().
exit_policy_is_general_exit() assumed that there are no redundancies
in the passed policy, in the sense that we actively combine entries
in the policy to really get rid of any redundancy. Since we cannot
do that without massively rewriting the policy lines the relay
operators set, fix exit_policy_is_general_exit().
Fixes bug 1238, discovered by Martin Kowalczyk.
We accidentally freed the internal buffer for bridge stats when we
were writing the bridge stats file or honoring a control port
request for said data. Change the interfaces for
geoip_get_bridge_stats* to prevent these problems, and remove the
offending free/add a tor_strdup.
Fixes bug 1208.
It's a bit confusing to have a loop where another function,
confusingly named "*_free", is responsible for advancing the loop
variable (or rather, for altering a structure so that the next time
the loop variable's initializer is evaluated it evaluates to something
different.)
Not only has this confused people: it's also confused coverity scan.
Let's fix that.
This was freaking out some relay operators without good reason, as
it is nothing the relay operator can do anything about anyways.
Quieting this warning suggested by rieo.
The OutboundBindAddress option is useful for making sure that all of
your outbond connections use a given interface. But when connecting
to 127.0.0.1 (or ::1 even) it's important to actually have the
connection come _from_ localhost, since lots of programs running on
localhost use the source address to authenticate that the connection
is really coming from the same host.
Our old code always bound to OutboundBindAddress, whether connecting
to localhost or not. This would potentially break DNS servers on
localhost, and socks proxies on localhost. This patch changes the
behavior so that we only look at OutboundBindAddress when connecting
to a non-loopback address.
this case can now legitimately happen, if you have a cached v2 status
from moria1, and you run with the new list of dirservers that's missing
the old moria1. it's nothing to worry about; the file will die off in
a month or two.
...to let us
rate-limit client connections as they enter the network. It's
controlled in the consensus so we can turn it on and off for
experiments. It's starting out off. Based on proposal 163.
Specifically, there are two cases: a) are we willing to start a new
circuit at a node not in your ExitNodes config option, and b) are we
willing to make use of a circuit that's already established but has an
unsuitable exit.
Now we discard all your circuits when you set ExitNodes, so the only
way you could end up with an exit circuit that ends at an unsuitable
place is if we explicitly ran out of exit nodes, StrictNodes was 0,
and we built this circuit to solve a stream that needs solving.
Fixes bug in dc322931, which would ignore the just-built circuit because
it has an unsuitable exit.
Before it would prepend your requested entrynodes to your list of guard
nodes, but feel free to use others after that. Now it chooses only
from your EntryNodes if any of those are available, and only falls back
to others if a) they're all down and b) StrictNodes is not set.
Also, now we refresh your entry guards from EntryNode at each consensus
fetch (rather than just at startup and then they slowly rot as the
network changes).
The goal here is to make users less likely to set StrictNodes, since
it's doing closer to what they expect it should be doing.
We do this in too many places throughout the code; it's time to start
clamping down.
Also, refactor Karsten's patch to use strchr-then-strndup, rather than
malloc-then-strlcpy-then-strchr-then-clear.
Fix statistics on client numbers by country as seen by bridges that were
broken in 0.2.2.1-alpha. Also switch to reporting full 24-hour intervals
instead of variable 12-to-48-hour intervals.
The HSAuthorityRecordStats option was used to track statistics of overall
hidden service usage on the version 0 hidden service authorities. With the
version 2 hidden service directories being deployed and version 0
descriptors being phased out, these statistics are not as useful anymore.
Goodbye, you fine piece of software; my first major code contribution to
Tor.
The new rule is: safe_str_X() means "this string is a piece of X
information; make it safe to log." safe_str() on its own means
"this string is a piece of who-knows-what; make it safe to log".
The rule is now: take the value from the CircuitPriorityHalflife
config option if it is set. If it zero, disable the cell_ewma
algorithm. If it is set, use it to calculate the scaling factor.
If it is not set, look for a CircPriorityHalflifeMsec parameter in the
consensus networkstatus. If *that* is zero, then disable the cell_ewma
algorithm; if it is set, use it to calculate the scaling factor.
If it is not set at all, disable the algorithm.
In connection_dir_client_reached_eof, we make sure that we either
return when we get an http status code of 503 or handle the problem
and set it to 200. Later we check if the status code is 503. Remove
that check.
There are two big changes here:
- We store active circuits in a priority queue for each or_conn,
rather than doing a linear search over all the active circuits
before we send each cell.
- Rather than multiplying every circuit's cell-ewma by a decay
factor every time we send a cell (thus normalizing the value of a
current cell to 1.0 and a past cell to alpha^t), we instead
only scale down the cell-ewma every tick (ten seconds atm),
normalizing so that a cell sent at the start of the tick has
value 1.0).
Each circuit is ranked in terms of how many cells from it have been
relayed recently, using a time-weighted average.
This patch has been tested this on a private Tor network on PlanetLab,
and gotten improvements of 12-35% in time it takes to fetch a small
web page while there's a simultaneous large data transfer going on
simultaneously.
[Commit msg by nickm based on mail from Ian Goldberg.]
This changes the pqueue API by requiring an additional int in every
structure that we store in a pqueue to hold the index of that structure
within the heap.
Some *_free functions threw asserts when passed NULL. Now all of them
accept NULL as input and perform no action when called that way.
This gains us consistence for our free functions, and allows some
code simplifications where an explicit null check is no longer necessary.
Do not segfault when writing buffer stats when we haven't observed a
single circuit to report about. This is a minor bug that would only show
up in testing environments with no traffic and with reduced stats
intervals.
Avoid crashing if the client is trying to upload many bytes and the
circuit gets torn down at the same time, or if the flip side
happens on the exit relay. Bugfix on 0.2.0.1-alpha; fixes bug 1150.
New config option "CircuitStreamTimeout" to override our internal
timeout schedule for how many seconds until we detach a stream from
a circuit and try a new circuit. If your network is particularly
slow, you might want to set this to a number like 60.
To fix a major security problem related to incorrect use of
SSL/TLS renegotiation, OpenSSL has turned off renegotiation by
default. We are not affected by this security problem, however,
since we do renegotiation right. (Specifically, we never treat a
renegotiated credential as authenticating previous communication.)
Nevertheless, OpenSSL's new behavior requires us to explicitly
turn renegotiation back on in order to get our protocol working
again.
Amusingly, this is not so simple as "set the flag when you create
the SSL object" , since calling connect or accept seems to clear
the flags.
For belt-and-suspenders purposes, we clear the flag once the Tor
handshake is done. There's no way to exploit a second handshake
either, but we might as well not allow it.
This commit implements a new config option: 'DisableAllSwap'
This option probably only works properly when Tor is started as root.
We added two new functions: tor_mlockall() and tor_set_max_memlock().
tor_mlockall() attempts to mlock() all current and all future memory pages.
For tor_mlockall() to work properly we set the process rlimits for memory to
RLIM_INFINITY (and beyond) inside of tor_set_max_memlock().
We behave differently from mlockall() by only allowing tor_mlockall() to be
called one single time. All other calls will result in a return code of 1.
It is not possible to change DisableAllSwap while running.
A sample configuration item was added to the torrc.complete.in config file.
A new item in the man page for DisableAllSwap was added.
Thanks to Moxie Marlinspike and Chris Palmer for their feedback on this patch.
Please note that we make no guarantees about the quality of your OS and its
mlock/mlockall implementation. It is possible that this will do nothing at all.
It is also possible that you can ulimit the mlock properties of a given user
such that root is not required. This has not been extensively tested and is
unsupported. I have included some comments for possible ways we can handle
this on win32.
If your relay can't keep up with the number of incoming create cells, it
would log one warning per failure into your logs. Limit warnings to 1 per
minute.
This was left over from an early draft of the microdescriptor code; it
began to populate the signatures array of a networkstatus vote, even
though there's no actual need to do that for a vote.
In C, the code "char x[10]; if (x) {...}" always takes the true branch of
the if statement. Coverity notices this now.
In some cases, we were testing arrays to make sure that an operation
we wanted to do would suceed. Those cases are now always-true.
In some cases, we were testing arrays to see if something was _set_.
Those caes are now tests for strlen(s), or tests for
!tor_mem_is_zero(d,len).
If all authorities restart at once right before a consensus vote, nobody
will vote about "Running", and clients will get a consensus with no usable
relays. Instead, authorities refuse to build a consensus if this happens.
The first happens on an error case when a controller wants an
impossible directory object. The second happens when we can't write
our fingerprint file.
The code for these was super-wrong, but will only break things when we
reset an option on a platform where sizeof(time_t) is different from
sizeof(int).
See task 1114. The most plausible explanation for someone sending us weak
DH keys is that they experiment with their Tor code or implement a new Tor
client. Usually, we don't care about such events, especially not on warn
level. If we really care about someone not following the Tor protocol, we
can set ProtocolWarnings to 1.
This patch introduces a new type called document_signature_t to represent the
signature of a consensus document. Now, each consensus document can have up
to one document signature per voter per digest algorithm. Also, each
detached-signatures document can have up to one signature per <voter,
algorithm, flavor>.
Previously, we insisted that a valid signature must be a signature of
the expected digest. Now we accept anything that starts with the
expected digest. This lets us include another digest later.
When we tried to use the deprecated non-threadsafe evdns
interfaces in Libevent 2 without using the also-deprecated
event_init() interface, Libevent 2 would sensibly crash, since it
has no guess where to find the Libevent library.
Here we use the evdns_base_*() functions instead if they're
present, and fake them if they aren't.
This is a possible fix for bug 1023, where if we vote (or make a v2
consensus networkstatus) right after we come online, we can call
rep_hist_note_router_unreachable() on every router we haven't connected
to yet, and thereby make all their uptime values reset.
This seems to be happening to me a lot on a garbage DSL line.
We may need to come up with 2 threshholds: a high short onehop
count and a lower longer count.
Don't count one-hop circuits when we're estimating how long it
takes circuits to build on average. Otherwise we'll set our circuit
build timeout lower than we should. Bugfix on 0.2.2.2-alpha.
Directory authorities now reject Tor relays with versions less than
0.1.2.14. This step cuts out four relays from the current network,
none of which are very big.
This shouldn't be necessary, but apparently the Android cross-compiler
doesn't respect -I as well as it should. (-I is supposed to add to the
*front* of the search path. Android's gcc wrapper apparently likes to add to
the end. This is broken, but we need to work around it.)
- Avoid memmoving 0 bytes which might lead to compiler warnings.
- Don't require relays to be entry node AND bridge at the same to time to
record clients.
- Fix a memory leak when writing dirreq-stats.
- Don't say in the stats files that measurement intervals are twice as long
as they really are.
- Reduce minimum observation time for requests to 12 hours, or we might
never record usage.
- Clear exit stats correctly after writing them, or we accumulate old stats
over time.
- Reset interval start for buffer stats, too.
The big change is to add a function to display the current SSL handshake
state, and to log it everywhere reasonable. (A failure in
SSL23_ST_CR_SRVR_HELLO_A is different from one in
SSL3_ST_CR_SESSION_TICKET_A.)
This patch also adds a new log domain for OR handshaking, so you can pull out
all the handshake log messages without having to run at debug for everything.
For example, you'd just say "log notice-err [handshake]debug-err file
tor.log".
This was the only log notice that happened during other
tor invocations, like --verify-config and --list-fingerprint.
Plus, now we think it works, so no need to hear about it.
"Tinytest" is a minimalist C unit testing framework I wrote for
Libevent. It supports some generally useful features, like being able
to run separate unit tests in their own processes.
I tried to do the refactoring to change test.c as little as possible.
Thus, we mostly don't call the tinytest macros directly. Instead, the
test.h header is now a wrapper on tinytest.h to make our existing
test_foo() macros work.
The next step(s) here will be:
- To break test.c into separate files, each with its own test group.
- To look into which things we can test
- To refactor the more fiddly tests to use the tinytest macros
directly and/or run forked.
- To see about writing unit tests for things we couldn't previously
test without forking.
If the networkstatus consensus tells us that we should use a
negative circuit package window, ignore it. Otherwise we'll
believe it and then trigger an assert.
Also, change the interface for networkstatus_get_param() so we
don't have to lookup the consensus beforehand.
A) We were considering a circuit had timed out in the special cases
where we close rendezvous circuits because the final rendezvous
circuit couldn't be built in time.
B) We were looking at the wrong timestamp_created when considering
a timeout.
Don't discard all circuits every MaxCircuitDirtiness, because the
user might legitimately have set that to a very lower number.
Also don't use up all of our idle circuits with testing circuits,
since that defeats the point of preemptive circuits.
We want it to be under our control so it doesn't mess
up initialization. This is likely the cause for
the bug the previous assert-adding commit (09a75ad) was
trying to address.
Using CircuitBuildTimeout is prone to issues with SIGHUP, etc.
Also, shuffle the circuit build times array after loading it
in so that newer measurements don't replace chunks of
similarly timed measurements.
To further attempt to fix bug 1090, make sure connection_ap_can_use_exit
always returns 0 when the chosen exit router is excluded. This should fix
bug1090.
When we excluded some Exits, we were sometimes warning the user that we
were going to use the node regardless. Many of those warnings were in
fact bogus, because the relay in question was not used to connect to
the outside world.
Based on patch by Rotor, thanks!
Tor now reads the "circwindow" parameter out of the consensus,
and uses that value for its circuit package window rather than the
default of 1000 cells. Begins the implementation of proposal 168.
This code adds a new field to vote on: "params". It consists of a list of
sorted key=int pairs. The output is computed as the median of all the
integers for any key on which anybody voted.
Improved with input from Roger.
Adding the same vote to a networkstatus consensus leads to a memory leak
on the client side. Fix that by only using the first vote from any given
voter, and ignoring the others.
Problem found by Rotor, who also helped writing the patch. Thanks!
A vote may only contain exactly one signature. Make sure we reject
votes that violate this.
Problem found by Rotor, who also helped writing the patch. Thanks!
Fix an obscure bug where hidden services on 64-bit big-endian
systems might mis-read the timestamp in v3 introduce cells, and
refuse to connect back to the client. Discovered by "rotor".
Bugfix on 0.2.1.6-alpha.
Add a "getinfo status/accepted-server-descriptor" controller
command, which is the recommended way for controllers to learn
whether our server descriptor has been successfully received by at
least on directory authority. Un-recommend good-server-descriptor
getinfo and status events until we have a better design for them.
We were telling the controller about CHECKING_REACHABILITY and
REACHABILITY_FAILED status events whenever we launch a testing
circuit or notice that one has failed. Instead, only tell the
controller when we want to inform the user of overall success or
overall failure. Bugfix on 0.1.2.6-alpha. Fixes bug 1075. Reported
by SwissTorExit.
When we added support for fractional units (like 1.5 MB) I broke
support for giving units with no space (like 2MB). This patch should
fix that. It also adds a propoer tor_parse_double().
Fix for bug 1076. Bugfix on 0.2.2.1-alpha.
We were triggering a CLOCK_SKEW controller status event whenever
we connect via the v2 connection protocol to any relay that has
a wrong clock. Instead, we should only inform the controller when
it's a trusted authority that claims our clock is wrong. Bugfix
on 0.2.0.20-rc; starts to fix bug 1074. Reported by SwissTorExit.
- Refactor geoip.c by moving duplicate code into rotate_request_period().
- Don't leak memory when cleaning up cell queues.
- Make sure that exit_(streams|bytes_(read|written)) are initialized in all
places accessing these arrays.
- Read only the last block from *stats files and ensure that its timestamp
is not more than 25 hours in the past and not more than 1 hour in the
future.
- Stop truncating the last character when reading *stats files.
The only thing that's left now is to avoid reading whole *stats files into
memory.
Note that unlike subversion revision numbers, it isn't meaningful to
compare these for anything but equality. We define a sort-order anyway,
in case one of these accidentally slips into a recommended-versions
list.
If any the v3 certs we download are unparseable, we should actually
notice the failure so we don't retry indefinitely. Bugfix on 0.2.0.x;
reported by "rotator".
The more verbose logs that were added in ee58153 also include a string
that might not have been initialized. This can lead to segfaults, e.g.,
when setting up private Tor networks. Initialize this string with NULL.
Send circuit or stream sendme cells when our window has decreased
by 100 cells, not when it has decreased by 101 cells. Bug uncovered
by Karsten when testing the "reduce circuit window" performance
patch. Bugfix on the 54th commit on Tor -- from July 2002,
before the release of Tor 0.0.0. This is the new winner of the
oldest-bug prize.
I don't think we actually use (or plan to use) strtok_r in a reentrant
way anywhere in our code, but would be nice not to have to think about
whether we're doing it.
Relays no longer publish a new server descriptor if they change
their MaxAdvertisedBandwidth config option but it doesn't end up
changing their advertised bandwidth numbers. Bugfix on 0.2.0.28-rc;
fixes bug 1026. Patch from Sebastian.
Specifically, every time we get a create cell but we have so many already
queued that we refuse it.
Bugfix on 0.2.0.19-alpha; fixes bug 1034. Reported by BarkerJr.
The problem is that clients and hidden services are receiving
relay_early cells, and they tear down the circuit.
Hack #1 is for rendezvous points to rewrite relay_early cells to
relay cells. That way there are never any incoming relay_early cells.
Hack #2 is for clients and hidden services to never send a relay_early
cell on an established rendezvous circuit. That works around rendezvous
points that haven't upgraded yet.
Hack #3 is for clients and hidden services to not tear down the circuit
when they receive an inbound relay_early cell. We already refuse extend
cells at clients.
When determining how long directory requests take or how long cells spend
in queues, we were comparing timestamps on microsecond detail only to
convert results to second or millisecond detail later on. But on 32-bit
architectures this means that 2^31 microseconds only cover time
differences of up to 36 minutes. Instead, compare timestamps on
millisecond detail.
Now that we require EntryStatistics to be 1 for counting connecting
clients, unit tests need to set that config option, too.
Reported by Sebastian Hahn.
Changes to directory request statistics:
- Rename GEOIP statistics to DIRREQ statistics, because they now include
more than only GeoIP-based statistics, whereas other statistics are
GeoIP-dependent, too.
- Rename output file from geoip-stats to dirreq-stats.
- Add new config option DirReqStatistics that is required to measure
directory request statistics.
- Clean up ChangeLog.
Also ensure that entry guards statistics have access to a local GeoIP
database.
This new option will allow clients to download the newest fresh consensus
much sooner than they normally would do so, even if they previously set
FetchDirInfoEarly. This includes a proper ChangeLog entry and an updated man
page.
Introduce a threshold of 0.01% of bytes that must be read and written per
port in order to be included in the statistics. Otherwise we cannot include
these statistics in extra-info documents, because they are too big.
Change the labels "-written" and "-read" so that the meanings are as
intended.
The internal error "could not find intro key" occurs when we want to send
an INTRODUCE1 cell over a recently finished introduction circuit and think
we built the introduction circuit with a v2 hidden service descriptor, but
cannot find the introduction key in our descriptor.
My first guess how we can end up in this situation is that we are wrong in
thinking that we built the introduction circuit based on a v2 hidden
service descriptor. This patch checks if we have a v0 descriptor, too, and
uses that instead.
o Minor features:
- If we're a relay and we change our IP address, be more verbose
about the reason that made us change. Should help track down
further bugs for relays on dynamic IP addresses.
when we write out our stability info, detect relays that have slipped
through the cracks. log about them and correct the problem.
if we continue to see a lot of these over time, it means there's another
spot where relays fall out of the routerlist without being marked as
unreachable.
Marks the control port connection for flushing before closing when
the QUIT command is issued. This allows a QUIT to be issued during
a long reply over the control port, flushing the reply and then
closing the connection. Fixes bug 1015.
rather than the bandwidth values in each relay descriptor. This approach
opens the door to more accurate bandwidth estimates once the directory
authorities start doing active measurements. Implements more of proposal
141.
arma's rationale: "I think this is a bug, since people intentionally
set DirPortFrontPage, so they really do want their relay to serve that
page when it's asked for. Having it appear only sometimes (or roughly
never in Sebastian's case) makes it way less useful."
Fixes bug 1013; bugfix on 0.2.1.8-alpha.
Added a sanity check in config.c and a check in directory.c
directory_initiate_command_rend() to catch any direct connection attempts
when a socks proxy is configured.
The rest of the code was only including event.h so that it could see
EV_READ and EV_WRITE, which we were using as part of the
connection_watch_events interface for no very good reason.
This patch adds a new compat_libevent.[ch] set of files, and moves our
Libevent compatibility and utilitity functions there. We build them
into a separate .a so that nothing else in src/commmon depends on
Libevent (partially fixing bug 507).
Also, do not use our own built-in evdns copy when we have Libevent
2.0, whose evdns is finally good enough (thus fixing Bug 920).
Fix an edge case where a malicious exit relay could convince a
controller that the client's DNS question resolves to an internal IP
address. Bug found and fixed by "optimist"; bugfix on 0.1.2.8-beta.
Fix an edge case where a malicious exit relay could convince a
controller that the client's DNS question resolves to an internal IP
address. Bug found and fixed by "optimist"; bugfix on 0.1.2.8-beta.
Hidden service clients didn't use a cached service descriptor that
was older than 15 minutes, but wouldn't fetch a new one either. Now,
use a cached descriptor no matter how old it is and only fetch a new
one when all introduction points fail. Fix for bug 997. Patch from
Marcus Griep.
Apparently all the stuff that does a linear scan over all the DNS
cache entries can get really expensive when your DNS cache is very
large. It's hard to say how much this will help performance, since
gprof doesn't count time spent in OpenSSL or zlib, but I'd guess 10%.
Also, this patch removes calls to assert_connection_ok() from inside
the read and write callbacks, which are similarly unneeded, and a
little costlier than I'm happy with.
This is probably worth backporting to 0.2.0.
Provide a useful warning when launch_circuit tries to make us use a
node we don't want to use. Just give an info message when this is a
normal and okay situation. Fix for logging issues in bug 984.
This patch adds a function to determine whether we're in the main
thread, and changes control_event_logmsg() to return immediately if
we're in a subthread. This is necessary because otherwise we will
call connection_write_to_buf, which modifies non-locked data
structures.
Bugfix on 0.2.0.x; fix for at least one of the things currently
called "bug 977".
Tas (thanks!) noticed that when *ListenAddress is set, Tor would
still warn on startup when *Port is low and hibernation is active.
The patch parses all the *ListenAddress lines, and checks the
ports. Bugfix on 0.2.1.15-rc
With the last fix of task 932 (5f03d6c), client requests are only added to
the history when they happen after the start of the current history. This
conflicts with the unit tests that insert current requests first (defining
the start of the client request history) followed by requests in the past.
The fix is to insert requests in chronological order in the unit tests.
- Write geoip stats to disk every 24 hours, not every hour.
- Remove configuration options and define reasonable defaults.
- Clear history of client requests every 24 hours (which wasn't done at
all before).
Specifically if you send SIGUSR1, it will add two lines to the log file:
May 22 07:41:59.576 [notice] Our DNS cache has 3364 entries.
May 22 07:41:59.576 [notice] Our DNS cache size is approximately 1022656
bytes.
[tweaked a bit by nickm]
Really, our idiocy was that we were calling event_set() on the same
event more than once, which sometimes led to us calling event_set() on
an event that was already inserted, thus making it look uninserted.
With this patch, we just initialize the timeout events when we create
the requests and nameservers, and we don't need to worry about
double-add and double-del cases at all.
If we ever add an event, then set it, then add it again, there will be
now two pointers to the event in the event base. If we delete one and
free it, the first pointer will still be there, and possibly cause a
crash later.
This patch adds detection for this case to the code paths in
eventdns.c, and works around it. If the warning message ever
displays, then a cleverer fix is in order.
{I am not too confident that this *is* the fix, since bug 957 is very
tricky. If it is, it is a bugfix on 0.2.0.}
When we got a descriptor that we (as an authority) rejected as totally
bad, we were freeing it, then using the digest in its RAM to look up its
download status. Caught by arma with valgrind. Bugfix on 0.2.1.9-alpha.
Bridges are not supposed to publish router descriptors to the directory
authorities. It defeats the point of bridges when they are included in the
public relay directory.
This patch puts out a warning and exits when the node is configured as
a bridge and to publish v1, v2, or v3 descriptors at the same time.
Also fixes part of bug 932.
This addresses the first part of bug 918. Users are now warned when
they try to use hibernation in combination with a port below 1024
when they're not on Windows. We don't want to die here, because
people might run Tor as root, use a capabilities system or some
other platform that will allow them to re-attach low ports.
Wording suggested by Marian
(Don't crash immediately if we have leftover chunks to free after
freeing chunks in a buffer freelist; instead log a debugging message
that might help.)
Now, when you call tor --digests, it dumps the SHA1 digest of each
source file that Tor was built with. We support both 'sha1sum' and
'openssl sha1'. If the user is building from a tarball and they
haven't edited anything, they don't need any program that calculates
SHA1. If they _have_ modified a file but they don't have a program to
calculate SHA1, we try to build so we do not output digests.
bytes (aka 20KB/s), to match our documentation. Also update
directory authorities so they always assign the Fast flag to relays
with 20KB/s of capacity. Now people running relays won't suddenly
find themselves not seeing any use, if the network gets faster
on average.
svn:r19305
IP address changes: directory mirrors were mistakenly telling them
their old address if they asked via begin_dir, so they never got
an accurate answer about their new address, so they just vanished
after a day. Should fix bugs 827, 883, and 900 -- but alas, only
after every directory mirror has upgraded.
svn:r19291
ago. This change should significantly improve client performance,
especially once more people upgrade, since relays that have been
a guard for a long time are currently overloaded.
svn:r19287
The directory authorities were refusing v3 consensus votes from
other authorities, since the votes are now 504K. Fixes bug 959;
bugfix on 0.0.2pre17 (where we raised it from 50K to 500K ;).
svn:r19194
When we used smartlist_free to free the list of succesful uploads
because we had succeeded in uploading everywhere, we did not actually
set the successful_uploads field to NULL, so later it would get freed
again in rend_service_descriptor_free. Fix for bug 948; bug
introduced in 0.2.1.6-alpha.
svn:r19073
tor_sscanf() only handles %u and %s for now, which will make it
adequate to replace sscanf() for date/time/IP parsing. We want this
to prevent attackers from constructing weirdly formed descriptors,
cells, addresses, HTTP responses, etc, that validate under some
locales but not others.
svn:r18760
It seems that 64-bit Sparc Solaris demands 64-bit-aligned access to
uint64_t, but does not 64-bit-align the stack-allocated char array we
use for cpuworker tags. So this patch adds a set/get_uint64 pair, and
uses them to access the conn_id field in the tag.
svn:r18743
It turns out that we weren't updating the _ExcludeExitNodesUnion set's
country numbers when we reloaded (or first loaded!) the IP-to-country
file. Spotted by Lark. Bugfix on 0.2.1.6-alpha.
svn:r18575
stream never finished making its connection, it would live
forever in circuit_wait state. Now we close it after SocksTimeout
seconds. Bugfix on 0.1.2.7-alpha; reported by Mike Perry.
svn:r18516
Previously, when we had the chosen_exit set but marked optional, and
we failed because we couldn't find an onion key for it, we'd just give
up on the circuit. But what we really want to do is try again, without
the forced exit node.
Spotted by rovv. Another case of bug 752. I think this might be
unreachable in our current code, but proposal 158 could change that.
svn:r18451
This resolves bug 526, wherein we would crash if the following
events occurred in this order:
A: We're an OR, and one of our nameservers goes down.
B: We launch a probe to it to see if it's up again. (We do this hourly
in steady-state.)
C: Before the probe finishes, we reconfigure our nameservers,
usually because we got a SIGHUP and the resolve.conf file changed.
D: The probe reply comes back, or times out. (There is a five-second
window for this, after B has happens).
IOW, if one of our nameservers is down and our nameserver
configuration has changed, there were 5 seconds per hour where HUPing
the server was unsafe.
Bugfix on 0.1.2.1-alpha. Too obscure to backport.
svn:r18306
This fixes the last known case of bug 891, which could happen if two
hosts, A and B, disagree about how long a circuit has been open,
because of clock drift of some kind. Host A would then mark the
connection as is_bad_for_new_circs when it got too old and open a new
connection. In between when B receives a NETINFO cell on the new
conn, and when B receives a conn cell on the new circuit, the new
circuit will seem worse to B than the old one, and so B will mark it
as is_bad_for_new_circs in the second or third loop of
connection_or_group_set_badness().
Bugfix on 0.1.1.13-alpha. Bug found by rovv.
Not a backport candidate: the bug is too obscure and the fix too tricky.
svn:r18303
crypto_global_init gets called. Also have it be crypto_global_init
that calls crypto_seed_rng, so we are not dependent on OpenSSL's
RAND_poll in these fiddly cases.
Should fix bug 907. Bugfix on 0.0.9pre6. Backport candidate.
svn:r18210
are stored when the --enable-local-appdata option is configured. This
changes the Windows path from %APPDATA% to a host local
%USERPROFILE%\Local Settings\Application Data\ path (aka,
LOCAL_APPDATA).
Patch from coderman.
svn:r18122
It was dumb to have an "announce the value if it's over 0" version of
the code coexisting with an "announce the value if it's at least N"
version. Retain the latter only, with N set to 1.
Incidentally, this should fix a Coverity REVERSE_INULL warning.
svn:r18100
Unfortunately, old Libevents don't _put_ a version in their headers, so
this can get a little tricky. Fortunately, the only binary-compatibility
issue we care about is the size of struct event. Even more fortunately,
Libevent 2.0 will let us keep binary compatiblity forever by letting us
decouple ourselves from the structs, if we like.
svn:r18014
five days old. Otherwise if Tor is off for a long time and then
starts with cached descriptors, it will try to use the onion
keys in those obsolete descriptors when building circuits. Bugfix
on 0.2.0.x. Fixes bug 887.
svn:r17993
cell back), avoid using that OR connection anymore, and also
tell all the one-hop directory requests waiting for it that they
should fail. Bugfix on 0.2.1.3-alpha.
svn:r17984
using the wrong onion key), we were dropping it and letting the
client time out. Now actually answer with a destroy cell. Bugfix
on 0.0.2pre8.
svn:r17970
When we made bridge authorities stop serving bridge descriptors over
unencrypted links, we also broke DirPort reachability testing for
bridges. So bridges with a non-zero DirPort were printing spurious
warns to their logs. Bugfix on 0.2.0.16-alpha. Fixes bug 709.
svn:r17945
descriptors shortly after startup, and then briefly resume
after a new bandwidth test and/or after publishing a new bridge
descriptor. Bridge users that try to bootstrap from them would
get a recent networkstatus but would get descriptors from up to
18 hours earlier, meaning most of the descriptors were obsolete
already. Reported by Tas; bugfix on 0.2.0.13-alpha.
svn:r17920
discard it rather than trying to use it. In theory it could
be useful because it lists alternate directory mirrors, but in
practice it just means we spend many minutes trying directory
mirrors that are long gone from the network. Helps bug 887 a bit;
bugfix on 0.2.0.x.
svn:r17917
The subversion $Id$ fields made every commit force a rebuild of
whatever file got committed. They were not actually useful for
telling the version of Tor files in the wild.
svn:r17867
use the same download mechanism as other places.
i had to make an ugly hack around "IMPOSSIBLE_TO_DOWNLOAD+1".
we should unhack that sometime.
svn:r17834
Specifically, split compare_tor_addr_to_addr_policy() from a loop with a bunch
of complicated ifs inside into some ifs, each with a simple loop. Rearrange
router_find_exact_exit_enclave() to run a little faster. Bizarrely,
router_policy_rejects_all() shows up on oprofile, so precalculate it per
routerinfo.
svn:r17802
new from their perspective) directory download schedule abstraction.
not done yet, but i'd better get this out of my sandbox before nick
does another sweeping change. :)
svn:r17798
of which countries we've seen clients from recently. Now controllers
like Vidalia can show bridge operators that they're actually making
a difference.
svn:r17796
shahn: "Add some documentation for the WRA_* family of functions, also make
sure that (hopefully) all functions that return was_router_added_t
don't return ints directly and that they don't refer to integers in
their documentation anymore."
svn:r17731
more extreme happens. the default should be to be quiet unless
something more extreme happens.
at least, this doesn't generate complaints anymore. perhaps that
means it is working better? :)
svn:r17724
(The unfixed ones are being downgraded to regular XXXs mainly on the rationale that they don't seem to be exploding Tor, and they were apparently not showstoppers for 0.2.0.x-final.)
svn:r17682