Fixes bug 4259, bugfix on 0.2.2.25-alpha. Bugfix by "Tey'".
Original message by submitter:
Changing nodes restrictions using a controller while Tor is doing
DNS resolution could makes Tor crashes (on WinXP at least). The
problem can be repeated by trying to reach a non-existent domain
using Tor:
curl --socks4a 127.0.0.1:9050 inexistantdomain.ext
.. and changing the ExitNodes parameter through the control port
before Tor returns a DNS resolution error (of course, the following
command won't work directly if the control port is password
protected):
echo SETCONF ExitNodes=TinyTurtle | nc -v 127.0.0.1 9051
Using a non-existent domain is needed to repeat the issue so that
Tor takes a few seconds for resolving the domain (which allows us to
change the configuration). Tor will crash while processing the
configuration change.
The bug is located in the addressmap_clear_excluded_trackexithosts
method which iterates over the entries of the addresses map in order
to check whether the changes made to the configuration will impact
those entries. When a DNS resolving is in progress, the new_adress
field of the associated entry will be set to NULL. The method
doesn't expect this field to be NULL, hence the crash.
It's too risky to have a function where if you leave one parameter
NULL, it splits up address:port strings, but if you set it, it does
hostname resolution.
Under the new convention, having a tor_addr.*lookup function that
doesn't do hostname resolution is too close for comfort.
I used this script here, and have made no other changes.
s/tor_addr_parse_reverse_lookup_name/tor_addr_parse_PTR_name/g;
s/tor_addr_to_reverse_lookup_name/tor_addr_to_PTR_name/g;
Now let's have "lookup" indicate that there can be a hostname
resolution, and "parse" indicate that there wasn't. Previously, we
had one "lookup" function that did resolution; four "parse" functions,
half of which did resolution; and a "from_str()" function that didn't
do resolution. That's confusing and error-prone!
The code changes in this commit are exactly the result of this perl
script, run under "perl -p -i.bak" :
s/tor_addr_port_parse/tor_addr_port_lookup/g;
s/parse_addr_port(?=[^_])/addr_port_lookup/g;
s/tor_addr_from_str/tor_addr_parse/g;
This patch leaves aton and pton alone: their naming convention and
behavior is is determined by the sockets API.
More renaming may be needed.
Conflicts:
src/or/connection.c
src/or/connection_edge.c
src/or/connection_edge.h
src/or/dnsserv.c
Some of these were a little tricky, since they touched code that
changed because of the prop171 fixes.
Now we track *which* stream with ISO_STREAM set is associated to a
particular circuit, so that we won't think that stream is incompatible
with its circuit and launch another one a second later, and we use that
same field to mark circuits which have had an ISO_STREAM stream attached
to them, so that we won't ever put a second stream on that circuit.
Fixes bug 3695.
They *are* non-NUL-terminated, after all (and they have to be, since
the SOCKS5 spec allows them to contain embedded NULs. But the code
to implement proposal 171 was copying them with tor_strdup and
comparing them with strcmp_opt.
Fix for bug on 3683; bug not present in any yet-released version.
Previously we'd just looked at the connection type, but that's
always CONN_TYPE_AP. Instead, we should be looking at the type of
the listener that created the connection.
Spotted by rransom; fixes bug 3636.
The conflicts are with the proposal 171 circuit isolation code, and
they're all trivial: they're just a matter of both branches adding
some unrelated code in the same places.
Conflicts:
src/or/circuituse.c
src/or/connection.c
Back when I added this logic in 20c0581a79, the rule was that whenever
a circuit finished building, we cleared its isolation info. I did that
so that we would still use the circuit even if all the streams that
had previously led us to tentatively set its isolation info had closed.
But there were problems with that approach: We could pretty easily get
into a case where S1 had led us to launch C1 and S2 had led us to
launch C2, but when C1 finished, we cleared its isolation and attached
S2 first. Since C2 was still marked in a way that made S1
unattachable to it, we'd then launch another circuit needlessly.
So instead, we try the following approach now: when a circuit is done
building, we try to attach streams to it. If it remains unused after
we try attaching streams, then we clear its isolation info, and try
again to attach streams.
Thanks to Sebastian for helping me figure this out.
One-hop dirconn streams all share a session group, and get the
ISO_SESSIONGRP flag: they may share circuits with each other and
nothing else.
Anonymized dirconn streams get a new internal-use-only ISO_STREAM
flag: they may not share circuits with anything, including each other.
The new candidate rule, which arma suggested and I like, is that
the original address as received from the client connection or as
rewritten by the controller is the address that counts.
Our old "do we need to launch a circuit for stream S" logic was,
more or less, that if we had a pending circuit that could handle S,
we didn't need to launch a new one.
But now that we have streams isolated from one another, we need
something stronger here: It's possible that some pending C can
handle either S1 or S2, but not both.
This patch reuses the existing isolation logic for a simple
solution: when we decide during circuit launching that some pending
C would satisfy stream S1, we "hypothetically" mark C as though S1
had been connected to it. Now if S2 is incompatible with S1, it
won't be something that can attach to C, and so we'll launch a new
stream.
When the circuit becomes OPEN for the first time (with no streams
attached to it), we reset the circuit's isolation status. I'm not
too sure about this part: I wanted some way to be sure that, if all
streams that would have used a circuit die before the circuit is
done, the circuit can still get used. But I worry that this
approach could also lead to us launching too many circuits. Careful
thought needed here.
The "nym epoch" of a stream is defined as the number of times that
NEWNYM had been called before the stream was opened. All streams
are isolated by nym epoch.
This feature should be redundant with existing signewnym stuff, but
it provides a good belt-and-suspenders way for us to avoid ever
letting any circuit type bypass signewnym.
This patch adds fields to track how streams should be isolated, and
ensures that those fields are set correctly. It also adds fields to
track what streams can go on a circuit, and adds functions to see
whether a streams can go on a circuit and update the circuit
accordingly. Those functions aren't yet called.
This adds a little code complexity: we need to remember for each
node whether it supports the right feature, and then check for each
connection whether it's exiting at such a node. We store this in a
flag in the edge_connection_t, and set that flag at link time.
- Conform to make check-spaces
- Build without warnings from passing size_t to %d
- Use connection_get_inbuf_len(), not buf_datalen (otherwise bufferevents
won't work).
- Don't log that we're using this feature at warn.
Instead, use compare_tor_addr_to_node_policy everywhere.
One advantage of this is that compare_tor_addr_to_node_policy can
better distinguish 0.0.0.0 from "unknown", which caused a nasty bug
with microdesc users.
All of the routerset_contains*() functions return 0 if their
routerset_t argument is NULL. Therefore, there's no point in
doing "if (ExcludeNodes && routerset_contains*(ExcludeNodes...))",
for example.
This patch fixes every instance of
if (X && routerstatus_contains*(X,...))
Note that there are other patterns that _aren't_ redundant. For
example, we *don't* want to change:
if (EntryNodes && !routerstatus_contains(EntryNodes,...))
Fixes#2797. No bug here; just needless code.
The issue was that we overlooked the possibility of reverse DNS success
at the end of connection_ap_handshake_socks_resolved(). Issue discovered
by katmagic, thanks!
Previously, fetch_from_buf_socks() might return 0 if there was still
data on the buffer and a subsequent call to fetch_from_buf_socks()
would return 1. This was making some of the socks5 unit tests
harder to write, and could potentially have caused misbehavior with
some overly verbose SOCKS implementations. Now,
fetch_from_buf_socks() does as much processing as it can, and
returns 0 only if it really needs more data. This brings it into
line with the evbuffer socks implementation.
This lets us make a lot of other stuff const, allows the compiler to
generate (slightly) better code, and will make me get slightly fewer
patches from folks who stick mutable stuff into or_options_t.
const: because not every input is an output!
The conflicts were mainly caused by the routerinfo->node transition.
Conflicts:
src/or/circuitbuild.c
src/or/command.c
src/or/connection_edge.c
src/or/directory.c
src/or/dirserv.c
src/or/relay.c
src/or/rendservice.c
src/or/routerlist.c
This patch introduces a few new functions in router.c to produce a
more helpful description of a node than its nickame, and then tweaks
nearly all log messages taking a nickname as an argument to call these
functions instead.
There are a few cases where I left the old log messages alone: in
these cases, the nickname was that of an authority (whose nicknames
are useful and unique), or the message already included an identity
and/or an address. I might have missed a couple more too.
This is a fix for bug 3045.