Change the default values for collecting directory request statistics and
inlcuding them in extra-info descriptors to 1.
Don't break if we are configured to collect directory request or entry
statistics and don't have a GeoIP database. Instead, print out a notice
and skip initializing the affected statistics code.
This is the cherry-picked 499661524b.
Previously Tor would always claim to have been given a hostname
by the client, while actually only verifying that the client
is using SOCKS4A or SOCKS5 with hostnames. Both protocol versions
allow IP addresses, too, in which case the log messages were wrong.
Fixes#4094.
Previously, we wouldn't refetch an HS's descriptor unless we didn't
have one at all. That was equivalent to refetching iff we didn't have
a usable one, but the next commit will make us keep some non-usable HS
descriptors around in our cache.
Code bugfix on the release that introduced the v2 HS directory system,
because rend_client_refetch_v2_renddesc's documentation comment should
have described what it actually did, not what its behaviour happened
to be equivalent to; no behaviour change in this commit.
Otherwise, we can wind up munging our reference counts if we set it in
the middle of loading the nodes. This happens because
nodelist_set_consensus() and microdesc_reload_cache() are both in the
business of adjusting microdescriptors' references.
we don't need to check whether we don't have enough guards right after
concluding that we do have enough.
slight efficiency fix suggested by an anonymous fellow on irc.
We were doing "divide bandwidth by 1000, then multiply by msec", but
that would lose accuracy: instead of getting your full bandwidth,
you'd lose up to 999 bytes per sec. (Not a big deal, but every byte
helps.)
Instead, do the multiply first, then the division. This can easily
overflow a 32-bit value, so make sure to do it as a 64-bit operation.
With managed proxies you would always get the error message:
"You have a Bridge line using the X pluggable transport, but there
doesn't seem to be a corresponding ClientTransportPlugin line."
because the check happened directly after parse_client_transport_line()
when managed proxies were not fully configured and their transports
were not registered.
The fix is to move the validation to run_scheduled_events() and make
sure that all managed proxies are configured first.
* Create mark/sweep functions for transports.
* Create a transport_resolve_conflicts() function that tries to
resolve conflicts when registering transports.
Previously the FooPort was ignored and the default used instead,
causing Tor to bind to the wrong port if FooPort and the default
port don't match or the CONN_TYPE_FOO_LISTENER has no default port.
Fixes#3936.
Right now we only force a new descriptor upload every 18 hours.
This can make servers become unlisted if they upload a descriptor at
time T which the authorities reject as being "too similar" to one
they uploaded before. Nothing will actually make the server upload a
new descriptor later on, until another 18 hours have passed.
This patch changes the upload behavior so that the 18 hour interval
applies only when we're listed in a live consensus with a descriptor
published within the last 18 hours. Otherwise--if we're not listed
in the live consensus, or if we're listed with a publication time
over 18 hours in the past--we upload a new descriptor every 90
minutes.
This is an attempted bugfix for #3327. If we merge it, it should
obsolete #535.
Conflicts:
src/or/connection.c
src/or/connection_edge.c
src/or/connection_edge.h
src/or/dnsserv.c
Some of these were a little tricky, since they touched code that
changed because of the prop171 fixes.
Add a "default" state which we use until we've decided whether we're
live or hibernating. This allows us to properly track whether we're
resuming a hibernation period or not. Fixes bug 2003.
For printf, %f and %lf are synonymous, since floats are promoted to
doubles when passed as varargs. It's only for scanf that we need to
say "%lf" for doubles and "%f" for floats.
Apparenly, some older compilers think it's naughty to say %lf and like
to spew warnings about it.
Found by grarpamp.
For bufferevents, we had all of connection_buckets_decrement() stubbed
out. But that's not actually right! The rephist_* parts were
essential for, inter alia, recording our own bandwidth. This patch
splits out the rephist parts of connection_buckets_decrement() into their
own function, and makes the bufferevent code call that new function.
Fixes bug 3803, and probably 3824 and 3826 too. Bugfix on 0.2.3.1-alpha.
Previously, if you were set up to use microdescriptors, and you
weren't a cache, you'd never fetch router descriptors (except for
bridges). Now FetchUselessDescriptors causes descriptors and
mirodescs to get cached. Also, FetchUselessDescriptors changes the
behavior of "UseMicrodescriptors auto" to be off, since there's no
point in saying "UseMicrodescriptors 1" when you have full descriptors
too.
Fix for bug 3851; bugfix on 0.2.3.1-alpha.
Because tunneled connections are implemented with buffervent_pair,
writing to them can cause an immediate flush. This means that
added to them and then checking to see whether their outbuf is
empty is _not_ an adequate way to see whether you added anything.
This caused a problem in directory server connections, since they
would try spooling a little more data out, and then close the
connection if there was no queued data to send.
This fix should improve matters; it only closes the connection if
there is no more data to spool, and all of the spooling callbacks
are supposed to put the dirconn into dir_spool_none on completion.
This is bug 3814; Sebastian found it; bugfix on 0.2.3.1-alpha.
When we're doing filtering ssl bufferevents, we want the rate-limits
to apply to the lowest level of the bufferevent stack, so that we're
actually limiting bytes sent on the network. Otherwise, we'll read
from the network aggressively, and only limit stuff as we process it.
This behavior is normal when we want more data than the evbuffer
actually has for us. We'll ask for (say) 7 bytes, get only 5
(because that's all there is), try to parse the 5 bytes, and get
told "no, I want 7". One option would be to bail out early whenever
want_length is > buflen, but sometimes we use an over-large
want_length. So instead, let's just remove the warning here: it's
not a bug after all.
* Use strcmpstart() instead of strcmp(x,y,strlen(y)).
* Warn the user if the managed proxy failed to launch.
* Improve function documentation.
* Use smartlist_len() instead of n_unconfigured_proxies.
* Split managed_proxy_destroy() to managed_proxy_destroy()
and managed_proxy_destroy_with_transports().
* Constification.
It turns out that it wasn't enough to set the configuration to
"auto", since the correct behavior for "auto" had been disabled in
microdesc.c. :p
(Hasn't been in a release yet, so doesn't need a changes entry.)
Now we track *which* stream with ISO_STREAM set is associated to a
particular circuit, so that we won't think that stream is incompatible
with its circuit and launch another one a second later, and we use that
same field to mark circuits which have had an ISO_STREAM stream attached
to them, so that we won't ever put a second stream on that circuit.
Fixes bug 3695.
Only write a bridge-stats string if bridge stats have been
initialized. This behavior is similar to dirreq-stats, entry-stats,
etc.
Also add a few unit tests for the bridge-stats code.
This patch separates the generation of a dirreq-stats string from
actually writing it to disk. The new geoip_format_dirreq_stats()
generates a dirreq-stats string that geoip_dirreq_stats_write() writes
to disk. All the state changing (e.g., resetting the dirreq-stats
history and initializing the next measurement interval) takes place in
geoip_dirreq_stats_write(). That allows us to finally test the
dirreq-stats code better.
Now that formatting the buffer-stats string is separate from writing
it to disk, we can also decouple the logic to extract stats from
circuits and finally write some unit tests for the history code.
The new rep_hist_format_buffer_stats() generates a buffer-stats string
that rep_hist_buffer_stats_write() writes to disk. All the state
changing (e.g., resetting the buffer-stats history and initializing
the next measurement interval) takes place in
rep_hist_buffer_stats_write(). That allows us to finally test the
buffer-stats code better.
So far, if we didn't see a single circuit, we refrained from
generating a cell-stats string and logged a warning. Nobody will
notice the warning, and people will wonder why there's no cell-stats
string in the extra-info descriptor. The better behavior is to
generate a cell-stats string with all zeros.
Right now, we append statistics to files in the stats/ directory for
half of the statistics, whereas we overwrite these files for the other
half. In particular, we append buffer, dirreq, and entry stats and
overwrite exit, connection, and bridge stats.
Appending to files was useful when we didn't include stats in extra-info
descriptors, because otherwise we'd have to copy them away to prevent
Tor from overwriting them.
But now that we include statistics in extra-info descriptors, it makes
no sense to keep the old statistics forever. We should change the
behavior to overwriting instead of appending for all statistics.
Implements #2930.
They *are* non-NUL-terminated, after all (and they have to be, since
the SOCKS5 spec allows them to contain embedded NULs. But the code
to implement proposal 171 was copying them with tor_strdup and
comparing them with strcmp_opt.
Fix for bug on 3683; bug not present in any yet-released version.
Previously we'd just looked at the connection type, but that's
always CONN_TYPE_AP. Instead, we should be looking at the type of
the listener that created the connection.
Spotted by rransom; fixes bug 3636.
We'll still need to tweak it so that it looks for includes and
libraries somewhere more sensible than "where we happened to find
them on Erinn's system"; so that tests and tools get built too;
so that it's a bit documented; and so that we actually try running
the output.
Work done with Erinn Clark.
The conflicts are with the proposal 171 circuit isolation code, and
they're all trivial: they're just a matter of both branches adding
some unrelated code in the same places.
Conflicts:
src/or/circuituse.c
src/or/connection.c
The problem was that we weren't initializing want_length to 0 before
calling parse_socks() the first time, so it looked like we were
risking an infinite loop when in fact we were safe.
Fixes 3615; bugfix on 0.2.3.2-alpha.
Back when I added this logic in 20c0581a79, the rule was that whenever
a circuit finished building, we cleared its isolation info. I did that
so that we would still use the circuit even if all the streams that
had previously led us to tentatively set its isolation info had closed.
But there were problems with that approach: We could pretty easily get
into a case where S1 had led us to launch C1 and S2 had led us to
launch C2, but when C1 finished, we cleared its isolation and attached
S2 first. Since C2 was still marked in a way that made S1
unattachable to it, we'd then launch another circuit needlessly.
So instead, we try the following approach now: when a circuit is done
building, we try to attach streams to it. If it remains unused after
we try attaching streams, then we clear its isolation info, and try
again to attach streams.
Thanks to Sebastian for helping me figure this out.
One-hop dirconn streams all share a session group, and get the
ISO_SESSIONGRP flag: they may share circuits with each other and
nothing else.
Anonymized dirconn streams get a new internal-use-only ISO_STREAM
flag: they may not share circuits with anything, including each other.
The new candidate rule, which arma suggested and I like, is that
the original address as received from the client connection or as
rewritten by the controller is the address that counts.
This is mainly meant as a way to keep clients from accidentally
DOSing themselves by (e.g.) enabling IsolateDestAddr or
IsolateDestPort on a port that they use for HTTP.
Our old "do we need to launch a circuit for stream S" logic was,
more or less, that if we had a pending circuit that could handle S,
we didn't need to launch a new one.
But now that we have streams isolated from one another, we need
something stronger here: It's possible that some pending C can
handle either S1 or S2, but not both.
This patch reuses the existing isolation logic for a simple
solution: when we decide during circuit launching that some pending
C would satisfy stream S1, we "hypothetically" mark C as though S1
had been connected to it. Now if S2 is incompatible with S1, it
won't be something that can attach to C, and so we'll launch a new
stream.
When the circuit becomes OPEN for the first time (with no streams
attached to it), we reset the circuit's isolation status. I'm not
too sure about this part: I wanted some way to be sure that, if all
streams that would have used a circuit die before the circuit is
done, the circuit can still get used. But I worry that this
approach could also lead to us launching too many circuits. Careful
thought needed here.
This is the meat of proposal 171: we change circuit_is_acceptable()
to require that the connection is compatible with every connection
that has been linked to the circuit; we update circuit_is_better to
prefer attaching streams to circuits in the way that decreases the
circuits' usefulness the least; and we update link_apconn_to_circ()
to do the appropriate bookkeeping.
The "nym epoch" of a stream is defined as the number of times that
NEWNYM had been called before the stream was opened. All streams
are isolated by nym epoch.
This feature should be redundant with existing signewnym stuff, but
it provides a good belt-and-suspenders way for us to avoid ever
letting any circuit type bypass signewnym.
This patch adds fields to track how streams should be isolated, and
ensures that those fields are set correctly. It also adds fields to
track what streams can go on a circuit, and adds functions to see
whether a streams can go on a circuit and update the circuit
accordingly. Those functions aren't yet called.
Proposal 171 gives us a new syntax for parsing client port options.
You can now have as many FooPort options as you want (for Foo in
Socks, Trans, DNS, NATD), and they can have address:port arguments,
and you can specify the level of isolation on those ports.
Additionally, this patch refactors the client port parsing logic to
use a new type, port_cfg_t. Previously, ports to be bound were
half-parsed in config.c, and later re-parsed in connection.c when
we're about to bind them. Now, parsing a port means converting it
into a port_cfg_t, and binding it uses only a port_cfg_t, without
needing to parse the user-provided strings at all.
We should do a related refactoring on other port types. For
control ports, that'll be easy enough. For ORPort and DirPort,
we'll want to do this when we solve proposal 118 (letting servers
bind to and advertise multiple ports).
This implements tickets 3514 and 3515.
This adds a little code complexity: we need to remember for each
node whether it supports the right feature, and then check for each
connection whether it's exiting at such a node. We store this in a
flag in the edge_connection_t, and set that flag at link time.
- Conform to make check-spaces
- Build without warnings from passing size_t to %d
- Use connection_get_inbuf_len(), not buf_datalen (otherwise bufferevents
won't work).
- Don't log that we're using this feature at warn.
Previously we were using router_get_by_id(foo) to test "do we have a
descriptor that will let us make an anonymous circuit to foo". But
that isn't right for microdescs: we should have been using node_t.
Fixes bug 3601; bugfix on 0.2.3.1-alpha.
Instead, use compare_tor_addr_to_node_policy everywhere.
One advantage of this is that compare_tor_addr_to_node_policy can
better distinguish 0.0.0.0 from "unknown", which caused a nasty bug
with microdesc users.
Previously, we had an issue where we'd treat an unknown address as
0, which turned into "0.0.0.0", which looked like a rejected
address. This meant in practice that as soon as we started doing
comparisons of unknown uint32 addresses to short policies, we'd get
'rejected' right away. Because of the circumstances under which
this would be called, it would only happen when we had local DNS
cached entries and we were looking to launch new circuits.