Commit Graph

649 Commits

Author SHA1 Message Date
Mike Perry
c8f731fabb Comment network liveness and change detection behavior. 2010-09-29 19:35:40 -07:00
Roger Dingledine
7f10707c42 refactor and recomment; no actual changes 2010-09-29 18:01:22 -04:00
Roger Dingledine
474e4d2722 Merge commit 'mikeperry/bug1740' into maint-0.2.2 2010-09-29 17:05:38 -04:00
Mike Perry
4324bb1b21 Cap the circuit build timeout to the max time we've seen.
Also, cap the measurement timeout to 2X the max we've seen.
2010-09-29 11:49:43 -07:00
Mike Perry
11910cf5b3 Do away with the complexity of the network liveness detection.
We really should ignore any timeouts that have *no* network activity for their
entire measured lifetime, now that we have the 95th percentile measurement
changes. Usually this is up to a minute, even on fast connections.
2010-09-29 11:49:43 -07:00
Mike Perry
0744a175af Fix state checks on liveness handling.
If we really want all this complexity for these stages here, we need to handle
it better for people with large timeouts. It should probably go away, though.
2010-09-29 11:49:43 -07:00
Mike Perry
9a77743b7b Fix non-live condition checks.
Rechecking the timeout condition was foolish, because it is checked on the
same codepath. It was also wrong, because we didn't round.

Also, the liveness check itself should be <, and not <=, because we only have
1 second resolution.
2010-09-29 11:49:31 -07:00
Mike Perry
5aa4564ab9 Only count timeout data for 3 hop circuits.
Use 4/3 of this timeout value for 4 hop circuits, and use half of it for
canabalized circuits.
2010-09-29 11:41:27 -07:00
Roger Dingledine
512433346f improve code comments, based on comments from nick 2010-09-28 23:27:00 -04:00
Roger Dingledine
9997676802 handle ugly edge case in retrying entrynodes
Specifically, a circ attempt that we'd launched while the network was
down could timeout after we've marked our entrynodes up, marking them
back down again. The fix is to annotate as bad the OR conns that were
around before we did the retry, so if a circuit that's attached to them
times out we don't do anything about it.
2010-09-28 22:32:38 -04:00
Roger Dingledine
7de1caa33f Actually notice when our last entrynode goes down
Otherwise we'd never set have_minimum_dir_info to false, so the
"optimistic retry" would never trigger.
2010-09-28 21:59:31 -04:00
Roger Dingledine
bb22360bad optimistically retry EntryNodes on socks request
We used to mark all our known bridges up when they're all down and we
get a new socks request. Now do that when we've set EntryNodes too.
2010-09-28 19:10:23 -04:00
Roger Dingledine
8bac188572 remove a redundant assert 2010-09-28 19:10:22 -04:00
Roger Dingledine
127f37ad29 refactor; no actual changes 2010-09-28 19:10:22 -04:00
Roger Dingledine
09a715bb72 Merge branch 'maint-0.2.1' into maint-0.2.2 2010-09-28 18:37:55 -04:00
Roger Dingledine
339993b409 actually retry bridges when your network goes away 2010-09-28 18:36:15 -04:00
Nick Mathewson
c9cb4f0a0e Rename has_completed_circuit to can_complete_circuit
Also redocument it.  Related to #1362.
2010-09-22 01:52:57 -04:00
Roger Dingledine
47b23bd03e A start at a patch for bug 1943 (alignment issues) 2010-09-20 18:40:32 -04:00
Nick Mathewson
01c6b01137 I hear we are close to a release. Clean up the whitespace. 2010-09-16 15:44:14 -04:00
Nick Mathewson
f89323afda Fix behavior of adding a cell to a blocked queue.
We frequently add cells to stream-blocked queues for valid reasons
that don't mean we need to block streams.  The most obvious reason
is if the cell arrives over a circuit rather than from an edge: we
don't block circuits, no matter how full queues get.  The next most
obvious reason is that we allow CONNECTED cells from a newly created
stream to get delivered just fine.

This patch changes the behavior so that we only iterate over the
streams on a circuit when the cell in question came from a stream,
and we only block the stream that generated the cell, so that other
streams can still get their CONNECTEDs in.
2010-09-02 15:26:17 -04:00
Nick Mathewson
b51f1a64e4 Make Sebastian's bug1831 branch build with --enable-gcc-warnings 2010-08-15 23:46:09 -04:00
Sebastian Hahn
561ca9b987 Fix misplaced labels 2010-08-16 00:46:44 +02:00
Sebastian Hahn
4c49d3c27e Refactor circuit_build_times_parse_state
Remove the msg parameter to pass an error message out. This
wasn't needed and made it harder to detect a memory leak.
2010-08-16 00:45:32 +02:00
Sebastian Hahn
70f0ba1495 Fix a memory leak in circuit_build_times_parse_state
Thanks weasel for noticing.
2010-08-16 00:33:29 +02:00
Sebastian Hahn
05072723cb Create routerparse.h 2010-07-27 10:00:46 +02:00
Sebastian Hahn
df9d42cef5 Create rephist.h 2010-07-27 10:00:46 +02:00
Sebastian Hahn
b0cd4551ab Create relay.h 2010-07-27 10:00:45 +02:00
Sebastian Hahn
7bd8dee463 Create policies.h 2010-07-27 10:00:45 +02:00
Sebastian Hahn
f6852fe031 Create onion.h 2010-07-27 10:00:45 +02:00
Sebastian Hahn
69fcbbaa89 Create networkstatus.h 2010-07-27 07:58:16 +02:00
Sebastian Hahn
0f1548ab18 Create main.h 2010-07-27 07:58:16 +02:00
Sebastian Hahn
bec1c838ca Create directory.h 2010-07-27 07:58:15 +02:00
Sebastian Hahn
0bfa34e1f6 Create control.h 2010-07-27 07:58:15 +02:00
Sebastian Hahn
0d33120c26 Create connection_or.h 2010-07-27 07:58:15 +02:00
Sebastian Hahn
78b6a4650b Create connection_edge.h 2010-07-27 07:58:14 +02:00
Sebastian Hahn
2a74101f7a Create connection.h 2010-07-27 07:58:14 +02:00
Sebastian Hahn
c4f8f1316e Create config.h 2010-07-27 07:58:14 +02:00
Sebastian Hahn
01c7b60a80 Create circuituse.h 2010-07-27 07:58:14 +02:00
Sebastian Hahn
174a88dd79 Create circuitlist.h 2010-07-27 07:58:13 +02:00
Sebastian Hahn
21155204c6 Create circuitbuild.h 2010-07-27 07:58:13 +02:00
Sebastian Hahn
cbee969f40 Create routerlist.h 2010-07-27 07:56:25 +02:00
Sebastian Hahn
c53b6cc831 Create router.h 2010-07-27 07:56:25 +02:00
Roger Dingledine
1d6656fcb3 mike needs DEFAULT_ROUTE_LEN in other files 2010-07-21 09:30:26 -04:00
Roger Dingledine
66d5ce266e naked constants are bad 2010-07-20 08:07:44 -04:00
Nick Mathewson
0b4b51314f Make the controller act more usefully when GETINFO fails
Right now it says "552 internal error" because there's no way for
getinfo_helper_*() countries to specify an error message.  This
patch changes the getinfo_helper_*() interface, and makes most of the
getinfo helpers give useful error messages in response to failures.

This should prevent recurrences of bug 1699, where a missing GeoIPFile
line in the torrc made GETINFO ip-to-county/* fail in a "not obvious
how to fix" way.
2010-07-18 17:05:58 +02:00
Mike Perry
a9edb0b4f6 More gracefully handle corrupt state files.
Save a backup if we get odd circuitbuildtimes and other state info.

In the case of circuit build times, we no longer assert, and reset our state.
2010-07-06 12:11:22 -07:00
Mike Perry
7bbdf71a82 Fix unittest failure in bug 1660.
We now record large times as abandoned, to prevent a filter step from
happening and skewing our results.

Also, issue a warn for a rare case that can happen for funky values of Xm or
too many abandoned circuits. Can happen (very rarely) during unit tests, but
should not be possble during live operation, due to network liveness filters
and discard logic.
2010-07-06 12:11:13 -07:00
Nick Mathewson
741ab2a47a Fix bugs with assuming time_t can be implicitly cast to long
Many friendly operating systems have 64-bit times, and it's not nice
to pass them to an %ld format.

It's also extremely not-nice to write a time to the log as an
integer.  Most people think it's 2010 June 29 23:57 UTC+epsilon, not
1277855805+epsilon.
2010-06-29 19:55:10 -04:00
Nick Mathewson
485cab869d Merge remote branch 'public/rand_double2' 2010-06-29 18:57:59 -04:00
Nick Mathewson
bea55766af Merge remote branch 'mikeperry/cbt-bugfixes3' 2010-06-29 18:57:50 -04:00
Nick Mathewson
b111a7cd9c Make cbt_generate_sample use crypto_rand_double()
Possible workaround for bug 1139, if anybody cares.
2010-06-25 21:33:22 -04:00
Mike Perry
2abe1ceccf Add CLOSE_MS and CLOSE_RATE keywords to buildtimeout event. 2010-06-15 20:04:49 -07:00
Mike Perry
c6c8fbf852 Split the circuit timeout and close codepaths.
We need to record different statistics at point of timeout, vs the point
of forcible closing.

Also, give some better names to constants and state file variables
to indicate they are not dealing with timeouts, but abandoned circuits.
2010-06-15 20:04:42 -07:00
Mike Perry
f528a6e62b Fix initialization and reset issues with close_ms.
Also clean up some log messages.
2010-06-15 16:41:24 -07:00
Mike Perry
c96206090e Keep circuits open until the greater of 95th CDF percentile or 60s.
This is done to provide better data to our right-censored Pareto model.

We do this by simply marking them with a new purpose.
2010-06-09 00:22:39 -07:00
Mike Perry
f897154b26 Make the Xm mode selection a consensus parameter. 2010-06-09 00:22:39 -07:00
Mike Perry
38770dd6a5 Add timeout count state variable. 2010-06-09 00:22:34 -07:00
Mike Perry
848d9f8b43 Remove synthetic timeout code in favor of better Pareto model. 2010-06-09 00:22:17 -07:00
Mike Perry
d76ebb79aa Improve log message about large timeouts and fix some typos. 2010-06-09 00:22:13 -07:00
Roger Dingledine
7e300cbba3 Let bridge users use the non-primary address of a multi-homed bridge 2010-06-03 20:29:29 -04:00
valerino
afe58cfa89 Don't use "try" as an identifier
C allows try, but some windows CE headers like to redefine 'try' to be
a reserved word.
2010-05-20 22:50:37 -04:00
Mike Perry
d9be6f3845 Fix CBT unit tests. 2010-05-12 15:31:22 -07:00
Mike Perry
a5ac96b58d Fix comments from Sebastian + Nick's code review.
Check for overflow in one place, and be consistent about type usage.
2010-05-10 19:56:27 -07:00
Mike Perry
29e0d70814 Bug 1296: Add option+logic to disable CBT learning.
There are now four ways that CBT can be disabled:

1. Network-wide, with the cbtdisabled consensus param.
2. Via config, with "LearnCircuitBuildTimeout 0"
3. Via config, with "AuthoritativeDirectory 1"
4. Via a state file write failure.
2010-05-10 13:11:48 -07:00
Mike Perry
0a6191cf70 Bug 1357: Store the suspended timeout value to resume.
This prevents a spurious warning where we have a timeout just after
deciding our network came back online.
2010-05-10 13:11:47 -07:00
Mike Perry
728e946efd Bug 1245: Ignore negative and large timeouts.
This should prevent some asserts and storage of incorrect build times
for the cases where Tor is suspended during a circuit construction, or
just after completing a circuit. The idea is that if the circuit
build time is much greater than we would have cut it off at, we probably
had a suspend event along this codepath, and we should discard the
value.
2010-05-10 13:11:46 -07:00
Mike Perry
e40e35507e Bump timeout calculation message to notice when timeout changes. 2010-05-10 13:01:25 -07:00
Mike Perry
eecdd94dec Add consensus parameter for max synthetic quantile.
In case we decide that the timeout rate is now too high due to our
change of the max synthetic quantile value, this consensus parameter
will allow us to restore it to the previous value.
2010-05-10 13:00:34 -07:00
Mike Perry
835ab53102 Add a TIMEOUT_RATE keyword to buildtimeout event. 2010-05-10 12:59:05 -07:00
Mike Perry
3bbc3e2137 Bug 1335: Implement filtering step to remove+prevent high timeouts.
This is for the other issue we saw in Bug 1335. A large number of high
timeouts were causing the timeout calculation to slowly drift upwards,
especially in conditions of load. This fix repeatedly regenerates all of
our synthetic timeouts whenever the timeout changes, to try to prevent
drift.

It also lowers the timeout cap to help for some cases of Bug 1245, where
some timeout values were so large that we ended up allocating a ton of
scratch memory to count the histogram bins.

The downside is that lowering this cap is affecting our timeout rate.
Unfortunately, the buildtimeout quantile is now higher than the actual
completion rate by what appears to be about 7-10%, which probably
represents the skew in the distribution due to lowering this synthetic
cap.
2010-05-10 12:58:10 -07:00
Mike Perry
cc2a48f1be Bug 1335: Alter Xm calculation to be weighted avg of top N=3 modes.
In my state files, I was seeing several peaks, probably due to different
guards having different latency. This change is meant to better capture
this behavior and generate more reasonable timeouts when it happens. It
is improving the timeout values for my collection of state files.
2010-05-10 12:46:49 -07:00
Nick Mathewson
927425150b Merge branch 'asprintf' 2010-04-02 12:30:46 -04:00
Nick Mathewson
b006e3279f Merge remote branch 'origin/maint-0.2.1'
Conflicts:
	src/common/test.h
	src/or/test.c
2010-02-27 17:16:31 -05:00
Nick Mathewson
c3e63483b2 Update Tor Project copyright years 2010-02-27 17:14:21 -05:00
Nick Mathewson
937b5cdd41 Merge remote branch 'origin/maint-0.2.1'
Conflicts:
	ChangeLog
	src/or/routerparse.c
2010-02-27 15:34:02 -05:00
Sebastian Hahn
86828e2004 Proper NULL checking in circuit_list_path_impl()
Another dereference-then-NULL-check sequence. No reports of this bug
triggered in the wild. Fixes bugreport 1256.

Thanks to ekir for discovering and reporting this bug.
2010-02-26 05:53:26 +01:00
Nick Mathewson
6fa8dacb97 Add a tor_asprintf() function, and use it in a couple of places.
asprintf() is a GNU extension that some BSDs have picked up: it does a printf
into a newly allocated chunk of RAM.

Our tor_asprintf() differs from standard asprintf() in that:
  - Like our other malloc functions, it asserts on OOM.
  - It works on windows.
  - It always sets its return-field.
2010-02-25 16:09:10 -05:00
Mike Perry
f4d6315afa Remove misc unnecessary newlines found by new check. 2010-02-22 16:52:11 -08:00
Mike Perry
245be159af Always weight routers by bandwidth.
Also always predict that we need a high capacity circuit or internal
circuit.
2010-02-22 16:52:11 -08:00
Mike Perry
2b95d1c0ee Describe the recent timeouts reallocation behavior. 2010-02-18 09:08:32 -08:00
Mike Perry
2258125e1a Move CBT params into consensus. 2010-02-18 09:08:31 -08:00
Mike Perry
f459388c29 Add an event for a case where we drop guards.
Also add a comment about an odd CBT timeout edgecase.
2010-02-18 09:08:31 -08:00
Mike Perry
8512e33773 Add BUILDTIMEOUT_SET event for CBT stress testing. 2010-02-18 09:08:31 -08:00
Roger Dingledine
8d84b4bfa1 Merge branch 'maint-0.2.1'
Conflicts:

	ChangeLog
2010-01-19 17:54:41 -05:00
Roger Dingledine
1fc94bfd0e spread guard rotation out throughout the month 2010-01-19 17:52:52 -05:00
Roger Dingledine
0642ab2428 weight guard choice by bandwidth; discard old guards 2010-01-19 17:30:52 -05:00
Roger Dingledine
7d832cc988 make the os x tiger compiler shut up
it's wrong, but that's our problem not its problem
2009-12-21 04:58:03 -05:00
Roger Dingledine
2138b05f17 Use nodes in ExitNodes even if they're not fast/stable 2009-12-21 03:52:33 -05:00
Roger Dingledine
cc73bc3853 Use nodes in EntryNodes even if they're not fast/stable 2009-12-21 03:52:33 -05:00
Roger Dingledine
7346804ec6 instrument entry_is_live to tell why our guard isn't live 2009-12-21 03:52:33 -05:00
Roger Dingledine
ef81649d2f Be more willing to use an unsuitable circuit for exit.
Specifically, there are two cases: a) are we willing to start a new
circuit at a node not in your ExitNodes config option, and b) are we
willing to make use of a circuit that's already established but has an
unsuitable exit.

Now we discard all your circuits when you set ExitNodes, so the only
way you could end up with an exit circuit that ends at an unsuitable
place is if we explicitly ran out of exit nodes, StrictNodes was 0,
and we built this circuit to solve a stream that needs solving.

Fixes bug in dc322931, which would ignore the just-built circuit because
it has an unsuitable exit.
2009-12-21 03:52:32 -05:00
Roger Dingledine
1a65bdd232 Make EntryNodes config option much more aggressive.
Before it would prepend your requested entrynodes to your list of guard
nodes, but feel free to use others after that. Now it chooses only
from your EntryNodes if any of those are available, and only falls back
to others if a) they're all down and b) StrictNodes is not set.

Also, now we refresh your entry guards from EntryNode at each consensus
fetch (rather than just at startup and then they slowly rot as the
network changes).

The goal here is to make users less likely to set StrictNodes, since
it's doing closer to what they expect it should be doing.
2009-12-21 03:52:31 -05:00
Roger Dingledine
580066f2f6 Switch to a StrictNodes config option.
This is step one of handling ExcludedNodes better. This first
step is just to make EntryNodes and ExitNodes do what they did
before.
2009-12-21 03:52:31 -05:00
Nick Mathewson
350181529e Merge branch 'safelogging2'
Conflicts:
	ChangeLog
2009-12-15 17:26:09 -05:00
Nick Mathewson
fcbd65b45c Refactor the safe_str_*() API to make more sense.
The new rule is: safe_str_X() means "this string is a piece of X
information; make it safe to log."  safe_str() on its own means
"this string is a piece of who-knows-what; make it safe to log".
2009-12-15 17:25:34 -05:00
Nick Mathewson
e56747f9cf Refactor a bit so that it is safe to include math.h, and mostly not needed. 2009-12-15 14:40:49 -05:00
Nick Mathewson
0c1b3070cf Now that FOO_free(NULL) always works, remove checks before calling it. 2009-12-12 02:07:59 -05:00
Sebastian Hahn
3807db001d *_free functions now accept NULL
Some *_free functions threw asserts when passed NULL. Now all of them
accept NULL as input and perform no action when called that way.

This gains us consistence for our free functions, and allows some
code simplifications where an explicit null check is no longer necessary.
2009-12-12 03:29:44 +01:00
Sebastian Hahn
f258647433 Allow SafeLogging to exclude client related information 2009-12-12 02:26:11 +01:00
Nick Mathewson
5e4d53d535 Remove checks for array existence. (CID 410..415)
In C, the code "char x[10]; if (x) {...}" always takes the true branch of
the if statement.  Coverity notices this now.

In some cases, we were testing arrays to make sure that an operation
we wanted to do would suceed.  Those cases are now always-true.

In some cases, we were testing arrays to see if something was _set_.
Those caes are now tests for strlen(s), or tests for
!tor_mem_is_zero(d,len).
2009-10-26 22:40:41 -04:00