Commit Graph

604 Commits

Author SHA1 Message Date
Mike Perry
a9edb0b4f6 More gracefully handle corrupt state files.
Save a backup if we get odd circuitbuildtimes and other state info.

In the case of circuit build times, we no longer assert, and reset our state.
2010-07-06 12:11:22 -07:00
Mike Perry
7bbdf71a82 Fix unittest failure in bug 1660.
We now record large times as abandoned, to prevent a filter step from
happening and skewing our results.

Also, issue a warn for a rare case that can happen for funky values of Xm or
too many abandoned circuits. Can happen (very rarely) during unit tests, but
should not be possble during live operation, due to network liveness filters
and discard logic.
2010-07-06 12:11:13 -07:00
Nick Mathewson
741ab2a47a Fix bugs with assuming time_t can be implicitly cast to long
Many friendly operating systems have 64-bit times, and it's not nice
to pass them to an %ld format.

It's also extremely not-nice to write a time to the log as an
integer.  Most people think it's 2010 June 29 23:57 UTC+epsilon, not
1277855805+epsilon.
2010-06-29 19:55:10 -04:00
Nick Mathewson
485cab869d Merge remote branch 'public/rand_double2' 2010-06-29 18:57:59 -04:00
Nick Mathewson
bea55766af Merge remote branch 'mikeperry/cbt-bugfixes3' 2010-06-29 18:57:50 -04:00
Nick Mathewson
b111a7cd9c Make cbt_generate_sample use crypto_rand_double()
Possible workaround for bug 1139, if anybody cares.
2010-06-25 21:33:22 -04:00
Mike Perry
2abe1ceccf Add CLOSE_MS and CLOSE_RATE keywords to buildtimeout event. 2010-06-15 20:04:49 -07:00
Mike Perry
c6c8fbf852 Split the circuit timeout and close codepaths.
We need to record different statistics at point of timeout, vs the point
of forcible closing.

Also, give some better names to constants and state file variables
to indicate they are not dealing with timeouts, but abandoned circuits.
2010-06-15 20:04:42 -07:00
Mike Perry
f528a6e62b Fix initialization and reset issues with close_ms.
Also clean up some log messages.
2010-06-15 16:41:24 -07:00
Mike Perry
c96206090e Keep circuits open until the greater of 95th CDF percentile or 60s.
This is done to provide better data to our right-censored Pareto model.

We do this by simply marking them with a new purpose.
2010-06-09 00:22:39 -07:00
Mike Perry
f897154b26 Make the Xm mode selection a consensus parameter. 2010-06-09 00:22:39 -07:00
Mike Perry
38770dd6a5 Add timeout count state variable. 2010-06-09 00:22:34 -07:00
Mike Perry
848d9f8b43 Remove synthetic timeout code in favor of better Pareto model. 2010-06-09 00:22:17 -07:00
Mike Perry
d76ebb79aa Improve log message about large timeouts and fix some typos. 2010-06-09 00:22:13 -07:00
Roger Dingledine
7e300cbba3 Let bridge users use the non-primary address of a multi-homed bridge 2010-06-03 20:29:29 -04:00
valerino
afe58cfa89 Don't use "try" as an identifier
C allows try, but some windows CE headers like to redefine 'try' to be
a reserved word.
2010-05-20 22:50:37 -04:00
Mike Perry
d9be6f3845 Fix CBT unit tests. 2010-05-12 15:31:22 -07:00
Mike Perry
a5ac96b58d Fix comments from Sebastian + Nick's code review.
Check for overflow in one place, and be consistent about type usage.
2010-05-10 19:56:27 -07:00
Mike Perry
29e0d70814 Bug 1296: Add option+logic to disable CBT learning.
There are now four ways that CBT can be disabled:

1. Network-wide, with the cbtdisabled consensus param.
2. Via config, with "LearnCircuitBuildTimeout 0"
3. Via config, with "AuthoritativeDirectory 1"
4. Via a state file write failure.
2010-05-10 13:11:48 -07:00
Mike Perry
0a6191cf70 Bug 1357: Store the suspended timeout value to resume.
This prevents a spurious warning where we have a timeout just after
deciding our network came back online.
2010-05-10 13:11:47 -07:00
Mike Perry
728e946efd Bug 1245: Ignore negative and large timeouts.
This should prevent some asserts and storage of incorrect build times
for the cases where Tor is suspended during a circuit construction, or
just after completing a circuit. The idea is that if the circuit
build time is much greater than we would have cut it off at, we probably
had a suspend event along this codepath, and we should discard the
value.
2010-05-10 13:11:46 -07:00
Mike Perry
e40e35507e Bump timeout calculation message to notice when timeout changes. 2010-05-10 13:01:25 -07:00
Mike Perry
eecdd94dec Add consensus parameter for max synthetic quantile.
In case we decide that the timeout rate is now too high due to our
change of the max synthetic quantile value, this consensus parameter
will allow us to restore it to the previous value.
2010-05-10 13:00:34 -07:00
Mike Perry
835ab53102 Add a TIMEOUT_RATE keyword to buildtimeout event. 2010-05-10 12:59:05 -07:00
Mike Perry
3bbc3e2137 Bug 1335: Implement filtering step to remove+prevent high timeouts.
This is for the other issue we saw in Bug 1335. A large number of high
timeouts were causing the timeout calculation to slowly drift upwards,
especially in conditions of load. This fix repeatedly regenerates all of
our synthetic timeouts whenever the timeout changes, to try to prevent
drift.

It also lowers the timeout cap to help for some cases of Bug 1245, where
some timeout values were so large that we ended up allocating a ton of
scratch memory to count the histogram bins.

The downside is that lowering this cap is affecting our timeout rate.
Unfortunately, the buildtimeout quantile is now higher than the actual
completion rate by what appears to be about 7-10%, which probably
represents the skew in the distribution due to lowering this synthetic
cap.
2010-05-10 12:58:10 -07:00
Mike Perry
cc2a48f1be Bug 1335: Alter Xm calculation to be weighted avg of top N=3 modes.
In my state files, I was seeing several peaks, probably due to different
guards having different latency. This change is meant to better capture
this behavior and generate more reasonable timeouts when it happens. It
is improving the timeout values for my collection of state files.
2010-05-10 12:46:49 -07:00
Nick Mathewson
927425150b Merge branch 'asprintf' 2010-04-02 12:30:46 -04:00
Nick Mathewson
b006e3279f Merge remote branch 'origin/maint-0.2.1'
Conflicts:
	src/common/test.h
	src/or/test.c
2010-02-27 17:16:31 -05:00
Nick Mathewson
c3e63483b2 Update Tor Project copyright years 2010-02-27 17:14:21 -05:00
Nick Mathewson
937b5cdd41 Merge remote branch 'origin/maint-0.2.1'
Conflicts:
	ChangeLog
	src/or/routerparse.c
2010-02-27 15:34:02 -05:00
Sebastian Hahn
86828e2004 Proper NULL checking in circuit_list_path_impl()
Another dereference-then-NULL-check sequence. No reports of this bug
triggered in the wild. Fixes bugreport 1256.

Thanks to ekir for discovering and reporting this bug.
2010-02-26 05:53:26 +01:00
Nick Mathewson
6fa8dacb97 Add a tor_asprintf() function, and use it in a couple of places.
asprintf() is a GNU extension that some BSDs have picked up: it does a printf
into a newly allocated chunk of RAM.

Our tor_asprintf() differs from standard asprintf() in that:
  - Like our other malloc functions, it asserts on OOM.
  - It works on windows.
  - It always sets its return-field.
2010-02-25 16:09:10 -05:00
Mike Perry
f4d6315afa Remove misc unnecessary newlines found by new check. 2010-02-22 16:52:11 -08:00
Mike Perry
245be159af Always weight routers by bandwidth.
Also always predict that we need a high capacity circuit or internal
circuit.
2010-02-22 16:52:11 -08:00
Mike Perry
2b95d1c0ee Describe the recent timeouts reallocation behavior. 2010-02-18 09:08:32 -08:00
Mike Perry
2258125e1a Move CBT params into consensus. 2010-02-18 09:08:31 -08:00
Mike Perry
f459388c29 Add an event for a case where we drop guards.
Also add a comment about an odd CBT timeout edgecase.
2010-02-18 09:08:31 -08:00
Mike Perry
8512e33773 Add BUILDTIMEOUT_SET event for CBT stress testing. 2010-02-18 09:08:31 -08:00
Roger Dingledine
8d84b4bfa1 Merge branch 'maint-0.2.1'
Conflicts:

	ChangeLog
2010-01-19 17:54:41 -05:00
Roger Dingledine
1fc94bfd0e spread guard rotation out throughout the month 2010-01-19 17:52:52 -05:00
Roger Dingledine
0642ab2428 weight guard choice by bandwidth; discard old guards 2010-01-19 17:30:52 -05:00
Roger Dingledine
7d832cc988 make the os x tiger compiler shut up
it's wrong, but that's our problem not its problem
2009-12-21 04:58:03 -05:00
Roger Dingledine
2138b05f17 Use nodes in ExitNodes even if they're not fast/stable 2009-12-21 03:52:33 -05:00
Roger Dingledine
cc73bc3853 Use nodes in EntryNodes even if they're not fast/stable 2009-12-21 03:52:33 -05:00
Roger Dingledine
7346804ec6 instrument entry_is_live to tell why our guard isn't live 2009-12-21 03:52:33 -05:00
Roger Dingledine
ef81649d2f Be more willing to use an unsuitable circuit for exit.
Specifically, there are two cases: a) are we willing to start a new
circuit at a node not in your ExitNodes config option, and b) are we
willing to make use of a circuit that's already established but has an
unsuitable exit.

Now we discard all your circuits when you set ExitNodes, so the only
way you could end up with an exit circuit that ends at an unsuitable
place is if we explicitly ran out of exit nodes, StrictNodes was 0,
and we built this circuit to solve a stream that needs solving.

Fixes bug in dc322931, which would ignore the just-built circuit because
it has an unsuitable exit.
2009-12-21 03:52:32 -05:00
Roger Dingledine
1a65bdd232 Make EntryNodes config option much more aggressive.
Before it would prepend your requested entrynodes to your list of guard
nodes, but feel free to use others after that. Now it chooses only
from your EntryNodes if any of those are available, and only falls back
to others if a) they're all down and b) StrictNodes is not set.

Also, now we refresh your entry guards from EntryNode at each consensus
fetch (rather than just at startup and then they slowly rot as the
network changes).

The goal here is to make users less likely to set StrictNodes, since
it's doing closer to what they expect it should be doing.
2009-12-21 03:52:31 -05:00
Roger Dingledine
580066f2f6 Switch to a StrictNodes config option.
This is step one of handling ExcludedNodes better. This first
step is just to make EntryNodes and ExitNodes do what they did
before.
2009-12-21 03:52:31 -05:00
Nick Mathewson
350181529e Merge branch 'safelogging2'
Conflicts:
	ChangeLog
2009-12-15 17:26:09 -05:00
Nick Mathewson
fcbd65b45c Refactor the safe_str_*() API to make more sense.
The new rule is: safe_str_X() means "this string is a piece of X
information; make it safe to log."  safe_str() on its own means
"this string is a piece of who-knows-what; make it safe to log".
2009-12-15 17:25:34 -05:00
Nick Mathewson
e56747f9cf Refactor a bit so that it is safe to include math.h, and mostly not needed. 2009-12-15 14:40:49 -05:00
Nick Mathewson
0c1b3070cf Now that FOO_free(NULL) always works, remove checks before calling it. 2009-12-12 02:07:59 -05:00
Sebastian Hahn
3807db001d *_free functions now accept NULL
Some *_free functions threw asserts when passed NULL. Now all of them
accept NULL as input and perform no action when called that way.

This gains us consistence for our free functions, and allows some
code simplifications where an explicit null check is no longer necessary.
2009-12-12 03:29:44 +01:00
Sebastian Hahn
f258647433 Allow SafeLogging to exclude client related information 2009-12-12 02:26:11 +01:00
Nick Mathewson
5e4d53d535 Remove checks for array existence. (CID 410..415)
In C, the code "char x[10]; if (x) {...}" always takes the true branch of
the if statement.  Coverity notices this now.

In some cases, we were testing arrays to make sure that an operation
we wanted to do would suceed.  Those cases are now always-true.

In some cases, we were testing arrays to see if something was _set_.
Those caes are now tests for strlen(s), or tests for
!tor_mem_is_zero(d,len).
2009-10-26 22:40:41 -04:00
Roger Dingledine
2394336426 read the "circwindow" parameter from the consensus
backport of c43859c5c1
backport of 0d13e0ed14
2009-10-14 17:07:32 -04:00
Roger Dingledine
9d6c79cbbb fix compile on windows 2009-10-11 17:23:47 -04:00
Sebastian Hahn
e35f9414d6 Fix a memleak when throwing away some build times
This was introduced in f7e6e852e8.
Found by Coverity
2009-10-10 13:41:44 +02:00
Mike Perry
18689317e4 Tweak an assert that shouldn't fire either way.
There were however other places where we used to call this
function that might have caused this to fire. Better
safe than sorry now.
2009-10-07 13:05:28 -07:00
Mike Perry
ec05e64a68 Tweak values for when to discard all of our history.
This seems to be happening to me a lot on a garbage DSL line.
We may need to come up with 2 threshholds: a high short onehop
count and a lower longer count.
2009-10-07 12:49:13 -07:00
Mike Perry
b918cd8f04 Remove another overzealous assert.
Pretimeouts may have build time data, just no timeout data.
2009-10-07 12:24:40 -07:00
Roger Dingledine
b4e0d09202 try to stem the 'sea of fail' 2009-10-01 05:35:24 -04:00
Roger Dingledine
9325b9269c Ignore one-hop circuits for circuit timeout calc
Don't count one-hop circuits when we're estimating how long it
takes circuits to build on average. Otherwise we'll set our circuit
build timeout lower than we should. Bugfix on 0.2.2.2-alpha.
2009-10-01 04:15:45 -04:00
Mike Perry
f7e6e852e8 Fix 1108: Handle corrupt or large build times state.
1108 was actually just a fencepost error in an assert,
but making the state file handling code resilient is a
good idea.
2009-09-29 14:07:04 -04:00
Sebastian Hahn
7f1f6984da Fix memory leak
Some memory could be lost in the error case of
circuit_build_times_parse_state.

Found by Coverity
2009-09-27 12:00:02 -04:00
Mike Perry
fd7454f9e3 Fix Bug 1103.
Don't pass in a quantile that is too high during pretimeout
calcualtion.
2009-09-21 20:01:20 -07:00
Mike Perry
134266b984 Change the condition on the nonlive timeout counting.
Try to clarify things in the comment too.
2009-09-20 18:20:10 -07:00
Roger Dingledine
cf2afcd707 Fix typos and comments, plus two bugs
A) We were considering a circuit had timed out in the special cases
where we close rendezvous circuits because the final rendezvous
circuit couldn't be built in time.
B) We were looking at the wrong timestamp_created when considering
a timeout.
2009-09-20 19:50:44 -04:00
Mike Perry
f39bedf250 Implement and document new network liveness algorithm.
Based on irc discussion with arma.
2009-09-20 14:51:30 -07:00
Mike Perry
6700e528be Fix some precision-related asserts in unit tests.
Mostly by storing the timeout as milliseconds and not seconds
internally.
2009-09-20 14:43:45 -07:00
Sebastian Hahn
335b67a354 Fix compile on freebsd 2009-09-18 02:43:45 +02:00
Roger Dingledine
ee89061ef2 give proposal 151 a changelog and other touchups 2009-09-17 01:42:33 -04:00
Mike Perry
43c18746bd Clarify use of magic number 0.98 with #define. 2009-09-16 18:41:22 -07:00
Sebastian Hahn
1aac7de1ea Fix unit tests and compile issues on Snow Leopard 2009-09-16 17:22:21 -07:00
Mike Perry
e2c2fa7a1f Change liveness value to be a function of the timeout.
And also the number of recent circuits used to decide
when the network changes.
2009-09-16 17:20:34 -07:00
Mike Perry
e4e0ce94f0 Add log message so we have accurate build time values. 2009-09-16 17:20:34 -07:00
Mike Perry
5bd60d8a41 Address nickm's issues from his review #1. 2009-09-16 17:20:29 -07:00
Mike Perry
0352d43917 Move circuitbuildtimeout config check.
We want it to be under our control so it doesn't mess
up initialization. This is likely the cause for
the bug the previous assert-adding commit (09a75ad) was
trying to address.
2009-09-16 15:58:42 -07:00
Mike Perry
09a75ad316 Time for some debugging by asserts.
Got a negative timeout value on startup. Need to narrow it down.
2009-09-16 15:55:51 -07:00
Mike Perry
742e08046f Fix bugs relating to not counting timeouts as circuit builds.
Also use bin midpoints for time values.
2009-09-16 15:55:51 -07:00
Mike Perry
67cee75ca2 Document functions and constants. 2009-09-16 15:55:50 -07:00
Mike Perry
c9363df09f Remove an assert.
It seems to fire because of precision issues. Added
more debug info to the warn to try to figure out for sure.
2009-09-16 15:55:50 -07:00
Mike Perry
63be2df84f Fix issues found by arma in review. 2009-09-16 15:55:36 -07:00
Roger Dingledine
672e2f6908 space/indent cleanups, plus point out three bugs 2009-09-16 15:55:32 -07:00
Mike Perry
4b3bc714a3 Woops. Fix a couple memory leaks.
Also change the max timeout quantile to 0.98, so we can
avoid huge synthetic timeout values.
2009-09-16 15:54:37 -07:00
Karsten Loesing
b508e4748f Remove trailing spaces. As if bytes were free...
Also correct some typos.
2009-09-16 15:52:05 -07:00
Mike Perry
535423a3bb Resolve mode ties in favor of the higher (slower) mode. 2009-09-16 15:52:04 -07:00
Mike Perry
8210336182 More detail for some log msgs. 2009-09-16 15:52:04 -07:00
Mike Perry
6eba08e22f Use our variable directly for timeout.
Using CircuitBuildTimeout is prone to issues with SIGHUP, etc.
Also, shuffle the circuit build times array after loading it
in so that newer measurements don't replace chunks of
similarly timed measurements.
2009-09-16 15:52:04 -07:00
Mike Perry
fca8446949 Fix a couple of assert bugs. 2009-09-16 15:52:03 -07:00
Mike Perry
c4e6b3eadb Fix timeout edge case when we get enough samples.
Also switch Xm calculation to mode, not min.
2009-09-16 15:52:03 -07:00
Mike Perry
95735e5478 Fix the math.h log() conflict.
It was compiling, but causing segfaults.

Also, adjust when the timer starts for new test circs
and save state every 25 circuits.
2009-09-16 15:51:17 -07:00
Mike Perry
7ac9a66c8f Recover from changing network connections.
Also add code to keep creating circuits every minute until we
hit our minimum threshhold.
2009-09-16 15:51:16 -07:00
Mike Perry
411b60325b Factor out the pretimeout handling code.
We need to also call it if we're going to calculate alpha
after a normal circuit build.
2009-09-16 15:51:15 -07:00
Mike Perry
b52bce91fc Write unit tests and fix issues they uncovered. 2009-09-16 15:51:10 -07:00
Mike Perry
04414830fe Implement the pareto fitting and timeout calculating bits. 2009-09-16 15:48:52 -07:00
Mike Perry
7750bee21d Clean up Fallon's partially complete GSoC project.
The code actually isn't that bad. It's a shame she didn't finish.
Using it as the base for this feature.
2009-09-16 15:48:51 -07:00
Nick Mathewson
ed7283d283 Merge commit 'origin/maint-0.2.1'
Resolved conflicts in:
	src/or/circuitbuild.c
2009-09-15 19:37:26 -04:00
Sebastian Hahn
113ba0e727 make some bug 1090 warnings go away
When we excluded some Exits, we were sometimes warning the user that we
were going to use the node regardless. Many of those warnings were in
fact bogus, because the relay in question was not used to connect to
the outside world.

Based on patch by Rotor, thanks!
2009-09-16 01:17:51 +02:00
Sebastian Hahn
5e01a86b42 some cleanups:
documentation fix for get_uint64
remove extra "." from a log line
fix a long line
2009-09-15 07:12:12 -04:00