Commit Graph

64 Commits

Author SHA1 Message Date
Nick Mathewson
fb0019daf9 Update copyrights to 2018. 2018-06-20 08:13:28 -04:00
Nick Mathewson
19c34b4658 Move or_connection_t to its own header. 2018-06-15 10:48:50 -04:00
Nick Mathewson
871ff0006d Add an API for a scheduled/manually activated event in the mainloop
Using this API lets us remove event2/event.h usage from half a dozen
modules, to better isolate libevent.  Implements part of ticket
23750.
2018-04-05 12:35:11 -04:00
Roger Dingledine
a7440d9c9d more fixes for typos, grammar, whitespace, etc
some of these ought to have been noticed by the "misspell" tool,
so if anybody is debugging it, here are some bug reports :)
2018-02-07 12:22:29 -05:00
David Goulet
e1a40535ea Merge branch 'bug24700_032_01' into bug24700_033_01 2018-02-01 16:39:04 -05:00
Nick Mathewson
cb5654f300 sched: Use the sched_heap_idx field to double-check our fix for 24700.
Signed-off-by: David Goulet <dgoulet@torproject.org>
2018-02-01 16:00:59 -05:00
Nick Mathewson
ca85d66217 Merge branch 'maint-0.3.2' 2018-02-01 08:15:09 -05:00
David Goulet
fbc455cbd2 ns: Add a before and after consensus has changed notification
In 0.3.2.1-alpha, we've added notify_networkstatus_changed() in order to have
a way to notify other subsystems that the consensus just changed. The old and
new consensus are passed to it.

Before this patch, this was done _before_ the new consensus was set globally
(thus NOT accessible by getting the latest consensus). The scheduler
notification was assuming that it was set and select_scheduler() is looking at
the latest consensus to get the parameters it might needs. This was very wrong
because at that point it is still the old consensus set globally.

This commit changes the notify_networkstatus_changed() to be the "before"
function and adds an "after" notification from which the scheduler subsystem
is notified.

Fixes #24975
2018-01-31 14:15:02 -05:00
David Goulet
adaf3e9b89 sched: Avoid adding the same channel twice to the KIST pending list
This is the quick fix that is keeping the channel in PENDING state so if we
ever try to reschedule the same channel, it won't happened.

Fixes #24700

Signed-off-by: David Goulet <dgoulet@torproject.org>
2018-01-31 13:46:31 -05:00
Nick Mathewson
943134e886 Merge remote-tracking branch 'pastly2/ticket24531_033_01' 2018-01-03 11:56:35 -05:00
Nick Mathewson
2b8a06a2ef Merge branch 'maint-0.3.2' 2017-12-21 11:16:00 -05:00
Nick Mathewson
6cd567d797 Merge remote-tracking branch 'dgoulet/bug24671_032_01' into maint-0.3.2 2017-12-21 11:13:33 -05:00
Nick Mathewson
bcc96c77de Merge branch 'maint-0.3.2' 2017-12-21 10:27:39 -05:00
Nick Mathewson
c38157be9d clarify a comment 2017-12-21 10:27:37 -05:00
Nick Mathewson
d0c5fe257b Merge branch 'maint-0.3.2' 2017-12-21 10:20:35 -05:00
David Goulet
885ba513ff sched: Consider extra_space even if negative in KIST
With extra_space negative, it means that the "notsent" queue is quite large so
we must consider that value with the current computed tcp_space. If we end up
to have negative space, we should not add more data to the kernel since the
notsent queue is just too filled up.

Fixes #24665

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-12-21 09:32:04 -05:00
David Goulet
fdfa4a5a14 sched: Use lower layer cell limit with KISTLite
Instead of using INT_MAX as a write limit for KISTLite, use the lower layer
limit which is using the specialized num_cells_writeable() of the channel that
will down the line check the connection's outbuf and limit it to 32KB
(OR_CONN_HIGHWATER).

That way we don't take the chance of bloating the connection's outbuf and we
keep the cells in the circuit queue which our OOM handler can take care of,
not the outbuf.

Finally, this commit adds a log_debug() in the update socket information
function of KIST so we can get the socket information in debug.

Fixes #24671

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-12-20 14:17:18 -05:00
Nick Mathewson
252db6ad26 Merge branch 'maint-0.3.2' 2017-12-11 16:02:10 -05:00
David Goulet
057139d383 sched: Avoid integer overflow when computing tcp_space
In KIST, we could have a small congestion window value than the unacked
packets leading to a integer overflow which leaves the tcp_space value to be
humongous.

This has no security implications but it results in KIST scheduler allowing to
send cells on a potentially saturated connection.

Found by #24423. Fixes #24590.

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-12-11 15:45:19 -05:00
Matt Traudt
667f931177 Make get_scheduler_state_string available to scheduler*.c 2017-12-11 09:43:09 -05:00
Matt Traudt
8797c8fbd3 Remove now-duplicate log_debug lines 2017-12-11 09:43:09 -05:00
Matt Traudt
273325e216 Add all the missed scheduler_state assignments 2017-12-11 09:43:08 -05:00
Nick Mathewson
5ee0cccd49 Merge branch 'macro_free_v2_squashed' 2017-12-08 14:58:43 -05:00
Nick Mathewson
fa0d24286b Convert remaining function (mostly static) to new free style 2017-12-08 14:47:19 -05:00
David Goulet
97702c69b0 sched: Set channel scheduler state to IDLE when not opened
In the KIST main loop, if the channel happens to be not opened, set its state
to IDLE so we can release it properly later on. Prior to this fix, the channel
was in PENDING state, removed from the channel pending list and then kept in
that state because it is not opened.

This bug was introduced in commit dcabf801e5 for
which we made the scheduler loop not consider unopened channel.

This has no consequences on tor except for an annoying but harmless BUG()
warning.

Fixes #24502

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-12-04 14:48:15 -05:00
David Goulet
ff6c8cf861 sched: Downgrade warning log to info in KIST
Some platforms don't have good monotonic time support so don't warn when the
diff between the last run of the scheduler time and now is negative. The
scheduler recovers properly from this so no need to be noisy.

Fixes #23696

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-12-04 12:22:02 -05:00
David Goulet
dcabf801e5 sched: Ignore closed channel after flushing cells
The flush cells process can close a channel if the connection write fails but
still return that it flushed at least one cell. This is due because the error
is not propagated up the call stack so there is no way of knowing if the flush
actually was successful or not.

Because this would require an important refactoring touching multiple
subsystems, this patch is a bandaid to avoid the KIST scheduler to handle
closed channel in its loop.

Bandaid on #23751.

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-11-08 09:44:39 -05:00
Nick Mathewson
cb42c62c9e Merge branch 'dgoulet_ticket23753_032_02_squashed' into maint-0.3.2 2017-11-02 10:30:42 -04:00
Matt Traudt
52050bb2c6 sched: Add another SCHED_BUG() callsite 2017-11-02 10:30:33 -04:00
David Goulet
3931a6f264 sched: Use SCHED_BUG() macro in scheduler
When a BUG() occurs, this macro will print extra information about the state
of the scheduler and the given channel if any. This will help us greatly to
fix future bugs in the scheduler especially when they occur rarely.

Fixes #23753

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-11-02 10:30:33 -04:00
Nick Mathewson
8652f3e9e8 Fix memory leak when freeing socket_table in KIST. 2017-10-17 13:40:31 -04:00
Nick Mathewson
dddae36f5e Merge remote-tracking branch 'dgoulet/ticket23696_032_01' 2017-09-29 17:46:50 -04:00
David Goulet
070064de89 sched: Always initialize scheduler_last_run to now
Because our monotonic time interface doesn't play well with value set to 0,
always initialize to now() the scheduler_last_run at init() of the KIST
scheduler.

Fixes #23696

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-09-29 14:07:55 -04:00
Matt Traudt
3ef7e6f187 sched: Don't get KIST stuck in an infinite loop
When a channel is scheduled and flush cells returns 0 that is no cells to
flush, we flag it back in waiting for cells so it doesn't get stuck in a
possible infinite loop.

It has been observed on moria1 where a closed channel end up in the scheduler
where the flush process returned 0 cells but it was ultimately kept in the
scheduling loop forever. We suspect that this is due to a more deeper problem
in tor where the channel_more_to_flush() is actually looking at the wrong
queue and was returning 1 for an empty channel thus putting the channel in the
"Case 4" of the scheduler which is to go back in pending state thus
re-considered at the next iteration.

This is a fix that allows the KIST scheduler to recover properly from a not
entirelly diagnosed problem in tor.

Fixes #23676

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-09-29 11:06:31 -04:00
Matt Traudt
7bbc29b0f2 sched: make interval a plain int; initialize with macro 2017-09-25 11:11:30 -04:00
David Goulet
ef2a449cce sched: Make KISTSchedRunInterval non negative
Fixes #23539.

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-09-25 11:11:30 -04:00
Matt Traudt
22699e3f16 sched: only log when scheduler type changes
Closes 23552. Thanks dgoulet for original impl
2017-09-22 08:51:22 -04:00
Matt Traudt
4a3b61a5b3 sched: reorder code to fit comment bodies; comment typos
Closes 23560
2017-09-19 20:13:28 -04:00
Nick Mathewson
290274dbb5 Add a cast so that 32-bit compilation goes without errors 2017-09-18 12:44:26 -04:00
David Goulet
77cc97cf0a sched: Don't cast to int32_t the monotime_diff_msec() result
When the KIST schedule() is called, it computes a diff value between the last
scheduler run and the current monotonic time. If tha value is below the run
interval, the libevent even is updated else the event is run.

It turned out that casting to int32_t the returned int64_t value for the very
first scheduler run (which is set to 0) was creating an overflow on the 32 bit
value leading to adding the event with a gigantic usec value. The scheduler
was simply never running for a while.

First of all, a BUG() is added for a diff value < 0 because if the time is
really monotonic, we should never have a now time that is lower than the last
scheduler run time. And we will try to recover with a diff value to 0.

Second, the diff value is changed to int64_t so we avoid this "bootstrap
overflow" and future casting overflow problems.

Fixes #23558

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-09-18 10:55:54 -04:00
Nick Mathewson
4519b7b469 kist_no_kernel_support is unused when we didn't detect it.
So, remove it.
2017-09-15 16:44:05 -04:00
Nick Mathewson
c1deabd3b0 Run our #else/#endif annotator on our source code. 2017-09-15 16:24:44 -04:00
Nick Mathewson
a01e4a1a95 kist: Cast, then do operations on int32.
Otherwise integer overflows can happen.  Remember, doing a i32xi32
multiply doesn't actually produce a 64-bit output.  You need to do
i64xi32 or i64xi64.

Coverity found this as CID 1417753
2017-09-15 14:30:19 -04:00
Nick Mathewson
03e102c1bb Make netinet/tcp include conditional too: windows lacks it. 2017-09-15 14:08:51 -04:00
Nick Mathewson
44fa866621 fix some 32-bit warnings about printf arguments 2017-09-15 14:02:08 -04:00
Nick Mathewson
fea2d84ce3 wrap a wide comment 2017-09-15 12:03:08 -04:00
Matt Traudt
ae9e6b547d sched: add comment about how we determine if a socket should write 2017-09-15 11:40:59 -04:00
Matt Traudt
501c58187d sched: add more per-socket limit documentation; int fix 2017-09-15 11:40:59 -04:00
David Goulet
1033e14a69 sched: Define SCHEDULER_KIST_PRIVATE for more encapsulation
Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-09-15 11:40:59 -04:00
David Goulet
734dbfa590 sched: Make the scheduler object static
Each type of scheduler implements its own static scheduler_t object and
returns a reference to it.

This commit also makes it a const pointer that is it can only change inside
the scheduler type subsystem but not outside for extra protection.

Signed-off-by: David Goulet <dgoulet@torproject.org>
2017-09-15 11:40:59 -04:00