sched: Ignore closed channel after flushing cells

The flush cells process can close a channel if the connection write fails but
still return that it flushed at least one cell. This is due because the error
is not propagated up the call stack so there is no way of knowing if the flush
actually was successful or not.

Because this would require an important refactoring touching multiple
subsystems, this patch is a bandaid to avoid the KIST scheduler to handle
closed channel in its loop.

Bandaid on #23751.

Signed-off-by: David Goulet <dgoulet@torproject.org>
This commit is contained in:
David Goulet 2017-11-08 09:44:39 -05:00
parent e5a83062ed
commit dcabf801e5
2 changed files with 15 additions and 0 deletions

6
changes/bug23751 Normal file
View File

@ -0,0 +1,6 @@
o Minor bugfixes (scheduler, channel):
- Ignore channels that have been closed while flushing cells. This can
happen if the write on the connection fails leading to the channel being
closed while in the scheduler loop. This is not a complete fix, it is a
bandaid until we are able to refactor those interactions. Fixes bug
23751; bugfix on 0.3.2.1-alpha.

View File

@ -598,6 +598,15 @@ kist_scheduler_run(void)
if (socket_can_write(&socket_table, chan)) {
/* flush to channel queue/outbuf */
flush_result = (int)channel_flush_some_cells(chan, 1); // 1 for num cells
/* XXX: While flushing cells, it is possible that the connection write
* fails leading to the channel to be closed which triggers a release
* and free its entry in the socket table. And because of a engineering
* design issue, the error is not propagated back so we don't get an
* error at this poin. So before we continue, make sure the channel is
* open and if not just ignore it. See #23751. */
if (!CHANNEL_IS_OPEN(chan)) {
continue;
}
/* flush_result has the # cells flushed */
if (flush_result > 0) {
update_socket_written(&socket_table, chan, flush_result *