Filename: 174-optimistic-data-server.txt Title: Optimistic Data for Tor: Server Side Author: Ian Goldberg Created: 2-Aug-2010 Status: Open Overview: When a SOCKS client opens a TCP connection through Tor (for an HTTP request, for example), the query latency is about 1.5x higher than it needs to be. Simply, the problem is that the sequence of data flows is this: 1. The SOCKS client opens a TCP connection to the OP 2. The SOCKS client sends a SOCKS CONNECT command 3. The OP sends a BEGIN cell to the Exit 4. The Exit opens a TCP connection to the Server 5. The Exit returns a CONNECTED cell to the OP 6. The OP returns a SOCKS CONNECTED notification to the SOCKS client 7. The SOCKS client sends some data (the GET request, for example) 8. The OP sends a DATA cell to the Exit 9. The Exit sends the GET to the server 10. The Server returns the HTTP result to the Exit 11. The Exit sends the DATA cells to the OP 12. The OP returns the HTTP result to the SOCKS client Note that the Exit node knows that the connection to the Server was successful at the end of step 4, but is unable to send the HTTP query to the server until step 9. This proposal (as well as its upcoming sibling concerning the client side) aims to reduce the latency by allowing: 1. SOCKS clients to optimistically send data before they are notified that the SOCKS connection has completed successfully 2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT state 3. Exit nodes to accept and queue DATA cells while in the EXIT_CONN_STATE_CONNECTING state This particular proposal deals with #3. In this way, the flow would be as follows: 1. The SOCKS client opens a TCP connection to the OP 2. The SOCKS client sends a SOCKS CONNECT command, followed immediately by data (such as the GET request) 3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA cells 4. The Exit opens a TCP connection to the Server 5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET request to the Server 6. The OP returns a SOCKS CONNECTED notification to the SOCKS client, and the Server returns the HTTP result to the Exit 7. The Exit sends the DATA cells to the OP 8. The OP returns the HTTP result to the SOCKS client Motivation: This change will save one OP<->Exit round trip (down to one from two). There are still two SOCKS Client<->OP round trips (negligible time) and two Exit<->Server round trips. Depending on the ratio of the Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will decrease the latency by 25 to 50 percent. Experiments validate these predictions. [Goldberg, PETS 2010 rump session; see https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ] Design: The current code actually correctly handles queued data at the Exit; if there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data will be immediately sent when the connection succeeds. If the connection fails, the data will be correctly ignored and freed. The problem with the current server code is that the server currently drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state. Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state, bad things happen because streams in that state don't yet have conn->write_event set, and so some existing sanity checks (any stream with queued data is at least potentially writable) are no longer sound. The solution is to simply not drop received DATA cells while in the EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this state, so that the OP cannot send more than one window's worth of data to be queued at the Exit. Finally, patch the sanity checks so that streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data can pass. If no clients ever send such optimistic data, the new code will never be executed, and the behaviour of Tor will not change. When clients begin to send optimistic data, the performance of those clients' streams will improve. After discussion with nickm, it seems best to just have the server version number be the indicator of whether a particular Exit supports optimistic data. (If a client sends optimistic data to an Exit which does not support it, the data will be dropped, and the client's request will fail to complete.) What do version numbers for hypothetical future protocol-compatible implementations look like, though? Security implications: Servers (for sure the Exit, and possibly others, by watching the pattern of packets) will be able to tell that a particular client is using optimistic data. This will be discussed more in the sibling proposal. On the Exit side, servers will be queueing a little bit extra data, but no more than one window. Clients today can cause Exits to queue that much data anyway, simply by establishing a Tor connection to a slow machine, and sending one window of data. Specification: tor-spec section 6.2 currently says: The OP waits for a RELAY_CONNECTED cell before sending any data. Once a connection has been established, the OP and exit node package stream data in RELAY_DATA cells, and upon receiving such cells, echo their contents to the corresponding TCP stream. RELAY_DATA cells sent to unrecognized streams are dropped. It is not clear exactly what an "unrecognized" stream is, but this last sentence would be changed to say that RELAY_DATA cells received on a stream that has processed a RELAY_BEGIN cell and has not yet issued a RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed immediately after a RELAY_CONNECTED cell is issued for the stream, or freed after a RELAY_END cell is issued for the stream. The earlier part of this section will be addressed in the sibling proposal. Compatibility: There are compatibility issues, as mentioned above. OPs MUST NOT send optimistic data to Exit nodes whose version numbers predate (something). OPs MAY send optimistic data to Exit nodes whose version numbers match or follow that value. (But see the question about independent server reimplementations, above.) Implementation: Here is a simple patch. It seems to work with both regular streams and hidden services, but there may be other corner cases I'm not aware of. (Do streams used for directory fetches, hidden services, etc. take a different code path?) diff --git a/src/or/connection.c b/src/or/connection.c index 7b1493b..f80cd6e 100644 --- a/src/or/connection.c +++ b/src/or/connection.c @@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len, return; } - connection_start_writing(conn); + /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING + * state, we don't want to try to write it right away, since + * conn->write_event won't be set yet. Otherwise, write data from + * this conn as the socket is available. */ + if (conn->state != EXIT_CONN_STATE_RESOLVING) { + connection_start_writing(conn); + } if (zlib) { conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen; } else { @@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now) tor_assert(conn->s < 0); if (conn->outbuf_flushlen > 0) { - tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw || + /* With optimistic data, we may have queued data in + * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing. + * */ + tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING || + connection_is_writing(conn) || conn->write_blocked_on_bw || (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ)); } diff --git a/src/or/relay.c b/src/or/relay.c index fab2d88..e45ff70 100644 --- a/src/or/relay.c +++ b/src/or/relay.c @@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, relay_header_t rh; unsigned domain = layer_hint?LD_APP:LD_EXIT; int reason; + int optimistic_data = 0; /* Set to 1 if we receive data on a stream + that's in the EXIT_CONN_STATE_RESOLVING + or EXIT_CONN_STATE_CONNECTING states.*/ tor_assert(cell); tor_assert(circ); @@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, /* either conn is NULL, in which case we've got a control cell, or else * conn points to the recognized stream. */ - if (conn && !connection_state_is_open(TO_CONN(conn))) - return connection_edge_process_relay_cell_not_open( - &rh, cell, circ, conn, layer_hint); + if (conn && !connection_state_is_open(TO_CONN(conn))) { + if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING || + conn->_base.state == EXIT_CONN_STATE_RESOLVING) && + rh.command == RELAY_COMMAND_DATA) { + /* We're going to allow DATA cells to be delivered to an exit + * node in state EXIT_CONN_STATE_CONNECTING or + * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */ + log_warn(domain, "Optimistic data received."); + optimistic_data = 1; + } else { + return connection_edge_process_relay_cell_not_open( + &rh, cell, circ, conn, layer_hint); + } + } switch (rh.command) { case RELAY_COMMAND_DROP: @@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, log_debug(domain,"circ deliver_window now %d.", layer_hint ? layer_hint->deliver_window : circ->deliver_window); - circuit_consider_sending_sendme(circ, layer_hint); + if (!optimistic_data) { + circuit_consider_sending_sendme(circ, layer_hint); + } if (!conn) { log_info(domain,"data cell dropped, unknown stream (streamid %d).", @@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, stats_n_data_bytes_received += rh.length; connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE, rh.length, TO_CONN(conn)); - connection_edge_consider_sending_sendme(conn); + if (!optimistic_data) { + connection_edge_consider_sending_sendme(conn); + } return 0; case RELAY_COMMAND_END: reason = rh.length > 0 ? Performance and scalability notes: There may be more RAM used at Exit nodes, as mentioned above, but it is transient.