mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-10 21:23:58 +01:00
Revise proposal 171 from start to finish
The big semantic change is to make the IsolateFoo options exist on a per-client-port basis.
This commit is contained in:
parent
c4d2a55a88
commit
a1e46c5393
@ -1,99 +1,350 @@
|
|||||||
Filename: 171-separate-streams-by-port-or-host.txt
|
Filename: 171-separate-streams.txt
|
||||||
Title: Separate streams across circuits by destination port or destination host
|
Title: Separate streams across circuits by connection metadata
|
||||||
Author: Robert Hogan, Jacob Appelbaum, Damon McCoy
|
Author: Robert Hogan, Jacob Appelbaum, Damon McCoy, Nick Mathewson
|
||||||
Created: 21-Oct-2008
|
Created: 21-Oct-2008
|
||||||
Modified: 30-Aug-2010
|
Modified: 7-Dec-2010
|
||||||
Status: Draft
|
Status: Open
|
||||||
|
|
||||||
|
Summary:
|
||||||
|
|
||||||
|
We propose a new set of options to isolate unrelated streams from one
|
||||||
|
another, putting them on separate circuits so that semantically
|
||||||
|
unrelated traffic is not inadvertently made linkable.
|
||||||
|
|
||||||
Motivation:
|
Motivation:
|
||||||
|
|
||||||
Streams are currently attached to circuits without regard to their content,
|
Currently, Tor attaches regular streams (that is, ones not carrying
|
||||||
destination host, or destination port. We propose three options,
|
rendezvous or directory traffic) to circuits based only on whether Tor
|
||||||
IsolateBySOCKSUser, IsolateStreamsByPort and IsolateStreamsByHost to change the
|
circuit's current exit node supports the destination, and whether the
|
||||||
default behavior.
|
circuit has been dirty (that is, in use) for too long.
|
||||||
|
|
||||||
The contents of some streams will always have revealing plain text information;
|
This means that traffic that would otherwise be unrelated sometimes
|
||||||
these streams should be treated differently than other streams that may or may
|
gets sent over the same circuit, allowing the exit node to link such
|
||||||
not have unencrypted PII content. DNS, with the exception of DNSCurve, is
|
streams with certainty, and allowing other parties to link such
|
||||||
always unencrypted. It is reasonable to assume that other protocols may exist
|
streams probabilistically.
|
||||||
that have a similar issue and may cause user concern. It is also the case that
|
|
||||||
we must balance network load issues and stream privacy. The Tor network will not
|
|
||||||
currently scale to one circuit per application connection nor should it anytime
|
|
||||||
soon.
|
|
||||||
|
|
||||||
Circuits are currently created with a few constraints and are rotated within
|
Older versions of onion routing tried to address this problem by
|
||||||
a reasonable time window. This allows a rogue exit node to correlate all
|
sending every stream over a separate circuit; performance issues made
|
||||||
streams on a given circuit.
|
this unfeasible. Moreover, in the presence of a localized adversary,
|
||||||
|
separating streams by circuits increases the odds that, for any given
|
||||||
|
linked set of streams, at least one will go over a compromised
|
||||||
|
circuit.
|
||||||
|
|
||||||
|
Therefore we ought to look for ways to allow streams that ought to be
|
||||||
|
linked to travel over a single circuit, while keeping streams that
|
||||||
|
ought not be linked isolated to separate circuits.
|
||||||
|
|
||||||
|
Discussion:
|
||||||
|
|
||||||
|
Let's call a series of inherently-linked streams (like a set of
|
||||||
|
streams downloading objects from the same webpage, or a browsing
|
||||||
|
session where the user requests several related webpages) a "Session".
|
||||||
|
|
||||||
|
"Sessions" are a necessarily a fuzzy concept. While users typically
|
||||||
|
consider some activities as wholly unrelated to each other ("My IM
|
||||||
|
session has nothing to do with my web browsing!"), the boundaries
|
||||||
|
between activities are sometimes hard to determine. If I'm reading
|
||||||
|
lolcats in one browser tab and reading about treatments for an
|
||||||
|
embarrassing disease in another, those are probably separate sessions.
|
||||||
|
If I search for a forum, log in, read it for a while, and post a few
|
||||||
|
messages on unrelated topics, that's probably all the same session.
|
||||||
|
|
||||||
|
So with the proviso that no automated process can identify sessions
|
||||||
|
100% accurately, let's see which options we have available.
|
||||||
|
|
||||||
|
Generally, all the streams on a session come from a single
|
||||||
|
application. Unfortunately, isolating streams by application
|
||||||
|
automatically isn't feasible, given the lack of any nice
|
||||||
|
cross-platform way to tell which local process originated a given
|
||||||
|
connection. (Yes, lsof works. But a quick review of the lsof code
|
||||||
|
should be sufficient to scare you away from thinking there is a
|
||||||
|
portable option, much less a portable O(1) option.) So instead, we'll
|
||||||
|
have to use some other aspect of a Tor request as a proxy for the
|
||||||
|
application.
|
||||||
|
|
||||||
|
Generally, traffic from separate applications is not in the same
|
||||||
|
session.
|
||||||
|
|
||||||
|
With some applications (IRC, for example), each stream is a session.
|
||||||
|
|
||||||
|
Some applications (most notably web browsing) can't be meaningfully
|
||||||
|
split into sessions without inspecting the traffic itself and
|
||||||
|
maintaining a lot of state.
|
||||||
|
|
||||||
|
How well do ports correspond to sessions? Early versions of this
|
||||||
|
proposal focused on using destination ports as a proxy for
|
||||||
|
application, since a connection to port 22 for SSH is probably not in
|
||||||
|
the same session as one to port 80. This only works with some
|
||||||
|
applications better than others, though: while SSH users typically
|
||||||
|
know when they're on port 22 and when they aren't, a web browser can
|
||||||
|
be coaxed (though img urls or any number of releated tricks) into
|
||||||
|
connecting to any port at all. Moreover, when Tor gets a DNS lookup
|
||||||
|
request, it doesn't know in advance which port the resulting address
|
||||||
|
will be used to connect to.
|
||||||
|
|
||||||
|
So in summary, each kind of traffic wants to follow different rules,
|
||||||
|
and assuming the existence of a web browser and a hostile web page or
|
||||||
|
exit node, we can't tell one kind of traffic from another by simply
|
||||||
|
looking at the destination:port of the traffic.
|
||||||
|
|
||||||
|
Fortunately, we're not doomed.
|
||||||
|
|
||||||
Design:
|
Design:
|
||||||
|
|
||||||
We propose two options for isolation of streams that lessen the observability
|
When a stream arrives at Tor, we have the following data to examine:
|
||||||
and linkability of the Tor client's traffic.
|
1) The destination address
|
||||||
|
2) The destination port (unless this a DNS lookup)
|
||||||
|
3) The protocol used by the application to send the stream to Tor:
|
||||||
|
SOCKS4, SOCKS4A, SOCKS5, or whatever local "transparent proxy"
|
||||||
|
mechanism the kernel gives us.
|
||||||
|
4) The port used by the application to send the stream to Tor --
|
||||||
|
that is, the SOCKSListenAddress or TransListenAddress that the
|
||||||
|
application used, if we have more than one.
|
||||||
|
5) The SOCKS username and password, if any.
|
||||||
|
6) The source address and port for the application.
|
||||||
|
|
||||||
IsolateStreamsByPort will take a list of ports or optionally the keyword 'All'
|
We propose to use 3, 4, and 5 as a backchannel for applications to
|
||||||
in place of a port list. The use of the keyword 'All' will ensure that all
|
tell Tor about different sessions. Rather than running only one
|
||||||
application connections attached to streams will be isolated to separate
|
SOCKSPort, a Tor user who would prefer better session isolation should
|
||||||
circuits by port number.
|
run multiple SOCKSPorts/TransPorts, and configure different
|
||||||
|
applications to use separate ports. Applications that support SOCKS
|
||||||
|
authentication can further be separated on a single port by their
|
||||||
|
choice of username/password. Streams sent to separate ports or using
|
||||||
|
different authentication information should never be sent over the
|
||||||
|
same circuit. We allow each port to have its own settings for
|
||||||
|
isolation based on destination port, destination address, or both.
|
||||||
|
|
||||||
IsolateStreamsByHost will take a boolean value. When enabled, all application
|
Handling DNS can be a challenge. We can get hostnames by one of three
|
||||||
connections, regardless of port number will be isolated with separate circuits
|
means:
|
||||||
per host. If this option is enabled, we should ensure that the client has a
|
|
||||||
reasonable number of pre-built circuits to ensure perceived performance. This
|
|
||||||
should also intentionally limit the total number of circuits a client will
|
|
||||||
build to ten circuits to prevent abuse and load on the network. This is a
|
|
||||||
trade-off of performance for anonymity. Tor will issue a warning if a client
|
|
||||||
encounters this limit.
|
|
||||||
|
|
||||||
IsolateBySOCKSUser will take a boolean value. When enabled, all application
|
A) A SOCKS4a request, or a SOCKS5 request with a hostname. This
|
||||||
connections, regardless of port number will be isolated with separate circuits
|
case is handled trivially using the rules above.
|
||||||
per SOCKS username. This options ensures that any two streams that were created
|
B) A RESOLVE request on a SOCKSPort. This case is handled using the
|
||||||
with different SOCKS usernames will be sent over different circuits. The empty
|
rules above, except that port isolation can't work to isolate
|
||||||
username will be treated as its own username different from all other usernames.
|
RESOLVE requests into a proper session, since we don't know which
|
||||||
|
port will eventually be used when we connect to the returned
|
||||||
|
address.
|
||||||
|
C) A request on a DNSPort. We have no way of knowing which
|
||||||
|
address/port will be used to connect to the requested address.
|
||||||
|
|
||||||
Security implications:
|
When B or C is required but problematic, we could favor the use of
|
||||||
|
AutomapHostsOnResolve.
|
||||||
|
|
||||||
It is believed that the proposed changes will improve the anonymity for end
|
Interface:
|
||||||
user stream privacy. The end user will no longer link all streams at a single
|
|
||||||
exit node during a given time window.
|
|
||||||
|
|
||||||
There is a possible attack where a hostile web page possibly in collusion with
|
We propose that {SOCKS,Natd,Trans,DNS}ListenAddr be deprecated in
|
||||||
an exit node contains image links for images at (say) "evil.example.com:53" and
|
favor of an expanded {SOCKS,Natd,Trans,DNS}Port syntax:
|
||||||
"evil.example.com:31337", and thereby (if they're lucky) correlate port-80
|
|
||||||
circuits with port-53 and port-31337 circuits.
|
ClientPortLine = OptionName SP (Addr ":")? Port (SP Options?)
|
||||||
|
OptionName = "SOCKSPort" / "NatdPort" / "TransPort" / "DNSPort"
|
||||||
|
Addr = An IPv4 address / an IPv6 address surrounded by brackets.
|
||||||
|
If optional, we default to 127.0.0.1
|
||||||
|
Port = An integer from 1 through 65535 inclusive
|
||||||
|
Options = Option
|
||||||
|
Options = Options SP Option
|
||||||
|
Option = IsolateOption / GroupOption
|
||||||
|
GroupOption = "SessionGroup=" UINT
|
||||||
|
IsolateOption = OptNo ("IsolateDestPort" / "IsolateDestAddr" /
|
||||||
|
"IsolateSOCKSUser"/ "IsolateClientProtocol" /
|
||||||
|
"IsolateClientAddr") OptPlural
|
||||||
|
OptNo = "No" ?
|
||||||
|
OptPlural = "s" ?
|
||||||
|
SP = " "
|
||||||
|
UINT = An unsigned integer
|
||||||
|
|
||||||
|
All options are case-insensitive.
|
||||||
|
|
||||||
|
The "IsolateSOCKSUser" and "IsolateClientAddr" options are on by
|
||||||
|
default; "NoIsolateSOCKSUser" and "NoIsolateClientAddr" respectively
|
||||||
|
turn them off. The IsolateDestPort and IsolateDestAddr and
|
||||||
|
IsolateClientProtocol options are off by default. NoIsolateDestPort and
|
||||||
|
NoIsolateDestAddr and NoIsolateClientProtocol have no effect.
|
||||||
|
|
||||||
|
Given a set of ClientPortLines, streams must NOT be placed on the same
|
||||||
|
circuit if ANY of the following hold:
|
||||||
|
|
||||||
|
* They were sent to two different client ports, unless the two
|
||||||
|
client ports both specify a "SessionGroup" option with the same
|
||||||
|
integer value.
|
||||||
|
* At least one was sent to a client port with the IsolateDestPort
|
||||||
|
active, and they have different destination ports.
|
||||||
|
* At least one was sent to a client port with IsolateDestAddr
|
||||||
|
active, and they have different destination addresses.
|
||||||
|
* At least one was sent to a client port with IsolateClientProtocol
|
||||||
|
active, and they use different protocols (where SOCKS4, SOCKS4a,
|
||||||
|
SOCKS5, TransPort, NatdPort, and DNS are the protocols in question)
|
||||||
|
* At least one was sent to a client port with IsolateSOCKSUser
|
||||||
|
active, and they have different SOCKS username/password values
|
||||||
|
configurations. (For the purposes of this option, the
|
||||||
|
username/password pair of ""/"" is distinct from SOCKS without
|
||||||
|
authentication, and both are distinct from any non-SOCKS client's
|
||||||
|
non-authentication.)
|
||||||
|
* At least one was sent to a client port with IsolateClientAddr
|
||||||
|
active, and they came from different client addresses. (For the
|
||||||
|
purpose of this option, any local interface counts as the same
|
||||||
|
address. So if the host is configured with addresses 10.0.0.1,
|
||||||
|
192.0.32.10, and 127.0.0.1, then traffic from those addresses can
|
||||||
|
leave on the same circuit, but traffic to from 10.0.0.2 (for
|
||||||
|
example) could not share a circuit with any of them.)
|
||||||
|
|
||||||
|
These rules apply regardless of whether the streams are active at the
|
||||||
|
same time. In other words, if the rules say that streams A and B must
|
||||||
|
not be on the same circuit, and stream A is attached to circuit X,
|
||||||
|
then stream B must never be attached to stream X, even if stream A is
|
||||||
|
closed first.
|
||||||
|
|
||||||
|
Alternative Interface:
|
||||||
|
|
||||||
|
We're cramming a lot onto one line in the design above. Perhaps
|
||||||
|
instead it would be a better idea to have grouped lines of the form:
|
||||||
|
|
||||||
|
StreamGroup 1
|
||||||
|
SOCKSPort 9050
|
||||||
|
TransPort 9051
|
||||||
|
IsolateDestPort 1
|
||||||
|
IsolateClientProtocol 0
|
||||||
|
EndStreamGroup
|
||||||
|
|
||||||
|
StreamGroup 2
|
||||||
|
SOCKSPort 9052
|
||||||
|
DNSPort 9053
|
||||||
|
IsolateDestAddr 1
|
||||||
|
EndStreamGroup
|
||||||
|
|
||||||
|
This would be equivalent to:
|
||||||
|
SOCKSPort 9050 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
|
||||||
|
TransPort 9051 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
|
||||||
|
SOCKSPort 9052 SessionGroup=2 IsolateDestAddr
|
||||||
|
DNSPort 9053 SessionGroup=2 IsolateDestAddr
|
||||||
|
|
||||||
|
But it would let us extend range of allowed options later without
|
||||||
|
having client port lines group without bound. For example, we might
|
||||||
|
give different circuit building parameters to different session
|
||||||
|
groups.
|
||||||
|
|
||||||
|
Example of use:
|
||||||
|
|
||||||
|
Suppose that we want to use a web browser, an IRC client, and a SSH
|
||||||
|
client all at the same time. Let's assume that we want web traffic to
|
||||||
|
be isolated from all other traffic, even if the browser makes
|
||||||
|
connections to ports usually used for IRC or SSH. Let's also assume
|
||||||
|
that IRC and SSH are both used for relatively long-lived connections,
|
||||||
|
and we want to keep all IRC/SSH sessions separate from one another.
|
||||||
|
|
||||||
|
In this case, we could say:
|
||||||
|
|
||||||
|
SOCKSPort 9050
|
||||||
|
SOCKSPort 9051 IsolateDestAddr IsolateDestPort
|
||||||
|
|
||||||
|
We would then configure our browser to use 9050 and our IRC/SSH
|
||||||
|
clients to use 9051.
|
||||||
|
|
||||||
|
Advanced example of use, #2:
|
||||||
|
|
||||||
|
Suppose that we have a bunch of applications, and we launch them all
|
||||||
|
using torsocks, and we want to keep each applications isolated from
|
||||||
|
one another. We just create a shell script, "torlaunch":
|
||||||
|
#!/bin/bash
|
||||||
|
export TORSOCKS_USERNAME="$1"
|
||||||
|
exec torsocks $@
|
||||||
|
And we configure our SOCKSPort with IsolateSOCKSUser.
|
||||||
|
|
||||||
|
Or if we're on Linux and we want to isolate by application invocation,
|
||||||
|
we would change the TORSOCKS_USERNAME line to:
|
||||||
|
|
||||||
|
export TORSOCKS_USERNAME="`cat /proc/sys/kernel/random/uuid`"
|
||||||
|
|
||||||
|
Advanced example of use, #2:
|
||||||
|
|
||||||
|
Now suppose that we want to achieve the benefits of the first example
|
||||||
|
of use, but we are stuck using transparent proxies. Let's suppose
|
||||||
|
this is Linux.
|
||||||
|
|
||||||
|
TransPort 9090
|
||||||
|
TransPort 9091 IsolateDestAddr IsolateDestPort
|
||||||
|
DNSPort 5353
|
||||||
|
AutomapHostsOnResolve 1
|
||||||
|
|
||||||
|
Here we use the iptables --cmd-owner filter to distinguish which
|
||||||
|
command is originating the packets, directing traffic from our irc
|
||||||
|
client and our SSH client to port 9091, and directing other traffic to
|
||||||
|
9090. Using AutomapHostsOnResolve will confuse ssh in its default
|
||||||
|
configuration; we'll need to find a way around that.
|
||||||
|
|
||||||
|
Security Risks:
|
||||||
|
|
||||||
|
Disabling IsolateClientAddr is a pretty bad idea.
|
||||||
|
|
||||||
|
Setting up a set of applications to use this system effectively is a
|
||||||
|
big problem. It's likely that lots of people who try to do this will
|
||||||
|
mess it up. We should try to see which setups are sensible, and see
|
||||||
|
if we can provide good feedback to explain which streams are isolated
|
||||||
|
how.
|
||||||
|
|
||||||
|
Performance Risks:
|
||||||
|
|
||||||
|
This proposal will result in clients building many more circuits than
|
||||||
|
they do today. To avoid accidentally hammering the network, we should
|
||||||
|
have in-process limits on the maximum circuit creation rate and the
|
||||||
|
total maximum client circuits.
|
||||||
|
|
||||||
Specification:
|
Specification:
|
||||||
|
|
||||||
The Tor client circuit selection process is not entirely specified. Any client
|
The Tor client circuit selection process is not entirely specified.
|
||||||
circuit specification must take these changes into account.
|
Any client circuit specification must take these changes into account.
|
||||||
|
|
||||||
Compatibility:
|
|
||||||
|
|
||||||
The proposed changes should not create any compatibility issues. New Tor clients
|
|
||||||
will be able to take advantage of this without any modification to the network.
|
|
||||||
|
|
||||||
Implementation:
|
|
||||||
|
|
||||||
It is further proposed that IsolateStreamsByPort will be enabled by default
|
|
||||||
for port 22, 53, and port 80.
|
|
||||||
|
|
||||||
It is further proposed that IsolateStreamsByHost will be disabled by default.
|
|
||||||
|
|
||||||
Implementation notes:
|
Implementation notes:
|
||||||
|
|
||||||
The implementation of this option may want to consider cases where the same
|
The more obvious ways to implement the "find a good circuit to attach
|
||||||
exit node is shared by two or more circuits and IsolateStreamsByPort is in
|
to" part of this proposal involve doing an O(n_circuits) operation
|
||||||
force. Since the purpose of the option is to reduce the opportunity of Exit
|
every time we have a stream to attach. We already do such an
|
||||||
Nodes to attack traffic from the same source on multiple ports, the
|
operation, so it's not as if we need to hunt for fancy ways to make it
|
||||||
implementation may need to ensure that circuits reserved for the exclusive use
|
O(1). What will be harder is implementing the "launch circuits as
|
||||||
of given ports do not share the same exit node.
|
needed" part of the proposal. Still, it should come down to "a simple
|
||||||
|
matter of programming."
|
||||||
|
|
||||||
Circuits should not be shared by unique clients. Tor should check to ensure
|
The SOCKS4 spec has the client provide authentication info when it
|
||||||
that peer IP addresses are identical when they connect to the SOCKS listener or
|
connects; accepting such info is no problem. But the SOCKS5 spec has
|
||||||
the TransPort listener before sharing a circuit. If the addresses are not
|
the client send a list of known auth methods, then has the server send
|
||||||
identical, Tor should ensure that the circuits are not shared.
|
back the authentication method it chooses. We'll need to update the
|
||||||
|
SOCKS5 implementation so it can accept user/password authentication if
|
||||||
|
it's offered.
|
||||||
|
|
||||||
Performance and scalability notes:
|
If we use the second syntax for describing these options, we'll want
|
||||||
|
to add a new "section-based" entry type for the configuration parser.
|
||||||
|
Not a huge deal; we already have kludged up something similar for
|
||||||
|
hidden service configurations.
|
||||||
|
|
||||||
It is further proposed that IsolateStreamsByPort will be enabled by default for
|
Opening circuits for predicted ports has the potential to get a little
|
||||||
all ports after a reasonable assessment is performed. Specifically, we should
|
more complicated; we can probably get away with the existing
|
||||||
determine the impact this option has on Tor clients and the Tor network.
|
algorithm, though, to see where its weak points are and look for
|
||||||
|
better ones.
|
||||||
|
|
||||||
|
Perhaps we can get our next-gen HTTP proxy to communicate browser tab
|
||||||
|
or session into to tor via authentication, or have torbutton do it
|
||||||
|
directly. More design is needed here, though.
|
||||||
|
|
||||||
|
Alternative designs:
|
||||||
|
|
||||||
|
The implementation of this option may want to consider cases where the
|
||||||
|
same exit node is shared by two or more circuits and
|
||||||
|
IsolateStreamsByPort is in force. Since one possible use of the option
|
||||||
|
is to reduce the opportunity of Exit Nodes to attack traffic from the
|
||||||
|
same source on multiple ports, the implementation may need to ensure
|
||||||
|
that circuits reserved for the exclusive use of given ports do not
|
||||||
|
share the same exit node. On the other hand, if our goal is only that
|
||||||
|
streams should be unlinkable, deliberately shunting them to different
|
||||||
|
exit nodes is unnecessary and slightly counterproductive.
|
||||||
|
|
||||||
|
Earlier versions of this design included a mechanism to isolate
|
||||||
|
_particular_ destination ports and addresses, so that traffic sent to,
|
||||||
|
say, port 22 would never share a port with any traffic *not* sent to
|
||||||
|
port 22. You can achieve this here by having all applications that
|
||||||
|
send traffic to one of these ports use a separate SOCKSPort, and
|
||||||
|
then setting IsolateDestPorts on that SOCKSPort.
|
||||||
|
|
||||||
|
Lingering questions:
|
||||||
|
|
||||||
|
I suspect there are issues remaining with DNS and TransPort users, and
|
||||||
|
that my "just use AutomapHostsOnResolve" suggestion may be
|
||||||
|
insufficient.
|
||||||
|
Loading…
Reference in New Issue
Block a user