Edit analysis and attacks and defenses.

svn:r710
This commit is contained in:
Nick Mathewson 2003-11-02 03:58:05 +00:00
parent 0ead73a78e
commit a91c6d27bf

View File

@ -1524,6 +1524,7 @@ In this section, we discuss how well Tor meets our stated design goals
and its resistance to attacks.
\SubSection{Meeting Basic Goals}
% None of these seem to say very much. Should this subsection be removed?
\begin{tightlist}
\item [Basic Anonymity:] Because traffic is encrypted, changing in
appearance, and can flow from anywhere to anywhere within the
@ -1532,9 +1533,8 @@ and its resistance to attacks.
the network will not be able to link the initiator and responder.
Nor is it possible to directly correlate any two communication
sessions as coming from a single source without additional
information. Resistance to specific anonymity threats will be discussed
below.
information. Resistance to more sophisticated anonymity threats is
discussed below.
\item[Deployability:] Tor requires no specialized hardware. Tor
requires no kernel modifications; it runs in user space (currently
on Linux, various BSDs, and Windows). All of these imply a low
@ -1542,17 +1542,21 @@ and its resistance to attacks.
Tor nodes have good relatively persistent net connectivity
(currently T1 or better);
% Is that reasonable to say? We haven't really discussed it -P.S.
% Roger thinks otherwise; he will fix this. -NM
however, there is no padding overhead, and operators can limit
bandwidth on any link. Tor is freely available under the modified
BSD license, and operators are able to choose there own exit
strategies. These reduce legal and social liability barriers to
BSD license, and operators are able to choose their own exit
policies, thus reducing legal and social barriers to
running a node.
\item[Usability:] As noted, Tor runs in user space. So does the onion
proxy, which is easy to install and run. And SOCKS aware
applications require nothing more than to be pointed at this proxy.
proxy, which is comparatively easy to install and run. SOCKS-aware
applications require nothing more than to be pointed at the onion
proxy; other applications can be redirected to use SOCKS for their
outgoing TCP connections by drop-in libraries such as tsocks.
\item[Flexibility:] Tor's design and implementation is modular. So,
\item[Flexibility:] Tor's design and implementation is fairly modular,
so that,
for example, a scalable P2P replacement for the directory servers
would not substantially impact other aspects of the system. Tor
runs on top of TCP, so design options that could not easily do so
@ -1562,26 +1566,28 @@ and its resistance to attacks.
two systems, which seems to be relatively straightforward. This will
allow testing and direct comparison of the two rather different
designs.
% Do we want to say this? I don't think we should talk about this
% kind of discussion till we have more positive results.
\item[Conservative design:] Tor opts for practicality when there is no
clear resolution of anonymity tradeoffs or practical means to
achieve resolution. Thus, we do not currently pad or mix; although
it would be easy to add either of these. Indeed, our system allows
longrange and variable padding if this should ever be shown to have
long-range and variable padding if this should ever be shown to have
a clear advantage. Similarly, we do not currently attempt to
resolve such issues as pseudospoofing to dominate the network except
resolve such issues as Sybil attacks to dominate the network except
by such direct means as personal familiarity of director operators
with all node operators.
\end{tightlist}
\SubSection{Attacks and Defenses}
\label{sec:attacks}
Below we summarize a variety of attacks and how well our design withstands
them.
[XXX Note that some of these attacks are outside our threat model! -NM]
\subsubsection*{Passive attacks}
\begin{tightlist}
\item \emph{Observing user traffic patterns.} Observations of connection
@ -1599,143 +1605,157 @@ them.
websites may not be. Further, a responding website may itself be
considered an adversary. Filtering content is not a primary goal of
Onion Routing; nonetheless, Tor can directly make use of Privoxy and
related services via SOCKS and thus provide their application data
stream anonymization.
related filtering services via SOCKS and thus anonymize their
application data streams.
\item \emph{Option distinguishability.} Configuration options can be a
source of distinguishable patterns. In general there is economic
incentive to allow preferential services \cite{econymics}, and some
degree of configuration choice is a factor in attracting large
numbers of users to provide anonymity. We offer a standardized set
of client option configurations to maximize attractiveness of the
system while minimizing affect on anonymity set size.
% This needs to go into the spec at least, yes? How else are we
% making this true? -PS
numbers of users to provide anonymity. So far, however, we have
not found a compelling use case in Tor for any client-configurable
options. Thus, clients are currently distinguishable only by their
behavior.
\item \emph{End-to-end Timing correlation.} Onion Routing only
minimally hides end-to-end timing correlations. If an attacker
suspects communication between a given initiator and responder, and
can watch patterns of traffic at the initiator end and the responder
end, then he will be able to confirm the correspondence with high
probability. The greatest protection currently against such
confirmation is if the connection between the onion proxy and the
first Tor node is hidden, e.g., because it is local or behind a
firewall. Except for obscuring multiple users behind one such
firewall, this just requires the observer to separate the traffic
that terminates at the onion router from that which passes through
it, and to filter the greater volume of terminating traffic than a
single initiator would multiplex. We do not expect that to be a
large problem for an attacker who can observe traffic at both ends
of an application connection.
\item \emph{End-to-end Timing correlation.} Tor only minimally hides
end-to-end timing correlations. If an attacker can watch patterns of
traffic at the initiator end and the responder end, then he will be
able to confirm the correspondence with high probability. The
greatest protection currently against such confirmation is if the
connection between the onion proxy and the first Tor node is hidden,
possibly because it is local or behind a firewall. This approach
requires an observer to separate traffic originating the onion
router from traffic passes through it. We still do not, however,
predict this approach to be a large problem for an attacker who can
observe traffic at both ends of an application connection.
\item \emph{End-to-end Size correlation.} Simple packet counting
without timing consideration will also be somewhat effective in
confirming endpoints of a connection through Onion Routing; although
slightly less so. This is because, even without padding, the leaky
pipe topology means different numbers of packets may enter one end
of a circuit than exit at the other.
without timing consideration will also be effective in confirming
endpoints of a connection through Onion Routing; although slightly
less so. This is because, even without padding, the leaky pipe
topology means different numbers of packets may enter one end of a
circuit than exit at the other.
\item \emph{Website fingerprinting.} All the above passive
attacks that are at all effective are traffic confirmation attacks.
This puts them outside our general design goals. There is also
passive traffic analysis attack that is potentially effective.
Instead of searching far end connections for timing and volume
a passive traffic analysis attack that is potentially effective.
Instead of searching exit connections for timing and volume
correlations it is possible to build up a database of
``fingerprints'' for large numbers of websites. If one now wants to
``fingerprints'' containing file sizes and access patterns for a
large numbers of interesting websites. If one now wants to
monitor the activity of a user, it may be possible to confirm a
connection to a site simply by consulting the database. This has
connection to a site simply by consulting the database. This attack has
been shown to be effective against SafeWeb \cite{hintz-pet02}. Onion
Routing is not as vulnerable as SafeWeb to this attack: There is the
possibility that multiple streams are exiting the circuit at
different places concurrently. Also, fingerprinting is limited to
different places concurrently. Also, fingerprinting will be limited to
the granularity of cells, currently 256 bytes. Larger cell sizes
and/or minimal padding schemes that group websites into large sets
are possible responses. But this remains an open problem. Note that
are possible responses. But this remains an open problem. Link
padding or long-range dummies may also make fingerprints harder to
detect. (Note that
such fingerprinting should not be confused with the latency attacks
of \cite{back01}. Those require a fingerprint of the latencies of
all circuits through the network, combined with those from the
network edges to the targetted user and the responder website. While
network edges to the targeted user and the responder website. While
these are in principal feasible and surprises are always possible,
these constitute a much more complicated attack, and there is no
current evidence of their practicality.
current evidence of their practicality.)
\item Content analysis. Not our main thing, but, Privoxy to
anonymization of data stream.
\item \emph{Content analysis.} Tor explicitly provides no content
rewriting for any protocol at a higher level than TCP. When
protocol cleaners are available, however (as Privoxy is for HTTP),
Tor can integrate them in order to address these attacks.
\end{tightlist}
\subsubsection*{Active attacks}
\begin{tightlist}
\item \emph{Key compromise.} Onion Routing makes use of several kinds
of keys. Links between Tor nodes are protected by TLS negotiated
session keys over which all traffic is multiplexed. Long-term
signature keys sign information about Tor nodes, directory servers
and the like. Medium-term encryption keys are used to send a
Diffie-Hellman key from an onion proxy to an onion router. And,
session keys encrypt traffic between onion routers and the onion
proxy. Session key compromise will obviate for the lifetime of the
circuit the change in appearance of cells on a circuit passing
through a specific onion router if that compromise is done by the
immediate neighboring onion routers in a circuit. Compromise of the
mid-term keys will result in a similar compromise of all session
keys until the mid-term key changes. Note that, because of perfect
forward secrecy, this does not affect previously established keys or
indeed any session keys unless the node is also compromised.
Compromise of a long-term key means that all information about a
node can be forged following the compromise. This includes what the
correct mid-term keys are, and in the case of directory servers,
information about which nodes are in the network, which keys they
are current for those nodes, etc.
\item \emph{Key compromise.} We consider the impact of a compromise
for each type of key in turn, from the shortest- to the
longest-lived. If a circuit session key is compromised, the
attacker can unwrap a single layer of encryption from the relay
cells traveling along that circuit. (Only nodes on the circuit can
see these cells.) If a TLS session key is compromised, an attacker
can view all the cells on TLS connection until the key is
renegotiated. (These cells are themselves encrypted.) If a TLS
private key is compromised, the attacker can fool others into
thinking that he is the affected OR, but still cannot accept any
connections. If an onion private key is compromised, the attacker
can impersonate the OR in circuits, but only if the attacker has
also compromised the OR's TLS private key, or is running the
previous OR in the circuit. (This compromise affects newly created
circuits, but because of perfect forward secrecy, the attacker
cannot hijack old circuits without compromising their session keys.)
In any case, an attacker can only take advantage of a compromise in
these mid-term private keys until they expire. Only by
compromising a node's identity key can an attacker replace that
node indefinitely, by sending new forged mid-term keys to the
directories. Finally, an attacker who can compromise a
\emph{directory's} identity key can influence every client's view
of the network---but only to the degree made possible by gaining a
vote with the rest of the the directory servers.
\item \emph{Iterated compromise.} A roving adversary who can
compromise ORs (by system intrusion, legal coersion, or extralegal
coersion) could march down length of a circuit compromising the
nodes until he reaches the end. Unless the adversary can complete
this attack within the lifetime of the circuit, however, the ORs
will have discarded the necessary information before the attack can
be completed. (Thanks to the perfect forward secrecy of session
keys, the attacker cannot cannot force nodes to decrypt recorded
traffic once the circuits have been closed.)
\item \emph{Iterated subpoena.} A roving adversary can march down the
length of a circuit compromising the nodes until he reaches both of
the endpoints. In \cite{or-pet00} the algorithmic structure of this
attack was described. But, only the unlikely case of compromise
during the lifetime of a circuit was considered. Far more likely is
that nodes in a circuit will be compromised after the fact, by legal
means, rubber-hose cryptanalysis, etc. Perfect forward secrecy of
session keys makes this attack unaffective against Tor as long as
Diffie-Hellman keys are discarded as soon as they are no longer
needed.
\item \emph{Run recipient.} By running a Web server, an adversary can
try to identify the initiator of connections to it and possibly also
attrack users to itself by providing attractive content. There is
always a danger that the application protocols and associated
programs can be induced to reveal information about the initiator's
system. This is not directly in Onion Routing's protection area, so,
to the extent it is a concern, we are dependent on Privoxy and
others to keep up with the issue. A Web server can also attempt to
provide recognizable volume and timing signatures. This is simply a
stronger version of the passive confirmation adversary against which
we already acknowledged vulnerability.
\item \emph{Run a recipient.} By running a Web server, an adversary
trivially learns the timing patterns of those connecting to it, and
can introduce arbitrary patterns in its responses. This can greatly
facilitate end-to-end attacks: If the adversary can induce certain
users to connect to connect to his webserver (perhaps by providing
content targeted at those users), she now holds one end of their
connection. Additonally, here is a danger that the application
protocols and associated programs can be induced to reveal
information about the initiator. This is not directly in Onion
Routing's protection area, so we are dependent on Privoxy and
similar protocol cleaners to solve the problem.
\item \emph{Run an onion proxy.} It is expected that end users will
nearly always run their own local onion proxy. However, in some
settings, it may be necessary for the proxy to run remotely.
Typically this would be in a secure setting where it was necessary
to monitor the activity of those connecting to the proxy. But, if
the onion proxy is compromised, then all future connections through
it are completely compromised.
settings, it may be necessary for the proxy to run
remotely---typically, in an institutional setting where it was
necessary to monitor the activity of those connecting to the proxy.
The drawback, of course, is that if the onion proxy is compromised,
then all future connections through it are completely compromised.
\item \emph{DoS non-observed nodes.} An observer who can observe some
of the Tor network can increase the value of this traffic analysis
if it can attack non-observed nodes to shut them down, reduce
their reliability, or persuade users that they are not trustworthy.
The best defense here is robustness.
\item \emph{Run a hostile node.} A hostile node can reveal everything
about circuits passing through it. It can also create circuits
through itself to affect traffic at other nodes. Its ability to
directly DoS a neighbor is now limited by bandwidth throttling. It
can enhance the amount of network traffic it can see by attacking
other nodes sufficiently to shut them down or greatly reduce their
service. Nonetheless, in terms of compromising anonymity of the
endpoints of a circuit by its observations, a hostile node is only
significant if it is immediately adjacent to that endpoint.
\item \emph{Run a hostile node.} In addition to the abilties of a
local observer, an isolated hostile node can create circuits through
itself, or alter traffic patterns, in order to affect traffic at
other nodes. Its ability to directly DoS a neighbor is now limited
by bandwidth throttling. Nonetheless, in order to compromise the
anonymity of the endpoints of a circuit by its observations, a
hostile node is only significant if it is immediately adjacent to
that endpoint.
\item \emph{Run multiple hostile nodes.} If an adversary is able to
run multiple ORs, and is able to persuade the directory servers
that those ORs are trustworthy and independant, then occasionally
some user will choose one of those ORs for the start and another of
those ORs as the end of a circuit. When this happens, the user's
anonymity is compromised for those circuits. If an adversary can
control $m$ out of $N$ nodes, he will be able to correlate at most
$\frac{m}{N}$ of the traffic in this way.
\item \emph{Compromise entire path.} Anyone compromising both
endpoints of a circuit can confirm this with high probability. If
the entire path is compromised, this becomes a certainty; however,
the added benefit to the adversary of such an attack is such that it
is most likely only as a coincidence.
the added benefit to the adversary of such an attack is small in
relation to the difficulty.
\item \emph{Run a hostile directory server.} Directory servers control
admission to the network. However, because the network directory
@ -1746,9 +1766,7 @@ them.
bandwidth limited; however, it is possible to open up sufficient
numbers of circuits that converge at a single onion router to
overwhelm its network connection, its ability to process new
circuits or both. This threat is diminished by router twins since
now the attack must be run on all twins of the attacked node to be
successful.
circuits or both.
%OK so I noticed that twins are completely removed from the paper above,
% but it's after 5 so I'll leave that problem to you guys. -PS
@ -1758,11 +1776,10 @@ them.
\item \emph{Tagging attacks.} A hostile node could try to ``tag'' a
cell by altering it. This would render it unreadable, but if the
connection is, e.g., an unencrypted one to a Web site, the garbled
content coming out at the appropriate time could confirm the
association. However, integrity checks on cells will prevent this
from succeeding.
connection is, for example, an unencrypted request to a Web site,
the garbled content coming out at the appropriate time could confirm
the association. However, integrity checks on cells prevent
this attack from succeeding.
[XXXX Damn it's 5:10. So, I'm stopping here. Good luck with what's left
tonight. Hopefully less than it looks. -PS]
@ -1827,9 +1844,6 @@ Pull attacks and defenses into analysis as a subsection
\Section{Open Questions in Low-latency Anonymity}
\label{sec:maintaining-anonymity}
% There must be a better intro than this! -NM
In addition to the open problems discussed in
section~\ref{subsec:non-goals}, many other questions remain to be