Assorted tweaks fixes, etc. to abstract et passim

svn:r3559
This commit is contained in:
Paul Syverson 2005-02-04 18:32:40 +00:00
parent 6e7007bfba
commit 7240950230

View File

@ -25,15 +25,17 @@
\begin{abstract} \begin{abstract}
We describe our experiences with deploying Tor, a low-latency anonymous We describe our experiences with deploying Tor, a low-latency
communication system that has been funded both by the U.S.~Navy anonymous general purpose communication system that has been funded
and also by the Electronic Frontier Foundation. by the U.S.~Navy, DARPA, and the Electronic Frontier Foundation. The
basic Tor design supports most applications that run over TCP (those
that are SOCKS compliant).
Because of its simplified threat model, Tor does not aim to defend %Because of its simplified threat model, Tor does not aim to defend
against many of the attacks in the literature. %against many of the attacks in the literature.
We describe both policy issues that have come up from operating the We describe both policy issues that have come up from operating the
network and technical challenges in building a more sustainable and network and technical challenges to building a more sustainable and
scalable network. scalable network.
\end{abstract} \end{abstract}
@ -73,8 +75,8 @@ who don't want to reveal information to their competitors, and law
enforcement and government intelligence agencies who need enforcement and government intelligence agencies who need
to do operations on the Internet without being noticed. to do operations on the Internet without being noticed.
Tor research and development has been funded by the U.S.~Navy, for use Tor research and development has been funded by the U.S.~Navy and DARPA
in securing government for use in securing government
communications, and also by the Electronic Frontier Foundation, for use communications, and also by the Electronic Frontier Foundation, for use
in maintaining civil liberties for ordinary citizens online. The Tor in maintaining civil liberties for ordinary citizens online. The Tor
protocol is one of the leading choices protocol is one of the leading choices
@ -298,24 +300,25 @@ dispersal and diversity. Murdoch and Danezis describe an attack
\cite{attack-tor-oak05} that lets an attacker determine the nodes used \cite{attack-tor-oak05} that lets an attacker determine the nodes used
in a circuit; yet s/he cannot identify the initiator or responder, in a circuit; yet s/he cannot identify the initiator or responder,
e.g., client or web server, through this attack. So the endpoints e.g., client or web server, through this attack. So the endpoints
remain secure, which is the goal. On the other hand we can imagine an remain secure, which is the goal. It is conceivable that an
adversary that could attack or set up observation of all connections adversary could attack or set up observation of all connections
to an arbitrary Tor node in only a few minutes. If such an adversary to an arbitrary Tor node in only a few minutes. If such an adversary
were to exist, s/he could use this probing to remotely identify a node were to exist, s/he could use this probing to remotely identify a node
for further attack. Also, the enclave model seems particularly for further attack. Of more likely immediate practical concern
threatened by this attack, since it identifies endpoints when they're an adversary with active access to the responder traffic
also nodes in the Tor network: see Section~\ref{subsec:helper-nodes}
for discussion of some ways to address this issue.
[*****Suppose an adversary with active access to the responder traffic
wants to keep a circuit alive long enough to attack an identified wants to keep a circuit alive long enough to attack an identified
node. Could s/he do this without the overt cooperation of the client node. Thus it is important to prevent the responding end of the circuit
proxy? More immediately, someone could identify nodes in this way and from keeping it open indefinitely.
if in their jurisdiction, immediately get a subpoena (if they even Also, someone could identify nodes in this way and if in their
need one) and tell the node operator(s) that she must retain all the jurisdiction, immediately get a subpoena (if they even need one)
active circuit data she now has at that moment. That \emph{can} be telling the node operator(s) that she must retain all the active
done in real time.********** We should say something about this circuit data she now has.
here or later in the paper -pfs] Further, the enclave model, which had previously looked to be the most
generally secure, seems particularly threatened by this attack, since
it identifies endpoints when they're also nodes in the Tor network:
see Section~\ref{subsec:helper-nodes} for discussion of some ways to
address this issue.
see \ref{subsec:routing-zones} for discussion of larger see \ref{subsec:routing-zones} for discussion of larger
adversaries and our dispersal goals. adversaries and our dispersal goals.
@ -605,7 +608,7 @@ possible major problem with the blocking of Tor is that it's not just
the decision of the individual server administrator whose deciding if the decision of the individual server administrator whose deciding if
he wants to post to Wikipedia from his Tor node address or allow he wants to post to Wikipedia from his Tor node address or allow
people to read Wikipedia anonymously through his Tor node. (Wikipedia people to read Wikipedia anonymously through his Tor node. (Wikipedia
has blocked all posting from all Tor nodes based in IP address.) If e.g., has blocked all posting from all Tor nodes based on IP address.) If e.g.,
s/he comes through a campus or corporate NAT, then the decision must s/he comes through a campus or corporate NAT, then the decision must
be to have the entire population behind it able to have a Tor exit be to have the entire population behind it able to have a Tor exit
node or to have write access to Wikipedia. This is a loss for both of us (Tor node or to have write access to Wikipedia. This is a loss for both of us (Tor
@ -726,8 +729,8 @@ characterize the exit policies and let clients parse them to decide
which nodes will allow which packets to exit. which nodes will allow which packets to exit.
\item \emph{The Tor-internal name spaces would need to be redesigned.} We \item \emph{The Tor-internal name spaces would need to be redesigned.} We
support hidden service {\tt{.onion}} addresses, and other special addresses support hidden service {\tt{.onion}} addresses, and other special addresses
like {\tt{.exit}} (see Section~\ref{subsec:}), by intercepting the addresses like {\tt{.exit}} (see Section~\ref{subsec:hidden-services}),
when they are passed to the Tor client. by intercepting the addresses when they are passed to the Tor client.
\end{enumerate} \end{enumerate}
This list is discouragingly long right now, but we recognize that it This list is discouragingly long right now, but we recognize that it
@ -833,7 +836,7 @@ This is likely to entail high variability and massive storage since
% Hintz stuff and the Back et al. stuff from Info Hiding 01. I've % Hintz stuff and the Back et al. stuff from Info Hiding 01. I've
% separated the two and added the references. -PFS % separated the two and added the references. -PFS
routes through the network to each site will be random even if they routes through the network to each site will be random even if they
have relatively unique latency characteristics. So the do have relatively unique latency characteristics. So this does
not seem an immediate practical threat. Further along similar lines, not seem an immediate practical threat. Further along similar lines,
the same paper suggested a ``clogging attack''. A version of this the same paper suggested a ``clogging attack''. A version of this
was demonstrated to be practical in was demonstrated to be practical in
@ -854,18 +857,31 @@ monitor the responder stream, in order of decreasing attack
effectiveness. So, another way to slow some of these attacks effectiveness. So, another way to slow some of these attacks
would be to cache responses at exit servers where possible, as it is with would be to cache responses at exit servers where possible, as it is with
DNS lookups and cacheable HTTP responses. Caching would, however, DNS lookups and cacheable HTTP responses. Caching would, however,
create threats of its own. create threats of its own. First, a Tor network is expected to contain
hostile nodes. If one of these is the repository of a cache, the
attack is still possible. Though more work to set up a Tor node and
cache repository, the payoff of such an attack is potentially
higher.
%To be %To be
%useful, such caches would need to be distributed to any likely exit %useful, such caches would need to be distributed to any likely exit
%nodes of recurred requests for the same data. %nodes of recurred requests for the same data.
% Even local caches could be useful, I think. -NM % Even local caches could be useful, I think. -NM
Aside from the logistic %
difficulties and overhead, caches would constitute a %Added some clarification -PFS
record of destinations and data visited by Tor users. While Besides allowing any other insider attacks, caching nodes would hold a
limited to network insiders, given the need for wide distribution record of destinations and data visited by Tor users reducing forward
they could serve as useful data to an attacker deciding which locations anonymity. Worse, for the cache to be widely useful much beyond the
to target for confirmation. A way to counter this distribution client that caused it there would have to either be a new mechanism to
threat might be to only cache at certain semitrusted helper nodes. distribute cache information around the network and a way for clients
to make use of it or the caches themselves would need to be
distributed widely. Either way the record of visited sites and
downloaded information is made automatically available to an attacker
without having to actively gather it himself. Besides its inherent
value, this could serve as useful data to an attacker deciding which
locations to target for confirmation. A way to counter this
distribution threat might be to only cache at certain semitrusted
helper nodes. This might help specific clients, but it would limit
the general value of caching.
%Does that cacheing discussion belong in low-latency? %Does that cacheing discussion belong in low-latency?
@ -984,7 +1000,10 @@ certain that it has not missed the first node in the circuit. Also,
the attack does not identify the order of nodes in a route, so the the attack does not identify the order of nodes in a route, so the
longer the route, the greater the uncertainty about which node might longer the route, the greater the uncertainty about which node might
be first. It may be possible to extend the attack to learn the route be first. It may be possible to extend the attack to learn the route
node order, but it is not clear that this is practically feasible. node order, but has not been shown whether this is practically feasible.
If so, the incompleteness uncertainty engendered by random lengths would
remain, but once the complete set of nodes in the route were identified
the initiating node would also be identified.
Another way to reduce the threats to both enclaves and simple Tor Another way to reduce the threats to both enclaves and simple Tor
clients is to have helper nodes. Helper nodes were introduced clients is to have helper nodes. Helper nodes were introduced
@ -993,7 +1012,8 @@ of the initiator of a communication in various anonymity protocols.
The idea is to use a single trusted node as the first one you go to, The idea is to use a single trusted node as the first one you go to,
that way an attacker cannot ever attack the first nodes you connect that way an attacker cannot ever attack the first nodes you connect
to and do some form of intersection attack. This will not affect the to and do some form of intersection attack. This will not affect the
Danezis-Murdoch attack at all. Danezis-Murdoch attack at all if the attacker can time latencies to
both the helper node and the enclave node.
We have to pick the path length so adversary can't distinguish client from We have to pick the path length so adversary can't distinguish client from
server (how many hops is good?). server (how many hops is good?).
@ -1054,6 +1074,7 @@ force their users to switch helper nodes more frequently.
%big stuff. %big stuff.
\subsection{Location-hidden services} \subsection{Location-hidden services}
\label{subsec:hidden-services}
While most of the discussions about have been about forward anonymity While most of the discussions about have been about forward anonymity
with Tor, it also provides support for \emph{rendezvous points}, which with Tor, it also provides support for \emph{rendezvous points}, which
@ -1174,9 +1195,9 @@ Scaling Tor involves three main challenges. First is safe server
discovery, both bootstrapping -- how a Tor client can robustly find an discovery, both bootstrapping -- how a Tor client can robustly find an
initial server list -- and ongoing -- how a Tor client can learn about initial server list -- and ongoing -- how a Tor client can learn about
a fair sample of honest servers and not let the adversary control his a fair sample of honest servers and not let the adversary control his
circuits (see Section x). Second is detecting and handling the speed circuits (see Section~\ref{}). Second is detecting and handling the speed
and reliability of the variety of servers we must use if we want to and reliability of the variety of servers we must use if we want to
accept many servers (see Section y). accept many servers (see Section~\ref{}).
Since the speed and reliability of a circuit is limited by its worst link, Since the speed and reliability of a circuit is limited by its worst link,
we must learn to track and predict performance. Finally, in order to get we must learn to track and predict performance. Finally, in order to get
a large set of servers in the first place, we must address incentives a large set of servers in the first place, we must address incentives