Edits, cleanups, and clarifications in 8 and 9.

svn:r761
This commit is contained in:
Nick Mathewson 2003-11-05 00:12:18 +00:00
parent 5c9e0685e6
commit bfa8831c18

View File

@ -1519,7 +1519,7 @@ by attacking non-observed nodes to shut them down, reduce
their reliability, or persuade users that they are not trustworthy. their reliability, or persuade users that they are not trustworthy.
The best defense here is robustness. The best defense here is robustness.
\emph{Run a hostile node.} In addition to the abilities of a \emph{Run a hostile node.} In addition to being a
local observer, an isolated hostile node can create circuits through local observer, an isolated hostile node can create circuits through
itself, or alter traffic patterns, to affect traffic at itself, or alter traffic patterns, to affect traffic at
other nodes. Its ability to directly DoS a neighbor is now limited other nodes. Its ability to directly DoS a neighbor is now limited
@ -1536,7 +1536,7 @@ control $m$ out of $N$ nodes, he should be able to correlate at most
$\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an $\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an
adversary adversary
could possibly attract a disproportionately large amount of traffic could possibly attract a disproportionately large amount of traffic
by running an exit node with an unusually permissive exit policy. by running an OR with an unusually permissive exit policy.
\emph{Run a hostile directory server.} Directory servers control \emph{Run a hostile directory server.} Directory servers control
admission to the network. However, because the network directory admission to the network. However, because the network directory
@ -1678,7 +1678,7 @@ by the session key shared by the client and server.
\Section{Open Questions in Low-latency Anonymity} \Section{Open Questions in Low-latency Anonymity}
\label{sec:maintaining-anonymity} \label{sec:maintaining-anonymity}
In addition to the open problems discussed in In addition to the non-goals in
Section~\ref{subsec:non-goals}, many other questions must be solved Section~\ref{subsec:non-goals}, many other questions must be solved
before we can be confident of Tor's security. before we can be confident of Tor's security.
@ -1686,25 +1686,33 @@ Many of these open issues are questions of balance. For example,
how often should users rotate to fresh circuits? Frequent rotation how often should users rotate to fresh circuits? Frequent rotation
is inefficient, expensive, and may lead to intersection attacks and is inefficient, expensive, and may lead to intersection attacks and
predecessor attacks \cite{wright03}, but infrequent rotation makes the predecessor attacks \cite{wright03}, but infrequent rotation makes the
user's traffic linkable. Along with opening a fresh circuit, clients can user's traffic linkable. Besides opening fresh circuits, clients can
also limit linkability by exiting from a middle point of the circuit, also exit from the middle of the circuit,
or by truncating and re-extending the circuit; but more analysis is or truncate and re-extend the circuit. More analysis is
needed to determine the proper tradeoff. needed to determine the proper tradeoff.
A similar question surrounds timing of directory operations: how often %% Duplicated by 'Better directory distribution' in section 9.
should directories be updated? Clients that update infrequently receive %
an inaccurate picture of the network, but frequent updates can overload %A similar question surrounds timing of directory operations: how often
the directory servers. More generally, we must find more %should directories be updated? Clients that update infrequently receive
decentralized yet practical ways to distribute up-to-date snapshots of %an inaccurate picture of the network, but frequent updates can overload
network status without introducing new attacks. %the directory servers. More generally, we must find more
%decentralized yet practical ways to distribute up-to-date snapshots of
%network status without introducing new attacks.
How should we choose path lengths? If she uses only two hops, then both How should we choose path lengths? If Alice only ever uses two hops,
these nodes are certain that by colluding they will learn about Alice then both ORs can be certain that by colluding they will learn about
and Bob. Our current approach is that Alice always chooses at least three Alice and Bob. In our current approach, Alice always chooses at least
nodes unrelated to herself and her destination. Thus normally she chooses three nodes unrelated to herself and her destination.
three nodes, but if she is running an OR and her destination is on an OR, %% This point is subtle, but not IMO necessary. Anybody who thinks
she uses five. Should Alice choose a nondeterministic path length (say, %% about it will see that it's implied by the above sentence; anybody
increasing it from a geometric distribution), to foil an attacker who %% who doesn't think about it is safe in his ignorance.
%
%Thus normally she chooses
%three nodes, but if she is running an OR and her destination is on an OR,
%she uses five.
Should Alice choose a nondeterministic path length (say,
increasing it a geometric distribution) to foil an attacker who
uses timing to learn that he is the fifth hop and thus concludes that uses timing to learn that he is the fifth hop and thus concludes that
both Alice and the responder are on ORs? both Alice and the responder are on ORs?
@ -1716,40 +1724,46 @@ are high enough, and if users' habits are sufficiently distinct
\cite{limits-open,statistical-disclosure}. Can anything be done to \cite{limits-open,statistical-disclosure}. Can anything be done to
make low-latency systems resist these attacks as well as high-latency make low-latency systems resist these attacks as well as high-latency
systems? Tor already makes some effort to conceal the starts and ends of systems? Tor already makes some effort to conceal the starts and ends of
streams by wrapping all long-range control commands in identical-looking streams by wrapping long-range control commands in identical-looking
relay cells. Link padding could frustrate passive observers who count relay cells. Link padding could frustrate passive observers who count
packets; long-range padding could work against observers who own the packets; long-range padding could work against observers who own the
first hop in a circuit. But more research remains to find an efficient first hop in a circuit. But more research remains to find an efficient
and practical approach. Volunteers prefer not to run constant-bandwidth and practical approach. Volunteers prefer not to run constant-bandwidth
padding; but no convincing traffic shaping approach has ever been padding; but no convincing traffic shaping approach has been
specified. Recent work on long-range padding \cite{defensive-dropping} specified. Recent work on long-range padding \cite{defensive-dropping}
shows promise. One could also try to reduce correlation in packet timing shows promise. One could also try to reduce correlation in packet timing
by batching and re-ordering packets, but it is unclear whether this could by batching and re-ordering packets, but it is unclear whether this could
improve anonymity without introducing so much latency as to render the improve anonymity without introducing so much latency as to render the
network unusable. network unusable.
Common wisdom suggests that Alice should run her own onion router for best A cascade topology may better defend against traffic confirmation by a
anonymity, because traffic coming through her node could plausibly have large adversary through aggregating users, and making padding and
come from elsewhere. How much mixing do we need before this is actually mixing more affordable. Does the hydra topology (many input nodes,
effective, or is it immediately beneficial because many real-world few output nodes) work better against some adversaries? Are we going
adversaries won't be able to observe Alice's router? to get a hydra anyway because most nodes will be middleman nodes?
Common wisdom suggests that Alice should run her own OR for best
anonymity, because traffic coming from her node could plausibly have
come from elsewhere. How much mixing does this approach need? Is it
immediately beneficial because of real-world adversaries that can't
observe Alice's router, but can run routers of their own?
To scale to many users, and to prevent an attacker from observing the To scale to many users, and to prevent an attacker from observing the
whole network at once, it may be necessary for low-latency anonymity whole network at once, it may be necessary
systems to support far more servers than Tor currently anticipates. to support far more servers than Tor currently anticipates.
This introduces several issues. First, if approval by a centralized set This introduces several issues. First, if approval by a centralized set
of directory servers is no longer feasible, what mechanism should be used of directory servers is no longer feasible, what mechanism should be used
to prevent adversaries from signing up many colluding servers? Second, to prevent adversaries from signing up many colluding servers? Second,
if clients can no longer have a complete picture of the network at all if clients can no longer have a complete picture of the network at all
times, how can they perform discovery while preventing attackers from times, how can they perform discovery while preventing attackers from
manipulating or exploiting gaps in client knowledge? Third, if there manipulating or exploiting gaps in their knowledge? Third, if there
are too many servers for every server to constantly communicate with are too many servers for every server to constantly communicate with
every other, what kind of non-clique topology should the network use? every other, what kind of non-clique topology should the network use?
Restricted-route topologies promise comparable anonymity with better (Restricted-route topologies promise comparable anonymity with better
scalability \cite{danezis-pets03}, but whatever topology we choose, we scalability \cite{danezis-pets03}, but whatever topology we choose, we
need some way to keep attackers from manipulating their position within need some way to keep attackers from manipulating their position within
it \cite{casc-rep}. Fourth, since no centralized authority is tracking it \cite{casc-rep}.) Fourth, since no centralized authority is tracking
server reliability, How do we prevent unreliable servers from rendering server reliability, how do we prevent unreliable servers from rendering
the network unusable? Fifth, do clients receive so much anonymity benefit the network unusable? Fifth, do clients receive so much anonymity benefit
from running their own servers that we should expect them all to do so from running their own servers that we should expect them all to do so
\cite{econymics}, or do we need to find another incentive structure to \cite{econymics}, or do we need to find another incentive structure to
@ -1757,18 +1771,12 @@ motivate them? Tarzan and MorphMix present possible solutions.
% advogato, captcha % advogato, captcha
A cascade topology with long-range padding and mixing may provide more
defense against traffic confirmation against a large adversary, because
it aggregates many users. Does the hydra topology (many input nodes,
few output nodes) work better against some adversaries? Are we going to
get a hydra anyway because most nodes will be middleman nodes?
When a Tor node goes down, all its circuits (and thus streams) must break. When a Tor node goes down, all its circuits (and thus streams) must break.
Do users abandon the system because of this brittleness? How well Will users abandon the system because of this brittleness? How well
does the method in Section~\ref{subsec:dos} allow streams to survive does the method in Section~\ref{subsec:dos} allow streams to survive
node failure? If affected users rebuild circuits immediately, how much node failure? If affected users rebuild circuits immediately, how much
anonymity is lost? It seems the problem is even worse in a peer-to-peer anonymity is lost? It seems the problem is even worse in a peer-to-peer
environment---so far such systems don't provide an incentive for peers to environment---such systems don't yet provide an incentive for peers to
stay connected when they're done retrieving content, so we would expect stay connected when they're done retrieving content, so we would expect
a higher churn rate. a higher churn rate.
@ -1778,21 +1786,22 @@ a higher churn rate.
\label{sec:conclusion} \label{sec:conclusion}
Tor brings together many innovations into a unified deployable system. The Tor brings together many innovations into a unified deployable system. The
immediate next steps include: next immediate steps include:
\emph{Scalability:} Tor's emphasis on design simplicity and deployability \emph{Scalability:} Tor's emphasis on deployability and design simplicity
has led us to adopt a clique topology, a semi-centralized model for has led us to adopt a clique topology, semi-centralized
directories and trusts, and a full-network-visibility model for client directories, and a full-network-visibility model for client
knowledge. These properties will not scale past a few hundred servers. knowledge. These properties will not scale past a few hundred servers.
Section~\ref{sec:maintaining-anonymity} describes some promising Section~\ref{sec:maintaining-anonymity} describes some promising
approaches, but more deployment experience will be helpful in learning approaches, but more deployment experience will be helpful in learning
the relative importance of these bottlenecks. the relative importance of these bottlenecks.
\emph{Bandwidth classes:} In this paper we assume all onion routers have \emph{Bandwidth classes:} This paper assumes that all ORs have
good bandwidth and latency. We should adapt the Morphmix model, good bandwidth and latency. We should instead adopt the Morphmix model,
where nodes advertise their bandwidth level (DSL, T1, T3), and where nodes advertise their bandwidth level (DSL, T1, T3), and
Alice avoids bottlenecks in her path by choosing nodes that match or Alice avoids bottlenecks by choosing nodes that match or
exceed her bandwidth. In this way DSL users can join the Tor network. exceed her bandwidth. In this way DSL users can usefully join the Tor
network.
\emph{Incentives:} Volunteers who run nodes are rewarded with publicity \emph{Incentives:} Volunteers who run nodes are rewarded with publicity
and possibly better anonymity \cite{econymics}. More nodes means increased and possibly better anonymity \cite{econymics}. More nodes means increased
@ -1801,7 +1810,7 @@ examining the incentive structures for participating in Tor.
\emph{Cover traffic:} Currently Tor avoids cover traffic because its costs \emph{Cover traffic:} Currently Tor avoids cover traffic because its costs
in performance and bandwidth are clear, whereas its security benefits are in performance and bandwidth are clear, whereas its security benefits are
not well-understood. We must pursue more research on both link-level cover not well understood. We must pursue more research on both link-level cover
traffic and long-range cover traffic to determine some simple padding traffic and long-range cover traffic to determine some simple padding
schemes that offer provable protection against our chosen adversary. schemes that offer provable protection against our chosen adversary.
@ -1810,14 +1819,15 @@ schemes that offer provable protection against our chosen adversary.
%%size cannot be optimal for both types of traffic. %%size cannot be optimal for both types of traffic.
% This should go in the spec and todo, but not the paper yet. -RD % This should go in the spec and todo, but not the paper yet. -RD
\emph{Caching at exit nodes:} We should run a caching web proxy at each \emph{Caching at exit nodes:} Perhaps each exit node should run a
exit node, to provide anonymity for cached pages (Alice's request never caching web proxy, to improve anonymity for cached pages (Alice's request never
leaves the Tor network), to improve speed, and to reduce bandwidth cost. leaves the Tor network), to improve speed, and to reduce bandwidth cost.
%XXX and to have a layer to block to block funny stuff out of port 80. %XXX and to have a layer to block to block funny stuff out of port 80.
% is that a useful thing to say? % is that a useful thing to say?
On the other hand, forward security is weakened because routers have the % No; we already said it in the exit abuse section. - NM.
pages in their cache. We must find the right balance between usability On the other hand, forward security is weakened because caches
and security. constitute a record of retrieved files. We must find the right
balance between usability and security.
\emph{Better directory distribution:} Directory retrieval presents \emph{Better directory distribution:} Directory retrieval presents
a scaling problem, since clients currently download a description of a scaling problem, since clients currently download a description of
@ -1830,15 +1840,15 @@ Section~\ref{sec:rendezvous} has not yet been implemented. While doing
so we are likely to encounter additional issues that must be resolved, so we are likely to encounter additional issues that must be resolved,
both in terms of usability and anonymity. both in terms of usability and anonymity.
\emph{Further specification review:} Although we have a public, \emph{Further specification review:} Although have a public
byte-level specification for the Tor protocols, this document has byte-level specification for the Tor protocols, it needs
not received extensive external review. We hope that as Tor extensive external review. We hope that as Tor
becomes more widely deployed, more people will examine its is more widely deployed, more people will examine its
specification. specification.
\emph{Multisystem interoperability:} We are currently working with the \emph{Multisystem interoperability:} We are currently working with the
designer of MorphMix to make the common elements of our two systems designer of MorphMix to unify the specification and implementation of
share a common specification and implementation. So far, this seems the common elements of our two systems. So far, this seems
to be relatively straightforward. Interoperability will allow testing to be relatively straightforward. Interoperability will allow testing
and direct comparison of the two designs for trust and scalability. and direct comparison of the two designs for trust and scalability.