Edits, cleanups, and clarifications in 8 and 9.

svn:r761
2024-11-10 21:23:58 +01:00 · 2003-11-05 00:12:18 +00:00 · 2003-11-05 00:12:18 +00:00 · bfa8831c18
commit bfa8831c18
parent 5c9e0685e6
1 changed files with 70 additions and 60 deletions
--- a/doc/tor-design.tex
+++ b/doc/tor-design.tex
@ -1519,7 +1519,7 @@ by attacking non-observed nodes to shut them down, reduce
 their reliability, or persuade users that they are not trustworthy.
 The best defense here is robustness.
-\emph{Run a hostile node.}  In addition to the abilities of a
+\emph{Run a hostile node.}  In addition to being a
 local observer, an isolated hostile node can create circuits through
 itself, or alter traffic patterns, to affect traffic at
 other nodes. Its ability to directly DoS a neighbor is now limited
@ -1536,7 +1536,7 @@ control $m$ out of $N$ nodes, he should be able to correlate at most
 $\left(\frac{m}{N}\right)^2$ of the traffic in this way---although an 
 adversary
 could possibly attract a disproportionately large amount of traffic
-by running an exit node with an unusually permissive exit policy.
+by running an OR with an unusually permissive exit policy.
 \emph{Run a hostile directory server.} Directory servers control
 admission to the network. However, because the network directory
@ -1678,7 +1678,7 @@ by the session key shared by the client and server.
 \Section{Open Questions in Low-latency Anonymity}
 \label{sec:maintaining-anonymity}
-In addition to the open problems discussed in
+In addition to the non-goals in
 Section~\ref{subsec:non-goals}, many other questions must be solved
 before we can be confident of Tor's security.
@ -1686,25 +1686,33 @@ Many of these open issues are questions of balance. For example,
 how often should users rotate to fresh circuits? Frequent rotation
 is inefficient, expensive, and may lead to intersection attacks and
 predecessor attacks \cite{wright03}, but infrequent rotation makes the
-user's traffic linkable. Along with opening a fresh circuit, clients can
+user's traffic linkable. Besides opening fresh circuits, clients can
-also limit linkability by exiting from a middle point of the circuit,
+also exit from the middle of the circuit,
-or by truncating and re-extending the circuit; but more analysis is
+or truncate and re-extend the circuit. More analysis is
 needed to determine the proper tradeoff.
-A similar question surrounds timing of directory operations: how often
+%% Duplicated by 'Better directory distribution' in section 9.
-should directories be updated?  Clients that update infrequently receive
+%
-an inaccurate picture of the network, but frequent updates can overload
+%A similar question surrounds timing of directory operations: how often
-the directory servers. More generally, we must find more
+%should directories be updated?  Clients that update infrequently receive
-decentralized yet practical ways to distribute up-to-date snapshots of
+%an inaccurate picture of the network, but frequent updates can overload
-network status without introducing new attacks.
+%the directory servers. More generally, we must find more
 %decentralized yet practical ways to distribute up-to-date snapshots of
 %network status without introducing new attacks.
-How should we choose path lengths? If she uses only two hops, then both
+How should we choose path lengths? If Alice only ever uses two hops,
-these nodes are certain that by colluding they will learn about Alice
+then both ORs can be certain that by colluding they will learn about
-and Bob. Our current approach is that Alice always chooses at least three
+Alice and Bob. In our current approach, Alice always chooses at least
-nodes unrelated to herself and her destination. Thus normally she chooses
+three nodes unrelated to herself and her destination.
-three nodes, but if she is running an OR and her destination is on an OR,
+%% This point is subtle, but not IMO necessary.  Anybody who thinks
-she uses five. Should Alice choose a nondeterministic path length (say,
+%% about it will see that it's implied by the above sentence; anybody
-increasing it from a geometric distribution), to foil an attacker who
+%% who doesn't think about it is safe in his ignorance.
 %
 %Thus normally she chooses
 %three nodes, but if she is running an OR and her destination is on an OR,
 %she uses five. 
 Should Alice choose a nondeterministic path length (say,
 increasing it a geometric distribution) to foil an attacker who
 uses timing to learn that he is the fifth hop and thus concludes that
 both Alice and the responder are on ORs?
@ -1716,40 +1724,46 @@ are high enough, and if users' habits are sufficiently distinct
 \cite{limits-open,statistical-disclosure}. Can anything be done to
 make low-latency systems resist these attacks as well as high-latency
 systems? Tor already makes some effort to conceal the starts and ends of
-streams by wrapping all long-range control commands in identical-looking
+streams by wrapping long-range control commands in identical-looking
 relay cells. Link padding could frustrate passive observers who count
 packets; long-range padding could work against observers who own the
 first hop in a circuit. But more research remains to find an efficient
 and practical approach. Volunteers prefer not to run constant-bandwidth
-padding; but no convincing traffic shaping approach has ever been
+padding; but no convincing traffic shaping approach has been
 specified. Recent work on long-range padding \cite{defensive-dropping}
 shows promise. One could also try to reduce correlation in packet timing
 by batching and re-ordering packets, but it is unclear whether this could
 improve anonymity without introducing so much latency as to render the
 network unusable.
-Common wisdom suggests that Alice should run her own onion router for best
+A cascade topology may better defend against traffic confirmation by a
-anonymity, because traffic coming through her node could plausibly have
+large adversary through aggregating users, and making padding and
-come from elsewhere. How much mixing do we need before this is actually
+mixing more affordable.  Does the hydra topology (many input nodes,
-effective, or is it immediately beneficial because many real-world
+few output nodes) work better against some adversaries? Are we going
-adversaries won't be able to observe Alice's router?
+to get a hydra anyway because most nodes will be middleman nodes?
 Common wisdom suggests that Alice should run her own OR for best
 anonymity, because traffic coming from her node could plausibly have
 come from elsewhere. How much mixing does this approach need?  Is it
 immediately beneficial because of real-world adversaries that can't
 observe Alice's router, but can run routers of their own?
 To scale to many users, and to prevent an attacker from observing the
-whole network at once, it may be necessary for low-latency anonymity
+whole network at once, it may be necessary 
-systems to support far more servers than Tor currently anticipates.
+to support far more servers than Tor currently anticipates.
 This introduces several issues.  First, if approval by a centralized set
 of directory servers is no longer feasible, what mechanism should be used
 to prevent adversaries from signing up many colluding servers? Second,
 if clients can no longer have a complete picture of the network at all
 times, how can they perform discovery while preventing attackers from
-manipulating or exploiting gaps in client knowledge?  Third, if there
+manipulating or exploiting gaps in their knowledge?  Third, if there
 are too many servers for every server to constantly communicate with
 every other, what kind of non-clique topology should the network use?
-Restricted-route topologies promise comparable anonymity with better
+(Restricted-route topologies promise comparable anonymity with better
 scalability \cite{danezis-pets03}, but whatever topology we choose, we
 need some way to keep attackers from manipulating their position within
-it \cite{casc-rep}. Fourth, since no centralized authority is tracking
+it \cite{casc-rep}.) Fourth, since no centralized authority is tracking
-server reliability, How do we prevent unreliable servers from rendering
+server reliability, how do we prevent unreliable servers from rendering
 the network unusable?  Fifth, do clients receive so much anonymity benefit
 from running their own servers that we should expect them all to do so
 \cite{econymics}, or do we need to find another incentive structure to
@ -1757,18 +1771,12 @@ motivate them?  Tarzan and MorphMix present possible solutions.
 % advogato, captcha
 A cascade topology with long-range padding and mixing may provide more
 defense against traffic confirmation against a large adversary, because
 it aggregates many users. Does the hydra topology (many input nodes,
 few output nodes) work better against some adversaries? Are we going to
 get a hydra anyway because most nodes will be middleman nodes?
 When a Tor node goes down, all its circuits (and thus streams) must break.
-Do users abandon the system because of this brittleness? How well
+Will users abandon the system because of this brittleness? How well
 does the method in Section~\ref{subsec:dos} allow streams to survive
 node failure? If affected users rebuild circuits immediately, how much
 anonymity is lost? It seems the problem is even worse in a peer-to-peer
-environment---so far such systems don't provide an incentive for peers to
+environment---such systems don't yet provide an incentive for peers to
 stay connected when they're done retrieving content, so we would expect
 a higher churn rate.
@ -1778,21 +1786,22 @@ a higher churn rate.
 \label{sec:conclusion}
 Tor brings together many innovations into a unified deployable system. The
-immediate next steps include:
+next immediate steps include:
-\emph{Scalability:} Tor's emphasis on design simplicity and deployability
+\emph{Scalability:} Tor's emphasis on deployability and design simplicity
-has led us to adopt a clique topology, a semi-centralized model for
+has led us to adopt a clique topology, semi-centralized 
-directories and trusts, and a full-network-visibility model for client
+directories, and a full-network-visibility model for client
 knowledge. These properties will not scale past a few hundred servers.
 Section~\ref{sec:maintaining-anonymity} describes some promising
 approaches, but more deployment experience will be helpful in learning
 the relative importance of these bottlenecks.
-\emph{Bandwidth classes:} In this paper we assume all onion routers have
+\emph{Bandwidth classes:} This paper assumes that all ORs have
-good bandwidth and latency. We should adapt the Morphmix model,
+good bandwidth and latency. We should instead adopt the Morphmix model,
 where nodes advertise their bandwidth level (DSL, T1, T3), and
-Alice avoids bottlenecks in her path by choosing nodes that match or
+Alice avoids bottlenecks by choosing nodes that match or
-exceed her bandwidth. In this way DSL users can join the Tor network.
+exceed her bandwidth. In this way DSL users can usefully join the Tor
 network.
 \emph{Incentives:} Volunteers who run nodes are rewarded with publicity
 and possibly better anonymity \cite{econymics}. More nodes means increased
@ -1801,7 +1810,7 @@ examining the incentive structures for participating in Tor.
 \emph{Cover traffic:} Currently Tor avoids cover traffic because its costs
 in performance and bandwidth are clear, whereas its security benefits are
-not well-understood. We must pursue more research on both link-level cover
+not well understood. We must pursue more research on both link-level cover
 traffic and long-range cover traffic to determine some simple padding
 schemes that offer provable protection against our chosen adversary.
@ -1810,14 +1819,15 @@ schemes that offer provable protection against our chosen adversary.
 %%size cannot be optimal for both types of traffic.
 % This should go in the spec and todo, but not the paper yet. -RD
-\emph{Caching at exit nodes:} We should run a caching web proxy at each
+\emph{Caching at exit nodes:} Perhaps each exit node should run a
-exit node, to provide anonymity for cached pages (Alice's request never
+caching web proxy, to improve anonymity for cached pages (Alice's request never
 leaves the Tor network), to improve speed, and to reduce bandwidth cost.
 %XXX and to have a layer to block to block funny stuff out of port 80.
 % is that a useful thing to say?
-On the other hand, forward security is weakened because routers have the
+%     No; we already said it in the exit abuse section. - NM.
-pages in their cache. We must find the right balance between usability
+On the other hand, forward security is weakened because caches
-and security.
+constitute a record of retrieved files.  We must find the right
 balance between usability and security.
 \emph{Better directory distribution:} Directory retrieval presents
 a scaling problem, since clients currently download a description of
@ -1830,15 +1840,15 @@ Section~\ref{sec:rendezvous} has not yet been implemented.  While doing
 so we are likely to encounter additional issues that must be resolved,
 both in terms of usability and anonymity.
-\emph{Further specification review:} Although we have a public,
+\emph{Further specification review:} Although have a public
-byte-level specification for the Tor protocols, this document has
+byte-level specification for the Tor protocols, it needs
-not received extensive external review.  We hope that as Tor
+extensive external review.  We hope that as Tor
-becomes more widely deployed, more people will examine its
+is more widely deployed, more people will examine its
 specification.
 \emph{Multisystem interoperability:} We are currently working with the
-designer of MorphMix to make the common elements of our two systems
+designer of MorphMix to unify the specification and implementation of
-share a common specification and implementation. So far, this seems
+the common elements of our two systems. So far, this seems
 to be relatively straightforward.  Interoperability will allow testing
 and direct comparison of the two designs for trust and scalability.