Tweaks and typos throughout. Nearly there.

svn:r3586
This commit is contained in:
Paul Syverson 2005-02-08 20:34:57 +00:00
parent 4518e7e642
commit 1d569eb492

View File

@ -6,11 +6,11 @@
\usepackage{amsmath} \usepackage{amsmath}
\usepackage{epsfig} \usepackage{epsfig}
\setlength{\textwidth}{6in} \setlength{\textwidth}{6.1in}
\setlength{\textheight}{8in} \setlength{\textheight}{8.5in}
\setlength{\topmargin}{.5in} \setlength{\topmargin}{1cm}
\setlength{\oddsidemargin}{1cm} \setlength{\oddsidemargin}{.5cm}
\setlength{\evensidemargin}{1cm} \setlength{\evensidemargin}{.5cm}
\newenvironment{tightlist}{\begin{list}{$\bullet$}{ \newenvironment{tightlist}{\begin{list}{$\bullet$}{
\setlength{\itemsep}{0mm} \setlength{\itemsep}{0mm}
@ -28,7 +28,7 @@
Nick Mathewson\inst{1} \and Nick Mathewson\inst{1} \and
Paul Syverson\inst{2}} Paul Syverson\inst{2}}
\institute{The Free Haven Project \email{<\{arma,nickm\}@freehaven.net>} \and \institute{The Free Haven Project \email{<\{arma,nickm\}@freehaven.net>} \and
Naval Research Lab \email{<syverson@itd.nrl.navy.mil>}} Naval Research Laboratory \email{<syverson@itd.nrl.navy.mil>}}
\maketitle \maketitle
\pagestyle{plain} \pagestyle{plain}
@ -77,14 +77,15 @@ made it possible for Tor to serve many thousands of users and attract
funding from diverse sources whose goals range from security on a funding from diverse sources whose goals range from security on a
national scale down to the liberties of each individual. national scale down to the liberties of each individual.
While the Tor design paper~\cite{tor-design} gives an overall view of Tor's While~\cite{tor-design} gives an overall view of Tor's
design and goals, this paper describes some policy, social, and technical design and goals, this paper describes policy, social, and technical
issues that we face as we continue deployment. issues that we face as we continue deployment.
Rather than trying to provide complete solutions to every problem here, we Rather than trying to provide complete solutions to every problem here, we
lay out the assumptions and constraints that we have observed while lay out the assumptions and constraints that we have observed while
deploying Tor in the wild. In doing so, we aim to create a research agenda deploying Tor in the wild. In doing so, we aim to create a research agenda
for others to help in addressing these issues. We believe that the issues for others to help in addressing these issues. We believe that the issues
described here will be of general interest to projects attempting to build described here will be of general interest to any and all
projects attempting to build
and deploy practical, useable anonymity networks in the wild. and deploy practical, useable anonymity networks in the wild.
%While the Tor design paper~\cite{tor-design} gives an overall view its %While the Tor design paper~\cite{tor-design} gives an overall view its
@ -132,7 +133,7 @@ Tor nodes on the network. The circuit is extended one hop at a time, and
each node along the way knows only which node gave it data and which each node along the way knows only which node gave it data and which
node it is giving data to. No individual Tor node ever knows the complete node it is giving data to. No individual Tor node ever knows the complete
path that a data packet has taken. The client negotiates a separate set path that a data packet has taken. The client negotiates a separate set
of encryption keys for each hop along the circuit.% to ensure that each of encryption keys for each hop along the circuit. % to ensure that each
%hop can't trace these connections as they pass through. %hop can't trace these connections as they pass through.
Because each node sees no more than one hop in the Because each node sees no more than one hop in the
circuit, neither an eavesdropper nor a compromised node can use traffic circuit, neither an eavesdropper nor a compromised node can use traffic
@ -140,7 +141,7 @@ analysis to link the connection's source and destination.
For efficiency, the Tor software uses the same circuit for all the TCP For efficiency, the Tor software uses the same circuit for all the TCP
connections that happen within the same short period. connections that happen within the same short period.
Later requests use a new Later requests use a new
circuit, to prevent long-term linkability between different actions by circuit, to complicate long-term linkability between different actions by
a single user. a single user.
Tor also makes it possible for users to hide their locations while Tor also makes it possible for users to hide their locations while
@ -152,25 +153,25 @@ identity.
Tor attempts to anonymize the transport layer, not the application layer, so Tor attempts to anonymize the transport layer, not the application layer, so
application protocols that include personally identifying information need application protocols that include personally identifying information need
additional application-level scrubbing proxies, such as additional application-level scrubbing proxies, such as
Privoxy~\cite{privoxy} for HTTP. Furthermore, Tor does not permit arbitrary Privoxy~\cite{privoxy} for HTTP\@. Furthermore, Tor does not permit arbitrary
IP packets; it only anonymizes TCP streams and DNS request, and only supports IP packets; it only anonymizes TCP streams and DNS request, and only supports
connections via SOCKS (see Section~\ref{subsec:tcp-vs-ip}). connections via SOCKS (see Section~\ref{subsec:tcp-vs-ip}).
Most node operators do not want to allow arbitary TCP connections to leave Most node operators do not want to allow arbitary TCP connections to leave
their server. To address this, Tor provides \emph{exit policies} so that their server. To address this, Tor provides \emph{exit policies} so that
each exit node can block the IP addresses and ports it is unwilling to allow. each exit node can block the IP addresses and ports it is unwilling to allow.
TRs advertise their exit policies to the directory servers, so that Tor nodes advertise their exit policies to the directory servers, so that
client can tell which nodes will support their connections. client can tell which nodes will support their connections.
As of January 2005, the Tor network has grown to around a hundred nodes As of January 2005, the Tor network has grown to around a hundred nodes
on four continents, with a total capacity exceeding 1Gbit/s. Appendix A on four continents, with a total capacity exceeding 1Gbit/s. Appendix A
shows a graph of the number of working nodes over time, as well as a shows a graph of the number of working nodes over time, as well as a
vgraph of the number of bytes being handled by the network over time. At graph of the number of bytes being handled by the network over time. At
this point the network is sufficiently diverse for further development this point the network is sufficiently diverse for further development
and testing; but of course we always encourage and welcome new nodes and testing; but of course we always encourage and welcome new nodes
to join the network. to join the network.
Tor research and development has been funded by the U.S.~Navy and DARPA Tor research and development has been funded by ONR and DARPA
for use in securing government for use in securing government
communications, and by the Electronic Frontier Foundation, for use communications, and by the Electronic Frontier Foundation, for use
in maintaining civil liberties for ordinary citizens online. The Tor in maintaining civil liberties for ordinary citizens online. The Tor
@ -257,8 +258,8 @@ that an outside attacker can trace a stream through the Tor network
while a stream is still active simply by observing the latency of his while a stream is still active simply by observing the latency of his
own traffic sent through various Tor nodes. These attacks do not show own traffic sent through various Tor nodes. These attacks do not show
the client address, only the first node within the Tor network, making the client address, only the first node within the Tor network, making
helper nodes all the more worthy of exploration (cf., helper nodes all the more worthy of exploration. (See
Section~\ref{subsec:helper-nodes}). Section~\ref{subsec:helper-nodes}.)
Against internal attackers who sign up Tor nodes, the situation is more Against internal attackers who sign up Tor nodes, the situation is more
complicated. In the simplest case, if an adversary has compromised $c$ of complicated. In the simplest case, if an adversary has compromised $c$ of
@ -277,8 +278,8 @@ complicating factors:
(3)~Users do not in fact choose nodes with uniform probability; they (3)~Users do not in fact choose nodes with uniform probability; they
favor nodes with high bandwidth or uptime, and exit nodes that favor nodes with high bandwidth or uptime, and exit nodes that
permit connections to their favorite services. permit connections to their favorite services.
See Section~\ref{subsec:routing-zones} for discussion of larger (See Section~\ref{subsec:routing-zones} for discussion of how larger
adversaries and our dispersal goals. adversaries affect our dispersal goals.)
%\begin{tightlist} %\begin{tightlist}
%\item If the user continues to build random circuits over time, an adversary %\item If the user continues to build random circuits over time, an adversary
@ -360,10 +361,10 @@ and operations of that agency would be easier, not harder, to distinguish.
Instead, to protect our networks from traffic analysis, we must Instead, to protect our networks from traffic analysis, we must
collaboratively blend the traffic from many organizations and private collaboratively blend the traffic from many organizations and private
citizens, so that an eavesdropper can't tell which users are which, citizens, so that an eavesdropper can't tell which users are which,
and who is looking for what information. By bringing more users onto and who is looking for what information. %By bringing more users onto
the network, all users become more secure~\cite{econymics}. %the network, all users become more secure~\cite{econymics}.
[XXX I feel uncomfortable saying this last sentence now. -RD] %[XXX I feel uncomfortable saying this last sentence now. -RD]
%[So, I took it out. I think we can do without it. -PFS]
Naturally, organizations will not want to depend on others for their Naturally, organizations will not want to depend on others for their
security. If most participating providers are reliable, Tor tolerates security. If most participating providers are reliable, Tor tolerates
some hostile infiltration of the network. For maximum protection, some hostile infiltration of the network. For maximum protection,
@ -430,13 +431,12 @@ system design and technology development. In particular, the
Tor project's \emph{image} with respect to its users and the rest of Tor project's \emph{image} with respect to its users and the rest of
the Internet impacts the security it can provide. the Internet impacts the security it can provide.
% No image, no sustainability -NM % No image, no sustainability -NM
With this image issue in mind, this section discusses the Tor user base and With this image issue in mind, this section discusses the Tor user base and
Tor's interaction with other services on the Internet. Tor's interaction with other services on the Internet.
\subsection{Communicating security} \subsection{Communicating security}
A growing field of papers argue that usability for anonymity systems Usability for anonymity systems
contributes directly to their security, because how usable the system contributes directly to their security, because how usable the system
is impacts the possible anonymity set~\cite{econymics,back01}. Or is impacts the possible anonymity set~\cite{econymics,back01}. Or
conversely, an unusable system attracts few users and thus can't provide conversely, an unusable system attracts few users and thus can't provide
@ -481,13 +481,15 @@ Like Tor, the current JAP implementation does not pad connections
JAP's cascade-based network topology may be even more vulnerable to these JAP's cascade-based network topology may be even more vulnerable to these
attacks, because the network has fewer edges. JAP was born out of attacks, because the network has fewer edges. JAP was born out of
the ISDN mix design~\cite{isdn-mixes}, where padding made sense because the ISDN mix design~\cite{isdn-mixes}, where padding made sense because
every user had a fixed bandwidth allocation, but in its current context every user had a fixed bandwidth allocation and altering the timing
pattern of packets could be immediately detected, but in its current context
as a general Internet web anonymizer, adding sufficient padding to JAP as a general Internet web anonymizer, adding sufficient padding to JAP
would be prohibitively expensive.\footnote{Even if JAP could would be prohibitively expensive and probably ineffective against a
minimally active attacker.\footnote{Even if JAP could
fund higher-capacity nodes indefinitely, our experience fund higher-capacity nodes indefinitely, our experience
suggests that many users would not accept the increased per-user suggests that many users would not accept the increased per-user
bandwidth requirements, leading to an overall much smaller user base. But bandwidth requirements, leading to an overall much smaller user base. But
cf.\ Section \ref{subsec:mid-latency}.} Therefore, since under this threat cf.\ Section~\ref{subsec:mid-latency}.} Therefore, since under this threat
model the number of concurrent users does not seem to have much impact model the number of concurrent users does not seem to have much impact
on the anonymity provided, we suggest that JAP's anonymity meter is not on the anonymity provided, we suggest that JAP's anonymity meter is not
accurately communicating security levels to its users. accurately communicating security levels to its users.
@ -611,9 +613,9 @@ wants to provide high bandwidth, but no more than a certain amount in a
giving billing cycle, to become dormant once its bandwidth is exhausted, and giving billing cycle, to become dormant once its bandwidth is exhausted, and
to reawaken at a random offset into the next billing cycle. This feature has to reawaken at a random offset into the next billing cycle. This feature has
interesting policy implications, however; see interesting policy implications, however; see
Section~\ref{subsec:bandwidth-and-file-sharing} below. the next section below.
Exit policies help to limit administrative costs by limiting the frequency of Exit policies help to limit administrative costs by limiting the frequency of
abuse complaints. abuse complaints. (See Section~\ref{subsec:tor-and-blacklists}.)
%[XXXX say more. Why else would you run a node? What else can we do/do we %[XXXX say more. Why else would you run a node? What else can we do/do we
% already do to make running a node more attractive?] % already do to make running a node more attractive?]
@ -696,6 +698,7 @@ file-sharing protocols that have separate control and data channels.
%your computer is doing that behavior. %your computer is doing that behavior.
\subsection{Tor and blacklists} \subsection{Tor and blacklists}
\label{subsec:tor-and-blacklists}
It was long expected that, alongside Tor's legitimate users, it would also It was long expected that, alongside Tor's legitimate users, it would also
attract troublemakers who exploited Tor in order to abuse services on the attract troublemakers who exploited Tor in order to abuse services on the
@ -730,7 +733,7 @@ and Wikipedia. We don't want to compete for (or divvy up) the NAT
protected entities of the world. protected entities of the world.
Worse, many IP blacklists are not terribly fine-grained. Worse, many IP blacklists are not terribly fine-grained.
No current IP blacklist, for example, allow a service provider to blacklist No current IP blacklist, for example, allows a service provider to blacklist
only those Tor nodes that allow access to a specific IP or port, even only those Tor nodes that allow access to a specific IP or port, even
though this information is readily available. One IP blacklist even bans though this information is readily available. One IP blacklist even bans
every class C network that contains a Tor node, and recommends banning SMTP every class C network that contains a Tor node, and recommends banning SMTP
@ -758,7 +761,7 @@ tolerably well for them in practice.
But of course, we would prefer that legitimate anonymous users be able to But of course, we would prefer that legitimate anonymous users be able to
access abuse-prone services. One conceivable approach would be to require access abuse-prone services. One conceivable approach would be to require
would-be IRC users, for instance, to register accounts if they wanted to would-be IRC users, for instance, to register accounts if they wanted to
access the IRC network from Tor. But in practise, this would not access the IRC network from Tor. In practise this would not
significantly impede abuse if creating new accounts were easily automatable; significantly impede abuse if creating new accounts were easily automatable;
this is why services use IP blocking. In order to deter abuse, pseudonymous this is why services use IP blocking. In order to deter abuse, pseudonymous
identities need to require a significant switching cost in resources or human identities need to require a significant switching cost in resources or human
@ -908,14 +911,21 @@ cable-modem nodes and more nodes in distant continents. Perhaps we can
harness this increased latency to improve anonymity rather than just harness this increased latency to improve anonymity rather than just
reduce usability. Further, if we let clients label certain circuits as reduce usability. Further, if we let clients label certain circuits as
mid-latency as they are constructed, we could handle both types of traffic mid-latency as they are constructed, we could handle both types of traffic
on the same network, giving users a choice between speed and security. on the same network, giving users a choice between speed and security---and
giving researchers a chance to experiment with parameters to improve the
quality of those choices.
\subsection{Enclaves and helper nodes} \subsection{Enclaves and helper nodes}
\label{subsec:helper-nodes} \label{subsec:helper-nodes}
It has long been thought that the best anonymity comes from running your It has long been thought that the best anonymity comes from running your
own node~\cite{tor-design,or-pet00}. This is called using Tor in an own node~\cite{tor-design,or-ih96,or-pet00}. This is called using Tor in an
\emph{enclave} configuration. Of course, Tor's default path length of \emph{enclave} configuration. By running Tor clients only on Tor nodes
at the enclave perimeter, enclave configuration can also permit anonymity
protection even when policy or other requiremnts prevent individual machines
within the enclave from running Tor clients~\cite{or-jsac98,or-discex00}.
Of course, Tor's default path length of
three is insufficient for these enclaves, since the entry and/or exit three is insufficient for these enclaves, since the entry and/or exit
themselves are sensitive. Tor thus increments the path length by one themselves are sensitive. Tor thus increments the path length by one
for each sensitive endpoint in the circuit. for each sensitive endpoint in the circuit.
@ -1034,14 +1044,14 @@ distributed trust to spread each transaction over multiple jurisdictions.
But how do we decide whether two nodes are in related locations? But how do we decide whether two nodes are in related locations?
Feamster and Dingledine defined a \emph{location diversity} metric Feamster and Dingledine defined a \emph{location diversity} metric
in \cite{feamster:wpes2004}, and began investigating a variant of location in~\cite{feamster:wpes2004}, and began investigating a variant of location
diversity based on the fact that the Internet is divided into thousands of diversity based on the fact that the Internet is divided into thousands of
independently operated networks called {\em autonomous systems} (ASes). independently operated networks called {\em autonomous systems} (ASes).
The key insight from their paper is that while we typically think of a The key insight from their paper is that while we typically think of a
connection as going directly from the Tor client to her first Tor node, connection as going directly from the Tor client to the first Tor node,
actually it traverses many different ASes on each hop. An adversary at actually it traverses many different ASes on each hop. An adversary at
any of these ASes can monitor or influence traffic. Specifically, given any of these ASes can monitor or influence traffic. Specifically, given
plausible initiators and recipients and path random path selection, plausible initiators and recipients, and given random path selection,
some ASes in the simulation were able to observe 10\% to 30\% of the some ASes in the simulation were able to observe 10\% to 30\% of the
transactions (that is, learn both the origin and the destination) on transactions (that is, learn both the origin and the destination) on
the deployed Tor network (33 nodes as of June 2004). the deployed Tor network (33 nodes as of June 2004).
@ -1049,10 +1059,10 @@ the deployed Tor network (33 nodes as of June 2004).
The paper concludes that for best protection against the AS-level The paper concludes that for best protection against the AS-level
adversary, nodes should be in ASes that have the most links to other ASes: adversary, nodes should be in ASes that have the most links to other ASes:
Tier-1 ISPs such as AT\&T and Abovenet. Further, a given transaction Tier-1 ISPs such as AT\&T and Abovenet. Further, a given transaction
is safest when it starts or ends in a Tier-1 ISP. Therefore, assuming is safest when it starts or ends in a Tier-1 ISP\@. Therefore, assuming
initiator and responder are both in the U.S., it actually \emph{hurts} initiator and responder are both in the U.S., it actually \emph{hurts}
our location diversity to add far-flung nodes in continents like Asia our location diversity to enter or exit from far-flung nodes in
or South America. continents like Asia or South America.
Many open questions remain. First, it will be an immense engineering Many open questions remain. First, it will be an immense engineering
challenge to get an entire BGP routing table to each Tor client, or to challenge to get an entire BGP routing table to each Tor client, or to
@ -1071,7 +1081,8 @@ network at all. What about taking advantage of caches like Akamai or
Google~\cite{shsm03}? (Note that they're also well-positioned as global Google~\cite{shsm03}? (Note that they're also well-positioned as global
adversaries.) adversaries.)
% %
Third, if we follow the paper's recommendations and tailor path selection Third, if we follow the recommendations in~\cite{feamster:wpes2004}
and tailor path selection
to avoid choosing endpoints in similar locations, how much are we hurting to avoid choosing endpoints in similar locations, how much are we hurting
anonymity against larger real-world adversaries who can take advantage anonymity against larger real-world adversaries who can take advantage
of knowing our algorithm? of knowing our algorithm?
@ -1150,7 +1161,7 @@ accept many nodes (see Section~\ref{subsec:performance}).
Since the speed and reliability of a circuit is limited by its worst link, Since the speed and reliability of a circuit is limited by its worst link,
we must learn to track and predict performance. Finally, in order to get we must learn to track and predict performance. Finally, in order to get
a large set of nodes in the first place, we must address incentives a large set of nodes in the first place, we must address incentives
for users to carry traffic for others (see Section incentives). for users to carry traffic for others.
\subsection{Incentives by Design} \subsection{Incentives by Design}
@ -1168,10 +1179,9 @@ seti@home. We further explain to users that they can get plausible
deniability for any traffic emerging from the same address as a Tor deniability for any traffic emerging from the same address as a Tor
exit node, and they can use their own Tor node exit node, and they can use their own Tor node
as entry or exit point and be confident it's not run by the adversary. as entry or exit point and be confident it's not run by the adversary.
Further, users who need to be able to communicate anonymously Further, users may run a node simply because they need such a network
may run a node simply because their need to increase to be persistently available and usable.
expectation that such a network continues to be available to them And, the value of supporting this exceeds any countervening costs.
and usable exceeds any countervening costs.
Finally, we can improve the usability and feature set of the software: Finally, we can improve the usability and feature set of the software:
rate limiting support and easy packaging decrease the hassle of rate limiting support and easy packaging decrease the hassle of
maintaining a node, and our configurable exit policies allow each maintaining a node, and our configurable exit policies allow each
@ -1197,8 +1207,8 @@ fairness of provided anonymity. An adversary can attract more traffic
by performing well or can provide targeted differential performance to by performing well or can provide targeted differential performance to
individual users to undermine their anonymity. Typically a user who individual users to undermine their anonymity. Typically a user who
chooses evenly from all options is most resistant to an adversary chooses evenly from all options is most resistant to an adversary
targeting him, but that approach prevents from handling heterogeneous targeting him, but that approach precludes the efficient use
nodes. of heterogeneous nodes.
%When a node (call him Steve) performs well for Alice, does Steve gain %When a node (call him Steve) performs well for Alice, does Steve gain
%reputation with the entire system, or just with Alice? If the entire %reputation with the entire system, or just with Alice? If the entire
@ -1236,14 +1246,15 @@ further study.
The published Tor design adopted a deliberately simplistic design for The published Tor design adopted a deliberately simplistic design for
authorizing new nodes and informing clients about Tor nodes and their status. authorizing new nodes and informing clients about Tor nodes and their status.
In the early Tor designs, all nodes periodically uploaded a signed description In preliminary Tor designs, all nodes periodically uploaded a
signed description
of their locations, keys, and capabilities to each of several well-known {\it of their locations, keys, and capabilities to each of several well-known {\it
directory servers}. These directory servers constructed a signed summary directory servers}. These directory servers constructed a signed summary
of all known Tor nodes (a ``directory''), and a signed statement of which of all known Tor nodes (a ``directory''), and a signed statement of which
nodes they nodes they
believed to be operational at any given time (a ``network status''). Clients believed to be operational at any given time (a ``network status''). Clients
periodically downloaded a directory in order to learn the latest nodes and periodically downloaded a directory in order to learn the latest nodes and
keys, and more frequently downloaded a network status to learn which nodes are keys, and more frequently downloaded a network status to learn which nodes were
likely to be running. Tor nodes also operate as directory caches, in order to likely to be running. Tor nodes also operate as directory caches, in order to
lighten the bandwidth on the authoritative directory servers. lighten the bandwidth on the authoritative directory servers.
@ -1258,7 +1269,7 @@ directory administrators performed little actual verification, and tended to
approve any Tor node whose operator could compose a coherent email. approve any Tor node whose operator could compose a coherent email.
This procedure This procedure
may have prevented trivial automated Sybil attacks, but would do little may have prevented trivial automated Sybil attacks, but would do little
against a clever attacker. against a clever and determined attacker.
There are a number of flaws in this system that need to be addressed as we There are a number of flaws in this system that need to be addressed as we
move forward. They include: move forward. They include:
@ -1283,7 +1294,7 @@ network capacity in order to support more users, we could simply
adopt even stricter validation requirements, and reduce the number of adopt even stricter validation requirements, and reduce the number of
nodes in the network to a trusted minimum. nodes in the network to a trusted minimum.
But, we can only do that if can simultaneously make node capacity But, we can only do that if can simultaneously make node capacity
scale much more than we anticipate feasible soon, and if we can find scale much more than we anticipate to be feasible soon, and if we can find
entities willing to run such nodes, an equally daunting prospect. entities willing to run such nodes, an equally daunting prospect.
@ -1355,7 +1366,8 @@ reveal the path taken by large traffic flows under low-usage circumstances.
\subsection{Non-clique topologies} \subsection{Non-clique topologies}
Tor's comparatively weak model makes it easier to scale than other mix net Tor's comparatively weak threat model makes it easier to scale than
other mix net
designs. High-latency mix networks need to avoid partitioning attacks, where designs. High-latency mix networks need to avoid partitioning attacks, where
network splits prevent users of the separate partitions from providing cover network splits prevent users of the separate partitions from providing cover
for each other. In Tor, however, we assume that the adversary cannot for each other. In Tor, however, we assume that the adversary cannot
@ -1381,7 +1393,7 @@ scaling include restricting the number of sockets and the amount of bandwidth
used by each node. The number of sockets is determined by the network's used by each node. The number of sockets is determined by the network's
connectivity and the number of users, while bandwidth capacity is determined connectivity and the number of users, while bandwidth capacity is determined
by the total bandwidth of nodes on the network. The simplest solution to by the total bandwidth of nodes on the network. The simplest solution to
bandwidth capacity is to add more nodes, since adding a tor node of any bandwidth capacity is to add more nodes, since adding a Tor node of any
feasible bandwidth will increase the traffic capacity of the network. So as feasible bandwidth will increase the traffic capacity of the network. So as
a first step to scaling, we should focus on making the network tolerate more a first step to scaling, we should focus on making the network tolerate more
nodes, by reducing the interconnectivity of the nodes; later we can reduce nodes, by reducing the interconnectivity of the nodes; later we can reduce
@ -1403,7 +1415,7 @@ a sparse network.
To make matters simpler, Tor may not need an expander graph per se: it To make matters simpler, Tor may not need an expander graph per se: it
may be enough to have a single subnet that is highly connected. As an may be enough to have a single subnet that is highly connected. As an
example, assume fifty nodes of relatively high traffic capacity. This example, assume fifty nodes of relatively high traffic capacity. This
\emph{center} forms are a clique. Assume each center node can each \emph{center} forms a clique. Assume each center node can
handle 200 connections to other nodes (including the other ones in the handle 200 connections to other nodes (including the other ones in the
center). Assume every noncenter node connects to three nodes in the center). Assume every noncenter node connects to three nodes in the
center and anyone out of the center that they want to. Then the center and anyone out of the center that they want to. Then the
@ -1413,16 +1425,16 @@ is distributed (presumably information about the center nodes could
be given to any new nodes with their codebase), whether center nodes be given to any new nodes with their codebase), whether center nodes
will need to function as a `backbone', etc. As above the point is will need to function as a `backbone', etc. As above the point is
that this would create problems for the expected anonymity for a mixnet, that this would create problems for the expected anonymity for a mixnet,
but for an onion routing network where anonymity derives largely from but for a low-latency network where anonymity derives largely from
the edges, it may be feasible. the edges, it may be feasible.
Another point is that we already have a non-clique topology. Another point is that we already have a non-clique topology.
Individuals can set up and run Tor nodes without informing the Individuals can set up and run Tor nodes without informing the
directory servers. This will allow, e.g., dissident groups to run a directory servers. This will allow, e.g., dissident groups to run a
local Tor network of such nodes that connects to the public Tor local Tor network of such nodes that connects to the public Tor
network. This network is hidden behind the Tor network and its network. This network is hidden behind the Tor network, and its
only visible connection to Tor at those points where it connects. only visible connection to Tor is at those points where it connects.
As far as the public network is concerned or anyone observing it, As far as the public network, or anyone observing it, is concerned,
they are running clients. they are running clients.
\section{The Future} \section{The Future}
@ -1442,7 +1454,7 @@ network: as Tor grows more popular, other groups who need an overlay
network on the Internet are starting to adapt Tor to their needs. network on the Internet are starting to adapt Tor to their needs.
% %
Second, Tor is only one of many components that preserve privacy online. Second, Tor is only one of many components that preserve privacy online.
To keep identifying information out of application traffic, we must build To keep identifying information out of application traffic, someone must build
more and better protocol-aware proxies that are usable by ordinary people. more and better protocol-aware proxies that are usable by ordinary people.
% %
Third, we need to gain a reputation for social good, and learn how to Third, we need to gain a reputation for social good, and learn how to