lots more cleanups. people should check these over.

svn:r3593
This commit is contained in:
Roger Dingledine 2005-02-09 04:34:50 +00:00
parent c5c46d6fb6
commit e4989f33c9

View File

@ -82,7 +82,7 @@ design and goals. Here we describe some policy, social, and technical
issues that we face as we continue deployment. issues that we face as we continue deployment.
Rather than providing complete solutions to every problem, we Rather than providing complete solutions to every problem, we
instead lay out the challenges and constraints that we have observed while instead lay out the challenges and constraints that we have observed while
deploying Tor in the wild. In doing so, we aim to provide a research agenda deploying Tor. In doing so, we aim to provide a research agenda
of general interest to projects attempting to build of general interest to projects attempting to build
and deploy practical, usable anonymity networks in the wild. and deploy practical, usable anonymity networks in the wild.
@ -179,10 +179,9 @@ for use in securing government
communications, and by the Electronic Frontier Foundation for use communications, and by the Electronic Frontier Foundation for use
in maintaining civil liberties for ordinary citizens online. The Tor in maintaining civil liberties for ordinary citizens online. The Tor
protocol is one of the leading choices protocol is one of the leading choices
for anonymizing layer in the European Union's PRIME directive to for the anonymizing layer in the European Union's PRIME directive to
help maintain privacy in Europe. help maintain privacy in Europe.
% XXXX We should credit the specific group, not the whole university. The AN.ON project in Germany
The University of Dresden in Germany
has integrated an independent implementation of the Tor protocol into has integrated an independent implementation of the Tor protocol into
their popular Java Anon Proxy anonymizing client. their popular Java Anon Proxy anonymizing client.
% This wide variety of % This wide variety of
@ -220,14 +219,16 @@ of intra-network~\cite{back01,attack-tor-oak05,flow-correlation04} and
end-to-end~\cite{danezis-pet2004,SS03} anonymity-breaking attacks. end-to-end~\cite{danezis-pet2004,SS03} anonymity-breaking attacks.
Tor does not attempt to defend against a global observer. In general, an Tor does not attempt to defend against a global observer. In general, an
attacker who can observe both ends of a connection through the Tor network attacker who can measure both ends of a connection through the Tor network
% I say 'measure' rather than 'observe', to encompass murdoch-danezis
% style attacks. -RD
can correlate the timing and volume of data on that connection as it enters can correlate the timing and volume of data on that connection as it enters
and leaves the network, and so link communication partners. and leaves the network, and so link communication partners.
Known solutions to this attack would seem to require introducing a Known solutions to this attack would seem to require introducing a
prohibitive degree of traffic padding between the user and the network, or prohibitive degree of traffic padding between the user and the network, or
introducing an unacceptable degree of latency (but see Section introducing an unacceptable degree of latency (but see Section
\ref{subsec:mid-latency}). Also, it is not clear that these methods would \ref{subsec:mid-latency}). Also, it is not clear that these methods would
work at all against even a minimally active adversary who could introduce timing work at all against a minimally active adversary who could introduce timing
patterns or additional traffic. Thus, Tor only attempts to defend against patterns or additional traffic. Thus, Tor only attempts to defend against
external observers who cannot observe both sides of a user's connections. external observers who cannot observe both sides of a user's connections.
@ -267,7 +268,7 @@ responders.
%However, it is still essentially confirming %However, it is still essentially confirming
%suspected communicants where the responder suspects are ``stored'' rather %suspected communicants where the responder suspects are ``stored'' rather
%than observed at the same time as the client. %than observed at the same time as the client.
Similarly latencies of going through various routes can be Similarly, latencies of going through various routes can be
cataloged~\cite{back01} to connect endpoints. cataloged~\cite{back01} to connect endpoints.
% XXX hintz-pet02 just looked at data volumes of the sites. this % XXX hintz-pet02 just looked at data volumes of the sites. this
% doesn't require much variability or storage. I think it works % doesn't require much variability or storage. I think it works
@ -286,18 +287,17 @@ rather than halt the attacks in the cases where they succeed.
%routes through the network to each site will be random even if they %routes through the network to each site will be random even if they
%have relatively unique latency characteristics. So this does not seem %have relatively unique latency characteristics. So this does not seem
%an immediate practical threat. %an immediate practical threat.
Along similar lines, the same Along similar lines, the same paper suggests a ``clogging
paper suggested a ``clogging attack''. In \cite{attack-tor-oak05}, a attack''. Murdoch and Danezis~\cite{attack-tor-oak05} show a practical
version of this was demonstrated to be practical against portions of clogging attack against portions of
the fifty node Tor network as deployed in mid 2004. There it was shown the fifty node Tor network as deployed in mid 2004.
that an outside attacker can trace a stream through the Tor network An outside attacker can actively trace a circuit through the Tor network
while a stream is still active by observing the latency of his by observing changes in the latency of his
own traffic sent through various Tor nodes. These attacks do not show own traffic sent through various Tor nodes. These attacks only reveal
client and server addresses, only the first and last nodes within the Tor the Tor nodes in the circuit, not initiator and responder addresses,
network, so it is still necessary to observe those nodes to complete the so it is still necessary to discover the endpoints to complete the
attacks. This may make attacks. Increasing the size and diversity of the Tor network may
helper nodes all the more worthy of exploration (see help counter these attacks.
Section~\ref{subsec:helper-nodes}).
%discuss $\frac{c^2}{n^2}$, except how in practice the chance of owning %discuss $\frac{c^2}{n^2}$, except how in practice the chance of owning
%the last hop is not $c/n$ since that doesn't take the destination (website) %the last hop is not $c/n$ since that doesn't take the destination (website)
@ -389,18 +389,18 @@ handles only web browsing rather than arbitrary TCP\@.
Zero-Knowledge Systems' commercial Freedom Zero-Knowledge Systems' commercial Freedom
network~\cite{freedom21-security} was even more flexible than Tor in network~\cite{freedom21-security} was even more flexible than Tor in
transporting arbitrary IP packets, and also supported transporting arbitrary IP packets, and also supported
pseudonymous in addition to anonymity; but it has pseudonymity in addition to anonymity; but it has
a different approach to sustainability (collecting money from users a different approach to sustainability (collecting money from users
and paying ISPs to run Tor nodes), and was eventually shut down due to financial and paying ISPs to run Tor nodes), and was eventually shut down due to financial
load. Finally, potentially load. Finally,
more scalable peer-to-peer designs like Tarzan~\cite{tarzan:ccs02} and more scalable peer-to-peer designs like Tarzan~\cite{tarzan:ccs02} and
MorphMix~\cite{morphmix:fc04} have been proposed in the literature, but MorphMix~\cite{morphmix:fc04} have been proposed in the literature, but
have not yet been fielded. These systems differ somewhat have not been fielded. These systems differ somewhat
in threat model and presumably practical resistance to threats. in threat model and presumably practical resistance to threats.
MorphMix is close to Tor in circuit setup, and, by separating Note that MorphMix and Tor differ only in
node discovery from route selection from circuit setup, Tor is node discovery and circuit setup; so Tor's architecture is flexible
flexible enough to potentially contain a MorphMix experiment within enough to contain a MorphMix experiment.
it. We direct the interested reader We direct the interested reader
to~\cite{tor-design} for a more in-depth review of related work. to~\cite{tor-design} for a more in-depth review of related work.
Tor also differs from other deployed systems for traffic analysis resistance Tor also differs from other deployed systems for traffic analysis resistance
@ -440,8 +440,8 @@ Tor's interaction with other services on the Internet.
\subsection{Communicating security} \subsection{Communicating security}
Usability for anonymity systems Usability for anonymity systems
contributes directly to their security, because usability contributes to their security, because usability
effects the possible anonymity set~\cite{econymics,back01}. affects the possible anonymity set~\cite{econymics,back01}.
Conversely, an unusable system attracts few users and thus can't provide Conversely, an unusable system attracts few users and thus can't provide
much anonymity. much anonymity.
@ -483,10 +483,10 @@ the initiator to her destination.% This is why Tor's threat model is
Like Tor, the current JAP implementation does not pad connections Like Tor, the current JAP implementation does not pad connections
apart from using small fixed-size cells for transport. In fact, apart from using small fixed-size cells for transport. In fact,
JAP's cascade-based network topology may be more vulnerable to these JAP's cascade-based network topology may be more vulnerable to these
attacks, because the network has fewer edges. JAP was born out of attacks, because its network has fewer edges. JAP was born out of
the ISDN mix design~\cite{isdn-mixes}, where padding made sense because the ISDN mix design~\cite{isdn-mixes}, where padding made sense because
every user had a fixed bandwidth allocation and altering the timing every user had a fixed bandwidth allocation and altering the timing
pattern of packets could be immediately detected, but in its current context pattern of packets could be immediately detected. But in its current context
as a general Internet web anonymizer, adding sufficient padding to JAP as a general Internet web anonymizer, adding sufficient padding to JAP
would probably be prohibitively expensive and ineffective against a would probably be prohibitively expensive and ineffective against a
minimally active attacker.\footnote{Even if JAP could minimally active attacker.\footnote{Even if JAP could
@ -498,10 +498,6 @@ model the number of concurrent users does not seem to have much impact
on the anonymity provided, we suggest that JAP's anonymity meter is not on the anonymity provided, we suggest that JAP's anonymity meter is not
accurately communicating security levels to its users. accurately communicating security levels to its users.
% because more users don't help anonymity much, we need to rely more
% on other incentive schemes, both policy-based (see sec x) and
% technically enforced (see sec y)
On the other hand, while the number of active concurrent users may not On the other hand, while the number of active concurrent users may not
matter as much as we'd like, it still helps to have some other users matter as much as we'd like, it still helps to have some other users
on the network. We investigate this issue next. on the network. We investigate this issue next.
@ -666,8 +662,8 @@ So when letters arrive, operators are likely to face
pressure to block file-sharing applications entirely, in order to avoid the pressure to block file-sharing applications entirely, in order to avoid the
hassle. hassle.
But blocking file-sharing would not necessarily be easy; many popular But blocking file-sharing is not easy: many popular
protocols have evolved to run on a non-standard ports in order to protocols have evolved to run on non-standard ports to
get around other port-based bans. Thus, exit node operators who want to get around other port-based bans. Thus, exit node operators who want to
block file-sharing would have to find some way to integrate Tor with a block file-sharing would have to find some way to integrate Tor with a
protocol-aware exit filter. This could be a technically expensive protocol-aware exit filter. This could be a technically expensive
@ -706,29 +702,27 @@ file-sharing protocols that have separate control and data channels.
It was long expected that, alongside legitimate users, Tor would also It was long expected that, alongside legitimate users, Tor would also
attract troublemakers who exploited Tor in order to abuse services on the attract troublemakers who exploited Tor in order to abuse services on the
Internet with vandalism, rude mail, and so on. Internet with vandalism, rude mail, and so on.
%[XXX we're not talking bandwidth abuse here, we're talking vandalism,
%hate mails via hotmail, attacks, etc.]
Our initial answer to this situation was to use ``exit policies'' Our initial answer to this situation was to use ``exit policies''
to allow individual Tor nodes to block access to specific IP/port ranges. to allow individual Tor nodes to block access to specific IP/port ranges.
This approach aims to make operators more willing to run Tor by allowing This approach aims to make operators more willing to run Tor by allowing
them to prevent their nodes from being used for abusing particular them to prevent their nodes from being used for abusing particular
services. For example, all Tor nodes currently block SMTP (port 25), in services. For example, all Tor nodes currently block SMTP (port 25),
order to avoid being used for spam. to avoid being used for spam.
This approach is useful, but is insufficient for two reasons. First, since Exit policies are useful, but are insufficient for two reasons. First, since
it is not possible to force all nodes to block access to any given service, it is not possible to force all nodes to block access to any given service,
many of those services try to block Tor instead. More broadly, while being many of those services try to block Tor instead. More broadly, while being
blockable is important to being good netizens, we would like to encourage blockable is important to being good netizens, we would like to encourage
services to allow anonymous access; services should not need to decide services to allow anonymous access. Services should not need to decide
between blocking legitimate anonymous use and allowing unlimited abuse. between blocking legitimate anonymous use and allowing unlimited abuse.
This is potentially a bigger problem than it may appear. This is potentially a bigger problem than it may appear.
On the one hand, people should be allowed to refuse connections to On the one hand, services should be allowed to refuse connections from
their services. But, it's not just sources of possible abuse.
for himself that a node administrator is deciding when he decides But when a Tor node administrator decides whether he prefers to be able
whether he prefers to be able to post to Wikipedia from his Tor node address, to post to Wikipedia from his IP address, or to allow people to read
or to allow Wikipedia anonymously through his Tor node, he is making the decision
people to read Wikipedia anonymously through his Tor node. (Wikipedia for others as well. (Wikipedia
has blocked all posting from all Tor nodes based on IP addresses.) If has blocked all posting from all Tor nodes based on IP addresses.) If
the Tor node shares an address with a campus or corporate NAT, the Tor node shares an address with a campus or corporate NAT,
then the decision can prevent the entire population from posting. then the decision can prevent the entire population from posting.
@ -736,10 +730,9 @@ This is a loss for both Tor
and Wikipedia: we don't want to compete for (or divvy up) the and Wikipedia: we don't want to compete for (or divvy up) the
NAT-protected entities of the world. NAT-protected entities of the world.
Worse, many IP blacklists are not terribly fine-grained. Worse, many IP blacklists are coarse-grained. Some
No current IP blacklist, for example, allows a service provider to blacklist ignore Tor's exit policies, preferring to punish
only those Tor nodes that allow access to a specific IP or port, even all Tor nodes. One IP blacklist even bans
though this information is readily available. One IP blacklist even bans
every class C network that contains a Tor node, and recommends banning SMTP every class C network that contains a Tor node, and recommends banning SMTP
from these networks even though Tor does not allow SMTP at all. This from these networks even though Tor does not allow SMTP at all. This
coarse-grained approach is typically a strategic decision to discourage the coarse-grained approach is typically a strategic decision to discourage the
@ -751,6 +744,7 @@ to shut it down in order to get unblocked themselves.
%[XXX Mention: it's not dumb, it's strategic!] %[XXX Mention: it's not dumb, it's strategic!]
%[XXX Mention: for some servops, any blacklist is a blacklist too many, %[XXX Mention: for some servops, any blacklist is a blacklist too many,
% because it is risky. (Guy lives in apt _building_ with one IP.)] % because it is risky. (Guy lives in apt _building_ with one IP.)]
%XXX roger should add more
Problems of abuse occur mainly with services such as IRC networks and Problems of abuse occur mainly with services such as IRC networks and
Wikipedia, which rely on IP blocking to ban abusive users. While at first Wikipedia, which rely on IP blocking to ban abusive users. While at first
@ -771,7 +765,7 @@ this is why services use IP blocking. In order to deter abuse, pseudonymous
identities need to require a significant switching cost in resources or human identities need to require a significant switching cost in resources or human
time. Some popular webmail applications time. Some popular webmail applications
impose cost with Reverse Turing Tests, but these may not be costly enough to impose cost with Reverse Turing Tests, but these may not be costly enough to
deter abusers. Freedom solved this using blind signatures to limit deter abusers. Freedom used blind signatures to limit
the number of pseudonyms for each paying account, but Tor has neither the the number of pseudonyms for each paying account, but Tor has neither the
ability nor the desire to collect payment. ability nor the desire to collect payment.
@ -779,7 +773,7 @@ ability nor the desire to collect payment.
%non-anonymous costly identification mechanism to allow access to a %non-anonymous costly identification mechanism to allow access to a
%blind-signature pseudonym protocol. This would effectively create costly %blind-signature pseudonym protocol. This would effectively create costly
%pseudonyms, which services could require in order to allow anonymous access. %pseudonyms, which services could require in order to allow anonymous access.
%This approach has difficulties in practise, however: %This approach has difficulties in practice, however:
%\begin{tightlist} %\begin{tightlist}
%\item Unlike Freedom, Tor is not a commercial service. Therefore, it would %\item Unlike Freedom, Tor is not a commercial service. Therefore, it would
% be a shame to require payment in order to make Tor useful, or to make % be a shame to require payment in order to make Tor useful, or to make
@ -828,21 +822,21 @@ at the IP layer. Before this could be done, many issues need to be resolved:
IP-level packet normalization, to stop things like TCP fingerprinting IP-level packet normalization, to stop things like TCP fingerprinting
attacks. %There likely exist libraries that can help with this. attacks. %There likely exist libraries that can help with this.
This is unlikely to be a trivial task, given the diversity and complexity of This is unlikely to be a trivial task, given the diversity and complexity of
various TCP stacks. TCP stacks.
\item \emph{Application-level streams still need scrubbing.} We still need \item \emph{Application-level streams still need scrubbing.} We still need
Tor to be easy to integrate with user-level application-specific proxies Tor to be easy to integrate with user-level application-specific proxies
such as Privoxy. So it's not just a matter of capturing packets and such as Privoxy. So it's not just a matter of capturing packets and
anonymizing them at the IP layer. anonymizing them at the IP layer.
\item \emph{Certain protocols will still leak information.} For example, we \item \emph{Certain protocols will still leak information.} For example, we
must rewrite DNS requests so they are delivered to an unlinkable DNS server must rewrite DNS requests so they are delivered to an unlinkable DNS server
rather than a DNS server at a user's ISP;thus, we must understand the rather than the DNS server at a user's ISP; thus, we must understand the
protocols we are transporting. protocols we are transporting.
\item \emph{The crypto is unspecified.} First we need a block-level encryption \item \emph{The crypto is unspecified.} First we need a block-level encryption
approach that can provide security despite approach that can provide security despite
packet loss and out-of-order delivery. Freedom allegedly had one, but it was packet loss and out-of-order delivery. Freedom allegedly had one, but it was
never publicly specified. never publicly specified.
Also, TLS over UDP is not yet implemented or Also, TLS over UDP is not yet implemented or
specified, though some early work has begun on that~\cite{dtls}. specified, though some early work has begun~\cite{dtls}.
\item \emph{We'll still need to tune network parameters.} Since the above \item \emph{We'll still need to tune network parameters.} Since the above
encryption system will likely need sequence numbers (and maybe more) to do encryption system will likely need sequence numbers (and maybe more) to do
replay detection, handle duplicate frames, and so on, we will be reimplementing replay detection, handle duplicate frames, and so on, we will be reimplementing
@ -863,8 +857,8 @@ which nodes will allow which packets to exit.
support hidden service {\tt{.onion}} addresses (and other special addresses, support hidden service {\tt{.onion}} addresses (and other special addresses,
like {\tt{.exit}} which lets the user request a particular exit node), like {\tt{.exit}} which lets the user request a particular exit node),
by intercepting the addresses when they are passed to the Tor client. by intercepting the addresses when they are passed to the Tor client.
Doing so at the IP level would require more complex interface between Doing so at the IP level would require a more complex interface between
Tor and local DNS resolver. Tor and the local DNS resolver.
\end{enumerate} \end{enumerate}
This list is discouragingly long, but being able to transport more This list is discouragingly long, but being able to transport more
@ -930,14 +924,13 @@ quality of those choices.
\subsection{Enclaves and helper nodes} \subsection{Enclaves and helper nodes}
\label{subsec:helper-nodes} \label{subsec:helper-nodes}
It has long been thought that users can improve their It has long been thought that users can improve their anonymity by
anonymity by running their running their own node~\cite{tor-design,or-ih96,or-pet00}, and using
own node~\cite{tor-design,or-ih96,or-pet00}, and using it in an it in an \emph{enclave} configuration, where all their circuits begin
\emph{enclave} configuration, where all their circuits begin at the node at the node under their control. Running Tor clients or servers at
under their control. By running Tor clients only on Tor nodes the enclave perimeter is useful when policy or other requirements
at the enclave perimeter, enclave configuration can also permit anonymity prevent individual machines within the enclave from running Tor
protection even when policy or other requirements prevent individual machines clients~\cite{or-jsac98,or-discex00}.
within the enclave from running Tor clients~\cite{or-jsac98,or-discex00}.
Of course, Tor's default path length of Of course, Tor's default path length of
three is insufficient for these enclaves, since the entry and/or exit three is insufficient for these enclaves, since the entry and/or exit
@ -1041,8 +1034,8 @@ News sites like Bloggers Without Borders (www.b19s.org) are advertising
a hidden-service address on their front page. Doing this can provide a hidden-service address on their front page. Doing this can provide
increased robustness if they use the dual-IP approach we describe increased robustness if they use the dual-IP approach we describe
in~\cite{tor-design}, in~\cite{tor-design},
but in practice they do it first to increase visibility but in practice they do it to increase visibility
of the Tor project and their support for privacy, and second to offer of the Tor project and their support for privacy, and to offer
a way for their users, using unmodified software, to get end-to-end a way for their users, using unmodified software, to get end-to-end
encryption and authentication to their website. encryption and authentication to their website.
@ -1077,8 +1070,11 @@ adversary, nodes should be in ASes that have the most links to other ASes:
Tier-1 ISPs such as AT\&T and Abovenet. Further, a given transaction Tier-1 ISPs such as AT\&T and Abovenet. Further, a given transaction
is safest when it starts or ends in a Tier-1 ISP\@. Therefore, assuming is safest when it starts or ends in a Tier-1 ISP\@. Therefore, assuming
initiator and responder are both in the U.S., it actually \emph{hurts} initiator and responder are both in the U.S., it actually \emph{hurts}
our location diversity to enter or exit from far-flung nodes in our location diversity to use far-flung nodes in
continents like Asia or South America. continents like Asia or South America.
% it's not just entering or exiting from them. using them as the middle
% hop reduces your effective path length, which you presumably don't
% want because you chose that path length for a reason.
Many open questions remain. First, it will be an immense engineering Many open questions remain. First, it will be an immense engineering
challenge to get an entire BGP routing table to each Tor client, or to challenge to get an entire BGP routing table to each Tor client, or to
@ -1089,9 +1085,11 @@ and MorphMix~\cite{morphmix:fc04} suggest that we compare IP prefixes to
determine location diversity; but the above paper showed that in practice determine location diversity; but the above paper showed that in practice
many of the Mixmaster nodes that share a single AS have entirely different many of the Mixmaster nodes that share a single AS have entirely different
IP prefixes. When the network has scaled to thousands of nodes, does IP IP prefixes. When the network has scaled to thousands of nodes, does IP
prefix comparison become a more useful approximation? Alternatively, can prefix comparison become a more useful approximation? % Alternatively, can
relevant parts of the routing tables be summarized centrally and delivered to %relevant parts of the routing tables be summarized centrally and delivered to
clients in a less verbose format? %clients in a less verbose format?
%% i already said "or to summarize is sufficiently" above. is that not
%% enough? -RD
% %
Second, we can take advantage of caching certain content at the Second, we can take advantage of caching certain content at the
exit nodes, to limit the number of requests that need to leave the exit nodes, to limit the number of requests that need to leave the
@ -1106,7 +1104,7 @@ anonymity against larger real-world adversaries who can take advantage
of knowing our algorithm? of knowing our algorithm?
% %
Fourth, can we use this knowledge to figure out which gaps in our network Fourth, can we use this knowledge to figure out which gaps in our network
most effect our robustness to this class of attack, and go recruit most affect our robustness to this class of attack, and go recruit
new nodes with those ASes in mind? new nodes with those ASes in mind?
%Tor's security relies in large part on the dispersal properties of its %Tor's security relies in large part on the dispersal properties of its
@ -1141,7 +1139,7 @@ to the users inside the country, and give them software to use them,
without letting the censors also enumerate this list and block each without letting the censors also enumerate this list and block each
relay. Anonymizer solves this by buying lots of seemingly-unrelated IP relay. Anonymizer solves this by buying lots of seemingly-unrelated IP
addresses (or having them donated), abandoning old addresses as they are addresses (or having them donated), abandoning old addresses as they are
`used up', and telling a few users about the new ones. Distributed `used up,' and telling a few users about the new ones. Distributed
anonymizing networks again have an advantage here, in that we already anonymizing networks again have an advantage here, in that we already
have tens of thousands of separate IP addresses whose users might have tens of thousands of separate IP addresses whose users might
volunteer to provide this service since they've already installed and use volunteer to provide this service since they've already installed and use
@ -1152,7 +1150,7 @@ to generate node descriptors and send them to a special directory
server that gives them out to dissidents who need to get around blocks. server that gives them out to dissidents who need to get around blocks.
Of course, this still doesn't prevent the adversary Of course, this still doesn't prevent the adversary
from enumerating and preemtively blocking the volunteer relays. from enumerating and preemptively blocking the volunteer relays.
Perhaps a tiered-trust system could be built where a few individuals are Perhaps a tiered-trust system could be built where a few individuals are
given relays' locations, and they recommend other individuals by telling them given relays' locations, and they recommend other individuals by telling them
those addresses, thus providing a built-in incentive to avoid letting the those addresses, thus providing a built-in incentive to avoid letting the
@ -1169,15 +1167,17 @@ help address censorship; we wish them success.
Tor is running today with hundreds of nodes and tens of thousands of Tor is running today with hundreds of nodes and tens of thousands of
users, but it will certainly not scale to millions. users, but it will certainly not scale to millions.
Scaling Tor involves three main challenges. First is safe node discovery, Scaling Tor involves four main challenges. First, in order to get a
both while bootstrapping (how does Tor client robustly find an initial node large set of nodes in the first place, we must address incentives for
list?) and later (how does Tor client can learn about a fair sample of honest users to carry traffic for others. Next is safe node discovery, both
nodes and not let the adversary control his circuits?) Second is detecting while bootstrapping (how does a Tor client robustly find an initial
and handling the speed and reliability of the variety of nodes as the network node list?) and later (how does a Tor client learn about a fair sample
becomes increasingly heterogeneous: since the speed and reliability of a of honest nodes and not let the adversary control his circuits?).
circuit is limited by its worst link, we must learn to track and predict We must also detect and handle node speed and reliability as the network
performance. Third, in order to get a large set of nodes in the first becomes increasingly heterogeneous: since the speed and reliability
place, we must address incentives for users to carry traffic for others. of a circuit is limited by its worst link, we must learn to track and
predict performance. Finally, we must stop assuming that all points on
the network can connect to all other points.
\subsection{Incentives by Design} \subsection{Incentives by Design}
@ -1246,17 +1246,6 @@ large set of nodes that meet some minimum service threshold
without opening Alice up as much to attacks. All of this requires without opening Alice up as much to attacks. All of this requires
further study. further study.
%XXX rewrite the above so it sounds less like a grant proposal and
%more like a "if somebody were to try to solve this, maybe this is a
%good first step".
%We should implement the above incentive scheme in the
%deployed Tor network, in conjunction with our plans to add the necessary
%associated scalability mechanisms. We will do experiments (simulated
%and/or real) to determine how much the incentive system improves
%efficiency over baseline, and also to determine how far we are from
%optimal efficiency (what we could get if we ignored the anonymity goals).
\subsection{Trust and discovery} \subsection{Trust and discovery}
\label{subsec:trust-and-discovery} \label{subsec:trust-and-discovery}