Numerous notes of stuff to do from mtg with Roger; add outline for design section.

svn:r671
This commit is contained in:
Nick Mathewson 2003-10-24 21:18:38 +00:00
parent 28e93f3aa3
commit d4ad3bde8c

View File

@ -50,8 +50,8 @@
\begin{abstract} \begin{abstract}
We present Tor, a connection-based low-latency anonymous communication We present Tor, a connection-based low-latency anonymous communication
system. It is intended as an update and replacement for onion routing system. It is intended as an update and replacement for Onion Routing
and addresses many limitations in the original onion routing design. and addresses many limitations in the original Onion Routing design.
Tor works in a real-world Internet environment, Tor works in a real-world Internet environment,
requires little synchronization or coordination between nodes, and requires little synchronization or coordination between nodes, and
protects against known anonymity-breaking attacks as well protects against known anonymity-breaking attacks as well
@ -73,10 +73,10 @@ and instant messaging. Users choose a path through the network and
build a \emph{virtual circuit}, in which each node in the path knows its build a \emph{virtual circuit}, in which each node in the path knows its
predecessor and successor, but no others. Traffic flowing down the circuit predecessor and successor, but no others. Traffic flowing down the circuit
is sent in fixed-size \emph{cells}, which are unwrapped by a symmetric key is sent in fixed-size \emph{cells}, which are unwrapped by a symmetric key
at each node, revealing the downstream node. The original onion routing at each node, revealing the downstream node. The original Onion Routing
project published several design and analysis papers project published several design and analysis papers
\cite{or-jsac98,or-discex00,or-ih96,or-pet00}. While there was briefly \cite{or-jsac98,or-discex00,or-ih96,or-pet00}. While there was briefly
a wide area onion routing network, a wide area Onion Routing network,
% how long is briefly? a day, a month? -RD % how long is briefly? a day, a month? -RD
the only long-running and publicly accessible the only long-running and publicly accessible
implementation was a fragile proof-of-concept that ran on a single implementation was a fragile proof-of-concept that ran on a single
@ -84,11 +84,13 @@ machine. Many critical design and deployment issues were never implemented,
and the design has not been updated in several years. and the design has not been updated in several years.
Here we describe Tor, a protocol for asynchronous, loosely Here we describe Tor, a protocol for asynchronous, loosely
federated onion routers that provides the following improvements over federated onion routers that provides the following improvements over
the old onion routing design: the old Onion Routing design:
% Also itemize improvements over Freedom.
\begin{tightlist} \begin{tightlist}
\item \textbf{Perfect forward secrecy:} The original onion routing \item \textbf{Perfect forward secrecy:} The original Onion Routing
design is vulnerable to a single hostile node recording traffic and later design is vulnerable to a single hostile node recording traffic and later
forcing successive nodes in the circuit to decrypt it. Rather than using forcing successive nodes in the circuit to decrypt it. Rather than using
onions to lay the circuits, Tor uses an incremental or \emph{telescoping} onions to lay the circuits, Tor uses an incremental or \emph{telescoping}
@ -98,7 +100,7 @@ necessary, and the process of building circuits is more reliable, since
the initiator knows which hop failed and can try extending to a new node. the initiator knows which hop failed and can try extending to a new node.
\item \textbf{Applications talk to the onion proxy via Socks:} \item \textbf{Applications talk to the onion proxy via Socks:}
The original onion routing design required a separate proxy for each The original Onion Routing design required a separate proxy for each
supported application protocol, resulting in a lot of extra code --- most supported application protocol, resulting in a lot of extra code --- most
of which was never written, so most applications were not supported. of which was never written, so most applications were not supported.
Tor uses the unified and standard Socks Tor uses the unified and standard Socks
@ -106,15 +108,15 @@ Tor uses the unified and standard Socks
program without modification. program without modification.
\item \textbf{Many applications can share one circuit:} The original \item \textbf{Many applications can share one circuit:} The original
onion routing design built one circuit for each request. Aside from the Onion Routing design built one circuit for each request. Aside from the
performance issues of doing public key operations for every request, it performance issues of doing public key operations for every request, it
also turns out that regular communications patterns mean building lots also turns out that regular communications patterns mean building lots
of circuits, which can endanger anonymity. of circuits, which can endanger anonymity.
The very first onion routing design \cite{or-ih96} protected against The very first Onion Routing design \cite{or-ih96} protected against
this to some extent by hiding network access behind an onion this to some extent by hiding network access behind an onion
router/firewall that was also forwarding traffic from other nodes. router/firewall that was also forwarding traffic from other nodes.
However, even if this meant complete protection, many users can However, even if this meant complete protection, many users can
benefit from onion routing for which neither running one's own node benefit from Onion Routing for which neither running one's own node
nor such firewall configurations are adequately convenient to be nor such firewall configurations are adequately convenient to be
feasible. Those users, especially if they engage in certain unusual feasible. Those users, especially if they engage in certain unusual
communication behaviors, may be identifiable \cite{wright03}. To communication behaviors, may be identifiable \cite{wright03}. To
@ -123,7 +125,7 @@ connections down each circuit, but still rotates the circuit
periodically to avoid too much linkability from requests on a single periodically to avoid too much linkability from requests on a single
circuit. circuit.
\item \textbf{No mixing or traffic shaping:} The original onion routing \item \textbf{No mixing or traffic shaping:} The original Onion Routing
design called for full link padding both between onion routers and between design called for full link padding both between onion routers and between
onion proxies (that is, users) and onion routers \cite{or-jsac98}. The onion proxies (that is, users) and onion routers \cite{or-jsac98}. The
later analysis paper \cite{or-pet00} suggested \emph{traffic shaping} later analysis paper \cite{or-pet00} suggested \emph{traffic shaping}
@ -187,12 +189,19 @@ are critical in a volunteer-based distributed infrastructure, because
each operator is comfortable with allowing different types of traffic each operator is comfortable with allowing different types of traffic
to exit the Tor network from his node. to exit the Tor network from his node.
\item \textbf{Implementable in user-space}.
\item \textbf{Rendezvous points and location-protected servers:} Tor \item \textbf{Rendezvous points and location-protected servers:} Tor
provides an integrated mechanism for responder-anonymity provides an integrated mechanism for responder-anonymity
location-protected servers location-protected servers.
[XXX Mention that reply onions are out because they're brittle don't give PFS.]
\end{tightlist} \end{tightlist}
[XXX carefully mention implementation, emphasizing that experience
deploying isn't there yet, and not all features are implemented.
Mention that it runs, is kinda alpha, kinda deployed, runs on win32.]
We review previous work in Section \ref{sec:background}, describe We review previous work in Section \ref{sec:background}, describe
our goals and assumptions in Section \ref{sec:assumptions}, our goals and assumptions in Section \ref{sec:assumptions},
and then address the above list of improvements in Sections and then address the above list of improvements in Sections
@ -242,8 +251,8 @@ been run for many years (the Java Anon Proxy, aka Web MIXes,
\cite{web-mix}). \cite{web-mix}).
Another low latency design that was proposed independently and at Another low latency design that was proposed independently and at
about the same time as onion routing was PipeNet \cite{pipenet}. about the same time as Onion Routing was PipeNet \cite{pipenet}.
This provided anonymity protections that were stronger than onion routing's, This provided anonymity protections that were stronger than Onion Routing's,
but at the cost of allowing a single user to shut down the network simply but at the cost of allowing a single user to shut down the network simply
by not sending. It was also never implemented or formally published. by not sending. It was also never implemented or formally published.
@ -261,7 +270,7 @@ requires public-key cryptography, whereas relaying packets along a tunnel is
comparatively inexpensive. Because a tunnel crosses several servers, no comparatively inexpensive. Because a tunnel crosses several servers, no
single server can learn the user's communication partners. single server can learn the user's communication partners.
Systems such as earlier versions of Freedom and onion routing Systems such as earlier versions of Freedom and Onion Routing
build the anonymous channel all at once (using an onion). Later build the anonymous channel all at once (using an onion). Later
designs of Freedom and onion routing as described herein build designs of Freedom and onion routing as described herein build
the channel in stages as does AnonNet the channel in stages as does AnonNet
@ -307,29 +316,19 @@ jondos on any one net- work (using IP address), the attacker would be
forced to launch jondos using many different identities and on many forced to launch jondos using many different identities and on many
different networks to succeed'' \cite{crowds-tissec}. different networks to succeed'' \cite{crowds-tissec}.
Tor is not primarily designed for censorship resistance but rather
Many systems have been designed for censorship resistant publishing. for anonymous communication. However, Tor's rendezvous points, which
The first of these was the Eternity Service \cite{eternity}. Since enable connections between mutually anonymous entities, also
then, there have been many alternatives and refinements, of which we note facilitate connections to hidden servers. These building blocks to
but a few censorship resistance and other capabilities are described in
\cite{eternity,gap-pets03,freenet-pets00,freehaven-berk,publius,tangler,taz}. Section~\ref{sec:rendezvous}. Location-hidden servers are an
From the beginning, traffic analysis resistant communication has been essential component for anonymous publishing systems such as
recognized as an important element of censorship resistance because of Publius\cite{publius}, Free Haven\cite{freehaven-berk}, and
the relation between the ability to censor material and the ability to Tangler\cite{tangler}.
find its distribution source.
Tor is not primarily for censorship resistance but for anonymous
communication. However, Tor's rendezvous points, which enable
connections between mutually anonymous entities, also facilitate
connections to hidden servers. These building blocks to censorship
resistance and other capabilities are described in
Section~\ref{sec:rendezvous}.
[XXX I'm considering the subsection as ended here for now. I'm leaving the [XXX I'm considering the subsection as ended here for now. I'm leaving the
following notes in case we want to revisit any of them. -PS] following notes in case we want to revisit any of them. -PS]
Channel-based anonymizing systems also differ in their use of dummy traffic. Channel-based anonymizing systems also differ in their use of dummy traffic.
[XXX] [XXX]
@ -338,25 +337,11 @@ communication. Crowds and [XXX] provide anonymity for HTTP requests; [...]
[XXX Mention error recovery?] [XXX Mention error recovery?]
STILL NOT MENTIONED:
anonymizer\\
pipenet\\
freedom v1\\
freedom v2\\
onion routing v1\\
isdn-mixes\\ isdn-mixes\\
crowds\\ real-time mixes\\
real-time mixes, web mixes\\
anonnet (marc rennhard's stuff)\\
morphmix\\
P5\\
gnunet\\
rewebbers\\ rewebbers\\
tarzan\\ cebolla\\
herbivore\\
hordes\\
cebolla (?)\\
[XXX Close by mentioning where Tor fits.] [XXX Close by mentioning where Tor fits.]
@ -379,7 +364,8 @@ provide); designs that place a heavy liability burden on operators
(for example, by allowing attackers to implicate operators in illegal (for example, by allowing attackers to implicate operators in illegal
activities); and designs that are difficult or expensive to implement activities); and designs that are difficult or expensive to implement
(for example, by requiring kernel patches to many operating systems, (for example, by requiring kernel patches to many operating systems,
or ). or ). [Only anon people need to run special software! Look at minion
reviews]
Second, the system must be {\bf usable}. A hard-to-use system has Second, the system must be {\bf usable}. A hard-to-use system has
fewer users --- and because anonymity systems hide users among users, a fewer users --- and because anonymity systems hide users among users, a
@ -599,6 +585,50 @@ shape of the traffic they send and receive.
\Section{The Tor Design} \Section{The Tor Design}
\label{sec:design} \label{sec:design}
high-level intro: overlay network of onion routers with long-term TLS
connections. (Every OR connects to every other.) Users run local
software (onion proxies) that establish path over network and
construct virtual circuit. (USers know about all ORs from Directory.)
OPs accept TCP streams and multiplex them across virtual circuit. OR
on the other side of the cirucuit connects to the destinations of the
TCP streams and continues to relay TCP sessions.
Describe connection protocol. Link-to-link rate limiting. Link
padding.
Describe cells. Control versus Relay. Cell structure.
Describe how circuits work and how relay cells get passed along,
decrypted etc. This will include mentioning leaky-pipe circuit
topology and end-to-end integrity checking. (Mention tagging.)
Describe how circuits get built, extended, truncated.
Describe how TCP connections get opened. (Mention DNS issues)
Descibe closing TCP connections and 2-END handshake to mirror TCP
close handshake.
Describe how data is transmitted.
Describe circuit-level and stream-level congestion control issues and
solutions.
Describe circuit-level and stream-level fairness issues; cite Marc's
anonnet stuff.
Describe DoS prevention.
Mention twins, what the do, what they can't.
How we should do sequencing and acking like TCP so that we can better
tolerate lost data cells.
[XXX mention that designers have to choose what you send across your
circuit: wrapped IP packets, wrapped stream data, etc. [Disspell
TCP-over-TCP misconception.]]
[XXX Mention that OR-to-OR connections should be highly reliable. If
they aren't, everything can stall.]
\Section{Other design decisions} \Section{Other design decisions}
@ -681,6 +711,12 @@ The JAP cascade model is really nice because they only need one node to
take the heat per cascade. On the other hand, a hydra scheme could work take the heat per cascade. On the other hand, a hydra scheme could work
better (it's still hard to watch all the clients). better (it's still hard to watch all the clients).
Discuss importance of public perception, and how abuse affects it.
``Usability is a security parameter''. ``Public Perception is also a
security parameter.''
Discuss smear attacks.
\SubSection{Directory Servers} \SubSection{Directory Servers}
\label{subsec:dirservers} \label{subsec:dirservers}
@ -706,6 +742,14 @@ state and router lists (a \emph{directory}), and so other onion routers
can upload a signed summary of their keys, address, bandwidth, exit can upload a signed summary of their keys, address, bandwidth, exit
policy, etc (\emph{server descriptors}. policy, etc (\emph{server descriptors}.
[[mention that descriptors are signed with long-term keys; ORs publish
regularly to dirservers; policies for generating directories; key
rotation (link, onion, identity); Everybody already know directory
keys; how to approve new nodes (advogato, sybil, captcha (RTT));
policy for handling connections with unknown ORs; diff-based
retrieval; diff-based consesus; separate liveness from descriptor
list]]
Of course, a variety of attacks remain. An adversary who controls a Of course, a variety of attacks remain. An adversary who controls a
directory server can track certain clients by providing different directory server can track certain clients by providing different
information --- perhaps by listing only nodes under its control information --- perhaps by listing only nodes under its control
@ -878,9 +922,23 @@ is also designed with authentication/authorization in mind -- if the
client doesn't include the right cookie with its request for service, client doesn't include the right cookie with its request for service,
the server doesn't even acknowledge its existence. the server doesn't even acknowledge its existence.
\Section{Analysis}
How well do we resist chosen adversary?
How well do we meet stated goals?
Mention jurisdictional arbitrage.
Pull attacks and defenses into analysis as a subsection
\Section{Maintaining anonymity sets} \Section{Maintaining anonymity sets}
\label{sec:maintaining-anonymity} \label{sec:maintaining-anonymity}
[Put as much of this as a part of open issuses as is possible.]
[what's an anonymity set?]
packet counting attacks work great against initiators. need to do some packet counting attacks work great against initiators. need to do some
level of obfuscation for that. standard link padding for passive link level of obfuscation for that. standard link padding for passive link
observers. long-range padding for people who own the first hop. are observers. long-range padding for people who own the first hop. are
@ -921,12 +979,15 @@ confirmation? does the hydra (many inputs, few outputs) topology work
better? are we going to get a hydra anyway because most nodes will be better? are we going to get a hydra anyway because most nodes will be
middleman nodes? middleman nodes?
using a circuit many times is good because it's less cpu work using a circuit many times is good because it's less cpu work.
good because of predecessor attacks with path rebuilding good because of predecessor attacks with path rebuilding.
bad because predecessor attacks can be more likely to link you with a bad because predecessor attacks can be more likely to link you with a
previous circuit since you're so verbose previous circuit since you're so verbose.
bad because each thing you do on that circuit is linked to the other bad because each thing you do on that circuit is linked to the other
things you do on that circuit things you do on that circuit.
how often to rotate?
how to decide when to exit from middle?
when to truncate and re-extend versus when to start new circuit?
Because Tor runs over TCP, when one of the servers goes down it seems Because Tor runs over TCP, when one of the servers goes down it seems
that all the circuits (and thus streams) going over that server must that all the circuits (and thus streams) going over that server must
@ -939,6 +1000,12 @@ done browsing, so we would expect a much higher churn rate than for
onion routing. Are there ways of allowing streams to survive the loss onion routing. Are there ways of allowing streams to survive the loss
of a node in the path? of a node in the path?
discuss topologies. Cite George's non-freeroutes paper. Maybe this
graf goes elsewhere.
discuss attracting users; incentives; usability.
Choosing paths and path lengths.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -984,6 +1051,8 @@ it could give you a bad IP that sends you somewhere else.
\Section{Future Directions and Open Problems} \Section{Future Directions and Open Problems}
\label{sec:conclusion} \label{sec:conclusion}
% Mention that we need to do TCP over tor for reliability.
Tor brings together many innovations into Tor brings together many innovations into
a unified deployable system. But there are still several attacks that a unified deployable system. But there are still several attacks that
work quite well, as well as a number of sustainability and run-time work quite well, as well as a number of sustainability and run-time
@ -1048,7 +1117,7 @@ deploying a wider network. We will see what happens!
% since Middle English.] % since Middle English.]
% 'nymserver' % 'nymserver'
% 'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer' % 'Cypherpunk', 'Cypherpunks', 'Cypherpunk remailer'
% 'Onion Routing design', 'onion router' [note capitalization]
% %
% 'Whenever you are tempted to write 'Very', write 'Damn' instead, so % 'Whenever you are tempted to write 'Very', write 'Damn' instead, so
% your editor will take it out for you.' -- Misquoted from Mark Twain % your editor will take it out for you.' -- Misquoted from Mark Twain