many small changes throughout

svn:r714
This commit is contained in:
Roger Dingledine 2003-11-02 09:56:52 +00:00
parent 27dd67e3a0
commit b6d5a56e84

View File

@ -124,7 +124,7 @@ proxy'' for each
supported application protocol---most supported application protocol---most
of which were never written, so many applications were never supported. of which were never written, so many applications were never supported.
Tor uses the standard and near-ubiquitous SOCKS Tor uses the standard and near-ubiquitous SOCKS
\cite{socks4,socks5} proxy interface, allowing us to support most TCP-based \cite{socks4} proxy interface, allowing us to support most TCP-based
programs without modification. This design change allows Tor to programs without modification. This design change allows Tor to
use the filtering features of privacy-enhancing use the filtering features of privacy-enhancing
application-level proxies such as Privoxy \cite{privoxy} without having to application-level proxies such as Privoxy \cite{privoxy} without having to
@ -142,7 +142,7 @@ circuit, to improve efficiency and anonymity.
\item \textbf{No mixing, padding, or traffic shaping:} The original \item \textbf{No mixing, padding, or traffic shaping:} The original
Onion Routing design called for batching and reordering the cells arriving Onion Routing design called for batching and reordering the cells arriving
from each circuit and the ability to do padding between onion routers and, from each circuit. It also included padding between onion routers and,
in a later design, between onion in a later design, between onion
proxies (that is, users) and onion routers \cite{or-ih96,or-jsac98}. proxies (that is, users) and onion routers \cite{or-ih96,or-jsac98}.
The tradeoff between padding protection and cost was discussed, but no The tradeoff between padding protection and cost was discussed, but no
@ -645,24 +645,33 @@ used, and expire old used circuits that are no longer in use. Thus
even heavy users spend a negligible amount of time and CPU in even heavy users spend a negligible amount of time and CPU in
building circuits, but only a limited number of requests can be linked building circuits, but only a limited number of requests can be linked
to each other by a given exit node. Also, because circuits are built to each other by a given exit node. Also, because circuits are built
in the background, failed routers do not affects user experience. in the background, failed routers do not affect user experience.
\subsubsection{Constructing a circuit} \subsubsection{Constructing a circuit}
Users construct each incrementally, negotiating a symmetric key with Users construct a circuit incrementally, negotiating a symmetric key with
each hop one at a time. To begin creating a new circuit, the user each hop one at a time. To begin creating a new circuit, the user
(call her Alice) sends a \emph{create} cell to the first node in her (call her Alice) sends a \emph{create} cell to the first node in her
chosen path. The cell's payload is the first half of the chosen path. The cell's payload is the first half of the
Diffie-Hellman handshake, encrypted to the onion key of the OR (call Diffie-Hellman handshake, encrypted to the onion key of the OR (call
him Bob). Bob responds with a \emph{created} cell containg the second him Bob). Bob responds with a \emph{created} cell containing the second
half of the DH handshake, along with a hash of the negotiated key half of the DH handshake, along with a hash of the negotiated key
$K=g^{xy}$. This protocol tries to achieve unilateral entity $K=g^{xy}$.
To extend a circuit past the first hop, Alice sends a \emph{relay extend}
cell to the last node in the circuit, specifying the address of the new
OR and an encrypted $g^x$ for it. That node copies the half-handshake
into a \emph{create} cell, and passes it to the new OR to extend the
circuit. When it responds with a \emph{created} cell, the penultimate OR
copies the payload into a \emph{relay extended} cell and passes it back.
% Nick: please fix my "that OR" pronouns -RD
The onion-level handshake protocol achieves unilateral entity
authentication (Alice knows she's handshaking with Bob, Bob doesn't authentication (Alice knows she's handshaking with Bob, Bob doesn't
care who is opening the circuit---Alice has no key and is trying to care who is opening the circuit---Alice has no key and is trying to
remain anonymous); unilateral key authentication (Alice and Bob remain anonymous) and unilateral key authentication (Alice and Bob
agree on a key, and Alice knows Bob is the only other person who could agree on a key, and Alice knows Bob is the only other person who should
know it). We also want perfect forward know it). We also want perfect forward secrecy and key freshness.
secrecy, key freshness, etc.
\begin{equation} \begin{equation}
\begin{aligned} \begin{aligned}
@ -679,19 +688,6 @@ don't have enough room in a single cell for a public key and also a
signature. Preliminary analysis with the NRL protocol analyzer \cite{meadows96} signature. Preliminary analysis with the NRL protocol analyzer \cite{meadows96}
shows the above protocol to be secure (including providing PFS) under the shows the above protocol to be secure (including providing PFS) under the
traditional Dolev-Yao model. traditional Dolev-Yao model.
% cite Cathy? -RD
% did I use the buzzwords correctly? -RD
% Hm. I think that this paragraph could go earlier in expository
% order: we describe how to build whole circuit, then explain the
% protocol in more detail. -NM
To extend a circuit past the first hop, Alice sends a \emph{relay extend}
cell to the last node in the circuit, specifying the address of the new
OR and an encrypted $g^x$ for it. That node copies the half-handshake
into a \emph{create} cell, and passes it to the new OR to extend the
circuit. When it responds with a \emph{created} cell, the penultimate OR
copies the payload into a \emph{relay extended} cell and passes it back.
% Nick: please fix my "that OR" pronouns -RD
\subsubsection{Relay cells} \subsubsection{Relay cells}
Once Alice has established the circuit (so she shares a key with each Once Alice has established the circuit (so she shares a key with each
@ -773,37 +769,36 @@ but are still willing to read.
\SubSection{Integrity checking on streams} \SubSection{Integrity checking on streams}
In the old Onion Routing design, traffic was vulnerable to a Because the old Onion Routing design used a stream cipher, traffic was
malleability attack: an attacker could make changes to an encrypted vulnerable to a malleability attack: even though the attacker could not
decrypt cells, he could make changes to an encrypted
cell to create corresponding changes to the data leaving the network. cell to create corresponding changes to the data leaving the network.
(Even an external adversary could do this, despite link encryption!) (Even an external adversary could do this, despite link encryption!)
This weakness allowed an adversary to change a create cell to a destroy This weakness allowed an adversary to change a padding cell to a destroy
cell; change the destination address in a relay begin cell to the cell; change the destination address in a relay begin cell to the
adversary's webserver; or change a user on an ftp connection from adversary's webserver; or change a user on an ftp connection from
typing ``dir'' to typing ``delete *''. Any node or observer along the typing ``dir'' to typing ``delete *''. Any node or external adversary
path could introduce such corruption in a stream. along the circuit could introduce such corruption in a stream.
Tor prevents external adversaries by mounting this attack simply by Tor prevents external adversaries from mounting this attack simply by
using TLS. Addressing the insider malleability attack, however, is using TLS. Addressing the insider malleability attack, however, is
more complex. more complex.
Rather than doing integrity checking of the relay cells at each hop, We could do integrity checking of the relay cells at each hop, either
which would increase packet size by including hashes or by using a cipher mode like EAX \cite{eax}.
by a function of path length\footnote{This is also the argument against But we don't want the added message-expansion overhead at each hop, and
using recent cipher modes like EAX \cite{eax}---we don't want the added we don't want to leak the path length (or pad to some max path length).
message-expansion overhead at each hop, and we don't want to leak the path Because we've already accepted that our design is vulnerable to end-to-end
length (or pad to some max path length).}, we choose to timing attacks, we can perform integrity checking only at the edges of
% accept passive timing attacks, the circuit without introducing any new anonymity attacks. When Alice
% (How? I don't get it. Do we mean end-to-end traffic negotiates a key
% confirmation attacks? -NM) with each hop, they both start a SHA-1 with some derivative of that key,
and perform integrity % Not just the exit hop, but each hop: any hop can be an exit node. -RD
checking only at the edges of the circuit. When Alice negotiates a key
with the exit hop, they both start a SHA-1 with some derivative of that key,
thus starting out with randomness that only the two of them know. From thus starting out with randomness that only the two of them know. From
then on they each incrementally add all the data bytes flowing across then on they each incrementally add to the SHA-1 all the data bytes
the stream to the SHA-1, and each relay cell includes the first 4 bytes entering or exiting from the circuit, and each such relay cell includes
of the current value of the hash. the first 4 bytes of the current value of the hash.
The attacker must be able to guess all previous bytes between Alice The attacker must be able to guess all previous bytes between Alice
and Bob on that circuit (including the pseudorandomness from the key and Bob on that circuit (including the pseudorandomness from the key
@ -812,7 +807,6 @@ cell. Attacks on SHA-1 where the adversary can incrementally add to a
hash to produce a new valid hash don't work, hash to produce a new valid hash don't work,
because all hashes are end-to-end encrypted across the circuit. because all hashes are end-to-end encrypted across the circuit.
The computational overhead isn't so bad, compared to doing an AES The computational overhead isn't so bad, compared to doing an AES
% XXX We never say we use AES. Say it somewhere above? -RD
crypt at each hop in the circuit. We use only four bytes per cell to crypt at each hop in the circuit. We use only four bytes per cell to
minimize overhead; the chance that an adversary will correctly guess a minimize overhead; the chance that an adversary will correctly guess a
valid hash, plus the payload the current cell, is acceptly low, given valid hash, plus the payload the current cell, is acceptly low, given
@ -962,6 +956,9 @@ new to the networking literature, some proposed approaches are a poor
fit to anonymous networks. For example, solutions based on backtracking fit to anonymous networks. For example, solutions based on backtracking
harmful traffic \cite{XXX} could allow an anonymity-breaking harmful traffic \cite{XXX} could allow an anonymity-breaking
adversary to exploit the backtracking mechanism. adversary to exploit the backtracking mechanism.
% XXX I don't see how you would do DDoS through Tor. And even if you
% did, it seems ok to track you down. Should we remove this
% paragraph? -RD
Attackers also have an opportunity to attack the Tor network by mounting Attackers also have an opportunity to attack the Tor network by mounting
attacks on its hosts and network links. Disrupting a single circuit or attacks on its hosts and network links. Disrupting a single circuit or
@ -1166,51 +1163,52 @@ signs its current opinion, and broadcasts it to the other directory
servers; then in round two, each server rebroadcasts all the signed servers; then in round two, each server rebroadcasts all the signed
opinions it has received. At this point all directory servers check opinions it has received. At this point all directory servers check
to see whether any server has signed multiple opinions in the same to see whether any server has signed multiple opinions in the same
period. If so, the server is either broken or cheating, so protocol period. If so, the server is either broken or cheating, so the protocol
stops and notifies the administrators, who either remove the cheater stops and notifies the administrators, who either remove the cheater
or wait for the broken server to be fixed. If there are no or wait for the broken server to be fixed. If there are no
discrepancies, each directory server then locally computes algorithm discrepancies, each directory server then locally computes an algorithm
(described below)
on the set of opinions, resulting in a uniform shared directory. In on the set of opinions, resulting in a uniform shared directory. In
round three servers sign this directory and broadcast it; and finally round three servers sign this directory and broadcast it; and finally
in round four the servers rebroadcast the directory and all the in round four the servers rebroadcast the directory and all the
signatures. If any directory server drops out of the network, its signatures. If any directory server drops out of the network, its
signature is not included on the file directory. signature is not included on the final directory.
The rebroadcast steps ensure that a directory server is heard by The rebroadcast steps ensure that a directory server is heard by
either all of the other servers or none of them, assuming that any two either all of the other servers or none of them, assuming that any two
directories can talk directly, or via a third directory (some of the directory servers can talk directly, or via a third directory server (some of the
links between directory servers may be down). Broadcasts are feasible links between directory servers may be down). Broadcasts are feasible
because there are relatively few directory servers (currently 3, but we expect because there are relatively few directory servers (currently 3, but we expect
to use as many as 9 as the network scales). The actual local algorithm to transition to 9 as the network scales). The actual local algorithm
for computing the shared directory is a straightforward threshold for computing the shared directory is a straightforward threshold
voting process: we include an OR if a majority of directory servers voting process: we include an OR if a majority of directory servers
believe it to be good. believe it to be good.
To avoid attacks where a router connects to all the directory servers
but refuses to relay traffic from other routers, the directory servers
must build circuits and use them to anonymously test router reliability
\cite{mix-acc}.
When a client Alice retrieves a consensus directory, she uses it if it When a client Alice retrieves a consensus directory, she uses it if it
is signed by a majority of the directory servers she knows. is signed by a majority of the directory servers she knows.
Using directory servers rather than flooding provides simplicity and Using directory servers rather than flooding provides simplicity and
flexibility. For example, they don't complicate the analysis when we flexibility. For example, they don't complicate the analysis when we
start experimenting with non-clique network topologies. And because start experimenting with non-clique network topologies. And because
the directories are signed, they can be cached by other onion routers, the directories are signed, they can be cached by other onion routers.
or indeed by any server. Thus directory servers are not a performance Thus directory servers are not a performance
bottleneck when we have many users, and do not aid traffic analysis by bottleneck when we have many users, and do not aid traffic analysis by
forcing clients to periodically announce their existence to any forcing clients to periodically announce their existence to any
central point. central point.
% Mention Hydra as an example of non-clique topologies. -NM, from RD % Mention Hydra as an example of non-clique topologies. -NM, from RD
% also find some place to integrate that dirservers have to actually
% lay test circuits and use them, otherwise routers could connect to
% the dirservers but discard all other traffic.
% in some sense they're like reputation servers in \cite{mix-acc} -RD
\Section{Rendezvous points: location privacy} \Section{Rendezvous points: location privacy}
\label{sec:rendezvous} \label{sec:rendezvous}
Rendezvous points are a building block for \emph{location-hidden Rendezvous points are a building block for \emph{location-hidden
services} (also known as ``responder anonymity'') in the Tor services} (also known as ``responder anonymity'') in the Tor
network. Location-hidden services allow a server Bob to a TCP network. Location-hidden services allow a server Bob to offer a TCP
service, such as a webserver, without revealing the IP of his service. service, such as a webserver, without revealing the IP of his service.
Besides allowing Bob to provided services anonymously, location Besides allowing Bob to provided services anonymously, location
privacy also seeks to provide some protection against DDoS attacks: privacy also seeks to provide some protection against DDoS attacks:
@ -1219,15 +1217,14 @@ rather than just Bob's IP.
\subsection{Goals for rendezvous points} \subsection{Goals for rendezvous points}
\label{subsec:rendezvous-goals} \label{subsec:rendezvous-goals}
In addition to our other goals, have tried to provide the following Our design for location-hidden servers has the following properties:
properties in our design for location-hidden servers:
\begin{tightlist} \begin{tightlist}
\item[Flood-proof:] An attacker should not be able to flood Bob with traffic \item[Flood-proof:] An attacker should not be able to flood Bob with traffic
simply by sending may requests to Bob's public location. Thus, Bob needs a simply by sending many requests to talk to Bob. Thus, Bob needs a
way to filter incoming requests. way to filter incoming requests.
\item[Robust:] Bob should be able to maintain a long-term pseudonymous \item[Robust:] Bob should be able to maintain a long-term pseudonymous
identity even in the presence of OR failure. Thus, Bob's identity must not identity even in the presence of router failure. Thus, Bob's identity
be tied to a single OR. must not be tied to a single OR.
\item[Smear-resistant:] An attacker should not be able to use rendezvous \item[Smear-resistant:] An attacker should not be able to use rendezvous
points to smear an OR. That is, if a social attacker tries to host a points to smear an OR. That is, if a social attacker tries to host a
location-hidden service that is illegal or disreputable, it should not location-hidden service that is illegal or disreputable, it should not