mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-24 04:13:28 +01:00
identical FAQ and HACKING files, now in /doc
svn:r194
This commit is contained in:
parent
f9c541bfcf
commit
8fb1056a7c
111
doc/FAQ
Normal file
111
doc/FAQ
Normal file
@ -0,0 +1,111 @@
|
||||
The Onion Routing (TOR) Frequently Asked Questions
|
||||
--------------------------------------------------
|
||||
|
||||
1. General.
|
||||
|
||||
1.1. What is tor?
|
||||
|
||||
Tor is an implementation of version 2 of Onion Routing.
|
||||
|
||||
Onion Routing is a connection-oriented anonymizing communication
|
||||
service. Users build a layered block of asymmetric encryptions
|
||||
(an "onion") which describes a source-routed path through a set of
|
||||
nodes. Those nodes build a "virtual circuit" through the network, in which
|
||||
each node knows its predecessor and successor, but no others. Traffic
|
||||
flowing down the circuit is unwrapped by a symmetric key at each node
|
||||
which reveals the downstream node.
|
||||
|
||||
Basically tor provides a distributed network of servers ("onion
|
||||
routers"). Users bounce their tcp streams (web traffic, ftp, ssh, etc)
|
||||
around the routers, and recipients, observers, and even the routers
|
||||
themselves have difficulty tracking the source of the stream.
|
||||
|
||||
1.2. Why's it called tor?
|
||||
|
||||
Because tor is the onion routing system. I kept telling people I was
|
||||
working on onion routing, and they said "Neat. Which one?" Even if onion
|
||||
routing has become a standard household term, this is the actual onion
|
||||
routing project, started out of the Naval Research Lab.
|
||||
|
||||
(Theories about recursive acronyms are ok too.)
|
||||
|
||||
|
||||
2. Compiling and installing.
|
||||
|
||||
[Read the README file for now; check back here once we've got packages/etc
|
||||
for you.]
|
||||
|
||||
|
||||
3. Running tor.
|
||||
|
||||
3.1. What's this about roles? What kind of server should I run?
|
||||
|
||||
The same executable ("or") functions as both client and server, depending
|
||||
on the value of the config variable named 'Role'. Role represents a
|
||||
combination of which tasks this particular tor server will do. The default
|
||||
Role (role 15) is an onion router: it listens for onion routers, listens
|
||||
for onion proxies, listens for application proxies, and it connects to
|
||||
all other onion routers it learns about. A directory server (role 63)
|
||||
does all of the above and also serves directory requests. A simple
|
||||
onion proxy, on the other hand (role 8), only listens for application
|
||||
proxies. See part 3.1 of the HACKING document for more technical details.
|
||||
|
||||
3.2. So I can just run a full onion router and join the network?
|
||||
|
||||
No. Users should run just an onion proxy (use the 'oprc' config file).
|
||||
If you start up a full onion router, the rest of the routers in the
|
||||
system won't recognize you, so they will reject your handshake attempts.
|
||||
|
||||
3.3. How do I join the network then?
|
||||
|
||||
If you just want to use the onion routing network, you can run a proxy
|
||||
and you're all set. If you want to run a router, you must convince
|
||||
the directory server operators (currently arma@mit.edu) that you're a
|
||||
trustworthy person. From there, the operators add you to the directory,
|
||||
which propagates out to the rest of the network. All nodes will know
|
||||
about you within an hour.
|
||||
|
||||
3.4. I want to run a directory server too.
|
||||
|
||||
If you run a very reliable node, you plan to be around for a long time,
|
||||
and you want to spend some time ensuring that router operators are
|
||||
people we know and like, we may want you to run a directory server
|
||||
too. We must manually add you to the 'dirservers' file that's part of
|
||||
the distribution; users will only know about you when they upgrade to
|
||||
a new version. Of course, you can always just start up your router as a
|
||||
directory server too --- but users won't know to ask you for directories,
|
||||
and more importantly, you'll never learn from the real directory servers
|
||||
about recently joined routers.
|
||||
|
||||
|
||||
4. Development.
|
||||
|
||||
4.1. Who's doing this?
|
||||
|
||||
4.2. Can I help?
|
||||
|
||||
4.3. I've got a bug.
|
||||
|
||||
|
||||
5. Anonymity.
|
||||
|
||||
5.1. So I'm totally anonymous if I use tor?
|
||||
|
||||
5.2. Where can I learn more about anonymity?
|
||||
|
||||
|
||||
6. Comparison to related projects.
|
||||
|
||||
6.1. Onion Routing.
|
||||
|
||||
Tor *is* onion routing.
|
||||
|
||||
6.2. Freedom.
|
||||
|
||||
|
||||
7. Protocol and application support.
|
||||
|
||||
7.1. http? ftp? udp? socks? mozilla?
|
||||
|
||||
|
||||
|
117
doc/HACKING
Normal file
117
doc/HACKING
Normal file
@ -0,0 +1,117 @@
|
||||
|
||||
0. Intro.
|
||||
Onion Routing is still very much in development stages. This document
|
||||
aims to get you started in the right direction if you want to understand
|
||||
the code, add features, fix bugs, etc.
|
||||
|
||||
Read the README file first, so you can get familiar with the basics.
|
||||
|
||||
1. The programs.
|
||||
|
||||
1.1. "or". This is the main program here. It functions as both a server
|
||||
and a client, depending on which config file you give it. ...
|
||||
|
||||
2. The pieces.
|
||||
|
||||
2.1. Routers. Onion routers, as far as the 'or' program is concerned,
|
||||
are a bunch of data items that are loaded into the router_array when
|
||||
the program starts. After it's loaded, the router information is never
|
||||
changed. When a new OR connection is started (see below), the relevant
|
||||
information is copied from the router struct to the connection struct.
|
||||
|
||||
2.2. Connections. A connection is a long-standing tcp socket between
|
||||
nodes. A connection is named based on what it's connected to -- an "OR
|
||||
connection" has an onion router on the other end, an "OP connection" has
|
||||
an onion proxy on the other end, an "exit connection" has a website or
|
||||
other server on the other end, and an "AP connection" has an application
|
||||
proxy (and thus a user) on the other end.
|
||||
|
||||
2.3. Circuits. A circuit is a single conversation between two
|
||||
participants over the onion routing network. One end of the circuit has
|
||||
an AP connection, and the other end has an exit connection. AP and exit
|
||||
connections have only one circuit associated with them (and thus these
|
||||
connection types are closed when the circuit is closed), whereas OP and
|
||||
OR connections multiplex many circuits at once, and stay standing even
|
||||
when there are no circuits running over them.
|
||||
|
||||
2.4. Cells. Some connections, specifically OR and OP connections, speak
|
||||
"cells". This means that data over that connection is bundled into 128
|
||||
byte packets (8 bytes of header and 120 bytes of payload). Each cell has
|
||||
a type, or "command", which indicates what it's for.
|
||||
|
||||
|
||||
3. Important parameters in the code.
|
||||
|
||||
3.1. Role.
|
||||
|
||||
|
||||
4. Robustness features.
|
||||
|
||||
4.1. Bandwidth throttling. Each cell-speaking connection has a maximum
|
||||
bandwidth it can use, as specified in the routers.or file. Bandwidth
|
||||
throttling occurs on both the sender side and the receiving side. The
|
||||
sending side sends cells at regularly spaced intervals (e.g., a connection
|
||||
with a bandwidth of 12800B/s would queue a cell every 10ms). The receiving
|
||||
side protects against misbehaving servers that send cells more frequently,
|
||||
by using a simple token bucket:
|
||||
|
||||
Each connection has a token bucket with a specified capacity. Tokens are
|
||||
added to the bucket each second (when the bucket is full, new tokens
|
||||
are discarded.) Each token represents permission to receive one byte
|
||||
from the network --- to receive a byte, the connection must remove a
|
||||
token from the bucket. Thus if the bucket is empty, that connection must
|
||||
wait until more tokens arrive. The number of tokens we add enforces a
|
||||
longterm average rate of incoming bytes, yet we still permit short-term
|
||||
bursts above the allowed bandwidth. Currently bucket sizes are set to
|
||||
ten seconds worth of traffic.
|
||||
|
||||
The bandwidth throttling uses TCP to push back when we stop reading.
|
||||
We extend it with token buckets to allow more flexibility for traffic
|
||||
bursts.
|
||||
|
||||
4.2. Data congestion control. Even with the above bandwidth throttling,
|
||||
we still need to worry about congestion, either accidental or intentional.
|
||||
If a lot of people make circuits into same node, and they all come out
|
||||
through the same connection, then that connection may become saturated
|
||||
(be unable to send out data cells as quickly as it wants to). An adversary
|
||||
can make a 'put' request through the onion routing network to a webserver
|
||||
he owns, and then refuse to read any of the bytes at the webserver end
|
||||
of the circuit. These bottlenecks can propagate back through the entire
|
||||
network, mucking up everything.
|
||||
|
||||
To handle this congestion, each circuit starts out with a receive
|
||||
window at each node of 100 cells -- it is willing to receive at most 100
|
||||
cells on that circuit. (It handles each direction separately; so that's
|
||||
really 100 cells forward and 100 cells back.) The edge of the circuit
|
||||
is willing to create at most 100 cells from data coming from outside the
|
||||
onion routing network. Nodes in the middle of the circuit will tear down
|
||||
the circuit if a data cell arrives when the receive window is 0. When
|
||||
data has traversed the network, the edge node buffers it on its outbuf,
|
||||
and evaluates whether to respond with a 'sendme' acknowledgement: if its
|
||||
outbuf is not too full, and its receive window is less than 90, then it
|
||||
queues a 'sendme' cell backwards in the circuit. Each node that receives
|
||||
the sendme increments its window by 10 and passes the cell onward.
|
||||
|
||||
In practice, all the nodes in the circuit maintain a receive window
|
||||
close to 100 except the exit node, which stays around 0, periodically
|
||||
receiving a sendme and reading 10 more data cells from the webserver.
|
||||
In this way we can use pretty much all of the available bandwidth for
|
||||
data, but gracefully back off when faced with multiple circuits (a new
|
||||
sendme arrives only after some cells have traversed the entire network),
|
||||
stalled network connections, or attacks.
|
||||
|
||||
We don't need to reimplement full tcp windows, with sequence numbers,
|
||||
the ability to drop cells when we're full etc, because the tcp streams
|
||||
already guarantee in-order delivery of each cell. Rather than trying
|
||||
to build some sort of tcp-on-tcp scheme, we implement this minimal data
|
||||
congestion control; so far it's enough.
|
||||
|
||||
4.3. Router twins. In many cases when we ask for a router with a given
|
||||
address and port, we really mean a router who knows a given key. Router
|
||||
twins are two or more routers that all share the same private key. We thus
|
||||
give routers extra flexibility in choosing the next hop in the circuit: if
|
||||
some of the twins are down or slow, it can choose the more available ones.
|
||||
|
||||
Currently the code tries for the primary router first, and if it's down,
|
||||
chooses the first available twin.
|
||||
|
Loading…
Reference in New Issue
Block a user