mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-27 22:03:31 +01:00
Merge branch 'doxygen_libs'
This commit is contained in:
commit
8933789fef
@ -256,6 +256,8 @@ TAB_SIZE = 8
|
|||||||
|
|
||||||
ALIASES =
|
ALIASES =
|
||||||
|
|
||||||
|
ALIASES += refdir{1}="\ref @top_srcdir@/src/\1 \"\1\""
|
||||||
|
|
||||||
# This tag can be used to specify a number of word-keyword mappings (TCL only).
|
# This tag can be used to specify a number of word-keyword mappings (TCL only).
|
||||||
# A mapping has the form "name=value". For example adding "class=itcl::class"
|
# A mapping has the form "name=value". For example adding "class=itcl::class"
|
||||||
# will allow you to use the command class in the itcl::class meaning.
|
# will allow you to use the command class in the itcl::class meaning.
|
||||||
|
@ -1,124 +1,6 @@
|
|||||||
|
|
||||||
## Overview ##
|
## Overview ##
|
||||||
|
|
||||||
This document describes the general structure of the Tor codebase, how
|
|
||||||
it fits together, what functionality is available for extending Tor,
|
|
||||||
and gives some notes on how Tor got that way.
|
|
||||||
|
|
||||||
Tor remains a work in progress: We've been working on it for nearly two
|
|
||||||
decades, and we've learned a lot about good coding since we first
|
|
||||||
started. This means, however, that some of the older pieces of Tor will
|
|
||||||
have some "code smell" in them that could stand a brisk
|
|
||||||
refactoring. So when I describe a piece of code, I'll sometimes give a
|
|
||||||
note on how it got that way, and whether I still think that's a good
|
|
||||||
idea.
|
|
||||||
|
|
||||||
The first drafts of this document were written in the Summer and Fall of
|
|
||||||
2015, when Tor 0.2.6 was the most recent stable version, and Tor 0.2.7
|
|
||||||
was under development. There is a revision in progress (as of late
|
|
||||||
2019), to bring it up to pace with Tor as of version 0.4.2. If you're
|
|
||||||
reading this far in the future, some things may have changed. Caveat
|
|
||||||
haxxor!
|
|
||||||
|
|
||||||
This document is not an overview of the Tor protocol. For that, see the
|
|
||||||
design paper and the specifications at https://spec.torproject.org/ .
|
|
||||||
|
|
||||||
For more information about Tor's coding standards and some helpful
|
|
||||||
development tools, see doc/HACKING in the Tor repository.
|
|
||||||
|
|
||||||
|
|
||||||
### The very high level ###
|
|
||||||
|
|
||||||
Ultimately, Tor runs as an event-driven network daemon: it responds to
|
|
||||||
network events, signals, and timers by sending and receiving things over
|
|
||||||
the network. Clients, relays, and directory authorities all use the
|
|
||||||
same codebase: the Tor process will run as a client, relay, or authority
|
|
||||||
depending on its configuration.
|
|
||||||
|
|
||||||
Tor has a few major dependencies, including Libevent (used to tell which
|
|
||||||
sockets are readable and writable), OpenSSL or NSS (used for many encryption
|
|
||||||
functions, and to implement the TLS protocol), and zlib (used to
|
|
||||||
compress and uncompress directory information).
|
|
||||||
|
|
||||||
Most of Tor's work today is done in a single event-driven main thread.
|
|
||||||
Tor also spawns one or more worker threads to handle CPU-intensive
|
|
||||||
tasks. (Right now, this only includes circuit encryption and the more
|
|
||||||
expensive compression algorithms.)
|
|
||||||
|
|
||||||
On startup, Tor initializes its libraries, reads and responds to its
|
|
||||||
configuration files, and launches a main event loop. At first, the only
|
|
||||||
events that Tor listens for are a few signals (like TERM and HUP), and
|
|
||||||
one or more listener sockets (for different kinds of incoming
|
|
||||||
connections). Tor also configures several timers to handle periodic
|
|
||||||
events. As Tor runs over time, other events will open, and new events
|
|
||||||
will be scheduled.
|
|
||||||
|
|
||||||
The codebase is divided into a few top-level subdirectories, each of
|
|
||||||
which contains several sub-modules.
|
|
||||||
|
|
||||||
* `src/ext` -- Code maintained elsewhere that we include in the Tor
|
|
||||||
source distribution.
|
|
||||||
|
|
||||||
* src/lib` -- Lower-level utility code, not necessarily tor-specific.
|
|
||||||
|
|
||||||
* `src/trunnel` -- Automatically generated code (from the Trunnel
|
|
||||||
tool): used to parse and encode binary formats.
|
|
||||||
|
|
||||||
* `src/core` -- Networking code that is implements the central parts of
|
|
||||||
the Tor protocol and main loop.
|
|
||||||
|
|
||||||
* `src/feature` -- Aspects of Tor (like directory management, running a
|
|
||||||
relay, running a directory authorities, managing a list of nodes,
|
|
||||||
running and using onion services) that are built on top of the
|
|
||||||
mainloop code.
|
|
||||||
|
|
||||||
* `src/app` -- Highest-level functionality; responsible for setting up
|
|
||||||
and configuring the Tor daemon, making sure all the lower-level
|
|
||||||
modules start up when required, and so on.
|
|
||||||
|
|
||||||
* `src/tools` -- Binaries other than Tor that we produce. Currently this
|
|
||||||
is tor-resolve, tor-gencert, and the tor_runner.o helper module.
|
|
||||||
|
|
||||||
* `src/test` -- unit tests, regression tests, and a few integration
|
|
||||||
tests.
|
|
||||||
|
|
||||||
In theory, the above parts of the codebase are sorted from highest-level to
|
|
||||||
lowest-level, where high-level code is only allowed to invoke lower-level
|
|
||||||
code, and lower-level code never includes or depends on code of a higher
|
|
||||||
level. In practice, this refactoring is incomplete: The modules in `src/lib`
|
|
||||||
are well-factored, but there are many layer violations ("upward
|
|
||||||
dependencies") in `src/core` and `src/feature`. We aim to eliminate those
|
|
||||||
over time.
|
|
||||||
|
|
||||||
### Some key high-level abstractions ###
|
|
||||||
|
|
||||||
The most important abstractions at Tor's high-level are Connections,
|
|
||||||
Channels, Circuits, and Nodes.
|
|
||||||
|
|
||||||
A 'Connection' represents a stream-based information flow. Most
|
|
||||||
connections are TCP connections to remote Tor servers and clients. (But
|
|
||||||
as a shortcut, a relay will sometimes make a connection to itself
|
|
||||||
without actually using a TCP connection. More details later on.)
|
|
||||||
Connections exist in different varieties, depending on what
|
|
||||||
functionality they provide. The principle types of connection are
|
|
||||||
"edge" (eg a socks connection or a connection from an exit relay to a
|
|
||||||
destination), "OR" (a TLS stream connecting to a relay), "Directory" (an
|
|
||||||
HTTP connection to learn about the network), and "Control" (a connection
|
|
||||||
from a controller).
|
|
||||||
|
|
||||||
A 'Circuit' is persistent tunnel through the Tor network, established
|
|
||||||
with public-key cryptography, and used to send cells one or more hops.
|
|
||||||
Clients keep track of multi-hop circuits, and the cryptography
|
|
||||||
associated with each hop. Relays, on the other hand, keep track only of
|
|
||||||
their hop of each circuit.
|
|
||||||
|
|
||||||
A 'Channel' is an abstract view of sending cells to and from a Tor
|
|
||||||
relay. Currently, all channels are implemented using OR connections.
|
|
||||||
If we switch to other strategies in the future, we'll have more
|
|
||||||
connection types.
|
|
||||||
|
|
||||||
A 'Node' is a view of a Tor instance's current knowledge and opinions
|
|
||||||
about a Tor relay or bridge.
|
|
||||||
|
|
||||||
### The rest of this document. ###
|
### The rest of this document. ###
|
||||||
|
|
||||||
|
@ -1,171 +0,0 @@
|
|||||||
|
|
||||||
## Library code in Tor.
|
|
||||||
|
|
||||||
Most of Tor's utility code is in modules in the `src/lib` subdirectory. In
|
|
||||||
general, this code is not necessarily Tor-specific, but is instead possibly
|
|
||||||
useful for other applications.
|
|
||||||
|
|
||||||
This code includes:
|
|
||||||
|
|
||||||
* Compatibility wrappers, to provide a uniform API across different
|
|
||||||
platforms.
|
|
||||||
|
|
||||||
* Library wrappers, to provide a tor-like API over different libraries
|
|
||||||
that Tor uses for things like compression and cryptography.
|
|
||||||
|
|
||||||
* Containers, to implement some general-purpose data container types.
|
|
||||||
|
|
||||||
The modules in `src/lib` are currently well-factored: each one depends
|
|
||||||
only on lower-level modules. You can see an up-to-date list of the
|
|
||||||
modules sorted from lowest to highest level by running
|
|
||||||
`./scripts/maint/practracker/includes.py --toposort`.
|
|
||||||
|
|
||||||
As of this writing, the library modules are (from lowest to highest
|
|
||||||
level):
|
|
||||||
|
|
||||||
* `lib/cc` -- Macros for managing the C compiler and
|
|
||||||
language. Includes macros for improving compatibility and clarity
|
|
||||||
across different C compilers.
|
|
||||||
|
|
||||||
* `lib/version` -- Holds the current version of Tor.
|
|
||||||
|
|
||||||
* `lib/testsupport` -- Helpers for making test-only code and test
|
|
||||||
mocking support.
|
|
||||||
|
|
||||||
* `lib/defs` -- Lowest-level constants used in many places across the
|
|
||||||
code.
|
|
||||||
|
|
||||||
* `lib/subsys` -- Types used for declaring a "subsystem". A subsystem
|
|
||||||
is a module with support for initialization, shutdown,
|
|
||||||
configuration, and so on.
|
|
||||||
|
|
||||||
* `lib/conf` -- Types and macros used for declaring configuration
|
|
||||||
options.
|
|
||||||
|
|
||||||
* `lib/arch` -- Compatibility functions and macros for handling
|
|
||||||
differences in CPU architecture.
|
|
||||||
|
|
||||||
* `lib/err` -- Lowest-level error handling code: responsible for
|
|
||||||
generating stack traces, handling raw assertion failures, and
|
|
||||||
otherwise reporting problems that might not be safe to report
|
|
||||||
via the regular logging module.
|
|
||||||
|
|
||||||
* `lib/malloc` -- Wrappers and utilities for memory management.
|
|
||||||
|
|
||||||
* `lib/intmath` -- Utilities for integer mathematics.
|
|
||||||
|
|
||||||
* `lib/fdio` -- Utilities and compatibility code for reading and
|
|
||||||
writing data on file descriptors (and on sockets, for platforms
|
|
||||||
where a socket is not a kind of fd).
|
|
||||||
|
|
||||||
* `lib/lock` -- Compatibility code for declaring and using locks.
|
|
||||||
Lower-level than the rest of the threading code.
|
|
||||||
|
|
||||||
* `lib/ctime` -- Constant-time implementations for data comparison
|
|
||||||
and table lookup, used to avoid timing side-channels from standard
|
|
||||||
implementations of memcmp() and so on.
|
|
||||||
|
|
||||||
* `lib/string` -- Low-level compatibility wrappers and utility
|
|
||||||
functions for string manipulation.
|
|
||||||
|
|
||||||
* `lib/wallclock` -- Compatibility and utility functions for
|
|
||||||
inspecting and manipulating the current (UTC) time.
|
|
||||||
|
|
||||||
* `lib/osinfo` -- Functions for inspecting the version and
|
|
||||||
capabilities of the operating system.
|
|
||||||
|
|
||||||
* `lib/smartlist_core` -- The bare-bones pieces of our dynamic array
|
|
||||||
("smartlist") implementation. There are higher-level pieces, but
|
|
||||||
these ones are used by (and therefore cannot use) the logging code.
|
|
||||||
|
|
||||||
* `lib/log` -- Implements the logging system used by all higher-level
|
|
||||||
Tor code. You can think of this as the logical "midpoint" of the
|
|
||||||
library code: much of the higher-level code is higher-level
|
|
||||||
_because_ it uses the logging module, and much of the lower-level
|
|
||||||
code is specifically written to avoid having to log, because the
|
|
||||||
logging module depends on it.
|
|
||||||
|
|
||||||
* `lib/container` -- General purpose containers, including dynamic arrays
|
|
||||||
("smartlists"), hashtables, bit arrays, weak-reference-like "handles",
|
|
||||||
bloom filters, and a bit more.
|
|
||||||
|
|
||||||
* `lib/trace` -- A general-purpose API for introducing
|
|
||||||
function-tracing functionality into Tor. Currently not much used.
|
|
||||||
|
|
||||||
* `lib/thread` -- Threading compatibility and utility functionality,
|
|
||||||
other than low-level locks (which are in `lib/lock`) and
|
|
||||||
workqueue/threadpool code (which belongs in `lib/evloop`).
|
|
||||||
|
|
||||||
* `lib/term` -- Code for terminal manipulation functions (like
|
|
||||||
reading a password from the user).
|
|
||||||
|
|
||||||
* `lib/memarea` -- A data structure for a fast "arena" style allocator,
|
|
||||||
where the data is freed all at once. Used for parsing.
|
|
||||||
|
|
||||||
* `lib/encoding` -- Implementations for encoding data in various
|
|
||||||
formats, datatypes, and transformations.
|
|
||||||
|
|
||||||
* `lib/dispatch` -- A general-purpose in-process message delivery
|
|
||||||
system. Used by `lib/pubsub` to implement our inter-module
|
|
||||||
publish/subscribe system.
|
|
||||||
|
|
||||||
* `lib/sandbox` -- Our Linux seccomp2 sandbox implementation.
|
|
||||||
|
|
||||||
* `lib/pubsub` -- Code and macros to implement our publish/subscribe
|
|
||||||
message passing system.
|
|
||||||
|
|
||||||
* `lib/fs` -- Utility and compatibility code for manipulating files,
|
|
||||||
filenames, directories, and so on.
|
|
||||||
|
|
||||||
* `lib/confmgt` -- Code to parse, encode, and manipulate our
|
|
||||||
configuration files, state files, and so forth.
|
|
||||||
|
|
||||||
* `lib/crypt_ops` -- Cryptographic operations. This module contains
|
|
||||||
wrappers around the cryptographic libraries that we support,
|
|
||||||
and implementations for some higher-level cryptographic
|
|
||||||
constructions that we use.
|
|
||||||
|
|
||||||
* `lib/meminfo` -- Functions for inspecting our memory usage, if the
|
|
||||||
malloc implementation exposes that to us.
|
|
||||||
|
|
||||||
* `lib/time` -- Higher level time functions, including fine-gained and
|
|
||||||
monotonic timers.
|
|
||||||
|
|
||||||
* `lib/math` -- Floating-point mathematical utilities, including
|
|
||||||
compatibility code, and probability distributions.
|
|
||||||
|
|
||||||
* `lib/buf` -- A general purpose queued buffer implementation,
|
|
||||||
similar to the BSD kernel's "mbuf" structure.
|
|
||||||
|
|
||||||
* `lib/net` -- Networking code, including address manipulation,
|
|
||||||
compatibility wrappers,
|
|
||||||
|
|
||||||
* `lib/compress` -- A compatibility wrapper around several
|
|
||||||
compression libraries, currently including zlib, zstd, and lzma.
|
|
||||||
|
|
||||||
* `lib/geoip` -- Utilities to manage geoip (IP to country) lookups
|
|
||||||
and formats.
|
|
||||||
|
|
||||||
* `lib/tls` -- Compatibility wrappers around the library (NSS or
|
|
||||||
OpenSSL, depending on configuration) that Tor uses to implement the
|
|
||||||
TLS link security protocol.
|
|
||||||
|
|
||||||
* `lib/evloop` -- Tools to manage the event loop and related
|
|
||||||
functionality, in order to implement asynchronous networking,
|
|
||||||
timers, periodic events, and other scheduling tasks.
|
|
||||||
|
|
||||||
* `lib/process` -- Utilities and compatibility code to launch and
|
|
||||||
manage subprocesses.
|
|
||||||
|
|
||||||
### What belongs in lib?
|
|
||||||
|
|
||||||
In general, if you can imagine some program wanting the functionality
|
|
||||||
you're writing, even if that program had nothing to do with Tor, your
|
|
||||||
functionality belongs in lib.
|
|
||||||
|
|
||||||
If it falls into one of the existing "lib" categories, your
|
|
||||||
functionality belongs in lib.
|
|
||||||
|
|
||||||
If you are using platform-specific `#ifdef`s to manage compatibility
|
|
||||||
issues among platforms, you should probably consider whether you can
|
|
||||||
put your code into lib.
|
|
@ -1,103 +0,0 @@
|
|||||||
|
|
||||||
## Memory management
|
|
||||||
|
|
||||||
### Heap-allocation functions: lib/malloc/malloc.h
|
|
||||||
|
|
||||||
Tor imposes a few light wrappers over C's native malloc and free
|
|
||||||
functions, to improve convenience, and to allow wholescale replacement
|
|
||||||
of malloc and free as needed.
|
|
||||||
|
|
||||||
You should never use 'malloc', 'calloc', 'realloc, or 'free' on their
|
|
||||||
own; always use the variants prefixed with 'tor_'.
|
|
||||||
They are the same as the standard C functions, with the following
|
|
||||||
exceptions:
|
|
||||||
|
|
||||||
* `tor_free(NULL)` is a no-op.
|
|
||||||
* `tor_free()` is a macro that takes an lvalue as an argument and sets it to
|
|
||||||
NULL after freeing it. To avoid this behavior, you can use `tor_free_()`
|
|
||||||
instead.
|
|
||||||
* tor_malloc() and friends fail with an assertion if they are asked to
|
|
||||||
allocate a value so large that it is probably an underflow.
|
|
||||||
* It is always safe to `tor_malloc(0)`, regardless of whether your libc
|
|
||||||
allows it.
|
|
||||||
* `tor_malloc()`, `tor_realloc()`, and friends are never allowed to fail.
|
|
||||||
Instead, Tor will die with an assertion. This means that you never
|
|
||||||
need to check their return values. See the next subsection for
|
|
||||||
information on why we think this is a good idea.
|
|
||||||
|
|
||||||
We define additional general-purpose memory allocation functions as well:
|
|
||||||
|
|
||||||
* `tor_malloc_zero(x)` behaves as `calloc(1, x)`, except the it makes clear
|
|
||||||
the intent to allocate a single zeroed-out value.
|
|
||||||
* `tor_reallocarray(x,y)` behaves as the OpenBSD reallocarray function.
|
|
||||||
Use it for cases when you need to realloc() in a multiplication-safe
|
|
||||||
way.
|
|
||||||
|
|
||||||
And specific-purpose functions as well:
|
|
||||||
|
|
||||||
* `tor_strdup()` and `tor_strndup()` behaves as the underlying libc
|
|
||||||
functions, but use `tor_malloc()` instead of the underlying function.
|
|
||||||
* `tor_memdup()` copies a chunk of memory of a given size.
|
|
||||||
* `tor_memdup_nulterm()` copies a chunk of memory of a given size, then
|
|
||||||
NUL-terminates it just to be safe.
|
|
||||||
|
|
||||||
#### Why assert on allocation failure?
|
|
||||||
|
|
||||||
Why don't we allow `tor_malloc()` and its allies to return NULL?
|
|
||||||
|
|
||||||
First, it's error-prone. Many programmers forget to check for NULL return
|
|
||||||
values, and testing for `malloc()` failures is a major pain.
|
|
||||||
|
|
||||||
Second, it's not necessarily a great way to handle OOM conditions. It's
|
|
||||||
probably better (we think) to have a memory target where we dynamically free
|
|
||||||
things ahead of time in order to stay under the target. Trying to respond to
|
|
||||||
an OOM at the point of `tor_malloc()` failure, on the other hand, would involve
|
|
||||||
a rare operation invoked from deep in the call stack. (Again, that's
|
|
||||||
error-prone and hard to debug.)
|
|
||||||
|
|
||||||
Third, thanks to the rise of Linux and other operating systems that allow
|
|
||||||
memory to be overcommitted, you can't actually ever rely on getting a NULL
|
|
||||||
from `malloc()` when you're out of memory; instead you have to use an approach
|
|
||||||
closer to tracking the total memory usage.
|
|
||||||
|
|
||||||
#### Conventions for your own allocation functions.
|
|
||||||
|
|
||||||
Whenever you create a new type, the convention is to give it a pair of
|
|
||||||
`x_new()` and `x_free_()` functions, named after the type.
|
|
||||||
|
|
||||||
Calling `x_free(NULL)` should always be a no-op.
|
|
||||||
|
|
||||||
There should additionally be an `x_free()` macro, defined in terms of
|
|
||||||
`x_free_()`. This macro should set its lvalue to NULL. You can define it
|
|
||||||
using the FREE_AND_NULL macro, as follows:
|
|
||||||
|
|
||||||
```
|
|
||||||
#define x_free(ptr) FREE_AND_NULL(x_t, x_free_, (ptr))
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
### Grow-only memory allocation: lib/memarea
|
|
||||||
|
|
||||||
It's often handy to allocate a large number of tiny objects, all of which
|
|
||||||
need to disappear at the same time. You can do this in tor using the
|
|
||||||
memarea.c abstraction, which uses a set of grow-only buffers for allocation,
|
|
||||||
and only supports a single "free" operation at the end.
|
|
||||||
|
|
||||||
Using memareas also helps you avoid memory fragmentation. You see, some libc
|
|
||||||
malloc implementations perform badly on the case where a large number of
|
|
||||||
small temporary objects are allocated at the same time as a few long-lived
|
|
||||||
objects of similar size. But if you use tor_malloc() for the long-lived ones
|
|
||||||
and a memarea for the temporary object, the malloc implementation is likelier
|
|
||||||
to do better.
|
|
||||||
|
|
||||||
To create a new memarea, use `memarea_new()`. To drop all the storage from a
|
|
||||||
memarea, and invalidate its pointers, use `memarea_drop_all()`.
|
|
||||||
|
|
||||||
The allocation functions `memarea_alloc()`, `memarea_alloc_zero()`,
|
|
||||||
`memarea_memdup()`, `memarea_strdup()`, and `memarea_strndup()` are analogous
|
|
||||||
to the similarly-named malloc() functions. There is intentionally no
|
|
||||||
`memarea_free()` or `memarea_realloc()`.
|
|
||||||
|
|
||||||
### Special allocation: lib/malloc/map_anon.h
|
|
||||||
|
|
||||||
TODO: WRITEME.
|
|
@ -1,45 +0,0 @@
|
|||||||
|
|
||||||
## Collections in tor
|
|
||||||
|
|
||||||
### Smartlists: Neither lists, nor especially smart.
|
|
||||||
|
|
||||||
For historical reasons, we call our dynamic-allocated array type
|
|
||||||
`smartlist_t`. It can grow or shrink as elements are added and removed.
|
|
||||||
|
|
||||||
All smartlists hold an array of `void *`. Whenever you expose a smartlist
|
|
||||||
in an API you *must* document which types its pointers actually hold.
|
|
||||||
|
|
||||||
<!-- It would be neat to fix that, wouldn't it? -NM -->
|
|
||||||
|
|
||||||
Smartlists are created empty with `smartlist_new()` and freed with
|
|
||||||
`smartlist_free()`. See the `containers.h` module documentation for more
|
|
||||||
information; there are many convenience functions for commonly needed
|
|
||||||
operations.
|
|
||||||
|
|
||||||
<!-- TODO: WRITE more about what you can do with smartlists. -->
|
|
||||||
|
|
||||||
### Digest maps, string maps, and more.
|
|
||||||
|
|
||||||
Tor makes frequent use of maps from 160-bit digests, 256-bit digests,
|
|
||||||
or nul-terminated strings to `void *`. These types are `digestmap_t`,
|
|
||||||
`digest256map_t`, and `strmap_t` respectively. See the containers.h
|
|
||||||
module documentation for more information.
|
|
||||||
|
|
||||||
### Intrusive lists and hashtables
|
|
||||||
|
|
||||||
For performance-sensitive cases, we sometimes want to use "intrusive"
|
|
||||||
collections: ones where the bookkeeping pointers are stuck inside the
|
|
||||||
structures that belong to the collection. If you've used the
|
|
||||||
BSD-style sys/queue.h macros, you'll be familiar with these.
|
|
||||||
|
|
||||||
Unfortunately, the `sys/queue.h` macros vary significantly between the
|
|
||||||
platforms that have them, so we provide our own variants in
|
|
||||||
`src/ext/tor_queue.h`.
|
|
||||||
|
|
||||||
We also provide an intrusive hashtable implementation in `src/ext/ht.h`.
|
|
||||||
When you're using it, you'll need to define your own hash
|
|
||||||
functions. If attacker-induced collisions are a worry here, use the
|
|
||||||
cryptographic siphash24g function to extract hashes.
|
|
||||||
|
|
||||||
<!-- TODO: WRITE about bloom filters, namemaps, bit-arrays, order functions.
|
|
||||||
-->
|
|
@ -1,132 +1,4 @@
|
|||||||
|
|
||||||
## Lower-level cryptography functionality in Tor ##
|
|
||||||
|
|
||||||
Generally speaking, Tor code shouldn't be calling OpenSSL (or any
|
|
||||||
other crypto library) directly. Instead, we should indirect through
|
|
||||||
one of the functions in src/common/crypto\*.c or src/common/tortls.c.
|
|
||||||
|
|
||||||
Cryptography functionality that's available is described below.
|
|
||||||
|
|
||||||
### RNG facilities ###
|
|
||||||
|
|
||||||
The most basic RNG capability in Tor is the crypto_rand() family of
|
|
||||||
functions. These currently use OpenSSL's RAND_() backend, but may use
|
|
||||||
something faster in the future.
|
|
||||||
|
|
||||||
In addition to crypto_rand(), which fills in a buffer with random
|
|
||||||
bytes, we also have functions to produce random integers in certain
|
|
||||||
ranges; to produce random hostnames; to produce random doubles, etc.
|
|
||||||
|
|
||||||
When you're creating a long-term cryptographic secret, you might want
|
|
||||||
to use crypto_strongest_rand() instead of crypto_rand(). It takes the
|
|
||||||
operating system's entropy source and combines it with output from
|
|
||||||
crypto_rand(). This is a pure paranoia measure, but it might help us
|
|
||||||
someday.
|
|
||||||
|
|
||||||
You can use smartlist_choose() to pick a random element from a smartlist
|
|
||||||
and smartlist_shuffle() to randomize the order of a smartlist. Both are
|
|
||||||
potentially a bit slow.
|
|
||||||
|
|
||||||
### Cryptographic digests and related functions ###
|
|
||||||
|
|
||||||
We treat digests as separate types based on the length of their
|
|
||||||
outputs. We support one 160-bit digest (SHA1), two 256-bit digests
|
|
||||||
(SHA256 and SHA3-256), and two 512-bit digests (SHA512 and SHA3-512).
|
|
||||||
|
|
||||||
You should not use SHA1 for anything new.
|
|
||||||
|
|
||||||
The crypto_digest\*() family of functions manipulates digests. You
|
|
||||||
can either compute a digest of a chunk of memory all at once using
|
|
||||||
crypto_digest(), crypto_digest256(), or crypto_digest512(). Or you
|
|
||||||
can create a crypto_digest_t object with
|
|
||||||
crypto_digest{,256,512}_new(), feed information to it in chunks using
|
|
||||||
crypto_digest_add_bytes(), and then extract the final digest using
|
|
||||||
crypto_digest_get_digest(). You can copy the state of one of these
|
|
||||||
objects using crypto_digest_dup() or crypto_digest_assign().
|
|
||||||
|
|
||||||
We support the HMAC hash-based message authentication code
|
|
||||||
instantiated using SHA256. See crypto_hmac_sha256. (You should not
|
|
||||||
add any HMAC users with SHA1, and HMAC is not necessary with SHA3.)
|
|
||||||
|
|
||||||
We also support the SHA3 cousins, SHAKE128 and SHAKE256. Unlike
|
|
||||||
digests, these are extendable output functions (or XOFs) where you can
|
|
||||||
get any amount of output. Use the crypto_xof_\*() functions to access
|
|
||||||
these.
|
|
||||||
|
|
||||||
We have several ways to derive keys from cryptographically strong secret
|
|
||||||
inputs (like diffie-hellman outputs). The old
|
|
||||||
crypto_expand_key_material-TAP() performs an ad-hoc KDF based on SHA1 -- you
|
|
||||||
shouldn't use it for implementing anything but old versions of the Tor
|
|
||||||
protocol. You can use HKDF-SHA256 (as defined in RFC5869) for more modern
|
|
||||||
protocols. Also consider SHAKE256.
|
|
||||||
|
|
||||||
If your input is potentially weak, like a password or passphrase, use a salt
|
|
||||||
along with the secret_to_key() functions as defined in crypto_s2k.c. Prefer
|
|
||||||
scrypt over other hashing methods when possible. If you're using a password
|
|
||||||
to encrypt something, see the "boxed file storage" section below.
|
|
||||||
|
|
||||||
Finally, in order to store objects in hash tables, Tor includes the
|
|
||||||
randomized SipHash 2-4 function. Call it via the siphash24g() function in
|
|
||||||
src/ext/siphash.h whenever you're creating a hashtable whose keys may be
|
|
||||||
manipulated by an attacker in order to DoS you with collisions.
|
|
||||||
|
|
||||||
|
|
||||||
### Stream ciphers ###
|
|
||||||
|
|
||||||
You can create instances of a stream cipher using crypto_cipher_new().
|
|
||||||
These are stateful objects of type crypto_cipher_t. Note that these
|
|
||||||
objects only support AES-128 right now; a future version should add
|
|
||||||
support for AES-128 and/or ChaCha20.
|
|
||||||
|
|
||||||
You can encrypt/decrypt with crypto_cipher_encrypt or
|
|
||||||
crypto_cipher_decrypt. The crypto_cipher_crypt_inplace function performs
|
|
||||||
an encryption without a copy.
|
|
||||||
|
|
||||||
Note that sensible people should not use raw stream ciphers; they should
|
|
||||||
probably be using some kind of AEAD. Sorry.
|
|
||||||
|
|
||||||
### Public key functionality ###
|
|
||||||
|
|
||||||
We support four public key algorithms: DH1024, RSA, Curve25519, and
|
|
||||||
Ed25519.
|
|
||||||
|
|
||||||
We support DH1024 over two prime groups. You access these via the
|
|
||||||
crypto_dh_\*() family of functions.
|
|
||||||
|
|
||||||
We support RSA in many bit sizes for signing and encryption. You access
|
|
||||||
it via the crypto_pk_*() family of functions. Note that a crypto_pk_t
|
|
||||||
may or may not include a private key. See the crypto_pk_* functions in
|
|
||||||
crypto.c for a full list of functions here.
|
|
||||||
|
|
||||||
For Curve25519 functionality, see the functions and types in
|
|
||||||
crypto_curve25519.c. Curve25519 is generally suitable for when you need
|
|
||||||
a secure fast elliptic-curve diffie hellman implementation. When
|
|
||||||
designing new protocols, prefer it over DH in Z_p.
|
|
||||||
|
|
||||||
For Ed25519 functionality, see the functions and types in
|
|
||||||
crypto_ed25519.c. Ed25519 is a generally suitable as a secure fast
|
|
||||||
elliptic curve signature method. For new protocols, prefer it over RSA
|
|
||||||
signatures.
|
|
||||||
|
|
||||||
### Metaformats for storage ###
|
|
||||||
|
|
||||||
When OpenSSL manages the storage of some object, we use whatever format
|
|
||||||
OpenSSL provides -- typically, some kind of PEM-wrapped base 64 encoding
|
|
||||||
that starts with "----- BEGIN CRYPTOGRAPHIC OBJECT ----".
|
|
||||||
|
|
||||||
When we manage the storage of some cryptographic object, we prefix the
|
|
||||||
object with 32-byte NUL-padded prefix in order to avoid accidental
|
|
||||||
object confusion; see the crypto_read_tagged_contents_from_file() and
|
|
||||||
crypto_write_tagged_contents_to_file() functions for manipulating
|
|
||||||
these. The prefix is "== type: tag ==", where type describes the object
|
|
||||||
and its encoding, and tag indicates which one it is.
|
|
||||||
|
|
||||||
### Boxed-file storage ###
|
|
||||||
|
|
||||||
When managing keys, you frequently want to have some way to write a
|
|
||||||
secret object to disk, encrypted with a passphrase. The crypto_pwbox
|
|
||||||
and crypto_unpwbox functions do so in a way that's likely to be
|
|
||||||
readable by future versions of Tor.
|
|
||||||
|
|
||||||
### Certificates ###
|
### Certificates ###
|
||||||
|
|
||||||
@ -153,17 +25,3 @@ napkin.
|
|||||||
documents that include keys and which are signed by keys. You can
|
documents that include keys and which are signed by keys. You can
|
||||||
consider these documents to be an additional kind of certificate if you
|
consider these documents to be an additional kind of certificate if you
|
||||||
want.)
|
want.)
|
||||||
|
|
||||||
### TLS ###
|
|
||||||
|
|
||||||
Tor's TLS implementation is more tightly coupled to OpenSSL than we'd
|
|
||||||
prefer. You can read most of it in tortls.c.
|
|
||||||
|
|
||||||
Unfortunately, TLS's state machine and our requirement for nonblocking
|
|
||||||
IO support means that using TLS in practice is a bit hairy, since
|
|
||||||
logical writes can block on a physical reads, and vice versa.
|
|
||||||
|
|
||||||
If you are lucky, you will never have to look at the code here.
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
@ -1,95 +1,6 @@
|
|||||||
|
|
||||||
## Tor's modules ##
|
## Tor's modules ##
|
||||||
|
|
||||||
### Generic modules ###
|
|
||||||
|
|
||||||
`buffers.c`
|
|
||||||
: Implements the `buf_t` buffered data type for connections, and several
|
|
||||||
low-level data handling functions to handle network protocols on it.
|
|
||||||
|
|
||||||
`channel.c`
|
|
||||||
: Generic channel implementation. Channels handle sending and receiving cells
|
|
||||||
among tor nodes.
|
|
||||||
|
|
||||||
`channeltls.c`
|
|
||||||
: Channel implementation for TLS-based OR connections. Uses `connection_or.c`.
|
|
||||||
|
|
||||||
`circuitbuild.c`
|
|
||||||
: Code for constructing circuits and choosing their paths. (*Note*:
|
|
||||||
this module could plausibly be split into handling the client side,
|
|
||||||
the server side, and the path generation aspects of circuit building.)
|
|
||||||
|
|
||||||
`circuitlist.c`
|
|
||||||
: Code for maintaining and navigating the global list of circuits.
|
|
||||||
|
|
||||||
`circuitmux.c`
|
|
||||||
: Generic circuitmux implementation. A circuitmux handles deciding, for a
|
|
||||||
particular channel, which circuit should write next.
|
|
||||||
|
|
||||||
`circuitmux_ewma.c`
|
|
||||||
: A circuitmux implementation based on the EWMA (exponentially
|
|
||||||
weighted moving average) algorithm.
|
|
||||||
|
|
||||||
`circuituse.c`
|
|
||||||
: Code to actually send and receive data on circuits.
|
|
||||||
|
|
||||||
`command.c`
|
|
||||||
: Handles incoming cells on channels.
|
|
||||||
|
|
||||||
`config.c`
|
|
||||||
: Parses options from torrc, and uses them to configure the rest of Tor.
|
|
||||||
|
|
||||||
`confparse.c`
|
|
||||||
: Generic torrc-style parser. Used to parse torrc and state files.
|
|
||||||
|
|
||||||
`connection.c`
|
|
||||||
: Generic and common connection tools, and implementation for the simpler
|
|
||||||
connection types.
|
|
||||||
|
|
||||||
`connection_edge.c`
|
|
||||||
: Implementation for entry and exit connections.
|
|
||||||
|
|
||||||
`connection_or.c`
|
|
||||||
: Implementation for OR connections (the ones that send cells over TLS).
|
|
||||||
|
|
||||||
`main.c`
|
|
||||||
: Principal entry point, main loops, scheduled events, and network
|
|
||||||
management for Tor.
|
|
||||||
|
|
||||||
`ntmain.c`
|
|
||||||
: Implements Tor as a Windows service. (Not very well.)
|
|
||||||
|
|
||||||
`onion.c`
|
|
||||||
: Generic code for generating and responding to CREATE and CREATED
|
|
||||||
cells, and performing the appropriate onion handshakes. Also contains
|
|
||||||
code to manage the server-side onion queue.
|
|
||||||
|
|
||||||
`onion_fast.c`
|
|
||||||
: Implements the old SHA1-based CREATE_FAST/CREATED_FAST circuit
|
|
||||||
creation handshake. (Now deprecated.)
|
|
||||||
|
|
||||||
`onion_ntor.c`
|
|
||||||
: Implements the Curve25519-based NTOR circuit creation handshake.
|
|
||||||
|
|
||||||
`onion_tap.c`
|
|
||||||
: Implements the old RSA1024/DH1024-based TAP circuit creation handshake. (Now
|
|
||||||
deprecated.)
|
|
||||||
|
|
||||||
`relay.c`
|
|
||||||
: Handles particular types of relay cells, and provides code to receive,
|
|
||||||
encrypt, route, and interpret relay cells.
|
|
||||||
|
|
||||||
`scheduler.c`
|
|
||||||
: Decides which channel/circuit pair is ready to receive the next cell.
|
|
||||||
|
|
||||||
`statefile.c`
|
|
||||||
: Handles loading and storing Tor's state file.
|
|
||||||
|
|
||||||
`tor_main.c`
|
|
||||||
: Contains the actual `main()` function. (This is placed in a separate
|
|
||||||
file so that the unit tests can have their own `main()`.)
|
|
||||||
|
|
||||||
|
|
||||||
### Node-status modules ###
|
### Node-status modules ###
|
||||||
|
|
||||||
`directory.c`
|
`directory.c`
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
/**
|
/**
|
||||||
@dir app
|
@dir /app
|
||||||
@brief app: top-level entry point for Tor
|
@brief app: top-level entry point for Tor
|
||||||
|
|
||||||
The "app" directory has Tor's main entry point and configuration logic,
|
The "app" directory has Tor's main entry point and configuration logic,
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir app/config
|
@dir /app/config
|
||||||
@brief app/config
|
@brief app/config: Top-level configuration code
|
||||||
|
|
||||||
|
Refactoring this module is a work in progress, see
|
||||||
|
[ticket 29211](https://trac.torproject.org/projects/tor/ticket/29211).
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir app/main
|
@dir /app/main
|
||||||
@brief app/main
|
@brief app/main: Entry point for tor.
|
||||||
**/
|
**/
|
||||||
|
@ -1,8 +1,20 @@
|
|||||||
/**
|
/**
|
||||||
@dir core
|
@dir /core
|
||||||
@brief core: main loop and onion routing functionality
|
@brief core: main loop and onion routing functionality
|
||||||
|
|
||||||
The "core" directory has the central protocols for Tor, which every
|
The "core" directory has the central protocols for Tor, which every
|
||||||
client and relay must implement in order to perform onion routing.
|
client and relay must implement in order to perform onion routing.
|
||||||
|
|
||||||
|
It is divided into three lower-level pieces:
|
||||||
|
|
||||||
|
- \refdir{core/crypto} -- Tor-specific cryptography.
|
||||||
|
|
||||||
|
- \refdir{core/proto} -- Protocol encoding/decoding.
|
||||||
|
|
||||||
|
- \refdir{core/mainloop} -- A connection-oriented asynchronous mainloop.
|
||||||
|
|
||||||
|
and one high-level piece:
|
||||||
|
|
||||||
|
- \refdir{core/or} -- Implements onion routing itself.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir core/crypto
|
@dir /core/crypto
|
||||||
@brief core/crypto
|
@brief core/crypto: Tor-specific cryptography
|
||||||
|
|
||||||
|
This module implements Tor's circuit-construction crypto and Tor's
|
||||||
|
relay crypto.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,12 @@
|
|||||||
/**
|
/**
|
||||||
@dir core/mainloop
|
@dir /core/mainloop
|
||||||
@brief core/mainloop
|
@brief core/mainloop: Non-onion-routing mainloop functionality
|
||||||
|
|
||||||
|
This module uses the event-loop code of \refdir{lib/evloop} to implement an
|
||||||
|
asynchronous connection-oriented protocol handler.
|
||||||
|
|
||||||
|
The layering here is imperfect: the code here was split from \refdir{core/or}
|
||||||
|
without refactoring how the two modules call one another. Probably many
|
||||||
|
functions should be moved and refactored.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,62 @@
|
|||||||
/**
|
/**
|
||||||
@dir core/or
|
@dir /core/or
|
||||||
@brief core/or
|
@brief core/or: *Onion routing happens here*.
|
||||||
**/
|
|
||||||
|
This is the central part of Tor that handles the core tasks of onion routing:
|
||||||
|
building circuit, handling circuits, attaching circuit to streams, moving
|
||||||
|
data around, and so forth.
|
||||||
|
|
||||||
|
Some aspects of this module should probably be refactored into others.
|
||||||
|
|
||||||
|
Notable files here include:
|
||||||
|
|
||||||
|
`channel.c`
|
||||||
|
: Generic channel implementation. Channels handle sending and receiving cells
|
||||||
|
among tor nodes.
|
||||||
|
|
||||||
|
`channeltls.c`
|
||||||
|
: Channel implementation for TLS-based OR connections. Uses `connection_or.c`.
|
||||||
|
|
||||||
|
`circuitbuild.c`
|
||||||
|
: Code for constructing circuits and choosing their paths. (*Note*:
|
||||||
|
this module could plausibly be split into handling the client side,
|
||||||
|
the server side, and the path generation aspects of circuit building.)
|
||||||
|
|
||||||
|
`circuitlist.c`
|
||||||
|
: Code for maintaining and navigating the global list of circuits.
|
||||||
|
|
||||||
|
`circuitmux.c`
|
||||||
|
: Generic circuitmux implementation. A circuitmux handles deciding, for a
|
||||||
|
particular channel, which circuit should write next.
|
||||||
|
|
||||||
|
`circuitmux_ewma.c`
|
||||||
|
: A circuitmux implementation based on the EWMA (exponentially
|
||||||
|
weighted moving average) algorithm.
|
||||||
|
|
||||||
|
`circuituse.c`
|
||||||
|
: Code to actually send and receive data on circuits.
|
||||||
|
|
||||||
|
`command.c`
|
||||||
|
: Handles incoming cells on channels.
|
||||||
|
|
||||||
|
`connection.c`
|
||||||
|
: Generic and common connection tools, and implementation for the simpler
|
||||||
|
connection types.
|
||||||
|
|
||||||
|
`connection_edge.c`
|
||||||
|
: Implementation for entry and exit connections.
|
||||||
|
|
||||||
|
`connection_or.c`
|
||||||
|
: Implementation for OR connections (the ones that send cells over TLS).
|
||||||
|
|
||||||
|
`onion.c`
|
||||||
|
: Generic code for generating and responding to CREATE and CREATED
|
||||||
|
cells, and performing the appropriate onion handshakes. Also contains
|
||||||
|
code to manage the server-side onion queue.
|
||||||
|
|
||||||
|
`relay.c`
|
||||||
|
: Handles particular types of relay cells, and provides code to receive,
|
||||||
|
encrypt, route, and interpret relay cells.
|
||||||
|
|
||||||
|
`scheduler.c`
|
||||||
|
: Decides which channel/circuit pair is ready to receive the next cell.
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir core/proto
|
@dir /core/proto
|
||||||
@brief core/proto
|
@brief core/proto: Protocol encoding/decoding
|
||||||
|
|
||||||
|
These functions should (but do not always) exist at a lower level than most
|
||||||
|
of the rest of core.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/api
|
@dir /feature/api
|
||||||
@brief feature/api
|
@brief feature/api: In-process interface to starting/stopping Tor.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,7 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/client
|
@dir /feature/client
|
||||||
@brief feature/client
|
@brief feature/client: Client-specific code
|
||||||
|
|
||||||
|
(There is also a bunch of client-specific code in other modules.)
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,10 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/control
|
@dir /feature/control
|
||||||
@brief feature/control
|
@brief feature/control: Controller API.
|
||||||
|
|
||||||
|
The Controller API is a text-based protocol that another program (or another
|
||||||
|
thread, if you're running Tor in-process) can use to configure and control
|
||||||
|
Tor while it is running. The current protocol is documented in
|
||||||
|
[control-spec.txt](https://gitweb.torproject.org/torspec.git/tree/control-spec.txt).
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,11 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/dirauth
|
@dir /feature/dirauth
|
||||||
@brief feature/dirauth
|
@brief feature/dirauth: Directory authority implementation.
|
||||||
|
|
||||||
|
This module handles running Tor as a directory authority.
|
||||||
|
|
||||||
|
The directory protocol is specified in
|
||||||
|
[dir-spec.txt](https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt).
|
||||||
|
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/dircache
|
@dir /feature/dircache
|
||||||
@brief feature/dircache
|
@brief feature/dircache: Run as a directory cache server
|
||||||
|
|
||||||
|
This module handles the directory caching functionality that all relays may
|
||||||
|
provide, for serving cached directory objects to objects.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,9 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/dirclient
|
@dir /feature/dirclient
|
||||||
@brief feature/dirclient
|
@brief feature/dirclient: Directory client implementation.
|
||||||
|
|
||||||
|
The code here is used by all Tor instances that need to download directory
|
||||||
|
information. Currently, that is all of them, since even authorities need to
|
||||||
|
launch downloads to learn about relays that other authorities have listed.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,9 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/dircommon
|
@dir /feature/dircommon
|
||||||
@brief feature/dircommon
|
@brief feature/dircommon: Directory client and server shared code
|
||||||
|
|
||||||
|
This module has the code that directory clients (anybody who download
|
||||||
|
information about relays) and directory servers (anybody who serves such
|
||||||
|
information) share in common.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,10 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/dirparse
|
@dir /feature/dirparse
|
||||||
@brief feature/dirparse
|
@brief feature/dirparse: Parsing Tor directory objects
|
||||||
|
|
||||||
|
We define a number of "directory objects" in
|
||||||
|
[dir-spec.txt](https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt),
|
||||||
|
all of them using a common line-oriented meta-format. This module is used by
|
||||||
|
other parts of Tor to parse them.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature
|
@dir /feature
|
||||||
@brief feature: domain-specific modules
|
@brief feature: domain-specific modules
|
||||||
|
|
||||||
The "feature" directory has modules that Tor uses only for a particular
|
The "feature" directory has modules that Tor uses only for a particular
|
||||||
|
@ -1,4 +1,16 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/hibernate
|
@dir /feature/hibernate
|
||||||
@brief feature/hibernate
|
@brief feature/hibernate: Bandwidth accounting and hibernation (!)
|
||||||
|
|
||||||
|
This module implements two features that are only somewhat related, and
|
||||||
|
should probably be separated in the future. One feature is bandwidth
|
||||||
|
accounting (making sure we use no more than so many gigabytes in a day) and
|
||||||
|
hibernation (avoiding network activity while we have used up all/most of our
|
||||||
|
configured gigabytes). The other feature is clean shutdown, where we stop
|
||||||
|
accepting new connections for a while and give the old ones time to close.
|
||||||
|
|
||||||
|
The two features are related only in the sense that "soft hibernation" (being
|
||||||
|
almost out of ) is very close to the "shutting down" state. But it would be
|
||||||
|
better in the long run to make the two completely separate.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,10 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/hs
|
@dir /feature/hs
|
||||||
@brief feature/hs
|
@brief feature/hs: v3 (current) onion service protocol
|
||||||
|
|
||||||
|
This directory implements the v3 onion service protocol,
|
||||||
|
as specified in
|
||||||
|
[rend-spec-v3.txt](https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt).
|
||||||
|
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,5 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/hs_common
|
@dir /feature/hs_common
|
||||||
@brief feature/hs_common
|
@brief feature/hs_common: Common to v2 (old) and v3 (current) onion services
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,5 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/keymgt
|
@dir /feature/keymgt
|
||||||
@brief feature/keymgt
|
@brief feature/keymgt: Store keys for relays, authorities, etc.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/nodelist
|
@dir /feature/nodelist
|
||||||
@brief feature/nodelist
|
@brief feature/nodelist: Download and manage a list of relays
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,6 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/relay
|
@dir /feature/relay
|
||||||
@brief feature/relay
|
@brief feature/relay: Relay-specific code
|
||||||
|
|
||||||
|
(There is also a bunch of relay-specific code in other modules.)
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,9 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/rend
|
@dir /feature/rend
|
||||||
@brief feature/rend
|
@brief feature/rend: version 2 (old) hidden services
|
||||||
|
|
||||||
|
This directory implements the v2 onion service protocol,
|
||||||
|
as specified in
|
||||||
|
[rend-spec-v2.txt](https://gitweb.torproject.org/torspec.git/tree/rend-spec-v2.txt).
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,12 @@
|
|||||||
/**
|
/**
|
||||||
@dir feature/stats
|
@dir /feature/stats
|
||||||
@brief feature/stats
|
@brief feature/stats: Relay statistics. Also, port prediction.
|
||||||
|
|
||||||
|
This module collects anonymized relay statistics in order to publish them in
|
||||||
|
relays' routerinfo and extrainfo documents.
|
||||||
|
|
||||||
|
Additionally, it contains predict_ports.c, which remembers which ports we've
|
||||||
|
visited recently as a client, so we can make sure we have open circuits that
|
||||||
|
support them.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/arch
|
@dir /lib/arch
|
||||||
@brief lib/arch
|
@brief lib/arch: Compatibility code for handling different CPU architectures.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,15 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/buf
|
@dir /lib/buf
|
||||||
@brief lib/buf
|
@brief lib/buf: An efficient byte queue.
|
||||||
|
|
||||||
|
This module defines the buf_t type, which is used throughout our networking
|
||||||
|
code. The implementation is a singly-linked queue of buffer chunks, similar
|
||||||
|
to the BSD kernel's
|
||||||
|
["mbuf"](https://www.freebsd.org/cgi/man.cgi?query=mbuf&sektion=9) structure.
|
||||||
|
|
||||||
|
The buf_t type is also reasonable for use in constructing long strings.
|
||||||
|
|
||||||
|
See \refdir{lib/net} for networking code that uses buf_t, and
|
||||||
|
\refdir{lib/tls} for cryptographic code that uses buf_t.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/cc
|
@dir /lib/cc
|
||||||
@brief lib/cc
|
@brief lib/cc: Macros for managing the C compiler and language.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/compress
|
@dir /lib/compress
|
||||||
@brief lib/compress
|
@brief lib/compress: Wraps several compression libraries
|
||||||
|
|
||||||
|
Currently supported are zlib (mandatory), zstd (optional), and lzma
|
||||||
|
(optional).
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,5 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/conf
|
@dir /lib/conf
|
||||||
@brief lib/conf
|
@brief lib/conf: Types and macros for declaring configuration options.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,9 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/confmgt
|
@dir /lib/confmgt
|
||||||
@brief lib/confmgt
|
@brief lib/confmgt: Parse, encode, manipulate configuration files.
|
||||||
|
|
||||||
|
This logic is used in common by our state files (statefile.c) and
|
||||||
|
configuration files (config.c) to manage a set of named, typed fields,
|
||||||
|
reading and writing them to disk and to the controller.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,51 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/container
|
@dir /lib/container
|
||||||
@brief lib/container
|
@brief lib/container: Hash tables, dynamic arrays, bit arrays, etc.
|
||||||
|
|
||||||
|
### Smartlists: Neither lists, nor especially smart.
|
||||||
|
|
||||||
|
For historical reasons, we call our dynamic-allocated array type
|
||||||
|
`smartlist_t`. It can grow or shrink as elements are added and removed.
|
||||||
|
|
||||||
|
All smartlists hold an array of `void *`. Whenever you expose a smartlist
|
||||||
|
in an API you *must* document which types its pointers actually hold.
|
||||||
|
|
||||||
|
<!-- It would be neat to fix that, wouldn't it? -NM -->
|
||||||
|
|
||||||
|
Smartlists are created empty with `smartlist_new()` and freed with
|
||||||
|
`smartlist_free()`. See the `containers.h` header documentation for more
|
||||||
|
information; there are many convenience functions for commonly needed
|
||||||
|
operations.
|
||||||
|
|
||||||
|
For low-level operations on smartlists, see also
|
||||||
|
\refdir{lib/smartlist_core}.
|
||||||
|
|
||||||
|
<!-- TODO: WRITE more about what you can do with smartlists. -->
|
||||||
|
|
||||||
|
### Digest maps, string maps, and more.
|
||||||
|
|
||||||
|
Tor makes frequent use of maps from 160-bit digests, 256-bit digests,
|
||||||
|
or nul-terminated strings to `void *`. These types are `digestmap_t`,
|
||||||
|
`digest256map_t`, and `strmap_t` respectively. See the containers.h
|
||||||
|
module documentation for more information.
|
||||||
|
|
||||||
|
### Intrusive lists and hashtables
|
||||||
|
|
||||||
|
For performance-sensitive cases, we sometimes want to use "intrusive"
|
||||||
|
collections: ones where the bookkeeping pointers are stuck inside the
|
||||||
|
structures that belong to the collection. If you've used the
|
||||||
|
BSD-style sys/queue.h macros, you'll be familiar with these.
|
||||||
|
|
||||||
|
Unfortunately, the `sys/queue.h` macros vary significantly between the
|
||||||
|
platforms that have them, so we provide our own variants in
|
||||||
|
`ext/tor_queue.h`.
|
||||||
|
|
||||||
|
We also provide an intrusive hashtable implementation in `ext/ht.h`.
|
||||||
|
When you're using it, you'll need to define your own hash
|
||||||
|
functions. If attacker-induced collisions are a worry here, use the
|
||||||
|
cryptographic siphash24g function to extract hashes.
|
||||||
|
|
||||||
|
<!-- TODO: WRITE about bloom filters, namemaps, bit-arrays, order functions.
|
||||||
|
-->
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,139 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/crypt_ops
|
@dir /lib/crypt_ops
|
||||||
@brief lib/crypt_ops
|
@brief lib/crypt_ops: Cryptographic operations.
|
||||||
|
|
||||||
|
This module contains wrappers around the cryptographic libraries that we
|
||||||
|
support, and implementations for some higher-level cryptographic
|
||||||
|
constructions that we use.
|
||||||
|
|
||||||
|
It wraps our two major cryptographic backends (OpenSSL or NSS, as configured
|
||||||
|
by the user), and also wraps other cryptographic code in src/ext.
|
||||||
|
|
||||||
|
Generally speaking, Tor code shouldn't be calling OpenSSL or NSS
|
||||||
|
(or any other crypto library) directly. Instead, we should indirect through
|
||||||
|
one of the functions in this directory, or through \refdir{lib/tls}.
|
||||||
|
|
||||||
|
Cryptography functionality that's available is described below.
|
||||||
|
|
||||||
|
### RNG facilities ###
|
||||||
|
|
||||||
|
The most basic RNG capability in Tor is the crypto_rand() family of
|
||||||
|
functions. These currently use OpenSSL's RAND_() backend, but may use
|
||||||
|
something faster in the future.
|
||||||
|
|
||||||
|
In addition to crypto_rand(), which fills in a buffer with random
|
||||||
|
bytes, we also have functions to produce random integers in certain
|
||||||
|
ranges; to produce random hostnames; to produce random doubles, etc.
|
||||||
|
|
||||||
|
When you're creating a long-term cryptographic secret, you might want
|
||||||
|
to use crypto_strongest_rand() instead of crypto_rand(). It takes the
|
||||||
|
operating system's entropy source and combines it with output from
|
||||||
|
crypto_rand(). This is a pure paranoia measure, but it might help us
|
||||||
|
someday.
|
||||||
|
|
||||||
|
You can use smartlist_choose() to pick a random element from a smartlist
|
||||||
|
and smartlist_shuffle() to randomize the order of a smartlist. Both are
|
||||||
|
potentially a bit slow.
|
||||||
|
|
||||||
|
### Cryptographic digests and related functions ###
|
||||||
|
|
||||||
|
We treat digests as separate types based on the length of their
|
||||||
|
outputs. We support one 160-bit digest (SHA1), two 256-bit digests
|
||||||
|
(SHA256 and SHA3-256), and two 512-bit digests (SHA512 and SHA3-512).
|
||||||
|
|
||||||
|
You should not use SHA1 for anything new.
|
||||||
|
|
||||||
|
The crypto_digest\*() family of functions manipulates digests. You
|
||||||
|
can either compute a digest of a chunk of memory all at once using
|
||||||
|
crypto_digest(), crypto_digest256(), or crypto_digest512(). Or you
|
||||||
|
can create a crypto_digest_t object with
|
||||||
|
crypto_digest{,256,512}_new(), feed information to it in chunks using
|
||||||
|
crypto_digest_add_bytes(), and then extract the final digest using
|
||||||
|
crypto_digest_get_digest(). You can copy the state of one of these
|
||||||
|
objects using crypto_digest_dup() or crypto_digest_assign().
|
||||||
|
|
||||||
|
We support the HMAC hash-based message authentication code
|
||||||
|
instantiated using SHA256. See crypto_hmac_sha256. (You should not
|
||||||
|
add any HMAC users with SHA1, and HMAC is not necessary with SHA3.)
|
||||||
|
|
||||||
|
We also support the SHA3 cousins, SHAKE128 and SHAKE256. Unlike
|
||||||
|
digests, these are extendable output functions (or XOFs) where you can
|
||||||
|
get any amount of output. Use the crypto_xof_\*() functions to access
|
||||||
|
these.
|
||||||
|
|
||||||
|
We have several ways to derive keys from cryptographically strong secret
|
||||||
|
inputs (like diffie-hellman outputs). The old
|
||||||
|
crypto_expand_key_material_TAP() performs an ad-hoc KDF based on SHA1 -- you
|
||||||
|
shouldn't use it for implementing anything but old versions of the Tor
|
||||||
|
protocol. You can use HKDF-SHA256 (as defined in RFC5869) for more modern
|
||||||
|
protocols. Also consider SHAKE256.
|
||||||
|
|
||||||
|
If your input is potentially weak, like a password or passphrase, use a salt
|
||||||
|
along with the secret_to_key() functions as defined in crypto_s2k.c. Prefer
|
||||||
|
scrypt over other hashing methods when possible. If you're using a password
|
||||||
|
to encrypt something, see the "boxed file storage" section below.
|
||||||
|
|
||||||
|
Finally, in order to store objects in hash tables, Tor includes the
|
||||||
|
randomized SipHash 2-4 function. Call it via the siphash24g() function in
|
||||||
|
src/ext/siphash.h whenever you're creating a hashtable whose keys may be
|
||||||
|
manipulated by an attacker in order to DoS you with collisions.
|
||||||
|
|
||||||
|
|
||||||
|
### Stream ciphers ###
|
||||||
|
|
||||||
|
You can create instances of a stream cipher using crypto_cipher_new().
|
||||||
|
These are stateful objects of type crypto_cipher_t. Note that these
|
||||||
|
objects only support AES-128 right now; a future version should add
|
||||||
|
support for AES-128 and/or ChaCha20.
|
||||||
|
|
||||||
|
You can encrypt/decrypt with crypto_cipher_encrypt or
|
||||||
|
crypto_cipher_decrypt. The crypto_cipher_crypt_inplace function performs
|
||||||
|
an encryption without a copy.
|
||||||
|
|
||||||
|
Note that sensible people should not use raw stream ciphers; they should
|
||||||
|
probably be using some kind of AEAD. Sorry.
|
||||||
|
|
||||||
|
### Public key functionality ###
|
||||||
|
|
||||||
|
We support four public key algorithms: DH1024, RSA, Curve25519, and
|
||||||
|
Ed25519.
|
||||||
|
|
||||||
|
We support DH1024 over two prime groups. You access these via the
|
||||||
|
crypto_dh_\*() family of functions.
|
||||||
|
|
||||||
|
We support RSA in many bit sizes for signing and encryption. You access
|
||||||
|
it via the crypto_pk_*() family of functions. Note that a crypto_pk_t
|
||||||
|
may or may not include a private key. See the crypto_pk_* functions in
|
||||||
|
crypto.c for a full list of functions here.
|
||||||
|
|
||||||
|
For Curve25519 functionality, see the functions and types in
|
||||||
|
crypto_curve25519.c. Curve25519 is generally suitable for when you need
|
||||||
|
a secure fast elliptic-curve diffie hellman implementation. When
|
||||||
|
designing new protocols, prefer it over DH in Z_p.
|
||||||
|
|
||||||
|
For Ed25519 functionality, see the functions and types in
|
||||||
|
crypto_ed25519.c. Ed25519 is a generally suitable as a secure fast
|
||||||
|
elliptic curve signature method. For new protocols, prefer it over RSA
|
||||||
|
signatures.
|
||||||
|
|
||||||
|
### Metaformats for storage ###
|
||||||
|
|
||||||
|
When OpenSSL manages the storage of some object, we use whatever format
|
||||||
|
OpenSSL provides -- typically, some kind of PEM-wrapped base 64 encoding
|
||||||
|
that starts with "----- BEGIN CRYPTOGRAPHIC OBJECT ----".
|
||||||
|
|
||||||
|
When we manage the storage of some cryptographic object, we prefix the
|
||||||
|
object with 32-byte NUL-padded prefix in order to avoid accidental
|
||||||
|
object confusion; see the crypto_read_tagged_contents_from_file() and
|
||||||
|
crypto_write_tagged_contents_to_file() functions for manipulating
|
||||||
|
these. The prefix is "== type: tag ==", where type describes the object
|
||||||
|
and its encoding, and tag indicates which one it is.
|
||||||
|
|
||||||
|
### Boxed-file storage ###
|
||||||
|
|
||||||
|
When managing keys, you frequently want to have some way to write a
|
||||||
|
secret object to disk, encrypted with a passphrase. The crypto_pwbox
|
||||||
|
and crypto_unpwbox functions do so in a way that's likely to be
|
||||||
|
readable by future versions of Tor.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,16 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/ctime
|
@dir /lib/ctime
|
||||||
@brief lib/ctime
|
@brief lib/ctime: Constant-time code to avoid side-channels.
|
||||||
|
|
||||||
|
This module contains constant-time implementations of various
|
||||||
|
data comparison and table lookup functions. We use these in preference to
|
||||||
|
memcmp() and so forth, since memcmp() can leak information about its inputs
|
||||||
|
based on how fast it returns. In general, your code should call tor_memeq()
|
||||||
|
and tor_memneq(), not memcmp().
|
||||||
|
|
||||||
|
We also define some _non_-constant-time wrappers for memcmp() here: Since we
|
||||||
|
consider calls to memcmp() to be in error, we require that code that actually
|
||||||
|
doesn't need to be constant-time to use the fast_memeq() / fast_memneq() /
|
||||||
|
fast_memcmp() aliases instead.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/defs
|
@dir /lib/defs
|
||||||
@brief lib/defs
|
@brief lib/defs: Lowest-level constants, used in many places.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,16 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/dispatch
|
@dir /lib/dispatch
|
||||||
@brief lib/dispatch
|
@brief lib/dispatch: In-process message delivery.
|
||||||
|
|
||||||
|
This module provides a general in-process "message dispatch" system in which
|
||||||
|
typed messages are sent on channels. The dispatch.h header has far more
|
||||||
|
information.
|
||||||
|
|
||||||
|
It is used by by \refdir{lib/pubsub} to implement our general
|
||||||
|
inter-module publish/subscribe system.
|
||||||
|
|
||||||
|
This is not a fancy multi-threaded many-to-many dispatcher as you may be used
|
||||||
|
to from more sophisticated architectures: this dispatcher is intended only
|
||||||
|
for use in improving Tor's architecture.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/encoding
|
@dir /lib/encoding
|
||||||
@brief lib/encoding
|
@brief lib/encoding: Encoding data in various forms, types, and transformations
|
||||||
|
|
||||||
|
Here we have time formats (timefmt.c), quoted strings (qstring.c), C strings
|
||||||
|
(string.c) base-16/32/64 (binascii.c), and more.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,15 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/err
|
@dir /lib/err
|
||||||
@brief lib/err
|
@brief lib/err: Lowest-level error handling code.
|
||||||
|
|
||||||
|
This module is responsible for generating stack traces, handling raw
|
||||||
|
assertion failures, and otherwise reporting problems that might not be
|
||||||
|
safe to report via the regular logging module.
|
||||||
|
|
||||||
|
There are three kinds of users for the functions in this module:
|
||||||
|
* Code that needs a way to assert(), but which cannot use the regular
|
||||||
|
`tor_assert()` macros in logging module.
|
||||||
|
* Code that needs signal-safe error reporting.
|
||||||
|
* Higher-level error handling code.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,9 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/evloop
|
@dir /lib/evloop
|
||||||
@brief lib/evloop
|
@brief lib/evloop: Low-level event loop.
|
||||||
|
|
||||||
|
This modules has tools to manage the [libevent](https://libevent.org/) event
|
||||||
|
loop and related functionality, in order to implement asynchronous
|
||||||
|
networking, timers, periodic events, and other scheduling tasks.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,7 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/fdio
|
@dir /lib/fdio
|
||||||
@brief lib/fdio
|
@brief lib/fdio: Code to read/write on file descriptors.
|
||||||
|
|
||||||
|
(This module also handles sockets, on platforms where a socket is not a kind
|
||||||
|
of fd.)
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,11 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/fs
|
@dir /lib/fs
|
||||||
@brief lib/fs
|
@brief lib/fs: Files, filenames, directories, etc.
|
||||||
|
|
||||||
|
This module is mostly a set of compatibility wrappers around
|
||||||
|
operating-system-specific filesystem access.
|
||||||
|
|
||||||
|
It also contains a set of convenience functions for safely writing to files,
|
||||||
|
creating directories, and so on.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,5 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/geoip
|
@dir /lib/geoip
|
||||||
@brief lib/geoip
|
@brief lib/geoip: IP-to-country mapping
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/intmath
|
@dir /lib/intmath
|
||||||
@brief lib/intmath
|
@brief lib/intmath: Integer mathematics.
|
||||||
**/
|
**/
|
||||||
|
131
src/lib/lib.dox
131
src/lib/lib.dox
@ -1,8 +1,133 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib
|
@dir /lib
|
||||||
@brief lib: low-level functionality.
|
@brief lib: low-level functionality.
|
||||||
|
|
||||||
The "lib" directory contains low-level functionality, most of it not
|
The "lib" directory contains low-level functionality. In general, this
|
||||||
necessarily Tor-specific.
|
code is not necessarily Tor-specific, but is instead possibly useful for
|
||||||
|
other applications.
|
||||||
|
|
||||||
|
The modules in `lib` are currently well-factored: each one depends
|
||||||
|
only on lower-level modules. You can see an up-to-date list of the
|
||||||
|
modules, sorted from lowest to highest level, by running
|
||||||
|
`./scripts/maint/practracker/includes.py --toposort`.
|
||||||
|
|
||||||
|
As of this writing, the library modules are (from lowest to highest
|
||||||
|
level):
|
||||||
|
|
||||||
|
- \refdir{lib/cc} -- Macros for managing the C compiler and
|
||||||
|
language.
|
||||||
|
|
||||||
|
- \refdir{lib/version} -- Holds the current version of Tor.
|
||||||
|
|
||||||
|
- \refdir{lib/testsupport} -- Helpers for making
|
||||||
|
test-only code, and test mocking support.
|
||||||
|
|
||||||
|
- \refdir{lib/defs} -- Lowest-level constants.
|
||||||
|
|
||||||
|
- \refdir{lib/subsys} -- Types used for declaring a
|
||||||
|
"subsystem". (_A subsystem is a module with support for initialization,
|
||||||
|
shutdown, configuration, and so on._)
|
||||||
|
|
||||||
|
- \refdir{lib/conf} -- For declaring configuration options.
|
||||||
|
|
||||||
|
- \refdir{lib/arch} -- For handling differences in CPU
|
||||||
|
architecture.
|
||||||
|
|
||||||
|
- \refdir{lib/err} -- Lowest-level error handling code.
|
||||||
|
|
||||||
|
- \refdir{lib/malloc} -- Memory management.
|
||||||
|
management.
|
||||||
|
|
||||||
|
- \refdir{lib/intmath} -- Integer mathematics.
|
||||||
|
|
||||||
|
- \refdir{lib/fdio} -- For
|
||||||
|
reading and writing n file descriptors.
|
||||||
|
|
||||||
|
- \refdir{lib/lock} -- Simple locking support.
|
||||||
|
(_Lower-level than the rest of the threading code._)
|
||||||
|
|
||||||
|
- \refdir{lib/ctime} -- Constant-time code to avoid
|
||||||
|
side-channels.
|
||||||
|
|
||||||
|
- \refdir{lib/string} -- Low-level string manipulation.
|
||||||
|
|
||||||
|
- \refdir{lib/wallclock} --
|
||||||
|
For inspecting and manipulating the current (UTC) time.
|
||||||
|
|
||||||
|
- \refdir{lib/osinfo} -- For inspecting the OS version
|
||||||
|
and capabilities.
|
||||||
|
|
||||||
|
- \refdir{lib/smartlist_core} -- The bare-bones
|
||||||
|
pieces of our dynamic array ("smartlist") implementation.
|
||||||
|
|
||||||
|
- \refdir{lib/log} -- Log messages to files, syslogs, etc.
|
||||||
|
|
||||||
|
- \refdir{lib/container} -- General purpose containers,
|
||||||
|
including dynamic arrays ("smartlists"), hashtables, bit arrays,
|
||||||
|
etc.
|
||||||
|
|
||||||
|
- \refdir{lib/trace} -- A general-purpose API
|
||||||
|
function-tracing functionality Tor. (_Currently not much used._)
|
||||||
|
|
||||||
|
- \refdir{lib/thread} -- Mid-level Threading.
|
||||||
|
|
||||||
|
- \refdir{lib/term} -- Terminal manipulation
|
||||||
|
(like reading a password from the user).
|
||||||
|
|
||||||
|
- \refdir{lib/memarea} -- A fast
|
||||||
|
"arena" style allocator, where the data is freed all at once.
|
||||||
|
|
||||||
|
- \refdir{lib/encoding} -- Encoding
|
||||||
|
data in various formats, datatypes, and transformations.
|
||||||
|
|
||||||
|
- \refdir{lib/dispatch} -- A general-purpose in-process
|
||||||
|
message delivery system.
|
||||||
|
|
||||||
|
- \refdir{lib/sandbox} -- Our Linux seccomp2 sandbox
|
||||||
|
implementation.
|
||||||
|
|
||||||
|
- \refdir{lib/pubsub} -- A publish/subscribe message passing system.
|
||||||
|
|
||||||
|
- \refdir{lib/fs} -- Files, filenames, directories, etc.
|
||||||
|
|
||||||
|
- \refdir{lib/confmgt} -- Parse, encode, and manipulate onfiguration files.
|
||||||
|
|
||||||
|
- \refdir{lib/crypt_ops} -- Cryptographic operations.
|
||||||
|
|
||||||
|
- \refdir{lib/meminfo} -- Functions for inspecting our
|
||||||
|
memory usage, if the malloc implementation exposes that to us.
|
||||||
|
|
||||||
|
- \refdir{lib/time} -- Higher level time functions, including
|
||||||
|
fine-gained and monotonic timers.
|
||||||
|
|
||||||
|
- \refdir{lib/math} -- Floating-point mathematical utilities.
|
||||||
|
|
||||||
|
- \refdir{lib/buf} -- An efficient byte queue.
|
||||||
|
|
||||||
|
- \refdir{lib/net} -- Networking code, including address
|
||||||
|
manipulation, compatibility wrappers, etc.
|
||||||
|
|
||||||
|
- \refdir{lib/compress} -- Wraps several compression libraries.
|
||||||
|
|
||||||
|
- \refdir{lib/geoip} -- IP-to-country mapping.
|
||||||
|
|
||||||
|
- \refdir{lib/tls} -- TLS library wrappers.
|
||||||
|
|
||||||
|
- \refdir{lib/evloop} -- Low-level event-loop.
|
||||||
|
|
||||||
|
- \refdir{lib/process} -- Launch and manage subprocesses.
|
||||||
|
|
||||||
|
### What belongs in lib?
|
||||||
|
|
||||||
|
In general, if you can imagine some program wanting the functionality
|
||||||
|
you're writing, even if that program had nothing to do with Tor, your
|
||||||
|
functionality belongs in lib.
|
||||||
|
|
||||||
|
If it falls into one of the existing "lib" categories, your
|
||||||
|
functionality belongs in lib.
|
||||||
|
|
||||||
|
If you are using platform-specific `ifdef`s to manage compatibility
|
||||||
|
issues among platforms, you should probably consider whether you can
|
||||||
|
put your code into lib.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/lock
|
@dir /lib/lock
|
||||||
@brief lib/lock
|
@brief lib/lock: Simple locking support.
|
||||||
|
|
||||||
|
This module is more low-level than the rest of the threading code, since it
|
||||||
|
is needed by more intermediate-level modules.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,12 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/log
|
@dir /lib/log
|
||||||
@brief lib/log
|
@brief lib/log: Log messages to files, syslogs, etc.
|
||||||
|
|
||||||
|
You can think of this as the logical "midpoint" of the
|
||||||
|
\refdir{lib} code": much of the higher-level code is higher-level
|
||||||
|
_because_ it uses the logging module, and much of the lower-level code is
|
||||||
|
specifically written to avoid having to log, because the logging module
|
||||||
|
depends on it.
|
||||||
|
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,78 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/malloc
|
@dir /lib/malloc
|
||||||
@brief lib/malloc
|
@brief lib/malloc: Wrappers and utilities for memory management.
|
||||||
|
|
||||||
|
|
||||||
|
Tor imposes a few light wrappers over C's native malloc and free
|
||||||
|
functions, to improve convenience, and to allow wholescale replacement
|
||||||
|
of malloc and free as needed.
|
||||||
|
|
||||||
|
You should never use 'malloc', 'calloc', 'realloc, or 'free' on their
|
||||||
|
own; always use the variants prefixed with 'tor_'.
|
||||||
|
They are the same as the standard C functions, with the following
|
||||||
|
exceptions:
|
||||||
|
|
||||||
|
* `tor_free(NULL)` is a no-op.
|
||||||
|
* `tor_free()` is a macro that takes an lvalue as an argument and sets it to
|
||||||
|
NULL after freeing it. To avoid this behavior, you can use `tor_free_()`
|
||||||
|
instead.
|
||||||
|
* tor_malloc() and friends fail with an assertion if they are asked to
|
||||||
|
allocate a value so large that it is probably an underflow.
|
||||||
|
* It is always safe to `tor_malloc(0)`, regardless of whether your libc
|
||||||
|
allows it.
|
||||||
|
* `tor_malloc()`, `tor_realloc()`, and friends are never allowed to fail.
|
||||||
|
Instead, Tor will die with an assertion. This means that you never
|
||||||
|
need to check their return values. See the next subsection for
|
||||||
|
information on why we think this is a good idea.
|
||||||
|
|
||||||
|
We define additional general-purpose memory allocation functions as well:
|
||||||
|
|
||||||
|
* `tor_malloc_zero(x)` behaves as `calloc(1, x)`, except the it makes clear
|
||||||
|
the intent to allocate a single zeroed-out value.
|
||||||
|
* `tor_reallocarray(x,y)` behaves as the OpenBSD reallocarray function.
|
||||||
|
Use it for cases when you need to realloc() in a multiplication-safe
|
||||||
|
way.
|
||||||
|
|
||||||
|
And specific-purpose functions as well:
|
||||||
|
|
||||||
|
* `tor_strdup()` and `tor_strndup()` behaves as the underlying libc
|
||||||
|
functions, but use `tor_malloc()` instead of the underlying function.
|
||||||
|
* `tor_memdup()` copies a chunk of memory of a given size.
|
||||||
|
* `tor_memdup_nulterm()` copies a chunk of memory of a given size, then
|
||||||
|
NUL-terminates it just to be safe.
|
||||||
|
|
||||||
|
#### Why assert on allocation failure?
|
||||||
|
|
||||||
|
Why don't we allow `tor_malloc()` and its allies to return NULL?
|
||||||
|
|
||||||
|
First, it's error-prone. Many programmers forget to check for NULL return
|
||||||
|
values, and testing for `malloc()` failures is a major pain.
|
||||||
|
|
||||||
|
Second, it's not necessarily a great way to handle OOM conditions. It's
|
||||||
|
probably better (we think) to have a memory target where we dynamically free
|
||||||
|
things ahead of time in order to stay under the target. Trying to respond to
|
||||||
|
an OOM at the point of `tor_malloc()` failure, on the other hand, would involve
|
||||||
|
a rare operation invoked from deep in the call stack. (Again, that's
|
||||||
|
error-prone and hard to debug.)
|
||||||
|
|
||||||
|
Third, thanks to the rise of Linux and other operating systems that allow
|
||||||
|
memory to be overcommitted, you can't actually ever rely on getting a NULL
|
||||||
|
from `malloc()` when you're out of memory; instead you have to use an approach
|
||||||
|
closer to tracking the total memory usage.
|
||||||
|
|
||||||
|
#### Conventions for your own allocation functions.
|
||||||
|
|
||||||
|
Whenever you create a new type, the convention is to give it a pair of
|
||||||
|
`x_new()` and `x_free_()` functions, named after the type.
|
||||||
|
|
||||||
|
Calling `x_free(NULL)` should always be a no-op.
|
||||||
|
|
||||||
|
There should additionally be an `x_free()` macro, defined in terms of
|
||||||
|
`x_free_()`. This macro should set its lvalue to NULL. You can define it
|
||||||
|
using the FREE_AND_NULL macro, as follows:
|
||||||
|
|
||||||
|
```
|
||||||
|
#define x_free(ptr) FREE_AND_NULL(x_t, x_free_, (ptr))
|
||||||
|
```
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/math
|
@dir /lib/math
|
||||||
@brief lib/math
|
@brief lib/math: Floating-point math utilities.
|
||||||
|
|
||||||
|
This module includes a bunch of floating-point compatibility code, and
|
||||||
|
implementations for several probability distributions.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,30 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/memarea
|
@dir /lib/memarea
|
||||||
@brief lib/memarea
|
@brief lib/memarea: A fast arena-style allocator.
|
||||||
|
|
||||||
|
This module has a fast "arena" style allocator, where memory is freed all at
|
||||||
|
once. This kind of allocation is very fast and avoids fragmentation, at the
|
||||||
|
expense of requiring all the data to be freed at the same time. We use this
|
||||||
|
for parsing and diff calculations.
|
||||||
|
|
||||||
|
It's often handy to allocate a large number of tiny objects, all of which
|
||||||
|
need to disappear at the same time. You can do this in tor using the
|
||||||
|
memarea.c abstraction, which uses a set of grow-only buffers for allocation,
|
||||||
|
and only supports a single "free" operation at the end.
|
||||||
|
|
||||||
|
Using memareas also helps you avoid memory fragmentation. You see, some libc
|
||||||
|
malloc implementations perform badly on the case where a large number of
|
||||||
|
small temporary objects are allocated at the same time as a few long-lived
|
||||||
|
objects of similar size. But if you use tor_malloc() for the long-lived ones
|
||||||
|
and a memarea for the temporary object, the malloc implementation is likelier
|
||||||
|
to do better.
|
||||||
|
|
||||||
|
To create a new memarea, use `memarea_new()`. To drop all the storage from a
|
||||||
|
memarea, and invalidate its pointers, use `memarea_drop_all()`.
|
||||||
|
|
||||||
|
The allocation functions `memarea_alloc()`, `memarea_alloc_zero()`,
|
||||||
|
`memarea_memdup()`, `memarea_strdup()`, and `memarea_strndup()` are analogous
|
||||||
|
to the similarly-named malloc() functions. There is intentionally no
|
||||||
|
`memarea_free()` or `memarea_realloc()`.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,7 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/meminfo
|
@dir /lib/meminfo
|
||||||
@brief lib/meminfo
|
@brief lib/meminfo: Inspecting malloc() usage.
|
||||||
|
|
||||||
|
Only available when malloc() provides mallinfo() or something similar.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/net
|
@dir /lib/net
|
||||||
@brief lib/net
|
@brief lib/net: Low-level network-related code.
|
||||||
|
|
||||||
|
This module includes address manipulation, compatibility wrappers,
|
||||||
|
convenience functions, and so on.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,10 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/osinfo
|
@dir /lib/osinfo
|
||||||
@brief lib/osinfo
|
@brief lib/osinfo: For inspecting the OS version and capabilities.
|
||||||
|
|
||||||
|
In general, we use this module when we're telling the user what operating
|
||||||
|
system they are running. We shouldn't make decisions based on the output of
|
||||||
|
these checks: instead, we should have more specific checks, either at compile
|
||||||
|
time or run time, based on the observed system behavior.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/process
|
@dir /lib/process
|
||||||
@brief lib/process
|
@brief lib/process: Launch and manage subprocesses.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,16 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/pubsub
|
@dir /lib/pubsub
|
||||||
@brief lib/pubsub
|
@brief lib/pubsub: Publish-subscribe message passing.
|
||||||
|
|
||||||
|
This module wraps the \refdir{lib/dispatch} module, to provide a more
|
||||||
|
ergonomic and type-safe approach to message passing.
|
||||||
|
|
||||||
|
In general, we favor this mechanism for cases where higher-level modules
|
||||||
|
need to be notified when something happens in lower-level modules. (The
|
||||||
|
alternative would be calling up from the lower-level modules, which
|
||||||
|
would be error-prone; or maintaining lists of function-pointers, which
|
||||||
|
would be clumsy and tend to complicate the call graph.)
|
||||||
|
|
||||||
|
See pubsub.c for more information.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,17 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/sandbox
|
@dir /lib/sandbox
|
||||||
@brief lib/sandbox
|
@brief lib/sandbox: Linux seccomp2-based sandbox.
|
||||||
|
|
||||||
|
This module uses Linux's seccomp2 facility via the
|
||||||
|
[`libseccomp` library](https://github.com/seccomp/libseccomp), to restrict
|
||||||
|
the set of system calls that Tor is allowed to invoke while it is running.
|
||||||
|
|
||||||
|
Because there are many libc versions that invoke different system calls, and
|
||||||
|
because handling strings is quite complex, this module is more complex and
|
||||||
|
less portable than it needs to be.
|
||||||
|
|
||||||
|
A better architecture would put the responsibility for invoking tricky system
|
||||||
|
calls (like open()) in another, less restricted process, and give that
|
||||||
|
process responsibility for enforcing our sandbox rules.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,12 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/smartlist_core
|
@dir /lib/smartlist_core
|
||||||
@brief lib/smartlist_core
|
@brief lib/smartlist_core: Minimal dynamic array implementation
|
||||||
|
|
||||||
|
A `smartlist_t` is a dynamic array type for holding `void *`. We use it
|
||||||
|
throughout the rest of the codebase.
|
||||||
|
|
||||||
|
There are higher-level pieces in \refdir{lib/container} but
|
||||||
|
the ones in lib/smartlist_core are used by the logging code, and therefore
|
||||||
|
cannot use the logging code.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +0,0 @@
|
|||||||
/**
|
|
||||||
@dir lib/stats
|
|
||||||
@brief lib/stats
|
|
||||||
**/
|
|
@ -1,4 +1,15 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/string
|
@dir /lib/string
|
||||||
@brief lib/string
|
@brief lib/string: Low-level string manipulation.
|
||||||
|
|
||||||
|
We have a number of compatibility functions here: some are for handling
|
||||||
|
functionality that is not implemented (or not implemented the same) on every
|
||||||
|
platform; some are for providing locale-independent versions of libc
|
||||||
|
functions that would otherwise be defined differently for different users.
|
||||||
|
|
||||||
|
Other functions here are for common string-manipulation operations that we do
|
||||||
|
in the rest of the codebase.
|
||||||
|
|
||||||
|
Any string function high-level enough to need logging belongs in a
|
||||||
|
higher-level module.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,34 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/subsys
|
@dir /lib/subsys
|
||||||
@brief lib/subsys
|
@brief lib/subsys: Types for declaring a "subsystem".
|
||||||
|
|
||||||
|
## Subsystems in Tor
|
||||||
|
|
||||||
|
A subsystem is a module with support for initialization, shutdown,
|
||||||
|
configuration, and so on.
|
||||||
|
|
||||||
|
Many parts of Tor can be initialized, cleaned up, and configured somewhat
|
||||||
|
independently through a table-driven mechanism. Each such part is called a
|
||||||
|
"subsystem".
|
||||||
|
|
||||||
|
To declare a subsystem, make a global `const` instance of the `subsys_fns_t`
|
||||||
|
type, filling in the function pointer fields that you require with ones
|
||||||
|
corresponding to your subsystem. Any function pointers left as "NULL" will
|
||||||
|
be a no-op. Each system must have a name and a "level", which corresponds to
|
||||||
|
the order in which it is initialized. (See `app/main/subsystem_list.c` for a
|
||||||
|
list of current subsystems and their levels.)
|
||||||
|
|
||||||
|
Then, insert your subsystem in the list in `app/main/subsystem_list.c`. It
|
||||||
|
will need to occupy a position corresponding to its level.
|
||||||
|
|
||||||
|
At this point, your subsystem will be handled like the others: it will get
|
||||||
|
initialized at startup, torn down at exit, and so on.
|
||||||
|
|
||||||
|
Historical note: Not all of Tor's code is currently handled as
|
||||||
|
subsystems. As you work with older code, you may see some parts of the code
|
||||||
|
that are initialized from `tor_init()` or `run_tor_main_loop()` or
|
||||||
|
`tor_run_main()`; and torn down from `tor_cleanup()`. We aim to migrate
|
||||||
|
these to subsystems over time; please don't add any new code that follows
|
||||||
|
this pattern.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/term
|
@dir /lib/term
|
||||||
@brief lib/term
|
@brief lib/term: Terminal operations (password input).
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/testsupport
|
@dir /lib/testsupport
|
||||||
@brief lib/testsupport
|
@brief lib/testsupport: Helpers for test-only code and for function mocking.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,9 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/thread
|
@dir /lib/thread
|
||||||
@brief lib/thread
|
@brief lib/thread: Mid-level threading.
|
||||||
|
|
||||||
|
This module contains compatibility and convenience code for multithreading,
|
||||||
|
except for low-level locks (which are in \refdir{lib/lock} and
|
||||||
|
workqueue/threadpool code (which belongs in \refdir{lib/evloop}.)
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,11 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/time
|
@dir /lib/time
|
||||||
@brief lib/time
|
@brief lib/time: Higher-level time functions
|
||||||
|
|
||||||
|
This includes both fine-grained timers and monotonic timers, along with
|
||||||
|
wrappers for them to try to improve efficiency.
|
||||||
|
|
||||||
|
For "what time is it" in UTC, see \refdir{lib/wallclock}. For parsing and
|
||||||
|
encoding times and dates, see \refdir{lib/encoding}.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,13 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/tls
|
@dir /lib/tls
|
||||||
@brief lib/tls
|
@brief lib/tls: TLS library wrappers
|
||||||
|
|
||||||
|
This module has compatibility wrappers around the library (NSS or OpenSSL,
|
||||||
|
depending on configuration) that Tor uses to implement the TLS link security
|
||||||
|
protocol.
|
||||||
|
|
||||||
|
It also implements the logic for some legacy TLS protocol usage we used to
|
||||||
|
support in old versions of Tor, involving conditional delivery of certificate
|
||||||
|
chains (v1 link protocol) and conditional renegotiation (v2 link protocol).
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/trace
|
@dir /lib/trace
|
||||||
@brief lib/trace
|
@brief lib/trace: Function-tracing functionality API.
|
||||||
|
|
||||||
|
This module is used for adding "trace" support (low-granularity function
|
||||||
|
logging) to Tor. Right now it doesn't have many users.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/version
|
@dir /lib/version
|
||||||
@brief lib/version
|
@brief lib/version: holds the current version of Tor.
|
||||||
**/
|
**/
|
||||||
|
@ -1,4 +1,13 @@
|
|||||||
/**
|
/**
|
||||||
@dir lib/wallclock
|
@dir /lib/wallclock
|
||||||
@brief lib/wallclock
|
@brief lib/wallclock: Inspect and manipulate the current time.
|
||||||
|
|
||||||
|
This module handles our concept of "what time is it" or "what time does the
|
||||||
|
world agree it is?" Generally, if you want something derived from UTC, this
|
||||||
|
is the module for you.
|
||||||
|
|
||||||
|
For versions of the time that are more local, more monotonic, or more
|
||||||
|
accurate, see \refdir{lib/time}. For parsing and encoding times and dates,
|
||||||
|
see \refdir{lib/encoding}.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
121
src/mainpage.dox
121
src/mainpage.dox
@ -1,11 +1,122 @@
|
|||||||
/**
|
/**
|
||||||
@mainpage Tor source reference
|
@mainpage Tor source reference
|
||||||
|
|
||||||
@section intro Getting to know Tor
|
@section intro Welcome to Tor
|
||||||
|
|
||||||
Welcome to the Tor source code documentation! Here we have documentation for
|
This documentation describes the general structure of the Tor codebase, how
|
||||||
nearly every function, type, and module in the Tor source code. The high-level
|
it fits together, what functionality is available for extending Tor, and
|
||||||
documentation is a work in progress. For now, have a look at the source code
|
gives some notes on how Tor got that way. It also includes a reference for
|
||||||
overview in doc/HACKING/design.
|
nearly every function, type, file, and module in the Tor source code. The
|
||||||
|
high-level documentation is a work in progress.
|
||||||
|
|
||||||
|
Tor itself remains a work in progress too: We've been working on it for
|
||||||
|
nearly two decades, and we've learned a lot about good coding since we first
|
||||||
|
started. This means, however, that some of the older pieces of Tor will have
|
||||||
|
some "code smell" in them that could stand a brisk refactoring. So when we
|
||||||
|
describe a piece of code, we'll sometimes give a note on how it got that way,
|
||||||
|
and whether we still think that's a good idea.
|
||||||
|
|
||||||
|
This document is not an overview of the Tor protocol. For that, see the
|
||||||
|
design paper and the specifications at https://spec.torproject.org/ .
|
||||||
|
|
||||||
|
For more information about Tor's coding standards and some helpful
|
||||||
|
development tools, see
|
||||||
|
[doc/HACKING](https://gitweb.torproject.org/tor.git/tree/doc/HACKING) in the
|
||||||
|
Tor repository.
|
||||||
|
|
||||||
|
@section highlevel The very high level
|
||||||
|
|
||||||
|
Ultimately, Tor runs as an event-driven network daemon: it responds to
|
||||||
|
network events, signals, and timers by sending and receiving things over
|
||||||
|
the network. Clients, relays, and directory authorities all use the
|
||||||
|
same codebase: the Tor process will run as a client, relay, or authority
|
||||||
|
depending on its configuration.
|
||||||
|
|
||||||
|
Tor has a few major dependencies, including Libevent (used to tell which
|
||||||
|
sockets are readable and writable), OpenSSL or NSS (used for many encryption
|
||||||
|
functions, and to implement the TLS protocol), and zlib (used to
|
||||||
|
compress and uncompress directory information).
|
||||||
|
|
||||||
|
Most of Tor's work today is done in a single event-driven main thread.
|
||||||
|
Tor also spawns one or more worker threads to handle CPU-intensive
|
||||||
|
tasks. (Right now, this only includes circuit encryption and the more
|
||||||
|
expensive compression algorithms.)
|
||||||
|
|
||||||
|
On startup, Tor initializes its libraries, reads and responds to its
|
||||||
|
configuration files, and launches a main event loop. At first, the only
|
||||||
|
events that Tor listens for are a few signals (like TERM and HUP), and
|
||||||
|
one or more listener sockets (for different kinds of incoming
|
||||||
|
connections). Tor also configures several timers to handle periodic
|
||||||
|
events. As Tor runs over time, other events will open, and new events
|
||||||
|
will be scheduled.
|
||||||
|
|
||||||
|
The codebase is divided into a few top-level subdirectories, each of
|
||||||
|
which contains several sub-modules.
|
||||||
|
|
||||||
|
- `ext` -- Code maintained elsewhere that we include in the Tor
|
||||||
|
source distribution.
|
||||||
|
|
||||||
|
- \refdir{lib} -- Lower-level utility code, not necessarily
|
||||||
|
tor-specific.
|
||||||
|
|
||||||
|
- `trunnel` -- Automatically generated code (from the Trunnel
|
||||||
|
tool): used to parse and encode binary formats.
|
||||||
|
|
||||||
|
- \refdir{core} -- Networking code that is implements the central
|
||||||
|
parts of the Tor protocol and main loop.
|
||||||
|
|
||||||
|
- \refdir{feature} -- Aspects of Tor (like directory management,
|
||||||
|
running a relay, running a directory authorities, managing a list of
|
||||||
|
nodes, running and using onion services) that are built on top of the
|
||||||
|
mainloop code.
|
||||||
|
|
||||||
|
- \refdir{app} -- Highest-level functionality; responsible for setting
|
||||||
|
up and configuring the Tor daemon, making sure all the lower-level
|
||||||
|
modules start up when required, and so on.
|
||||||
|
|
||||||
|
- \refdir{tools} -- Binaries other than Tor that we produce.
|
||||||
|
Currently this is tor-resolve, tor-gencert, and the tor_runner.o helper
|
||||||
|
module.
|
||||||
|
|
||||||
|
- `test` -- unit tests, regression tests, and a few integration
|
||||||
|
tests.
|
||||||
|
|
||||||
|
In theory, the above parts of the codebase are sorted from highest-level to
|
||||||
|
lowest-level, where high-level code is only allowed to invoke lower-level
|
||||||
|
code, and lower-level code never includes or depends on code of a higher
|
||||||
|
level. In practice, this refactoring is incomplete: The modules in
|
||||||
|
\refdir{lib} are well-factored, but there are many layer violations ("upward
|
||||||
|
dependencies") in \refdir{core} and \refdir{feature}.
|
||||||
|
We aim to eliminate those over time.
|
||||||
|
|
||||||
|
@section keyabstractions Some key high-level abstractions
|
||||||
|
|
||||||
|
The most important abstractions at Tor's high-level are Connections,
|
||||||
|
Channels, Circuits, and Nodes.
|
||||||
|
|
||||||
|
A 'Connection' (connection_t) represents a stream-based information flow.
|
||||||
|
Most connections are TCP connections to remote Tor servers and clients. (But
|
||||||
|
as a shortcut, a relay will sometimes make a connection to itself without
|
||||||
|
actually using a TCP connection. More details later on.) Connections exist
|
||||||
|
in different varieties, depending on what functionality they provide. The
|
||||||
|
principle types of connection are edge_connection_t (eg a socks connection or
|
||||||
|
a connection from an exit relay to a destination), or_connection_t (a TLS
|
||||||
|
stream connecting to a relay), dir_connection_t (an HTTP connection to learn
|
||||||
|
about the network), and control_connection_t (a connection from a
|
||||||
|
controller).
|
||||||
|
|
||||||
|
A 'Circuit' (circuit_t) is persistent tunnel through the Tor network,
|
||||||
|
established with public-key cryptography, and used to send cells one or more
|
||||||
|
hops. Clients keep track of multi-hop circuits (origin_circuit_t), and the
|
||||||
|
cryptography associated with each hop. Relays, on the other hand, keep track
|
||||||
|
only of their hop of each circuit (or_circuit_t).
|
||||||
|
|
||||||
|
A 'Channel' (channel_t) is an abstract view of sending cells to and from a
|
||||||
|
Tor relay. Currently, all channels are implemented using OR connections
|
||||||
|
(channel_tls_t). If we switch to other strategies in the future, we'll have
|
||||||
|
more connection types.
|
||||||
|
|
||||||
|
A 'Node' (node_t) is a view of a Tor instance's current knowledge and opinions
|
||||||
|
about a Tor relay or bridge.
|
||||||
|
|
||||||
**/
|
**/
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
/**
|
/**
|
||||||
@dir tools
|
@dir /tools
|
||||||
@brief tools: other command-line tools for use with Tor.
|
@brief tools: other command-line tools for use with Tor.
|
||||||
|
|
||||||
The "tools" directory has a few other programs that use Tor, but are not part
|
The "tools" directory has a few other programs that use Tor, but are not part
|
||||||
|
Loading…
Reference in New Issue
Block a user