mirror of
https://gitlab.torproject.org/tpo/core/tor.git
synced 2024-11-25 04:43:31 +01:00
b56985f022
Closes tpo/core/team#1 .
544 lines
21 KiB
Markdown
544 lines
21 KiB
Markdown
# Coding conventions for Tor
|
|
|
|
tl;dr:
|
|
|
|
- Run configure with `--enable-fatal-warnings`
|
|
- Document your functions
|
|
- Write unit tests
|
|
- Run `make check` before submitting a patch
|
|
- Run `make distcheck` if you have made changes to build system components
|
|
- Add a file in `changes` for your branch.
|
|
|
|
## Patch checklist
|
|
|
|
If possible, send your patch as one of these (in descending order of
|
|
preference)
|
|
|
|
- A git branch we can pull from
|
|
- Patches generated by git format-patch
|
|
- A unified diff
|
|
|
|
Did you remember...
|
|
|
|
- To build your code while configured with `--enable-fatal-warnings`?
|
|
- To run `make check-docs` to see whether all new options are on
|
|
the manpage?
|
|
- To write unit tests, as possible?
|
|
- To run `make test-full` to test against all unit and integration tests (or
|
|
`make test-full-online` if you have a working connection to the internet)?
|
|
- To test that the distribution will actually work via `make distcheck`?
|
|
- To base your code on the appropriate branch?
|
|
- To include a file in the `changes` directory as appropriate?
|
|
|
|
If you are submitting a major patch or new feature, or want to in the future...
|
|
|
|
- Set up Chutney and Stem, see `doc/HACKING/WritingTests.md`
|
|
- Run `make test-full` to test against all unit and integration tests.
|
|
|
|
If you have changed build system components:
|
|
- Please run `make distcheck`
|
|
- For example, if you have changed Makefiles, autoconf files, or anything
|
|
else that affects the build system.
|
|
|
|
## License issues
|
|
|
|
Tor is distributed under the license terms in the LICENSE -- in
|
|
brief, the "3-clause BSD license". If you send us code to
|
|
distribute with Tor, it needs to be code that we can distribute
|
|
under those terms. Please don't send us patches unless you agree
|
|
to allow this.
|
|
|
|
Some compatible licenses include:
|
|
|
|
- 3-clause BSD
|
|
- 2-clause BSD
|
|
- CC0 Public Domain Dedication
|
|
|
|
## How we use Git branches
|
|
|
|
Each main development series (like 0.2.1, 0.2.2, etc) has its main work
|
|
applied to a single branch. At most one series can be the development series
|
|
at a time; all other series are maintenance series that get bug-fixes only.
|
|
The development series is built in a git branch called "master"; the
|
|
maintenance series are built in branches called "maint-0.2.0", "maint-0.2.1",
|
|
and so on. We regularly merge the active maint branches forward.
|
|
|
|
For all series except the development series, we also have a "release" branch
|
|
(as in "release-0.2.1"). The release series is based on the corresponding
|
|
maintenance series, except that it deliberately lags the maint series for
|
|
most of its patches, so that bugfix patches are not typically included in a
|
|
maintenance release until they've been tested for a while in a development
|
|
release. Occasionally, we'll merge an urgent bugfix into the release branch
|
|
before it gets merged into maint, but that's rare.
|
|
|
|
If you're working on a bugfix for a bug that occurs in a particular version,
|
|
base your bugfix branch on the "maint" branch for the first supported series
|
|
that has that bug. (As of June 2013, we're supporting 0.2.3 and later.)
|
|
|
|
If you're working on a new feature, base it on the master branch. If you're
|
|
working on a new feature and it will take a while to implement and/or you'd
|
|
like to avoid the possibility of unrelated bugs in Tor while you're
|
|
implementing your feature, consider branching off of the latest maint- branch.
|
|
_Never_ branch off a relase- branch. Don't branch off a tag either: they come
|
|
from release branches. Doing so will likely produce a nightmare of merge
|
|
conflicts in the ChangeLog when it comes time to merge your branch into Tor.
|
|
Best advice: don't try to keep an independent branch forked for more than 6
|
|
months and expect it to merge cleanly. Try to merge pieces early and often.
|
|
|
|
## How we log changes
|
|
|
|
When you do a commit that needs a ChangeLog entry, add a new file to
|
|
the `changes` toplevel subdirectory. It should have the format of a
|
|
one-entry changelog section from the current ChangeLog file, as in
|
|
|
|
o Major bugfixes (security):
|
|
- Fix a potential buffer overflow. Fixes bug 99999; bugfix on
|
|
0.3.1.4-beta.
|
|
o Minor features (performance):
|
|
- Make tor faster. Closes ticket 88888.
|
|
|
|
To write a changes file, first categorize the change. Some common categories
|
|
are:
|
|
o Minor bugfixes (subheading):
|
|
o Major bugfixes (subheading):
|
|
o Minor features (subheading):
|
|
o Major features (subheading):
|
|
o Code simplifications and refactoring:
|
|
o Testing:
|
|
o Documentation:
|
|
|
|
The subheading is a particular area within Tor. See the ChangeLog for
|
|
examples.
|
|
|
|
Then say what the change does. If it's a bugfix, mention what bug it fixes
|
|
and when the bug was introduced. To find out which Git tag the change was
|
|
introduced in, you can use `git describe --contains <sha1 of commit>`.
|
|
If you don't know the commit, you can search the git diffs (-S) for the first
|
|
instance of the feature (--reverse).
|
|
|
|
For example, for #30224, we wanted to know when the bridge-distribution-request
|
|
feature was introduced into Tor:
|
|
|
|
```console
|
|
$ git log -S bridge-distribution-request --reverse commit ebab521525
|
|
Author: Roger Dingledine <arma@torproject.org>
|
|
Date: Sun Nov 13 02:39:16 2016 -0500
|
|
|
|
Add new BridgeDistribution config option
|
|
|
|
$ git describe --contains ebab521525
|
|
tor-0.3.2.3-alpha~15^2~4
|
|
```
|
|
|
|
If you need to know all the Tor versions that contain a commit, use:
|
|
|
|
```console
|
|
$ git tag --contains 9f2efd02a1 | sort -V
|
|
tor-0.2.5.16
|
|
tor-0.2.8.17
|
|
tor-0.2.9.14
|
|
tor-0.2.9.15
|
|
...
|
|
tor-0.3.0.13
|
|
tor-0.3.1.9
|
|
tor-0.3.1.10
|
|
...
|
|
```
|
|
|
|
If a bug was introduced before the oldest currently supported release series
|
|
of Tor, and it's hard to track down where it was introduced, you may say
|
|
"bugfix on all supported versions of Tor."
|
|
|
|
If at all possible, try to create the changes file in the same commit where
|
|
you are making the change. Please give it a distinctive name that no other
|
|
branch will use for the lifetime of your change. We usually use "ticketNNNNN"
|
|
or "bugNNNNN", where NNNNN is the ticket number. To verify the format of the
|
|
changes file, you can use `make check-changes`. This is run automatically as
|
|
part of `make check` -- if it fails, we must fix it as soon as possible, so
|
|
that our CI passes. These checks are implemented in
|
|
`scripts/maint/lintChanges.py`.
|
|
|
|
Changes file style guide:
|
|
* Make everything terse.
|
|
|
|
* Write from the user's point of view: describe the user-visible changes
|
|
right away.
|
|
|
|
* Mention configuration options by name. If they're rare or unusual,
|
|
remind people what they're for.
|
|
|
|
* Describe changes in the present tense and in the imperative: not past.
|
|
|
|
* Every bugfix should have a sentence of the form "Fixes bug 1234; bugfix
|
|
on 0.1.2.3-alpha", describing what bug was fixed and where it came from.
|
|
|
|
* "Relays", not "servers", "nodes", or "Tor relays".
|
|
|
|
When we go to make a release, we will concatenate all the entries
|
|
in changes to make a draft changelog, and clear the directory. We'll
|
|
then edit the draft changelog into a nice readable format.
|
|
|
|
What needs a changes file?
|
|
|
|
* A not-exhaustive list: Anything that might change user-visible
|
|
behavior. Anything that changes internals, documentation, or the build
|
|
system enough that somebody could notice. Big or interesting code
|
|
rewrites. Anything about which somebody might plausibly wonder "when
|
|
did that happen, and/or why did we do that" 6 months down the line.
|
|
|
|
What does not need a changes file?
|
|
|
|
* Bugfixes for code that hasn't shipped in any released version of Tor
|
|
* Any change to a file that is not distributed in the tarball. This
|
|
includes:
|
|
* Any change to our CI configuration that does not affect the distributed
|
|
source.
|
|
* Any change to developer-only tools, unless those tools are distributed
|
|
in the tarball.
|
|
* Non-functional code movement.
|
|
* Identifier re-namings, comment edits, spelling fixes, and so on.
|
|
|
|
Why use changes files instead of Git commit messages?
|
|
|
|
* Git commit messages are written for developers, not users, and they
|
|
are nigh-impossible to revise after the fact.
|
|
|
|
Why use changes files instead of entries in the ChangeLog?
|
|
|
|
* Having every single commit touch the ChangeLog file tended to create
|
|
zillions of merge conflicts.
|
|
|
|
## Whitespace and C conformance
|
|
|
|
Tor's C code is written in accordance with the C99 standard. Invoke `make
|
|
check-spaces` from time to time, so it can tell you about deviations from our C
|
|
whitespace style. Generally, we use:
|
|
|
|
- Unix-style line endings
|
|
- K&R-style indentation
|
|
- No space before newlines
|
|
- Never more than one blank line in a row
|
|
- Always spaces, never tabs
|
|
- No more than 79-columns per line.
|
|
- Two spaces per indent.
|
|
- A space between control keywords and their corresponding paren
|
|
`if (x)`, `while (x)`, and `switch (x)`, never `if(x)`, `while(x)`, or
|
|
`switch(x)`.
|
|
- A space between anything and an open brace.
|
|
- No space between a function name and an opening paren. `puts(x)`, not
|
|
`puts (x)`.
|
|
- Function declarations at the start of the line.
|
|
- Use `void foo(void)` to declare a function with no arguments. Saying
|
|
`void foo()` is C++ syntax.
|
|
- Use `const` for new APIs.
|
|
- Variables should be initialized when declared, rather than declared at the
|
|
top of a scope.
|
|
|
|
If you use an editor that has plugins for editorconfig.org, the file
|
|
`.editorconfig` will help you to conform this coding style.
|
|
|
|
We try hard to build without warnings everywhere. In particular, if
|
|
you're using gcc, you should invoke the configure script with the
|
|
option `--enable-fatal-warnings`. This will tell the compiler
|
|
to make all warnings into errors.
|
|
|
|
## Functions to use; functions not to use
|
|
|
|
We have some wrapper functions like `tor_malloc`, `tor_free`, `tor_strdup`, and
|
|
`tor_gettimeofday;` use them instead of their generic equivalents. (They
|
|
always succeed or exit.)
|
|
|
|
Specifically, Don't use `malloc`, `realloc`, `calloc`, `free`, or
|
|
`strdup`. Use `tor_malloc`, `tor_realloc`, `tor_calloc`, `tor_free`, or
|
|
`tor_strdup`.
|
|
|
|
Don't use `tor_realloc(x, y\*z)`. Use `tor_reallocarray(x, y, z)` instead.;
|
|
|
|
You can get a full list of the compatibility functions that Tor provides by
|
|
looking through `src/lib/*/*.h`. You can see the
|
|
available containers in `src/lib/containers/*.h`. You should probably
|
|
familiarize yourself with these modules before you write too much code, or
|
|
else you'll wind up reinventing the wheel.
|
|
|
|
We don't use `strcat` or `strcpy` or `sprintf` of any of those notoriously
|
|
broken old C functions. We also avoid `strncat` and `strncpy`. Use
|
|
`strlcat`, `strlcpy`, or `tor_snprintf/tor_asprintf` instead.
|
|
|
|
We don't call `memcmp()` directly. Use `fast_memeq()`, `fast_memneq()`,
|
|
`tor_memeq()`, or `tor_memneq()` for most purposes. If you really need a
|
|
tristate return value, use `tor_memcmp()` or `fast_memcmp()`.
|
|
|
|
Don't call `assert()` directly. For hard asserts, use `tor_assert()`. For
|
|
soft asserts, use `tor_assert_nonfatal()` or `BUG()`. If you need to print
|
|
debug information in assert error message, consider using `tor_assertf()` and
|
|
`tor_assertf_nonfatal()`. If you are writing code that is too low-level to
|
|
use the logging subsystem, use `raw_assert()`.
|
|
|
|
Don't use `toupper()` and `tolower()` functions. Use `TOR_TOUPPER` and
|
|
`TOR_TOLOWER` macros instead. Similarly, use `TOR_ISALPHA`, `TOR_ISALNUM` et.
|
|
al. instead of `isalpha()`, `isalnum()`, etc.
|
|
|
|
When allocating new string to be added to a smartlist, use
|
|
`smartlist_add_asprintf()` to do both at once.
|
|
|
|
Avoid calling BSD socket functions directly. Use portable wrappers to work
|
|
with sockets and socket addresses. Also, sockets should be of type
|
|
`tor_socket_t`.
|
|
|
|
Don't use any of these functions: they aren't portable. Use the
|
|
version prefixed with `tor_` instead: strtok_r, memmem, memstr,
|
|
asprintf, localtime_r, gmtime_r, inet_aton, inet_ntop, inet_pton,
|
|
getpass, ntohll, htonll. (This list is incomplete.)
|
|
|
|
## What code can use what other code?
|
|
|
|
We're trying to simplify Tor's structure over time. In the long run, we want
|
|
Tor to be structured as a set of modules with *no circular dependencies*.
|
|
|
|
This property is currently provided by the modules in src/lib, but not
|
|
throughout the rest of Tor. In general, higher-level libraries may use
|
|
lower-level libraries, but never the reverse.
|
|
|
|
To prevent new circular dependencies from landing, we have a tool that
|
|
you can invoke with `make check-includes`, and which is run
|
|
automatically as part of `make check`. This tool will verify that, for
|
|
every source directory with a `.may_include` file, no local headers are
|
|
included except those specifically permitted by the `.may_include` file.
|
|
When editing one of these files, please make sure that you are not
|
|
introducing any cycles into Tor's dependency graph.
|
|
|
|
## Floating point math is hard
|
|
|
|
Floating point arithmetic as typically implemented by computers is
|
|
very counterintuitive. Failure to adequately analyze floating point
|
|
usage can result in surprising behavior and even security
|
|
vulnerabilities!
|
|
|
|
General advice:
|
|
|
|
- Don't use floating point.
|
|
- If you must use floating point, document how the limits of
|
|
floating point precision and calculation accuracy affect function
|
|
outputs.
|
|
- Try to do as much as possible of your calculations using integers
|
|
(possibly acting as fixed-point numbers) and convert to floating
|
|
point for display.
|
|
- If you must send floating point numbers on the wire, serialize
|
|
them in a platform-independent way. Tor avoids exchanging
|
|
floating-point values, but when it does, it uses ASCII numerals,
|
|
with a decimal point (".").
|
|
- Binary fractions behave very differently from decimal fractions.
|
|
Make sure you understand how these differences affect your
|
|
calculations.
|
|
- Every floating point arithmetic operation is an opportunity to
|
|
lose precision, overflow, underflow, or otherwise produce
|
|
undesired results. Addition and subtraction tend to be worse
|
|
than multiplication and division (due to things like catastrophic
|
|
cancellation). Try to arrange your calculations to minimize such
|
|
effects.
|
|
- Changing the order of operations changes the results of many
|
|
floating-point calculations. Be careful when you simplify
|
|
calculations! If the order is significant, document it using a
|
|
code comment.
|
|
- Comparing most floating point values for equality is unreliable.
|
|
Avoid using `==`, instead, use `>=` or `<=`. If you use an
|
|
epsilon value, make sure it's appropriate for the ranges in
|
|
question.
|
|
- Different environments (including compiler flags and per-thread
|
|
state on a single platform!) can get different results from the
|
|
same floating point calculations. This means you can't use
|
|
floats in anything that needs to be deterministic, like consensus
|
|
generation. This also makes reliable unit tests of
|
|
floating-point outputs hard to write.
|
|
|
|
For additional useful advice (and a little bit of background), see
|
|
[What Every Programmer Should Know About Floating-Point
|
|
Arithmetic](https://floating-point-gui.de/).
|
|
|
|
A list of notable (and surprising) facts about floating point
|
|
arithmetic is at [Floating-point
|
|
complexities](https://randomascii.wordpress.com/2012/04/05/floating-point-complexities/).
|
|
Most of that [series of posts on floating
|
|
point](https://randomascii.wordpress.com/category/floating-point/) is
|
|
helpful.
|
|
|
|
For more detailed (and math-intensive) background, see [What Every
|
|
Computer Scientist Should Know About Floating-Point
|
|
Arithmetic](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html).
|
|
|
|
## Other C conventions
|
|
|
|
The `a ? b : c` trinary operator only goes inside other expressions;
|
|
don't use it as a replacement for if. (You can ignore this inside macro
|
|
definitions when necessary.)
|
|
|
|
Assignment operators shouldn't nest inside other expressions. (You can
|
|
ignore this inside macro definitions when necessary.)
|
|
|
|
## Binary data and wire formats
|
|
|
|
Use pointer to `char` when representing NUL-terminated string. To represent
|
|
arbitrary binary data, use pointer to `uint8_t`. (Many older Tor APIs ignore
|
|
this rule.)
|
|
|
|
Refrain from attempting to encode integers by casting their pointers to byte
|
|
arrays. Use something like `set_uint32()`/`get_uint32()` instead and don't
|
|
forget about endianness.
|
|
|
|
Try to never hand-write new code to parse or generate binary
|
|
formats. Instead, use trunnel if at all possible. See
|
|
|
|
https://gitweb.torproject.org/trunnel.git/tree
|
|
|
|
for more information about trunnel.
|
|
|
|
For information on adding new trunnel code to Tor, see src/trunnel/README
|
|
|
|
## Calling and naming conventions
|
|
|
|
Whenever possible, functions should return -1 on error and 0 on success.
|
|
|
|
For multi-word identifiers, use lowercase words combined with
|
|
underscores. (e.g., `multi_word_identifier`). Use ALL_CAPS for macros and
|
|
constants.
|
|
|
|
Typenames should end with `_t`.
|
|
|
|
Function names should be prefixed with a module name or object name. (In
|
|
general, code to manipulate an object should be a module with the same name
|
|
as the object, so it's hard to tell which convention is used.)
|
|
|
|
Functions that do things should have imperative-verb names
|
|
(e.g. `buffer_clear`, `buffer_resize`); functions that return booleans should
|
|
have predicate names (e.g. `buffer_is_empty`, `buffer_needs_resizing`).
|
|
|
|
If you find that you have four or more possible return code values, it's
|
|
probably time to create an enum. If you find that you are passing three or
|
|
more flags to a function, it's probably time to create a flags argument that
|
|
takes a bitfield.
|
|
|
|
## What To Optimize
|
|
|
|
Don't optimize anything if it's not in the critical path. Right now, the
|
|
critical path seems to be AES, logging, and the network itself. Feel free to
|
|
do your own profiling to determine otherwise.
|
|
|
|
## Log conventions
|
|
|
|
[FAQ - Log Levels](https://www.torproject.org/docs/faq#LogLevel)
|
|
|
|
No error or warning messages should be expected during normal OR or OP
|
|
operation.
|
|
|
|
If a library function is currently called such that failure always means ERR,
|
|
then the library function should log WARN and let the caller log ERR.
|
|
|
|
Every message of severity INFO or higher should either (A) be intelligible
|
|
to end-users who don't know the Tor source; or (B) somehow inform the
|
|
end-users that they aren't expected to understand the message (perhaps
|
|
with a string like "internal error"). Option (A) is to be preferred to
|
|
option (B).
|
|
|
|
## Assertions In Tor
|
|
|
|
Assertions should be used for bug-detection only. Don't use assertions to
|
|
detect bad user inputs, network errors, resource exhaustion, or similar
|
|
issues.
|
|
|
|
Tor is always built with assertions enabled, so try to only use
|
|
`tor_assert()` for cases where you are absolutely sure that crashing is the
|
|
least bad option. Many bugs have been caused by use of `tor_assert()` when
|
|
another kind of check would have been safer.
|
|
|
|
If you're writing an assertion to test for a bug that you _can_ recover from,
|
|
use `tor_assert_nonfatal()` in place of `tor_assert()`. If you'd like to
|
|
write a conditional that incorporates a nonfatal assertion, use the `BUG()`
|
|
macro, as in:
|
|
|
|
```c
|
|
if (BUG(ptr == NULL))
|
|
return -1;
|
|
```
|
|
|
|
## Allocator conventions
|
|
|
|
By convention, any tor type with a name like `abc_t` should be allocated
|
|
by a function named `abc_new()`. This function should never return
|
|
NULL.
|
|
|
|
Also, a type named `abc_t` should be freed by a function named `abc_free_()`.
|
|
Don't call this `abc_free_()` function directly -- instead, wrap it in a
|
|
macro called `abc_free()`, using the `FREE_AND_NULL` macro:
|
|
|
|
```c
|
|
void abc_free_(abc_t *obj);
|
|
#define abc_free(obj) FREE_AND_NULL(abc_t, abc_free_, (obj))
|
|
```
|
|
|
|
This macro will free the underlying `abc_t` object, and will also set
|
|
the object pointer to NULL.
|
|
|
|
You should define all `abc_free_()` functions to accept NULL inputs:
|
|
|
|
```c
|
|
void
|
|
abc_free_(abc_t *obj)
|
|
{
|
|
if (!obj)
|
|
return;
|
|
tor_free(obj->name);
|
|
thing_free(obj->thing);
|
|
tor_free(obj);
|
|
}
|
|
```
|
|
|
|
If you need a free function that takes a `void *` argument (for example,
|
|
to use it as a function callback), define it with a name like
|
|
`abc_free_void()`:
|
|
|
|
```c
|
|
static void
|
|
abc_free_void_(void *obj)
|
|
{
|
|
abc_free_(obj);
|
|
}
|
|
```
|
|
|
|
When deallocating, don't say e.g. `if (x) tor_free(x)`. The convention is to
|
|
have deallocators do nothing when NULL pointer is passed.
|
|
|
|
## Doxygen comment conventions
|
|
|
|
Say what functions do as a series of one or more imperative sentences, as
|
|
though you were telling somebody how to be the function. In other words, DO
|
|
NOT say:
|
|
|
|
```c
|
|
/** The strtol function parses a number.
|
|
*
|
|
* nptr -- the string to parse. It can include whitespace.
|
|
* endptr -- a string pointer to hold the first thing that is not part
|
|
* of the number, if present.
|
|
* base -- the numeric base.
|
|
* returns: the resulting number.
|
|
*/
|
|
long strtol(const char *nptr, char **nptr, int base);
|
|
```
|
|
|
|
Instead, please DO say:
|
|
|
|
```c
|
|
/** Parse a number in radix <b>base</b> from the string <b>nptr</b>,
|
|
* and return the result. Skip all leading whitespace. If
|
|
* <b>endptr</b> is not NULL, set *<b>endptr</b> to the first character
|
|
* after the number parsed.
|
|
**/
|
|
long strtol(const char *nptr, char **nptr, int base);
|
|
```
|
|
|
|
Doxygen comments are the contract in our abstraction-by-contract world: if
|
|
the functions that call your function rely on it doing something, then your
|
|
function should mention that it does that something in the documentation. If
|
|
you rely on a function doing something beyond what is in its documentation,
|
|
then you should watch out, or it might do something else later.
|