2020-03-26 16:19:42 +01:00
|
|
|
# Tracing
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2017-04-25 16:40:27 +02:00
|
|
|
This document describes how the event tracing subsystem works in tor so
|
|
|
|
developers can add events to the code base but also hook them to an event
|
2020-02-11 21:06:50 +01:00
|
|
|
tracing framework (i.e. tracer).
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-03-11 17:26:10 +01:00
|
|
|
## WARNING ##
|
|
|
|
|
|
|
|
Tracing the tor daemon **always** generates sensitive data if used in
|
|
|
|
production (on the public network).
|
|
|
|
|
|
|
|
It **is** ethical for researchers to use tracing for their own tor client (for
|
|
|
|
example: building paths, timings, or performance).
|
|
|
|
|
|
|
|
It is **NOT** ethical to archive, publish or keep data containing other users'
|
|
|
|
activity such as relay data or anything that handles users' traffic. This
|
|
|
|
of course includes any logs below notice level.
|
|
|
|
|
|
|
|
Publishing analysis of tracing data containing user traffic is **NOT** safe
|
|
|
|
either.
|
|
|
|
|
|
|
|
In other words, tracing data that contains other users's activity is **NOT**
|
|
|
|
safe to publish in any form.
|
|
|
|
|
|
|
|
## Basics ###
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
Tracing is separated in two different concepts. The tracing API and the
|
|
|
|
tracing probes.
|
|
|
|
|
|
|
|
The API is in `src/lib/trace/` which defines how to call tracepoints in the
|
2020-07-13 11:16:51 +02:00
|
|
|
tor code. Every C files should include `src/lib/trace/events.h` if they want
|
2020-02-11 21:06:50 +01:00
|
|
|
to call a tracepoint.
|
|
|
|
|
|
|
|
The probes are what actually record the tracepoint data. Because they often
|
|
|
|
need to access specific subsystem objects, the probes are within each
|
|
|
|
subsystem. They are defined in the `trace-probes-<subsystem>.c` files.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-03-26 16:19:42 +01:00
|
|
|
### Events
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
A trace event is basically a function from which we can pass any data that we
|
|
|
|
want to collect. In addition, we specify a context for the event such as the
|
|
|
|
subsystem and an event name.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
|
|
|
A trace event in tor has the following standard format:
|
|
|
|
|
2020-07-13 11:16:51 +02:00
|
|
|
```c
|
2020-07-14 16:09:01 +02:00
|
|
|
tor_trace(subsystem, event_name, args...);
|
2020-07-13 11:16:51 +02:00
|
|
|
```
|
2017-01-23 20:28:10 +01:00
|
|
|
|
|
|
|
The `subsystem` parameter is the name of the subsytem the trace event is in.
|
|
|
|
For example that could be "scheduler" or "vote" or "hs". The idea is to add
|
|
|
|
some context to the event so when we collect them we know where it's coming
|
2020-02-11 21:06:50 +01:00
|
|
|
from.
|
|
|
|
|
2020-07-14 16:09:01 +02:00
|
|
|
The `event_name` is the name of the event which adds better semantic to the
|
2020-02-11 21:06:50 +01:00
|
|
|
event.
|
|
|
|
|
|
|
|
The `args` can be any number of arguments we want to collect.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
|
|
|
Here is an example of a possible tracepoint in main():
|
|
|
|
|
2020-07-13 11:16:51 +02:00
|
|
|
```c
|
|
|
|
tor_trace(main, init_phase, argc);
|
|
|
|
```
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-07-14 16:09:01 +02:00
|
|
|
The above is a tracepoint in the `main` subsystem with `init_phase` as the
|
2020-02-11 21:06:50 +01:00
|
|
|
event name and the `int argc` is passed to the event as one argument.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
|
|
|
How `argc` is collected or used has nothing to do with the instrumentation
|
|
|
|
(adding trace events to the code). It is the work of the tracer so this is why
|
|
|
|
the trace events and collection framework (tracer) are decoupled. You _can_
|
|
|
|
have trace events without a tracer.
|
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
### Instrumentation ###
|
|
|
|
|
2020-07-14 16:09:01 +02:00
|
|
|
In `src/lib/trace/events.h`, we map the high level `tor_trace()` macro to one
|
2020-02-11 21:06:50 +01:00
|
|
|
or many enabled instrumentation.
|
|
|
|
|
|
|
|
Currently, we have 3 types of possible instrumentation:
|
|
|
|
|
|
|
|
1. Debug
|
|
|
|
|
2020-07-14 16:09:01 +02:00
|
|
|
This will map every tracepoint to `log_debug()`. However, none of the
|
2020-02-11 21:06:50 +01:00
|
|
|
arguments will be passed on because we don't know their type nor the string
|
|
|
|
format of the debug log. The output is standardized like this:
|
|
|
|
|
2020-07-13 11:16:51 +02:00
|
|
|
```
|
|
|
|
[debug] __FUNC__: Tracepoint <event_name> from subsystem <subsystem> hit.
|
|
|
|
```
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
2. USDT
|
|
|
|
|
|
|
|
User Statically-Defined Tracing (USDT) is a kind of probe which can be
|
|
|
|
handled by a variety of tracers such as SystemTap, DTrace, perf, eBPF and
|
|
|
|
ftrace.
|
|
|
|
|
|
|
|
For each tracer, one will need to define the ABI in order for the tracer to
|
|
|
|
be able to extract the data from the tracepoint objects. For instance, the
|
2020-07-14 16:09:01 +02:00
|
|
|
tracer needs to know how to print the circuit state of a `circuit_t`
|
2020-02-11 21:06:50 +01:00
|
|
|
object.
|
|
|
|
|
|
|
|
3. LTTng-UST
|
|
|
|
|
|
|
|
LTTng Userspace is a tracer that has it own type of instrumentation. The
|
|
|
|
probe definitions are created within the C code and is strongly typed.
|
|
|
|
|
|
|
|
For more information, see https://lttng.org/docs.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-03-26 16:19:42 +01:00
|
|
|
## Build System
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
This section describes how the instrumentation is integrated into the build
|
|
|
|
system of tor.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-07-14 16:09:01 +02:00
|
|
|
By default, every tracing events are disabled in tor that is `tor_trace()` is
|
2020-02-11 21:06:50 +01:00
|
|
|
a NOP thus has no execution cost time.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
To enable a specific instrumentation, there are configure options:
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
1. Debug: `--enable-tracing-instrumentation-debug`
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
2. USDT: `--enable-tracing-instrumentation-usdt`
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
3. LTTng: `--enable-tracing-instrumentation-lttng`
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
They can all be used together or independently. If one of them is set,
|
2020-07-14 16:09:01 +02:00
|
|
|
`HAVE_TRACING` define is set. And for each instrumentation, a
|
|
|
|
`USE_TRACING_INSTRUMENTATION_<type>` is set.
|
2020-02-11 21:06:50 +01:00
|
|
|
|
|
|
|
## Adding a Tracepoint ##
|
2017-01-23 20:28:10 +01:00
|
|
|
|
|
|
|
This is pretty easy. Let's say you want to add a trace event in
|
2020-02-11 21:06:50 +01:00
|
|
|
`src/feature/rend/rendcache.c`, you first need to include this file:
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-07-13 11:16:51 +02:00
|
|
|
```c
|
2020-07-14 16:09:01 +02:00
|
|
|
#include "lib/trace/events.h"
|
2020-07-13 11:16:51 +02:00
|
|
|
```
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-07-14 16:09:01 +02:00
|
|
|
Then, the `tor_trace()` macro can be used with the specific format detailled
|
2020-02-11 21:06:50 +01:00
|
|
|
before in a previous section. As an example:
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-07-13 11:16:51 +02:00
|
|
|
```c
|
|
|
|
tor_trace(hs, store_desc_as_client, desc, desc_id);
|
|
|
|
```
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
For `Debug` instrumentation, you have nothing else to do.
|
|
|
|
|
|
|
|
For `USDT`, instrumentation, you will need to define the probes in a way the
|
|
|
|
specific tracer can understand. For instance, SystemTap requires you to define
|
|
|
|
a `tapset` for each tracepoints.
|
|
|
|
|
|
|
|
For `LTTng`, you will need to define the probes in the
|
|
|
|
`trace-probes-<subsystem>.{c|h}` file. See the `trace-probes-circuit.{c|h}`
|
|
|
|
file as an example and https://lttng.org/docs/v2.11/#doc-instrumenting.
|
|
|
|
|
|
|
|
## Performance ##
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
A word about performance when a tracepoint is enabled. One of the goal of a
|
|
|
|
tracepoint (USDT, LTTng-UST, ...) is that they can be enabled or disabled. By
|
|
|
|
default, they are disabled which means the tracer will not record the data but
|
|
|
|
it has to do a check thus the cost is basically the one of a `branch`.
|
2017-01-23 20:28:10 +01:00
|
|
|
|
2020-02-11 21:06:50 +01:00
|
|
|
If enabled, then the performance depends on the tracer. In the case of
|
|
|
|
LTTng-UST, the event costs around 110nsec.
|