2006-01-31 10:10:13 +01:00
|
|
|
|
|
|
|
Tor Incentives Design Brainstorms
|
|
|
|
|
|
|
|
1. Goals: what do we want to achieve with an incentive scheme?
|
|
|
|
|
|
|
|
1.1. Encourage users to provide good relay service (throughput, latency).
|
|
|
|
1.2. Encourage users to allow traffic to exit the Tor network from
|
|
|
|
their node.
|
|
|
|
|
|
|
|
2. Approaches to learning who should get priority.
|
|
|
|
|
|
|
|
2.1. "Hard" or quantitative reputation tracking.
|
|
|
|
|
|
|
|
In this design, we track the number of bytes and throughput in and
|
|
|
|
out of nodes we interact with. When a node asks to send or receive
|
|
|
|
bytes, we provide service proportional to our current record of the
|
|
|
|
node's value. One approach is to let each circuit be either a normal
|
|
|
|
circuit or a premium circuit, and nodes can "spend" their value by
|
|
|
|
sending and receiving bytes on premium circuits: see section 4.1 for
|
|
|
|
details of this design. Another approach (section 4.2) would treat
|
|
|
|
all traffic from the node with the same priority class, and so nodes
|
|
|
|
that provide resources will get and provide better service on average.
|
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
This approach could be complemented with an anonymous e-cash
|
|
|
|
implementation to let people spend reputations gained in one context
|
|
|
|
in another context.
|
|
|
|
|
2006-01-31 10:10:13 +01:00
|
|
|
2.2. "Soft" or qualitative reputation tracking.
|
|
|
|
|
|
|
|
Rather than accounting for every byte (if I owe you a byte, I don't
|
|
|
|
owe it anymore once you've spent it), instead I keep a general opinion
|
|
|
|
about each server: my opinion increases when they do good work for me,
|
|
|
|
and it decays with time, but it does not decrease as they send traffic.
|
|
|
|
Therefore we reward servers who provide value to the system without
|
|
|
|
nickle and diming them at each step. We also let them benefit from
|
|
|
|
relaying traffic for others without having to "reserve" some of the
|
|
|
|
payment for their own use. See section 4.3 for a possible design.
|
|
|
|
|
|
|
|
2.3. Centralized opinions from the reputation servers.
|
|
|
|
|
|
|
|
The above approaches are complex and we don't have all the answers
|
|
|
|
for them yet. A simpler approach is just to let some central set
|
|
|
|
of trusted servers (say, the Tor directory servers) measure whether
|
|
|
|
people are contributing to the network, and provide a signal about
|
|
|
|
which servers should be rewarded. They can even do the measurements
|
|
|
|
via Tor so servers can't easily perform only when they're being
|
|
|
|
tested. See section 4.4.
|
|
|
|
|
|
|
|
2.4. Reputation servers that aggregate opinions.
|
|
|
|
|
|
|
|
The option above has the directory servers doing all of the
|
|
|
|
measurements. This doesn't scale. We can set it up so we have "deputy
|
|
|
|
testers" -- trusted other nodes that do performance testing and report
|
|
|
|
their results. If we want to be really adventurous, we could even
|
|
|
|
accept claims from every Tor user and build a complex weighting /
|
|
|
|
reputation system to decide which claims are "probably" right.
|
|
|
|
|
|
|
|
3. Related issues we need to keep in mind.
|
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
3.1. Relay and exit configuration needs to be easy and usable.
|
2006-02-01 07:45:15 +01:00
|
|
|
|
|
|
|
Implicit in all of the above designs is the need to make it easy to
|
|
|
|
run a Tor server out of the box. We need to make it stable on all
|
|
|
|
common platforms (including XP), it needs to detect its available
|
|
|
|
bandwidth and not overreach that, and it needs to help the operator
|
|
|
|
through opening up ports on his firewall. Then we need a slick GUI
|
|
|
|
that lets people click a button or two rather than editing text files.
|
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
Once we've done all this, we'll hit our first big question: is
|
2006-02-01 07:45:15 +01:00
|
|
|
most of the barrier to growth caused by the unusability of the current
|
|
|
|
software? If so, are the rest of these incentive schemes superfluous?
|
|
|
|
|
|
|
|
3.2. The network effect: how many nodes will you interact with?
|
2006-01-31 10:10:13 +01:00
|
|
|
|
|
|
|
One of the concerns with pairwise reputation systems is that as the
|
|
|
|
network gets thousands of servers, the chance that you're going to
|
2006-02-12 11:34:31 +01:00
|
|
|
interact with a given server decreases. So if 90% of interactions
|
|
|
|
don't have any prior information, the "local" incentive schemes above
|
2006-01-31 10:10:13 +01:00
|
|
|
are going to degrade. This doesn't mean they're pointless -- it just
|
|
|
|
means we need to be aware that this is a limitation, and plan in the
|
2006-02-12 11:34:31 +01:00
|
|
|
background for what step to take next. (It seems that e-cash solutions
|
|
|
|
would scale better, though they have issues of their own.)
|
2006-01-31 10:10:13 +01:00
|
|
|
|
2006-02-01 07:45:15 +01:00
|
|
|
3.3. Guard nodes
|
2006-01-31 10:10:13 +01:00
|
|
|
|
|
|
|
As of Tor 0.1.1.11, Tor users pick from a small set of semi-permanent
|
|
|
|
"guard nodes" for their first hop of each circuit. This seems to have
|
|
|
|
a big impact on pairwise reputation systems since you will only be
|
|
|
|
cashing in on your reputation to a few people, and it is unlikely
|
2006-02-12 11:34:31 +01:00
|
|
|
that a given pair of nodes will use each other as guard nodes.
|
2006-01-31 10:10:13 +01:00
|
|
|
|
|
|
|
What does this imply? For one, it means that we don't care at all
|
|
|
|
about the opinions of most of the servers out there -- we should
|
|
|
|
focus on keeping our guard nodes happy with us.
|
|
|
|
|
|
|
|
One conclusion from that is that our design needs to judge performance
|
|
|
|
not just through direct interaction (beginning of the circuit) but
|
|
|
|
also through indirect interaction (middle of the circuit). That way
|
|
|
|
you can never be sure when your guards are measuring you.
|
|
|
|
|
2006-02-01 07:45:15 +01:00
|
|
|
3.4. Restricted topology: benefits and roadmap.
|
2006-01-31 10:10:13 +01:00
|
|
|
|
2006-02-01 11:50:23 +01:00
|
|
|
As the Tor network continues to grow, we will need to make design
|
|
|
|
changes to the network topology so that each node does not need
|
|
|
|
to maintain connections to an unbounded number of other nodes. For
|
2006-02-12 11:34:31 +01:00
|
|
|
anonymity's sake, we may partition the network such that all
|
2006-02-01 11:50:23 +01:00
|
|
|
the nodes have the same belief about the divisions and each node is
|
|
|
|
in only one partition. (The alternative is that every user fetches
|
|
|
|
his own random subset of the overall node list -- this is bad because
|
|
|
|
of intersection attacks.)
|
|
|
|
|
|
|
|
Therefore the "network horizon" for each user will stay bounded,
|
|
|
|
which helps against the above issues in 3.2 and 3.3.
|
|
|
|
|
|
|
|
It could be that the core of long-lived servers will all get to know
|
|
|
|
each other, and so the critical point that decides whether you get
|
|
|
|
good service is whether the core likes you. Or perhaps it will turn
|
|
|
|
out to work some other way.
|
|
|
|
|
|
|
|
A special case here is the social network, where the network isn't
|
|
|
|
partitioned randomly but instead based on some external properties.
|
2006-02-12 11:34:31 +01:00
|
|
|
Social network topologies can provide incentives in other ways, because
|
|
|
|
people may be more inclined to help out their friends, and more willing
|
|
|
|
to relay traffic if only their friends are relaying through them. It
|
|
|
|
also opens the door for out-of-band incentive schemes because of the
|
|
|
|
out-of-band links in the graph.
|
2006-02-01 07:45:15 +01:00
|
|
|
|
|
|
|
3.5. Profit-maximizing vs. Altruism.
|
2006-01-31 10:10:13 +01:00
|
|
|
|
|
|
|
There are some interesting game theory questions here.
|
|
|
|
|
|
|
|
First, in a volunteer culture, success is measured in public utility
|
|
|
|
or in public esteem. If we add a reward mechanism, there's a risk that
|
|
|
|
reward-maximizing behavior will surpass utility- or esteem-maximizing
|
|
|
|
behavior.
|
|
|
|
|
|
|
|
Specifically, if most of our servers right now are relaying traffic
|
|
|
|
for the good of the community, we may actually *lose* those volunteers
|
|
|
|
if we turn the act of relaying traffic into a selfish act.
|
|
|
|
|
|
|
|
I am not too worried about this issue for now, since we're aiming
|
|
|
|
for an incentive scheme so effective that it produces thousands of
|
|
|
|
new servers.
|
|
|
|
|
2006-02-01 11:50:23 +01:00
|
|
|
3.6. What part of the node's performance do you measure?
|
|
|
|
|
|
|
|
We keep referring to having a node measure how well the other nodes
|
2006-02-12 11:34:31 +01:00
|
|
|
receive bytes. But don't leeching clients receive bytes just as well
|
|
|
|
as servers?
|
|
|
|
|
|
|
|
Further, many transactions in Tor involve fetching lots of
|
2006-02-01 11:50:23 +01:00
|
|
|
bytes and not sending very many. So it seems that we want to turn
|
|
|
|
things around: we need to measure how quickly a node can _send_
|
|
|
|
us bytes, and then only send it bytes in proportion to that.
|
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
However, a sneaky user could simply connect to a node and send some
|
|
|
|
traffic through it, and voila, he has performed for the network. This
|
|
|
|
is no good. The first fix is that we only count if you're receiving
|
|
|
|
bytes "backwards" in the circuit. Now the sneaky user needs to
|
|
|
|
construct a circuit such that his node appears later in the circuit,
|
|
|
|
and then send some bytes back quickly.
|
2006-02-01 11:50:23 +01:00
|
|
|
|
|
|
|
Maybe that complexity is sufficient to deter most lazy users. Or
|
|
|
|
maybe it's an argument in favor of a more penny-counting reputation
|
|
|
|
approach.
|
|
|
|
|
2006-02-09 04:44:13 +01:00
|
|
|
3.7. What is the appropriate resource balance for servers vs. clients?
|
|
|
|
|
|
|
|
If we build a good incentive system, we'll still need to tune it
|
|
|
|
to provide the right bandwidth allocation -- if we reserve too much
|
|
|
|
bandwidth for fast servers, then we're wasting some potential, but we
|
|
|
|
if we reserve too little, then fewer people will opt to become servers.
|
2006-02-12 11:34:31 +01:00
|
|
|
In fact, finding an optimum balance is especially hard because it's
|
|
|
|
a moving target: the better our incentive mechanism (and the lower
|
|
|
|
the barrier to setup), the more servers there will be. How do we find
|
|
|
|
the right balance?
|
2006-02-09 04:44:13 +01:00
|
|
|
|
|
|
|
One answer is that it doesn't have to be perfect: we can err on the
|
2006-02-12 11:34:31 +01:00
|
|
|
side of providing extra resources to servers. Then we will achieve our
|
|
|
|
desired goal -- when people complain about speed, we can tell them to
|
|
|
|
run a server, and they will in fact get better performance.
|
2006-02-09 04:44:13 +01:00
|
|
|
|
|
|
|
3.8. Anonymity attack: fast connections probably come from good servers.
|
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
If only fast servers can consistently get good performance in the
|
|
|
|
network, they will stand out. "Oh, that connection probably came from
|
|
|
|
one of the top ten servers in the network." Intersection attacks over
|
|
|
|
time can improve the certainty of the attack.
|
2006-02-09 04:44:13 +01:00
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
I'm not too worried about this. First, in periods of low activity,
|
|
|
|
many different people might be getting good performance. This dirties
|
|
|
|
the intersection attack. Second, with many of these schemes, we will
|
|
|
|
still be uncertain whether the fast node originated the traffic, or
|
|
|
|
was the entry node for some other lucky user -- and we already accept
|
|
|
|
this level of attack in other cases such as the Murdoch-Danezis attack
|
|
|
|
(http://freehaven.net/anonbib/#torta05).
|
2006-02-09 04:44:13 +01:00
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
3.9. How do we allocate bandwidth over the course of a second?
|
2006-02-09 04:44:13 +01:00
|
|
|
|
2006-02-12 11:34:31 +01:00
|
|
|
This may be a simple matter of engineering, but it still needs to be
|
|
|
|
addressed. Our current token bucket design refills each bucket once a
|
|
|
|
second. If we have N tokens in our bucket, and we don't know ahead of
|
|
|
|
time how many connections are going to want to send how many bytes,
|
|
|
|
how do we balance providing quick service to the traffic that is
|
|
|
|
already here compared to providing service to potential high-importance
|
|
|
|
future traffic?
|
|
|
|
|
|
|
|
If we have only two classes of service, here is a simple design:
|
|
|
|
At each point, when we are 1/t through the second, the total number
|
|
|
|
of non-priority bytes we are willing to accept is N/t. Thus if N
|
|
|
|
priority bytes arrive at the beginning of the second, we drain our
|
|
|
|
whole bucket then, and otherwise we provide some delayed service to
|
|
|
|
the non-priority bytes.
|
|
|
|
|
|
|
|
Does this design expand to cover the case of three priority classes?
|
|
|
|
Ideally we'd give each remote server its own priority number. Or
|
|
|
|
hopefully there's an easy design in the literature to point to --
|
|
|
|
this is clearly not my field.
|
2006-02-09 04:44:13 +01:00
|
|
|
|
2006-02-21 04:58:05 +01:00
|
|
|
3.10. Different-priority cells arriving on the same TCP connection.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2006-01-31 10:10:13 +01:00
|
|
|
4. Sample designs.
|
|
|
|
|
|
|
|
4.1. Two classes of service for circuits.
|
|
|
|
|
|
|
|
4.2. Treat all the traffic from the node with the same service;
|
|
|
|
hard reputation system.
|
|
|
|
|
|
|
|
4.3. Treat all the traffic from the node with the same service;
|
|
|
|
soft reputation system.
|
|
|
|
|
2006-02-01 11:50:23 +01:00
|
|
|
Rather than a guaranteed system with accounting (as 4.1 and 4.2),
|
|
|
|
we instead try for a best-effort system. All bytes are in the same
|
|
|
|
class of service. You keep track of other Tors by key, and give them
|
|
|
|
service proportional to the service they have given you. That is, in
|
|
|
|
the past when you have tried to push bytes through them, you track the
|
|
|
|
number of bytes and the average bandwidth, and use that to weight the
|
|
|
|
priority of their connections if they try to push bytes through you.
|
|
|
|
|
|
|
|
Now you're going to get minimum service if you don't ever push bytes
|
|
|
|
for other people, and you get increasingly improved service the more
|
|
|
|
active you are. We should have memories fade over time (we'll have
|
|
|
|
to tune that, which could be quite hard).
|
|
|
|
|
|
|
|
Pro: Sybil attacks are pointless because new identities get lowest
|
|
|
|
priority.
|
|
|
|
|
|
|
|
Pro: Smoothly handles periods of both low and high network load. Rather
|
|
|
|
than keeping track of the ratio/difference between what he's done for
|
|
|
|
you and what you've done for him, simply keep track of what he's done
|
|
|
|
for you, and give him priority based on that.
|
|
|
|
|
|
|
|
Based on 3.3 above, it seems we should reward all the nodes in our
|
|
|
|
path, not just the first one -- otherwise the node can provide good
|
|
|
|
service only to its guards. On the other hand, there might be a
|
|
|
|
second-order effect where you want nodes to like you so that *when*
|
|
|
|
your guards choose you for a circuit, they'll be able to get good
|
|
|
|
performance. This tradeoff needs more simulation/analysis.
|
|
|
|
|
|
|
|
This approach focuses on incenting people to relay traffic, but it
|
|
|
|
doesn't do much for incenting them to allow exits. It may help in
|
|
|
|
one way through: if there are few exits, then they will attract a
|
|
|
|
lot of use, so lots of people will like them, so when they try to
|
|
|
|
use the network they will find their first hop to be particularly
|
|
|
|
pleasant. After that they're like the rest of the world though.
|
|
|
|
|
|
|
|
Pro: this is a pretty easy design to add; and it can be phased in
|
|
|
|
incrementally simply by having new nodes behave differently.
|
|
|
|
|
2006-01-31 10:10:13 +01:00
|
|
|
4.4. Centralized opinions from the reputation servers.
|
|
|
|
|
2006-02-01 11:50:23 +01:00
|
|
|
Have a set of official measurers who spot-check servers from the
|
|
|
|
directory to see if they really do offer roughly the bandwidth
|
|
|
|
they advertise. Include these observations in the directory. (For
|
|
|
|
simplicity, the directory servers could be the measurers.) Then Tor
|
2006-02-13 11:43:29 +01:00
|
|
|
servers give priority to other servers. We'd like to weight the
|
|
|
|
priority by advertised bandwidth to encourage people to donate more,
|
|
|
|
but it seems hard to distinguish between a slow server and a busy
|
|
|
|
server.
|
|
|
|
|
|
|
|
The spot-checking can be done anonymously to prevent selectively
|
|
|
|
performing only for the measurers, because hey, we have an anonymity
|
|
|
|
network.
|
2006-02-01 11:50:23 +01:00
|
|
|
|
|
|
|
We could also reward exit nodes by giving them better priority, but
|
|
|
|
like above this only will affect their first hop. Another problem
|
|
|
|
is that it's darn hard to spot-check whether a server allows exits
|
2006-02-13 11:43:29 +01:00
|
|
|
to all the pieces of the Internet that it claims to. If necessary,
|
|
|
|
perhaps this can be solved by a distributed reporting mechanism,
|
|
|
|
where clients that can reach a site from one exit but not another
|
|
|
|
anonymously submit that site to the measurers, who verify.
|
|
|
|
|
|
|
|
A last problem is that since directory servers will be doing their
|
|
|
|
tests directly (easy to detect) or indirectly (through other Tor
|
|
|
|
servers), then we know that we can get away with poor performance for
|
|
|
|
people that aren't listed in the directory. Maybe we can turn this
|
|
|
|
around and call it a feature though -- another reason to get listed
|
|
|
|
in the directory.
|
2006-02-01 11:50:23 +01:00
|
|
|
|
2006-02-09 04:44:13 +01:00
|
|
|
5. Recommendations and next steps.
|
|
|
|
|
2006-01-31 10:10:13 +01:00
|
|
|
|
|
|
|
|