2007-12-08 01:53:43 +01:00
|
|
|
|
|
|
|
How to hand out bridges.
|
|
|
|
|
|
|
|
Divide bridges into 'strategies' as they come in. Do this uniformly
|
|
|
|
at random for now.
|
|
|
|
|
|
|
|
For each strategy, we'll hand out bridges in a different way to
|
|
|
|
clients. This document describes two strategies: email-based and
|
|
|
|
IP-based.
|
|
|
|
|
|
|
|
0. Notation:
|
|
|
|
|
|
|
|
HMAC(k,v) : an HMAC of v using the key k.
|
|
|
|
|
|
|
|
A|B: The string A concatenated with the string B.
|
|
|
|
|
|
|
|
|
|
|
|
1. Email-based.
|
|
|
|
|
|
|
|
Goal: bootstrap based on one or more popular email service's sybil
|
|
|
|
prevention algorithms.
|
|
|
|
|
|
|
|
|
|
|
|
Parameters:
|
|
|
|
HMAC -- an HMAC function
|
|
|
|
P -- a time period
|
|
|
|
K -- the number of bridges to send in a period.
|
|
|
|
|
|
|
|
Setup: Generate two nonces, N and M.
|
|
|
|
|
|
|
|
As bridges arrive, put them into a ring according to HMAC(N,ID)
|
|
|
|
where ID is the bridges's identity digest.
|
|
|
|
|
|
|
|
Divide time into divisions of length P.
|
|
|
|
|
|
|
|
When we get an email:
|
|
|
|
|
|
|
|
If it's not from a supported email service, reject it.
|
|
|
|
|
|
|
|
If we already sent a response to that email address (normalized)
|
|
|
|
in this period, send _exactly_ the same response.
|
|
|
|
|
|
|
|
If it is from a supported service, generate X = HMAC(M,PS|E) where E
|
|
|
|
is the lowercased normalized email address for the user, and
|
|
|
|
where PS is the start of the currrent period. Send
|
|
|
|
the first K bridges in the ring after point X.
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
[If we want to make sure that repeat queries are given exactly the
|
|
|
|
same results, then we can't let the ring change during the
|
|
|
|
time period. For a long time period like a month, that's quite a
|
|
|
|
hassle. How about instead just keeping a replay cache of addresses
|
|
|
|
that have been answered, and sending them a "sorry, you already got
|
|
|
|
your addresses for the time period; perhaps you should try these
|
|
|
|
other fine distribution strategies while you wait?" response? This
|
|
|
|
approach would also resolve the "Make sure you can't construct a
|
|
|
|
distinct address to match an existing one" note below. -RD]
|
|
|
|
|
2008-08-03 17:34:28 +02:00
|
|
|
[I think, if we get a replay, we need to send back the same
|
2007-12-12 00:23:25 +01:00
|
|
|
answer as we did the first time, not say "try again."
|
|
|
|
Otherwise we need to worry that an attacker can keep people
|
|
|
|
from getting bridges by preemtively asking for them,
|
|
|
|
or that an attacker may force them to prove they haven't
|
|
|
|
gotten any bridges by asking. -NM]
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
[While we're at it, if we do the replay cache thing and don't need
|
|
|
|
repeatable answers, we could just pick K random answers from the
|
|
|
|
pool. Is it beneficial that a bridge user who knows about a clump of
|
|
|
|
nodes will be sharing them with other users who know about a similar
|
|
|
|
(overlapping) clump? One good aspect is against an adversary who
|
|
|
|
learns about a clump this way and watches those bridges to learn
|
|
|
|
other users and discover *their* bridges: he doesn't learn about
|
|
|
|
as many new bridges as he might if they were randomly distributed.
|
|
|
|
A drawback is against an adversary who happens to pick two email
|
|
|
|
addresses in P that include overlapping answers: he can measure
|
|
|
|
the difference in clumps and estimate how quickly the bridge pool
|
|
|
|
is growing. -RD]
|
|
|
|
|
2007-12-12 00:23:25 +01:00
|
|
|
[Random is one more darn thing to implement; rings are already
|
|
|
|
there. -NM]
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
[If we make the period P be mailbox-specific, and make it a random
|
|
|
|
value around some mean, then we make it harder for an attacker to
|
|
|
|
know when to try using his small army of gmail addresses to gather
|
|
|
|
another harvest. But we also make it harder for users to know when
|
|
|
|
they can try again. -RD]
|
|
|
|
|
2007-12-12 00:23:25 +01:00
|
|
|
[Letting the users know about when they can try again seems
|
|
|
|
worthwhile. Otherwise users and attackers will all probe and
|
|
|
|
probe and probe until they get an answer. No additional
|
|
|
|
security will be achieved, but bandwidth will be lost. -NM]
|
|
|
|
|
2007-12-08 01:53:43 +01:00
|
|
|
To normalize an email address:
|
|
|
|
Start with the RFC822 address. Consider only the mailbox {???}
|
2007-12-08 05:13:07 +01:00
|
|
|
portion of the address (username@domain). Put this into lowercase
|
2007-12-08 01:53:43 +01:00
|
|
|
ascii.
|
|
|
|
|
|
|
|
Questions:
|
|
|
|
What to do with weird character encodings? Look up the RFC.
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
Make sure that you can't force a single email address to appear
|
|
|
|
in lots of different ways. IOW, if nickm@freehaven.net and
|
|
|
|
NICKM@freehaven.net aren't treated the same, then I can get lots
|
|
|
|
more bridges than I should.
|
|
|
|
|
|
|
|
Make sure you can't construct a distinct address to match an
|
|
|
|
existing one. IOW, if we treat nickm@X and nickm@Y as the same
|
|
|
|
user, then anybody can register nickm@Z and use it to tell which
|
|
|
|
bridges nickm@X got (or would get).
|
|
|
|
|
|
|
|
Make sure that we actually check headers so we can't be trivially
|
2007-12-08 05:13:07 +01:00
|
|
|
used to spam people.
|
2007-12-08 01:53:43 +01:00
|
|
|
|
|
|
|
|
|
|
|
2. IP-based.
|
|
|
|
|
|
|
|
Goal: avoid handing out all the bridges to users in a similar IP
|
|
|
|
space and time.
|
|
|
|
|
|
|
|
Parameters:
|
|
|
|
|
|
|
|
T_Flush -- how long it should take a user on a single network to
|
|
|
|
see a whole cluster of bridges.
|
|
|
|
|
|
|
|
N_C
|
|
|
|
|
|
|
|
K -- the number of bridges we hand out in response to a single
|
|
|
|
request.
|
|
|
|
|
|
|
|
Setup: using an AS map or a geoip map or some other flawed input
|
|
|
|
source, divide IP space into "areas" such that surveying a large
|
2007-12-08 05:13:07 +01:00
|
|
|
collection of "areas" is hard. For v0, use /24 address blocks.
|
2007-12-08 01:53:43 +01:00
|
|
|
|
|
|
|
Group areas into N_C clusters.
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
Generate secrets L, M, N.
|
2007-12-08 01:53:43 +01:00
|
|
|
|
|
|
|
Set the period P such that P*(bridges-per-cluster/K) = T_flush.
|
|
|
|
Don't set P to greater than a week, or less than three hours.
|
|
|
|
|
|
|
|
When we get a bridge:
|
|
|
|
|
|
|
|
Based on HMAC(L,ID), assign the bridge to a cluster. Within each
|
|
|
|
cluster, keep the bridges in a ring based on HMAC(M,ID).
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
[Should we re-sort the rings for each new time period, so the ring
|
|
|
|
for a given cluster is based on HMAC(M,PS|ID)? -RD]
|
|
|
|
|
2007-12-08 01:53:43 +01:00
|
|
|
When we get a connection:
|
|
|
|
|
|
|
|
If it's http, redirect it to https.
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
Let area be the incoming IP network. Let PS be the current
|
|
|
|
period. Compute X = HMAC(N, PS|area). Return the next K bridges
|
2007-12-08 01:53:43 +01:00
|
|
|
in the ring after X.
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
[Don't we want to compute C = HMAC(key, area) to learn what cluster
|
2007-12-12 00:23:25 +01:00
|
|
|
to answer from, and then X = HMAC(key, PS|area) to pick a point in
|
|
|
|
that ring? -RD]
|
|
|
|
|
2007-12-08 05:13:07 +01:00
|
|
|
|
|
|
|
Need to clarify that some HMACs are for rings, and some are for
|
|
|
|
partitions. How rings scale is clear. How do we grow the number of
|
|
|
|
partitions? Looking at successive bits from the HMAC output is one way.
|
|
|
|
|
2007-12-08 01:53:43 +01:00
|
|
|
3. Open issues
|
|
|
|
|
|
|
|
Denial of service attacks
|
|
|
|
A good view of network topology
|
|
|
|
|
2007-12-12 18:02:47 +01:00
|
|
|
at some point we should learn some reliability stats on our bridges. when
|
|
|
|
we say above 'give out k bridges', we might give out 2 reliable ones and
|
|
|
|
k-2 others. we count around the ring the same way we do now, to find them.
|
|
|
|
|