convert draft pluggable transport proposal to plaintext

This commit is contained in:
Nick Mathewson 2010-12-10 14:34:26 -05:00
parent 4e9f9a4ee8
commit 1fb3a60f54

View File

@ -0,0 +1,430 @@
Filename: xxx-pluggable-transport.txt
Title: Pluggable transports for circumvention
Author: Jacob Appelbaum, Nick Mathewson
Created: 15-Oct-2010
Status: Draft
Overview
This is a document about transport plugins; it does not cover
discovery, or bridgedb improvements. Each transport plugin
specification should make clear any external requirements but those
are generally out of scope if they fall into discovery or
infrastructure components.
We should include a description of how to write a good set of plugins,
how to evaluate and how to classify a plugin. For example, if a plugin
is said to be hard to detect on the wire if you know what it is and
how it works, it should say so. If it's easy, it's still possibly
functional for a given network but perhaps it is not well hidden or
automatically filtered. Detection and blocking are not always the same
thing right off. In both cases, a plugin should be quite clear about
its security claims.
Target use-cases[a][b]
Here's some stuff we want to be able to support. We're listing these
in the draft to try to define the problem space. We won't put this
section in the final version.
1. The 'obfuscated SSH' superencipherment:
http://github.com/brl/obfuscated-openssh/blob/master/README.obfuscation
2. Big P2P-network style transports where instead of connecting to a
bridge at a known IP, you connect to a bridge by a username, a public
key, or whatever.
1. We need the ability to have two kinds of proxies - one for
incoming connections and one for outgoing connections. [Sure, but
that's about how we implement stuff arg arg dumb touchpad -NM]
1. Probably we want to have the ability to get connections
anyway we'll take them
2. So, bridges use the incoming kind, and clients use the ougoing
kind? Sounds right.-N
1. Probably also we're a multi-plexed incoming kind of Tor
relay - so we should take connections from say localhost's
little helper and also, we should take connections from
external ips. This would be useful to identify though. I think
this is how we would already work as of today.
1. You mean, regular non-bridge relays should support this
too? I hadn't considered that. it has seemed pointless
because of IP blocking, but if we have a p2p transport, it
would be useful for regular relays to allow it. Yes -io
1. Also it would be nice for stats purposes to ensure that
we know what kinds of connections we're handling, even if
we basically treat them exactly the same. Perhaps Karsten
wants to weigh in on how we should have Tor handle these
things? I guess we'll really fuck up his stats collection
if all of sudden he's getting lots of connections from
127.0.0.1...
1. Various protocol-impersonation tools
1. NSTX, iodyne, Ozymandns or such, for the lulz.
1. DNS tunneling of many types - eg: TXT records or the NULL
protocol trick
1. HTTP -- many kinds are possible, some may even be right
1. HTTP POST requests are implemented in Firepass
1. FTP
1. Perhaps some kind of anonymous ftp login with sending and
receiving of data would be useful?
1. Lots to think about before designing off the cuff crappy
protocol covert channels
1. NTP
1. Hardly anyone knows about NTP these days - it's almost always
outbound allowed and it's usually not well inspected
1. That makes it good for short-term circumvention, but bad
for long-term hiding.
1. Triangle-boy
2. IPSec look-alike
3. UDP
4. IPv6
1. A forged-RST-ignoring tool
1. A forged-RST-ignoring tool that pretends that it is getting all
of its connections closed and retrying all the time, when really
it is just carrying on with business as usual. Hooray for
crypto.
1. Perhaps it's a good idea to mention CCTT?
1. What else goes here?
1. We should ask Nextgens about protocol filters from Freenet
2. http://gray-world.net/papers.shtml
3. http://gray-world.net/pr_cook_cc.shtml
4. http://gray-world.net/pr_firepass.shtml
5. We should ensure we cover the topics and lessons learned from
"FIREWALL RESISTANCE TO METAFEROGRAPHY IN NETWORK COMMUNICATIONS"
- see
https://ritdml.rit.edu/bitstream/handle/1850/12272/RSavacoolThesis5-21-2010.pdf
Here's some stuff that seems out-of-scope:
1. A generic firewall-breaker that works with all Tor nodes and
bridges. Like, if you're using a VPN to get through your firewall,
and it lets you connect to any Tor node, you can just use it without
any special plug-in support. I think this spec is just for stuff
that requires buy-in from the server side of the connection. Agreed?
1. Yeah - I think we should simply codify the proxy stuff to ensure
that we plan to remain pluggable for incoming and outgoing connections
in some formal way.
I'm uncertain if we want to support stuff like:
1. An ssh tunnel that uses openssh to tunnel raw tor packets, with no
actual TLS going on underneath. Promising, but risky. -NM
1. I think there isn't much to gain by doing this but perhaps so - we
are too dependent on TLS and our certs are trivial to fingerprint -io
1. Also, Tor-over-TLS-tunneled-over-SSH looks even weirder than
Tor-over-SSH. -N
2. It might be nice to allow certs [cn] fields to be configurable by
bridge nodes? -io
1. If we allowed "raw traffic" transports, a transport could get this
trivially by implementing TLS with the right certs. -NM
1. perhaps we just want a "raw traffic port" where we connect to pass
around cells? thoughts?
1. A bridge-discovery-and-round-robin p2p tool that connects you to a
randomly chosen one of an unknown number of bridges.
1. Stackable plugins
1. Tor over DNS over HTTP Post over Obfuscated Tor to reach the Tor
network to read a copy of uncensored Google News.
1. Christ, what the fuck world are we building? Or even more,
what kind of world are we resisting?
1. More like RST-drop plus sshobfs over HTTP over VPN.
Goals & Motivation
Frequently, people want to try a novel circumvention method to help
users connect to Tor bridges. Some of these methods are already
pretty easy to deploy: if the user knows an unblocked VPN or open
SOCKS proxy, they can just use that with the Tor client today.
Less easy to deploy are methods that require participation by both the
client and the bridge. In order of increasing sophistication, we
might want to support:
1. A protocol obfuscation tool that transforms the output of a TLS
connection into something that looks like HTTP as it leaves the client,
and back to TLS as it arrives at the bridge.
2. An additional authentication step that a client would need to
perform for a given bridge before being allowed to connect.
3. An information passing system that uses a side-channel in some
existing protocol to convey traffic between a client and a bridge
without the two of them ever communicating directly.
4. A set of clients to tunnel client->bridge traffic over an existing
large p2p network, such that the bridge is known by an identifier
in that network rather than by an IP address.
We could in theory support these almost fine with Tor as it stands
today: every Tor client can take a SOCKS proxy to use for its outgoing
traffic, so a suitable client proxy could handle the client's traffic
and connections on its behalf, while a corresponding program on the
bridge side could handle the bridge's side of the protocol
transformation. Nevertheless, there are some reasons to add support
for transportation plugins to Tor itself:
1. It would be good for bridges to have a standard way to advertise
which transports they support, so that clients can have multiple
local transport proxies, and automatically use the right one for
the right bridge.
2. There are some changes to our architecture that we'll need for a
system like this to work. For testing purposes, if a bridge blocks
off its regular ORPort and instead has an obfuscated ORPort, the
bridge authority has no way to test it. Also, unless the bridge
has some way to tell that the bridge-side proxy at 127.0.0.1 is not
the origin of all the connections it is relaying, it might decide
that there are too many connections from 127.0.0.1, and start
paring them down to avoid a DoS.
3.
4. (what else?)
Non-Goals
We're not going to talk about automatic verification of plugin
correctness and safety via sandboxing, proof-carrying code, or
whatever.
We need to do more with discovery and distribution, but that's not
what this proposal is about. We're pretty convinced that the problems
are sufficiently orthogonal that we should be fine so long as we don't
preclude a single program from implementing both transport and
discovery extensions.
This proposal is not about what transport plugins are the best ones
for people to write.
We've considered issues involved with completely replacing Tor's TLS
with another encryption layer, rather than layering it inside the
obfuscation layer. We describe how to do this in an appendix to the
current proposal, though we are not currently sure whether it's a good
idea to implement.
Design overview
Clients run one or more "Transport client" programs that act like
SOCKS proxies. They accept connections on localhost on different
ports. Each one implements one or more transport methods. Parameters
are passed from Tor inside the regular username/password parts of the
SOCKS protocol.
Bridges (and maybe relays) run one or more programs that act like
stunnel-server (or whatever the option is): they get connections from
the network (typically by listening for connections on the network)
and relay them to the Bridge's real ORPort.
1. The bridge needs to know which methods these servers support
1. The bridge needs to advertise this fact some way that the clients
will find out about it--probably by sticking it in its bridge
descriptor so that the bridgedb can find out and see that the clients
get informed.
2. Somebody needs to launch these programs
3. The bridge may want to just not have a public ORPort at all.
4. The bridge may not want to advertise a real IP at all
5. The bridge will want to find out from the program any client
identification information it can get (IP, etc) to implement rules
about max clients at once
Any methods that are wildly successful, we can bake into Tor.
Proposed terminology:
Transport protocol:
Transport proxy:
Specifications: Client behavior
Bridge lines can now follow the extended format "bridge method
address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]". To connect
to such a bridge, a client must open a local connection to the SOCKS
proxy for "method", and ask it to connect to address:port. If
[id-fingerprint] is provided, it should expect the public identity key
on the TLS connection to match the digest provided in
[id-fingerprint]. If any [k=v] items are provided, they are
configuration parameters for the proxy: Tor should separate them with
NUL bytes and put them user and password fields of the request,
splitting them across the fields as necessary. The "id-fingerprint"
field is always provided in a field named "keyid", if it was given.
example: if the bridge line is "bridge trebuchet www.example.com:3333
rocks=20 height=5.6m" then, if the Tor client knows that the
trebuchet' method is provided by a SOCKS5 proxy on 127.0.0.1:19999,
it should connect to that proxy, ask it to connect to www.example.com,
and provide the string "rocks=20\0height=5.6m" as the username, the
password, or split across the username and password.
There are two ways to tell Tor clients about protocol proxies:
external proxies and managed proxies. An external proxy is configured
with "Transport trebuchet socks5 127.0.0.1:9999". This tells Tor that
another program is already running to handle trubuchet' connections,
and Tor doesn't need to worry about it. A managed proxy is configured
with "Transport trebuchet /usr/libexec/tor-proxies/trebuchet
[options]", and tells Tor to launch an external program on-demand to
provide a socks proxy for trebuchet' connections. The Tor client only
launches one instance of each external program, even if the same
executable is listed for more than one method.
The same program can implement a managed or an external proxy: it just
needs to take an argument saying which one to be.
[I don't like the terminology here. We should pick better words before
this "external/managed" stuff catches on. Also, to most users a
"proxy" is a computer that relays stuff for them, not a local program
on their computer. -NM I think we should go with Helper of some kind
as it's less technically overloaded and more friendly feeling - io
"Helper" is too overloaded already. -NM]
Client proxy behavior
When launched from the command-line by a Tor client, a transport
proxy needs to tell Tor which methods and ports it supports. It does
this by printing one or more METHOD: lines to its stdout. These look
like CMETHOD: trebuchet SOCKS5 127.0.0.1:19999 ARGS:rocks,height
OPT-ARGS:tensile-strength
The ARGS field lists mandatory parameters that must appear in every
bridge line for this method. The OPT-ARGS field lists optional
parameters. If no ARGS or OPT-ARGS field is provided, Tor should not
check the parameters in bridge lines for this method.
The proxy should print a single "METHODS:DONE" line after it is
finished telling Tor about the methods it provides.
[Should methods be versionable? Can they be? -nm I think probably?
-io Then how? -nm]
The transport proxy MUST exit cleanly when it receives a SIGTERM from
Tor.
The Tor client MUST ignore lines beginning with a keyword and a colon
if it does not recognize the keyword.
In the future, if we need a control mechanism, we can use the
stdin/stdout from Tor to the transport proxy.
Transport proxy requirements
A transport proxy MUST handle SOCKS connect requests using the SOCKS
version it advertises.
Server proxy behavior
[So, we can have this work like client proxies, where the bridge
launches some programs, and they tell the bridge, "I am giving you
method X with parameters Y"? Do you have to take all the methods? If
not, which do you specify?]
[Do we allow programs that get started independently?]
[We'll need to figure out how this works with port forwarding. Is
port forwarding the bridge's problem, the proxy's problem, or some
combination of the two?]
[If we're using the bridge authority/bridgedb system for distributing
bridge info, the right place to advertise bridge lines is probably
the extrainfo document. We also need a way to tell the bridge
authority "don't give out a default bridge line for me"]
Server behavior
Bridge authority behavior
Implementation plan
Finish the design work here.
Clean up all the inline conversations to just get summarized by the
conclusions they arrived at.
Turn this into a draft proposal
Circulate and discuss on or-dev
(Use Cinderblock Of Loving Correction to reeducate anybody who tries
to divert discussion of how pluggable transports should work into
discussion of what is the best possible transport, or whatever.)
We should ship a couple of null plugin implementations in one or two
popular, portable languages so that people get an idea of how to
write the stuff.
1. We should have one that's just a proof of concept that does
nothing but transfer bytes back and forth.
1. We should not do a rot13 one.
2. We should implement a basic proxy that does not transform the bytes at all
1. We should implement DNS or HTTP using other software (as goodell
did years ago with DNS) as an example of wrapping existing code into
our plugin model.
2. The obfuscated-ssh superencipherment is pretty trivial and pretty
useful. It makes the protocol stringwise unfingerprintable.
1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh
superencipherment too badly
1. Go ahead, bikeshed my day
1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice.
Appendix: recommendations for transports
Be free/open-source software. Also, if you think your code might
someday do so well at circumvention that it should be implemented
inside Tor, it should use the same license as Tor.
Use libraries that Tor already requires. (You can rely on openssl and
libevent being present if current Tor is present.)
Be portable: most Tor users are on Windows, and most Tor developers
are not, so designing your code for just one of these platforms will
make it either get a small userbase, or poor auditing.
Think secure: if your code is in a C-like language, and it's hard to
read it and become convinced it's safe then, it's probably not safe.
Think small: we want to minimize the bytes that a Windows user needs
to download for a transport client.
Specify: if you can't come up with a good explanation
Avoid security-through-obscurity if possible. Specify.
Resist trivial fingerprinting: There should be no good string or regex
to search for to distinguish your protocol from protocols permitted by
censors.
Imitate a real profile: There are many ways to implement most
protocols -- and in many cases, most possible variants of a given
protocol won't actually exist in the wild.
Appendix: Raw-traffic transports
This section describes an optional extension to the proposal above.
[a]I agree that we should remove this section - perhaps we should also save the links and move them to the possible plugin examples? - ioerror
[b]This whole section should get removed from the final thing. I tried to summarize broad themes in the Motivations section below. - NM
[c]That doesn't really help - does it? Or do you mean that the Tor should set the CN to be say, the IP or hostname of the relay? - ioerror
The "Address" field when we have it. After that, the hostname if we know it. After that, do a PTR lookup on our IP. After that, use our IP. -NM