diff --git a/doc/spec/proposals/159-exit-scanning.txt b/doc/spec/proposals/159-exit-scanning.txt new file mode 100644 index 0000000000..e4fddce3c0 --- /dev/null +++ b/doc/spec/proposals/159-exit-scanning.txt @@ -0,0 +1,144 @@ +Filename: 159-exit-scanning.txt +Title: Exit Scanning +Version: $Revision$ +Last-Modified: $Date$ +Author: Mike Perry +Created: 13-Feb-2009 +Status: Open + +Overview: + +This proposal describes the implementation and integration of an +automated exit node scanner for scanning the Tor network for malicious, +misconfigured, firewalled or filtered nodes. + +Motivation: + +Tor exit nodes can be run by anyone with an Internet connection. Often, +these users aren't fully aware of limitations of their networking +setup. Content filters, antivirus software, advertisements injected by +their service providers, malicious upstream providers, and the resource +limitations of their computer or networking equipment have all been +observed on the current Tor network. + +It is also possible that some nodes exist purely for malicious +purposes. In the past, there have been intermittent instances of +nodes spoofing SSH keys, as well as nodes being used for purposes of +plaintext surveillance. + +While it is not realistic to expect to catch extremely targeted or +completely passive malicious adversaries, the goal is to prevent +malicious adversaries from deploying dragnet attacks against large +segments of the Tor userbase. + + +Scanning methodology: + +The first scans to be implemented are HTTP, HTML, Javascript, and +SSL scans. + +The HTTP scan scrapes Google for common filetype urls such as exe, msi, +doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and +compares the SHA1 hases of the resulting content. + +The SSL scan downloads certificates for all IPs a domain will locally +resolve to and compares these certificates to those seen over Tor. The +scanner notes if a domain had rotated certificates locally in the +results for each scan. + +The HTML scan checks HTML, Javascript, and plugin content for +modifications. Because of the dynamic nature of most of the web, the +scanner has a number of mechanisms built in to filter out false +positives that are used when a change is noticed between Tor and +Non-Tor. + +All tests also share a URL-based false positive filter that +automatically removes results retroactively if the number of failures +exceeds a certain percentage of nodes tested with the URL. + + +Deployment Stages: + +To avoid instances where bugs cause us to mark exit nodes as BadExit +improperly, it is proposed that we begin use of the scanner in stages. + +1. Manual Review: + + In the first stage, basic scans will be run by a small number of + people while we stabilize the scanner. The scanner has the ability + to resume crashed scans, and to rescan nodes that fail various + tests. + +2. Human Review: + + In the second stage, results will be automatically mailed to + an email list of interested parties for review. We will also begin + classifying failure types into three to four different severity + levels, based on both the reliability of the test and the nature of + the failure. + +3. Automatic BadExit Marking: + + In the final stage, the scanner will begin marking exits depending + on the failure severity level in one of three different ways: by + node idhex, by node IP, or by node IP mask. A potential fourth, less + severe category of results may still be delivered via email only for + review. + + BadExit markings will be delivered in batches upon completion + of whole-network scans, so that the final false positive + filter has an opportunity to filter out URLs that exhibit + dynamic content beyond what we can filter. + + +Specification of Exit Marking: + +Technically, BadExit could be marked via SETCONF AuthDirBadExit over +the control port, but this would allow full access to the directory +authority configuration and operation. + +The approved-routers file could also be used, but currently it only +supports fingerprints, and it also contains other data unrelated to +exit scanning that would be difficult to coordinate. + +Instead, we propose that a new badexit-routers file that has three +keywords: + + BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt] + BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt] + +BadExitNet lines would follow the codepaths used by AuthDirBadExit to +set authdir_badexit_policy, and BadExitFP would follow the codepaths +from approved-router's !badexit lines. + +The scanner would have exclusive ability to write, append, rewrite, +and modify this file. Prior to building a new consensus vote, a +participating Tor authority would read in a fresh copy. + + +Security Implications: + +Aside from evading the scanner's detection, there are two additional +high-level security considerations: + +1. Ensure nodes cannot be marked BadExit by an adversary at will + +It is possible individual website owners will be able to target certain +Tor nodes, but once they begin to attempt to fail more than the URL +filter percentage of the exits, their sites will be automatically +discarded. + +Failing specific nodes is possible, but scanned results are fully +reproducible, and BadExits should be rare enough that humans are never +fully removed from the loop. + +State (cookies, cache, etc) does not otherwise persist in the scanner +between exit nodes to enable one exit node to bias the results of a +later one. + +2. Ensure that scanner compromise does not yield authority compromise + +Having a separate file that is under the exclusive control of the +scanner allows us to heavily isolate the scanner from the Tor +authority, potentially even running them on separate machines. +