Filename: 121-hidden-service-authentication.txt Title: Hidden Service Authentication Version: $Revision$ Last-Modified: $Date$ Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger, Christoph Weingarten Created: 10-Sep-2007 Status: Open Change history: 26-Sep-2007 Initial proposal for or-dev 08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007 15-Dec-2007 Rewrote complete proposal for better readability, modified authentication protocol, merged in personal notes 24-Dec-2007 Replaced misleading term "authentication" by "authorization" and added some clarifications (comments by Sven Kaffille) 28-Apr-2008 Updated most parts of the concrete authorization protocol Overview: This proposal deals with a general infrastructure for performing authorization (not necessarily implying authentication) of requests to hidden services at three points: (1) when downloading and decrypting parts of the hidden service descriptor, (2) at the introduction point, and (3) at Bob's Tor client before contacting the rendezvous point. A service provider will be able to restrict access to his service at these three points to authorized clients only. Further, the proposal contains a first instance of an authorization protocol for the presented infrastructure. This proposal is based on v2 hidden service descriptors as described in proposal 114 and introduced in version 0.2.0.10-alpha. The proposal is structured as follows: The next section motivates the integration of authorization mechanisms in the hidden service protocol. Then we describe a general infrastructure for authorization in hidden services, followed by a specific authorization protocol for this infrastructure. At the end we discuss a number of attacks and non-attacks as well as compatibility issues. Motivation: The major part of hidden services does not require client authorization now and won't do so in the future. To the contrary, many clients would not want to be (pseudonymously) identifiable by the service (though this is unavoidable to some extent), but rather use the service anonymously. These services are not addressed by this proposal. However, there may be certain services which are intended to be accessed by a limited set of clients only. A possible application might be a wiki or forum that should only be accessible for a closed user group. Another, less intuitive example might be a real-time communication service, where someone provides a presence and messaging service only to his buddies. Finally, a possible application would be a personal home server that should be remotely accessed by its owner. Performing authorization for a hidden service within the Tor network, as proposed here, offers a range of advantages compared to allowing all client connections in the first instance and deferring authorization to the transported protocol: (1) Reduced traffic: Unauthorized requests would be rejected as early as possible, thereby reducing the overall traffic in the network generated by establishing circuits and sending cells. (2) Better protection of service location: Unauthorized clients could not force Bob to create circuits to their rendezvous points, thus preventing the attack described by Ă˜verlier and Syverson in their paper "Locating Hidden Servers" even without the need for guards. (3) Hiding activity: Apart from performing the actual authorization, a service provider could also hide the mere presence of his service from unauthorized clients when not providing hidden service descriptors to them, rejecting unauthorized requests already at the introduction point (ideally without leaking presence information at any of these points), or not answering unauthorized introduction requests. (4) Better protection of introduction points: When providing hidden service descriptors to authorized clients only and encrypting the introduction points as described in proposal 114, the introduction points would be unknown to unauthorized clients and thereby protected from DoS attacks. (5) Protocol independence: Authorization could be performed for all transported protocols, regardless of their own capabilities to do so. (6) Ease of administration: A service provider running multiple hidden services would be able to configure access at a single place uniformly instead of doing so for all services separately. (7) Optional QoS support: Bob could adapt his node selection algorithm for building the circuit to Alice's rendezvous point depending on a previously guaranteed QoS level, thus providing better latency or bandwidth for selected clients. A disadvantage of performing authorization within the Tor network is that a hidden service cannot make use of authorization data in the transported protocol. Tor hidden services were designed to be independent of the transported protocol. Therefore it's only possible to either grant or deny access to the whole service, but not to specific resources of the service. Authorization often implies authentication, i.e. proving one's identity. However, when performing authorization within the Tor network, untrusted points should not gain any useful information about the identities of communicating parties, neither server nor client. A crucial challenge is to remain anonymous towards directory servers and introduction points. However, trying to hide identity from the hidden service is a futile task, because a client would never know if he is the only authorized client and therefore perfectly identifiable. Therefore, hiding client identity from the hidden service is not aimed by this proposal. The current implementation of hidden services does not provide any kind of authorization. The hidden service descriptor version 2, introduced by proposal 114, was designed to use a descriptor cookie for downloading and decrypting parts of the descriptor content, but this feature is not yet in use. Further, most relevant cell formats specified in rend-spec contain fields for authorization data, but those fields are neither implemented nor do they suffice entirely. Details: 1. General infrastructure for authorization to hidden services We spotted three possible authorization points in the hidden service protocol: (1) when downloading and decrypting parts of the hidden service descriptor, (2) at the introduction point, and (3) at Bob's Tor client before contacting the rendezvous point. The general idea of this proposal is to allow service providers to restrict access to some or all of these points to authorized clients only. 1.1. Client authorization at directory Since the implementation of proposal 114 it is possible to combine a hidden service descriptor with a so-called descriptor cookie. If done so, the descriptor cookie becomes part of the descriptor ID, thus having an effect on the storage location of the descriptor. Someone who has learned about a service, but is not aware of the descriptor cookie, won't be able to determine the descriptor ID and download the current hidden service descriptor; he won't even know whether the service has uploaded a descriptor recently. Descriptor IDs are calculated as follows (see section 1.2 of rend-spec for the complete specification of v2 hidden service descriptors): descriptor-id = H(service-id | H(time-period | descriptor-cookie | replica)) Currently, service-id is equivalent to permanent-id which is calculated as in the following formula. But in principle it could be any public key. permanent-id = H(permanent-key)[:10] The second purpose of the descriptor cookie is to encrypt the list of introduction points, including optional authorization data. Hence, the hidden service directories won't learn any introduction information from storing a hidden service descriptor. This feature is implemented but unused at the moment, so that this proposal will harness the advantages of proposal 114. The descriptor cookie can be used for authorization by keeping it secret from everyone but authorized clients. A service could then decide whether to publish hidden service descriptors using that descriptor cookie later on. An authorized client being aware of the descriptor cookie would be able to download and decrypt the hidden service descriptor. The number of concurrently used descriptor cookies for one hidden service is not restricted. A service could use a single descriptor cookie for all users, a distinct cookie per user, or something in between, like one cookie per group of users. It is up to the specific protocol and how it is applied by a service provider. Although this part of the proposal is meant to describe a general infrastructure for authorization, changing the way of using the descriptor cookie to look up hidden service descriptors, e.g. applying some sort of asymmetric crypto system, would require in-depth changes that would be incompatible to v2 hidden service descriptors. On the contrary, using another key for en-/decrypting the introduction point part of a hidden service descriptor, e.g. a different symmetric key or asymmetric encryption, would be easy to implement and compatible to v2 hidden service descriptors as understood by hidden service directories (clients and servers would have to be upgraded anyway for using the new features). 1.2. Client authorization at introduction point The next possible authorization point after downloading and decrypting a hidden service descriptor is the introduction point. It may be important for authorization, because it bears the last chance of hiding presence of a hidden service from unauthorized clients. Further, performing authorization at the introduction point might reduce traffic in the network, because unauthorized requests would not be passed to the hidden service. This applies to those clients who are aware of a descriptor cookie and thereby of the hidden service descriptor, but do not have authorization data to pass the introduction point or access the service (such a situation might occur when authorization data for authorization at the directory is not issued on a per-user base as opposed to authorization data for authorization at the introduction point). It is important to note that the introduction point must be considered untrustworthy, and therefore cannot replace authorization at the hidden service itself. Nor should the introduction point learn any sensitive identifiable information from either server or client. In order to perform authorization at the introduction point, three message formats need to be modified: (1) v2 hidden service descriptors, (2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells. A v2 hidden service descriptor needs to contain authorization data that is introduction-point-specific and sometimes also authorization data that is introduction-point-independent. Therefore, v2 hidden service descriptors as specified in section 1.2 of rend-spec already contain two reserved fields "intro-authorization" and "service-authorization" (originally, the names of these fields were "...-authentication") containing an authorization type number and arbitrary authorization data. We propose that authorization data consists of base64 encoded objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and "-----END MESSAGE-----". This will increase the size of hidden service descriptors, which however is possible, as there is no strict upper limit. The current ESTABLISH_INTRO cells as described in section 1.3 of rend-spec do not contain either authorization data or version information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO cells adding these two issues as follows: V Format byte: set to 255 [1 octet] V Version byte: set to 1 [1 octet] KL Key length [2 octets] PK Bob's public key [KL octets] HS Hash of session info [20 octets] AUTHT The auth type that is supported [1 octet] AUTHL Length of auth data [2 octets] AUTHD Auth data [variable] SIG Signature of above information [variable] From the format it is possible to determine the maximum allowed size for authorization data: given the fact that cells are 512 octets long, of which 498 octets are usable (see section 6.1 of tor-spec), and assuming 1024 bit = 128 octet long keys, there are 215 octets left for authorization data. Hence, authorization protocols are bound to use no more than these 215 octets, regardless of the number of clients that shall be authenticated at the introduction point. Otherwise, one would need to send multiple ESTABLISH_INTRO cells or split them up, what we do not specify here. In order to understand a v1 ESTABLISH_INTRO cell, the implementation of a relay must have a certain Tor version, which would probably be some 0.2.1.x. Hidden services need to be able to distinguish relays being capable of understanding the new v1 cell formats and perform authorization. We propose to use the version number that is contained in networkstatus documents to find capable introduction points. The current INTRODUCE1 cells as described in section 1.8 of rend-spec is not designed to carry authorization data and has no version number, too. We propose the following version 1 of INTRODUCE1 cells: Cleartext V Version byte: set to 1 [1 octet] PK_ID Identifier for Bob's PK [20 octets] AUTHT The auth type that is supported [1 octet] AUTHL Length of auth data [2 octets] AUTHD Auth data [variable] Encrypted to Bob's PK: (RELAY_INTRODUCE2 cell) The maximum length of contained authorization data depends on the length of the contained INTRODUCE2 cell. A calculation follows below when describing the INTRODUCE2 cell format we propose to use. Unfortunately, v0 INTRODUCE1 cells consist only of a fixed-size, seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This makes it impossible to distinguish v0 INTRODUCE1 cells from any later format. In particular, it is not possible to introduce some kind of format and version byte for newer versions of this cell. That's probably where the comment "[XXX011 want to put intro-level auth info here, but no version. crap. -RD]" that was part of rend-spec some time ago comes from. Processing of v1 INTRODUCE1 cells therefore requires knowledge about the context in which they are used. As a result, we propose that when receiving a v1 ESTABLISH_INTRO cell, an introduction point only accepts v1 INTRODUCE1 cells later on. Hence, the same introduction point cannot be used to accept both v0 and v1 INTRODUCE1 cells for the same service. (Another solution would be to distinguish v0 and v1 INTRODUCE1 cells by their size, as v0 INTRODUCE1 cells can only have specific cell sizes, depending on the version of the contained INTRODUCE2 cell; however, this approach does not appear very clean.) 1.3. Client authorization at hidden service The time when a hidden service receives an INTRODUCE2 cell constitutes the last possible authorization point during the hidden service protocol. Performing authorization here is easier than at the other two authorization points, because there are no possibly untrusted entities involved. In general, a client that is successfully authorized at the introduction point should be granted access at the hidden service, too. Otherwise, the client would receive a positive INTRODUCE_ACK cell from the introduction point and conclude that it may connect to the service, but the request will be dropped without notice. This would appear as a failure to clients. Therefore, the number of cases in which a client successfully passes the introduction point, but fails at the hidden service should be zero. However, this does not lead to the conclusion, that the authorization data used at the introduction point and the hidden service must be the same, but only that both authorization data should lead to the same authorization result. Authorization data is transmitted from client to server via an INTRODUCE2 cell that is forwarded by the introduction point. There are versions 0 to 2 specified in section 1.8 of rend-spec, but none of these contains fields for carrying authorization data. We propose a slightly modified version of v3 INTRODUCE2 cells that is specified in section 1.8.1 and which is not implemented as of December 2007. In contrast to the specified v3 we avoid specifying (and implementing) IPv6 capabilities, because Tor relays will be required to support IPv4 addresses for a long time in the future, so that this seems unnecessary at the moment. The proposed format of v3 INTRODUCE2 cells is as follows: VER Version byte: set to 3. [1 octet] AUTHT The auth type that is supported [1 octet] AUTHL Length of auth data [2 octets] AUTHD Auth data [variable] IP Rendezvous point's address [4 octets] PORT Rendezvous point's OR port [2 octets] ID Rendezvous point identity ID [20 octets] KLEN Length of onion key [2 octets] KEY Rendezvous point onion key [KLEN octets] RC Rendezvous cookie [20 octets] g^x Diffie-Hellman data, part 1 [128 octets] The maximum possible length of authorization data is related to the enclosing INTRODUCE1 cell. A v3 INTRODUCE2 cell with 1024 bit = 128 octets long public keys without any authorization data occupies 306 octets (AUTHL is only used when AUTHT has a value != 0), plus 58 octets for hybrid public key encryption (see section 5.1 of tor-spec on hybrid encryption of CREATE cells). The surrounding v1 INTRODUCE1 cell requires 24 octets. This leaves only 110 of the 498 available octets free, which must be shared between authorization data to the introduction point _and_ to the hidden service. When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has provided valid authorization data to him. He will only then build a circuit to the provided rendezvous point and otherwise will drop the cell. There might be several attacks based on the idea of replaying existing cells to the hidden service. In particular, someone (the introduction point or an evil authenticated client) might replay valid INTRODUCE2 cells to make the hidden service build an arbitrary number of circuits to (maybe long gone) rendezvous points. Therefore, we propose that hidden services maintain a history of received INTRODUCE2 cells within the last hour and only accept INTRODUCE2 cells matching the following rules: (1) a maximum of 3 cells coming from the same client and containing the same rendezvous cookie, and (2) a maximum of 10 cells coming from the same client with different rendezvous cookies. This allows a client to retry connection establishment using the same rendezvous point for 3 times and a total number of 10 connection establishments (not requests in the transported protocol) per hour. 1.4. Summary of authorization data fields In summary, the proposed descriptor format and cell formats provide the following fields for carrying authorization data: (1) The v2 hidden service descriptor contains: - a descriptor cookie that is used for the lookup process, and - an arbitrary encryption schema to ensure authorization to access introduction information (currently symmetric encryption with the descriptor cookie). (2) For performing authorization at the introduction point we can use: - the fields intro-authorization and service-authorization in hidden service descriptors, - a maximum of 215 octets in the ESTABLISH_INTRO cell, and - one part of 110 octets in the INTRODUCE1 cell. (3) For performing authorization at the hidden service we can use: - the fields intro-authorization and service-authorization in hidden service descriptors, - the other part of 110 octets in the INTRODUCE2 cell. It will also still be possible to access a hidden service without any authorization or only use a part of the authorization infrastructure. However, this requires to consider all parts of the infrastructure. For example, authorization at the introduction point relying on confidential intro-authorization data transported in the hidden service descriptor cannot be performed without using an encryption schema for introduction information. 1.5. Managing authorization data at servers and clients In order to provide authorization data at the hidden server and the authenticated clients, we propose to use files---either the tor configuration file or separate files. The exact format of these special files depends on the authorization protocol used. Currently, rend-spec contains the proposition to encode client-side authorization data in the URL, like in x.y.z.onion. This was never used and is also a bad idea, because in case of HTTP the requested URL may be contained in the Host and Referer fields. 2. An authorization protocol based on group and user passwords In the following we discuss an authorization protocol for the proposed authorization architecture that performs authorization at the directory and the hidden service, but not at the introduction point. The protocol relies on a distinct asymmetric (client-key) and a symmetric key (descriptor-cookie) for each client. The asymmetric key replaces the service's permanent key and the symmetric key is used as descriptor cookie as described above. 2.1. Client authorization at directory The symmetric key of 128 bits length is used as descriptor cookie for publishing/fetching hidden service descriptors and for encrypting/decrypting the contained introduction points. Further, the asymmetric key replaces the service's permanent key that is used to encode and sign a v2 hidden service descriptor. The result is a v2 hidden service descriptor with the following format: descriptor-id = H(H(client-key)[:10] | H(time-period | descriptor-cookie | replica)) descriptor-content = { descriptor-id, version, client-key, H(time-period | descriptor-cookie | replica), timestamp, protocol-versions, { introduction-points } encrypted with descriptor-cookie } signed with private-key Whenever a server decides to remove authorization for a client, he can simply stop publishing hidden service descriptors using the descriptor cookie. The fact that there needs to be a separate hidden service descriptor for each user leads to a large number of such descriptors. However, this is the only way for a service provider to remove a client's authorization without remains. We assume that distributing the directory of hidden service descriptors as implemented by proposal 114 provides the necessary scalability to do so. 2.2. Client authorization at introduction point There is no need to perform authorization at the introduction point in this protocol. Only authorized clients can decrypt the introduction point part of a hidden service descriptor. This contains the introduction key that was introduced by proposal 114 and that is required to get an INTRODUCE1 cell passed at the introduction point. 2.3. Client authorization at hidden service Authorization at the hidden service also makes use of the descriptor cookie. The client include this descriptor cookie, in INTRODUCE2 cells that it sends to the server. The server compares authorization data of incoming INTRODUCE2 cells with the locally stored value that it would expect. The authorization type number of this protocol for INTRODUCE2 cells is "1". 2.4. Providing authorization data The Tor client of a hidden service needs to know the client keys and descriptor cookies of all authorized clients. We decided to create a new configuration option that specifies a comma-separated list of human-readable client names: HiddenServiceAuthorizeClient client-name,client-name,... When a hidden service is configured, the client keys and descriptor cookies for all configured client names are either read from a file or generated and appended to that file. The file format is: "client-name" human-readable client identifier NL "descriptor-cookie" 128-bit key ^= 22 base64 chars NL "client-key" NL a public key in PEM format On client side, we propose to add a new configuration option that contains a service name, the service identifier (H(client-key)[:10]), and the descriptor cookie that are required to access a hidden service. The configuration option has the following syntax: HidServAuth service-name service-address descriptor-cookie Whenever the user tries to access the given onion address, the given descriptor cookie is used for authorization. Security implications: In the following we want to discuss possible attacks by dishonest entities in the presented infrastructure and specific protocol. These security implications would have to be verified once more when adding another protocol. The dishonest entities (theoretically) include the hidden server itself, the authenticated clients, hidden service directory nodes, introduction points, and rendezvous points. The relays that are part of circuits used during protocol execution, but never learn about the exchanged descriptors or cells by design, are not considered. Obviously, this list makes no claim to be complete. The discussed attacks are sorted by the difficulty to perform them, in ascending order, starting with roles that everyone could attempt to take and ending with partially trusted entities abusing the trust put in them. (1) A hidden service directory could attempt to conclude presence of a server from the existence of a locally stored hidden service descriptor: This passive attack is possible only for a single client-service relation, because descriptors need to contain a publicly visible signature of the server using the client key A possible protection would be to increase the number of hidden service directories in the network. (2) A hidden service directory could try to break the descriptor cookies of locally stored descriptors: This attack can be performed offline. The only useful countermeasure against it might be using safe passwords that are generated by Tor. (3) An introduction point could try to identify the pseudonym of the hidden service on behalf of which it operates: This is impossible by design, because the service uses a fresh public key for every establishment of an introduction point (see proposal 114) and the introduction point receives a fresh introduction cookie, so that there is no identifiable information about the service that the introduction point could learn. The introduction point cannot even tell if client accesses belong to the same client or not, nor can it know the total number of authorized clients. The only information might be the pattern of anonymous client accesses, but that is hardly enough to reliably identify a specific service. (4) An introduction point could want to learn the identities of accessing clients: This is also impossible by design, because all clients use the same introduction cookie for authorization at the introduction point. (5) An introduction point could try to replay a correct INTRODUCE1 cell to other introduction points of the same service, e.g. in order to force the service to create a huge number of useless circuits: This attack is not possible by design, because INTRODUCE1 cells are encrypted using a freshly created introduction key that is only known to authorized clients. (6) An introduction point could attempt to replay a correct INTRODUCE2 cell to the hidden service, e.g. for the same reason as in the last attack: This attack is very limited by the fact that a server will only accept 3 INTRODUCE2 cells containing the same rendezvous cookie and drop all further replayed cells. (7) An introduction point could block client requests by sending either positive or negative INTRODUCE_ACK cells back to the client, but without forwarding INTRODUCE2 cells to the server: This attack is an annoyance for clients, because they might wait for a timeout to elapse until trying another introduction point. However, this attack is not introduced by performing authorization and it cannot be targeted towards a specific client. A countermeasure might be for the server to periodically perform introduction requests to his own service to see if introduction points are working correctly. (8) The rendezvous point could attempt to identify either server or client: This remains impossible as it was before, because the rendezvous cookie does not contain any identifiable information. (9) An authenticated client could swamp the server with valid INTRODUCE1 and INTRODUCE2 cells, e.g. in order to force the service to create useless circuits to rendezvous points; as opposed to an introduction point replaying the same INTRODUCE2 cell, a client could include a new rendezvous cookie for every request: The countermeasure for this attack is the restriction to 10 connection establishments per client and hour. Compatibility: An implementation of this proposal would require changes to hidden servers and clients to process authorization data and encode and understand the new formats. However, both servers and clients would remain compatible to regular hidden services without authorization. Implementation: The implementation of this proposal can be divided into a number of changes to hidden service and client side. There are no changes necessary on directory, introduction, or rendezvous nodes. All changes are marked with either [service] or [client] do denote on which side they need to be made. /1/ Configure client authorization [service] - Parse configuration option HiddenServiceAuthorizeClient containing authorized client names. - Load previously created client keys and descriptor cookies. - Generate missing client keys and descriptor cookies, add them to client_keys file. - Rewrite the hostname file. - Keep client keys and descriptor cookies of authorized clients in memory. [- In case of reconfiguration, mark which client authorizations were added and whether any were removed. This can be used later when deciding whether to rebuild introduction points and publish new hidden service descriptors. Not implemented yet.] /2/ Publish hidden service descriptors [service] - Create and upload hidden service descriptors for all authorized clients. [- See /1/ for the case of reconfiguration.] /3/ Configure permission for hidden services [client] - Parse configuration option HidServAuth containing service authorization, store authorization data in memory. /5/ Fetch hidden service descriptors [client] - Look up client authorization upon receiving a hidden service request. - Request hidden service descriptor ID including client key and descriptor cookie. Only request v2 descriptors, no v0. /6/ Process hidden service descriptor [client] - Decrypt introduction points with descriptor cookie. /7/ Create introduction request [client] - Include descriptor cookie in INTRODUCE2 cell to introduction point. - Pass descriptor cookie around between involved connections and circuits. /8/ Process introduction request [service] - Read descriptor cookie from INTRODUCE2 cell. - Check whether descriptor cookie is authorized for access, including checking access counters. - Log access for accountability.