Client-specific tls-crypt keys (--tls-crypt-v2) =============================================== This document describes the ``--tls-crypt-v2`` option, which enables OpenVPN to use client-specific ``--tls-crypt`` keys. Rationale --------- ``--tls-auth`` and ``tls-crypt`` use a pre-shared group key, which is shared among all clients and servers in an OpenVPN deployment. If any client or server is compromised, the attacker will have access to this shared key, and it will no longer provide any security. To reduce the risk of losing pre-shared keys, ``tls-crypt-v2`` adds the ability to supply each client with a unique tls-crypt key. This allows large organisations and VPN providers to profit from the same DoS and TLS stack protection that small deployments can already achieve using ``tls-auth`` or ``tls-crypt``. Also, for ``tls-crypt``, even if all these peers succeed in keeping the key secret, the key lifetime is limited to roughly 8000 years, divided by the number of clients (see the ``--tls-crypt`` section of the man page). Using client-specific keys, we lift this lifetime requirement to roughly 8000 years for each client key (which "Should Be Enough For Everybody (tm)"). Introduction ------------ ``tls-crypt-v2`` uses an encrypted cookie mechanism to introduce client-specific tls-crypt keys without introducing a lot of server-side state. The client-specific key is encrypted using a server key. The server key is the same for all servers in a group. When a client connects, it first sends the encrypted key to the server, such that the server can decrypt the key and all messages can thereafter be encrypted using the client-specific key. A wrapped (encrypted and authenticated) client-specific key can also contain metadata. The metadata is wrapped together with the key, and can be used to allow servers to identify clients and/or key validity. This allows the server to abort the connection immediately after receiving the first packet, rather than performing an entire TLS handshake. Aborting the connection this early greatly improves the DoS resilience and reduces attack surface against malicious clients that have the ``tls-crypt`` or ``tls-auth`` key. This is particularly relevant for large deployments (think lost key or disgruntled employee) and VPN providers (clients are not trusted). To allow for a smooth transition, ``tls-crypt-v2`` is designed such that a server can enable both ``tls-crypt-v2`` and either ``tls-crypt`` or ``tls-auth``. This is achieved by introducing a P_CONTROL_HARD_RESET_CLIENT_V3 opcode, that indicates that the client wants to use ``tls-crypt-v2`` for the current connection. For an exact specification and more details, read the Implementation section. Implementation -------------- When setting up a tls-crypt-v2 group (similar to generating a tls-crypt or tls-auth key previously): 1. Generate a tls-crypt-v2 server key using OpenVPN's ``--genkey tls-crypt-v2-server``. This key contains 2 512-bit keys, of which we use: * the first 256 bits of key 1 as AES-256-CTR encryption key ``Ke`` * the first 256 bits of key 2 as HMAC-SHA-256 authentication key ``Ka`` This format is similar to the format for regular ``tls-crypt``/``tls-auth`` and data channel keys, which allows us to reuse code. 2. Add the tls-crypt-v2 server key to all server configs (``tls-crypt-v2 /path/to/server.key``) When provisioning a client, create a client-specific tls-crypt key: 1. Generate 2048 bits client-specific key ``Kc`` using OpenVPN's ``--genkey tls-crypt-v2-client`` 2. Optionally generate metadata The first byte of the metadata determines the type. The initial implementation supports the following types: 0x00 (USER): User-defined free-form data. 0x01 (TIMESTAMP): 64-bit network order unix timestamp of key generation. The timestamp can be used to reject too-old tls-crypt-v2 client keys. User metadata could for example contain the users certificate serial, such that the incoming connection can be verified against a CRL. If no metadata is supplied during key generation, openvpn defaults to the TIMESTAMP metadata type. 3. Create a wrapped client key ``WKc``, using the same nonce-misuse-resistant SIV construction we use for tls-crypt: ``len = len(WKc)`` (16 bit, network byte order) ``T = HMAC-SHA256(Ka, len || Kc || metadata)`` ``IV = 128 most significant bits of T`` ``WKc = T || AES-256-CTR(Ke, IV, Kc || metadata) || len`` Note that the length of ``WKc`` can be computed before composing ``WKc``, because the length of each component is known (and AES-256-CTR does not add any padding). 4. Create a tls-crypt-v2 client key: PEM-encode ``Kc || WKc`` and store in a file, using the header ``-----BEGIN OpenVPN tls-crypt-v2 client key-----`` and the footer ``-----END OpenVPN tls-crypt-v2 client key-----``. (The PEM format is simple, and following PEM allows us to use the crypto lib function for en/decoding.) 5. Add the tls-crypt-v2 client key to the client config (``tls-crypt-v2 /path/to/client-specific.key``) When setting up the openvpn connection: 1. The client reads the tls-crypt-v2 key from its config, and: 1. loads ``Kc`` as its tls-crypt key, 2. stores ``WKc`` in memory for sending to the server. 2. To start the connection, the client creates a P_CONTROL_HARD_RESET_CLIENT_V3 message, wraps it with tls-crypt using ``Kc`` as the key, and appends ``WKc``. (``WKc`` must not be encrypted, to prevent a chicken-and-egg problem.) 3. The server receives the P_CONTROL_HARD_RESET_CLIENT_V3 message, and 1. reads the WKc length field from the end of the message, and extracts WKc from the message 2. unwraps ``WKc`` 3. uses unwrapped ``Kc`` to verify the remaining P_CONTROL_HARD_RESET_CLIENT_V3 message's (encryption and) authentication. The message is dropped and no error response is sent when either 3.1, 3.2 or 3.3 fails (DoS protection). 4. Server optionally checks metadata using a --tls-crypt-v2-verify script This allows early abort of connection, *before* we expose any of the notoriously dangerous TLS, X.509 and ASN.1 parsers and thereby reduces the attack surface of the server. The metadata is checked *after* the OpenVPN three-way handshake has completed, to prevent DoS attacks. (That is, once the client has proved to the server that it possesses Kc, by authenticating a packet that contains the session ID picked by the server.) A server should not send back any error messages if metadata verification fails, to reduce attack surface and maximize DoS resilience. 6. Client and server use ``Kc`` for (un)wrapping any following control channel messages. HMAC Cookie support ------------------- To avoid exhaustion attack and keeping state for connections that fail to complete the three-way handshake, the OpenVPN server will use its own session id as challenge that the client must repeat in the third packet of the handshake. This introduces a problem. If the server does not keep the wrapped client key from the initial packet, the server cannot decode the third packet. Therefore, tls-crypt-v2 in 2.6 allows resending the wrapped key in the third packet of the handshake with the P_CONTROL_WKC_V1 message. The modified handshake is as follows (the rest of the handshake is unmodified): 1. The client creates the P_CONTROL_HARD_RESET_CLIENT_V3 message as before but indicates that it supports resending the wrapped key. This is done by setting the packet id of the replay id to 0x0f000000. The first byte indicates the early negotiation support and the next byte the flags. All tls-crypt-v2 implementations that support early negotiation, must also support resending the wrapped key. The flags byte is therefore empty. 2. The server responds with a P_CONTROL_HARD_RESET_V2 message. Instead of having an empty payload like normally, the payload consists of TLV (type (uint16), length (uint16), value) packets. TLV was chosen to allow extensibility in the future. Currently only the following TLV is defined: flags - type 0x01, length 2. Bit 1 indicates that the client needs to resend the WKc in the third packet. 3. Instead of normal P_ACK_V1 or P_CONTROL_V1 packet, the client will send a P_CONTROL_WKC_V1 packet. The P_CONTROL_WKC_V1 is identical to a normal P_CONTROL_V1 packet but with the WKc appended. Normally the first message of the client is either P_ACK_V1, directly followed by a P_CONTROL_V1 message that contains the TLS Client Hello or just a P_CONTROL_V1 message. Instead of a P_ACK_V1 message the client should send a P_CONTROL_WKC_V1 message with an empty payload. This message must also include an ACK for the P_CONTROL_HARD_RESET_V2 message. When directly sending the TLS Client Hello message in the P_CONTROL_WKC_V1 message, the client must ensure that the resulting P_CONTROL_WKC_V1 message with the appended WKc does not extend the control message length. Considerations -------------- To allow for a smooth transition, the server implementation allows ``tls-crypt`` or ``tls-auth`` to be used simultaneously with ``tls-crypt-v2``. This specification does not allow simultaneously using ``tls-crypt-v2`` and connections without any control channel wrapping, because that would break DoS resilience. WKc includes a length field, so we leave the option for future extension of the P_CONTROL_HEAD_RESET_CLIENT_V3 message open. (E.g. add payload to the reset to indicate low-level protocol features.) ``tls-crypt-v2`` uses fixed crypto algorithms, because: * The crypto is used before we can do any negotiation, so the algorithms have to be predefined. * The crypto primitives are chosen conservatively, making problems with these primitives unlikely. * Making anything configurable adds complexity, both in implementation and usage. We should not add any more complexity than is absolutely necessary. Potential ``tls-crypt-v2`` risks: * Slightly more work on first connection (``WKc`` unwrap + hard reset unwrap) than with ``tls-crypt`` (hard reset unwrap) or ``tls-auth`` (hard reset auth). * Flexible metadata allow mistakes (So we should make it easy to do it right. Provide tooling to create client keys based on cert serial + CA fingerprint, provide script that uses CRL (if available) to drop revoked keys.)