Internet Engineering Task Force G. D'Amore, Ed. Internet-Draft Intended status: Informational M. Sustrik Expires: April 29, 2017 October 26, 2016 UDP Mapping for Scalability Protocols sp-udp-mapping-01 Abstract This document defines the UDP mapping for scalability protocols. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 29, 2017. Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. D'Amore & Sustrik Expires April 29, 2017 [Page 1] Internet-Draft UDP mapping for SPs October 2016 1. Underlying protocol This mapping should be layered directly on the top of UDP. There's no fixed UDP port to use for the communication. Instead, port numbers are assigned to individual services by the user. 2. Message delimitation Each UDP packet maps to exactly one SP message. There is no way to split one SP message into multiple UDP packets and therefore SP messages larger than existing path MTU will be dropped silently. There is also no way to pack multiple SP messages into a single UDP packet. 3. Packet layout Each packet consists of an 8-byte header followed by the opaque message payload: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0x00 | 0x53 | 0x50 | opcode | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | type | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | payload ... +-+-+-+-+-+-+-+-+-+-+-+-+- The first three bytes of the protocol header are used to make sure that the peer's protocol is compatible with the protocol used by the local endpoint. Keep in mind that this protocol is designed to run on an arbitrary UDP port, thus the standard compatibility check -- if it runs on port X and protocol Y is assigned to X by IANA, it speaks protocol Y -- does not apply. We have to use an alternative mechanism. The first three bytes of the protocol header MUST be set to 0x00, 0x53 and 0x50. If the protocol header received from the peer differs, the UDP packet MUST be ignored. The fact that the first byte of the protocol header is binary zero eliminates any text-based protocols that were accidentally sending to the endpoint. Subsequent two bytes make the check even more D'Amore & Sustrik Expires April 29, 2017 [Page 2] Internet-Draft UDP mapping for SPs October 2016 rigorous. At the same time they can be used as a debugging hint to indicate that the connection is supposed to use one of the scalability protocols -- ASCII representation of these bytes is 'SP' that can be easily spotted in when capturing the network traffic. 3.1. Opcode The fourth byte is an operation code (opcode), which is used to indicate a transport-specific type of operation. The following operations are defined: 0x00 -- DATA: The packet carries SP payload data. 0x01 -- CREQ: The packet requests (initiates) a logical connection. 0x02 -- CACK: The packet confirms creation of a logical connection. 0x03 -- DISC: The packet terminates a logical connection. It is intended that if the protocol must be revised, that additional op codes can be assigned, obviating the need for an explicit version field. These operation codes are used in the mangement of logical connections, and are described in more detail in the Logical Connections section below. 3.2. SP Type The fifth and sixth bytes of the header form a 16-bit unsigned integer in network byte order representing the type of SP endpoint on the layer above. The value SHOULD NOT be interpreted by the mapping, rather the interpretation should be delegated to the scalability protocol above the mapping. For informational purposes, it should be noted that the field encodes information such as SP protocol ID, protocol version and the role of endpoint within the protocol. Individual values are assigned by IANA. 3.3. Reserved Field Finally, the last two bytes of the protocol header are reserved for future use and must be set to binary zeroes. If the protocol header from the peer contains anything else than zeroes in this field, the implementation MUST ignore the UDP packet. D'Amore & Sustrik Expires April 29, 2017 [Page 3] Internet-Draft UDP mapping for SPs October 2016 3.4. Payload The packet header is followed by the opaque message payload which spans all the way to the end of the packet. The payload contents are the SP message content itself, except in the case of DISC messages. 4. Unicast Only This mapping operates only over Unicast UDP. Messages with source or destination addresses that are not unicast MUST be discarded. Furthermore, there is an assumption that bidirectional communication is possible. Various SP protocols have this requirement anyway. 5. Logical Connections This mapping of the SP protocols is layered on top of a connectionless transport layer. However the SP protocols require use of logical connections in order to pass "connection IDs" in some of the fields (such as the backtrace information used for the Request/ Reply Scalability Protocol. Hence it is desirable to create logical connections, and to be able to ensure that the connections are kept active either by monitoring activity passively, or by active probing. It is also desirable for a peer to be able to indicate that it's partner may release resources used for tracking the connection. The logical connection itself is still best-effort only, and packets may be corrupted, lost, or reordered. The packets are also subject to inspecion, modification, and replay, if the underlying IP transport is not secured. 5.1. Connection roles As with other transports like TCP, we intentionally create the notion of a connection initiator, and a connection accepter. The initator is always the party who initiates the connection, and is the party responsible for sending CREQ messages. The accepter is the party that receives the CREQ messages, and replies with CACK (or possibly DISC). The role of initiator and accepter do not change during the course of a single logical connection. D'Amore & Sustrik Expires April 29, 2017 [Page 4] Internet-Draft UDP mapping for SPs October 2016 5.2. Keeping the Connection Alive In order to keep session state, implementations will generally store data for each session. In order to prevent a stale session from consuming these resources forever, and in order to keep any intervening state active (e.g. NAT rules), a CACK message may be sent at any time by the initator. In the event of an error, the accepter MAY reply with an DISC message. Otherwise it MUST reply with a CREQ message. An accepter MUST NOT send a CACK message. The initiator MUST send CREQ messages every T1 seconds. An accepter that has not received a CREQ message for at least T2 seconds, or an initiator that has not received a CACK message for at least T2 seconds, SHOULD assume the connection has failed, and and may discard any state associated with the logical connection. It is recommended that T1 and T2 be configurable, with recommended default values of 30 and 300. This means that sessions that go inexplicably silent for 5 minutes will be closed, and their resources reclaimed. Note that values for T1 and T2 must match on both initiator and accepter for correct operation. Failure to do so will likely result in permature connection termination. 6. Message Types 6.1. DATA messages DATA messages (opcode 0x00) are used on an established logical, connection and, and carry SP messages. The payload part of the message is the SP payload. Data messages are unacknowledged. A DATA message that is received by a party which does not have a corresponding logical connection set up SHOULD be replied to with a DISC message, using error code 0x04 (not connected). DATA messages may only be sent on an active connection, and may be sent at any time by either party. D'Amore & Sustrik Expires April 29, 2017 [Page 5] Internet-Draft UDP mapping for SPs October 2016 6.2. CREQ messages CREQ messages are used to establish a logical connection. The initiator sends a CREQ message to the listener. On success, the accepter will record the full UDP source and destination addresses (including IP addresses and port numbers), creating a local entry in a list of logical connections. If this is successful, the accepter MUST respond with an CACK message. Otherwise it SHOULD respond with an appropriately formed DISC message. CREQ messages have no payload. A CREQ message may be sent at any time, and is idempotent. (If the receiver of a CREQ already has the logical connection established, it need simply reply with the CACK. 6.3. CACK messages CACK messages are sent in response to a CREQ, when the connection is either already established, or after a new connection is established. As noted above, CREQ is meant to be idempotent, and indeed may be sent periodically to keep a connection from idling out, and so a CACK message is also expected to be sent over the network periodically. 6.4. DISC messages DISC messages are sent by one side of a logical connection to terminate the logical connection. This notification allows the remote partner to either terminate an existing connection, such as when it is shutting down, or to reject a connection attempt made with CREQ. DISC messages carry in their payload, a single byte indicating a reason for the disconnect, along with a human readable reason as an ASCII byte array. (This MAY not be present, and SHOULD NOT be zero terminated.) The following reasons for DISC are defined: 0x00 - normal close of a socket or endpoint 0x01 - connection rejected 0x02 - SP protocol invalid 0x03 - invalid address (such as multicast) 0x04 - not connected D'Amore & Sustrik Expires April 29, 2017 [Page 6] Internet-Draft UDP mapping for SPs October 2016 0xff - Other error 7. Latency Considerations A full round trip is required to establish a logical connection. However, a particularly optimistic initiator could send a CREQ and a DATA message in rapid succession, and hope that the DATA was received. Since these protocols are best effort only, this might not be a terrible thing to do. A future extension might be to offer a new opcode, allowing DATA to be combined with a CREQ or CACK. However, given that typical uses of these protocols are for long-lived use cases, there does not seem to be any particular urgency in doing this. 8. IANA Considerations This memo includes no request to IANA. 9. Security Considerations The mapping isn't intended to provide any additional security in addition to UDP. If this is a concern, it is recommended that IPsec or similar approaches be used to secure the underlying transport. Authors' Addresses Garrett D'Amore (editor) Email: garrett@damore.org Martin Sustrik Email: sustrik@250bpm.com D'Amore & Sustrik Expires April 29, 2017 [Page 7]