{ "metadata": { "name": "", "signature": "sha256:9da24394819694e63dd995f75a99df1a888d9b2a682c232c1b17784e5fc8e919" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Introducing IPFIX and the python `ipfix` module\n", "\n", "IP Flow Information Export (IPFIX) (see [RFC7011](http://tools.ietf.org/html/7011)) is the IETF standard for export of network traffic flow data, and is comprised of:\n", "\n", "1. a unidirectional **protocol** for data export;\n", "2. a **data format** providing efficient record-level self-description for this protocol; and\n", "3. an **information model** providing the vocabulary for this data format.\n", "\n", "The Python IPFIX module (`pip install ipfix`) provides access to the data format and information model for bridging between IPFIX Messages and Python objects, and is useful for building implementations of the protocol, as well as for manipulating data stored in IPFIX files (see [RFC5655](http://tools.ietf.org/html/5655)). Documentation for the module is available at [github](https://britram.github.io/python-ipfix). The package also contains an undocumented module for *visualizing* IPFIX Messages as SVG graphics representing bitfields. We'll use this functionality in this notebook to explore the structure of IPFIX Messages.\n", "\n", "First, execute the following code to import and define a few functions we'll need:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import ipfix\n", "import ipfix.vis\n", "\n", "from ipaddress import ip_address\n", "from datetime import datetime\n", "from datetime import timezone\n", "from IPython.display import SVG\n", "\n", "def iso8601(x):\n", " return datetime.strptime(x, \"%Y-%m-%d %H:%M:%S.%f\")\n", "\n", "def draw_message(msg, length=256):\n", " return SVG(ipfix.vis.MessageBufferRenderer(msg, raster=4).render(length=length))\n", "\n", "def draw_template(tmpl):\n", " ofd = ipfix.vis.OctetFieldDrawing(raster=4)\n", " ipfix.vis.draw_template(ofd, tmpl)\n", " return SVG(ofd.render((90,30)))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "IPFIX identifies data record structures using Information Elements (IEs). The basic IEs are defined in the IANA [IPFIX Information Element Registry](http://www.iana.org/assignments/ipfix), and a mechanism for applying these IEs to export bidirectional flow data is defined in [RFC 5103](http://tools.ietf.org/html/5103). In order to work with these IEs, we'll need to tell the `ipfix` module to use them:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ipfix.ie.use_iana_default()\n", "ipfix.ie.use_5103_default()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we're ready to begin.\n", "\n", "## Templates and Messages\n", "\n", "A Template is an ordered sequence of IEs that describes the structure of a type of Data Record. If you're familiar with relational databases, you can think of a Template as defining a table, and an IE as a defined column name with a standard meaning, such that records containing the same IEs describe the same type of data.\n", "\n", "Let's consider a template defining a simple IPv4 flow record, with start and end timestamps, source and destination addresses and ports, protocol identifier, octet and packet delta counts:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "tmpl = ipfix.template.for_specs(261, \"flowStartMilliseconds\", \n", " \"flowEndMilliseconds\", \n", " \"sourceIPv4Address\", \n", " \"destinationIPv4Address\",\n", " \"sourceTransportPort\",\n", " \"destinationTransportPort\",\n", " \"protocolIdentifier\",\n", " \"octetDeltaCount\", \n", " \"packetDeltaCount\")\n", "draw_template(tmpl)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "svg": [ "01230x00x40x80xc0x100x140x180x1c0x200x24ID: 261Count: 9IE: flowSt...onds(152)Len: 8IE: flowEn...onds(153)Len: 8IE: source...ress(8)Len: 4IE: destin...ress(12)Len: 4IE: source...Port(7)Len: 2IE: destin...Port(11)Len: 2IE: protoc...fier(4)Len: 1IE: octetD...ount(1)Len: 8IE: packet...ount(2)Len: 8" ], "text": [ "" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we see that the use of an IE registry, which maps these numbers to names, allows IPFIX to represent this type information very efficiently: the template consists of a 16-bit template ID, a 16-bit field count, and 16 bits for the ID and 16 bits for the length for each IE.\n", "\n", "To illustrate how this template is used to encode a record, let's add this template to a message, along with a record encoded using the template:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "msg = ipfix.message.MessageBuffer()\n", "msg.begin_export(8304) # Observation Domain ID\n", "msg.add_template(tmpl)\n", "msg.export_new_set(261) # The set ID here refers to the Template ID of the template\n", "msg.export_namedict({ 'flowStartMilliseconds': iso8601('2012-10-22 09:29:07.170000'),\n", " 'flowEndMilliseconds': iso8601('2012-10-22 09:29:33.916000'),\n", " 'sourceIPv4Address': ip_address('192.0.2.11'),\n", " 'destinationIPv4Address': ip_address('192.0.2.212'),\n", " 'sourceTransportPort': 32798,\n", " 'destinationTransportPort': 80,\n", " 'protocolIdentifier': 6,\n", " 'packetDeltaCount': 17,\n", " 'octetDeltaCount': 3329})\n", "draw_message(msg)\n" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "svg": [ "01230x00x40x80xc0x100x140x180x1c0x200x240x280x2c0x300x340x380x3c0x400x440x480x4c0x500x540x580x5c0x5d0x610x650x690x6dVersion: 10Length: 109Export Time: 2014-07-08 13:48:13.000000 Sequence: 0Observation Domain: 8304Set ID: 2Set Length: 44ID: 261Count: 9IE: flowSt...onds(152)Len: 8IE: flowEn...onds(153)Len: 8IE: source...ress(8)Len: 4IE: destin...ress(12)Len: 4IE: source...Port(7)Len: 2IE: destin...Port(11)Len: 2IE: protoc...fier(4)Len: 1IE: octetD...ount(1)Len: 8IE: packet...ount(2)Len: 8Set ID: 261Set Length: 49flowStar...iseconds: 2012-10-22 09:29:07.170000 flowEndMilliseconds: 2012-10-22 09:29:33.915999 sourceIPv4Address: 192.0.2.11destinat...4Address: 192.0.2.212sourceTr...Port: 32798destinat...Port: 80prot...fier: 6octetDeltaCount: 3329packetDeltaCount: 17" ], "text": [ "" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "First we see a message header, containing the **version** (10) and **length** of the message in bytes, followed by a **sequence number** (which here advertises that 0 records have been sent for this observation domain ID in this file or session), an export time, and an **observation domain ID**. Observation domains separate parts of a metering infrastructure into logical domains within which a given packet is reported as observed at most once, and map to sets of components coordinated in measurement of passing flows (e.g., a line card).\n", "\n", "Notice that the set ID matches the template ID, and the Information Elements in the Template appear in the same order as the fields in the Data Records in the Message; this is how IPFIX handles self-description.\n", "\n", "Templates are persistent within a session; subsequent messages can refer to templates sent in previous messages:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "msg.begin_export(8304)\n", "msg.export_new_set(261)\n", "msg.export_namedict({ 'flowStartMilliseconds': iso8601('2012-10-22 09:30:01.912000'),\n", " 'flowEndMilliseconds': iso8601('2012-10-22 09:31:15.009000'),\n", " 'sourceIPv4Address': ip_address('192.0.2.212'),\n", " 'destinationIPv4Address': ip_address('192.0.2.11'),\n", " 'sourceTransportPort': 80,\n", " 'destinationTransportPort': 32801,\n", " 'protocolIdentifier': 6,\n", " 'packetDeltaCount': 83,\n", " 'octetDeltaCount': 97501})\n", "msg.export_namedict({ 'flowStartMilliseconds': iso8601('2012-10-22 09:30:08.182000'),\n", " 'flowEndMilliseconds': iso8601('2012-10-22 09:31:16.012000'),\n", " 'sourceIPv4Address': ip_address('192.0.2.212'),\n", " 'destinationIPv4Address': ip_address('192.0.2.11'),\n", " 'sourceTransportPort': 80,\n", " 'destinationTransportPort': 32802,\n", " 'protocolIdentifier': 6,\n", " 'packetDeltaCount': 99,\n", " 'octetDeltaCount': 136172})\n", "draw_message(msg)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "svg": [ "01230x00x40x80xc0x100x140x180x1c0x200x240x280x2c0x300x310x350x390x3d0x410x450x490x4d0x510x550x590x5d0x5e0x620x660x6a0x6eVersion: 10Length: 110Export Time: 2014-07-08 13:48:18.000000 Sequence: 1Observation Domain: 8304Set ID: 261Set Length: 94flowStar...iseconds: 2012-10-22 09:30:01.911999 flowEndMilliseconds: 2012-10-22 09:31:15.009000 sourceIPv4Address: 192.0.2.212destinat...4Address: 192.0.2.11sourceTr...Port: 80destinat...Port: 32801prot...fier: 6octetDeltaCount: 97501packetDeltaCount: 83flowStar...iseconds: 2012-10-22 09:30:08.181999 flowEndMilliseconds: 2012-10-22 09:31:16.012000 sourceIPv4Address: 192.0.2.212destinat...4Address: 192.0.2.11sourceTr...Port: 80destinat...Port: 32802prot...fier: 6octetDeltaCount: 136172packetDeltaCount: 99" ], "text": [ "" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "To see how IPFIX encodes reduced-length and variable-length IEs, let's define a new template including the 802.11 SSID, encoded as a UTF-8 string. Here, note that the octetDeltaCount and packetDeltaCount IEs are exported using 4 bytes instead of the native 8, as well, to illustrate reduced-length encoding." ] }, { "cell_type": "code", "collapsed": false, "input": [ "vtmpl = ipfix.template.for_specs(262, \"flowStartMilliseconds\", \n", " \"flowEndMilliseconds\", \n", " \"sourceIPv6Address\", \n", " \"destinationIPv6Address\",\n", " \"octetDeltaCount[4]\", \n", " \"packetDeltaCount[4]\",\n", " \"wlanSSID\")\n", "msg.begin_export(8304)\n", "msg.add_template(vtmpl)\n", "msg.export_new_set(262)\n", "msg.export_namedict({'flowStartMilliseconds': iso8601('2012-10-22 09:31:54.903000'),\n", " 'flowEndMilliseconds': iso8601('2012-10-22 09:41:52.627000'),\n", " 'sourceIPv6Address': ip_address('2001:db8:c0:ffee::2'),\n", " 'destinationIPv6Address': ip_address('2001:bd8:b:ea75::3'),\n", " 'packetDeltaCount': 212,\n", " 'octetDeltaCount': 553290,\n", " 'wlanSSID': 'ietf-a-v6only'})\n", "draw_message(msg)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "svg": [ "01230x00x40x80xc0x100x140x180x1c0x200x240x280x2c0x300x340x380x3c0x400x440x480x4c0x500x540x580x5c0x600x640x680x6c0x700x710x750x790x7dVersion: 10Length: 126Export Time: 2014-07-08 13:48:20.000000 Sequence: 3Observation Domain: 8304Set ID: 2Set Length: 36ID: 262Count: 7IE: flowSt...onds(152)Len: 8IE: flowEn...onds(153)Len: 8IE: source...ress(27)Len: 16IE: destin...ress(28)Len: 16IE: octetD...ount(1)Len: 4IE: packet...ount(2)Len: 4IE: wlanSSID(147)Len: 65535Set ID: 262Set Length: 74flowStar...iseconds: 2012-10-22 09:31:54.903000 flowEndMilliseconds: 2012-10-22 09:41:52.627000 sourceIPv6Address: 2001:db8:c0:ffee::2destinat...6Address: 2001:bd8:b:ea75::3octetDeltaCount: 553290packetDeltaCount: 212varlen: 13wlanSSID: ietf-a-v6only" ], "text": [ "" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we see the octetDeltaCount and PacketDeltaCount fields taking up only 4 as opposed to 8 bytes. In addition, note that the wlanSSID field is prefixed with a `varlen` byte, counting the number of subsequent bytes containing the value. IPFIX uses this length-prefixing since no delimiter can be guaranteed never to appear in a variable-length binary value.\n", "\n", "## Beyond Flow Export\n", "\n", "So far, we've explored the use of IPFIX for flow export; however, the protocol is useful for any application to which the following applies:\n", "\n", "- The application's data flow is fundamentally unidirectional. IPFIX is a \"push\" protocol, supporting only the export of information from a sender (an Exporting Process) to a receiver (a Collecting Process). Request-response interactions are not supported by IPFIX.\n", "- The application handles discrete event information, or information to be periodically reported. IPFIX is particularly well suited to representing events, which can be scoped in time.\n", "- The application handles information about network entities. IPFIX's information model is network-oriented, so network management applications have many opportunities for information model reuse.\n", "- The application requires a small number of arrangements of data structures relative to the number of records it handles. The template-driven self-description mechanism used by IPFIX excels at handling large volumes of identically structured data, compared to representations which define structure inline with data (such as XML).\n", "\n", "To take an example, let's use IPFIX to solve a common problem during meetings, conferences, and classes: meeting room temperature. Let's say we have a table of observation point identifiers to meeting room names, and use IPFIX to periodically export ambient temperature information for a given machine. In an ideal world, we'd write a proper description for the Information Element and send it to the Internet Assigned Numbers Authority (IANA), the body that maintains the Information Element Registry, as follows:\n", "\n", "```\n", "Name: ambientTemperatureCelsius\n", "Description: The ambient temperature measured in the environment. \n", " May be associated with an Observation Point or a Metering Process; \n", " otherwise, taken to be the ambient temperature in the environment \n", " of the Exporting Process.\n", "Abstract Data Type: float32\n", "Units: degrees Celsius\n", "Range: -273.15 - +infinity\n", "```\n", "\n", "But this is just a two hour tutorial, so we'll define a new **enterprise-specific IE** instead:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ipfix.ie.for_spec(\"ambientTemperatureCelsius(35566/2)[4]\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "InformationElement('ambientTemperatureCelsius', 35566, 2, ipfix.types.for_name('float32'), 4)" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now, as above, a template, a record, and a message:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ttmpl = ipfix.template.for_specs(263, \"observationTimeMilliseconds\", \n", " \"observationPointId[4]\",\n", " \"ambientTemperatureCelsius\")\n", "msg = ipfix.message.MessageBuffer()\n", "msg.begin_export(8304)\n", "msg.add_template(ttmpl)\n", "msg.export_new_set(263)\n", "msg.export_namedict({'observationTimeMilliseconds': datetime.utcnow(),\n", " 'observationPointId': 1,\n", " 'ambientTemperatureCelsius': 22.3})\n", "draw_message(msg)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "svg": [ "01230x00x40x80xc0x100x140x180x1c0x200x240x280x2c0x300x340x38Version: 10Length: 60Export Time: 2014-07-08 13:48:23.000000 Sequence: 0Observation Domain: 8304Set ID: 2Set Length: 24ID: 263Count: 3IE: observ...onds(323)Len: 8IE: observ...ntId(138)Len: 4IE: ambien...sius(32770)Len: 4PEN: 35566Set ID: 263Set Length: 20observat...iseconds: 2014-07-08 13:48:23.489000 observationPointId: 1ambientT...eCelsius: 22.299999237060547" ], "text": [ "" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course, physical measurements are more interesting if they're real. We've attached a cheap temperature-and-humidity sensor to a Raspberry Pi connected to the display laptop to demonstrate two things:\n", "\n", "1. IPFIX can be applied to any push-based event data export task, and\n", "2. the protocol itself is lightweight enough to implement anywhere you realistically have a TCP/IP stack.\n", "\n", "Here, we took an off-the-shelf user-space driver for the sensor attached to the GPIO pins, and added some C code to encode the result as IPFIX, attaching a static template and message header. This program doesn't even handle network communication; for that, we pipe its binary IPFIX output to nc, which acts as a TCP exporting process given an IPFIX message stream.\n", "\n", "This sensor handles relative humidity as well as temperature, so we'll need an IE for that, too:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ipfix.ie.for_spec(\"relativeHumidityPercent(35566/3)[4]\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now we'll need a way to collect from an external exporting process. Recall that the `ipfix` module only handles the data format and the information model, not the full protocol stack. Fortunately, the protocol's dynamics are relatively simple: IPFIX over TCP, for instance, contains the a stream of data more or less identical to what would be found in an IPFIX file. Therefore, we can use Python's `socketserver` module to build a quick and dirty IPFIX Collecting Process that draws SVG representations of the received messages and stores them for later display." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import socketserver\n", "import ipfix.reader\n", "import threading\n", "\n", "msg_length = 512 # Maximum number of bytes per message to render\n", "svgbuf = []\n", "svgbuf_mtx = threading.Lock()\n", "\n", "class StreamRendererHandler(socketserver.StreamRequestHandler):\n", " def handle(self):\n", " global svgbuf\n", " print(\"connection from \"+str(self.client_address)+\".\")\n", " msr = ipfix.vis.MessageStreamRenderer(self.rfile, scale=(90,30), raster=4)\n", "\n", " while True:\n", " try:\n", " svgbuf_mtx.acquire()\n", " svgbuf.append(msr.render_next_message(msg_length))\n", " svgbuf_mtx.release()\n", " except:\n", " break\n", "\n", " print(\"connection from \"+str(self.client_address)+\" terminated.\")\n", "\n", "\n", "class ThreadingTCPServer(socketserver.ThreadingMixIn, socketserver.TCPServer):\n", " pass\n", "\n", "srv = None # shut down old server, if any, through loss of reference\n", "srv = ThreadingTCPServer((\"\", 4739), StreamRendererHandler)\n", "srvt = threading.Thread(target=srv.serve_forever)\n", "srvt.daemon = True\n", "srvt.start()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*If you're not attending the course, and don't have a Raspberry Pi with the appropriate sensor attached, run the following code to simulate the connected device:*" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pushd ../raspi\n", "!./run_sim.sh localhost 4739\n", "%popd" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once we're done receiving messages, we can shut the server down:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "srv.shutdown()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now we can draw the messages buffered by the server:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "SVG(svgbuf[0])" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "SVG(svgbuf[1])" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "SVG(svgbuf[2])" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- - - - - -\n", "This notebook is © 2013-2014 Brian Trammell, and is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/)." ] } ], "metadata": {} } ] }