{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# NFStream: a Flexible Network Data Analysis Framework" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "6.5.3\n" ] } ], "source": [ "import nfstream\n", "print(nfstream.__version__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[**NFStream**][repo] is a multiplatform Python framework providing fast, flexible, and expressive data structures designed to make \n", "working with **online** or **offline** network data easy and intuitive. It aims to be Python's fundamental high-level \n", "building block for doing practical, **real-world** network flow data analysis. Additionally, it has the broader \n", "goal of becoming **a unifying network data analytics framework for researchers** providing data reproducibility \n", "across experiments.\n", "\n", "* **Performance:** NFStream is designed to be fast: [**AF_PACKET_V3/FANOUT**][packet] on Linux, multiprocessing, native\n", "[**CFFI based**][cffi] computation engine, and [**PyPy**][pypy] full support.\n", "* **Encrypted layer-7 visibility:** NFStream deep packet inspection is based on [**nDPI**][ndpi]. \n", "It allows NFStream to perform [**reliable**][reliable] encrypted applications identification and metadata \n", "fingerprinting (e.g. TLS, SSH, DHCP, HTTP).\n", "* **System visibility:** NFStream probes the monitored system's kernel to obtain information on open Internet sockets \n", "and collects guaranteed ground-truth (process name, PID, etc.) at the application level.\n", "* **Statistical features extraction:** NFStream provides state of the art of flow-based statistical feature extraction. \n", "It includes post-mortem statistical features (e.g., minimum, mean, standard deviation, and maximum of packet size and \n", "inter-arrival time) and early flow features (e.g. sequence of first n packets sizes, inter-arrival times, and directions).\n", "* **Flexibility:** NFStream is easily extensible using [**NFPlugins**][nfplugin]. It allows the creation of a new flow \n", "feature within a few lines of Python.\n", "* **Machine Learning oriented:** NFStream aims to make Machine Learning Approaches for network traffic management \n", "reproducible and deployable. By using NFStream as a common framework, researchers ensure that models are trained using \n", "the same feature computation logic, and thus, a fair comparison is possible. Moreover, trained models can be deployed \n", "and evaluated on live networks using [**NFPlugins**][nfplugin]. \n", "\n", "\n", "In this notebook, we demonstrate a subset of features provided by [**NFStream**][repo].\n", "\n", "[ndpi]: https://github.com/ntop/nDPI\n", "[nfplugin]: https://nfstream.github.io/docs/api#nfplugin\n", "[reliable]: http://people.ac.upc.edu/pbarlet/papers/ground-truth.pam2014.pdf\n", "[repo]: https://nfstream.org/\n", "[pypy]: https://www.pypy.org/\n", "[cffi]: https://cffi.readthedocs.io/en/latest/index.html" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from nfstream import NFStreamer, NFPlugin\n", "import pandas as pd\n", "pd.set_option('display.max_columns', 500)\n", "pd.set_option('display.max_rows', 500)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Flow aggregation made simple" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the following, we are going to use the main object provided by nfstream, `NFStreamer` which have the following parameters:\n", "\n", "* `source` [default=None]: Packet capture source. Pcap file path or network interface name.\n", "* `decode_tunnels` [default=True]: Enable/Disable GTP/TZSP tunnels decoding.\n", "* `bpf_filter` [default=None]: Specify a [BPF filter][bpf] filter for filtering selected traffic.\n", "* `promiscuous_mode` [default=True]: Enable/Disable promiscuous capture mode.\n", "* `snapshot_length` [default=1500]: Control packet slicing size (truncation) in bytes.\n", "* `idle_timeout` [default=120]: Flows that are idle (no packets received) for more than this value in seconds are expired.\n", "* `active_timeout` [default=1800]: Flows that are active for more than this value in seconds are expired.\n", "* `accounting_mode` [default=0] : Specify the accounting mode that will be used to report bytes related features (0: Link layer, 1: IP layer, 2: Transport layer, 3: Payload).\n", "* `udps` [default=None]: Specify user defined NFPlugins used to extend NFStreamer.\n", "* `n_dissections` [default=20]: Number of per flow packets to dissect for L7 visibility feature. When set to 0, L7 visibility feature is disabled.\n", "* `statistical_analysis` [default=False]: Enable/Disable post-mortem flow statistical analysis.\n", "* `splt_analysis` [default=0]: Specify the sequence of first packets length for early statistical analysis. When set to 0, splt_analysis is disabled.\n", "* `max_nflows` [default=0]:\tSpecify the number of maximum flows to capture before returning. Unset when equal to 0.\n", "* `n_meters` [default=0]: Specify the number of parallel metering processes. When set to 0, NFStreamer will automatically scale metering according to available physical cores on the running host.\n", "* `performance_report` [default=0]: [**Performance report**](https://github.com/nfstream/nfstream/blob/master/assets/PERFORMANCE_REPORT.md) interval in seconds. Disabled whhen set to 0. Ignored for offline capture.\n", "* `system_visibility_mode` [default=0]\tEnable system process mapping by probing the host machine.\n", "* `system_visibility_poll_ms` [default=100]\tSet the polling interval in milliseconds for system process mapping feature (0 is the maximum achievable rate).\n", "\n", "`NFStreamer` returns a flow iterator. We can iterate over flows or convert it directly to pandas Dataframe using `to_pandas()` method.\n", "\n", "[bpf]: https://biot.com/capstats/bpf.html" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "df = NFStreamer(source=\"pcap/instagram.pcap\").to_pandas()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idexpiration_idsrc_ipsrc_macsrc_ouisrc_portdst_ipdst_macdst_ouidst_portprotocolip_versionvlan_idtunnel_idbidirectional_first_seen_msbidirectional_last_seen_msbidirectional_duration_msbidirectional_packetsbidirectional_bytessrc2dst_first_seen_mssrc2dst_last_seen_mssrc2dst_duration_mssrc2dst_packetssrc2dst_bytesdst2src_first_seen_msdst2src_last_seen_msdst2src_duration_msdst2src_packetsdst2src_bytesapplication_nameapplication_category_nameapplication_is_guessedapplication_confidencerequested_server_nameclient_fingerprintserver_fingerprintuser_agentcontent_type
000192.168.0.10340:f3:08:c3:8e:e140:f3:083393631.13.93.5200:1b:2f:f0:7e:b400:1b:2f44364001436720898386143672090844210056684568814367208983861436720908442100563455551436720898475143672090844299673440133TLSWeb06NaNNaNNaNNaNNaN
110192.168.0.10600:16:44:1f:59:6600:16:4417500255.255.255.255ff:ff:ff:ff:ff:ffff:ff:ff17500174001436720906017143672090602474580143672090601714367209060247458000000DropboxCloud06NaNNaNNaNNaNNaN
220192.168.0.10600:16:44:1f:59:6600:16:4417500192.168.0.255ff:ff:ff:ff:ff:ffff:ff:ff17500174001436720906022143672090602201145143672090602214367209060220114500000DropboxCloud06NaNNaNNaNNaNNaN
330192.168.0.100:1b:2f:f0:7e:b400:1b:2f520192.168.0.255ff:ff:ff:ff:ff:ffff:ff:ff5201740014367209060251436720906025016614367209060251436720906025016600000UnknownUnspecified00NaNNaNNaNNaNNaN
440192.168.0.10300:00:00:00:00:0000:00:000192.168.0.10300:00:00:00:00:0000:00:00014001436720908464143672091113926755510143672090846414367209111392675551000000ICMPNetwork06NaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " id expiration_id src_ip src_mac src_oui src_port \\\n", "0 0 0 192.168.0.103 40:f3:08:c3:8e:e1 40:f3:08 33936 \n", "1 1 0 192.168.0.106 00:16:44:1f:59:66 00:16:44 17500 \n", "2 2 0 192.168.0.106 00:16:44:1f:59:66 00:16:44 17500 \n", "3 3 0 192.168.0.1 00:1b:2f:f0:7e:b4 00:1b:2f 520 \n", "4 4 0 192.168.0.103 00:00:00:00:00:00 00:00:00 0 \n", "\n", " dst_ip dst_mac dst_oui dst_port protocol \\\n", "0 31.13.93.52 00:1b:2f:f0:7e:b4 00:1b:2f 443 6 \n", "1 255.255.255.255 ff:ff:ff:ff:ff:ff ff:ff:ff 17500 17 \n", "2 192.168.0.255 ff:ff:ff:ff:ff:ff ff:ff:ff 17500 17 \n", "3 192.168.0.255 ff:ff:ff:ff:ff:ff ff:ff:ff 520 17 \n", "4 192.168.0.103 00:00:00:00:00:00 00:00:00 0 1 \n", "\n", " ip_version vlan_id tunnel_id bidirectional_first_seen_ms \\\n", "0 4 0 0 1436720898386 \n", "1 4 0 0 1436720906017 \n", "2 4 0 0 1436720906022 \n", "3 4 0 0 1436720906025 \n", "4 4 0 0 1436720908464 \n", "\n", " bidirectional_last_seen_ms bidirectional_duration_ms \\\n", "0 1436720908442 10056 \n", "1 1436720906024 7 \n", "2 1436720906022 0 \n", "3 1436720906025 0 \n", "4 1436720911139 2675 \n", "\n", " bidirectional_packets bidirectional_bytes src2dst_first_seen_ms \\\n", "0 68 45688 1436720898386 \n", "1 4 580 1436720906017 \n", "2 1 145 1436720906022 \n", "3 1 66 1436720906025 \n", "4 5 510 1436720908464 \n", "\n", " src2dst_last_seen_ms src2dst_duration_ms src2dst_packets src2dst_bytes \\\n", "0 1436720908442 10056 34 5555 \n", "1 1436720906024 7 4 580 \n", "2 1436720906022 0 1 145 \n", "3 1436720906025 0 1 66 \n", "4 1436720911139 2675 5 510 \n", "\n", " dst2src_first_seen_ms dst2src_last_seen_ms dst2src_duration_ms \\\n", "0 1436720898475 1436720908442 9967 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_packets dst2src_bytes application_name application_category_name \\\n", "0 34 40133 TLS Web \n", "1 0 0 Dropbox Cloud \n", "2 0 0 Dropbox Cloud \n", "3 0 0 Unknown Unspecified \n", "4 0 0 ICMP Network \n", "\n", " application_is_guessed application_confidence requested_server_name \\\n", "0 0 6 NaN \n", "1 0 6 NaN \n", "2 0 6 NaN \n", "3 0 0 NaN \n", "4 0 6 NaN \n", "\n", " client_fingerprint server_fingerprint user_agent content_type \n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(38, 38)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can enable post-mortem statistical flow features extraction as follow:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "df = NFStreamer(source=\"pcap/instagram.pcap\", statistical_analysis=True).to_pandas()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idexpiration_idsrc_ipsrc_macsrc_ouisrc_portdst_ipdst_macdst_ouidst_portprotocolip_versionvlan_idtunnel_idbidirectional_first_seen_msbidirectional_last_seen_msbidirectional_duration_msbidirectional_packetsbidirectional_bytessrc2dst_first_seen_mssrc2dst_last_seen_mssrc2dst_duration_mssrc2dst_packetssrc2dst_bytesdst2src_first_seen_msdst2src_last_seen_msdst2src_duration_msdst2src_packetsdst2src_bytesbidirectional_min_psbidirectional_mean_psbidirectional_stddev_psbidirectional_max_pssrc2dst_min_pssrc2dst_mean_pssrc2dst_stddev_pssrc2dst_max_psdst2src_min_psdst2src_mean_psdst2src_stddev_psdst2src_max_psbidirectional_min_piat_msbidirectional_mean_piat_msbidirectional_stddev_piat_msbidirectional_max_piat_mssrc2dst_min_piat_mssrc2dst_mean_piat_mssrc2dst_stddev_piat_mssrc2dst_max_piat_msdst2src_min_piat_msdst2src_mean_piat_msdst2src_stddev_piat_msdst2src_max_piat_msbidirectional_syn_packetsbidirectional_cwr_packetsbidirectional_ece_packetsbidirectional_urg_packetsbidirectional_ack_packetsbidirectional_psh_packetsbidirectional_rst_packetsbidirectional_fin_packetssrc2dst_syn_packetssrc2dst_cwr_packetssrc2dst_ece_packetssrc2dst_urg_packetssrc2dst_ack_packetssrc2dst_psh_packetssrc2dst_rst_packetssrc2dst_fin_packetsdst2src_syn_packetsdst2src_cwr_packetsdst2src_ece_packetsdst2src_urg_packetsdst2src_ack_packetsdst2src_psh_packetsdst2src_rst_packetsdst2src_fin_packetsapplication_nameapplication_category_nameapplication_is_guessedapplication_confidencerequested_server_nameclient_fingerprintserver_fingerprintuser_agentcontent_type
000192.168.0.10340:f3:08:c3:8e:e140:f3:083393631.13.93.5200:1b:2f:f0:7e:b400:1b:2f4436400143672089838614367209084421005668456881436720898386143672090844210056345555143672089847514367209084429967344013366671.882353661.76184146466163.382353322.6501071431661180.382353502.20453514640150.089552951.79186276690304.7272731349.72409876690302.0303031358.38570377090000681000000034300000034700TLSWeb06NaNNaNNaNNaNNaN
110192.168.0.10600:16:44:1f:59:6600:16:4417500255.255.255.255ff:ff:ff:ff:ff:ffff:ff:ff17500174001436720906017143672090602474580143672090601714367209060247458000000145145.0000000.00000145145145.0000000.00000014500.0000000.000000012.3333331.527525412.3333331.527525400.0000000.0000000000000000000000000000000DropboxCloud06NaNNaNNaNNaNNaN
220192.168.0.10600:16:44:1f:59:6600:16:4417500192.168.0.255ff:ff:ff:ff:ff:ffff:ff:ff17500174001436720906022143672090602201145143672090602214367209060220114500000145145.0000000.00000145145145.0000000.00000014500.0000000.000000000.0000000.000000000.0000000.000000000.0000000.0000000000000000000000000000000DropboxCloud06NaNNaNNaNNaNNaN
330192.168.0.100:1b:2f:f0:7e:b400:1b:2f520192.168.0.255ff:ff:ff:ff:ff:ffff:ff:ff52017400143672090602514367209060250166143672090602514367209060250166000006666.0000000.00000666666.0000000.0000006600.0000000.000000000.0000000.000000000.0000000.000000000.0000000.0000000000000000000000000000000UnknownUnspecified00NaNNaNNaNNaNNaN
440192.168.0.10300:00:00:00:00:0000:00:000192.168.0.10300:00:00:00:00:0000:00:00014001436720908464143672091113926755510143672090846414367209111392675551000000102102.0000000.00000102102102.0000000.00000010200.0000000.00000000668.7500001173.67212224200668.7500001173.672122242000.0000000.0000000000000000000000000000000ICMPNetwork06NaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " id expiration_id src_ip src_mac src_oui src_port \\\n", "0 0 0 192.168.0.103 40:f3:08:c3:8e:e1 40:f3:08 33936 \n", "1 1 0 192.168.0.106 00:16:44:1f:59:66 00:16:44 17500 \n", "2 2 0 192.168.0.106 00:16:44:1f:59:66 00:16:44 17500 \n", "3 3 0 192.168.0.1 00:1b:2f:f0:7e:b4 00:1b:2f 520 \n", "4 4 0 192.168.0.103 00:00:00:00:00:00 00:00:00 0 \n", "\n", " dst_ip dst_mac dst_oui dst_port protocol \\\n", "0 31.13.93.52 00:1b:2f:f0:7e:b4 00:1b:2f 443 6 \n", "1 255.255.255.255 ff:ff:ff:ff:ff:ff ff:ff:ff 17500 17 \n", "2 192.168.0.255 ff:ff:ff:ff:ff:ff ff:ff:ff 17500 17 \n", "3 192.168.0.255 ff:ff:ff:ff:ff:ff ff:ff:ff 520 17 \n", "4 192.168.0.103 00:00:00:00:00:00 00:00:00 0 1 \n", "\n", " ip_version vlan_id tunnel_id bidirectional_first_seen_ms \\\n", "0 4 0 0 1436720898386 \n", "1 4 0 0 1436720906017 \n", "2 4 0 0 1436720906022 \n", "3 4 0 0 1436720906025 \n", "4 4 0 0 1436720908464 \n", "\n", " bidirectional_last_seen_ms bidirectional_duration_ms \\\n", "0 1436720908442 10056 \n", "1 1436720906024 7 \n", "2 1436720906022 0 \n", "3 1436720906025 0 \n", "4 1436720911139 2675 \n", "\n", " bidirectional_packets bidirectional_bytes src2dst_first_seen_ms \\\n", "0 68 45688 1436720898386 \n", "1 4 580 1436720906017 \n", "2 1 145 1436720906022 \n", "3 1 66 1436720906025 \n", "4 5 510 1436720908464 \n", "\n", " src2dst_last_seen_ms src2dst_duration_ms src2dst_packets src2dst_bytes \\\n", "0 1436720908442 10056 34 5555 \n", "1 1436720906024 7 4 580 \n", "2 1436720906022 0 1 145 \n", "3 1436720906025 0 1 66 \n", "4 1436720911139 2675 5 510 \n", "\n", " dst2src_first_seen_ms dst2src_last_seen_ms dst2src_duration_ms \\\n", "0 1436720898475 1436720908442 9967 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_packets dst2src_bytes bidirectional_min_ps \\\n", "0 34 40133 66 \n", "1 0 0 145 \n", "2 0 0 145 \n", "3 0 0 66 \n", "4 0 0 102 \n", "\n", " bidirectional_mean_ps bidirectional_stddev_ps bidirectional_max_ps \\\n", "0 671.882353 661.76184 1464 \n", "1 145.000000 0.00000 145 \n", "2 145.000000 0.00000 145 \n", "3 66.000000 0.00000 66 \n", "4 102.000000 0.00000 102 \n", "\n", " src2dst_min_ps src2dst_mean_ps src2dst_stddev_ps src2dst_max_ps \\\n", "0 66 163.382353 322.650107 1431 \n", "1 145 145.000000 0.000000 145 \n", "2 145 145.000000 0.000000 145 \n", "3 66 66.000000 0.000000 66 \n", "4 102 102.000000 0.000000 102 \n", "\n", " dst2src_min_ps dst2src_mean_ps dst2src_stddev_ps dst2src_max_ps \\\n", "0 66 1180.382353 502.204535 1464 \n", "1 0 0.000000 0.000000 0 \n", "2 0 0.000000 0.000000 0 \n", "3 0 0.000000 0.000000 0 \n", "4 0 0.000000 0.000000 0 \n", "\n", " bidirectional_min_piat_ms bidirectional_mean_piat_ms \\\n", "0 0 150.089552 \n", "1 1 2.333333 \n", "2 0 0.000000 \n", "3 0 0.000000 \n", "4 0 668.750000 \n", "\n", " bidirectional_stddev_piat_ms bidirectional_max_piat_ms \\\n", "0 951.791862 7669 \n", "1 1.527525 4 \n", "2 0.000000 0 \n", "3 0.000000 0 \n", "4 1173.672122 2420 \n", "\n", " src2dst_min_piat_ms src2dst_mean_piat_ms src2dst_stddev_piat_ms \\\n", "0 0 304.727273 1349.724098 \n", "1 1 2.333333 1.527525 \n", "2 0 0.000000 0.000000 \n", "3 0 0.000000 0.000000 \n", "4 0 668.750000 1173.672122 \n", "\n", " src2dst_max_piat_ms dst2src_min_piat_ms dst2src_mean_piat_ms \\\n", "0 7669 0 302.030303 \n", "1 4 0 0.000000 \n", "2 0 0 0.000000 \n", "3 0 0 0.000000 \n", "4 2420 0 0.000000 \n", "\n", " dst2src_stddev_piat_ms dst2src_max_piat_ms bidirectional_syn_packets \\\n", "0 1358.385703 7709 0 \n", "1 0.000000 0 0 \n", "2 0.000000 0 0 \n", "3 0.000000 0 0 \n", "4 0.000000 0 0 \n", "\n", " bidirectional_cwr_packets bidirectional_ece_packets \\\n", "0 0 0 \n", "1 0 0 \n", "2 0 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " bidirectional_urg_packets bidirectional_ack_packets \\\n", "0 0 68 \n", "1 0 0 \n", "2 0 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " bidirectional_psh_packets bidirectional_rst_packets \\\n", "0 10 0 \n", "1 0 0 \n", "2 0 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " bidirectional_fin_packets src2dst_syn_packets src2dst_cwr_packets \\\n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " src2dst_ece_packets src2dst_urg_packets src2dst_ack_packets \\\n", "0 0 0 34 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " src2dst_psh_packets src2dst_rst_packets src2dst_fin_packets \\\n", "0 3 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_syn_packets dst2src_cwr_packets dst2src_ece_packets \\\n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_urg_packets dst2src_ack_packets dst2src_psh_packets \\\n", "0 0 34 7 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_rst_packets dst2src_fin_packets application_name \\\n", "0 0 0 TLS \n", "1 0 0 Dropbox \n", "2 0 0 Dropbox \n", "3 0 0 Unknown \n", "4 0 0 ICMP \n", "\n", " application_category_name application_is_guessed application_confidence \\\n", "0 Web 0 6 \n", "1 Cloud 0 6 \n", "2 Cloud 0 6 \n", "3 Unspecified 0 0 \n", "4 Network 0 6 \n", "\n", " requested_server_name client_fingerprint server_fingerprint user_agent \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " content_type \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can enable early statistical flow features extraction as follow:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "df = NFStreamer(source=\"pcap/instagram.pcap\", splt_analysis=10).to_pandas()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idexpiration_idsrc_ipsrc_macsrc_ouisrc_portdst_ipdst_macdst_ouidst_portprotocolip_versionvlan_idtunnel_idbidirectional_first_seen_msbidirectional_last_seen_msbidirectional_duration_msbidirectional_packetsbidirectional_bytessrc2dst_first_seen_mssrc2dst_last_seen_mssrc2dst_duration_mssrc2dst_packetssrc2dst_bytesdst2src_first_seen_msdst2src_last_seen_msdst2src_duration_msdst2src_packetsdst2src_bytessplt_directionsplt_pssplt_piat_msapplication_nameapplication_category_nameapplication_is_guessedapplication_confidencerequested_server_nameclient_fingerprintserver_fingerprintuser_agentcontent_type
000192.168.0.10340:f3:08:c3:8e:e140:f3:083393631.13.93.5200:1b:2f:f0:7e:b400:1b:2f44364001436720898386143672090844210056684568814367208983861436720908442100563455551436720898475143672090844299673440133[0, 1, 1, 0, 0, 1, 1, 0, 1, 0][1431, 66, 679, 66, 1063, 66, 1464, 66, 209, 66][0, 89, 76, 0, 1523, 50, 340, 0, 2, 0]TLSWeb06NaNNaNNaNNaNNaN
110192.168.0.10600:16:44:1f:59:6600:16:4417500255.255.255.255ff:ff:ff:ff:ff:ffff:ff:ff17500174001436720906017143672090602474580143672090601714367209060247458000000[0, 0, 0, 0, -1, -1, -1, -1, -1, -1][145, 145, 145, 145, -1, -1, -1, -1, -1, -1][0, 2, 1, 4, -1, -1, -1, -1, -1, -1]DropboxCloud06NaNNaNNaNNaNNaN
220192.168.0.10600:16:44:1f:59:6600:16:4417500192.168.0.255ff:ff:ff:ff:ff:ffff:ff:ff17500174001436720906022143672090602201145143672090602214367209060220114500000[0, -1, -1, -1, -1, -1, -1, -1, -1, -1][145, -1, -1, -1, -1, -1, -1, -1, -1, -1][0, -1, -1, -1, -1, -1, -1, -1, -1, -1]DropboxCloud06NaNNaNNaNNaNNaN
330192.168.0.100:1b:2f:f0:7e:b400:1b:2f520192.168.0.255ff:ff:ff:ff:ff:ffff:ff:ff5201740014367209060251436720906025016614367209060251436720906025016600000[0, -1, -1, -1, -1, -1, -1, -1, -1, -1][66, -1, -1, -1, -1, -1, -1, -1, -1, -1][0, -1, -1, -1, -1, -1, -1, -1, -1, -1]UnknownUnspecified00NaNNaNNaNNaNNaN
440192.168.0.10340:f3:08:c3:8e:e140:f3:083881646.33.70.16000:1b:2f:f0:7e:b400:1b:2f80640014367209006841436720900750665258994143672090068414367209007506613111814367209007161436720900744283957876[0, 1, 0, 1, 1, 1, 1, 1, 1, 1][326, 1484, 66, 1484, 1484, 1484, 1484, 1484, ...[0, 32, 1, 0, 1, 2, 2, 0, 0, 0]HTTP.InstagramSocialNetwork06photos-h.ak.instagram.comNaNNaNInstagram 7.1.1 Android (19/4.4.2; 480dpi; 108...NaN
\n", "
" ], "text/plain": [ " id expiration_id src_ip src_mac src_oui src_port \\\n", "0 0 0 192.168.0.103 40:f3:08:c3:8e:e1 40:f3:08 33936 \n", "1 1 0 192.168.0.106 00:16:44:1f:59:66 00:16:44 17500 \n", "2 2 0 192.168.0.106 00:16:44:1f:59:66 00:16:44 17500 \n", "3 3 0 192.168.0.1 00:1b:2f:f0:7e:b4 00:1b:2f 520 \n", "4 4 0 192.168.0.103 40:f3:08:c3:8e:e1 40:f3:08 38816 \n", "\n", " dst_ip dst_mac dst_oui dst_port protocol \\\n", "0 31.13.93.52 00:1b:2f:f0:7e:b4 00:1b:2f 443 6 \n", "1 255.255.255.255 ff:ff:ff:ff:ff:ff ff:ff:ff 17500 17 \n", "2 192.168.0.255 ff:ff:ff:ff:ff:ff ff:ff:ff 17500 17 \n", "3 192.168.0.255 ff:ff:ff:ff:ff:ff ff:ff:ff 520 17 \n", "4 46.33.70.160 00:1b:2f:f0:7e:b4 00:1b:2f 80 6 \n", "\n", " ip_version vlan_id tunnel_id bidirectional_first_seen_ms \\\n", "0 4 0 0 1436720898386 \n", "1 4 0 0 1436720906017 \n", "2 4 0 0 1436720906022 \n", "3 4 0 0 1436720906025 \n", "4 4 0 0 1436720900684 \n", "\n", " bidirectional_last_seen_ms bidirectional_duration_ms \\\n", "0 1436720908442 10056 \n", "1 1436720906024 7 \n", "2 1436720906022 0 \n", "3 1436720906025 0 \n", "4 1436720900750 66 \n", "\n", " bidirectional_packets bidirectional_bytes src2dst_first_seen_ms \\\n", "0 68 45688 1436720898386 \n", "1 4 580 1436720906017 \n", "2 1 145 1436720906022 \n", "3 1 66 1436720906025 \n", "4 52 58994 1436720900684 \n", "\n", " src2dst_last_seen_ms src2dst_duration_ms src2dst_packets src2dst_bytes \\\n", "0 1436720908442 10056 34 5555 \n", "1 1436720906024 7 4 580 \n", "2 1436720906022 0 1 145 \n", "3 1436720906025 0 1 66 \n", "4 1436720900750 66 13 1118 \n", "\n", " dst2src_first_seen_ms dst2src_last_seen_ms dst2src_duration_ms \\\n", "0 1436720898475 1436720908442 9967 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 1436720900716 1436720900744 28 \n", "\n", " dst2src_packets dst2src_bytes splt_direction \\\n", "0 34 40133 [0, 1, 1, 0, 0, 1, 1, 0, 1, 0] \n", "1 0 0 [0, 0, 0, 0, -1, -1, -1, -1, -1, -1] \n", "2 0 0 [0, -1, -1, -1, -1, -1, -1, -1, -1, -1] \n", "3 0 0 [0, -1, -1, -1, -1, -1, -1, -1, -1, -1] \n", "4 39 57876 [0, 1, 0, 1, 1, 1, 1, 1, 1, 1] \n", "\n", " splt_ps \\\n", "0 [1431, 66, 679, 66, 1063, 66, 1464, 66, 209, 66] \n", "1 [145, 145, 145, 145, -1, -1, -1, -1, -1, -1] \n", "2 [145, -1, -1, -1, -1, -1, -1, -1, -1, -1] \n", "3 [66, -1, -1, -1, -1, -1, -1, -1, -1, -1] \n", "4 [326, 1484, 66, 1484, 1484, 1484, 1484, 1484, ... \n", "\n", " splt_piat_ms application_name \\\n", "0 [0, 89, 76, 0, 1523, 50, 340, 0, 2, 0] TLS \n", "1 [0, 2, 1, 4, -1, -1, -1, -1, -1, -1] Dropbox \n", "2 [0, -1, -1, -1, -1, -1, -1, -1, -1, -1] Dropbox \n", "3 [0, -1, -1, -1, -1, -1, -1, -1, -1, -1] Unknown \n", "4 [0, 32, 1, 0, 1, 2, 2, 0, 0, 0] HTTP.Instagram \n", "\n", " application_category_name application_is_guessed application_confidence \\\n", "0 Web 0 6 \n", "1 Cloud 0 6 \n", "2 Cloud 0 6 \n", "3 Unspecified 0 0 \n", "4 SocialNetwork 0 6 \n", "\n", " requested_server_name client_fingerprint server_fingerprint \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 photos-h.ak.instagram.com NaN NaN \n", "\n", " user_agent content_type \n", "0 NaN NaN \n", "1 NaN NaN \n", "2 NaN NaN \n", "3 NaN NaN \n", "4 Instagram 7.1.1 Android (19/4.4.2; 480dpi; 108... NaN " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can enable IP anonymization as follow:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "df = NFStreamer(source=\"pcap/instagram.pcap\", \n", " statistical_analysis=True).to_pandas(columns_to_anonymize=[\"src_ip\", \"src_mac\", \"dst_ip\", \"dst_mac\"])" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idexpiration_idsrc_ipsrc_macsrc_ouisrc_portdst_ipdst_macdst_ouidst_portprotocolip_versionvlan_idtunnel_idbidirectional_first_seen_msbidirectional_last_seen_msbidirectional_duration_msbidirectional_packetsbidirectional_bytessrc2dst_first_seen_mssrc2dst_last_seen_mssrc2dst_duration_mssrc2dst_packetssrc2dst_bytesdst2src_first_seen_msdst2src_last_seen_msdst2src_duration_msdst2src_packetsdst2src_bytesbidirectional_min_psbidirectional_mean_psbidirectional_stddev_psbidirectional_max_pssrc2dst_min_pssrc2dst_mean_pssrc2dst_stddev_pssrc2dst_max_psdst2src_min_psdst2src_mean_psdst2src_stddev_psdst2src_max_psbidirectional_min_piat_msbidirectional_mean_piat_msbidirectional_stddev_piat_msbidirectional_max_piat_mssrc2dst_min_piat_mssrc2dst_mean_piat_mssrc2dst_stddev_piat_mssrc2dst_max_piat_msdst2src_min_piat_msdst2src_mean_piat_msdst2src_stddev_piat_msdst2src_max_piat_msbidirectional_syn_packetsbidirectional_cwr_packetsbidirectional_ece_packetsbidirectional_urg_packetsbidirectional_ack_packetsbidirectional_psh_packetsbidirectional_rst_packetsbidirectional_fin_packetssrc2dst_syn_packetssrc2dst_cwr_packetssrc2dst_ece_packetssrc2dst_urg_packetssrc2dst_ack_packetssrc2dst_psh_packetssrc2dst_rst_packetssrc2dst_fin_packetsdst2src_syn_packetsdst2src_cwr_packetsdst2src_ece_packetsdst2src_urg_packetsdst2src_ack_packetsdst2src_psh_packetsdst2src_rst_packetsdst2src_fin_packetsapplication_nameapplication_category_nameapplication_is_guessedapplication_confidencerequested_server_nameclient_fingerprintserver_fingerprintuser_agentcontent_type
0005885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:08579363a44c94fd7c9aefa07df278f016460d75aa809c94a571c...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f8064001436720900687143672090120051358502201436720900687143672090120051324183714367209007441436720901200456344838366865.862069696.73948514846676.54166751.6434093191861423.029412252.360311148409.00000045.124035321022.30434870.131976322013.81818258.73109323000058400000024100000034300HTTP.InstagramSocialNetwork06photos-g.ak.instagram.comNaNNaNInstagram 7.1.1 Android (19/4.4.2; 480dpi; 108...NaN
1105885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0838816c26866c915aaa410921c4fc309477eb0ceba2caec77bcf...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f80640014367209006841436720900750665258994143672090068414367209007506613111814367209007161436720900744283957876661134.500000612.25777914846686.00000072.11102632614841484.0000000.000000148401.2941184.4957503205.5000009.1701103300.7368420.685142000052100000013100000039000HTTP.InstagramSocialNetwork06photos-h.ak.instagram.comNaNNaNInstagram 7.1.1 Android (19/4.4.2; 480dpi; 108...NaN
2205885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:083735068df6e56301d6c238c302eaa732d2075ef802914771e7b...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f8064001436720901262143672090126201324143672090126214367209012620132400000324324.0000000.000000324324324.0000000.00000032400.0000000.000000000.0000000.000000000.0000000.000000000.0000000.000000000011000000110000000000HTTP.InstagramSocialNetwork06photos-a.ak.instagram.comNaNNaNInstagram 7.1.1 Android (19/4.4.2; 480dpi; 108...NaN
3305885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0833603107d5f2f69c5e2f3da41be7ce2a59c0d818947212d6ef0...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f531740014367209085241436720908575512298143672090852414367209085240189143672090857514367209085750120989149.00000084.8528142098989.0000000.00000089209209.0000000.0000002095151.0000000.0000005100.0000000.000000000.0000000.000000000000000000000000000000DNS.InstagramNetwork06igcdn-photos-a-a.akamaihd.netNaNNaNNaNNaN
440d7feda5309e4f8477aac71903e83486f9f13566cc836f8...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f805885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0840855640014367209526111436720952611021401436720952611143672095261101741436720952611143672095261101666670.0000005.656854747474.0000000.000000746666.0000000.0000006600.0000000.000000000.0000000.000000000.0000000.000000100020001000100000001000HTTPWeb11NaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " id expiration_id src_ip \\\n", "0 0 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "1 1 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "2 2 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "3 3 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "4 4 0 d7feda5309e4f8477aac71903e83486f9f13566cc836f8... \n", "\n", " src_mac src_oui src_port \\\n", "0 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 57936 \n", "1 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 38816 \n", "2 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 37350 \n", "3 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 33603 \n", "4 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "\n", " dst_ip \\\n", "0 3a44c94fd7c9aefa07df278f016460d75aa809c94a571c... \n", "1 c26866c915aaa410921c4fc309477eb0ceba2caec77bcf... \n", "2 68df6e56301d6c238c302eaa732d2075ef802914771e7b... \n", "3 107d5f2f69c5e2f3da41be7ce2a59c0d818947212d6ef0... \n", "4 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "\n", " dst_mac dst_oui dst_port \\\n", "0 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "1 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "2 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "3 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 53 \n", "4 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 40855 \n", "\n", " protocol ip_version vlan_id tunnel_id bidirectional_first_seen_ms \\\n", "0 6 4 0 0 1436720900687 \n", "1 6 4 0 0 1436720900684 \n", "2 6 4 0 0 1436720901262 \n", "3 17 4 0 0 1436720908524 \n", "4 6 4 0 0 1436720952611 \n", "\n", " bidirectional_last_seen_ms bidirectional_duration_ms \\\n", "0 1436720901200 513 \n", "1 1436720900750 66 \n", "2 1436720901262 0 \n", "3 1436720908575 51 \n", "4 1436720952611 0 \n", "\n", " bidirectional_packets bidirectional_bytes src2dst_first_seen_ms \\\n", "0 58 50220 1436720900687 \n", "1 52 58994 1436720900684 \n", "2 1 324 1436720901262 \n", "3 2 298 1436720908524 \n", "4 2 140 1436720952611 \n", "\n", " src2dst_last_seen_ms src2dst_duration_ms src2dst_packets src2dst_bytes \\\n", "0 1436720901200 513 24 1837 \n", "1 1436720900750 66 13 1118 \n", "2 1436720901262 0 1 324 \n", "3 1436720908524 0 1 89 \n", "4 1436720952611 0 1 74 \n", "\n", " dst2src_first_seen_ms dst2src_last_seen_ms dst2src_duration_ms \\\n", "0 1436720900744 1436720901200 456 \n", "1 1436720900716 1436720900744 28 \n", "2 0 0 0 \n", "3 1436720908575 1436720908575 0 \n", "4 1436720952611 1436720952611 0 \n", "\n", " dst2src_packets dst2src_bytes bidirectional_min_ps \\\n", "0 34 48383 66 \n", "1 39 57876 66 \n", "2 0 0 324 \n", "3 1 209 89 \n", "4 1 66 66 \n", "\n", " bidirectional_mean_ps bidirectional_stddev_ps bidirectional_max_ps \\\n", "0 865.862069 696.739485 1484 \n", "1 1134.500000 612.257779 1484 \n", "2 324.000000 0.000000 324 \n", "3 149.000000 84.852814 209 \n", "4 70.000000 5.656854 74 \n", "\n", " src2dst_min_ps src2dst_mean_ps src2dst_stddev_ps src2dst_max_ps \\\n", "0 66 76.541667 51.643409 319 \n", "1 66 86.000000 72.111026 326 \n", "2 324 324.000000 0.000000 324 \n", "3 89 89.000000 0.000000 89 \n", "4 74 74.000000 0.000000 74 \n", "\n", " dst2src_min_ps dst2src_mean_ps dst2src_stddev_ps dst2src_max_ps \\\n", "0 186 1423.029412 252.360311 1484 \n", "1 1484 1484.000000 0.000000 1484 \n", "2 0 0.000000 0.000000 0 \n", "3 209 209.000000 0.000000 209 \n", "4 66 66.000000 0.000000 66 \n", "\n", " bidirectional_min_piat_ms bidirectional_mean_piat_ms \\\n", "0 0 9.000000 \n", "1 0 1.294118 \n", "2 0 0.000000 \n", "3 51 51.000000 \n", "4 0 0.000000 \n", "\n", " bidirectional_stddev_piat_ms bidirectional_max_piat_ms \\\n", "0 45.124035 321 \n", "1 4.495750 32 \n", "2 0.000000 0 \n", "3 0.000000 51 \n", "4 0.000000 0 \n", "\n", " src2dst_min_piat_ms src2dst_mean_piat_ms src2dst_stddev_piat_ms \\\n", "0 0 22.304348 70.131976 \n", "1 0 5.500000 9.170110 \n", "2 0 0.000000 0.000000 \n", "3 0 0.000000 0.000000 \n", "4 0 0.000000 0.000000 \n", "\n", " src2dst_max_piat_ms dst2src_min_piat_ms dst2src_mean_piat_ms \\\n", "0 322 0 13.818182 \n", "1 33 0 0.736842 \n", "2 0 0 0.000000 \n", "3 0 0 0.000000 \n", "4 0 0 0.000000 \n", "\n", " dst2src_stddev_piat_ms dst2src_max_piat_ms bidirectional_syn_packets \\\n", "0 58.73109 323 0 \n", "1 0.68514 2 0 \n", "2 0.00000 0 0 \n", "3 0.00000 0 0 \n", "4 0.00000 0 1 \n", "\n", " bidirectional_cwr_packets bidirectional_ece_packets \\\n", "0 0 0 \n", "1 0 0 \n", "2 0 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " bidirectional_urg_packets bidirectional_ack_packets \\\n", "0 0 58 \n", "1 0 52 \n", "2 0 1 \n", "3 0 0 \n", "4 0 2 \n", "\n", " bidirectional_psh_packets bidirectional_rst_packets \\\n", "0 4 0 \n", "1 1 0 \n", "2 1 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " bidirectional_fin_packets src2dst_syn_packets src2dst_cwr_packets \\\n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 1 0 \n", "\n", " src2dst_ece_packets src2dst_urg_packets src2dst_ack_packets \\\n", "0 0 0 24 \n", "1 0 0 13 \n", "2 0 0 1 \n", "3 0 0 0 \n", "4 0 0 1 \n", "\n", " src2dst_psh_packets src2dst_rst_packets src2dst_fin_packets \\\n", "0 1 0 0 \n", "1 1 0 0 \n", "2 1 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_syn_packets dst2src_cwr_packets dst2src_ece_packets \\\n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_urg_packets dst2src_ack_packets dst2src_psh_packets \\\n", "0 0 34 3 \n", "1 0 39 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 1 0 \n", "\n", " dst2src_rst_packets dst2src_fin_packets application_name \\\n", "0 0 0 HTTP.Instagram \n", "1 0 0 HTTP.Instagram \n", "2 0 0 HTTP.Instagram \n", "3 0 0 DNS.Instagram \n", "4 0 0 HTTP \n", "\n", " application_category_name application_is_guessed application_confidence \\\n", "0 SocialNetwork 0 6 \n", "1 SocialNetwork 0 6 \n", "2 SocialNetwork 0 6 \n", "3 Network 0 6 \n", "4 Web 1 1 \n", "\n", " requested_server_name client_fingerprint server_fingerprint \\\n", "0 photos-g.ak.instagram.com NaN NaN \n", "1 photos-h.ak.instagram.com NaN NaN \n", "2 photos-a.ak.instagram.com NaN NaN \n", "3 igcdn-photos-a-a.akamaihd.net NaN NaN \n", "4 NaN NaN NaN \n", "\n", " user_agent content_type \n", "0 Instagram 7.1.1 Android (19/4.4.2; 480dpi; 108... NaN \n", "1 Instagram 7.1.1 Android (19/4.4.2; 480dpi; 108... NaN \n", "2 Instagram 7.1.1 Android (19/4.4.2; 480dpi; 108... NaN \n", "3 NaN NaN \n", "4 NaN NaN " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have our Dataframe, we can start analyzing our data as any data. For example we can compute additional features:\n", "\n", "* Compute data ratio on both direction (src2dst and dst2src)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "df[\"src2dst_bytes_data_ratio\"] = df['src2dst_bytes'] / df['bidirectional_bytes']\n", "df[\"dst2src_bytes_data_ratio\"] = df['dst2src_bytes'] / df['bidirectional_bytes']" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idexpiration_idsrc_ipsrc_macsrc_ouisrc_portdst_ipdst_macdst_ouidst_portprotocolip_versionvlan_idtunnel_idbidirectional_first_seen_msbidirectional_last_seen_msbidirectional_duration_msbidirectional_packetsbidirectional_bytessrc2dst_first_seen_mssrc2dst_last_seen_mssrc2dst_duration_mssrc2dst_packetssrc2dst_bytesdst2src_first_seen_msdst2src_last_seen_msdst2src_duration_msdst2src_packetsdst2src_bytesbidirectional_min_psbidirectional_mean_psbidirectional_stddev_psbidirectional_max_pssrc2dst_min_pssrc2dst_mean_pssrc2dst_stddev_pssrc2dst_max_psdst2src_min_psdst2src_mean_psdst2src_stddev_psdst2src_max_psbidirectional_min_piat_msbidirectional_mean_piat_msbidirectional_stddev_piat_msbidirectional_max_piat_mssrc2dst_min_piat_mssrc2dst_mean_piat_mssrc2dst_stddev_piat_mssrc2dst_max_piat_msdst2src_min_piat_msdst2src_mean_piat_msdst2src_stddev_piat_msdst2src_max_piat_msbidirectional_syn_packetsbidirectional_cwr_packetsbidirectional_ece_packetsbidirectional_urg_packetsbidirectional_ack_packetsbidirectional_psh_packetsbidirectional_rst_packetsbidirectional_fin_packetssrc2dst_syn_packetssrc2dst_cwr_packetssrc2dst_ece_packetssrc2dst_urg_packetssrc2dst_ack_packetssrc2dst_psh_packetssrc2dst_rst_packetssrc2dst_fin_packetsdst2src_syn_packetsdst2src_cwr_packetsdst2src_ece_packetsdst2src_urg_packetsdst2src_ack_packetsdst2src_psh_packetsdst2src_rst_packetsdst2src_fin_packetsapplication_nameapplication_category_nameapplication_is_guessedapplication_confidencerequested_server_nameclient_fingerprintserver_fingerprintuser_agentcontent_typesrc2dst_bytes_data_ratiodst2src_bytes_data_ratio
0005885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:08579363a44c94fd7c9aefa07df278f016460d75aa809c94a571c...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f8064001436720900687143672090120051358502201436720900687143672090120051324183714367209007441436720901200456344838366865.862069696.73948514846676.54166751.6434093191861423.029412252.360311148409.00000045.124035321022.30434870.131976322013.81818258.73109323000058400000024100000034300HTTP.InstagramSocialNetwork06photos-g.ak.instagram.comNaNNaNInstagram 7.1.1 Android (19/4.4.2; 480dpi; 108...NaN0.0365790.963421
1105885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0838816c26866c915aaa410921c4fc309477eb0ceba2caec77bcf...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f80640014367209006841436720900750665258994143672090068414367209007506613111814367209007161436720900744283957876661134.500000612.25777914846686.00000072.11102632614841484.0000000.000000148401.2941184.4957503205.5000009.1701103300.7368420.685142000052100000013100000039000HTTP.InstagramSocialNetwork06photos-h.ak.instagram.comNaNNaNInstagram 7.1.1 Android (19/4.4.2; 480dpi; 108...NaN0.0189510.981049
2205885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:083735068df6e56301d6c238c302eaa732d2075ef802914771e7b...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f8064001436720901262143672090126201324143672090126214367209012620132400000324324.0000000.000000324324324.0000000.00000032400.0000000.000000000.0000000.000000000.0000000.000000000.0000000.000000000011000000110000000000HTTP.InstagramSocialNetwork06photos-a.ak.instagram.comNaNNaNInstagram 7.1.1 Android (19/4.4.2; 480dpi; 108...NaN1.0000000.000000
3305885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0833603107d5f2f69c5e2f3da41be7ce2a59c0d818947212d6ef0...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f531740014367209085241436720908575512298143672090852414367209085240189143672090857514367209085750120989149.00000084.8528142098989.0000000.00000089209209.0000000.0000002095151.0000000.0000005100.0000000.000000000.0000000.000000000000000000000000000000DNS.InstagramNetwork06igcdn-photos-a-a.akamaihd.netNaNNaNNaNNaN0.2986580.701342
440d7feda5309e4f8477aac71903e83486f9f13566cc836f8...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f805885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0840855640014367209526111436720952611021401436720952611143672095261101741436720952611143672095261101666670.0000005.656854747474.0000000.000000746666.0000000.0000006600.0000000.000000000.0000000.000000000.0000000.000000100020001000100000001000HTTPWeb11NaNNaNNaNNaNNaN0.5285710.471429
\n", "
" ], "text/plain": [ " id expiration_id src_ip \\\n", "0 0 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "1 1 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "2 2 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "3 3 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "4 4 0 d7feda5309e4f8477aac71903e83486f9f13566cc836f8... \n", "\n", " src_mac src_oui src_port \\\n", "0 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 57936 \n", "1 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 38816 \n", "2 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 37350 \n", "3 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 33603 \n", "4 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "\n", " dst_ip \\\n", "0 3a44c94fd7c9aefa07df278f016460d75aa809c94a571c... \n", "1 c26866c915aaa410921c4fc309477eb0ceba2caec77bcf... \n", "2 68df6e56301d6c238c302eaa732d2075ef802914771e7b... \n", "3 107d5f2f69c5e2f3da41be7ce2a59c0d818947212d6ef0... \n", "4 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "\n", " dst_mac dst_oui dst_port \\\n", "0 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "1 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "2 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 80 \n", "3 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 53 \n", "4 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 40855 \n", "\n", " protocol ip_version vlan_id tunnel_id bidirectional_first_seen_ms \\\n", "0 6 4 0 0 1436720900687 \n", "1 6 4 0 0 1436720900684 \n", "2 6 4 0 0 1436720901262 \n", "3 17 4 0 0 1436720908524 \n", "4 6 4 0 0 1436720952611 \n", "\n", " bidirectional_last_seen_ms bidirectional_duration_ms \\\n", "0 1436720901200 513 \n", "1 1436720900750 66 \n", "2 1436720901262 0 \n", "3 1436720908575 51 \n", "4 1436720952611 0 \n", "\n", " bidirectional_packets bidirectional_bytes src2dst_first_seen_ms \\\n", "0 58 50220 1436720900687 \n", "1 52 58994 1436720900684 \n", "2 1 324 1436720901262 \n", "3 2 298 1436720908524 \n", "4 2 140 1436720952611 \n", "\n", " src2dst_last_seen_ms src2dst_duration_ms src2dst_packets src2dst_bytes \\\n", "0 1436720901200 513 24 1837 \n", "1 1436720900750 66 13 1118 \n", "2 1436720901262 0 1 324 \n", "3 1436720908524 0 1 89 \n", "4 1436720952611 0 1 74 \n", "\n", " dst2src_first_seen_ms dst2src_last_seen_ms dst2src_duration_ms \\\n", "0 1436720900744 1436720901200 456 \n", "1 1436720900716 1436720900744 28 \n", "2 0 0 0 \n", "3 1436720908575 1436720908575 0 \n", "4 1436720952611 1436720952611 0 \n", "\n", " dst2src_packets dst2src_bytes bidirectional_min_ps \\\n", "0 34 48383 66 \n", "1 39 57876 66 \n", "2 0 0 324 \n", "3 1 209 89 \n", "4 1 66 66 \n", "\n", " bidirectional_mean_ps bidirectional_stddev_ps bidirectional_max_ps \\\n", "0 865.862069 696.739485 1484 \n", "1 1134.500000 612.257779 1484 \n", "2 324.000000 0.000000 324 \n", "3 149.000000 84.852814 209 \n", "4 70.000000 5.656854 74 \n", "\n", " src2dst_min_ps src2dst_mean_ps src2dst_stddev_ps src2dst_max_ps \\\n", "0 66 76.541667 51.643409 319 \n", "1 66 86.000000 72.111026 326 \n", "2 324 324.000000 0.000000 324 \n", "3 89 89.000000 0.000000 89 \n", "4 74 74.000000 0.000000 74 \n", "\n", " dst2src_min_ps dst2src_mean_ps dst2src_stddev_ps dst2src_max_ps \\\n", "0 186 1423.029412 252.360311 1484 \n", "1 1484 1484.000000 0.000000 1484 \n", "2 0 0.000000 0.000000 0 \n", "3 209 209.000000 0.000000 209 \n", "4 66 66.000000 0.000000 66 \n", "\n", " bidirectional_min_piat_ms bidirectional_mean_piat_ms \\\n", "0 0 9.000000 \n", "1 0 1.294118 \n", "2 0 0.000000 \n", "3 51 51.000000 \n", "4 0 0.000000 \n", "\n", " bidirectional_stddev_piat_ms bidirectional_max_piat_ms \\\n", "0 45.124035 321 \n", "1 4.495750 32 \n", "2 0.000000 0 \n", "3 0.000000 51 \n", "4 0.000000 0 \n", "\n", " src2dst_min_piat_ms src2dst_mean_piat_ms src2dst_stddev_piat_ms \\\n", "0 0 22.304348 70.131976 \n", "1 0 5.500000 9.170110 \n", "2 0 0.000000 0.000000 \n", "3 0 0.000000 0.000000 \n", "4 0 0.000000 0.000000 \n", "\n", " src2dst_max_piat_ms dst2src_min_piat_ms dst2src_mean_piat_ms \\\n", "0 322 0 13.818182 \n", "1 33 0 0.736842 \n", "2 0 0 0.000000 \n", "3 0 0 0.000000 \n", "4 0 0 0.000000 \n", "\n", " dst2src_stddev_piat_ms dst2src_max_piat_ms bidirectional_syn_packets \\\n", "0 58.73109 323 0 \n", "1 0.68514 2 0 \n", "2 0.00000 0 0 \n", "3 0.00000 0 0 \n", "4 0.00000 0 1 \n", "\n", " bidirectional_cwr_packets bidirectional_ece_packets \\\n", "0 0 0 \n", "1 0 0 \n", "2 0 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " bidirectional_urg_packets bidirectional_ack_packets \\\n", "0 0 58 \n", "1 0 52 \n", "2 0 1 \n", "3 0 0 \n", "4 0 2 \n", "\n", " bidirectional_psh_packets bidirectional_rst_packets \\\n", "0 4 0 \n", "1 1 0 \n", "2 1 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " bidirectional_fin_packets src2dst_syn_packets src2dst_cwr_packets \\\n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 1 0 \n", "\n", " src2dst_ece_packets src2dst_urg_packets src2dst_ack_packets \\\n", "0 0 0 24 \n", "1 0 0 13 \n", "2 0 0 1 \n", "3 0 0 0 \n", "4 0 0 1 \n", "\n", " src2dst_psh_packets src2dst_rst_packets src2dst_fin_packets \\\n", "0 1 0 0 \n", "1 1 0 0 \n", "2 1 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_syn_packets dst2src_cwr_packets dst2src_ece_packets \\\n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "\n", " dst2src_urg_packets dst2src_ack_packets dst2src_psh_packets \\\n", "0 0 34 3 \n", "1 0 39 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 1 0 \n", "\n", " dst2src_rst_packets dst2src_fin_packets application_name \\\n", "0 0 0 HTTP.Instagram \n", "1 0 0 HTTP.Instagram \n", "2 0 0 HTTP.Instagram \n", "3 0 0 DNS.Instagram \n", "4 0 0 HTTP \n", "\n", " application_category_name application_is_guessed application_confidence \\\n", "0 SocialNetwork 0 6 \n", "1 SocialNetwork 0 6 \n", "2 SocialNetwork 0 6 \n", "3 Network 0 6 \n", "4 Web 1 1 \n", "\n", " requested_server_name client_fingerprint server_fingerprint \\\n", "0 photos-g.ak.instagram.com NaN NaN \n", "1 photos-h.ak.instagram.com NaN NaN \n", "2 photos-a.ak.instagram.com NaN NaN \n", "3 igcdn-photos-a-a.akamaihd.net NaN NaN \n", "4 NaN NaN NaN \n", "\n", " user_agent content_type \\\n", "0 Instagram 7.1.1 Android (19/4.4.2; 480dpi; 108... NaN \n", "1 Instagram 7.1.1 Android (19/4.4.2; 480dpi; 108... NaN \n", "2 Instagram 7.1.1 Android (19/4.4.2; 480dpi; 108... NaN \n", "3 NaN NaN \n", "4 NaN NaN \n", "\n", " src2dst_bytes_data_ratio dst2src_bytes_data_ratio \n", "0 0.036579 0.963421 \n", "1 0.018951 0.981049 \n", "2 1.000000 0.000000 \n", "3 0.298658 0.701342 \n", "4 0.528571 0.471429 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Filter data according to some criterias:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idexpiration_idsrc_ipsrc_macsrc_ouisrc_portdst_ipdst_macdst_ouidst_portprotocolip_versionvlan_idtunnel_idbidirectional_first_seen_msbidirectional_last_seen_msbidirectional_duration_msbidirectional_packetsbidirectional_bytessrc2dst_first_seen_mssrc2dst_last_seen_mssrc2dst_duration_mssrc2dst_packetssrc2dst_bytesdst2src_first_seen_msdst2src_last_seen_msdst2src_duration_msdst2src_packetsdst2src_bytesbidirectional_min_psbidirectional_mean_psbidirectional_stddev_psbidirectional_max_pssrc2dst_min_pssrc2dst_mean_pssrc2dst_stddev_pssrc2dst_max_psdst2src_min_psdst2src_mean_psdst2src_stddev_psdst2src_max_psbidirectional_min_piat_msbidirectional_mean_piat_msbidirectional_stddev_piat_msbidirectional_max_piat_mssrc2dst_min_piat_mssrc2dst_mean_piat_mssrc2dst_stddev_piat_mssrc2dst_max_piat_msdst2src_min_piat_msdst2src_mean_piat_msdst2src_stddev_piat_msdst2src_max_piat_msbidirectional_syn_packetsbidirectional_cwr_packetsbidirectional_ece_packetsbidirectional_urg_packetsbidirectional_ack_packetsbidirectional_psh_packetsbidirectional_rst_packetsbidirectional_fin_packetssrc2dst_syn_packetssrc2dst_cwr_packetssrc2dst_ece_packetssrc2dst_urg_packetssrc2dst_ack_packetssrc2dst_psh_packetssrc2dst_rst_packetssrc2dst_fin_packetsdst2src_syn_packetsdst2src_cwr_packetsdst2src_ece_packetsdst2src_urg_packetsdst2src_ack_packetsdst2src_psh_packetsdst2src_rst_packetsdst2src_fin_packetsapplication_nameapplication_category_nameapplication_is_guessedapplication_confidencerequested_server_nameclient_fingerprintserver_fingerprintuser_agentcontent_typesrc2dst_bytes_data_ratiodst2src_bytes_data_ratio
5505885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:08337636c662e3d71901ed2227ad7c8bd2e074e240ca921855da1...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f443640014367209084661436720910950248411539714367209084661436720908723257512791436720908518143672091095024326411866490.636364588.172640146466255.800000424.405702101566686.333333668.35100614640248.400000698.0939452227064.250000126.5026352540486.4976.91007822270000113000000510000006200TLSWeb06NaNNaNNaNNaNNaN0.2369840.763016
8805885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0841181c9f555cb103cf7cb79a76242fe4b121682293ebb993677...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f44364001436720908576143672090873315714556714367209085761436720908733157889614367209086151436720908662476467166397.642857566.204041148466112.00000086.32827729266778.500000720.0577061484012.07692322.74665471022.42857128.5648807109.417.728508412000134001000720010006200TLS.InstagramSocialNetwork06igcdn-photos-a-a.akamaihd.net54ae5fcb0159e2ddf6a50e149221c7c734d6f0ad0a79e4cfdf145e640cc93f78NaNNaN0.1609480.839052
9905885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:08586907e269e3e2e4c87e46547c961422353c0034227df8cb6e5...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f443640014367209525611436720952561021691436720952561143672095256102169000006684.50000026.1629511036684.50000026.16295110300.0000000.000000000.0000000.000000000.0000000.000000000.00.0000000000021010000210100000000TLSWeb06NaNNaNNaNNaNNaN1.0000000.000000
111105885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:08563828296dd65fdefef3bde6b1205337bd3270f4960d0747ae6...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f4436400143672089835414367208991588041726471436720898354143672089915880491583143672089849914367208991226238106466155.705882128.13799453066175.888889164.14737653066133.00000074.989523231050.25000072.0097221810100.50000083.513900183089.084.2614981832000168001000840010008400TLS.InstagramSocialNetwork06telegraph-ash.instagram.com54ae5fcb0159e2ddf6a50e149221c7c7acb741bcdffb787c5a52654c78645bdfNaNNaN0.5980360.401964
151505885370fbc1de250a4570351f2679e915e15245a5534bd...b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175...40:f3:0841182c9f555cb103cf7cb79a76242fe4b121682293ebb993677...7f6b3b13330898c4dcf505e44b642c996b9674139831e0...00:1b:2f44364001436720908577143672090873716014556714367209085771436720908737160889614367209086161436720908665496467166397.642857566.204041148466112.00000086.32827729266778.500000720.0577061484012.30769223.34660871122.85714328.4395777109.820.801442472000134001000720010006200TLS.InstagramSocialNetwork06igcdn-photos-a-a.akamaihd.net54ae5fcb0159e2ddf6a50e149221c7c734d6f0ad0a79e4cfdf145e640cc93f78NaNNaN0.1609480.839052
\n", "
" ], "text/plain": [ " id expiration_id src_ip \\\n", "5 5 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "8 8 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "9 9 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "11 11 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "15 15 0 5885370fbc1de250a4570351f2679e915e15245a5534bd... \n", "\n", " src_mac src_oui src_port \\\n", "5 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 33763 \n", "8 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 41181 \n", "9 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 58690 \n", "11 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 56382 \n", "15 b5d836f0b4088481bd22d1bcdbf78c8bb4ed6c5b5a3175... 40:f3:08 41182 \n", "\n", " dst_ip \\\n", "5 6c662e3d71901ed2227ad7c8bd2e074e240ca921855da1... \n", "8 c9f555cb103cf7cb79a76242fe4b121682293ebb993677... \n", "9 7e269e3e2e4c87e46547c961422353c0034227df8cb6e5... \n", "11 8296dd65fdefef3bde6b1205337bd3270f4960d0747ae6... \n", "15 c9f555cb103cf7cb79a76242fe4b121682293ebb993677... \n", "\n", " dst_mac dst_oui dst_port \\\n", "5 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 443 \n", "8 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 443 \n", "9 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 443 \n", "11 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 443 \n", "15 7f6b3b13330898c4dcf505e44b642c996b9674139831e0... 00:1b:2f 443 \n", "\n", " protocol ip_version vlan_id tunnel_id bidirectional_first_seen_ms \\\n", "5 6 4 0 0 1436720908466 \n", "8 6 4 0 0 1436720908576 \n", "9 6 4 0 0 1436720952561 \n", "11 6 4 0 0 1436720898354 \n", "15 6 4 0 0 1436720908577 \n", "\n", " bidirectional_last_seen_ms bidirectional_duration_ms \\\n", "5 1436720910950 2484 \n", "8 1436720908733 157 \n", "9 1436720952561 0 \n", "11 1436720899158 804 \n", "15 1436720908737 160 \n", "\n", " bidirectional_packets bidirectional_bytes src2dst_first_seen_ms \\\n", "5 11 5397 1436720908466 \n", "8 14 5567 1436720908576 \n", "9 2 169 1436720952561 \n", "11 17 2647 1436720898354 \n", "15 14 5567 1436720908577 \n", "\n", " src2dst_last_seen_ms src2dst_duration_ms src2dst_packets src2dst_bytes \\\n", "5 1436720908723 257 5 1279 \n", "8 1436720908733 157 8 896 \n", "9 1436720952561 0 2 169 \n", "11 1436720899158 804 9 1583 \n", "15 1436720908737 160 8 896 \n", "\n", " dst2src_first_seen_ms dst2src_last_seen_ms dst2src_duration_ms \\\n", "5 1436720908518 1436720910950 2432 \n", "8 1436720908615 1436720908662 47 \n", "9 0 0 0 \n", "11 1436720898499 1436720899122 623 \n", "15 1436720908616 1436720908665 49 \n", "\n", " dst2src_packets dst2src_bytes bidirectional_min_ps \\\n", "5 6 4118 66 \n", "8 6 4671 66 \n", "9 0 0 66 \n", "11 8 1064 66 \n", "15 6 4671 66 \n", "\n", " bidirectional_mean_ps bidirectional_stddev_ps bidirectional_max_ps \\\n", "5 490.636364 588.172640 1464 \n", "8 397.642857 566.204041 1484 \n", "9 84.500000 26.162951 103 \n", "11 155.705882 128.137994 530 \n", "15 397.642857 566.204041 1484 \n", "\n", " src2dst_min_ps src2dst_mean_ps src2dst_stddev_ps src2dst_max_ps \\\n", "5 66 255.800000 424.405702 1015 \n", "8 66 112.000000 86.328277 292 \n", "9 66 84.500000 26.162951 103 \n", "11 66 175.888889 164.147376 530 \n", "15 66 112.000000 86.328277 292 \n", "\n", " dst2src_min_ps dst2src_mean_ps dst2src_stddev_ps dst2src_max_ps \\\n", "5 66 686.333333 668.351006 1464 \n", "8 66 778.500000 720.057706 1484 \n", "9 0 0.000000 0.000000 0 \n", "11 66 133.000000 74.989523 231 \n", "15 66 778.500000 720.057706 1484 \n", "\n", " bidirectional_min_piat_ms bidirectional_mean_piat_ms \\\n", "5 0 248.400000 \n", "8 0 12.076923 \n", "9 0 0.000000 \n", "11 0 50.250000 \n", "15 0 12.307692 \n", "\n", " bidirectional_stddev_piat_ms bidirectional_max_piat_ms \\\n", "5 698.093945 2227 \n", "8 22.746654 71 \n", "9 0.000000 0 \n", "11 72.009722 181 \n", "15 23.346608 71 \n", "\n", " src2dst_min_piat_ms src2dst_mean_piat_ms src2dst_stddev_piat_ms \\\n", "5 0 64.250000 126.502635 \n", "8 0 22.428571 28.564880 \n", "9 0 0.000000 0.000000 \n", "11 0 100.500000 83.513900 \n", "15 1 22.857143 28.439577 \n", "\n", " src2dst_max_piat_ms dst2src_min_piat_ms dst2src_mean_piat_ms \\\n", "5 254 0 486.4 \n", "8 71 0 9.4 \n", "9 0 0 0.0 \n", "11 183 0 89.0 \n", "15 71 0 9.8 \n", "\n", " dst2src_stddev_piat_ms dst2src_max_piat_ms bidirectional_syn_packets \\\n", "5 976.910078 2227 0 \n", "8 17.728508 41 2 \n", "9 0.000000 0 0 \n", "11 84.261498 183 2 \n", "15 20.801442 47 2 \n", "\n", " bidirectional_cwr_packets bidirectional_ece_packets \\\n", "5 0 0 \n", "8 0 0 \n", "9 0 0 \n", "11 0 0 \n", "15 0 0 \n", "\n", " bidirectional_urg_packets bidirectional_ack_packets \\\n", "5 0 11 \n", "8 0 13 \n", "9 0 2 \n", "11 0 16 \n", "15 0 13 \n", "\n", " bidirectional_psh_packets bidirectional_rst_packets \\\n", "5 3 0 \n", "8 4 0 \n", "9 1 0 \n", "11 8 0 \n", "15 4 0 \n", "\n", " bidirectional_fin_packets src2dst_syn_packets src2dst_cwr_packets \\\n", "5 0 0 0 \n", "8 0 1 0 \n", "9 1 0 0 \n", "11 0 1 0 \n", "15 0 1 0 \n", "\n", " src2dst_ece_packets src2dst_urg_packets src2dst_ack_packets \\\n", "5 0 0 5 \n", "8 0 0 7 \n", "9 0 0 2 \n", "11 0 0 8 \n", "15 0 0 7 \n", "\n", " src2dst_psh_packets src2dst_rst_packets src2dst_fin_packets \\\n", "5 1 0 0 \n", "8 2 0 0 \n", "9 1 0 1 \n", "11 4 0 0 \n", "15 2 0 0 \n", "\n", " dst2src_syn_packets dst2src_cwr_packets dst2src_ece_packets \\\n", "5 0 0 0 \n", "8 1 0 0 \n", "9 0 0 0 \n", "11 1 0 0 \n", "15 1 0 0 \n", "\n", " dst2src_urg_packets dst2src_ack_packets dst2src_psh_packets \\\n", "5 0 6 2 \n", "8 0 6 2 \n", "9 0 0 0 \n", "11 0 8 4 \n", "15 0 6 2 \n", "\n", " dst2src_rst_packets dst2src_fin_packets application_name \\\n", "5 0 0 TLS \n", "8 0 0 TLS.Instagram \n", "9 0 0 TLS \n", "11 0 0 TLS.Instagram \n", "15 0 0 TLS.Instagram \n", "\n", " application_category_name application_is_guessed application_confidence \\\n", "5 Web 0 6 \n", "8 SocialNetwork 0 6 \n", "9 Web 0 6 \n", "11 SocialNetwork 0 6 \n", "15 SocialNetwork 0 6 \n", "\n", " requested_server_name client_fingerprint \\\n", "5 NaN NaN \n", "8 igcdn-photos-a-a.akamaihd.net 54ae5fcb0159e2ddf6a50e149221c7c7 \n", "9 NaN NaN \n", "11 telegraph-ash.instagram.com 54ae5fcb0159e2ddf6a50e149221c7c7 \n", "15 igcdn-photos-a-a.akamaihd.net 54ae5fcb0159e2ddf6a50e149221c7c7 \n", "\n", " server_fingerprint user_agent content_type \\\n", "5 NaN NaN NaN \n", "8 34d6f0ad0a79e4cfdf145e640cc93f78 NaN NaN \n", "9 NaN NaN NaN \n", "11 acb741bcdffb787c5a52654c78645bdf NaN NaN \n", "15 34d6f0ad0a79e4cfdf145e640cc93f78 NaN NaN \n", "\n", " src2dst_bytes_data_ratio dst2src_bytes_data_ratio \n", "5 0.236984 0.763016 \n", "8 0.160948 0.839052 \n", "9 1.000000 0.000000 \n", "11 0.598036 0.401964 \n", "15 0.160948 0.839052 " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df[\"dst_port\"] == 443].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Extend nfstream" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In some use cases, we need to add features that are computed as packet level. Thus, nfstream handles such scenario using [**NFPlugin**][nfplugin].\n", "\n", "[nfplugin]: https://nfstream.github.io/docs/api#nfplugin\n", "\n", "* Let's suppose that we want bidirectional packets with exact IP size equal to 40 counter per flow." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "class Packet40Count(NFPlugin):\n", " def on_init(self, pkt, flow): # flow creation with the first packet\n", " if pkt.ip_size == 40:\n", " flow.udps.packet_with_40_ip_size=1\n", " else:\n", " flow.udps.packet_with_40_ip_size=0\n", " \n", " def on_update(self, pkt, flow): # flow update with each packet belonging to the flow\n", " if pkt.ip_size == 40:\n", " flow.udps.packet_with_40_ip_size += 1" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "df = NFStreamer(source=\"pcap/google_ssl.pcap\", udps=[Packet40Count()]).to_pandas()" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idexpiration_idsrc_ipsrc_macsrc_ouisrc_portdst_ipdst_macdst_ouidst_portprotocolip_versionvlan_idtunnel_idbidirectional_first_seen_msbidirectional_last_seen_msbidirectional_duration_msbidirectional_packetsbidirectional_bytessrc2dst_first_seen_mssrc2dst_last_seen_mssrc2dst_duration_mssrc2dst_packetssrc2dst_bytesdst2src_first_seen_msdst2src_last_seen_msdst2src_duration_msdst2src_packetsdst2src_bytesapplication_nameapplication_category_nameapplication_is_guessedapplication_confidencerequested_server_nameclient_fingerprintserver_fingerprintuser_agentcontent_typeudps.packet_with_40_ip_size
000172.31.3.22480:c6:ca:00:9e:9f80:c6:ca42835216.58.212.10000:0e:8e:4d:b4:a800:0e:8e4436400143444339468314344434013536670289108143444339468314344434013536670161512143444339471714344434013086591127596TLSWeb11NaNNaNNaNNaNNaN14
\n", "
" ], "text/plain": [ " id expiration_id src_ip src_mac src_oui src_port \\\n", "0 0 0 172.31.3.224 80:c6:ca:00:9e:9f 80:c6:ca 42835 \n", "\n", " dst_ip dst_mac dst_oui dst_port protocol \\\n", "0 216.58.212.100 00:0e:8e:4d:b4:a8 00:0e:8e 443 6 \n", "\n", " ip_version vlan_id tunnel_id bidirectional_first_seen_ms \\\n", "0 4 0 0 1434443394683 \n", "\n", " bidirectional_last_seen_ms bidirectional_duration_ms \\\n", "0 1434443401353 6670 \n", "\n", " bidirectional_packets bidirectional_bytes src2dst_first_seen_ms \\\n", "0 28 9108 1434443394683 \n", "\n", " src2dst_last_seen_ms src2dst_duration_ms src2dst_packets src2dst_bytes \\\n", "0 1434443401353 6670 16 1512 \n", "\n", " dst2src_first_seen_ms dst2src_last_seen_ms dst2src_duration_ms \\\n", "0 1434443394717 1434443401308 6591 \n", "\n", " dst2src_packets dst2src_bytes application_name application_category_name \\\n", "0 12 7596 TLS Web \n", "\n", " application_is_guessed application_confidence requested_server_name \\\n", "0 1 1 NaN \n", "\n", " client_fingerprint server_fingerprint user_agent content_type \\\n", "0 NaN NaN NaN NaN \n", "\n", " udps.packet_with_40_ip_size \n", "0 14 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our Dataframe have a new column named `udps.packet_with_40_ip_size`." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 4 }