{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MSTICPy Pivot Functions\n", "\n", "## What are Pivot Functions?\n", "\n", "MSTICPy has a lot of functionality distributed across many classes and modules. \n", "However, there is no simple way to discover where these functions are and what types\n", "of data the function is relevant to.\n", "\n", "Pivot functions bring this functionality together grouped around Entities.\n", "\n", "Entities are representations real-world objects found commonly in CyberSec investigations.\n", "Some examples are: IpAddress, Host, Account, URL\n", "\n", "```python\n", ">>> IpAddress.util.ip_type(ip_str=\"157.53.1.1\"))\n", "ip \tresult\n", "157.53.1.1 \tPublic\n", "\n", ">>> IpAddress.util.whois(\"157.53.1.1\"))\n", "asn \tasn_cidr \tasn_country_code \tasn_date \tasn_description \tasn_registry \tnets \tnir \tquery \traw \traw_referral \treferral\n", "NA \tNA \tUS \t2015-04-01 \tNA \tarin \t[{'cidr': '157.53.0.0/16', 'name': 'NETACTUATE-MDN-04', 'handle': 'NET-157-53-0-0-1', 'range': '... \tNone \t157.53.1.1 \tNone \tNone \tNone\n", " \n", ">>> IpAddress.util.geoloc(value=\"157.53.1.1\"))\n", "CountryCode \tCountryName \tState \tCity \tLongitude \tLatitude \tAsn \tedges \tType \tAdditionalData \tIpAddress\n", "US \tUnited States \tNone \tNone \t-97.822 \t37.751 \tNone \t{} \tgeolocation \t{} \t157.53.1.1\n", " \n", ">>> Host.MSSentinel.list_host_logons(host_name=\"VictimPc\")\n", "Account \tEventID \tTimeGenerated \tSourceComputerId \tComputer \tSubjectUserName \tSubjectDomainName\n", "NT AUTHORITY\\SYSTEM \t4624 \t2020-10-01 22:39:36.987000+00:00 \tf6638b82-98a5-4542-8bec-6bc0977f793f \tVictimPc.Contoso.Azure \tVictimPc$ \tCONTOSO\n", "NT AUTHORITY\\SYSTEM \t4624 \t2020-10-01 22:39:37.220000+00:00 \tf6638b82-98a5-4542-8bec-6bc0977f793f \tVictimPc.Contoso.Azure \tVictimPc\\$ \tCONTOSO\n", "NT AUTHORITY\\SYSTEM \t4624 \t2020-10-01 22:39:42.603000+00:00 \tf6638b82-98a5-4542-8bec-6bc0977f793f \tVictimPc.Contoso.Azure \tVictimPc\\$ \tCONTOSO\n", "\n", "```\n", "\n", "You can also chain pivot functions together to create a processing\n", "pipeline that does multiple operations on data:\n", "```python\n", ">>> (\n", " suspicious_ips_df\n", " # Lookup IPs at VT\n", " .mp_pivot.run(IpAddress.ti.lookup_ipv4_VirusTotal, column=\"IPAddress\")\n", " # Filter on high severity\n", " .query(\"Severity == 'high'\")\n", " .mp_pivot.run(IpAddress.util.whois, column=\"Ioc\", join=\"left\")\n", " # Query IPs that have login attempts\n", " .mp_pivot.run(IpAddress.MSSentinel.list_aad_signins_for_ip, ip_address_list=\"Ioc\")\n", " # Send the output of this to a plot\n", " .mp_plot.timeline(\n", " title=\"High Severity IPs with Logon attempts\",\n", " source_columns=[\"UserPrincipalName\", \"IPAddress\", \"ClientAppUsed\", \"Location\"],\n", " group_by=\"UserPrincipalName\"\n", " )\n", " )\n", "\n", "```\n", "\n", "> We'll see examples of how to do these pivoting queries later in the notebook.\n", "\n", "MSTICPy has had entity classes from the very early days but, until now, these\n", "have only been used sporadically in the rest of the package.\n", "\n", "The pivot functionality exposed operations relevant to a particular\n", "entity as methods of that entity. These operations could include:\n", "\n", "- Data queries\n", "- Threat intelligence lookups\n", "- Other data lookups such as GeoLocation or domain resolution\n", "- and other local functionality\n", "\n", "## What is Pivoting?\n", "\n", "The name comes from the common practice of Cyber investigators navigating\n", "between related entities. For example an entity/investigation chain might\n", "look like the following:\n", "\n", "\n", "| Step | Source | Operation | Target |\n", "| :--: | :----------------- | :----------------- | :----------------- |\n", "| 1 | Alert | Review alert -> | Source IP(A) |\n", "| 2 | Source IP(A) | Lookup TI -> | Related URLs |\n", "| | | | Malware names |\n", "| 3 | URL | Query web logs -> | Requesting hosts |\n", "| 4 | Host | Query host logons -> | Accounts |\n", "\n", "\n", "At each step there are one or more directions that you can take to\n", "follow the chain of related indicators of activity in a possible attack.\n", "\n", "Bringing these functions into a few, well-known locations makes it easier to\n", "use MSTICPy to carry out this common pivoting pattern in Jupyter notebooks." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "## Getting started" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "import msticpy as mp\n", "mp.init_notebook();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The pivoting subsystem will automatically load functions from\n", "*MSTICPy* that do not need any authentication.\n", "\n", "This is the case with providers such as Threat Intelligence (TILookup) and GeoIP.\n", "If not initialized before running `init_notebook`, they will be loaded with\n", "the defaults as specified in your *msticpyconfig.yaml*.\n", "\n", "For query providers, pivot functions are added dynamically as\n", "you connect/authenticate to the data provider." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### What happens at initialization?\n", "\n", "- TI provider is loaded and entity-specific lookups (e.g. IP, Url, File)\n", " are added as pivot functions\n", "- Miscellaneous Msticpy functions and classes (e.g. GeoIP, IpType,\n", " Domain utils) are added as pivot functions to the appropriate entity.\n", "\n", "You can add custom functions as pivot functions by creating a\n", "registration template and importing the function.\n", "Details of this are covered later in the document." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load one or more data providers" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Please wait. Loading Kqlmagic extension...done\n", "Connecting... " ] }, { "data": { "text/html": [ "\n", " \n", "
\n", " popup schema 8ecf8077-cf51-4820-aadd-14040956f35d@loganalytics\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "connected\n" ] } ], "source": [ "az_provider = QueryProvider(\"MSSentinel\")\n", "az_provider.connect(workspace=\"Default\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pivot function list\n", "\n", "You can view the pivot functions loaded for\n", "an entity. Note that because we've loaded an\n", "MS Sentinel query provider a subset of the queries\n", "(those that have a `host_name` parameter) are\n", "also loaded.\n", "\n", "> Note entity classes are also automatically imported\n", "> by `init_notebook`, so you do not need to import\n", "> them manually." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MSSentinel.VMComputer_vmcomputer',\n", " 'MSSentinel.auditd_auditd_all',\n", " 'MSSentinel.az_nsg_interface',\n", " 'MSSentinel.az_nsg_net_flows',\n", " 'MSSentinel.az_nsg_net_flows_depr',\n", " 'MSSentinel.heartbeat',\n", " 'MSSentinel.heartbeat_for_host_depr',\n", " 'MSSentinel.sec_alerts',\n", " 'MSSentinel.sent_bookmarks',\n", " 'MSSentinel.syslog_all_syslog',\n", " 'MSSentinel.syslog_cron_activity',\n", " 'MSSentinel.syslog_logon_failures',\n", " 'MSSentinel.syslog_logons',\n", " 'MSSentinel.syslog_squid_activity',\n", " 'MSSentinel.syslog_sudo_activity',\n", " 'MSSentinel.syslog_user_group_activity',\n", " 'MSSentinel.syslog_user_logon',\n", " 'MSSentinel.wevt_all_events',\n", " 'MSSentinel.wevt_events_by_id',\n", " 'MSSentinel.wevt_get_process_tree',\n", " 'MSSentinel.wevt_list_other_events',\n", " 'MSSentinel.wevt_logon_attempts',\n", " 'MSSentinel.wevt_logon_failures',\n", " 'MSSentinel.wevt_logon_session',\n", " 'MSSentinel.wevt_logons',\n", " 'MSSentinel.wevt_parent_process',\n", " 'MSSentinel.wevt_process_session',\n", " 'MSSentinel.wevt_processes',\n", " 'RiskIQ.articles',\n", " 'RiskIQ.artifacts',\n", " 'RiskIQ.certificates',\n", " 'RiskIQ.components',\n", " 'RiskIQ.cookies',\n", " 'RiskIQ.hostpair_children',\n", " 'RiskIQ.hostpair_parents',\n", " 'RiskIQ.malware',\n", " 'RiskIQ.projects',\n", " 'RiskIQ.reputation',\n", " 'RiskIQ.resolutions',\n", " 'RiskIQ.summary',\n", " 'RiskIQ.trackers',\n", " 'RiskIQ.whois',\n", " 'dns_is_resolvable',\n", " 'dns_resolve',\n", " 'util.dns_components',\n", " 'util.dns_in_abuse_list',\n", " 'util.dns_is_resolvable',\n", " 'util.dns_resolve',\n", " 'util.dns_validate_tld']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.pivots()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### See the list of providers loaded by the Pivot class\n", "\n", "Notice that TILookup was loaded even though we did not create an instance of TILookup beforehand." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'TILookup': ,\n", " 'MSSentinel': }" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mp.pivot.providers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pivot functions are grouped into containers\n", "\n", "Data queries are grouped into a container with the name of the data provider to which they belong.\n", "E.g. MSSentinel queries are in a container of that name, Spunk queries would be in a \"Splunk\" container.\n", "\n", "> Note: if you have multiple instances of a provider type\n", "> the name of the provider container will have a suffix\n", "> with the instance name (e.g. the Sentinel Workspace name).\n", "\n", "TI lookups are put into a \"ti\" container\n", "\n", "All other built-in functions are added to the \"other\" container.\n", "\n", "The containers themselves are callable and will return a list of their contents. \n", "Containers are also iterable - each iteration returns a tuple (pair) of name/function values.\n", "\n", "In notebooks/IPython you can also use tab completion to get to the right function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can pass a substring to the `pivots` (or `get_pivot_list`) function" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MSSentinel.syslog_logon_failures',\n", " 'MSSentinel.syslog_logons',\n", " 'MSSentinel.syslog_user_logon',\n", " 'MSSentinel.wevt_logon_attempts',\n", " 'MSSentinel.wevt_logon_failures',\n", " 'MSSentinel.wevt_logon_session',\n", " 'MSSentinel.wevt_logons']" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.pivots(\"logon\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the Pivot Browser\n", "\n", "Pivot also has a utility that allows you to browse entities and the \n", "pivot functions attached to them. You can search for functions with\n", "desired keywords, view help for the specific function and copy the function\n", "signature to paste into a code cell.\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "57bc3a3465d048f4a59d1bf27f235b36", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HBox(children=(VBox(children=(HTML(value='Entities'), Select(description='entity', layou…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "Pivot.browse()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Running a pivot function\n", "\n", "Pivot functions have flexible input types. They can be used with the following types of parameters:\n", "\n", "- entity instances (e.g. where you have an IpAddress entity with a populated address field)\n", "- single values (e.g. a DNS domain name)\n", "- lists of values (e.g. a list of IpAddresses)\n", "- pandas DataFrames (where one or more of the columns contains the input parameter data)\n", "\n", "Pivot functions normally return results as a dataframe (although some complex functions such as Notebooklets\n", "can return composite results objects containing multiple dataframes and other object types.\n" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "List 'utility' pivot functions for IpAddress\n", "\n", "whois function\n", "ip_type function\n", "ip_rev_resolve function\n", "geoloc function\n", "geoloc_ips function\n", "\n", "-------------------------------\n", "\n", "Print help for a function - IpAddress.util.type\n", "\n", "Help on function get_ip_type in module msticpy.context.ip_utils:\n", "\n", "get_ip_type(ip: str = None, ip_str: str = None) -> str\n", " Validate value is an IP address and determine IPType category.\n", " \n", " (IPAddress category is e.g. Private/Public/Multicast).\n", " \n", " Parameters\n", " ----------\n", " ip : str\n", " The string of the IP Address\n", " ip_str : str\n", " The string of the IP Address - alias for `ip`\n", " \n", " Returns\n", " -------\n", " str\n", " Returns ip type string using ip address module\n", "\n" ] } ], "source": [ "print(\"List 'utility' pivot functions for IpAddress\\n\")\n", "IpAddress.util()\n", "print()\n", "print(\"-------------------------------\\n\")\n", "print(\"Print help for a function - IpAddress.util.type\\n\")\n", "help(IpAddress.util.ip_type)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Parameter names\n", "- Positional parameter - If the function only accepts one parameter you can usually just supply it without a name - as a positional parameter (see first and third examples below)\n", "- Native parameter - You can also use the native parameter name - i.e. the name that the underlying function expects and that will be shown in the help(function) output\n", "- Generic parameter - You can also use the generic parameter name \"value\" in most cases.\n", "\n", "If in doubt, use help(entity.container.func) or entity.container.func?" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
010.1.1.1Private0
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 10.1.1.1 Private 0" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.util.ip_type(\"10.1.1.1\")" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
010.1.1.1Private0
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 10.1.1.1 Private 0" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
0157.53.1.1Public0
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 157.53.1.1 Public 0" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
asnasn_cidrasn_country_codeasn_dateasn_descriptionasn_registrynetsnirqueryrawraw_referralreferral
0NANAUS2015-04-01NAarin[{'cidr': '157.53.0.0/16', 'name': 'NETACTUATE-MDN-04', 'handle': 'NET-157-53-0-0-1', 'range': '...None157.53.1.1NoneNoneNone
\n", "
" ], "text/plain": [ " asn asn_cidr asn_country_code asn_date asn_description asn_registry \\\n", "0 NA NA US 2015-04-01 NA arin \n", "\n", " nets \\\n", "0 [{'cidr': '157.53.0.0/16', 'name': 'NETACTUATE-MDN-04', 'handle': 'NET-157-53-0-0-1', 'range': '... \n", "\n", " nir query raw raw_referral referral \n", "0 None 157.53.1.1 None None None " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCodeCountryNameLongitudeLatitudeTimeGeneratedTypeIpAddress
0USUnited States-97.82237.7512022-06-08 19:38:16.207819geolocation157.53.1.1
\n", "
" ], "text/plain": [ " CountryCode CountryName Longitude Latitude TimeGenerated \\\n", "0 US United States -97.822 37.751 2022-06-08 19:38:16.207819 \n", "\n", " Type IpAddress \n", "0 geolocation 157.53.1.1 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(IpAddress.util.ip_type(\"10.1.1.1\"))\n", "display(IpAddress.util.ip_type(ip_str=\"157.53.1.1\"))\n", "display(IpAddress.util.whois(\"157.53.1.1\"))\n", "display(IpAddress.util.geoloc(value=\"157.53.1.1\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using an entity as a parameter\n", "\n", "Behind the scenes the Pivot api is using a mapping of\n", "entity attributes to supply the right value to the function parameter." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
010.1.1.1Private0
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 10.1.1.1 Private 0" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
0157.53.1.1Public0
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 157.53.1.1 Public 0" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
asnasn_cidrasn_country_codeasn_dateasn_descriptionasn_registrynetsnirqueryrawraw_referralreferral
0NANAUS2015-04-01NAarin[{'cidr': '157.53.0.0/16', 'name': 'NETACTUATE-MDN-04', 'handle': 'NET-157-53-0-0-1', 'range': '...None157.53.1.1NoneNoneNone
\n", "
" ], "text/plain": [ " asn asn_cidr asn_country_code asn_date asn_description asn_registry \\\n", "0 NA NA US 2015-04-01 NA arin \n", "\n", " nets \\\n", "0 [{'cidr': '157.53.0.0/16', 'name': 'NETACTUATE-MDN-04', 'handle': 'NET-157-53-0-0-1', 'range': '... \n", "\n", " nir query raw raw_referral referral \n", "0 None 157.53.1.1 None None None " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCodeCountryNameLongitudeLatitudeTimeGeneratedTypeIpAddress
0USUnited States-97.82237.7512022-06-08 19:38:19.432876geolocation157.53.1.1
\n", "
" ], "text/plain": [ " CountryCode CountryName Longitude Latitude TimeGenerated \\\n", "0 US United States -97.822 37.751 2022-06-08 19:38:19.432876 \n", "\n", " Type IpAddress \n", "0 geolocation 157.53.1.1 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ip1 = IpAddress(Address=\"10.1.1.1\")\n", "ip2 = IpAddress(Address=\"157.53.1.1\")\n", "\n", "display(IpAddress.util.ip_type(ip1))\n", "display(IpAddress.util.ip_type(ip2))\n", "display(IpAddress.util.whois(ip2))\n", "display(IpAddress.util.geoloc(ip2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using a list (or other iterable) as a parameter\n", "\n", "Many of the underlying functions will accept either single values or collections \n", "(usually in DataFrames) of values as input.\n", "Even in cases where the underlying function does not accept iterables as parameters, the\n", "Pivot library will usually be able to iterate through each value and collate the results\n", "to hand you back a single dataframe.\n", "\n", "> Note: there are some exceptions to this - usually where the underlying function
\n", "> is long-running or expensive and has opted not to accept iterated calls.
\n", "> Notebooklets are an example of these.
\n", "\n", "Where the function has multiple parameters you can supply a mixture of iterables and single values.\n", "\n", "- In this case, the single-valued parameters are re-used on each call, paired with the item\n", " in the list(s) taken from the multi-valued parameters\n", " \n", "You can also use multiple iterables for multiple parameters.\n", "- In this case the iterables *should* be the same length. \n", " If they are different lengths the iterations stop after the shorted list/iterable is exhausted.\n", " \n", "For example:\n", "```\n", " list_1 = [1, 2, 3, 4]\n", " list_2 = [\"a\", \"b\", \"c\"]\n", " entity.util.func(p1=list_1, p2=list_2)\n", "```\n", "\n", "The function will execute with the pairings (1, \"a\"), (2, \"b\") and (3, \"c) - (4, \\_) will be ignored" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "Use our magic function to convert pasted-in list to dataframe" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AllExtIPs
9172.217.15.99
1040.85.232.64
1120.38.98.100
1223.96.64.84
1365.55.44.108
14131.107.147.209
1510.0.3.4
1610.0.3.5
1713.82.152.48
\n", "
" ], "text/plain": [ " AllExtIPs\n", "9 172.217.15.99\n", "10 40.85.232.64\n", "11 20.38.98.100\n", "12 23.96.64.84\n", "13 65.55.44.108\n", "14 131.107.147.209\n", "15 10.0.3.4\n", "16 10.0.3.5\n", "17 13.82.152.48" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%txt2df --headers --name ip_df1\n", "AllExtIPs\n", "9, 172.217.15.99\n", "10, 40.85.232.64\n", "11, 20.38.98.100\n", "12, 23.96.64.84\n", "13, 65.55.44.108\n", "14, 131.107.147.209\n", "15, 10.0.3.4\n", "16, 10.0.3.5\n", "17, 13.82.152.48" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
023.96.64.84Public0
165.55.44.108Public1
2131.107.147.209Public2
310.0.3.4Private3
410.0.3.5Private4
513.82.152.48Public5
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 23.96.64.84 Public 0\n", "1 65.55.44.108 Public 1\n", "2 131.107.147.209 Public 2\n", "3 10.0.3.4 Private 3\n", "4 10.0.3.5 Private 4\n", "5 13.82.152.48 Public 5" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
023.96.64.84Public0
165.55.44.108Public1
2131.107.147.209Public2
310.0.3.4Private3
410.0.3.5Private4
513.82.152.48Public5
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 23.96.64.84 Public 0\n", "1 65.55.44.108 Public 1\n", "2 131.107.147.209 Public 2\n", "3 10.0.3.4 Private 3\n", "4 10.0.3.5 Private 4\n", "5 13.82.152.48 Public 5" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nirasn_registryasnasn_cidrasn_country_codeasn_dateasn_descriptionquerynetsrawreferralraw_referral
0NaNarin807523.96.0.0/14US2013-06-18MICROSOFT-CORP-MSN-AS-BLOCK, US23.96.64.84[{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23....NaNNaNNaN
1NaNarin807565.52.0.0/14US2001-02-14MICROSOFT-CORP-MSN-AS-BLOCK, US65.55.44.108[{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52...NaNNaNNaN
2NaNarin3598131.107.0.0/16US1988-11-11MICROSOFT-CORP-AS, US131.107.147.209[{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10...NaNNaNNaN
3NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
4NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
5NaNarin807513.64.0.0/11US2015-03-26MICROSOFT-CORP-MSN-AS-BLOCK, US13.82.152.48[{'cidr': '13.96.0.0/13, 13.104.0.0/14, 13.64.0.0/11', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-...NaNNaNNaN
\n", "
" ], "text/plain": [ " nir asn_registry asn asn_cidr asn_country_code asn_date \\\n", "0 NaN arin 8075 23.96.0.0/14 US 2013-06-18 \n", "1 NaN arin 8075 65.52.0.0/14 US 2001-02-14 \n", "2 NaN arin 3598 131.107.0.0/16 US 1988-11-11 \n", "3 NaN NaN NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN NaN NaN \n", "5 NaN arin 8075 13.64.0.0/11 US 2015-03-26 \n", "\n", " asn_description query \\\n", "0 MICROSOFT-CORP-MSN-AS-BLOCK, US 23.96.64.84 \n", "1 MICROSOFT-CORP-MSN-AS-BLOCK, US 65.55.44.108 \n", "2 MICROSOFT-CORP-AS, US 131.107.147.209 \n", "3 NaN NaN \n", "4 NaN NaN \n", "5 MICROSOFT-CORP-MSN-AS-BLOCK, US 13.82.152.48 \n", "\n", " nets \\\n", "0 [{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23.... \n", "1 [{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52... \n", "2 [{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10... \n", "3 NaN \n", "4 NaN \n", "5 [{'cidr': '13.96.0.0/13, 13.104.0.0/14, 13.64.0.0/11', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-... \n", "\n", " raw referral raw_referral \n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "5 NaN NaN NaN " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCodeCountryNameStateCityLongitudeLatitudeTimeGeneratedTypeIpAddress
0USUnited StatesVirginiaTappahannock-76.854537.92732022-06-08 19:38:37.428898geolocation23.96.64.84
1USUnited StatesVirginiaBoydton-78.375036.65342022-06-08 19:38:37.429901geolocation65.55.44.108
2USUnited StatesWashingtonRedmond-122.125747.67222022-06-08 19:38:37.429901geolocation131.107.147.209
3NaNPrivate addressNaNlocation unknownNaNNaN2022-06-08 19:38:37.429901geolocation10.0.3.4
4NaNPrivate addressNaNlocation unknownNaNNaN2022-06-08 19:38:37.429901geolocation10.0.3.5
5USUnited StatesVirginiaTappahannock-76.854537.92732022-06-08 19:38:37.430901geolocation13.82.152.48
\n", "
" ], "text/plain": [ " CountryCode CountryName State City Longitude \\\n", "0 US United States Virginia Tappahannock -76.8545 \n", "1 US United States Virginia Boydton -78.3750 \n", "2 US United States Washington Redmond -122.1257 \n", "3 NaN Private address NaN location unknown NaN \n", "4 NaN Private address NaN location unknown NaN \n", "5 US United States Virginia Tappahannock -76.8545 \n", "\n", " Latitude TimeGenerated Type IpAddress \n", "0 37.9273 2022-06-08 19:38:37.428898 geolocation 23.96.64.84 \n", "1 36.6534 2022-06-08 19:38:37.429901 geolocation 65.55.44.108 \n", "2 47.6722 2022-06-08 19:38:37.429901 geolocation 131.107.147.209 \n", "3 NaN 2022-06-08 19:38:37.429901 geolocation 10.0.3.4 \n", "4 NaN 2022-06-08 19:38:37.429901 geolocation 10.0.3.5 \n", "5 37.9273 2022-06-08 19:38:37.430901 geolocation 13.82.152.48 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ip_list1 = ip_df1.AllExtIPs.values[-6:]\n", "\n", "display(IpAddress.util.ip_type(ip_list1))\n", "display(IpAddress.util.ip_type(ip_str=list(ip_list1)))\n", "display(IpAddress.util.whois(value=tuple(ip_list1)))\n", "display(IpAddress.util.geoloc(ip_list1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using DataFrames as input\n", "\n", "Using a dataframe as input requires a slightly different syntax since you not\n", "only need to pass the dataframe as a parameter but also tell the function\n", "which column to use for input.\n", "\n", "To specify the column to use, you can use the name of the parameter that the\n", "underlying function expects or one of these generic names:\n", "\n", "- column\n", "- input_column\n", "- input_col\n", "- src_column\n", "- src_col\n", "\n", "> Note these generic names are not shown in the function help" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
0172.217.15.99Public0
140.85.232.64Public1
220.38.98.100Public2
323.96.64.84Public3
465.55.44.108Public4
5131.107.147.209Public5
610.0.3.4Private6
710.0.3.5Private7
813.82.152.48Public8
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 172.217.15.99 Public 0\n", "1 40.85.232.64 Public 1\n", "2 20.38.98.100 Public 2\n", "3 23.96.64.84 Public 3\n", "4 65.55.44.108 Public 4\n", "5 131.107.147.209 Public 5\n", "6 10.0.3.4 Private 6\n", "7 10.0.3.5 Private 7\n", "8 13.82.152.48 Public 8" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
0172.217.15.99Public0
140.85.232.64Public1
220.38.98.100Public2
323.96.64.84Public3
465.55.44.108Public4
5131.107.147.209Public5
610.0.3.4Private6
710.0.3.5Private7
813.82.152.48Public8
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 172.217.15.99 Public 0\n", "1 40.85.232.64 Public 1\n", "2 20.38.98.100 Public 2\n", "3 23.96.64.84 Public 3\n", "4 65.55.44.108 Public 4\n", "5 131.107.147.209 Public 5\n", "6 10.0.3.4 Private 6\n", "7 10.0.3.5 Private 7\n", "8 13.82.152.48 Public 8" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nirasn_registryasnasn_cidrasn_country_codeasn_dateasn_descriptionquerynetsrawreferralraw_referral
9NaNarin15169172.217.15.0/24US2012-04-16GOOGLE, US172.217.15.99[{'cidr': '172.217.0.0/16', 'name': 'GOOGLE', 'handle': 'NET-172-217-0-0-1', 'range': '172.217.0...NaNNaNNaN
10NaNarin807540.80.0.0/12US2015-02-23MICROSOFT-CORP-MSN-AS-BLOCK, US40.85.232.64[{'cidr': '40.96.0.0/12, 40.74.0.0/15, 40.124.0.0/16, 40.80.0.0/12, 40.125.0.0/17, 40.112.0.0/13...NaNNaNNaN
11NaNarin807520.36.0.0/14US2017-10-18MICROSOFT-CORP-MSN-AS-BLOCK, US20.38.98.100[{'cidr': '20.40.0.0/13, 20.33.0.0/16, 20.64.0.0/10, 20.128.0.0/16, 20.34.0.0/15, 20.36.0.0/14, ...NaNNaNNaN
12NaNarin807523.96.0.0/14US2013-06-18MICROSOFT-CORP-MSN-AS-BLOCK, US23.96.64.84[{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23....NaNNaNNaN
13NaNarin807565.52.0.0/14US2001-02-14MICROSOFT-CORP-MSN-AS-BLOCK, US65.55.44.108[{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52...NaNNaNNaN
14NaNarin3598131.107.0.0/16US1988-11-11MICROSOFT-CORP-AS, US131.107.147.209[{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10...NaNNaNNaN
15NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
16NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
17NaNarin807513.64.0.0/11US2015-03-26MICROSOFT-CORP-MSN-AS-BLOCK, US13.82.152.48[{'cidr': '13.96.0.0/13, 13.104.0.0/14, 13.64.0.0/11', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-...NaNNaNNaN
\n", "
" ], "text/plain": [ " nir asn_registry asn asn_cidr asn_country_code asn_date \\\n", "9 NaN arin 15169 172.217.15.0/24 US 2012-04-16 \n", "10 NaN arin 8075 40.80.0.0/12 US 2015-02-23 \n", "11 NaN arin 8075 20.36.0.0/14 US 2017-10-18 \n", "12 NaN arin 8075 23.96.0.0/14 US 2013-06-18 \n", "13 NaN arin 8075 65.52.0.0/14 US 2001-02-14 \n", "14 NaN arin 3598 131.107.0.0/16 US 1988-11-11 \n", "15 NaN NaN NaN NaN NaN NaN \n", "16 NaN NaN NaN NaN NaN NaN \n", "17 NaN arin 8075 13.64.0.0/11 US 2015-03-26 \n", "\n", " asn_description query \\\n", "9 GOOGLE, US 172.217.15.99 \n", "10 MICROSOFT-CORP-MSN-AS-BLOCK, US 40.85.232.64 \n", "11 MICROSOFT-CORP-MSN-AS-BLOCK, US 20.38.98.100 \n", "12 MICROSOFT-CORP-MSN-AS-BLOCK, US 23.96.64.84 \n", "13 MICROSOFT-CORP-MSN-AS-BLOCK, US 65.55.44.108 \n", "14 MICROSOFT-CORP-AS, US 131.107.147.209 \n", "15 NaN NaN \n", "16 NaN NaN \n", "17 MICROSOFT-CORP-MSN-AS-BLOCK, US 13.82.152.48 \n", "\n", " nets \\\n", "9 [{'cidr': '172.217.0.0/16', 'name': 'GOOGLE', 'handle': 'NET-172-217-0-0-1', 'range': '172.217.0... \n", "10 [{'cidr': '40.96.0.0/12, 40.74.0.0/15, 40.124.0.0/16, 40.80.0.0/12, 40.125.0.0/17, 40.112.0.0/13... \n", "11 [{'cidr': '20.40.0.0/13, 20.33.0.0/16, 20.64.0.0/10, 20.128.0.0/16, 20.34.0.0/15, 20.36.0.0/14, ... \n", "12 [{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23.... \n", "13 [{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52... \n", "14 [{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10... \n", "15 NaN \n", "16 NaN \n", "17 [{'cidr': '13.96.0.0/13, 13.104.0.0/14, 13.64.0.0/11', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-... \n", "\n", " raw referral raw_referral \n", "9 NaN NaN NaN \n", "10 NaN NaN NaN \n", "11 NaN NaN NaN \n", "12 NaN NaN NaN \n", "13 NaN NaN NaN \n", "14 NaN NaN NaN \n", "15 NaN NaN NaN \n", "16 NaN NaN NaN \n", "17 NaN NaN NaN " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCodeCountryNameLongitudeLatitudeTimeGeneratedTypeIpAddressStateCity
0USUnited States-97.822037.75102022-06-08 19:38:44.907125geolocation172.217.15.99NaNNaN
1CACanada-79.362343.65472022-06-08 19:38:44.907125geolocation40.85.232.64OntarioToronto
2USUnited States-76.854537.92732022-06-08 19:38:44.907125geolocation20.38.98.100VirginiaTappahannock
3USUnited States-76.854537.92732022-06-08 19:38:44.908124geolocation23.96.64.84VirginiaTappahannock
4USUnited States-78.375036.65342022-06-08 19:38:44.908124geolocation65.55.44.108VirginiaBoydton
5USUnited States-122.125747.67222022-06-08 19:38:44.909124geolocation131.107.147.209WashingtonRedmond
6NaNPrivate addressNaNNaN2022-06-08 19:38:44.909124geolocation10.0.3.4NaNlocation unknown
7NaNPrivate addressNaNNaN2022-06-08 19:38:44.909124geolocation10.0.3.5NaNlocation unknown
8USUnited States-76.854537.92732022-06-08 19:38:44.909124geolocation13.82.152.48VirginiaTappahannock
\n", "
" ], "text/plain": [ " CountryCode CountryName Longitude Latitude \\\n", "0 US United States -97.8220 37.7510 \n", "1 CA Canada -79.3623 43.6547 \n", "2 US United States -76.8545 37.9273 \n", "3 US United States -76.8545 37.9273 \n", "4 US United States -78.3750 36.6534 \n", "5 US United States -122.1257 47.6722 \n", "6 NaN Private address NaN NaN \n", "7 NaN Private address NaN NaN \n", "8 US United States -76.8545 37.9273 \n", "\n", " TimeGenerated Type IpAddress State \\\n", "0 2022-06-08 19:38:44.907125 geolocation 172.217.15.99 NaN \n", "1 2022-06-08 19:38:44.907125 geolocation 40.85.232.64 Ontario \n", "2 2022-06-08 19:38:44.907125 geolocation 20.38.98.100 Virginia \n", "3 2022-06-08 19:38:44.908124 geolocation 23.96.64.84 Virginia \n", "4 2022-06-08 19:38:44.908124 geolocation 65.55.44.108 Virginia \n", "5 2022-06-08 19:38:44.909124 geolocation 131.107.147.209 Washington \n", "6 2022-06-08 19:38:44.909124 geolocation 10.0.3.4 NaN \n", "7 2022-06-08 19:38:44.909124 geolocation 10.0.3.5 NaN \n", "8 2022-06-08 19:38:44.909124 geolocation 13.82.152.48 Virginia \n", "\n", " City \n", "0 NaN \n", "1 Toronto \n", "2 Tappahannock \n", "3 Tappahannock \n", "4 Boydton \n", "5 Redmond \n", "6 location unknown \n", "7 location unknown \n", "8 Tappahannock " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(IpAddress.util.ip_type(data=ip_df1, input_col=\"AllExtIPs\"))\n", "display(IpAddress.util.ip_type(data=ip_df1, ip=\"AllExtIPs\"))\n", "display(IpAddress.util.whois(data=ip_df1, column=\"AllExtIPs\"))\n", "display(IpAddress.util.geoloc(data=ip_df1, src_col=\"AllExtIPs\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Joining input to output data\n", "\n", "You might want to return a data set that is joined to your input set.\n", "To do that use the \"join\" parameter.\n", "\n", "The value of join can be:\n", "- inner\n", "- left\n", "- right\n", "- outer\n", "\n", "To preserve all rows from the input, use a \"left\" join.\n", "To keep only rows that have a valid result from the function use \"inner\" or \"right\"\n", "\n", "> Note while most functions only return a single output row for each input row
\n", "> some return multiple rows. Be cautious using \"outer\" in these cases." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AllExtIPsCountryCodeCountryNameLongitudeLatitudeTimeGeneratedTypeIpAddressStateCity
0172.217.15.99USUnited States-97.822037.75102022-06-08 23:18:07.276418geolocation172.217.15.99NaNNaN
140.85.232.64CACanada-79.362343.65472022-06-08 23:18:07.277418geolocation40.85.232.64OntarioToronto
220.38.98.100USUnited States-76.854537.92732022-06-08 23:18:07.277418geolocation20.38.98.100VirginiaTappahannock
323.96.64.84USUnited States-76.854537.92732022-06-08 23:18:07.277418geolocation23.96.64.84VirginiaTappahannock
465.55.44.108USUnited States-78.375036.65342022-06-08 23:18:07.278418geolocation65.55.44.108VirginiaBoydton
5131.107.147.209USUnited States-122.125747.67222022-06-08 23:18:07.278418geolocation131.107.147.209WashingtonRedmond
610.0.3.4NaNPrivate addressNaNNaN2022-06-08 23:18:07.278418geolocation10.0.3.4NaNlocation unknown
710.0.3.5NaNPrivate addressNaNNaN2022-06-08 23:18:07.278418geolocation10.0.3.5NaNlocation unknown
813.82.152.48USUnited States-76.854537.92732022-06-08 23:18:07.278418geolocation13.82.152.48VirginiaTappahannock
\n", "
" ], "text/plain": [ " AllExtIPs CountryCode CountryName Longitude Latitude \\\n", "0 172.217.15.99 US United States -97.8220 37.7510 \n", "1 40.85.232.64 CA Canada -79.3623 43.6547 \n", "2 20.38.98.100 US United States -76.8545 37.9273 \n", "3 23.96.64.84 US United States -76.8545 37.9273 \n", "4 65.55.44.108 US United States -78.3750 36.6534 \n", "5 131.107.147.209 US United States -122.1257 47.6722 \n", "6 10.0.3.4 NaN Private address NaN NaN \n", "7 10.0.3.5 NaN Private address NaN NaN \n", "8 13.82.152.48 US United States -76.8545 37.9273 \n", "\n", " TimeGenerated Type IpAddress State \\\n", "0 2022-06-08 23:18:07.276418 geolocation 172.217.15.99 NaN \n", "1 2022-06-08 23:18:07.277418 geolocation 40.85.232.64 Ontario \n", "2 2022-06-08 23:18:07.277418 geolocation 20.38.98.100 Virginia \n", "3 2022-06-08 23:18:07.277418 geolocation 23.96.64.84 Virginia \n", "4 2022-06-08 23:18:07.278418 geolocation 65.55.44.108 Virginia \n", "5 2022-06-08 23:18:07.278418 geolocation 131.107.147.209 Washington \n", "6 2022-06-08 23:18:07.278418 geolocation 10.0.3.4 NaN \n", "7 2022-06-08 23:18:07.278418 geolocation 10.0.3.5 NaN \n", "8 2022-06-08 23:18:07.278418 geolocation 13.82.152.48 Virginia \n", "\n", " City \n", "0 NaN \n", "1 Toronto \n", "2 Tappahannock \n", "3 Tappahannock \n", "4 Boydton \n", "5 Redmond \n", "6 location unknown \n", "7 location unknown \n", "8 Tappahannock " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "display(IpAddress.util.geoloc(data=ip_df1, src_col=\"AllExtIPs\", join=\"left\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DataQuery Pivot functions\n", "\n", "A significant difference between the functions that we've seen so far\n", "and data query functions is that the latter **do not accept generic parameter names.**\n", "\n", "When you use a named parameter in a data query pivot, you must specify\n", "the name that the query function is expecting. If in doubt, use \"?\" prefix to show the function help.\n", "\n", "Example:\n", "```\n", " Host.MSSentinel.list_host_events_by_id?\n", "```" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connecting... connected\n" ] }, { "data": { "text/html": [ "\n", " \n", "
\n", " FKAH67GNV Copy code to clipboard and authenticate\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "qry_prov2 = QueryProvider(\"MSSentinel\")\n", "qry_prov2.connect(workspace=\"CyberSecuritySOC\")" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MSSentinel.VMComputer_vmcomputer',\n", " 'MSSentinel.auditd_auditd_all',\n", " 'MSSentinel.az_nsg_interface',\n", " 'MSSentinel.az_nsg_net_flows',\n", " 'MSSentinel.az_nsg_net_flows_depr',\n", " 'MSSentinel.heartbeat',\n", " 'MSSentinel.heartbeat_for_host_depr',\n", " 'MSSentinel.sec_alerts',\n", " 'MSSentinel.sent_bookmarks',\n", " 'MSSentinel.syslog_all_syslog',\n", " 'MSSentinel.syslog_cron_activity',\n", " 'MSSentinel.syslog_logon_failures',\n", " 'MSSentinel.syslog_logons',\n", " 'MSSentinel.syslog_squid_activity',\n", " 'MSSentinel.syslog_sudo_activity',\n", " 'MSSentinel.syslog_user_group_activity',\n", " 'MSSentinel.syslog_user_logon',\n", " 'MSSentinel.wevt_all_events',\n", " 'MSSentinel.wevt_events_by_id',\n", " 'MSSentinel.wevt_get_process_tree',\n", " 'MSSentinel.wevt_list_other_events',\n", " 'MSSentinel.wevt_logon_attempts',\n", " 'MSSentinel.wevt_logon_failures',\n", " 'MSSentinel.wevt_logon_session',\n", " 'MSSentinel.wevt_logons',\n", " 'MSSentinel.wevt_parent_process',\n", " 'MSSentinel.wevt_process_session',\n", " 'MSSentinel.wevt_processes',\n", " 'MSSentinel_cybersecuritysoc.VMComputer_vmcomputer',\n", " 'MSSentinel_cybersecuritysoc.auditd_auditd_all',\n", " 'MSSentinel_cybersecuritysoc.az_nsg_interface',\n", " 'MSSentinel_cybersecuritysoc.az_nsg_net_flows',\n", " 'MSSentinel_cybersecuritysoc.az_nsg_net_flows_depr',\n", " 'MSSentinel_cybersecuritysoc.heartbeat',\n", " 'MSSentinel_cybersecuritysoc.heartbeat_for_host_depr',\n", " 'MSSentinel_cybersecuritysoc.sec_alerts',\n", " 'MSSentinel_cybersecuritysoc.sent_bookmarks',\n", " 'MSSentinel_cybersecuritysoc.syslog_all_syslog',\n", " 'MSSentinel_cybersecuritysoc.syslog_cron_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_logon_failures',\n", " 'MSSentinel_cybersecuritysoc.syslog_logons',\n", " 'MSSentinel_cybersecuritysoc.syslog_squid_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_sudo_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_user_group_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_user_logon',\n", " 'MSSentinel_cybersecuritysoc.wevt_all_events',\n", " 'MSSentinel_cybersecuritysoc.wevt_events_by_id',\n", " 'MSSentinel_cybersecuritysoc.wevt_get_process_tree',\n", " 'MSSentinel_cybersecuritysoc.wevt_list_other_events',\n", " 'MSSentinel_cybersecuritysoc.wevt_logon_attempts',\n", " 'MSSentinel_cybersecuritysoc.wevt_logon_failures',\n", " 'MSSentinel_cybersecuritysoc.wevt_logon_session',\n", " 'MSSentinel_cybersecuritysoc.wevt_logons',\n", " 'MSSentinel_cybersecuritysoc.wevt_parent_process',\n", " 'MSSentinel_cybersecuritysoc.wevt_process_session',\n", " 'MSSentinel_cybersecuritysoc.wevt_processes',\n", " 'RiskIQ.articles',\n", " 'RiskIQ.artifacts',\n", " 'RiskIQ.certificates',\n", " 'RiskIQ.components',\n", " 'RiskIQ.cookies',\n", " 'RiskIQ.hostpair_children',\n", " 'RiskIQ.hostpair_parents',\n", " 'RiskIQ.malware',\n", " 'RiskIQ.projects',\n", " 'RiskIQ.reputation',\n", " 'RiskIQ.resolutions',\n", " 'RiskIQ.summary',\n", " 'RiskIQ.trackers',\n", " 'RiskIQ.whois',\n", " 'dns_is_resolvable',\n", " 'dns_resolve',\n", " 'util.dns_components',\n", " 'util.dns_in_abuse_list',\n", " 'util.dns_is_resolvable',\n", " 'util.dns_resolve',\n", " 'util.dns_validate_tld']" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.pivots()" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MSSentinel.VMComputer_vmcomputer',\n", " 'MSSentinel.auditd_auditd_all',\n", " 'MSSentinel.az_nsg_interface',\n", " 'MSSentinel.az_nsg_net_flows',\n", " 'MSSentinel.az_nsg_net_flows_depr',\n", " 'MSSentinel.heartbeat',\n", " 'MSSentinel.heartbeat_for_host_depr',\n", " 'MSSentinel.sec_alerts',\n", " 'MSSentinel.sent_bookmarks',\n", " 'MSSentinel.syslog_all_syslog']" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.pivots()[:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting time parameters for queries interactively\n", "\n", "Use the `edit_query_time` function to set/change the time range used by queries.\n", "\n", "With no parameters it defaults to a period of \\[*UtcNow - 1 day*\\] to \\[*UtcNow*\\].\n", "\n", "Or you can change the timespan to use with the TimeSpan class.\n", "Changes that you make to the time range take effect immediately." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on method edit_query_time in module msticpy.init.pivot:\n", "\n", "edit_query_time(timespan: Optional[msticpy.common.timespan.TimeSpan] = None) method of msticpy.init.pivot.Pivot instance\n", " Display a QueryTime widget to get the timespan.\n", " \n", " Parameters\n", " ----------\n", " timespan : Optional[TimeSpan], optional\n", " Pre-populate the timespan shown by the QueryTime editor,\n", " by default None\n", "\n" ] } ], "source": [ "help(mp.pivot.edit_query_time)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "25fe2dc79c034eb7b7e10ffcff3d5e2a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='

Set time range for pivot functions.

'), HBox(children=(DatePicker(value=dat…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mp.pivot.edit_query_time()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting the timespan programmatically\n", "You can also just set the timespan directly on the pivot object" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "from msticpy.common.timespan import TimeSpan\n", "ts = TimeSpan(start=\"2020-10-01\", period=\"1d\")\n", "mp.pivot.timespan = ts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What queries do we have?" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sec_alerts function\n", "VMComputer_vmcomputer function\n", "sent_bookmarks function\n", "az_nsg_net_flows_depr function\n", "az_nsg_interface function\n", "heartbeat function\n", "az_nsg_net_flows function\n", "heartbeat_for_host_depr function\n", "auditd_auditd_all function\n", "syslog_sudo_activity function\n", "syslog_cron_activity function\n", "syslog_user_group_activity function\n", "syslog_all_syslog function\n", "syslog_squid_activity function\n", "syslog_user_logon function\n", "syslog_logons function\n", "syslog_logon_failures function\n", "wevt_all_events function\n", "wevt_events_by_id function\n", "wevt_list_other_events function\n", "wevt_logon_session function\n", "wevt_logons function\n", "wevt_logon_failures function\n", "wevt_logon_attempts function\n", "wevt_processes function\n", "wevt_get_process_tree function\n", "wevt_parent_process function\n", "wevt_process_session function\n" ] } ], "source": [ "Host.MSSentinel()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TenantIdSourceSystemTimeGeneratedMGManagementGroupNameSourceComputerIdComputerIPComputerCategoryOSTypeOSNameOSMajorVersionOSMinorVersionVersionSCAgentChannelIsGatewayInstalledRemoteIPLongitudeRemoteIPLatitudeRemoteIPCountrySubscriptionIdResourceGroupResourceProviderResourceResourceIdResourceTypeComputerEnvironmentSolutionsVMUUIDComputerPrivateIPsType_ResourceId
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [TenantId, SourceSystem, TimeGenerated, MG, ManagementGroupName, SourceComputerId, ComputerIP, Computer, Category, OSType, OSName, OSMajorVersion, OSMinorVersion, Version, SCAgentChannel, IsGatewayInstalled, RemoteIPLongitude, RemoteIPLatitude, RemoteIPCountry, SubscriptionId, ResourceGroup, ResourceProvider, Resource, ResourceId, ResourceType, ComputerEnvironment, Solutions, VMUUID, ComputerPrivateIPs, Type, _ResourceId]\n", "Index: []" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "host = Host(HostName=\"VictimPc\")\n", "Host.MSSentinel.heartbeat(host)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TenantIdAccountEventIDTimeGeneratedSourceComputerIdComputerSubjectUserNameSubjectDomainNameSubjectUserSidTargetUserNameTargetDomainNameTargetUserSidTargetLogonIdLogonProcessNameLogonTypeLogonTypeNameAuthenticationPackageNameStatusIpAddressWorkstationNameTimeCreatedUtc
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [TenantId, Account, EventID, TimeGenerated, SourceComputerId, Computer, SubjectUserName, SubjectDomainName, SubjectUserSid, TargetUserName, TargetDomainName, TargetUserSid, TargetLogonId, LogonProcessName, LogonType, LogonTypeName, AuthenticationPackageName, Status, IpAddress, WorkstationName, TimeCreatedUtc]\n", "Index: []" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.MSSentinel.wevt_logons(host_name=\"VictimPc\").head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding additional parameters\n", "\n", "The example below shows using the host entity as an initial parameter\n", "(Pivot is using the attribute mapping assign the `host_name` function parameter the value of `host.fqdn`).\n", "\n", "The second parameter is a list of event IDs specified explicitly." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1;31mSignature:\u001b[0m \u001b[0mHost\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mMSSentinel\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mwevt_logons\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m->\u001b[0m \u001b[0mUnion\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mpandas\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcore\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mframe\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDataFrame\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mAny\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mDocstring:\u001b[0m\n", "Retrieves the logon events on the host\n", "\n", "Parameters\n", "----------\n", "add_query_items: str (optional)\n", " Additional query clauses\n", "end: datetime\n", " Query end time\n", "event_filter: str (optional)\n", " Event subset\n", " (default value is: | where EventID == 4624)\n", "host_name: str\n", " Name of host\n", "query_project: str (optional)\n", " Column project statement\n", " (default value is: | project TenantId, Account, EventID, TimeGenerat...)\n", "start: datetime\n", " Query start time\n", "subscription_filter: str (optional)\n", " Optional subscription/tenant filter expression\n", " (default value is: true)\n", "table: str (optional)\n", " Table name\n", " (default value is: SecurityEvent)\n", "\u001b[1;31mFile:\u001b[0m f:\\anaconda\\envs\\msticpy\\lib\\functools.py\n", "\u001b[1;31mType:\u001b[0m function\n" ] } ], "source": [ "Host.MSSentinel.wevt_logons?" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Computer
EventIDActivity
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [Computer]\n", "Index: []" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", " Host.MSSentinel.wevt_events_by_id( # Pivot query returns DataFrame\n", " host, event_list=[4624, 4625, 4672]\n", " )\n", " [[\"Computer\", \"EventID\", \"Activity\"]] # we could have save the output to a dataframe\n", " .groupby([\"EventID\", \"Activity\"]) # variable but we can also use pandas\n", " .count() # functions/syntax directly on the output\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using iterables as parameters to data queries\n", "\n", "Some data queries accept \"list\" items as parameters (e.g. many of the IP queries accept a\n", "list of IP addresses). These work as expected, with a single query calling sending the whole list\n", "as a single parameter." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1;31mSignature:\u001b[0m \u001b[0mIpAddress\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mMSSentinel\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0maad_signins\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m->\u001b[0m \u001b[0mUnion\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mpandas\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcore\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mframe\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDataFrame\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mAny\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mDocstring:\u001b[0m\n", "Lists Azure AD Signins for an IP Address\n", "\n", "Parameters\n", "----------\n", "add_query_items: str (optional)\n", " Additional query clauses\n", "end: datetime\n", " Query end time\n", "ip_address_list: list\n", " The IP Address or list of Addresses\n", "start: datetime\n", " Query start time\n", "table: str (optional)\n", " Table name\n", " (default value is: SigninLogs)\n", "\u001b[1;31mFile:\u001b[0m f:\\anaconda\\envs\\msticpy\\lib\\functools.py\n", "\u001b[1;31mType:\u001b[0m function\n" ] } ], "source": [ "ip_list = [\n", " \"203.23.68.64\",\n", " \"67.10.68.45\",\n", " \"182.69.173.164\",\n", " \"79.176.167.161\",\n", " \"167.220.197.230\",\n", "]\n", "\n", "IpAddress.MSSentinel.aad_signins?" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "pandas.core.frame.DataFrame" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(IpAddress.MSSentinel.aad_signins(ip_address_list=ip_list).head(5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using iterable values where the query function was designed to only accept single values\n", "\n", "In this case the pivot function will iterate through the values of the\n", "iterable, making a separate query for each and then joining the results.\n", "\n", "We can see that this function only accepts a single value for \"account_name\"." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1;31mSignature:\u001b[0m \u001b[0mAccount\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mMSSentinel\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0maad_signins\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m->\u001b[0m \u001b[0mUnion\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mpandas\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcore\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mframe\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDataFrame\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mAny\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mDocstring:\u001b[0m\n", "Lists Azure AD Signins for Account\n", "\n", "Parameters\n", "----------\n", "account_id: str (optional)\n", " Azure user ID to find\n", " (default value is: !!DEFAULT!!)\n", "account_name: str (optional)\n", " The account name to find\n", " (default value is: !!DEFAULT!!)\n", "add_query_items: str (optional)\n", " Additional query clauses\n", "end: datetime\n", " Query end time\n", "start: datetime\n", " Query start time\n", "table: str (optional)\n", " Table name\n", " (default value is: SigninLogs)\n", "\u001b[1;31mFile:\u001b[0m f:\\anaconda\\envs\\msticpy\\lib\\functools.py\n", "\u001b[1;31mType:\u001b[0m function\n" ] } ], "source": [ "Account.MSSentinel.aad_signins?" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TenantIdSourceSystemTimeGeneratedResourceIdOperationNameOperationVersionCategoryResultTypeResultSignatureResultDescriptionDurationMsCorrelationIdResourceResourceGroupResourceProviderIdentityLevelLocationAlternateSignInNameAppDisplayNameAppIdAuthenticationContextClassReferencesAuthenticationDetailsAuthenticationMethodsUsedAuthenticationProcessingDetails...RiskStateResourceDisplayNameResourceIdentityResourceServicePrincipalIdServicePrincipalIdServicePrincipalNameStatusTokenIssuerNameTokenIssuerTypeUserAgentUserDisplayNameUserIdUserPrincipalNameAADTenantIdUserTypeFlaggedForReviewIPAddressFromResourceProviderSignInIdentifierSignInIdentifierTypeResourceTenantIdHomeTenantIdUniqueTokenIdentifierSessionLifetimePoliciesAutonomousSystemNumberType
\n", "

0 rows × 71 columns

\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [TenantId, SourceSystem, TimeGenerated, ResourceId, OperationName, OperationVersion, Category, ResultType, ResultSignature, ResultDescription, DurationMs, CorrelationId, Resource, ResourceGroup, ResourceProvider, Identity, Level, Location, AlternateSignInName, AppDisplayName, AppId, AuthenticationContextClassReferences, AuthenticationDetails, AuthenticationMethodsUsed, AuthenticationProcessingDetails, AuthenticationRequirement, AuthenticationRequirementPolicies, ClientAppUsed, ConditionalAccessPolicies, ConditionalAccessStatus, CreatedDateTime, DeviceDetail, IsInteractive, Id, IPAddress, IsRisky, LocationDetails, MfaDetail, NetworkLocationDetails, OriginalRequestId, ProcessingTimeInMilliseconds, RiskDetail, RiskEventTypes, RiskEventTypes_V2, RiskLevelAggregated, RiskLevelDuringSignIn, RiskState, ResourceDisplayName, ResourceIdentity, ResourceServicePrincipalId, ServicePrincipalId, ServicePrincipalName, Status, TokenIssuerName, TokenIssuerType, UserAgent, UserDisplayName, UserId, UserPrincipalName, AADTenantId, UserType, FlaggedForReview, IPAddressFromResourceProvider, SignInIdentifier, SignInIdentifierType, ResourceTenantId, HomeTenantId, UniqueTokenIdentifier, SessionLifetimePolicies, AutonomousSystemNumber, Type]\n", "Index: []\n", "\n", "[0 rows x 71 columns]" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "accounts = [\n", " \"ofshezaf\",\n", " \"moshabi\",\n", "]\n", "Account.MSSentinel.aad_signins(account_name=accounts)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Combining multiple iterables and single-valued parameters\n", "\n", "The same rules as outline earlier for multiple parameters of different types apply to data queries" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
UserPrincipalNameIdentity
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [UserPrincipalName, Identity]\n", "Index: []" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "project = \"| project UserPrincipalName, Identity\"\n", "Account.MSSentinel.aad_signins(account_name=accounts, add_query_items=project)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using DataFrames as input\n", "\n", "This is similar to using dataframes for other pivot functions.\n", "\n", "We must use the `data` parameter to specify the input dataframe.\n", "You supply the column name from your input dataframe as the value of\n", "the parameters expected by the function." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
User
0ofshezaf
1moshabi
\n", "
" ], "text/plain": [ " User\n", "0 ofshezaf\n", "1 moshabi" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "account_df = pd.DataFrame(accounts, columns=[\"User\"])\n", "display(account_df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have our dataframe:\n", "\n", "- we specify `account_df` as the value of the `data` parameter.\n", "- in our source (input) dataframe, the column that we want to use as the input value for each query is `User`\n", "- we specify that column name as the value of the function parameter\n", "\n", "On each iteration, the column value from a subsequent row will be extracted and \n", "given as the parameter value for the function parameter.\n", "\n", "> Note:
\n", "> If the function parameter type is a \"list\" type - i.e. it expects a list of values
\n", "> the parameter value will be sent as a list and only a single query is executed.
\n", "> If the query function has multiple \"list\" type parameters, these will be
\n", "> populated in the same way.\n", "\n", "> Note2:
\n", "> If you have multiple parameters fed by multiple input columns AND one or more
\n", "> of the function parameters *is not* a list type, the the query will be broken
\n", "> into queries for each row. Each sub-query getting its values from a single row
\n", "> of the input dataframe." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
UserPrincipalNameIdentity
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [UserPrincipalName, Identity]\n", "Index: []" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Account.MSSentinel.aad_signins(data=account_df, account_name=\"User\", add_query_items=project)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Threat Intelligence Lookups\n", "\n", "These work in the same way as the functions described earlier. However,\n", "there are a few peculiarities of the Threat Intel functions:\n", "\n", "### IPV4 and IPV6\n", "Some providers treat these interchangably and use the same endpoint for both.\n", "Other providers do not explicitly support IPV6 (e.g. the Tor exit nodes provider).\n", "Still others (notably OTX) use different endpoints for IPv4 and IPv6.\n", "\n", "If you are querying IPv4 you can use either the `lookup_ip` function or one\n", "of the `lookup_ipv4` functions. In most cases, you can also use these functions\n", "for a mixture of IPv4 and v6 addresses. However, in cases where a provider\n", "does not support IPv6 or uses a different endpoint for IPv6 queries you\n", "will get no responses.\n", "\n", "### Entity mapping to IoC Types\n", "This table shows the mapping between and entity type\n", "and IoC Types:\n", "\n", "| Entity | IoCType |\n", "| :--------- | :----------------- |\n", "| IpAddress | ipv4, ipv6 |\n", "| Dns | domain |\n", "| File | filehash (incl |\n", "| | md5, sha1, sha256) |\n", "| Url | url |\n", "\n", "
\n", "\n", "> Note: Where you are using a File entity as a parameter, there is a complication.
\n", "> A file entity can have multiple hash values (md5, sha1, sha256 and even sha256 authenticode).
\n", "> The `file_hash` attibute of File is used as the default parameter.
\n", "> In cases where a file has multiple hashes the highest priority hash (in order
\n", "> sha256, sha1, md5, sha256ac) is returned.
\n", "> If you are not using file entities as parameters (and specifying the input values
\n", "> explicitly or via a Dataframe or iterable) you can ignore this." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "lookup_ip function\n", "lookup_ipv4 function\n", "lookup_ipv6 function\n" ] } ], "source": [ "IpAddress.ti()" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IocIocTypeSafeIocQuerySubtypeProviderResultSeverityDetailsRawResultReferenceStatus
0fkksjobnn43.orgdnsfkksjobnn43.orgNoneOTXTruehigh{'pulse_count': 36, 'names': ['Vertek - Jaff Ransomware', 'Jaff - Malware Domain Feed V2', 'Jaff...{'sections': ['general', 'geo', 'url_list', 'passive_dns', 'malware', 'whois', 'http_scans'], 'w...https://otx.alienvault.com/api/v1/indicators/domain/fkksjobnn43.org/general0
0fkksjobnn43.orgdnsNoneOPRTruewarning{'rank': None, 'error': 'Domain not found'}{'status_code': 404, 'error': 'Domain not found', 'page_rank_integer': 0, 'page_rank_decimal': 0...https://openpagerank.com/api/v1.0/getPageRank?domains[0]=fkksjobnn43.org0
0fkksjobnn43.orgdnsfkksjobnn43.orgNoneRiskIQTruehigh{'summary': {'resolutions': 7, 'certificates': 0, 'malware_hashes': 79, 'projects': 0, 'articles...{'summary': {'resolutions': 7, 'certificates': 0, 'malware_hashes': 79, 'projects': 0, 'articles...https://community.riskiq.com0
0fkksjobnn43.orgdnsfkksjobnn43.orgNoneVirusTotalTrueinformation{'verbose_msg': 'Domain found in dataset', 'response_code': 1, 'positives': 0, 'detected_urls': ...{'Sophos category': 'command and control', 'undetected_downloaded_samples': [], 'whois_timestamp...https://www.virustotal.com/vtapi/v2/domain/report0
0fkksjobnn43.orgdnsfkksjobnn43.orgNoneXForceFalseinformationAuthorization failed. Check account and key details.<Response [401 Unauthorized]>https://api.xforce.ibmcloud.com/url/fkksjobnn43.org401
\n", "
" ], "text/plain": [ " Ioc IocType SafeIoc QuerySubtype Provider Result \\\n", "0 fkksjobnn43.org dns fkksjobnn43.org None OTX True \n", "0 fkksjobnn43.org dns None OPR True \n", "0 fkksjobnn43.org dns fkksjobnn43.org None RiskIQ True \n", "0 fkksjobnn43.org dns fkksjobnn43.org None VirusTotal True \n", "0 fkksjobnn43.org dns fkksjobnn43.org None XForce False \n", "\n", " Severity \\\n", "0 high \n", "0 warning \n", "0 high \n", "0 information \n", "0 information \n", "\n", " Details \\\n", "0 {'pulse_count': 36, 'names': ['Vertek - Jaff Ransomware', 'Jaff - Malware Domain Feed V2', 'Jaff... \n", "0 {'rank': None, 'error': 'Domain not found'} \n", "0 {'summary': {'resolutions': 7, 'certificates': 0, 'malware_hashes': 79, 'projects': 0, 'articles... \n", "0 {'verbose_msg': 'Domain found in dataset', 'response_code': 1, 'positives': 0, 'detected_urls': ... \n", "0 Authorization failed. Check account and key details. \n", "\n", " RawResult \\\n", "0 {'sections': ['general', 'geo', 'url_list', 'passive_dns', 'malware', 'whois', 'http_scans'], 'w... \n", "0 {'status_code': 404, 'error': 'Domain not found', 'page_rank_integer': 0, 'page_rank_decimal': 0... \n", "0 {'summary': {'resolutions': 7, 'certificates': 0, 'malware_hashes': 79, 'projects': 0, 'articles... \n", "0 {'Sophos category': 'command and control', 'undetected_downloaded_samples': [], 'whois_timestamp... \n", "0 \n", "\n", " Reference \\\n", "0 https://otx.alienvault.com/api/v1/indicators/domain/fkksjobnn43.org/general \n", "0 https://openpagerank.com/api/v1.0/getPageRank?domains[0]=fkksjobnn43.org \n", "0 https://community.riskiq.com \n", "0 https://www.virustotal.com/vtapi/v2/domain/report \n", "0 https://api.xforce.ibmcloud.com/url/fkksjobnn43.org \n", "\n", " Status \n", "0 0 \n", "0 0 \n", "0 0 \n", "0 0 \n", "0 401 " ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Dns.ti.lookup_dns(value=\"fkksjobnn43.org\")" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IocIocTypeSafeIocQuerySubtypeProviderResultSeverityDetailsRawResultReferenceStatus
002a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbdsha256_hash02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbdNoneVirusTotalTruehigh{'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 51, 'res...{'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware2'...https://www.virustotal.com/vtapi/v2/file/report0
106b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdafsha256_hash06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdafNoneVirusTotalTruehigh{'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 55, 'res...{'scans': {'Bkav': {'detected': False, 'version': '1.3.0.9899', 'result': None, 'update': '20201...https://www.virustotal.com/vtapi/v2/file/report0
206c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ffsha256_hash06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ffNoneVirusTotalTruehigh{'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 53, 'res...{'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware1'...https://www.virustotal.com/vtapi/v2/file/report0
\n", "
" ], "text/plain": [ " Ioc \\\n", "0 02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbd \n", "1 06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdaf \n", "2 06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ff \n", "\n", " IocType \\\n", "0 sha256_hash \n", "1 sha256_hash \n", "2 sha256_hash \n", "\n", " SafeIoc \\\n", "0 02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbd \n", "1 06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdaf \n", "2 06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ff \n", "\n", " QuerySubtype Provider Result Severity \\\n", "0 None VirusTotal True high \n", "1 None VirusTotal True high \n", "2 None VirusTotal True high \n", "\n", " Details \\\n", "0 {'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 51, 'res... \n", "1 {'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 55, 'res... \n", "2 {'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 53, 'res... \n", "\n", " RawResult \\\n", "0 {'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware2'... \n", "1 {'scans': {'Bkav': {'detected': False, 'version': '1.3.0.9899', 'result': None, 'update': '20201... \n", "2 {'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware1'... \n", "\n", " Reference Status \n", "0 https://www.virustotal.com/vtapi/v2/file/report 0 \n", "1 https://www.virustotal.com/vtapi/v2/file/report 0 \n", "2 https://www.virustotal.com/vtapi/v2/file/report 0 " ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hashes = [\n", " \"02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbd\",\n", " \"06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdaf\",\n", " \"06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ff\",\n", "]\n", "\n", "File.ti.lookup_file_hash_VirusTotal(hashes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Lookup from a DataFrame\n", "\n", "To specify the source column you can use either \"column\" or \"obs_column\"" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hashrefdesc
002a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbditem_0stuff
106b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdafitem_1stuff
206c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ffitem_2stuff
\n", "
" ], "text/plain": [ " hash ref \\\n", "0 02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbd item_0 \n", "1 06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdaf item_1 \n", "2 06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ff item_2 \n", "\n", " desc \n", "0 stuff \n", "1 stuff \n", "2 stuff " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IocIocTypeSafeIocQuerySubtypeProviderResultSeverityDetailsRawResultReferenceStatus
002a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbdsha256_hash02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbdNoneVirusTotalTruehigh{'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 51, 'res...{'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware2'...https://www.virustotal.com/vtapi/v2/file/report0
106b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdafsha256_hash06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdafNoneVirusTotalTruehigh{'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 55, 'res...{'scans': {'Bkav': {'detected': False, 'version': '1.3.0.9899', 'result': None, 'update': '20201...https://www.virustotal.com/vtapi/v2/file/report0
206c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ffsha256_hash06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ffNoneVirusTotalTruehigh{'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 53, 'res...{'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware1'...https://www.virustotal.com/vtapi/v2/file/report0
\n", "
" ], "text/plain": [ " Ioc \\\n", "0 02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbd \n", "1 06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdaf \n", "2 06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ff \n", "\n", " IocType \\\n", "0 sha256_hash \n", "1 sha256_hash \n", "2 sha256_hash \n", "\n", " SafeIoc \\\n", "0 02a7977d1faf7bfc93a4b678a049c9495ea663e7065aa5a6caf0f69c5ff25dbd \n", "1 06b020a3fd3296bc4c7bf53307fe7b40638e7f445bdd43fac1d04547a429fdaf \n", "2 06c676bf8f5c6af99172c1cf63a84348628ae3f39df9e523c42447e2045e00ff \n", "\n", " QuerySubtype Provider Result Severity \\\n", "0 None VirusTotal True high \n", "1 None VirusTotal True high \n", "2 None VirusTotal True high \n", "\n", " Details \\\n", "0 {'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 51, 'res... \n", "1 {'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 55, 'res... \n", "2 {'verbose_msg': 'Scan finished, information embedded', 'response_code': 1, 'positives': 53, 'res... \n", "\n", " RawResult \\\n", "0 {'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware2'... \n", "1 {'scans': {'Bkav': {'detected': False, 'version': '1.3.0.9899', 'result': None, 'update': '20201... \n", "2 {'scans': {'Bkav': {'detected': True, 'version': '1.3.0.9899', 'result': 'W32.AIDetect.malware1'... \n", "\n", " Reference Status \n", "0 https://www.virustotal.com/vtapi/v2/file/report 0 \n", "1 https://www.virustotal.com/vtapi/v2/file/report 0 \n", "2 https://www.virustotal.com/vtapi/v2/file/report 0 " ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hashes_df = pd.DataFrame(\n", " [(fh, f\"item_{idx}\", \"stuff\") for idx, fh in enumerate(hashes)],\n", " columns=[\"hash\", \"ref\", \"desc\"],\n", ")\n", "display(hashes_df)\n", "File.ti.lookup_file_hash_VirusTotal(data=hashes_df, column=\"hash\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Chaining pivot and other functions\n", "\n", "Because pivot functions can take dataframes as inputs and return them\n", "as outputs, you can create chains of pivot functions.\n", "You can also add other items to the chain that input or output\n", "dataframes.\n", "\n", "For example, you could build a chain that included the following:\n", "- take IP addresses from firewall alerts\n", "- lookup the IPs in Threat Intel providers filtering those that have high severity\n", "- lookup the any remote logon events sourced at those IPs\n", "- display a timeline of the logons\n", "\n", "To make building these types of pipelines easier we've implemented some\n", "pandas helper functions. These are available in the `mp_pivot`\n", "property of pandas DataFrames, once Pivot is imported.\n", "\n", "### mp_pivot.run\n", "\n", "`run` lets you run a pivot function as a pandas pipeline operation.\n", " \n", "Let's take an example of a simple pivot function using a dataframe as input\n", "```\n", " IpAddress.util.whois(data=my_df, column=\"Ioc\")\n", "```\n", "\n", "We can us mp_pivot.run to do this:\n", "```\n", " (\n", " my_df\n", " .query(\"UserCount > 1\")\n", " .mp_pivot.run(IpAddress.util.whois, column=\"Ioc\")\n", " .drop_duplicates()\n", " )\n", "```\n", "The pandas extension takes care of the `data=my_df` parameter. We still have\n", "to add any other required parameters (like the column specification in this case.\n", "When it runs it returns its output as a DataFrame and the next operation\n", "(drop_duplicates()) runs on this output.\n", "\n", "Depending on the scenario you might want to preserve the existing dataframe\n", "contents (most of the pivot functions only return the results of their specific\n", "operation - e.g. whois returns ASN information for an IP address). You\n", "can carry the columns of the input dataframe over to the output from \n", "the pivot function by adding a `join` parameter to the mp_pivot.run() call.\n", "Use a \"left\" to keep all of the input rows regardless of whether the pivot\n", "function returned a result for that row.\n", "Use an \"inner\" join to return only rows where the input had a positive result\n", "in the pivot function.\n", "```\n", " .mp_pivot.run(IpAddress.util.whois, column=\"Ioc\", join=\"inner\")\n", "```\n", "\n", "There are also a couple of convenience functions. These only work in\n", "an IPython/Jupyter environment.\n", "\n", "### mp_pivot.display\n", "\n", "`mp_pivot.display` will display the intermediate results of the dataframe in the middle\n", "of a pipeline. It does not change the data at all, but does give you the \n", "chance to display a view of the data partway through processing. This\n", "is useful for debugging but its main purpose is to give you a way to\n", "show partial results without having to break the pipeline into pieces\n", "and create unnecessary throw-away variables that will add bulk to your\n", "code and clutter to your memory.\n", "\n", "`display` supports some options that you can use to modify the displayed\n", "output:\n", "\n", "- title - displays a title above the data\n", "- cols - a list of columns to display (others are hidden)\n", "- query - you can filter the output using a df.query() string. See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html?highlight=query#pandas.DataFrame.query\n", " for more details\n", "- head - limits the display to the first `head` rows\n", "\n", "These options do not affect the data being passed through the pipeline -\n", "only how the intermediate output is displayed.\n", "\n", "### mp_pivot.tee\n", "`mp_pivot.tee` behaves a little like the Linux \"tee\" command. It allows the\n", "data to pass through unchanged but allows you to create a variable that\n", "is a snapshot of the data at that point in the pipeline. It takes\n", "a parameter `var_name` and assigns the current DataFrame instance\n", "to that name. So, when your pipeline has run you can access partial results (again,\n", "without having to break up your pipeline to do so).\n", "\n", "By default, it will not overwrite an existing variable of the same name\n", "unless you specify `clobber=True` in the call to `tee`.\n", "\n", "### mp_pivot.tee_exec\n", "behaves similarly to the \"tee\" function above except that it\n", "will try to execute the DataFrame accessor function on the input\n", "DataFrame. The name of the function (as a string) can be passed named as the value of the\n", "`df_func` named parameter, or the first positional.\n", "The function **must** be a method of a pandas DataFrame - this includes\n", "built-in functions such as `.query`, `.sort_values` or a custom function\n", "added as a custom pd accessor function (see \n", "[Extending pandas](https://pandas.pydata.org/pandas-docs/stable/development/extending.html?highlight=accessor))\n", "\n", "`mp_pivot.tee_exec` allows the input\n", "data to pass through unchanged but will also send\n", "a snapshot of the data at that point in the pipeline to the named function.\n", "You can also pass arbitrary other named arguments to the `tee_exec`. These arguments will be passed to the `df_func` function.\n", "\n", "### Example\n", "The example below shows the use of mp_pivot.run and mp_pivot.display.\n", "\n", "This takes an existing DataFrame - suspcious_ips - and:\n", "\n", "- displays the top 5 rows of the dataframe\n", "- checks for threat intelligence reports on any of the IP addresses\n", "- uses pandas `query` to filter only the high severity hits\n", "- calls the whois pivot function to obtain ownership information for these IPs\n", " (note that we join the results of the previous step here usine `join='left'`\n", " so our output will be all TI result data plus whois data\n", "- calls a pivot data query to check for Azure Active Directory logins that\n", " have an IP address source that matches any of these addresses.\n", " \n", "The final step uses another MSTICPy pandas extension to plot the login attempts\n", "on a timeline chart." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "suspicious_ips = [\n", " \"212.109.217.155\",\n", " \"103.141.68.38\",\n", " \"165.255.70.149\",\n", " \"206.1.228.141\",\n", "]\n", "suspicious_ips_df = pd.DataFrame(suspicious_ips, columns=[\"IPAddress\"])" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/html": [ "

Initial IPs 4

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IPAddress
0113.190.36.2
1118.163.135.17
2118.163.135.18
3118.163.97.19
4125.34.240.33
\n", "
" ], "text/plain": [ " IPAddress\n", "0 113.190.36.2\n", "1 118.163.135.17\n", "2 118.163.135.18\n", "3 118.163.97.19\n", "4 125.34.240.33" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "

TI High Severity IPs

" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IocIocTypeSafeIocQuerySubtypeProviderResultSeverityDetailsRawResultReferenceStatusnirasn_registryasnasn_cidrasn_country_codeasn_dateasn_descriptionquerynetsrawreferralraw_referral
0113.190.36.2ipv4113.190.36.2NoneOTXTruehigh{'pulse_count': 46, 'names': ['Analysis Report (AR21-013A) - Strengthening Security Configuratio...{'whois': 'http://whois.domaintools.com/113.190.36.2', 'reputation': 0, 'indicator': '113.190.36...https://otx.alienvault.com/api/v1/indicators/IPv4/113.190.36.2/general0Noneapnic45899113.190.32.0/20VN2008-11-04VNPT-AS-VN VNPT Corp, VN113.190.36.2[{'cidr': '113.160.0.0/11', 'name': 'VNPT-VN', 'handle': 'PTH13-AP', 'range': '113.160.0.0 - 113...NoneNoneNone
1118.163.135.17ipv4118.163.135.17NoneOTXTruehigh{'pulse_count': 50, 'names': ['IOCs - 20201122247 - ANIA Threat Feeds - IP Segment 0', 'IOCs - 2...{'whois': 'http://whois.domaintools.com/118.163.135.17', 'reputation': 0, 'indicator': '118.163....https://otx.alienvault.com/api/v1/indicators/IPv4/118.163.135.17/general0Noneapnic3462118.163.0.0/16TW2007-10-04HINET Data Communication Business Group, TW118.163.135.17[{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1...NoneNoneNone
2118.163.135.18ipv4118.163.135.18NoneOTXTruehigh{'pulse_count': 50, 'names': ['Analysis Report (AR21-013A) - Strengthening Security Configuratio...{'whois': 'http://whois.domaintools.com/118.163.135.18', 'reputation': 0, 'indicator': '118.163....https://otx.alienvault.com/api/v1/indicators/IPv4/118.163.135.18/general0Noneapnic3462118.163.0.0/16TW2007-10-04HINET Data Communication Business Group, TW118.163.135.18[{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1...NoneNoneNone
3118.163.97.19ipv4118.163.97.19NoneOTXTruehigh{'pulse_count': 50, 'names': ['Analysis Report (AR21-013A) - Strengthening Security Configuratio...{'whois': 'http://whois.domaintools.com/118.163.97.19', 'reputation': 0, 'indicator': '118.163.9...https://otx.alienvault.com/api/v1/indicators/IPv4/118.163.97.19/general0Noneapnic3462118.163.0.0/16TW2007-10-04HINET Data Communication Business Group, TW118.163.97.19[{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1...NoneNoneNone
4125.34.240.33ipv4125.34.240.33NoneOTXTruehigh{'pulse_count': 50, 'names': ['IOCs - 20223111348 - ANIA Threat Feeds - IP Segment 4', 'IOCs - 2...{'whois': 'http://whois.domaintools.com/125.34.240.33', 'reputation': 0, 'indicator': '125.34.24...https://otx.alienvault.com/api/v1/indicators/IPv4/125.34.240.33/general0Noneapnic4837125.34.240.0/24CN2006-01-09CHINA169-BACKBONE CHINA UNICOM China169 Backbone, CN125.34.240.33[{'cidr': '125.34.0.0/16', 'name': 'UNICOM-BJ', 'handle': 'CH1302-AP', 'range': '125.34.0.0 - 12...NoneNoneNone
\n", "
" ], "text/plain": [ " Ioc IocType SafeIoc QuerySubtype Provider Result \\\n", "0 113.190.36.2 ipv4 113.190.36.2 None OTX True \n", "1 118.163.135.17 ipv4 118.163.135.17 None OTX True \n", "2 118.163.135.18 ipv4 118.163.135.18 None OTX True \n", "3 118.163.97.19 ipv4 118.163.97.19 None OTX True \n", "4 125.34.240.33 ipv4 125.34.240.33 None OTX True \n", "\n", " Severity \\\n", "0 high \n", "1 high \n", "2 high \n", "3 high \n", "4 high \n", "\n", " Details \\\n", "0 {'pulse_count': 46, 'names': ['Analysis Report (AR21-013A) - Strengthening Security Configuratio... \n", "1 {'pulse_count': 50, 'names': ['IOCs - 20201122247 - ANIA Threat Feeds - IP Segment 0', 'IOCs - 2... \n", "2 {'pulse_count': 50, 'names': ['Analysis Report (AR21-013A) - Strengthening Security Configuratio... \n", "3 {'pulse_count': 50, 'names': ['Analysis Report (AR21-013A) - Strengthening Security Configuratio... \n", "4 {'pulse_count': 50, 'names': ['IOCs - 20223111348 - ANIA Threat Feeds - IP Segment 4', 'IOCs - 2... \n", "\n", " RawResult \\\n", "0 {'whois': 'http://whois.domaintools.com/113.190.36.2', 'reputation': 0, 'indicator': '113.190.36... \n", "1 {'whois': 'http://whois.domaintools.com/118.163.135.17', 'reputation': 0, 'indicator': '118.163.... \n", "2 {'whois': 'http://whois.domaintools.com/118.163.135.18', 'reputation': 0, 'indicator': '118.163.... \n", "3 {'whois': 'http://whois.domaintools.com/118.163.97.19', 'reputation': 0, 'indicator': '118.163.9... \n", "4 {'whois': 'http://whois.domaintools.com/125.34.240.33', 'reputation': 0, 'indicator': '125.34.24... \n", "\n", " Reference \\\n", "0 https://otx.alienvault.com/api/v1/indicators/IPv4/113.190.36.2/general \n", "1 https://otx.alienvault.com/api/v1/indicators/IPv4/118.163.135.17/general \n", "2 https://otx.alienvault.com/api/v1/indicators/IPv4/118.163.135.18/general \n", "3 https://otx.alienvault.com/api/v1/indicators/IPv4/118.163.97.19/general \n", "4 https://otx.alienvault.com/api/v1/indicators/IPv4/125.34.240.33/general \n", "\n", " Status nir asn_registry asn asn_cidr asn_country_code \\\n", "0 0 None apnic 45899 113.190.32.0/20 VN \n", "1 0 None apnic 3462 118.163.0.0/16 TW \n", "2 0 None apnic 3462 118.163.0.0/16 TW \n", "3 0 None apnic 3462 118.163.0.0/16 TW \n", "4 0 None apnic 4837 125.34.240.0/24 CN \n", "\n", " asn_date asn_description \\\n", "0 2008-11-04 VNPT-AS-VN VNPT Corp, VN \n", "1 2007-10-04 HINET Data Communication Business Group, TW \n", "2 2007-10-04 HINET Data Communication Business Group, TW \n", "3 2007-10-04 HINET Data Communication Business Group, TW \n", "4 2006-01-09 CHINA169-BACKBONE CHINA UNICOM China169 Backbone, CN \n", "\n", " query \\\n", "0 113.190.36.2 \n", "1 118.163.135.17 \n", "2 118.163.135.18 \n", "3 118.163.97.19 \n", "4 125.34.240.33 \n", "\n", " nets \\\n", "0 [{'cidr': '113.160.0.0/11', 'name': 'VNPT-VN', 'handle': 'PTH13-AP', 'range': '113.160.0.0 - 113... \n", "1 [{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1... \n", "2 [{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1... \n", "3 [{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1... \n", "4 [{'cidr': '125.34.0.0/16', 'name': 'UNICOM-BJ', 'handle': 'CH1302-AP', 'range': '125.34.0.0 - 12... \n", "\n", " raw referral raw_referral \n", "0 None None None \n", "1 None None None \n", "2 None None None \n", "3 None None None \n", "4 None None None " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", " \n", " Loading BokehJS ...\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": "\n(function(root) {\n function now() {\n return new Date();\n }\n\n const force = true;\n\n if (typeof root._bokeh_onload_callbacks === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n const JS_MIME_TYPE = 'application/javascript';\n const HTML_MIME_TYPE = 'text/html';\n const EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n const CLASS_NAME = 'output_bokeh rendered_html';\n\n /**\n * Render data to the DOM node\n */\n function render(props, node) {\n const script = document.createElement(\"script\");\n node.appendChild(script);\n }\n\n /**\n * Handle when an output is cleared or removed\n */\n function handleClearOutput(event, handle) {\n const cell = handle.cell;\n\n const id = cell.output_area._bokeh_element_id;\n const server_id = cell.output_area._bokeh_server_id;\n // Clean up Bokeh references\n if (id != null && id in Bokeh.index) {\n Bokeh.index[id].model.document.clear();\n delete Bokeh.index[id];\n }\n\n if (server_id !== undefined) {\n // Clean up Bokeh references\n const cmd_clean = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n cell.notebook.kernel.execute(cmd_clean, {\n iopub: {\n output: function(msg) {\n const id = msg.content.text.trim();\n if (id in Bokeh.index) {\n Bokeh.index[id].model.document.clear();\n delete Bokeh.index[id];\n }\n }\n }\n });\n // Destroy server and session\n const cmd_destroy = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n cell.notebook.kernel.execute(cmd_destroy);\n }\n }\n\n /**\n * Handle when a new output is added\n */\n function handleAddOutput(event, handle) {\n const output_area = handle.output_area;\n const output = handle.output;\n\n // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n if ((output.output_type != \"display_data\") || (!Object.prototype.hasOwnProperty.call(output.data, EXEC_MIME_TYPE))) {\n return\n }\n\n const toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n\n if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n toinsert[toinsert.length - 1].firstChild.textContent = output.data[JS_MIME_TYPE];\n // store reference to embed id on output_area\n output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n }\n if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n const bk_div = document.createElement(\"div\");\n bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n const script_attrs = bk_div.children[0].attributes;\n for (let i = 0; i < script_attrs.length; i++) {\n toinsert[toinsert.length - 1].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n toinsert[toinsert.length - 1].firstChild.textContent = bk_div.children[0].textContent\n }\n // store reference to server id on output_area\n output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n }\n }\n\n function register_renderer(events, OutputArea) {\n\n function append_mime(data, metadata, element) {\n // create a DOM node to render to\n const toinsert = this.create_output_subarea(\n metadata,\n CLASS_NAME,\n EXEC_MIME_TYPE\n );\n this.keyboard_manager.register_events(toinsert);\n // Render to node\n const props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n render(props, toinsert[toinsert.length - 1]);\n element.append(toinsert);\n return toinsert\n }\n\n /* Handle when an output is cleared or removed */\n events.on('clear_output.CodeCell', handleClearOutput);\n events.on('delete.Cell', handleClearOutput);\n\n /* Handle when a new output is added */\n events.on('output_added.OutputArea', handleAddOutput);\n\n /**\n * Register the mime type and append_mime function with output_area\n */\n OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n /* Is output safe? */\n safe: true,\n /* Index of renderer in `output_area.display_order` */\n index: 0\n });\n }\n\n // register the mime type if in Jupyter Notebook environment and previously unregistered\n if (root.Jupyter !== undefined) {\n const events = require('base/js/events');\n const OutputArea = require('notebook/js/outputarea').OutputArea;\n\n if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n register_renderer(events, OutputArea);\n }\n }\n\n \n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n const NB_LOAD_WARNING = {'data': {'text/html':\n \"
\\n\"+\n \"

\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"

\\n\"+\n \"
    \\n\"+\n \"
  • re-rerun `output_notebook()` to attempt to load from CDN again, or
  • \\n\"+\n \"
  • use INLINE resources instead, as so:
  • \\n\"+\n \"
\\n\"+\n \"\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded() {\n const el = document.getElementById(\"1002\");\n if (el != null) {\n el.textContent = \"BokehJS is loading...\";\n }\n if (root.Bokeh !== undefined) {\n if (el != null) {\n el.textContent = \"BokehJS \" + root.Bokeh.version + \" successfully loaded.\";\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(display_loaded, 100)\n }\n }\n\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) {\n if (callback != null)\n callback();\n });\n } finally {\n delete root._bokeh_onload_callbacks\n }\n console.debug(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(css_urls, js_urls, callback) {\n if (css_urls == null) css_urls = [];\n if (js_urls == null) js_urls = [];\n\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.debug(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.debug(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = css_urls.length + js_urls.length;\n\n function on_load() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: all BokehJS libraries/stylesheets loaded\");\n run_callbacks()\n }\n }\n\n function on_error(url) {\n console.error(\"failed to load \" + url);\n }\n\n for (let i = 0; i < css_urls.length; i++) {\n const url = css_urls[i];\n const element = document.createElement(\"link\");\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.rel = \"stylesheet\";\n element.type = \"text/css\";\n element.href = url;\n console.debug(\"Bokeh: injecting link tag for BokehJS stylesheet: \", url);\n document.body.appendChild(element);\n }\n\n for (let i = 0; i < js_urls.length; i++) {\n const url = js_urls[i];\n const element = document.createElement('script');\n element.onload = on_load;\n element.onerror = on_error.bind(null, url);\n element.async = false;\n element.src = url;\n console.debug(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.head.appendChild(element);\n }\n };\n\n function inject_raw_css(css) {\n const element = document.createElement(\"style\");\n element.appendChild(document.createTextNode(css));\n document.body.appendChild(element);\n }\n\n \n const js_urls = [\"https://cdn.bokeh.org/bokeh/release/bokeh-2.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-gl-2.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-widgets-2.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-tables-2.4.2.min.js\", \"https://cdn.bokeh.org/bokeh/release/bokeh-mathjax-2.4.2.min.js\"];\n const css_urls = [];\n \n\n const inline_js = [\n function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\n function(Bokeh) {\n \n \n }\n ];\n\n function run_inline_js() {\n \n if (root.Bokeh !== undefined || force === true) {\n \n for (let i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }\n if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n const cell = $(document.getElementById(\"1002\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n\n }\n\n if (root._bokeh_is_loading === 0) {\n console.debug(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(css_urls, js_urls, function() {\n console.debug(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));", "application/vnd.bokehjs_load.v0+json": "" }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "No data to plot.\n" ] }, { "data": { "text/html": [ "
Figure(
id = '1003', …)
above = [],
align = 'start',
aspect_ratio = None,
aspect_scale = 1,
background = None,
background_fill_alpha = 1.0,
background_fill_color = '#ffffff',
below = [LinearAxis(id='1012', ...)],
border_fill_alpha = 1.0,
border_fill_color = '#ffffff',
center = [Grid(id='1015', ...), Grid(id='1019', ...)],
css_classes = [],
disabled = False,
extra_x_ranges = {},
extra_x_scales = {},
extra_y_ranges = {},
extra_y_scales = {},
frame_height = None,
frame_width = None,
height = 600,
height_policy = 'auto',
hidpi = True,
inner_height = 0,
inner_width = 0,
js_event_callbacks = {},
js_property_callbacks = {},
left = [LinearAxis(id='1016', ...)],
lod_factor = 10,
lod_interval = 300,
lod_threshold = 2000,
lod_timeout = 500,
margin = (0, 0, 0, 0),
match_aspect = False,
max_height = None,
max_width = None,
min_border = 5,
min_border_bottom = None,
min_border_left = None,
min_border_right = None,
min_border_top = None,
min_height = None,
min_width = None,
name = None,
outer_height = 0,
outer_width = 0,
outline_line_alpha = 1.0,
outline_line_cap = 'butt',
outline_line_color = '#e5e5e5',
outline_line_dash = [],
outline_line_dash_offset = 0,
outline_line_join = 'bevel',
outline_line_width = 1,
output_backend = 'canvas',
renderers = [],
reset_policy = 'standard',
right = [],
sizing_mode = None,
subscribed_events = [],
syncable = True,
tags = [],
title = Title(id='1036', ...),
title_location = 'above',
toolbar = Toolbar(id='1027', ...),
toolbar_location = 'right',
toolbar_sticky = True,
visible = True,
width = 600,
width_policy = 'auto',
x_range = DataRange1d(id='1004', ...),
x_scale = LinearScale(id='1008', ...),
y_range = DataRange1d(id='1006', ...),
y_scale = LinearScale(id='1010', ...))
\n", "\n" ], "text/plain": [ "Figure(id='1003', ...)" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", " suspicious_ips_df\n", " .mp_pivot.display(title=f\"Initial IPs {len(suspicious_ips)}\", head=5)\n", " # Lookup IPs at OTX\n", " .mp_pivot.run(IpAddress.ti.lookup_ipv4_OTX, column=\"IPAddress\")\n", " # Filter on high severity\n", " .query(\"Severity == 'high'\")\n", " .mp_pivot.run(IpAddress.util.whois, column=\"Ioc\", join=\"left\")\n", " .mp_pivot.display(title=\"TI High Severity IPs\", head=5)\n", " # Query IPs that have login attempts\n", " .mp_pivot.run(IpAddress.MSSentinel.aad_signins, ip_address_list=\"Ioc\")\n", " # Send the output of this to a plot\n", " .mp_plot.timeline(\n", " title=\"High Severity IPs with Logon attempts\",\n", " source_columns=[\"UserPrincipalName\", \"IPAddress\", \"ResultType\", \"ClientAppUsed\", \"UserAgent\", \"Location\"],\n", " group_by=\"UserPrincipalName\"\n", " )\n", ")" ] }, { "attachments": { "0f98df03-f8db-4ddc-a8cd-2535ff315af5.png": { "image/png": "" } }, "cell_type": "markdown", "metadata": {}, "source": [ "### Example output from pipelined functions\n", "\n", "This is what the pipelined functions should output (although the results\n", "will obviously not be the same for your environment).\n", "\n", "![image.png](attachment:0f98df03-f8db-4ddc-a8cd-2535ff315af5.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding custom functions to the pivot interface\n", "\n", "To do this you need the following information\n", "\n", "| Item | Description | Required |\n", "| :--------------------- | :-------------------------- | :---------- |\n", "| src_module | The src_module to containing the class or function | Yes |\n", "| class | The class containing function | No |\n", "| src_func_name | The name of the function to wrap | Yes |\n", "| func_new_name | Rename the function | No |\n", "| input type | The input type that the wrapped function expects (dataframe iterable value) | Yes |\n", "| entity_map | Mapping of entity and attribute used for function | Yes |\n", "| func_df_param_name | The param name that the function uses as input param for DataFrame | If DF input |\n", "| func_df_col_param_name | The param name that function uses to identify the input column name | If DF input |\n", "| func_out_column_name | Name of the column in the output DF to use as a key to join | If DF output|\n", "| func_static_params | dict of static name/value params always sent to the function | No |\n", "| func_input_value_arg | Name of the param that the wrapped function uses for its input value | No |\n", "| can_iterate | True if the function supports being called multiple times | No |\n", "| entity_container_name | The name of the container in the entity where the func will appear | No |\n", "\n", "\n", "The entity_map controls where the pivot function will be added. Each entry\n", "requires an Entity name (see msticpy.datamodel.entities) and an entity\n", "attribute name. This is only used if an instance of the entity is used\n", "as a parameter to the function. For `IpAddress` in the example below,\n", "the pivot function will try to extract the value of the `Address` attribute\n", "when an instance of IpAddress is used as a function parameter.\n", "\n", "```yaml\n", " entity_map:\n", " IpAddress: Address\n", " Host: HostName\n", " Account: Name\n", "```\n", "\n", "This means that you can specify different attributes of the same entity\n", "for different functions (or even for two instances of the same function)\n", "\n", "The `func_df_param_name` and `func_df_col_param_name` are needed only if\n", "the source function takes a dataframe and column name as input parameters.\n", "\n", "`func_out_column_name` is relevant if the source function returns a\n", "dataframe. In order to join input data with output data this needs to\n", "be the column in the output that has the same value as the function\n", "input (e.g. if you are processing IP addresses and the column name\n", "in the output DF containing the IP is named \"ip_addr\", put \"ip_addr\" here.)\n", "\n", "When you have this information create or add this to a yaml file\n", "with the top-level element `pivot_providers`.\n", "\n", "Example from the msticpy ip_utils `who_is` function\n", "```yaml\n", "pivot_providers:\n", " ...\n", " who_is:\n", " src_module: msticpy.sectools.ip_utils\n", " src_func_name: get_whois_df\n", " func_new_name: whois\n", " input_type: dataframe\n", " entity_map:\n", " IpAddress: Address\n", " func_df_param_name: data\n", " func_df_col_param_name: ip_column\n", " func_out_column_name: ip\n", " func_static_params:\n", " whois_col: whois_result\n", " func_input_value_arg: ip_address\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once you have your yaml definition file you can call\n", "```python\n", " Pivot.register_pivot_providers(\n", " pivot_reg_path=path_to_your_yaml,\n", " namespace=globals(),\n", " def_container=\"my_container\",\n", " force_container=True\n", " )\n", "```\n", "\n", "Note, this is not persistent. You will need to call this each time you\n", "start a new session." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### register_pivot_providers docstring\n", "\n", "```python\n", "Pivot.register_pivot_providers(\n", " pivot_reg_path: str,\n", " namespace: Dict[str, Any] = None,\n", " def_container: str = 'custom',\n", " force_container: bool = False,\n", ")\n", "Docstring:\n", "Register pivot functions from configuration file.\n", "\n", "Parameters\n", "----------\n", "file_path : str\n", " Path to config yaml file\n", "namespace : Dict[str, Any], optional\n", " Namespace to search for existing instances of classes, by default None\n", "container : str, optional\n", " Container name to use for entity pivot functions, by default \"other\"\n", "force_container : bool, optional\n", " Force `container` value to be used even if entity definitions have\n", " specific setting for a container name, by default False\n", "```" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1;31mSignature:\u001b[0m\n", "\u001b[0mPivot\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mregister_pivot_providers\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m\n", "\u001b[0m \u001b[0mpivot_reg_path\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mstr\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\n", "\u001b[0m \u001b[0mnamespace\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mDict\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mstr\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mAny\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\n", "\u001b[0m \u001b[0mdef_container\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mstr\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;34m'custom'\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\n", "\u001b[0m \u001b[0mforce_container\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mbool\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;32mFalse\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;33m\n", "\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mDocstring:\u001b[0m\n", "Register pivot functions from configuration file.\n", "\n", "Parameters\n", "----------\n", "pivot_reg_path : str\n", " Path to config yaml file\n", "namespace : Dict[str, Any], optional\n", " Namespace to search for existing instances of classes, by default None\n", "def_container : str, optional\n", " Container name to use for entity pivot functions, by default \"other\"\n", "force_container : bool, optional\n", " Force `container` value to be used even if entity definitions have\n", " specific setting for a container name, by default False\n", "\n", "Raises\n", "------\n", "ValueError\n", " An entity specified in the config file is not recognized.\n", "\u001b[1;31mFile:\u001b[0m e:\\src\\msticpy\\msticpy\\init\\pivot.py\n", "\u001b[1;31mType:\u001b[0m function\n" ] } ], "source": [ "Pivot.register_pivot_providers?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding ad hoc pivot functions\n", "\n", "You can also add ad hoc functions as pivot functions. This is\n", "probably a less common scenario but may be useful for testing and\n", "development.\n", "\n", "You can either create a PivotRegistration object and supply that (along\n", "with the `func` parameter), to this method." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "from msticpy.init.pivot import PivotRegistration\n", "\n", "def my_func(input: str):\n", " return input.upper()\n", "\n", "piv_reg = PivotRegistration(\n", " input_type=\"value\",\n", " entity_map={\"Host\": \"HostName\"},\n", " func_input_value_arg=\"input\",\n", " func_new_name=\"upper_name\"\n", ")\n", "\n", "Pivot.add_pivot_function(my_func, piv_reg, container=\"change_case\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, you can supply the\n", "pivot registration parameters as keyword arguments:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "def my_func(input: str):\n", " return input.upper()\n", "\n", "Pivot.add_pivot_function(\n", " func=my_func,\n", " container=\"change_case\",\n", " input_type=\"value\",\n", " entity_map={\"Host\": \"HostName\"},\n", " func_input_value_arg=\"input\",\n", " func_new_name=\"upper_name\",\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saving and re-using pipelines as yaml\n", "\n", "```yaml\n", "pipelines:\n", " pipeline1:\n", " description: Pipeline 1 description\n", " steps:\n", " - name: get_logons\n", " step_type: pivot\n", " function: util.whois\n", " entity: IpAddress\n", " comment: Standard pivot function\n", " params:\n", " column: IpAddress\n", " join: inner\n", " - name: disp_logons\n", " step_type: pivot_display\n", " comment: Pivot display\n", " params:\n", " title: \"The title\"\n", " cols:\n", " - Computer\n", " - Account\n", " query: Computer.str.startswith('MSTICAlerts')\n", " head: 10\n", " - name: tee_logons\n", " step_type: pivot_tee\n", " comment: Pivot tee\n", " params:\n", " var_name: var_df\n", " clobber: True\n", " - name: tee_logons_disp\n", " step_type: pivot_tee_exec\n", " comment: Pivot tee_exec with mp_plot.timeline\n", " function: mp_plot.timeline\n", " params:\n", " source_columns:\n", " - Computer\n", " - Account\n", " - name: logons_timeline\n", " step_type: pd_accessor\n", " comment: Standard accessor with mp_plot.timeline\n", " function: mp_plot.timeline\n", " params:\n", " source_columns:\n", " - Computer\n", " - Account\n", " pipeline2:\n", " description: Pipeline 2 description\n", " steps:\n", " - name: get_logons\n", " step_type: pivot\n", " function: util.whois\n", " entity: IpAddress\n", " comment: Standard pivot function\n", " params:\n", " column: IpAddress\n", " join: inner\n", " - name: disp_logons\n", " step_type: pivot_display\n", " comment: Pivot display\n", " params:\n", " title: \"The title\"\n", " cols:\n", " - Computer\n", " - Account\n", " query: Computer.str.startswith('MSTICAlerts')\n", " head: 10\n", " - name: tee_logons\n", " step_type: pivot_tee\n", " comment: Pivot tee\n", " params:\n", " var_name: var_df\n", " clobber: True\n", "```" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "from msticpy.init.pivot_core.pivot_pipeline import Pipeline\n", "\n", "pipelines_yml = \"\"\"\n", "pipelines:\n", " pipeline1:\n", " description: Pipeline 1 description\n", " steps:\n", " - name: get_ip_type\n", " step_type: pivot\n", " function: util.ip_type\n", " entity: IpAddress\n", " comment: Get IP Type\n", " params:\n", " column: IPAddress\n", " join: inner\n", " - name: filter_public\n", " step_type: pd_accessor\n", " comment: Filter to only public IPs\n", " function: query\n", " pos_params:\n", " - result == \"Public\"\n", " - name: whois\n", " step_type: pivot\n", " function: util.whois\n", " entity: IpAddress\n", " comment: Get Whois info\n", " params:\n", " column: IPAddress\n", " join: inner\n", " \n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# Pipeline 1 description\n", "(\n", " input_df\n", " # Get IP Type\n", " .mp_pivot.run(IpAddress.util.ip_type, column='IPAddress', join='inner')\n", " # Filter to only public IPs\n", " .query('result == \"Public\"')\n", " # Get Whois info\n", " .mp_pivot.run(IpAddress.util.whois, column='IPAddress', join='inner')\n", ")\n" ] } ], "source": [ "pipelines = list(Pipeline.from_yaml(pipelines_yml))\n", "print(pipelines[0].print_pipeline())" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Steps: 0%| | 0/3 [00:00, 'column': 'IPAddress', 'join': 'inner'}, text=\".mp_pivot.run(IpAddress.util.ip_type, column='IPAddress', join='inner')\", comment='Get IP Type', step_type='pivot')\n", "step = filter_public \n", " PipelineExecStep(accessor='query', pos_params=['result == \"Public\"'], params={}, text='.query(\\'result == \"Public\"\\')', comment='Filter to only public IPs', step_type='pd_accessor')\n", "step = whois \n", " PipelineExecStep(accessor='mp_pivot.run', pos_params=[], params={'func': , 'column': 'IPAddress', 'join': 'inner'}, text=\".mp_pivot.run(IpAddress.util.whois, column='IPAddress', join='inner')\", comment='Get Whois info', step_type='pivot')\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Steps: 100%|██████████| 3/3 [00:00<00:00, 3.49it/s]\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IPAddressipresultnirasn_registryasnasn_cidrasn_country_codeasn_dateasn_descriptionquerynetsrawreferralraw_referral
0113.190.36.2113.190.36.2PublicNoneapnic45899113.190.32.0/20VN2008-11-04VNPT-AS-VN VNPT Corp, VN113.190.36.2[{'cidr': '113.160.0.0/11', 'name': 'VNPT-VN', 'handle': 'PTH13-AP', 'range': '113.160.0.0 - 113...NoneNoneNone
1118.163.135.17118.163.135.17PublicNoneapnic3462118.163.0.0/16TW2007-10-04HINET Data Communication Business Group, TW118.163.135.17[{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1...NoneNoneNone
2118.163.135.18118.163.135.18PublicNoneapnic3462118.163.0.0/16TW2007-10-04HINET Data Communication Business Group, TW118.163.135.18[{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1...NoneNoneNone
\n", "
" ], "text/plain": [ " IPAddress ip result nir asn_registry asn \\\n", "0 113.190.36.2 113.190.36.2 Public None apnic 45899 \n", "1 118.163.135.17 118.163.135.17 Public None apnic 3462 \n", "2 118.163.135.18 118.163.135.18 Public None apnic 3462 \n", "\n", " asn_cidr asn_country_code asn_date \\\n", "0 113.190.32.0/20 VN 2008-11-04 \n", "1 118.163.0.0/16 TW 2007-10-04 \n", "2 118.163.0.0/16 TW 2007-10-04 \n", "\n", " asn_description query \\\n", "0 VNPT-AS-VN VNPT Corp, VN 113.190.36.2 \n", "1 HINET Data Communication Business Group, TW 118.163.135.17 \n", "2 HINET Data Communication Business Group, TW 118.163.135.18 \n", "\n", " nets \\\n", "0 [{'cidr': '113.160.0.0/11', 'name': 'VNPT-VN', 'handle': 'PTH13-AP', 'range': '113.160.0.0 - 113... \n", "1 [{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1... \n", "2 [{'cidr': '118.160.0.0/13', 'name': 'HINET-NET', 'handle': 'AT939-AP', 'range': '118.160.0.0 - 1... \n", "\n", " raw referral raw_referral \n", "0 None None None \n", "1 None None None \n", "2 None None None " ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pipeline1 = pipelines[0]\n", "result_df = pipeline1.run(data=suspicious_ips_df)\n", "result_df.head(3)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.7 ('msticpy')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "vscode": { "interpreter": { "hash": "0f1a8e166ce5c1ec1911a36e4fdbd34b2f623e2a3442791008b8ac429a1d6070" } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }