{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MSTICPy Pivot Functions\n", "\n", "We recently released a new version of *MSTICPy* with a feature called **Pivot functions**.\n", "You must have msticpy installed to run this notebook:\n", "```\n", "%pip install --upgrade msticpy\n", "```\n", "\n", "MSTICpy versions >= 1.0.0\n", "\n", "This feature has three main goals:\n", "- Making it easy to discover and invoke *MSTICPy* functionality\n", "- Creating a standardized way to call pivotable functions\n", "- Letting you assemble multiple functions into re-usable pipelines.\n", "\n", "Here are a couple of examples showing calling different kinds of\n", "enrichment functions from the IpAddress entity:\n", "\n", "```python\n", "\n", " >>> from msticpy.datamodel.entities import IpAddress, Host\n", " >>> IpAddress.util.ip_type(ip_str=\"157.53.1.1\"))\n", " ip result\n", " 157.53.1.1 Public\n", "\n", " >>> IpAddress.util.whois(\"157.53.1.1\"))\n", " asn asn_cidr asn_country_code asn_date asn_description asn_registry nets .....\n", " NA NA US 2015-04-01 NA arin [{'cidr': '157.53.0.0/16'...\n", "\n", " >>> IpAddress.util.geoloc(value=\"157.53.1.1\"))\n", " CountryCode CountryName State City Longitude Latitude Asn...\n", " US United States None None -97.822 37.751 None...\n", "```\n", "\n", "This second example shows a pivot function that does a data query for host\n", "logon events from a Host entity.\n", "\n", "```python\n", " >>> Host.AzureSentinel.list_host_logons(host_name=\"VictimPc\")\n", " Account EventID TimeGenerated Computer SubjectUserName SubjectDomainName\n", " NT AUTHORITY\\SYSTEM 4624 2020-10-01 22:39:36.987000+00:00 VictimPc.Contoso.Azure VictimPc$ CONTOSO\n", " NT AUTHORITY\\SYSTEM 4624 2020-10-01 22:39:37.220000+00:00 VictimPc.Contoso.Azure VictimPc$ CONTOSO\n", " NT AUTHORITY\\SYSTEM 4624 2020-10-01 22:39:42.603000+00:00 VictimPc.Contoso.Azure VictimPc$ CONTOSO\n", "```\n", "\n", "The pivot functionality exposes operations relevant to a particular\n", "entity as methods (or functions) of that entity. These operations include:\n", "\n", "- Data queries\n", "- Threat intelligence lookups\n", "- Other data lookups such as geo-location or domain resolution\n", "- and other local functionality\n", "\n", "You can also add other functions from 3rd party Python packages or\n", "ones you write yourself as pivot functions.\n", "\n", "\n", "## Terminology\n", "Before we get into things let's clear up a few terms.\n", "\n", "### Entities\n", "These are Python classes that represent real-world objects\n", "commonly encountered in CyberSec investigations and hunting. E.g. Host,\n", "URL, IP Address, Account, etc.\n", "\n", "### Pivoting\n", "This comes from the common practice in CyberSec investigations\n", "of navigating from one suspect entity to another. E.g. you might start\n", "with an alert identifying a potentially malicious IP Address, from there you\n", "'pivot' to see which hosts or accounts were communicating with that \n", "address. From there you might pivot again to look at processes running on\n", "the host or Office activity for the account." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Background Reading" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This article is available in Notebook form so that you can try out the examples. [TODO]\n", "\n", "There is also full documenation of the Pivot functionality on our [ReadtheDocs page](https://msticpy.readthedocs.io/en/latest/data_analysis/PivotFunctions.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "## Life before pivot functions\n", "\n", "Before Pivot functions your ability to use the various bits of\n", "functionality in *MSTICPy* was always bounded by you knowledge of\n", "where a certain function was (or your enthusiasm for reading the docs).\n", "\n", "For example, suppose you had an IP address that you wanted to do \n", "some simple enrichment on." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "ip_addr = \"20.72.193.242\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First you'd need to locate and import the functions. There\n", "might also be (as in the GeoIPLiteLookup class) some initialization\n", "step you'd need to do before using the functionality." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "from msticpy.context.ip_utils import get_ip_type\n", "from msticpy.context.ip_utils import get_whois_info\n", "from msticpy.context.geoip import GeoLiteLookup\n", "geoip = GeoLiteLookup()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next you might have to check the help for each function to\n", "work it parameters." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function get_ip_type in module msticpy.context.ip_utils:\n", "\n", "get_ip_type(ip: str = None, ip_str: str = None) -> str\n", " Validate value is an IP address and determine IPType category.\n", " \n", " (IPAddress category is e.g. Private/Public/Multicast).\n", " \n", " Parameters\n", " ----------\n", " ip : str\n", " The string of the IP Address\n", " ip_str : str\n", " The string of the IP Address - alias for `ip`\n", " \n", " Returns\n", " -------\n", " str\n", " Returns ip type string using ip address module\n", "\n" ] } ], "source": [ "help(get_ip_type)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then finally run the functions" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Public'" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "get_ip_type(ip_addr)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('MICROSOFT-CORP-MSN-AS-BLOCK, US',\n", " {'nir': None,\n", " 'asn_registry': 'arin',\n", " 'asn': '8075',\n", " 'asn_cidr': '20.64.0.0/10',\n", " 'asn_country_code': 'US',\n", " 'asn_date': '2017-10-18',\n", " 'asn_description': 'MICROSOFT-CORP-MSN-AS-BLOCK, US',\n", " 'query': '20.72.193.242',\n", " 'nets': [{'cidr': '20.34.0.0/15, 20.48.0.0/12, 20.36.0.0/14, 20.40.0.0/13, 20.33.0.0/16, 20.128.0.0/16, 20.64.0.0/10',\n", " 'name': 'MSFT',\n", " 'handle': 'NET-20-33-0-0-1',\n", " 'range': '20.33.0.0 - 20.128.255.255',\n", " 'description': 'Microsoft Corporation',\n", " 'country': 'US',\n", " 'state': 'WA',\n", " 'city': 'Redmond',\n", " 'address': 'One Microsoft Way',\n", " 'postal_code': '98052',\n", " 'emails': ['msndcc@microsoft.com',\n", " 'IOC@microsoft.com',\n", " 'abuse@microsoft.com'],\n", " 'created': '2017-10-18',\n", " 'updated': '2017-10-18'}],\n", " 'raw': None,\n", " 'referral': None,\n", " 'raw_referral': None})" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "get_whois_info(ip_addr)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "([{'continent': {'code': 'NA',\n", " 'geoname_id': 6255149,\n", " 'names': {'de': 'Nordamerika',\n", " 'en': 'North America',\n", " 'es': 'Norteamérica',\n", " 'fr': 'Amérique du Nord',\n", " 'ja': '北アメリカ',\n", " 'pt-BR': 'América do Norte',\n", " 'ru': 'Северная Америка',\n", " 'zh-CN': '北美洲'}},\n", " 'country': {'geoname_id': 6252001,\n", " 'iso_code': 'US',\n", " 'names': {'de': 'USA',\n", " 'en': 'United States',\n", " 'es': 'Estados Unidos',\n", " 'fr': 'États-Unis',\n", " 'ja': 'アメリカ合衆国',\n", " 'pt-BR': 'Estados Unidos',\n", " 'ru': 'США',\n", " 'zh-CN': '美国'}},\n", " 'location': {'accuracy_radius': 1000,\n", " 'latitude': 47.6032,\n", " 'longitude': -122.3412,\n", " 'time_zone': 'America/Los_Angeles'},\n", " 'registered_country': {'geoname_id': 6252001,\n", " 'iso_code': 'US',\n", " 'names': {'de': 'USA',\n", " 'en': 'United States',\n", " 'es': 'Estados Unidos',\n", " 'fr': 'États-Unis',\n", " 'ja': 'アメリカ合衆国',\n", " 'pt-BR': 'Estados Unidos',\n", " 'ru': 'США',\n", " 'zh-CN': '美国'}},\n", " 'subdivisions': [{'geoname_id': 5815135,\n", " 'iso_code': 'WA',\n", " 'names': {'en': 'Washington',\n", " 'es': 'Washington',\n", " 'fr': 'Washington',\n", " 'ja': 'ワシントン州',\n", " 'ru': 'Вашингтон',\n", " 'zh-CN': '华盛顿州'}}],\n", " 'traits': {'ip_address': '20.72.193.242', 'prefix_len': 18}}],\n", " [IpAddress(Address=20.72.193.242, Location={ 'AdditionalData': {},\n", " 'CountryCode': 'US',\n", " ...)])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "geoip.lookup_ip(ip_addr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At which point you'd discover that the output from each\n", "function was somewhat raw and it would take a bit more\n", "work if you wanted to combine it in any way (say in a single table).\n", "\n", "We'll see how pivot functions address these problems in the remainder\n", "of the notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting Started with Pivot functions\n", "Typically we use *MSTICPy*'s `init_notebook` function that handles\n", "checking versions and importing some commonly-used packages and modules\n", "(both *MSTICPy* and 3rd party packages like *pandas*" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import msticpy as mp\n", "mp.init_notebook(verbosity=0);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Pivot subsystem is loaded as part of the `init_notebook`\n", "process. This also import entities such as IpAddress, Host, Url, etc.\n", "into the notebook namespace.\n", "\n", "One class of pivot functions that are not added to entities\n", "in `init_notebook` is data queries. These are loaded when you\n", "create and connect to a QueryProvider\n", "\n", "Let's load our data query provider for MS Sentinel" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "qry_prov = QueryProvider(\"MSSentinel\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "msticpy.init_notebook loads and instantiates the Pivot class.\n", "\n", "You can do that manually, if needed:\n", "```python\n", "from msticpy.init.pivot import Pivot\n", "pivot = Pivot(namespace=globals())\n", "```\n", "\n", "Why do we need to pass `namespace=globals()`?\n", "Pivot searches through the current objects defined in the Python/notebook\n", "namespace. This is most relevant for QueryProviders. In most other cases\n", "(like GeoIP and ThreatIntel providers, it will create new ones if it\n", "can't find existing ones)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Easy discovery of functionality" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Find the entity name you need\n", "\n", "The simplest way to do this is to use the `entities.find_entity`\n", "function.\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Match found 'IpAddress'\n" ] }, { "data": { "text/plain": [ "msticpy.datamodel.entities.ip_address.IpAddress" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "entities.find_entity(\"ip\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No exact match found for 'azure'. \n", "Closest matches are 'AzureResource', 'Url', 'Malware'\n" ] } ], "source": [ "entities.find_entity(\"azure\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Listing pivot functions available for an entity\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once you have the entity you can use the `pivots()`\n", "function to see which pivot functions are available for it." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['RiskIQ.articles',\n", " 'RiskIQ.artifacts',\n", " 'RiskIQ.certificates',\n", " 'RiskIQ.components',\n", " 'RiskIQ.cookies',\n", " 'RiskIQ.hostpair_children',\n", " 'RiskIQ.hostpair_parents',\n", " 'RiskIQ.malware',\n", " 'RiskIQ.projects',\n", " 'RiskIQ.reputation',\n", " 'RiskIQ.resolutions',\n", " 'RiskIQ.services',\n", " 'RiskIQ.summary',\n", " 'RiskIQ.trackers',\n", " 'RiskIQ.whois',\n", " 'VT.vt_communicating_files',\n", " 'VT.vt_historical_ssl_certificates',\n", " 'VT.vt_historical_whois',\n", " 'VT.vt_referrer_files',\n", " 'VT.vt_resolutions',\n", " 'VT.vt_subdomains',\n", " 'geoloc',\n", " 'ip_type',\n", " 'ti.lookup_ip',\n", " 'tilookup_ip',\n", " 'util.geoloc',\n", " 'util.geoloc_ips',\n", " 'util.ip_rev_resolve',\n", " 'util.ip_type',\n", " 'util.whois',\n", " 'vt_communicating_files',\n", " 'vt_historical_ssl_certificates',\n", " 'vt_historical_whois',\n", " 'vt_referrer_files',\n", " 'vt_resolutions',\n", " 'vt_subdomains',\n", " 'whois']" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.pivots()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some of the function names are a little unweildy but, in \n", "many cases, this is necessary to avoid name collisions.\n", "You might notice from the list that the functions are\n", "grouped into containers such as \"ti\" and \"util\" in \n", "the above example.\n", "\n", "Although this makes the function name even longer we thought\n", "that this helped to keep related functionality together - so\n", "you don't get a TI lookup, when you thought you were running\n", "a query.\n", "\n", "Fortunately Jupyter notebooks/IPython support tab completion\n", "so you should not normally have to remember these names.\n" ] }, { "attachments": { "64c1580e-21f7-4ed3-af12-4f59ced1d67b.png": { "image/png": "" } }, "cell_type": "markdown", "metadata": {}, "source": [ "![image.png](attachment:64c1580e-21f7-4ed3-af12-4f59ced1d67b.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The containers (\"util\", etc.) are also callable\n", "functions - they just return the list of functions they contain." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "geoloc (pivot function)\n", "geoloc_ips (pivot function)\n", "ip_rev_resolve (pivot function)\n", "ip_type (pivot function)\n", "whois (pivot function)\n" ] } ], "source": [ "IpAddress.util()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we're ready to run any of the functions for this entity" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
020.72.193.242Public0
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 20.72.193.242 Public 0" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ip_addr = \"20.72.193.242\"\n", "IpAddress.util.ip_type(ip_addr)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
asnasn_cidrasn_country_codeasn_dateasn_descriptionasn_registrynetsnirqueryrawraw_referralreferral
0807520.64.0.0/10US2017-10-18MICROSOFT-CORP-MSN-AS-BLOCK, USarin[{'cidr': '20.40.0.0/13, 20.34.0.0/15, 20.48.0.0/12, 20.64.0.0/10, 20.33.0.0/16, 20.128.0.0/16, ...None20.72.193.242NoneNoneNone
\n", "
" ], "text/plain": [ " asn asn_cidr asn_country_code asn_date \\\n", "0 8075 20.64.0.0/10 US 2017-10-18 \n", "\n", " asn_description asn_registry \\\n", "0 MICROSOFT-CORP-MSN-AS-BLOCK, US arin \n", "\n", " nets \\\n", "0 [{'cidr': '20.40.0.0/13, 20.34.0.0/15, 20.48.0.0/12, 20.64.0.0/10, 20.33.0.0/16, 20.128.0.0/16, ... \n", "\n", " nir query raw raw_referral referral \n", "0 None 20.72.193.242 None None None " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.util.whois(ip_addr)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
qnamerdtyperesponseip_addresssrc_row_index
020.72.193.242PTRNone of DNS query names exist: 20.72.193.242., 20.72.193.242.corp.microsoft.com.20.72.193.2420
\n", "
" ], "text/plain": [ " qname rdtype \\\n", "0 20.72.193.242 PTR \n", "\n", " response \\\n", "0 None of DNS query names exist: 20.72.193.242., 20.72.193.242.corp.microsoft.com. \n", "\n", " ip_address src_row_index \n", "0 20.72.193.242 0 " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.util.ip_rev_resolve(ip_addr)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCodeCountryNameStateLongitudeLatitudeTimeGeneratedTypeIpAddress
0USUnited StatesWashington-122.341447.60342022-04-22 03:03:14.422813geolocation20.72.193.242
\n", "
" ], "text/plain": [ " CountryCode CountryName State Longitude Latitude \\\n", "0 US United States Washington -122.3414 47.6034 \n", "\n", " TimeGenerated Type IpAddress \n", "0 2022-04-22 03:03:14.422813 geolocation 20.72.193.242 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.util.geoloc(ip_addr)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IocIocTypeSafeIocQuerySubtypeProviderResultSeverityDetailsRawResultReferenceStatus
020.72.193.242ipv420.72.193.242NoneRiskIQTruehigh{'summary': {'resolutions': 0, 'certificates': 0, 'malware_hashes': 0, 'projects': 0, 'articles'...{'summary': {'resolutions': 0, 'certificates': 0, 'malware_hashes': 0, 'projects': 0, 'articles'...https://community.riskiq.com0
020.72.193.242ipv420.72.193.242NoneTorTrueinformationNot found.Nonehttps://check.torproject.org/exit-addresses0
020.72.193.242ipv420.72.193.242NoneVirusTotalTrueinformation{'verbose_msg': 'IP address in dataset', 'response_code': 1, 'positives': 0, 'detected_urls': []}{'detected_urls': [], 'asn': 8075, 'country': 'US', 'response_code': 1, 'as_owner': 'MICROSOFT-C...https://www.virustotal.com/vtapi/v2/ip-address/report0
020.72.193.242ipv420.72.193.242NoneXForceFalseinformationAuthorization failed. Check account and key details.<Response [401 Unauthorized]>https://api.xforce.ibmcloud.com/ipr/20.72.193.242401
\n", "
" ], "text/plain": [ " Ioc IocType SafeIoc QuerySubtype Provider Result \\\n", "0 20.72.193.242 ipv4 20.72.193.242 None RiskIQ True \n", "0 20.72.193.242 ipv4 20.72.193.242 None Tor True \n", "0 20.72.193.242 ipv4 20.72.193.242 None VirusTotal True \n", "0 20.72.193.242 ipv4 20.72.193.242 None XForce False \n", "\n", " Severity \\\n", "0 high \n", "0 information \n", "0 information \n", "0 information \n", "\n", " Details \\\n", "0 {'summary': {'resolutions': 0, 'certificates': 0, 'malware_hashes': 0, 'projects': 0, 'articles'... \n", "0 Not found. \n", "0 {'verbose_msg': 'IP address in dataset', 'response_code': 1, 'positives': 0, 'detected_urls': []} \n", "0 Authorization failed. Check account and key details. \n", "\n", " RawResult \\\n", "0 {'summary': {'resolutions': 0, 'certificates': 0, 'malware_hashes': 0, 'projects': 0, 'articles'... \n", "0 None \n", "0 {'detected_urls': [], 'asn': 8075, 'country': 'US', 'response_code': 1, 'as_owner': 'MICROSOFT-C... \n", "0 \n", "\n", " Reference Status \n", "0 https://community.riskiq.com 0 \n", "0 https://check.torproject.org/exit-addresses 0 \n", "0 https://www.virustotal.com/vtapi/v2/ip-address/report 0 \n", "0 https://api.xforce.ibmcloud.com/ipr/20.72.193.242 401 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.ti.lookup_ip(ip_addr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we didn't need to worry about either the parameter\n", "name or format (more on this in the next section). Also, \n", "whatever the function, the output is always returned\n", "as a pandas DataFrame." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### For Data query functions you *do* need to worry about the parameter name\n", "Data query functions are a little more complex than most other functions\n", "and specifically often support many parameters. Rather than try\n", "to guess which parameter you meant, we require you to be explicit.\n", "\n", "To use a data query, we need to authenticate to the provider." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connecting... " ] }, { "data": { "text/html": [ "\n", " \n", "
\n", " popup schema 8ecf8077-cf51-4820-aadd-14040956f35d@loganalytics\n", "
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "connected\n" ] } ], "source": [ "qry_prov.connect(workspace=\"CyberSecuritySoc\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We should now have many more data query pivots\n", "attached to our entities" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['MSSentinel_cybersecuritysoc.VMComputer_vmcomputer',\n", " 'MSSentinel_cybersecuritysoc.auditd_auditd_all',\n", " 'MSSentinel_cybersecuritysoc.az_nsg_interface',\n", " 'MSSentinel_cybersecuritysoc.az_nsg_net_flows',\n", " 'MSSentinel_cybersecuritysoc.az_nsg_net_flows_depr',\n", " 'MSSentinel_cybersecuritysoc.heartbeat',\n", " 'MSSentinel_cybersecuritysoc.heartbeat_for_host_depr',\n", " 'MSSentinel_cybersecuritysoc.sec_alerts',\n", " 'MSSentinel_cybersecuritysoc.sent_bookmarks',\n", " 'MSSentinel_cybersecuritysoc.syslog_all_syslog',\n", " 'MSSentinel_cybersecuritysoc.syslog_cron_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_logon_failures',\n", " 'MSSentinel_cybersecuritysoc.syslog_logons',\n", " 'MSSentinel_cybersecuritysoc.syslog_squid_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_sudo_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_user_group_activity',\n", " 'MSSentinel_cybersecuritysoc.syslog_user_logon',\n", " 'MSSentinel_cybersecuritysoc.wevt_all_events',\n", " 'MSSentinel_cybersecuritysoc.wevt_events_by_id',\n", " 'MSSentinel_cybersecuritysoc.wevt_get_process_tree',\n", " 'MSSentinel_cybersecuritysoc.wevt_list_other_events',\n", " 'MSSentinel_cybersecuritysoc.wevt_logon_attempts',\n", " 'MSSentinel_cybersecuritysoc.wevt_logon_failures',\n", " 'MSSentinel_cybersecuritysoc.wevt_logon_session',\n", " 'MSSentinel_cybersecuritysoc.wevt_logons',\n", " 'MSSentinel_cybersecuritysoc.wevt_parent_process',\n", " 'MSSentinel_cybersecuritysoc.wevt_process_session',\n", " 'MSSentinel_cybersecuritysoc.wevt_processes',\n", " 'RiskIQ.articles',\n", " 'RiskIQ.artifacts',\n", " 'RiskIQ.certificates',\n", " 'RiskIQ.components',\n", " 'RiskIQ.cookies',\n", " 'RiskIQ.hostpair_children',\n", " 'RiskIQ.hostpair_parents',\n", " 'RiskIQ.malware',\n", " 'RiskIQ.projects',\n", " 'RiskIQ.reputation',\n", " 'RiskIQ.resolutions',\n", " 'RiskIQ.summary',\n", " 'RiskIQ.trackers',\n", " 'RiskIQ.whois',\n", " 'dns_is_resolvable',\n", " 'dns_resolve',\n", " 'util.dns_components',\n", " 'util.dns_in_abuse_list',\n", " 'util.dns_is_resolvable',\n", " 'util.dns_resolve',\n", " 'util.dns_validate_tld']" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.pivots()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you are not sure of the parameters required by the query\n", "you can use the built-in help" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1;31mSignature:\u001b[0m \u001b[0mHost\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mMSSentinel_cybersecuritysoc\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msec_alerts\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m*\u001b[0m\u001b[0margs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m->\u001b[0m \u001b[0mUnion\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mpandas\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcore\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mframe\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mDataFrame\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mAny\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mDocstring:\u001b[0m\n", "Retrieves list of alerts with a common host, account or process\n", "\n", "Parameters\n", "----------\n", "account_name: str (optional)\n", " The account name to find\n", "add_query_items: str (optional)\n", " Additional query clauses\n", "end: datetime\n", " Query end time\n", "host_name: str (optional)\n", " The hostname to find\n", "path_separator: str (optional)\n", " Path separator\n", " (default value is: \\\\)\n", "process_name: str (optional)\n", " The process name to find\n", "query_project: str (optional)\n", " Column project statement\n", " (default value is: | project-rename StartTimeUtc = StartTime, EndTim...)\n", "start: datetime\n", " Query start time\n", "table: str (optional)\n", " Table name\n", " (default value is: SecurityAlert)\n", "\u001b[1;31mFile:\u001b[0m f:\\anaconda\\envs\\msticpy\\lib\\functools.py\n", "\u001b[1;31mType:\u001b[0m function\n" ] } ], "source": [ "Host.MSSentinel_cybersecuritysoc.sec_alerts?" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "application/javascript": "try {IPython.notebook.kernel.execute(\"NOTEBOOK_URL = '\" + window.location + \"'\");} catch(err) {;}", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TenantIdTimeGeneratedAlertDisplayNameAlertNameSeverityDescriptionProviderNameVendorNameVendorOriginalIdSystemAlertIdResourceIdSourceComputerIdAlertTypeConfidenceLevelConfidenceScoreIsIncidentStartTimeUtcEndTimeUtcProcessingEndTimeRemediationStepsExtendedPropertiesEntitiesSourceSystemWorkspaceSubscriptionIdWorkspaceResourceGroupExtendedLinksProductNameProductComponentNameAlertLinkStatusCompromisedEntityTacticsTypeComputersrc_hostnamesrc_accountnamesrc_procnamehost_matchacct_matchproc_match
08ecf8077-cf51-4820-aadd-14040956f35d2021-03-11 12:05:14.355000+00:00Suspected credential theft activitySuspected credential theft activityMediumThis program exhibits suspect characteristics potentially associated with credential theft. Onc...MDATPMicrosoftda637509097413415122_-841817867bf226b1b-8bda-31f7-c848-1f8bbb5f5922WindowsDefenderAtpNaNFalse2021-03-09 17:56:55.275000+00:002021-03-09 17:56:55.275000+00:002021-03-11 12:05:13.759000+00:00[\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc...{\\r\\n \"MicrosoftDefenderAtp.Category\": \"CredentialAccess\",\\r\\n \"MicrosoftDefenderAtp.Investiga...[\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict...DetectionMicrosoft Defender Advanced Threat Protectionhttps://securitycenter.microsoft.com/alert/da637509097413415122_-841817867?tid=4b2462a4-bbee-495...Newvictim00.na.contosohotels.comCredentialAccessSecurityAlertvictim00victim00TrueFalseFalse
18ecf8077-cf51-4820-aadd-14040956f35d2021-03-11 13:24:53.495000+00:00'Mimikatz' hacktool was detected'Mimikatz' hacktool was detectedLowReadily available tools, such as hacking programs, can be used by unauthorized individuals to sp...MDATPMicrosoftda637510393722104539_-1180405651ef04126b-2683-0a98-d01c-77ee6b1115acWindowsDefenderAvNaNFalse2021-03-11 06:00:14.083000+00:002021-03-11 06:00:14.083000+00:002021-03-11 13:24:53.379000+00:00[\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc...{\\r\\n \"MicrosoftDefenderAtp.Category\": \"Malware\",\\r\\n \"MicrosoftDefenderAtp.InvestigationId\": ...[\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict...DetectionMicrosoft Defender Advanced Threat Protectionhttps://securitycenter.microsoft.com/alert/da637510393722104539_-1180405651?tid=4b2462a4-bbee-49...Newvictim00.na.contosohotels.comUnknownSecurityAlertvictim00victim00TrueFalseFalse
28ecf8077-cf51-4820-aadd-14040956f35d2021-03-11 13:24:53.490000+00:00Suspected credential theft activitySuspected credential theft activityMediumThis program exhibits suspect characteristics potentially associated with credential theft. Onc...MDATPMicrosoftda637509097413415122_-841817867bf226b1b-8bda-31f7-c848-1f8bbb5f5922WindowsDefenderAtpNaNFalse2021-03-09 17:56:55.275000+00:002021-03-09 17:56:55.275000+00:002021-03-11 13:24:53.363000+00:00[\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc...{\\r\\n \"MicrosoftDefenderAtp.Category\": \"CredentialAccess\",\\r\\n \"MicrosoftDefenderAtp.Investiga...[\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict...DetectionMicrosoft Defender Advanced Threat Protectionhttps://securitycenter.microsoft.com/alert/da637509097413415122_-841817867?tid=4b2462a4-bbee-495...Newvictim00.na.contosohotels.comCredentialAccessSecurityAlertvictim00victim00TrueFalseFalse
38ecf8077-cf51-4820-aadd-14040956f35d2021-03-11 13:19:42.521000+00:00Malicious credential theft tool execution detectedMalicious credential theft tool execution detectedHighA known credential theft tool execution command line was detected.\\nEither the process itself or...MDATPMicrosoftda637508847019595161_-562481393753680a5-4d20-2726-61b4-9c36e620ea26WindowsDefenderAtpNaNFalse2021-03-09 10:56:58.134000+00:002021-03-09 10:56:58.134000+00:002021-03-11 13:19:42.289000+00:00[\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc...{\\r\\n \"MicrosoftDefenderAtp.Category\": \"CredentialAccess\",\\r\\n \"MicrosoftDefenderAtp.Investiga...[\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict...DetectionMicrosoft Defender Advanced Threat Protectionhttps://securitycenter.microsoft.com/alert/da637508847019595161_-562481393?tid=4b2462a4-bbee-495...Newvictim00.na.contosohotels.comCredentialAccessSecurityAlertvictim00victim00TrueFalseFalse
48ecf8077-cf51-4820-aadd-14040956f35d2021-03-11 14:30:14.730000+00:00'Mimikatz' hacktool was detected'Mimikatz' hacktool was detectedLowReadily available tools, such as hacking programs, can be used by unauthorized individuals to sp...MDATPMicrosoftda637510393722104539_-1180405651ef04126b-2683-0a98-d01c-77ee6b1115acWindowsDefenderAvNaNFalse2021-03-11 06:00:14.083000+00:002021-03-11 06:00:14.083000+00:002021-03-11 14:30:14.450000+00:00[\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc...{\\r\\n \"MicrosoftDefenderAtp.Category\": \"Malware\",\\r\\n \"MicrosoftDefenderAtp.InvestigationId\": ...[\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict...DetectionMicrosoft Defender Advanced Threat Protectionhttps://securitycenter.microsoft.com/alert/da637510393722104539_-1180405651?tid=4b2462a4-bbee-49...Newvictim00.na.contosohotels.comUnknownSecurityAlertvictim00victim00TrueFalseFalse
\n", "
" ], "text/plain": [ " TenantId TimeGenerated \\\n", "0 8ecf8077-cf51-4820-aadd-14040956f35d 2021-03-11 12:05:14.355000+00:00 \n", "1 8ecf8077-cf51-4820-aadd-14040956f35d 2021-03-11 13:24:53.495000+00:00 \n", "2 8ecf8077-cf51-4820-aadd-14040956f35d 2021-03-11 13:24:53.490000+00:00 \n", "3 8ecf8077-cf51-4820-aadd-14040956f35d 2021-03-11 13:19:42.521000+00:00 \n", "4 8ecf8077-cf51-4820-aadd-14040956f35d 2021-03-11 14:30:14.730000+00:00 \n", "\n", " AlertDisplayName \\\n", "0 Suspected credential theft activity \n", "1 'Mimikatz' hacktool was detected \n", "2 Suspected credential theft activity \n", "3 Malicious credential theft tool execution detected \n", "4 'Mimikatz' hacktool was detected \n", "\n", " AlertName Severity \\\n", "0 Suspected credential theft activity Medium \n", "1 'Mimikatz' hacktool was detected Low \n", "2 Suspected credential theft activity Medium \n", "3 Malicious credential theft tool execution detected High \n", "4 'Mimikatz' hacktool was detected Low \n", "\n", " Description \\\n", "0 This program exhibits suspect characteristics potentially associated with credential theft. Onc... \n", "1 Readily available tools, such as hacking programs, can be used by unauthorized individuals to sp... \n", "2 This program exhibits suspect characteristics potentially associated with credential theft. Onc... \n", "3 A known credential theft tool execution command line was detected.\\nEither the process itself or... \n", "4 Readily available tools, such as hacking programs, can be used by unauthorized individuals to sp... \n", "\n", " ProviderName VendorName VendorOriginalId \\\n", "0 MDATP Microsoft da637509097413415122_-841817867 \n", "1 MDATP Microsoft da637510393722104539_-1180405651 \n", "2 MDATP Microsoft da637509097413415122_-841817867 \n", "3 MDATP Microsoft da637508847019595161_-562481393 \n", "4 MDATP Microsoft da637510393722104539_-1180405651 \n", "\n", " SystemAlertId ResourceId SourceComputerId \\\n", "0 bf226b1b-8bda-31f7-c848-1f8bbb5f5922 \n", "1 ef04126b-2683-0a98-d01c-77ee6b1115ac \n", "2 bf226b1b-8bda-31f7-c848-1f8bbb5f5922 \n", "3 753680a5-4d20-2726-61b4-9c36e620ea26 \n", "4 ef04126b-2683-0a98-d01c-77ee6b1115ac \n", "\n", " AlertType ConfidenceLevel ConfidenceScore IsIncident \\\n", "0 WindowsDefenderAtp NaN False \n", "1 WindowsDefenderAv NaN False \n", "2 WindowsDefenderAtp NaN False \n", "3 WindowsDefenderAtp NaN False \n", "4 WindowsDefenderAv NaN False \n", "\n", " StartTimeUtc EndTimeUtc \\\n", "0 2021-03-09 17:56:55.275000+00:00 2021-03-09 17:56:55.275000+00:00 \n", "1 2021-03-11 06:00:14.083000+00:00 2021-03-11 06:00:14.083000+00:00 \n", "2 2021-03-09 17:56:55.275000+00:00 2021-03-09 17:56:55.275000+00:00 \n", "3 2021-03-09 10:56:58.134000+00:00 2021-03-09 10:56:58.134000+00:00 \n", "4 2021-03-11 06:00:14.083000+00:00 2021-03-11 06:00:14.083000+00:00 \n", "\n", " ProcessingEndTime \\\n", "0 2021-03-11 12:05:13.759000+00:00 \n", "1 2021-03-11 13:24:53.379000+00:00 \n", "2 2021-03-11 13:24:53.363000+00:00 \n", "3 2021-03-11 13:19:42.289000+00:00 \n", "4 2021-03-11 14:30:14.450000+00:00 \n", "\n", " RemediationSteps \\\n", "0 [\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc... \n", "1 [\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc... \n", "2 [\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc... \n", "3 [\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc... \n", "4 [\\r\\n \"1. Make sure the machine is completely updated and all your software has the latest patc... \n", "\n", " ExtendedProperties \\\n", "0 {\\r\\n \"MicrosoftDefenderAtp.Category\": \"CredentialAccess\",\\r\\n \"MicrosoftDefenderAtp.Investiga... \n", "1 {\\r\\n \"MicrosoftDefenderAtp.Category\": \"Malware\",\\r\\n \"MicrosoftDefenderAtp.InvestigationId\": ... \n", "2 {\\r\\n \"MicrosoftDefenderAtp.Category\": \"CredentialAccess\",\\r\\n \"MicrosoftDefenderAtp.Investiga... \n", "3 {\\r\\n \"MicrosoftDefenderAtp.Category\": \"CredentialAccess\",\\r\\n \"MicrosoftDefenderAtp.Investiga... \n", "4 {\\r\\n \"MicrosoftDefenderAtp.Category\": \"Malware\",\\r\\n \"MicrosoftDefenderAtp.InvestigationId\": ... \n", "\n", " Entities \\\n", "0 [\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict... \n", "1 [\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict... \n", "2 [\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict... \n", "3 [\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict... \n", "4 [\\r\\n {\\r\\n \"$id\": \"4\",\\r\\n \"DnsDomain\": \"na.contosohotels.com\",\\r\\n \"HostName\": \"vict... \n", "\n", " SourceSystem WorkspaceSubscriptionId WorkspaceResourceGroup ExtendedLinks \\\n", "0 Detection \n", "1 Detection \n", "2 Detection \n", "3 Detection \n", "4 Detection \n", "\n", " ProductName ProductComponentName \\\n", "0 Microsoft Defender Advanced Threat Protection \n", "1 Microsoft Defender Advanced Threat Protection \n", "2 Microsoft Defender Advanced Threat Protection \n", "3 Microsoft Defender Advanced Threat Protection \n", "4 Microsoft Defender Advanced Threat Protection \n", "\n", " AlertLink \\\n", "0 https://securitycenter.microsoft.com/alert/da637509097413415122_-841817867?tid=4b2462a4-bbee-495... \n", "1 https://securitycenter.microsoft.com/alert/da637510393722104539_-1180405651?tid=4b2462a4-bbee-49... \n", "2 https://securitycenter.microsoft.com/alert/da637509097413415122_-841817867?tid=4b2462a4-bbee-495... \n", "3 https://securitycenter.microsoft.com/alert/da637508847019595161_-562481393?tid=4b2462a4-bbee-495... \n", "4 https://securitycenter.microsoft.com/alert/da637510393722104539_-1180405651?tid=4b2462a4-bbee-49... \n", "\n", " Status CompromisedEntity Tactics Type \\\n", "0 New victim00.na.contosohotels.com CredentialAccess SecurityAlert \n", "1 New victim00.na.contosohotels.com Unknown SecurityAlert \n", "2 New victim00.na.contosohotels.com CredentialAccess SecurityAlert \n", "3 New victim00.na.contosohotels.com CredentialAccess SecurityAlert \n", "4 New victim00.na.contosohotels.com Unknown SecurityAlert \n", "\n", " Computer src_hostname src_accountname src_procname host_match acct_match \\\n", "0 victim00 victim00 True False \n", "1 victim00 victim00 True False \n", "2 victim00 victim00 True False \n", "3 victim00 victim00 True False \n", "4 victim00 victim00 True False \n", "\n", " proc_match \n", "0 False \n", "1 False \n", "2 False \n", "3 False \n", "4 False " ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.MSSentinel_cybersecuritysoc.sec_alerts(host_name=\"victim00\").head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Browse and search for Pivot functions with the pivot browse" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7802c79e943c4b289ae27f59572698e3", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HBox(children=(VBox(children=(HTML(value='Entities'), Select(description='entity', layou…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "Pivot.browse()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Standardized way of calling Pivot functions\n", "\n", "Due to various factors (historical, underlying data,\n", "developer laziness and forgetfullness, etc.) the functionality\n", "in *MSTICPy* can be inconsistent in the way it uses input\n", "parameters.\n", "\n", "Also, many functions will only accept inputs as a single\n", "value, or a list or a DataFrame or some unpredictable combination\n", "of these.\n", "\n", "Pivot functions allow you to largely forget about this - you\n", "can use the same function whether you have:\n", "- a single value\n", "- a list (or any iterable) of values\n", "- a DataFrame with the input value in one of the columns.\n", "\n", "Let's take an example. \n", "\n", "Suppose we have a set of IP addresses pasted\n", "from somewhere that we want to use as input." ] }, { "cell_type": "raw", "metadata": {}, "source": [ "0, 172.217.15.99, Public\n", "1, 40.85.232.64, Public\n", "2, 20.38.98.100, Public\n", "3, 23.96.64.84, Public\n", "4, 65.55.44.108, Public\n", "5, 131.107.147.209, Public\n", "6, 10.0.3.4, Private\n", "7, 10.0.3.5, Private\n", "8, 13.82.152.48, Public" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to convert this into a Python data object of some sort.\n", "To do this we can use another Pivot utility `%%txt2df`. This is a\n", "Jupyter/IPython magic function so you can just paste you data in\n", "a cell.\n", "Use `%%txt2df --help` in an empty cell to see the full syntax.\n", "\n", "The example below we specify a comma separator, that the\n", "data has a headers row and to save the converted data as\n", "a DataFrame named \"ip_df\".\n", "\n", "> Warning this will overwrite any existing variable of this\n", "name" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idxiptype
00172.217.15.99Public
1140.85.232.64Public
2220.38.98.100Public
3323.96.64.84Public
4465.55.44.108Public
55131.107.147.209Public
6610.0.3.4Private
7710.0.3.5Private
8813.82.152.48Public
\n", "
" ], "text/plain": [ " idx ip type\n", "0 0 172.217.15.99 Public\n", "1 1 40.85.232.64 Public\n", "2 2 20.38.98.100 Public\n", "3 3 23.96.64.84 Public\n", "4 4 65.55.44.108 Public\n", "5 5 131.107.147.209 Public\n", "6 6 10.0.3.4 Private\n", "7 7 10.0.3.5 Private\n", "8 8 13.82.152.48 Public" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%txt2df --sep , --headers --name ip_df\n", "idx, ip, type\n", "0, 172.217.15.99, Public\n", "1, 40.85.232.64, Public\n", "2, 20.38.98.100, Public\n", "3, 23.96.64.84, Public\n", "4, 65.55.44.108, Public\n", "5, 131.107.147.209, Public\n", "6, 10.0.3.4, Private\n", "7, 10.0.3.5, Private\n", "8, 13.82.152.48, Public\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For our example we'll also create a standard Python list\n", "from the ip column." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['172.217.15.99', '40.85.232.64', '20.38.98.100', '23.96.64.84', '65.55.44.108', '131.107.147.209', '10.0.3.4', '10.0.3.5', '13.82.152.48']\n" ] } ], "source": [ "ip_list = list(ip_df.ip)\n", "print(ip_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How did this work before?\n", "\n", "If you recall the earlier example of `get_ip_type`, passing it\n", "a list or DataFrame doesn't result in anything useful." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['172.217.15.99', '40.85.232.64', '20.38.98.100', '23.96.64.84', '65.55.44.108', '131.107.147.209', '10.0.3.4', '10.0.3.5', '13.82.152.48'] does not appear to be an IPv4 or IPv6 address\n" ] }, { "data": { "text/plain": [ "'Unspecified'" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "get_ip_type(ip_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pivot versions are much more forgiving" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The pivotized version of get_ip_type can accept and correctly process\n", "a list" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresultsrc_row_index
0172.217.15.99Public0
140.85.232.64Public1
220.38.98.100Public2
323.96.64.84Public3
465.55.44.108Public4
5131.107.147.209Public5
610.0.3.4Private6
710.0.3.5Private7
813.82.152.48Public8
\n", "
" ], "text/plain": [ " ip result src_row_index\n", "0 172.217.15.99 Public 0\n", "1 40.85.232.64 Public 1\n", "2 20.38.98.100 Public 2\n", "3 23.96.64.84 Public 3\n", "4 65.55.44.108 Public 4\n", "5 131.107.147.209 Public 5\n", "6 10.0.3.4 Private 6\n", "7 10.0.3.5 Private 7\n", "8 13.82.152.48 Public 8" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.util.ip_type(ip_list)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When using a DataFrame as an input to pivot, things are a little more\n", "complicated.\n", "We have to pass the DataFrame to the function and also supply \n", "the name of the column thatcontains the input data." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
nirasn_registryasnasn_cidrasn_country_codeasn_dateasn_descriptionquerynetsrawreferralraw_referral
0NaNarin15169172.217.15.0/24US2012-04-16GOOGLE, US172.217.15.99[{'cidr': '172.217.0.0/16', 'name': 'GOOGLE', 'handle': 'NET-172-217-0-0-1', 'range': '172.217.0...NaNNaNNaN
1NaNarin807540.80.0.0/12US2015-02-23MICROSOFT-CORP-MSN-AS-BLOCK, US40.85.232.64[{'cidr': '40.80.0.0/12, 40.124.0.0/16, 40.74.0.0/15, 40.76.0.0/14, 40.120.0.0/14, 40.125.0.0/17...NaNNaNNaN
2NaNarin807520.36.0.0/14US2017-10-18MICROSOFT-CORP-MSN-AS-BLOCK, US20.38.98.100[{'cidr': '20.128.0.0/16, 20.33.0.0/16, 20.34.0.0/15, 20.36.0.0/14, 20.64.0.0/10, 20.40.0.0/13, ...NaNNaNNaN
3NaNarin807523.96.0.0/14US2013-06-18MICROSOFT-CORP-MSN-AS-BLOCK, US23.96.64.84[{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23....NaNNaNNaN
4NaNarin807565.52.0.0/14US2001-02-14MICROSOFT-CORP-MSN-AS-BLOCK, US65.55.44.108[{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52...NaNNaNNaN
5NaNarin3598131.107.0.0/16US1988-11-11MICROSOFT-CORP-AS, US131.107.147.209[{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10...NaNNaNNaN
6NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
7NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
8NaNarin807513.64.0.0/11US2015-03-26MICROSOFT-CORP-MSN-AS-BLOCK, US13.82.152.48[{'cidr': '13.64.0.0/11, 13.96.0.0/13, 13.104.0.0/14', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-...NaNNaNNaN
\n", "
" ], "text/plain": [ " nir asn_registry asn asn_cidr asn_country_code asn_date \\\n", "0 NaN arin 15169 172.217.15.0/24 US 2012-04-16 \n", "1 NaN arin 8075 40.80.0.0/12 US 2015-02-23 \n", "2 NaN arin 8075 20.36.0.0/14 US 2017-10-18 \n", "3 NaN arin 8075 23.96.0.0/14 US 2013-06-18 \n", "4 NaN arin 8075 65.52.0.0/14 US 2001-02-14 \n", "5 NaN arin 3598 131.107.0.0/16 US 1988-11-11 \n", "6 NaN NaN NaN NaN NaN NaN \n", "7 NaN NaN NaN NaN NaN NaN \n", "8 NaN arin 8075 13.64.0.0/11 US 2015-03-26 \n", "\n", " asn_description query \\\n", "0 GOOGLE, US 172.217.15.99 \n", "1 MICROSOFT-CORP-MSN-AS-BLOCK, US 40.85.232.64 \n", "2 MICROSOFT-CORP-MSN-AS-BLOCK, US 20.38.98.100 \n", "3 MICROSOFT-CORP-MSN-AS-BLOCK, US 23.96.64.84 \n", "4 MICROSOFT-CORP-MSN-AS-BLOCK, US 65.55.44.108 \n", "5 MICROSOFT-CORP-AS, US 131.107.147.209 \n", "6 NaN NaN \n", "7 NaN NaN \n", "8 MICROSOFT-CORP-MSN-AS-BLOCK, US 13.82.152.48 \n", "\n", " nets \\\n", "0 [{'cidr': '172.217.0.0/16', 'name': 'GOOGLE', 'handle': 'NET-172-217-0-0-1', 'range': '172.217.0... \n", "1 [{'cidr': '40.80.0.0/12, 40.124.0.0/16, 40.74.0.0/15, 40.76.0.0/14, 40.120.0.0/14, 40.125.0.0/17... \n", "2 [{'cidr': '20.128.0.0/16, 20.33.0.0/16, 20.34.0.0/15, 20.36.0.0/14, 20.64.0.0/10, 20.40.0.0/13, ... \n", "3 [{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23.... \n", "4 [{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52... \n", "5 [{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10... \n", "6 NaN \n", "7 NaN \n", "8 [{'cidr': '13.64.0.0/11, 13.96.0.0/13, 13.104.0.0/14', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-... \n", "\n", " raw referral raw_referral \n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "5 NaN NaN NaN \n", "6 NaN NaN NaN \n", "7 NaN NaN NaN \n", "8 NaN NaN NaN " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.util.whois(ip_df, column=\"ip\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: for most functions you can ignore the parameter\n", "name and just specify it as a positional parameter.\n", "You can also use the original parameter name of the underlying\n", "function or the placeholder name \"value\".\n", "\n", "The following are all equivalent:\n", "```python\n", "IpAddress.util.ip_type(ip_list)\n", "IpAddress.util.ip_type(ip_str=ip_list)\n", "IpAddress.util.ip_type(value=ip_list)\n", "IpAddress.util.ip_type(data=ip_list)\n", "```\n", "\n", "When passing both a DataFrame and column name use:\n", "```python\n", "IpAddress.util.ip_type(data=ip_df, column=\"col_name\")\n", "```\n", "You can also pass an entity instance of an entity\n", "as a input parameter. The pivot code knows which attribute\n", "or attributes of an entity will provider the input value." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ipresult
040.85.232.64Public
\n", "
" ], "text/plain": [ " ip result\n", "0 40.85.232.64 Public" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ip_entity = IpAddress(Address=\"40.85.232.64\")\n", "IpAddress.util.ip_type(ip_entity)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Iterable/DataFrame inputs and single-value functions\n", "\n", "Many of the underlying functions only accept single values\n", "as inputs. Examples of these are the data query functions - typically\n", "they expect a single host name, IP address, etc.\n", "\n", "Pivot knows about the type of parameters that the function accepts.\n", "It will adjust the input to match the expectations of the underlying\n", "function. If a list or DataFrame is passed as input to a single-value\n", "function Pivot will split the input and call the function once for\n", "each value. It then combines the output into a single DataFrame\n", "before returning the results. \n", "\n", "You can read a bit more about how this is done in the Appendix TODO" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data queries - where does the time range come from?\n", "\n", "The Pivot class has a buit-in time range. This is used by\n", "default for all queries. Don't worry - you can change it easily" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "TimeSpan(start=2022-06-08 22:10:15.959575+00:00, end=2022-06-09 22:10:15.959575+00:00, period=1 day, 0:00:00)" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mp.pivot.timespan" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can edit the time range interactively" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fed253eb36b248d295e2469285be0dc7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='

Set time range for pivot functions.

'), HBox(children=(DatePicker(value=dat…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mp.pivot.edit_query_time()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or by setting the timespan property directly" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "from msticpy.common.timespan import TimeSpan\n", "# TimeSpan accepts datetimes or datestrings\n", "timespan = TimeSpan(start=\"02/01/2021\", end=\"02/15/2021\")\n", "mp.pivot.timespan = timespan" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is also a convenience function\n", "for setting the time directly with Python datetimes or date strings" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "mp.pivot.current.set_timespan(start=\"2020-02-06 03:00:00\", end=\"2021-02-15 01:42:42\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also override the built-in time settings by specifying\n", "`start` and `end` as parameters." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TenantIdTimeGeneratedAlertDisplayNameAlertNameSeverityDescriptionProviderNameVendorNameVendorOriginalIdSystemAlertIdResourceIdSourceComputerIdAlertTypeConfidenceLevelConfidenceScoreIsIncidentStartTimeUtcEndTimeUtcProcessingEndTimeRemediationStepsExtendedPropertiesEntitiesSourceSystemWorkspaceSubscriptionIdWorkspaceResourceGroupExtendedLinksProductNameProductComponentNameAlertLinkStatusCompromisedEntityTacticsTechniquesTypeComputersrc_hostnamesrc_accountnamesrc_procnamehost_matchacct_matchproc_match
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [TenantId, TimeGenerated, AlertDisplayName, AlertName, Severity, Description, ProviderName, VendorName, VendorOriginalId, SystemAlertId, ResourceId, SourceComputerId, AlertType, ConfidenceLevel, ConfidenceScore, IsIncident, StartTimeUtc, EndTimeUtc, ProcessingEndTime, RemediationSteps, ExtendedProperties, Entities, SourceSystem, WorkspaceSubscriptionId, WorkspaceResourceGroup, ExtendedLinks, ProductName, ProductComponentName, AlertLink, Status, CompromisedEntity, Tactics, Techniques, Type, Computer, src_hostname, src_accountname, src_procname, host_match, acct_match, proc_match]\n", "Index: []" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dt1 = mp.pivot.timespan.start\n", "dt2 = mp.pivot.timespan.end\n", "Host.MSSentinel_cybersecuritysoc.sec_alerts(host_name=\"victim00\", start=dt1, end=dt2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Supplying extra parameters\n", "\n", "The Pivot layer will pass any unused keyword parameters to the\n", "underlying function. This *does not* usually apply to positional parameters -\n", "if you want parameters to get to the function, you have to name them\n", "explicitly.\n", "In this example the `add_query_items` parameter is passed to the underlying\n", "query function" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "application/javascript": "try {IPython.notebook.kernel.execute(\"NOTEBOOK_URL = '\" + window.location + \"'\");} catch(err) {;}", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LogonTypecount_
0521650
136808
249426
32109
41044
507
698
\n", "
" ], "text/plain": [ " LogonType count_\n", "0 5 21650\n", "1 3 6808\n", "2 4 9426\n", "3 2 109\n", "4 10 44\n", "5 0 7\n", "6 9 8" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Host.MSSentinel_cybersecuritysoc.wevt_logons(\n", " host_name=\"victimPc\",\n", " add_query_items=\"| summarize count() by LogonType\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pivot Pipelines\n", "\n", "Because all pivot functions accept DataFrames as input\n", "and produce DataFrames as output, it means that it is possible\n", "to chain pivot functions into a pipeline.\n", "\n", "### Joining input to output\n", "You can join the input to the output. This usually only makes sense\n", "when the input is a DataFrame. It\n", "lets you keep the previously accumumated results and tag on the\n", "additional columns produced by the pivot function you are calling.\n", "\n", "The `join` parameter supports \"inner\", \"left\", \"right\" and \"outer\"\n", "joins (be careful with the latter though!)\n", "See [pivot joins documentation](https://msticpy.readthedocs.io/en/latest/data_analysis/PivotFunctions.html#joining-input-to-output-data)\n", "\n", "Although joining is useful in pipelines you can use it on\n", "any function whether in a pipeline or not." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idxiptypenirasn_registryasnasn_cidrasn_country_codeasn_dateasn_descriptionquerynetsrawreferralraw_referral
00172.217.15.99PublicNaNarin15169172.217.15.0/24US2012-04-16GOOGLE, US172.217.15.99[{'cidr': '172.217.0.0/16', 'name': 'GOOGLE', 'handle': 'NET-172-217-0-0-1', 'range': '172.217.0...NaNNaNNaN
1140.85.232.64PublicNaNarin807540.80.0.0/12US2015-02-23MICROSOFT-CORP-MSN-AS-BLOCK, US40.85.232.64[{'cidr': '40.80.0.0/12, 40.124.0.0/16, 40.74.0.0/15, 40.76.0.0/14, 40.120.0.0/14, 40.125.0.0/17...NaNNaNNaN
2220.38.98.100PublicNaNarin807520.36.0.0/14US2017-10-18MICROSOFT-CORP-MSN-AS-BLOCK, US20.38.98.100[{'cidr': '20.128.0.0/16, 20.33.0.0/16, 20.34.0.0/15, 20.36.0.0/14, 20.64.0.0/10, 20.40.0.0/13, ...NaNNaNNaN
3323.96.64.84PublicNaNarin807523.96.0.0/14US2013-06-18MICROSOFT-CORP-MSN-AS-BLOCK, US23.96.64.84[{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23....NaNNaNNaN
4465.55.44.108PublicNaNarin807565.52.0.0/14US2001-02-14MICROSOFT-CORP-MSN-AS-BLOCK, US65.55.44.108[{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52...NaNNaNNaN
55131.107.147.209PublicNaNarin3598131.107.0.0/16US1988-11-11MICROSOFT-CORP-AS, US131.107.147.209[{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10...NaNNaNNaN
6813.82.152.48PublicNaNarin807513.64.0.0/11US2015-03-26MICROSOFT-CORP-MSN-AS-BLOCK, US13.82.152.48[{'cidr': '13.64.0.0/11, 13.96.0.0/13, 13.104.0.0/14', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-...NaNNaNNaN
\n", "
" ], "text/plain": [ " idx ip type nir asn_registry asn asn_cidr \\\n", "0 0 172.217.15.99 Public NaN arin 15169 172.217.15.0/24 \n", "1 1 40.85.232.64 Public NaN arin 8075 40.80.0.0/12 \n", "2 2 20.38.98.100 Public NaN arin 8075 20.36.0.0/14 \n", "3 3 23.96.64.84 Public NaN arin 8075 23.96.0.0/14 \n", "4 4 65.55.44.108 Public NaN arin 8075 65.52.0.0/14 \n", "5 5 131.107.147.209 Public NaN arin 3598 131.107.0.0/16 \n", "6 8 13.82.152.48 Public NaN arin 8075 13.64.0.0/11 \n", "\n", " asn_country_code asn_date asn_description \\\n", "0 US 2012-04-16 GOOGLE, US \n", "1 US 2015-02-23 MICROSOFT-CORP-MSN-AS-BLOCK, US \n", "2 US 2017-10-18 MICROSOFT-CORP-MSN-AS-BLOCK, US \n", "3 US 2013-06-18 MICROSOFT-CORP-MSN-AS-BLOCK, US \n", "4 US 2001-02-14 MICROSOFT-CORP-MSN-AS-BLOCK, US \n", "5 US 1988-11-11 MICROSOFT-CORP-AS, US \n", "6 US 2015-03-26 MICROSOFT-CORP-MSN-AS-BLOCK, US \n", "\n", " query \\\n", "0 172.217.15.99 \n", "1 40.85.232.64 \n", "2 20.38.98.100 \n", "3 23.96.64.84 \n", "4 65.55.44.108 \n", "5 131.107.147.209 \n", "6 13.82.152.48 \n", "\n", " nets \\\n", "0 [{'cidr': '172.217.0.0/16', 'name': 'GOOGLE', 'handle': 'NET-172-217-0-0-1', 'range': '172.217.0... \n", "1 [{'cidr': '40.80.0.0/12, 40.124.0.0/16, 40.74.0.0/15, 40.76.0.0/14, 40.120.0.0/14, 40.125.0.0/17... \n", "2 [{'cidr': '20.128.0.0/16, 20.33.0.0/16, 20.34.0.0/15, 20.36.0.0/14, 20.64.0.0/10, 20.40.0.0/13, ... \n", "3 [{'cidr': '23.96.0.0/13', 'name': 'MSFT', 'handle': 'NET-23-96-0-0-1', 'range': '23.96.0.0 - 23.... \n", "4 [{'cidr': '65.52.0.0/14', 'name': 'MICROSOFT-1BLK', 'handle': 'NET-65-52-0-0-1', 'range': '65.52... \n", "5 [{'cidr': '131.107.0.0/16', 'name': 'MICROSOFT', 'handle': 'NET-131-107-0-0-1', 'range': '131.10... \n", "6 [{'cidr': '13.64.0.0/11, 13.96.0.0/13, 13.104.0.0/14', 'name': 'MSFT', 'handle': 'NET-13-64-0-0-... \n", "\n", " raw referral raw_referral \n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "5 NaN NaN NaN \n", "6 NaN NaN NaN " ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "IpAddress.util.whois(ip_df, column=\"ip\", join=\"inner\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "### Pipelines \n", "\n", "Pivot pipelines are implemented pandas customr accessors.\n", "Read more about [Extending pandas here](https://pandas.pydata.org/pandas-docs/stable/development/extending.html)\n", "\n", "When you load Pivot it adds the `mp_pivot` accessor to the pandas\n", "`DataFrame` class. This\n", "appears as an attribute to DataFrames.\n", "\n", "```python\n", ">>> ips_df.mp_pivot\n", "\n", "```\n", "\n", "The main pipelining function `run` is a method of `mp_pivot`.\n", "`run` requires two parameters - the pivot function to run and\n", "the column to use as input. See [mp_pivot.run documentation](https://msticpy.readthedocs.io/en/latest/data_analysis/PivotFunctions.html#mp-pivot-run)\n" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "# Create a dataframe for input\n", "ip_list = [\n", " \"192.168.40.32\",\n", " \"192.168.1.216\",\n", " \"192.168.153.17\",\n", " \"3.88.48.125\",\n", " \"10.200.104.20\",\n", " \"192.168.90.101\",\n", " \"192.168.150.50\",\n", " \"172.16.100.31\",\n", " \"192.168.30.189\",\n", " \"10.100.199.10\",\n", "]\n", "ips_df = pd.DataFrame(ip_list, columns=[\"IP\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pipeline example\n", "\n", "Here is an example of using `mp_pivot` to call 4 pivot functions, each\n", "using the output of the previous function as input and using\n", "the `join` parameter to accumulate the results from each\n", "stage.\n", "\n", "Let's step through it line by line.\n", "1. The whole thing is surrounded by a pair of parentheses - this is just\n", " to let us split the whole expression over multiple lines without\n", " Python complaining.\n", "2. Next we have `ips_df` - this is just the starting DataFrame, our input data.\n", "3. Next we call the `mp_pivot.run()` accessor method on this dataframe.\n", " We pass it the pivot function that we want to run and the input column name.\n", " This column name is the column in ips_df where our input IP addresses are.\n", " We've also specified an `join` type of inner. In this case the join type doesn't\n", " really matter since we know we get exactly one output row for every input row.\n", "4. We're using the pandas `query` function to filter out unwanted entries\n", " from the previous stage. In this case we only want Public IP addresses. \n", " This illustrates that you can intersperse standard pandas functions\n", " in the same pipeline. We could have also added a column selector expression\n", " ([[\"col1\", \"col2\"...]]) if we wanted to filter the columns passed to the \n", " next stage\n", "5. We are calling a further pivot function - `whois`. Remember the \"column\" parameter\n", " always refers to the input column, i.e. the column from previous stage\n", " that we want to use in this stage.\n", "6. We are calling `geoloc` to get geo location details joining with a left\n", " join - this preserves the input data rows and adds null columns in any cases\n", " where the pivot function returned no result.\n", "7. Is the same as 6 except is a data query to see if we have any alerts\n", " that contain these IP addresses. Remember, in the case of data queries\n", " we have to name the specific query parameter that we want the input to \n", " go to. In this case, each row value in the \"ip\" column from the previous\n", " stage will be sent to the query.\n", "8. Finally we close the parentheses to form a valid Python expression.\n", " The whole expression returns a DataFrame so we can add further pandas\n", " operations here (like `.head(5)` shown here)." ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "application/javascript": "try {IPython.notebook.kernel.execute(\"NOTEBOOK_URL = '\" + window.location + \"'\");} catch(err) {;}", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IPipresultasnasn_cidrasn_country_codeasn_dateasn_descriptionasn_registrynetsnirqueryrawraw_referralreferralCountryCodeCountryNameStateCityLongitudeLatitudeAsnedgesType_xAdditionalData...AlertTypeConfidenceLevelConfidenceScoreIsIncidentStartTimeUtcEndTimeUtcProcessingEndTimeRemediationStepsExtendedPropertiesEntitiesSourceSystemWorkspaceSubscriptionIdWorkspaceResourceGroupExtendedLinksProductNameProductComponentNameAlertLinkStatusCompromisedEntityTacticsType_ySystemAlertId1ExtendedProperties1Entities1MatchingIps
03.88.48.1253.88.48.125Public146183.80.0.0/12US2017-12-20AMAZON-AES, USarin[{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2...None3.88.48.125NoneNoneNoneUSUnited StatesVirginiaAshburn-77.472839.0481None{}geolocation{}...8ecf8077-cf51-4820-aadd-14040956f35d_8a369bd2-97b6-4fe2-922a-cd170faf25bcNaNFalse2020-12-19 13:04:59+00:002020-12-19 19:04:59+00:002020-12-19 19:10:17+00:00{\\r\\n \"Query\": \"// The query_now parameter (in UTC format) was prepended to the query to reflec...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"Address\": \"3.88.48.125\",\\r\\n \"Type\": \"ip\"\\r\\n }\\r\\n]Detectiond1d8779d-38d7-4f06-91db-9cbc8de0176fsocAzure SentinelScheduled AlertsNewCommandAndControlSecurityAlertfdc54c12-efba-38b0-8379-f06d7fbfd34a{\\r\\n \"Query\": \"// The query_now parameter (in UTC format) was prepended to the query to reflec...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"Address\": \"3.88.48.125\",\\r\\n \"Type\": \"ip\"\\r\\n }\\r\\n][3.88.48.125]
13.88.48.1253.88.48.125Public146183.80.0.0/12US2017-12-20AMAZON-AES, USarin[{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2...None3.88.48.125NoneNoneNoneUSUnited StatesVirginiaAshburn-77.472839.0481None{}geolocation{}...ThreatIntelligence83NaNFalse2020-12-23 13:48:23+00:002020-12-23 13:48:23+00:002020-12-23 14:08:15+00:00{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...Detectiond1d8779d-38d7-4f06-91db-9cbc8de0176fsocAzure SentinelMicrosoft Threat Intelligence AnalyticsNew3.88.48.125UnknownSecurityAlert625ff9af-dddc-0cf8-9d4b-e79067fa2e71{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...[3.88.48.125]
23.88.48.1253.88.48.125Public146183.80.0.0/12US2017-12-20AMAZON-AES, USarin[{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2...None3.88.48.125NoneNoneNoneUSUnited StatesVirginiaAshburn-77.472839.0481None{}geolocation{}...ThreatIntelligence83NaNFalse2020-12-23 13:48:23+00:002020-12-23 13:48:23+00:002020-12-23 14:08:15+00:00{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...Detectiond1d8779d-38d7-4f06-91db-9cbc8de0176fsocAzure SentinelMicrosoft Threat Intelligence AnalyticsNew3.88.48.125UnknownSecurityAlertc977f904-ab30-d57e-986f-9d6ebf72771b{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...[3.88.48.125]
33.88.48.1253.88.48.125Public146183.80.0.0/12US2017-12-20AMAZON-AES, USarin[{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2...None3.88.48.125NoneNoneNoneUSUnited StatesVirginiaAshburn-77.472839.0481None{}geolocation{}...ThreatIntelligence83NaNFalse2020-12-23 13:48:23+00:002020-12-23 13:48:23+00:002020-12-23 14:08:15+00:00{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...Detectiond1d8779d-38d7-4f06-91db-9cbc8de0176fsocAzure SentinelMicrosoft Threat Intelligence AnalyticsNew3.88.48.125UnknownSecurityAlert9ee547e4-cba1-47d1-e1f9-87247b693a52{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...[3.88.48.125]
43.88.48.1253.88.48.125Public146183.80.0.0/12US2017-12-20AMAZON-AES, USarin[{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2...None3.88.48.125NoneNoneNoneUSUnited StatesVirginiaAshburn-77.472839.0481None{}geolocation{}...ThreatIntelligence83NaNFalse2020-12-23 13:48:23+00:002020-12-23 13:48:23+00:002020-12-23 14:08:16+00:00{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...Detectiond1d8779d-38d7-4f06-91db-9cbc8de0176fsocAzure SentinelMicrosoft Threat Intelligence AnalyticsNew3.88.48.125UnknownSecurityAlert83a0e08a-1adb-ef75-1c56-f6c9ce25ca69{\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim...[\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo...[3.88.48.125]
\n", "

5 rows × 63 columns

\n", "
" ], "text/plain": [ " IP ip result asn asn_cidr asn_country_code \\\n", "0 3.88.48.125 3.88.48.125 Public 14618 3.80.0.0/12 US \n", "1 3.88.48.125 3.88.48.125 Public 14618 3.80.0.0/12 US \n", "2 3.88.48.125 3.88.48.125 Public 14618 3.80.0.0/12 US \n", "3 3.88.48.125 3.88.48.125 Public 14618 3.80.0.0/12 US \n", "4 3.88.48.125 3.88.48.125 Public 14618 3.80.0.0/12 US \n", "\n", " asn_date asn_description asn_registry \\\n", "0 2017-12-20 AMAZON-AES, US arin \n", "1 2017-12-20 AMAZON-AES, US arin \n", "2 2017-12-20 AMAZON-AES, US arin \n", "3 2017-12-20 AMAZON-AES, US arin \n", "4 2017-12-20 AMAZON-AES, US arin \n", "\n", " nets \\\n", "0 [{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2... \n", "1 [{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2... \n", "2 [{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2... \n", "3 [{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2... \n", "4 [{'cidr': '3.0.0.0/9', 'name': 'AT-88-Z', 'handle': 'NET-3-0-0-0-1', 'range': '3.0.0.0 - 3.127.2... \n", "\n", " nir query raw raw_referral referral CountryCode CountryName \\\n", "0 None 3.88.48.125 None None None US United States \n", "1 None 3.88.48.125 None None None US United States \n", "2 None 3.88.48.125 None None None US United States \n", "3 None 3.88.48.125 None None None US United States \n", "4 None 3.88.48.125 None None None US United States \n", "\n", " State City Longitude Latitude Asn edges Type_x \\\n", "0 Virginia Ashburn -77.4728 39.0481 None {} geolocation \n", "1 Virginia Ashburn -77.4728 39.0481 None {} geolocation \n", "2 Virginia Ashburn -77.4728 39.0481 None {} geolocation \n", "3 Virginia Ashburn -77.4728 39.0481 None {} geolocation \n", "4 Virginia Ashburn -77.4728 39.0481 None {} geolocation \n", "\n", " AdditionalData ... \\\n", "0 {} ... \n", "1 {} ... \n", "2 {} ... \n", "3 {} ... \n", "4 {} ... \n", "\n", " AlertType \\\n", "0 8ecf8077-cf51-4820-aadd-14040956f35d_8a369bd2-97b6-4fe2-922a-cd170faf25bc \n", "1 ThreatIntelligence \n", "2 ThreatIntelligence \n", "3 ThreatIntelligence \n", "4 ThreatIntelligence \n", "\n", " ConfidenceLevel ConfidenceScore IsIncident StartTimeUtc \\\n", "0 NaN False 2020-12-19 13:04:59+00:00 \n", "1 83 NaN False 2020-12-23 13:48:23+00:00 \n", "2 83 NaN False 2020-12-23 13:48:23+00:00 \n", "3 83 NaN False 2020-12-23 13:48:23+00:00 \n", "4 83 NaN False 2020-12-23 13:48:23+00:00 \n", "\n", " EndTimeUtc ProcessingEndTime RemediationSteps \\\n", "0 2020-12-19 19:04:59+00:00 2020-12-19 19:10:17+00:00 \n", "1 2020-12-23 13:48:23+00:00 2020-12-23 14:08:15+00:00 \n", "2 2020-12-23 13:48:23+00:00 2020-12-23 14:08:15+00:00 \n", "3 2020-12-23 13:48:23+00:00 2020-12-23 14:08:15+00:00 \n", "4 2020-12-23 13:48:23+00:00 2020-12-23 14:08:16+00:00 \n", "\n", " ExtendedProperties \\\n", "0 {\\r\\n \"Query\": \"// The query_now parameter (in UTC format) was prepended to the query to reflec... \n", "1 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "2 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "3 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "4 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "\n", " Entities \\\n", "0 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"Address\": \"3.88.48.125\",\\r\\n \"Type\": \"ip\"\\r\\n }\\r\\n] \n", "1 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "2 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "3 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "4 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "\n", " SourceSystem WorkspaceSubscriptionId WorkspaceResourceGroup \\\n", "0 Detection d1d8779d-38d7-4f06-91db-9cbc8de0176f soc \n", "1 Detection d1d8779d-38d7-4f06-91db-9cbc8de0176f soc \n", "2 Detection d1d8779d-38d7-4f06-91db-9cbc8de0176f soc \n", "3 Detection d1d8779d-38d7-4f06-91db-9cbc8de0176f soc \n", "4 Detection d1d8779d-38d7-4f06-91db-9cbc8de0176f soc \n", "\n", " ExtendedLinks ProductName ProductComponentName \\\n", "0 Azure Sentinel Scheduled Alerts \n", "1 Azure Sentinel Microsoft Threat Intelligence Analytics \n", "2 Azure Sentinel Microsoft Threat Intelligence Analytics \n", "3 Azure Sentinel Microsoft Threat Intelligence Analytics \n", "4 Azure Sentinel Microsoft Threat Intelligence Analytics \n", "\n", " AlertLink Status CompromisedEntity Tactics Type_y \\\n", "0 New CommandAndControl SecurityAlert \n", "1 New 3.88.48.125 Unknown SecurityAlert \n", "2 New 3.88.48.125 Unknown SecurityAlert \n", "3 New 3.88.48.125 Unknown SecurityAlert \n", "4 New 3.88.48.125 Unknown SecurityAlert \n", "\n", " SystemAlertId1 \\\n", "0 fdc54c12-efba-38b0-8379-f06d7fbfd34a \n", "1 625ff9af-dddc-0cf8-9d4b-e79067fa2e71 \n", "2 c977f904-ab30-d57e-986f-9d6ebf72771b \n", "3 9ee547e4-cba1-47d1-e1f9-87247b693a52 \n", "4 83a0e08a-1adb-ef75-1c56-f6c9ce25ca69 \n", "\n", " ExtendedProperties1 \\\n", "0 {\\r\\n \"Query\": \"// The query_now parameter (in UTC format) was prepended to the query to reflec... \n", "1 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "2 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "3 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "4 {\\r\\n \"Query\": \"CommonSecurityLog| where RequestURL hasprefix(\\\"www.arboretum.hu\\\") | where Tim... \n", "\n", " Entities1 \\\n", "0 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"Address\": \"3.88.48.125\",\\r\\n \"Type\": \"ip\"\\r\\n }\\r\\n] \n", "1 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "2 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "3 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "4 [\\r\\n {\\r\\n \"$id\": \"3\",\\r\\n \"DnsDomain\": \"www.arboretum.hu\",\\r\\n \"HostName\": \"www.arbo... \n", "\n", " MatchingIps \n", "0 [3.88.48.125] \n", "1 [3.88.48.125] \n", "2 [3.88.48.125] \n", "3 [3.88.48.125] \n", "4 [3.88.48.125] \n", "\n", "[5 rows x 63 columns]" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(\n", " ips_df\n", " .mp_pivot.run(IpAddress.util.ip_type, column=\"IP\", join=\"inner\")\n", " .query(\"result == 'Public'\").head(10)\n", " .mp_pivot.run(IpAddress.util.whois, column=\"ip\", join=\"left\")\n", " .mp_pivot.run(IpAddress.util.geoloc, column=\"ip\", join=\"left\")\n", " .mp_pivot.run(IpAddress.MSSentinel_cybersecuritysoc.sec_list_alerts_for_ip, source_ip_list=\"ip\", join=\"left\")\n", ").head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Other pipeline functions\n", "\n", "In addition to `run`, the `mp_pivot` accessor also \n", "has the following functions:\n", "- `display` - this simply displays the data at the point called in\n", " the pipeline. You can add an optional title, filtering and the number\n", " or rows to display\n", "- `tee` - this forks a copy of the dataframe at the point it is\n", " called in the pipeline. It will assign the forked copy to the name\n", " given in the `var_name` parameter. If there is an existing variable of\n", " the same name it will not overwrite it unless you add the `clobber=True`\n", " parameter.\n", " \n", "In both cases the pipelined data is passed through unchanged.\n", "See [Pivot functions help](https://msticpy.readthedocs.io/en/latest/data_analysis/PivotFunctions.html#mp-pivot-display)\n", "for more details.\n", "\n", "Use of these is shown below" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", " ...\n", " .mp_pivot.run(entities.IpAddress.util.geoloc, column=\"ip\", join=\"left\")\n", " .mp_pivot.display(title=\"Geo Lookup\", cols=[\"IP\", \"City\"]) # << display an intermediate result\n", " .mp_pivot.tee(var_name=\"geoip_df\", clobber=True) # << save a copy called 'geoip_df'\n", " .mp_pivot.run(entities.IpAddress.AzureSentinel.SecurityAlert_list_alerts_for_ip, source_ip_list=\"ip\", join=\"left\")\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the next release we've also implemented:\n", "- `tee_exec` - this executes a function on a forked copy of the DataFrame\n", " The function must be a pandas function or custom accessor. A\n", " good example of the use of this might be creating a plot or summary\n", " table to display partway through the pipeline." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Extending Pivot - adding your own (or someone else's) functions\n", "\n", "You can add pivot functions of your own. You need to supply:\n", "- the function\n", "- some metadata that describes where the function can be found\n", " and how the function works\n", "\n", "\n", "Full details of this are [described here](https://msticpy.readthedocs.io/en/latest/data_analysis/PivotFunctions.html#adding-custom-functions-to-the-pivot-interface)." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "from hashlib import md5\n", "\n", "def my_func2(input: str):\n", " md5_hash = \"-\".join(hex(b)[2:] for b in md5(\"hello\".encode(\"utf-8\")).digest())\n", " return {\n", " \"Title\": input.upper(),\n", " \"Hash\": md5_hash\n", " }\n", "\n", "\n", "mp.Pivot.add_pivot_function(\n", " func=my_func2,\n", " container=\"cyber\", # which container it will appear in on the entity\n", " input_type=\"value\",\n", " entity_map={\"Host\": \"HostName\"},\n", " func_input_value_arg=\"input\",\n", " func_new_name=\"il_upper_hash_name\",\n", ")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Now run the function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "Host.cyber.il_upper_hash_name(\"host_name\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "\n", "We've taken a short tour through the *MSTICPy* looking at how\n", "they make the functionality in the package easier to discover\n", "and use.\n", "I'm particularly excited about the pipeline functionality.\n", "In the next release we're going to make it possible to define\n", "reusable pipelines in configuration files and execute them\n", "with a single function call. This should help streamline\n", "some common patterns in notebooks for Cyber hunting and investigation.\n", "\n", "Please send any feedback or suggestions for improvements\n", "to msticpy@microsoft.com or create an issue on https://github.com/microsoft/msticpy.\n", "\n", "Happy hunting!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Appendix - how do pivot wrappers work?\n", "\n", "In Python you can create functions that return other functions.\n", "On the way they can change how the arguments and output are\n", "processed.\n", "\n", "Take this simple function that just applies proper capitalization\n", "to an input string." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello\n" ] } ], "source": [ "def print_me(arg):\n", " print(arg.capitalize())\n", " \n", "print_me(\"hello\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we try to pass a list to this function we get an \n", "expected exception about lists not supporting `capitalize`" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "ename": "NameError", "evalue": "name 'print_me' is not defined", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mprint_me\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m\"hello\"\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m\"world\"\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mNameError\u001b[0m: name 'print_me' is not defined" ] } ], "source": [ "print_me([\"hello\", \"world\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We could create a wrapper function that checked the\n", "input and iterated over the individual items if arg is a list.\n", "The works but we don't want to have to do this for every \n", "function that we want to have flexible input!" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello\n", "How\n", "Are\n", "You\n", "?\n" ] } ], "source": [ "def print_me_list(arg):\n", " if isinstance(arg, list):\n", " for item in arg:\n", " print_me(item)\n", " else:\n", " print_me(arg)\n", " \n", "print_me_list(\"hello\")\n", "print_me_list([\"how\", \"are\", \"you\", \"?\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead we can create a function wrapper. The outer function\n", "`dont_care_func` defines an inner function, `list_or_str` and then\n", "returns this function. The inner function `list_or_str` is what\n", "implements the same \"is-this-a-string-or-list\" logic that we \n", "saw in the previous example. \n", "Crucially though, it isn't hard-coded to call `print_me` but\n", "calls whatever function passed to it from the outer function\n", "`dont_care_func`." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "# Our magic wrapper\n", "\n", "def dont_care_func(func):\n", " \n", " def list_or_str(arg):\n", " if isinstance(arg, list):\n", " for item in arg:\n", " func(item)\n", " else:\n", " func(arg)\n", " return list_or_str" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How do we use this?\n", "\n", "We simply pass the function that we want to wrap to\n", "`dont_care_func`. Recall, that this function just returns\n", "an instance of the inner function. In this particular instance\n", "the value `func` will have been replaced by the actual function\n", "`print_me`." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "print_stuff = dont_care_func(print_me)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have a wrapped version of `print_me` that can\n", "handle different types of input. Magic!" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Hello\n", "How\n", "Are\n", "You\n", "?\n" ] } ], "source": [ "print_stuff(\"hello\")\n", "print_stuff([\"how\", \"are\", \"you\", \"?\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also define further functions and create wrapped\n", "versions of those by passing them to `dont_care_func`." ] }, { "cell_type": "code", "execution_count": 118, "metadata": {}, "outputs": [], "source": [ "def shout_me(arg):\n", " print(arg.upper(), \"\\U0001F92C!\", end=\" \")\n", " \n", "shout_stuff = dont_care_func(shout_me)" ] }, { "cell_type": "code", "execution_count": 119, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "HELLO 🤬! HOW 🤬! ARE 🤬! YOU 🤬! ? 🤬! " ] } ], "source": [ "shout_stuff(\"hello\")\n", "shout_stuff([\"how\", \"are\", \"you\", \"?\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The wrapper functionality in Pivot is a bit more complex than\n", "this but essentially operates this way." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.7 ('msticpy')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "vscode": { "interpreter": { "hash": "0f1a8e166ce5c1ec1911a36e4fdbd34b2f623e2a3442791008b8ac429a1d6070" } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": { "2aeaa3525526453282e5e1934aa4a923": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "width": "95%" } }, "d24faa135bc84d798e4028cbbf8a91b9": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "width": "95%" } } }, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }