{ "cells": [ { "cell_type": "markdown", "id": "fdab3e77", "metadata": {}, "source": [ "# Guided Hunting - Azure Resource Explorer" ] }, { "cell_type": "markdown", "id": "2c5df446", "metadata": {}, "source": [ "
\n", "  Details...\n", " \n", "**Notebook Version:** 1.0
\n", "**Python Version:** Python 3.7 (including Python 3.6 - AzureML)
\n", "**Required Packages**: kqlmagic, msticpy, pandas, numpy, matplotlib, networkx, ipywidgets, ipython
\n", "**Platforms Supported**:\n", "- Azure Notebooks Free Compute\n", "- Azure Notebooks DSVM\n", "- OS Independent\n", "- Azure Machine Learning Notebooks\n", "\n", "**Data Sources Required**:\n", "- Log Analytics \n", " - SecurityAlert\n", " - SignInLogs\n", " - AzureActivity\n", "- ResourceGraph\n", " - Resources\n", " \n", "- (Optional) \n", " - VirusTotal (with API key)\n", " - Alienvault OTX (with API key) \n", " - IBM Xforce (with API key) \n", "
\n", "\n", "This notebook guides you through an investigation of an Azure Resource of choice and enables you to pivot using functionality from Azure Resource Graphs. The notebook uses SecurityAlert, SignInLogs, and AzureActivity logs.\n", "\n", "You can begin with a resource or a security alert you want to investigate or use our queries to find one of interest.\n", "\n", "The goal of the notebook is to help you better understand potential malicious behavior in your Azure Resource Graph and to successfully pivot to resources of interest as you hunt." ] }, { "cell_type": "markdown", "id": "aba25c62", "metadata": {}, "source": [ "
\n", " \n", "
" ] }, { "cell_type": "markdown", "id": "46137334", "metadata": {}, "source": [ "---\n", "## Notebook initialization\n", "The next cell:\n", "- Checks for the correct Python version\n", "- Checks versions and optionally installs required packages\n", "- Imports the required packages into the notebook\n", "- Sets a number of configuration options.\n", "\n", "This should complete without errors. If you encounter errors or warnings look at the following two notebooks:\n", "- [TroubleShootingNotebooks](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/TroubleShootingNotebooks.ipynb)\n", "- [ConfiguringNotebookEnvironment](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb)\n", "\n", "If you are running in the Azure Sentinel Notebooks environment (Azure Notebooks or Azure ML) you can run live versions of these notebooks:\n", "- [Run TroubleShootingNotebooks](./TroubleShootingNotebooks.ipynb)\n", "- [Run ConfiguringNotebookEnvironment](./ConfiguringNotebookEnvironment.ipynb)\n", "\n", "You may also need to do some additional configuration to successfully use functions such as Threat Intelligence service lookup and Geo IP lookup. \n", "There are more details about this in the `ConfiguringNotebookEnvironment` notebook and in these documents:\n", "- [msticpy configuration](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html)\n", "- [Threat intelligence provider configuration](https://msticpy.readthedocs.io/en/latest/data_acquisition/TIProviders.html#configuration-file)" ] }, { "cell_type": "code", "execution_count": null, "id": "cf3e106f", "metadata": { "scrolled": true }, "outputs": [], "source": [ "from pathlib import Path\n", "import os\n", "import sys\n", "import warnings\n", "from IPython.display import display, HTML, Markdown\n", "REQ_PYTHON_VER=(3, 6)\n", "REQ_MSTICPY_VER=(0, 6, 0)\n", "\n", "display(HTML(\"Checking for msticpy update\"))\n", "\n", "%pip install --upgrade msticpy\n", "\n", "import msticpy\n", "\n", "msticpy.init_notebook(namespace=globals())" ] }, { "cell_type": "markdown", "id": "5f796300", "metadata": {}, "source": [ "## Get WorkspaceId and Authenticate to Log Analytics and ResourceGraph" ] }, { "cell_type": "markdown", "id": "1e70f47f", "metadata": {}, "source": [ "Run the cells below to connect to your Log Analytics workspace. If you haven't already, please fill in the relevant information in `msticpyconfig.yaml`. This file is found in the [Azure Sentinel Notebooks folder](https://github.com/Azure/Azure-Sentinel-Notebooks) this notebook is in. There is more information on how to do this in the Notebook Setup section above. You may need to restart the kernel after doing so and rerun any cells you've already run to update to the new information.\n", "\n", "If you are unfamiliar with connecting to Log Analytics or want a more in-depth walkthrough, check out the [Getting Started with Azure Sentinel Notebook](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/A%20Getting%20Started%20Guide%20For%20Azure%20Sentinel%20Notebooks.ipynb).\n", "\n", "If you are running this notebook locally, you may also need to install [Azure CLI](https://docs.microsoft.com/cli/azure/install-azure-cli). You will have to restart your computer and relaunch the notebook if this is done." ] }, { "cell_type": "markdown", "id": "b24f5577", "metadata": {}, "source": [ "### Log into Azure" ] }, { "cell_type": "markdown", "id": "1a8e703e", "metadata": {}, "source": [ "Log into your Azure account by running the following cell." ] }, { "cell_type": "code", "execution_count": null, "id": "ae8f9a09", "metadata": { "scrolled": true }, "outputs": [], "source": [ "!az login" ] }, { "cell_type": "markdown", "id": "9ac19c26", "metadata": {}, "source": [ "### Connect to your Azure Workspace" ] }, { "cell_type": "code", "execution_count": null, "id": "156618e6", "metadata": {}, "outputs": [], "source": [ "# See if we have an Azure Sentinel Workspace defined in our config file.\n", "# If not, let the user specify Workspace and Tenant IDs\n", "\n", "ws_config = WorkspaceConfig()\n", "if not ws_config.config_loaded:\n", " ws_config.prompt_for_ws()" ] }, { "cell_type": "markdown", "id": "3e4fea8f", "metadata": {}, "source": [ "### Connect to ResourceGraph and LogAnalytics" ] }, { "cell_type": "code", "execution_count": null, "id": "3aee51ba", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Connect to Resource Graph\n", "\n", "qp_RG = QueryProvider(\"ResourceGraph\")\n", "qp_RG.connect(ws_config)" ] }, { "cell_type": "code", "execution_count": null, "id": "eeaabf65", "metadata": {}, "outputs": [], "source": [ "# Connect to Log Analytics\n", "\n", "qp_LA = QueryProvider(\"LogAnalytics\")\n", "qp_LA.connect(ws_config)" ] }, { "cell_type": "markdown", "id": "fa9d7f4b", "metadata": {}, "source": [ "## Select Resource to Investigate" ] }, { "cell_type": "markdown", "id": "45a2faec", "metadata": {}, "source": [ "### Select Time Range" ] }, { "cell_type": "markdown", "id": "f1c1721e", "metadata": {}, "source": [ "This time range will be used in all queries that follow in this notebook to retrieve any related alerts connected to your chosen resource." ] }, { "cell_type": "code", "execution_count": null, "id": "fb642595", "metadata": { "scrolled": true }, "outputs": [], "source": [ "q_times = nbwidgets.QueryTime(units='day', max_before=20, before=5, max_after=1)\n", "q_times.display()" ] }, { "cell_type": "markdown", "id": "97079292", "metadata": {}, "source": [ "### Select Resource" ] }, { "cell_type": "markdown", "id": "ed2f2bb5", "metadata": {}, "source": [ "#### Enter ResourceID" ] }, { "cell_type": "markdown", "id": "2e748c1c", "metadata": {}, "source": [ "If you already know which resource you want to investigate, enter its resource ID in the text box after running the following cell. \n", "\n", "Skip this cell if you would like to use related alerts to select a resource to investigate. The below cells will provide some context on related alerts and offer you a chance to select a resource directly." ] }, { "cell_type": "code", "execution_count": null, "id": "31db1128", "metadata": {}, "outputs": [], "source": [ "selected_resourceName = widgets.Text(\n", " placeholder='insert resource ID',\n", " description='Resource ID:',\n", " disabled=False\n", ")\n", "\n", "display(selected_resourceName)" ] }, { "cell_type": "markdown", "id": "6b32562c", "metadata": {}, "source": [ "#### Gather related alert information and select resource" ] }, { "cell_type": "markdown", "id": "644b1b6a", "metadata": {}, "source": [ "Run the following cells for a summary table of alert activity in your workspace. Resources with more SecurityAlert results may be more likely to be victims of malicious activity." ] }, { "cell_type": "code", "execution_count": null, "id": "b12029e9", "metadata": { "scrolled": true }, "outputs": [], "source": [ "alert_query = f\"\"\"\n", "SecurityAlert\n", "| where TimeGenerated >= datetime(\"{q_times.start}\")\n", "| where TimeGenerated <= datetime(\"{q_times.end}\")\n", "| where isnotempty(ResourceId)\n", "| extend json_extendProp = parse_json(ExtendedProperties)\n", "| extend UserName = json_extendProp['User Name'], ServiceId = json_extendProp['ServiceId'], WdatpTenantId = json_extendProp['WdatpTenantId'], FileName = json_extendProp['File Name'], resourceType = json_extendProp['resourceType'], AttackerSourceIP = json_extendProp['Attacker source IP'], numFailedAuthAttemptsToHost = json_extendProp['Number of failed authentication attempts to host'], numExistingAccountsUsedBySource = json_extendProp['Number of existing accounts used by source to sign in'], numNonExistentAccountsUsedBySource = json_extendProp['Number of nonexistent accounts used by source to sign in'], topAccountsWithFailedSignInAttempts = json_extendProp['Top accounts with failed sign in attempts (count)'], RDPSessionInitiated = json_extendProp['Was RDP session initiated'], attackerSourceComputerName = json_extendProp['Attacker source computer name'] \n", "| project-away json_extendProp\n", "\"\"\"\n", "\n", "alert_df = qp_LA.exec_query(alert_query)\n", "\n", "\n", "sum_alert_query = f\"\"\"\n", "SecurityAlert\n", "| where TimeGenerated >= datetime(\"{q_times.start}\")\n", "| where TimeGenerated <= datetime(\"{q_times.end}\")\n", "| where isnotempty(ResourceId)\n", "| extend json_extendProp = parse_json(ExtendedProperties)\n", "| extend UserName = json_extendProp['User Name'], ServiceId = json_extendProp['ServiceId'], WdatpTenantId = json_extendProp['WdatpTenantId'], FileName = json_extendProp['File Name'], resourceType = json_extendProp['resourceType'], AttackerSourceIP = json_extendProp['Attacker source IP'], numFailedAuthAttemptsToHost = json_extendProp['Number of failed authentication attempts to host'], numExistingAccountsUsedBySource = json_extendProp['Number of existing accounts used by source to sign in'], numNonExistentAccountsUsedBySource = json_extendProp['Number of nonexistent accounts used by source to sign in'], topAccountsWithFailedSignInAttempts = json_extendProp['Top accounts with failed sign in attempts (count)'], RDPSessionInitiated = json_extendProp['Was RDP session initiated'], attackerSourceComputerName = json_extendProp['Attacker source computer name'] \n", "| project-away json_extendProp\n", "| summarize count() by AlertName, AlertSeverity, CompromisedEntity, tostring(resourceType)\n", "| sort by count_\n", "\"\"\"\n", "\n", "sum_alert_df = qp_LA.exec_query(sum_alert_query)\n", "display(sum_alert_df)" ] }, { "cell_type": "markdown", "id": "5e3703de", "metadata": {}, "source": [ "Run the cell below to see a dropdown listing all resources involved in the alerts shown. Select one that you would like to investigate. Skip this section if you have already entered a ResourceID of interest above." ] }, { "cell_type": "code", "execution_count": null, "id": "05522b73", "metadata": {}, "outputs": [], "source": [ "resource_types = [i if i else \"N/A\" for i in alert_df.resourceType]\n", "resources = set(zip(alert_df.CompromisedEntity, resource_types))\n", "resources = [i for i in resources if i[0]]\n", "resources = [str(i).replace('(','').replace(')','').replace(\"'\", '') for i in resources]\n", "resource_dropdown = widgets.Dropdown(options = resources, description='Resource:')\n", "display(resource_dropdown)" ] }, { "cell_type": "markdown", "id": "5470d294", "metadata": {}, "source": [ "## View Resource Graph " ] }, { "cell_type": "markdown", "id": "00a795c8", "metadata": {}, "source": [ "This section of the notebook allows you to investigate resources related to the resource you have chosen and better understand your resource graph environment by generating a visual representation of the graph. You can reselect the resource you want to investigate in the sections above at any time. Rerun the below cells to generate a new graph if you select a different resource.\n", "\n", "Run the following cells to generate the resource graph." ] }, { "cell_type": "markdown", "id": "2ce08b2e", "metadata": {}, "source": [ "#### Import required graph libraries" ] }, { "cell_type": "code", "execution_count": null, "id": "11ed31bb", "metadata": {}, "outputs": [], "source": [ "# Import libraries\n", "\n", "import networkx as nx\n", "from bokeh.io import output_notebook, show, save\n", "from bokeh.models import (BoxSelectTool, Circle, EdgesAndLinkedNodes, HoverTool,\n", " MultiLine, NodesAndLinkedEdges, Plot, Range1d, TapTool, ColumnDataSource, LabelSet)\n", "from bokeh.plotting import figure\n", "from bokeh.plotting import from_networkx\n", "from bokeh.palettes import Blues8, Reds8, Purples8, Oranges8, Viridis8, Spectral8, Blues256\n", "from bokeh.transform import linear_cmap, factor_cmap\n", "from networkx.algorithms import community\n", "from ipywidgets import interact, interactive, fixed, interact_manual\n", "from bokeh.io import push_notebook, show, output_notebook\n", "\n", "output_notebook()" ] }, { "cell_type": "markdown", "id": "99673c7f", "metadata": {}, "source": [ "#### Validate selected resource" ] }, { "cell_type": "markdown", "id": "c6784a4c", "metadata": {}, "source": [ "The following cell will confirm if the resource you selected exists and is valid for generating the investigation graph. If the resource is not found, feel free to use the dropdown or text box to enter a different resource and return to this cell." ] }, { "cell_type": "code", "execution_count": null, "id": "664cf0fc", "metadata": { "scrolled": false }, "outputs": [], "source": [ "# Query ResourceGraph for resource info \n", "if selected_resourceName.value == '':\n", " print(\"SELECTED: \", resource_dropdown.value.split(',')[0])\n", " rg_query = f\"\"\"\n", " Resources\n", " | where name == \"{resource_dropdown.value.split(',')[0]}\"\n", " \"\"\"\n", "else:\n", " print(\"SELECTED: \", selected_resourceName.value)\n", " rg_query = f\"\"\"\n", " Resources\n", " | where name == \"{selected_resourceName.value}\"\n", " \"\"\"\n", " \n", "rg_df = qp_RG.exec_query(rg_query)\n", "display(pd.DataFrame(rg_df.iloc[0].T))\n", "\n", "try:\n", " resource_id_list = [rg_df['id'][0]]\n", " rg = rg_df['resourceGroup'][0]\n", " print(\"Resource found!\")\n", " \n", " related_rg_query = f\"\"\"\n", " Resources\n", " | where resourceGroup == \"{rg}\"\n", " \"\"\"\n", " \n", " related_rg_df = qp_RG.exec_query(related_rg_query)\n", " resource_id_list.extend(list(related_rg_df['id']))\n", " related_rg_df['managedByVal'] = related_rg_df['managedBy'].str.split('/').str[-1]\n", " \n", "except:\n", " print(\"No results for that resource. Please select a different resource above.\")\n", "\n", "#print(\"You can select a different resource here and run the cell again.\")\n", "#resource_dropdown = widgets.Dropdown(options = resources, description='Resource:')\n", "#display(resource_dropdown)" ] }, { "cell_type": "markdown", "id": "55153a84", "metadata": {}, "source": [ "#### Generate graph" ] }, { "cell_type": "markdown", "id": "636079f6", "metadata": {}, "source": [ "The following cells will generate a NetworkX graph of your resource environment. Please run each cell to properly generate the graph. Confirmation that the cell you just ran worked properly will print out once each cell finishes running." ] }, { "cell_type": "code", "execution_count": null, "id": "cd7cee8e", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Parse for relationships between resource types\n", "\n", "network_rg_df = related_rg_df.loc[related_rg_df['managedByVal'] != '']\n", "vm_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.compute/virtualmachines']\n", "nsg_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/networksecuritygroups']\n", "ip_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/publicipaddresses']\n", "\n", "# Get associated NIC to a given VM\n", "def get_associated_nic(vm_name):\n", " nic_query = f\"\"\" Resources\n", " | where name == \"{vm_name}\"\n", " | extend d=parse_json(properties)\n", " | project result = d.networkProfile['networkInterfaces'][0][\"id\"]\n", " \"\"\"\n", " nic_id = qp_RG.exec_query(nic_query)['result']\n", " if nic_id[0] == None: \n", " final_nic_id = nic_id[1]\n", " else:\n", " final_nic_id = nic_id[0]\n", " \n", " nic_name_query = f\"\"\"Resources\n", " | where id == \"{final_nic_id}\"\n", " | project name\n", " \"\"\"\n", " nic_name = qp_RG.exec_query(nic_name_query)['name'][0]\n", " \n", " return nic_name\n", "\n", "\n", "# Get associated NIC to a given NSG\n", "def get_associated_nic_nsg(nsg_name):\n", " nic_query = f\"\"\" Resources\n", " | where name == \"{nsg_name}\"\n", " | extend d=parse_json(properties)\n", " | project result = d.networkInterfaces[0]['id']\n", " \"\"\"\n", " nic_id = qp_RG.exec_query(nic_query)['result'][0]\n", " \n", " nic_name_query = f\"\"\"Resources\n", " | where id == \"{nic_id}\"\n", " | project name\n", " \"\"\"\n", " nic_name = qp_RG.exec_query(nic_name_query)['name'][0]\n", " \n", " return nic_name\n", "\n", "\n", "vm_nic_pairs = []\n", "vm_nic_dict = {}\n", "\n", "for vm in vm_rg_df['name']:\n", " vm_nic_pairs.append((vm, get_associated_nic(vm)))\n", " vm_nic_dict[vm] = get_associated_nic(vm)\n", "\n", "vm_nic_df = pd.DataFrame(vm_nic_pairs, columns =['name', 'nic'])\n", "\n", "nic_nsg_pairs = []\n", "nic_nsg_dict = {}\n", "\n", "for nsg in nsg_rg_df['name']:\n", " nic_nsg_pairs.append((nsg, get_associated_nic_nsg(nsg)))\n", " nic_nsg_dict[nsg] = get_associated_nic_nsg(nsg)\n", "nic_nsg_df = pd.DataFrame(nic_nsg_pairs, columns =['nsg', 'nic'])\n", "\n", "\n", "# Get associated NIC to a given IP\n", "def get_associated_nic_ip(ip_name):\n", " nic_query = f\"\"\" Resources\n", " | where name == \"{ip_name}\"\n", " | extend d=parse_json(properties)\n", " | project result = d.ipConfiguration['id']\n", " \"\"\"\n", " try: \n", " nic_name = qp_RG.exec_query(nic_query)['result'][0].split('/')[-3]\n", " except:\n", " nic_name = qp_RG.exec_query(nic_query)['result'][1].split('/')[-3]\n", " return nic_name\n", "\n", "nic_ip_pairs = []\n", "nic_ip_dict = {}\n", "\n", "for ip in ip_rg_df['name']:\n", " nic_ip_pairs.append((ip, get_associated_nic_ip(ip)))\n", " nic_ip_dict[ip] = get_associated_nic_ip(ip)\n", " \n", "nic_ip_df = pd.DataFrame(nic_ip_pairs, columns =['ip', 'nic'])\n", "\n", "storage_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.storage/storageaccounts']\n", "vnet_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/virtualnetworks']\n", "endpt_rg_df = related_rg_df.loc[related_rg_df['type'] == 'microsoft.network/privateendpoints']\n", "\n", "# Get associated Vnet for a given Endpt\n", "def get_associated_vnet(endpt_name):\n", " vnet_query = f\"\"\"Resources\n", " | where name == \"{endpt_name}\"\n", " | extend d=parse_json(properties)\n", " | project result = d.subnet['id']\n", " \"\"\"\n", " try: \n", " vnet_id = qp_RG.exec_query(vnet_query)['result'][0].split('/')[-3]\n", " except:\n", " vnet_id = qp_RG.exec_query(vnet_query)['result'][1].split('/')[-3]\n", " return vnet_id\n", "\n", "vnet_endpt_pairs = []\n", "vnet_endpt_dict = {}\n", "\n", "for endpt in endpt_rg_df['name']:\n", " vnet_endpt_pairs.append((endpt, get_associated_vnet(endpt)))\n", " vnet_endpt_dict[endpt] = get_associated_vnet(endpt)\n", " \n", "vnet_endpt_df = pd.DataFrame(vnet_endpt_pairs, columns =['vnet', 'endpt'])\n", "\n", "print(\"Associations complete\")" ] }, { "cell_type": "code", "execution_count": null, "id": "9b42156e", "metadata": { "scrolled": false }, "outputs": [], "source": [ "# Create Networkx graph and add nodes\n", "g = nx.MultiGraph()\n", "\n", "resource_list_order = [(vm_rg_df, \"resourceGroup\", \"name\"), (vm_nic_df, \"nic\", \"name\"), (nic_nsg_df, \"nsg\", \"nic\"), (nic_ip_df, \"ip\", \"nic\"), \n", " (network_rg_df, \"resourceGroup\", \"managedByVal\"), (storage_rg_df, \"name\", \"resourceGroup\"), (vnet_rg_df, \"name\", \"resourceGroup\"),\n", " (vnet_endpt_df, \"endpt\", \"vnet\"), (related_rg_df, \"resourceGroup\", \"name\")]\n", "\n", "for r in resource_list_order:\n", " g.add_nodes_from(nx.from_pandas_edgelist(r[0], r[1], r[2]))\n", "\n", "# Add edges between nodes based on the hierarchical associations determined in the previous cell\n", "for node in g:\n", " if node in vm_rg_df['name'].values:\n", " g.add_edge(node, vm_nic_dict[node])\n", " elif node in nic_nsg_df['nsg'].values:\n", " g.add_edge(node, nic_nsg_dict[node])\n", " elif node not in vm_nic_df['nic'].values and node in nic_nsg_df['nic'].values:\n", " g.add_edge(node, rg)\n", " elif node in network_rg_df['name'].values:\n", " g.add_edge(node, network_rg_df.loc[network_rg_df['name'] == node, 'managedByVal'].item())\n", " elif node in vnet_rg_df['name'].values:\n", " g.add_edge(node, rg)\n", " elif node in endpt_rg_df['name'].values:\n", " g.add_edge(node, vnet_endpt_dict[node])\n", " elif node in storage_rg_df['name'].values:\n", " g.add_edge(node, rg)\n", " elif node in nic_ip_df['ip'].values:\n", " g.add_edge(node, nic_ip_dict[node])\n", " elif node not in vm_nic_df['nic'].values and node not in nic_nsg_df['nic'].values:\n", " g.add_edge(node, rg)\n", "\n", "#nx.draw(g)\n", "print(\"NetworkX done\")" ] }, { "cell_type": "code", "execution_count": null, "id": "8c46e53f", "metadata": {}, "outputs": [], "source": [ "# Set graph node (resource) attributes\n", "def get_resource_alert_count(resource_name):\n", " resource_alert_sev_query = f\"\"\"\n", " SecurityAlert\n", " | where TimeGenerated >= datetime(\"{q_times.start}\")\n", " | where TimeGenerated <= datetime(\"{q_times.end}\")\n", " | where ResourceId contains \"{resource_name}\"\n", " | summarize count()\n", " \"\"\"\n", " resource_alert_sev_df = qp_LA.exec_query(resource_alert_sev_query)\n", " return resource_alert_sev_df[\"count_\"][0]\n", "\n", "def get_resource_type(resource_name):\n", " resource_type_query = f\"\"\"\n", " Resources\n", " | where name == \"{resource_name}\"\n", " | project type\n", " \"\"\"\n", " resource_type_df = qp_RG.exec_query(resource_type_query)\n", " return resource_type_df[\"type\"][0]\n", "\n", "num_alert_dict = {}\n", "resource_type_dict = {}\n", "selected_resource_dict = {}\n", "selected_resource_color_dict = {}\n", "show_or_hide_dict = {}\n", "for node in g:\n", " show_or_hide_dict[node] = \"show\"\n", " num_alert_dict[node] = get_resource_alert_count(node) + 20\n", " if node != rg:\n", " if node == resource_dropdown.value.split(',')[0]:\n", " selected_resource_dict[node] = 1\n", " selected_resource_color_dict[node] = Spectral8[1]\n", " else:\n", " selected_resource_dict[node] = 0\n", " selected_resource_color_dict[node] = Spectral8[3]\n", " resource_type_dict[node] = get_resource_type(node)\n", " else:\n", " resource_type_dict[node] = \"ResourceGroup\"\n", " \n", "nx.set_node_attributes(g, name='num_alerts', values=num_alert_dict)\n", "nx.set_node_attributes(g, name='resource_type', values=resource_type_dict)\n", "nx.set_node_attributes(g, name='selected_resource', values=selected_resource_dict)\n", "nx.set_node_attributes(g, name='selected_resource_color', values=selected_resource_color_dict)\n", "nx.set_node_attributes(g, name='show_or_hide', values=show_or_hide_dict)\n", "\n", "print(\"Graph notes successfully generated\")" ] }, { "cell_type": "markdown", "id": "54adbec6", "metadata": {}, "source": [ "### Show Graph" ] }, { "cell_type": "markdown", "id": "205ca313", "metadata": {}, "source": [ "The following graph prints out the graph that the above cells generate. Keep the following in mind for optimal viewing:\n", "- The sizes of the circles represent how many alerts are related to the resource that it represents. The resource you selected above to investigate will be in a darker green color than the rest.\n", "- Hover over each circle for information on its name, type, and the number of alerts associated with it.\n", "- Use the selector tool to choose the types of resources you want displayed in the graph. Be aware the graph will not update unless you also update the slider after updating the selector.\n", "- Use the slider to filter by the number of alerts. We recommend clicking rather than sliding to prevent the graph from slowly generating a graph per number you slide onto. " ] }, { "cell_type": "code", "execution_count": null, "id": "ec827d0e", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Create graph\n", "# Define size and color attributes\n", "size_by_this_attribute = 'num_alerts'\n", "color_by_this_attribute = 'selected_resource_color'\n", "color_palette = Blues8\n", "\n", "#Choose colors for node and edge highlighting\n", "node_highlight_color = 'white'\n", "edge_highlight_color = 'black'\n", "\n", "def create_graph(g_copy, show_graph):\n", " #Choose a title\n", " title = 'Azure Resource Graph'\n", "\n", " #Hover categories\n", " HOVER_TOOLTIPS = [(\"Resource Name\", \"@index\"),\n", " (\"Num Alerts\", \"@num_alerts\"),\n", " (\"Type\", \"@resource_type\")]\n", "\n", " #Set dimensions, title, toolbar\n", " plot = figure(tooltips = HOVER_TOOLTIPS,\n", " tools=\"pan,wheel_zoom,save,reset\", active_scroll='wheel_zoom', title=title, width=900, height=700)\n", "\n", " plot.add_tools(HoverTool(tooltips=None), TapTool(), BoxSelectTool())\n", " #Create graph\n", " network_graph = from_networkx(g_copy, nx.spring_layout, scale=20, center=(0, 0))\n", "\n", " #Set node sizes and colors according to num alerts and type\n", " network_graph.node_renderer.glyph = Circle(size=size_by_this_attribute, fill_color=color_by_this_attribute)\n", "\n", " #Set highlight colors\n", " network_graph.node_renderer.hover_glyph = Circle(size=size_by_this_attribute, fill_color=node_highlight_color, line_width=2)\n", " network_graph.node_renderer.selection_glyph = Circle(size=size_by_this_attribute, fill_color=\"black\", line_width=2)\n", "\n", " #Set edge opacity and width\n", " network_graph.edge_renderer.glyph = MultiLine(line_alpha=0.5, line_width=1)\n", "\n", " #Set edge highlight colors\n", " network_graph.edge_renderer.selection_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)\n", " network_graph.edge_renderer.hover_glyph = MultiLine(line_color=edge_highlight_color, line_width=2)\n", "\n", " #Highlight nodes and edges\n", " network_graph.selection_policy = NodesAndLinkedEdges()\n", " network_graph.inspection_policy = NodesAndLinkedEdges()\n", "\n", " #Add Labels\n", " x, y = zip(*network_graph.layout_provider.graph_layout.values())\n", " node_labels = list(g_copy.nodes())\n", " source = ColumnDataSource({'x': x, 'y': y, 'name': [node_labels[i] for i in range(len(x))]})\n", " labels = LabelSet(x='x', y='y', text='name', source=source, background_fill_color='white', text_font_size='10px', background_fill_alpha=.7)\n", " plot.renderers.append(labels)\n", "\n", " #Add network graph to the plot\n", " plot.renderers.append(network_graph)\n", "\n", " show(plot)\n", " \n", "output_notebook()\n", "resource_names = set(resource_type_dict.values())\n", "resource_names.remove(\"ResourceGroup\")\n", "sel_sub = nbwidgets.SelectSubset(source_items=resource_names, default_selected=[\"microsoft.compute/virtualmachines\"])\n", "\n", "def filter_graph(alert_limit):\n", " g_copy = g.copy()\n", " att_dict_alerts = nx.get_node_attributes(g_copy,'num_alerts')\n", " att_dict_type = nx.get_node_attributes(g_copy, 'resource_type')\n", " kept_alerts = dict(filter(lambda elem: elem[1] > alert_limit, att_dict_alerts.items()))\n", " kept_types = dict(filter(lambda elem: elem[1] in sel_sub.selected_items, att_dict_type.items()))\n", " alert_keep = list(kept_alerts.keys())\n", " type_keep = list(kept_types.keys())\n", " list_keep = [x for x in alert_keep if x in type_keep]\n", " \n", " for node in g:\n", " if node != rg:\n", " if node not in list_keep:\n", " g_copy.remove_node(node)\n", " \n", " #g = g_copy\n", " \n", " create_graph(g_copy, False)\n", " \n", " push_notebook()\n", " \n", "interact(filter_graph, alert_limit = (0, max(num_alert_dict.values())))\n" ] }, { "cell_type": "markdown", "id": "094c5579", "metadata": {}, "source": [ "## Resource Investigation" ] }, { "cell_type": "markdown", "id": "be6e039a", "metadata": {}, "source": [ "The following sections provide context around the resource you selected." ] }, { "cell_type": "markdown", "id": "d5e83936", "metadata": {}, "source": [ "### Related Alerts" ] }, { "cell_type": "markdown", "id": "9cb81705", "metadata": {}, "source": [ "The following cell shows SecurityAlert event log entries that feature \n", "\n", "This includes alerts in which the Compromised Entity is the resource you selected and those that contain the same IP addresses that appear in alerts with the selected compromised entity. A TI search on available IOC data is calculated where available." ] }, { "cell_type": "code", "execution_count": null, "id": "3d1aaf0c", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Alerts from the chosen resource\n", "\n", "related_alerts_df = alert_df[alert_df['CompromisedEntity'] == resource_dropdown.value.split(',')[0]].copy()\n", "\n", "# parse for IP address\n", "def ip_splitter(ip):\n", " if ip != None:\n", " if \"IP Address:\" in ip:\n", " return ip.split(\":\")[1].strip()\n", " else:\n", " return ip\n", " return ip\n", "\n", "related_alerts_df[\"AttackerSourceIP\"] = related_alerts_df[\"AttackerSourceIP\"].apply(lambda ip: ip_splitter(ip))\n", "\n", "# add TI Data column\n", "def getTIData(col):\n", " sev = []\n", " if col in ti_results[\"Ioc\"].values:\n", " sev.append((col, ti_results.loc[ti_results['Ioc'] == col, 'Severity'].item()))\n", " else:\n", " sev.append((\"n/a\", \"n/a\"))\n", " return sev\n", "\n", "severity_values = {'information': 0, 'high': 3}\n", "def getHighestSev(col):\n", " sev = []\n", " for i in range(len(col)):\n", " if 'n/a' in col[i][0]:\n", " sev.append('n/a')\n", " else:\n", " sev.append(col[i][0][1])\n", " return sev\n", "\n", "all_ips = set(related_alerts_df[\"AttackerSourceIP\"].values)\n", "\n", "def print_related_alerts(related_alerts_df):\n", " attacker_source_ips = list(set(related_alerts_df['AttackerSourceIP'].values))\n", " attacker_source_ips_str = str(attacker_source_ips).replace('[', '(').replace(']', ')')\n", " ip_alert_query = f\"\"\"\n", " SecurityAlert\n", " | where TimeGenerated >= datetime(\"{q_times.start}\")\n", " | where TimeGenerated <= datetime(\"{q_times.end}\")\n", " | where isnotempty(ResourceId)\n", " | extend json_extendProp = parse_json(ExtendedProperties)\n", " | extend UserName = json_extendProp['User Name'], ServiceId = json_extendProp['ServiceId'], WdatpTenantId = json_extendProp['WdatpTenantId'], FileName = json_extendProp['File Name'], resourceType = json_extendProp['resourceType'], AttackerSourceIP = json_extendProp['Attacker source IP'], numFailedAuthAttemptsToHost = json_extendProp['Number of failed authentication attempts to host'], numExistingAccountsUsedBySource = json_extendProp['Number of existing accounts used by source to sign in'], numNonExistentAccountsUsedBySource = json_extendProp['Number of nonexistent accounts used by source to sign in'], topAccountsWithFailedSignInAttempts = json_extendProp['Top accounts with failed sign in attempts (count)'], RDPSessionInitiated = json_extendProp['Was RDP session initiated'], attackerSourceComputerName = json_extendProp['Attacker source computer name'] \n", " | project-away json_extendProp\n", " | where AttackerSourceIP has_any {attacker_source_ips_str}\n", " \"\"\"\n", " ip_alert_df = qp_LA.exec_query(ip_alert_query)\n", " related_alerts_df = pd.concat([ip_alert_df, related_alerts_df]).drop_duplicates().reset_index(drop=True)\n", " related_alerts_df[\"AttackerSourceIP\"] = related_alerts_df[\"AttackerSourceIP\"].apply(lambda ip: ip_splitter(ip))\n", " ti_lookup = TILookup()\n", " ti_results = ti_lookup.lookup_iocs(data=attacker_source_ips)\n", " related_alerts_df[\"TIData\"] = related_alerts_df['AttackerSourceIP'].apply(getTIData)\n", " related_alerts_df[\"TISeverity\"] = getHighestSev(list(related_alerts_df['TIData'].values))\n", " display(related_alerts_df[['TimeGenerated', 'AlertName', 'AlertSeverity', 'TISeverity', 'AttackerSourceIP', 'ResourceId', 'TIData', 'ProductName', 'resourceType', 'numNonExistentAccountsUsedBySource', 'topAccountsWithFailedSignInAttempts', 'attackerSourceComputerName']])\n", "\n", "if len(all_ips) == 0 or (len(all_ips) == 1 and None in all_ips):\n", " print(\"No data for TI search\")\n", " display(related_alerts_df[['TimeGenerated', 'AlertName', 'AlertSeverity', 'ResourceId', 'ProductName', 'resourceType', 'numNonExistentAccountsUsedBySource', 'topAccountsWithFailedSignInAttempts', 'attackerSourceComputerName']])\n", "else:\n", " print_related_alerts(related_alerts_df)" ] }, { "cell_type": "markdown", "id": "d3365cc6", "metadata": {}, "source": [ "#### Investigate further!\n", "If you would like to pivot further on a certain entity, please check out our Entity Explorer series:\n", "- [IP Addresses](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/Entity%20Explorer%20-%20IP%20Address.ipynb)\n", "- [Windows Host](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/Entity%20Explorer%20-%20Windows%20Host.ipynb)\n", "- [Linux Host](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/Entity%20Explorer%20-%20Linux%20Host.ipynb)\n", "- [Domain and URL](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/Entity%20Explorer%20-%20Domain%20and%20URL.ipynb)\n", "- [Account](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/Entity%20Explorer%20-%20Account.ipynb)" ] }, { "cell_type": "markdown", "id": "cfd4c816", "metadata": {}, "source": [ "#### Timeline of related alerts" ] }, { "cell_type": "code", "execution_count": null, "id": "267fee84", "metadata": { "scrolled": true }, "outputs": [], "source": [ "# density timeline - all on one line, or at least high on top\n", "\n", "if 'TISeverity' in related_alerts_df.columns:\n", " nbdisplay.display_timeline(related_alerts_df,\n", " time_column=\"TimeGenerated\",\n", " group_by=\"TISeverity\",\n", " source_columns=[\"AlertName\", \"Description\", \"AlertSeverity\", \"TISeverity\", \"ProviderName\"])\n", "else:\n", " nbdisplay.display_timeline(related_alerts_df,\n", " time_column=\"TimeGenerated\",\n", " group_by=\"AlertSeverity\",\n", " source_columns=[\"AlertName\", \"Description\", \"AlertSeverity\", \"ProviderName\"]) " ] }, { "cell_type": "markdown", "id": "7b5fcf66", "metadata": {}, "source": [ "### Parse ResourceGraph" ] }, { "cell_type": "markdown", "id": "7a9dddd0", "metadata": {}, "source": [ "From the dropdown below, pick a resource of interest from the resource graph then run the cell below it to view all information gathered on it." ] }, { "cell_type": "code", "execution_count": null, "id": "e9f2dd0b", "metadata": {}, "outputs": [], "source": [ "rg = rg_df['resourceGroup'][0]\n", "\n", "related_rg_query = f\"\"\"\n", "Resources\n", "| where resourceGroup == \"{rg}\"\n", "\"\"\"\n", "\n", "related_rg_df = qp_RG.exec_query(related_rg_query)\n", "resource_id_list.extend(list(related_rg_df['id']))\n", "\n", "all_resources = [i for i in g]\n", "all_resource_dropdown = widgets.Dropdown(options = all_resources, description='Resources:')\n", "display(all_resource_dropdown)" ] }, { "cell_type": "code", "execution_count": null, "id": "e078e74c", "metadata": {}, "outputs": [], "source": [ "# Parse all info\n", "\n", "chosen_resource_query = f\"\"\"\n", "Resources\n", "| where name == \"{all_resource_dropdown.value}\"\n", "\"\"\"\n", "try:\n", " chosen_resource_df = qp_RG.exec_query(chosen_resource_query)\n", " display(chosen_resource_df.transpose().style.set_properties(**{'text-align': 'left'})) \n", "except:\n", " print(\"No results. Please select another resource.\")" ] }, { "cell_type": "markdown", "id": "903c8626", "metadata": {}, "source": [ "#### Investigate further!\n", "To further view a user's access, please check out our [Guided Analysis - User Security Metadata notebook](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/BehaviorAnalytics/UserSecurityMetadata/Guided%20Analysis%20-%20User%20Security%20Metadata.ipynb)." ] }, { "cell_type": "markdown", "id": "6e3d6a39", "metadata": {}, "source": [ "### Location and Resource Type Counts" ] }, { "cell_type": "markdown", "id": "92483220", "metadata": {}, "source": [ "The following cell prints out summary information about all of the resources and their locations and types in your workspace." ] }, { "cell_type": "code", "execution_count": null, "id": "5b759a77", "metadata": {}, "outputs": [], "source": [ "print (\"LOCATIONS:\")\n", "print(related_rg_df['location'].value_counts())\n", "\n", "print(\"\\n\\nRESOURCE TYPE COUNTS:\")\n", "print(related_rg_df['type'].value_counts())" ] }, { "cell_type": "markdown", "id": "21455184", "metadata": {}, "source": [ "### Related AzureActivityLogs Activity" ] }, { "cell_type": "markdown", "id": "9f155cca", "metadata": {}, "source": [ "In the following cell, we use a KQL query to see if there are any AzureActivity log entries related to the resource you selected. You can use the results to pivot and check for TI intel results." ] }, { "cell_type": "code", "execution_count": null, "id": "5fc20b7a", "metadata": { "scrolled": true }, "outputs": [], "source": [ "azure_activity_query = f\"\"\"\n", "AzureActivity\n", "//| where TimeGenerated >= datetime(\"{q_times.start}\")\n", "//| where TimeGenerated <= datetime(\"{q_times.end}\")\n", "| where Resource =~ \"{resource_dropdown.value.split(',')[0]}\"\n", "| extend json_prop = parse_json(Properties)\n", "| extend isComplianceCheck = json_prop['isComplianceCheck'], ancestors = json_prop['ancestors'], message = json_prop['message']\n", "| extend json_auth = parse_json(Authorization)\n", "| extend action = json_auth['action'], scope = json_auth['scope']\n", "| extend json_http = parse_json(HTTPRequest)\n", "| extend clientRequestId = json_http['clientRequestId'], clientIpAddress = json_http['clientIpAddress'], method = json_http['method']\n", "| project-away json_prop, json_auth, json_http\n", "| summarize count() by OperationName, Caller, CallerIpAddress, tostring(clientIpAddress)\n", "| sort by count_\n", "\"\"\"\n", "\n", "azure_activity_df = qp_LA.exec_query(azure_activity_query)\n", "\n", "# get TI data\n", "callIpAddressList = list(azure_activity_df['CallerIpAddress'].unique())\n", "cliIpAddressList = list(azure_activity_df['clientIpAddress'].unique())\n", "callIpAddressList.extend(cliIpAddressList)\n", "callIpAddressList = list(set([i for i in callIpAddressList if i]))\n", "aa_full_list = callIpAddressList\n", "\n", "#aa_results = ti_lookup.lookup_iocs(data=aa_full_list)\n", "\n", "# add TI column\n", "def getTIData(col):\n", " sev = []\n", " if col in aa_results[\"Ioc\"].values:\n", " sev.append((col, aa_results.loc[aa_results['Ioc'] == col, 'Severity'].item()))\n", " else:\n", " sev.append((\"n/a\", \"n/a\"))\n", " return sev\n", "\n", "severity_values = {'information': 0, 'high': 3}\n", "def getHighestSev(call, cli):\n", " sev = []\n", " for i in range(len(call)):\n", " if 'n/a' in call[i][0] or 'n/a' in cli[i][0]:\n", " sev.append('n/a')\n", " else:\n", " if severity_values[call[i][0][1]] > severity_values[cli[i][0][1]]:\n", " sev.append(call[i][0][1])\n", " else:\n", " sev.append(cli[i][0][1])\n", " return sev\n", "\n", "\n", "if len(aa_full_list) == 0:\n", " print(\"No data for TI search\")\n", " display(azure_activity_df)\n", "else:\n", " ti_lookup = TILookup()\n", " aa_results = ti_lookup.lookup_iocs(data=aa_full_list)\n", " azure_activity_df[\"TIData_caller\"] = azure_activity_df['CallerIpAddress'].apply(getTIData)\n", " azure_activity_df[\"TIData_client\"] = azure_activity_df['clientIpAddress'].apply(getTIData)\n", " azure_activity_df[\"Severity\"] = getHighestSev(list(azure_activity_df['TIData_caller'].values), list(azure_activity_df['TIData_client'].values))\n", " \n", "display(azure_activity_df)" ] }, { "cell_type": "markdown", "id": "4f1ebcf4", "metadata": {}, "source": [ "#### AzureActivity Timeline" ] }, { "cell_type": "markdown", "id": "c869feb3", "metadata": {}, "source": [ "The following cell prints out a timeline of AzureActivity entries related to the resource you selected to put the results into time context. It also parses any TI data out and results from connected TI sources." ] }, { "cell_type": "code", "execution_count": null, "id": "c95cc6da", "metadata": {}, "outputs": [], "source": [ "all_azure_activity_query = f\"\"\"\n", "AzureActivity\n", "//| where TimeGenerated >= datetime(\"{q_times.start}\")\n", "//| where TimeGenerated <= datetime(\"{q_times.end}\")\n", "| where Resource =~ \"{resource_dropdown.value.split(',')[0]}\"\n", "| extend json_prop = parse_json(Properties)\n", "| extend isComplianceCheck = json_prop['isComplianceCheck'], ancestors = json_prop['ancestors'], message = json_prop['message']\n", "| extend json_auth = parse_json(Authorization)\n", "| extend action = json_auth['action'], scope = json_auth['scope']\n", "| extend json_http = parse_json(HTTPRequest)\n", "| extend clientRequestId = json_http['clientRequestId'], clientIpAddress = json_http['clientIpAddress'], method = json_http['method']\n", "| project-away json_prop, json_auth, json_http\n", "\"\"\"\n", "all_azure_activity_df = qp_LA.exec_query(all_azure_activity_query)\n", "\n", "if len(aa_full_list) == 0:\n", " print(\"No data for TI search\")\n", " display(all_azure_activity_df)\n", "else:\n", " ti_lookup = TILookup()\n", " aa_results = ti_lookup.lookup_iocs(data=aa_full_list)\n", " all_azure_activity_df[\"TIData_caller\"] = all_azure_activity_df['CallerIpAddress'].apply(getTIData)\n", " all_azure_activity_df[\"TIData_client\"] = all_azure_activity_df['clientIpAddress'].apply(getTIData)\n", " all_azure_activity_df[\"TISeverity\"] = getHighestSev(list(all_azure_activity_df['TIData_caller'].values), list(all_azure_activity_df['TIData_client'].values))\n", " display(all_azure_activity_df[['TimeGenerated', 'OperationName', 'Level', 'ActivityStatus', 'TISeverity', 'TIData_caller', 'TIData_client', 'CorrelationId', 'Caller', 'clientRequestId']])" ] }, { "cell_type": "markdown", "id": "4f959c7b", "metadata": {}, "source": [ "#### Show Timeline" ] }, { "cell_type": "code", "execution_count": null, "id": "66451178", "metadata": {}, "outputs": [], "source": [ "if 'TISeverity' in all_azure_activity_df.columns:\n", " nbdisplay.display_timeline(all_azure_activity_df,\n", " time_column=\"TimeGenerated\",\n", " group_by=\"TISeverity\",\n", " source_columns=[\"OperationName\", \"Level\", \"CorrelationId\", \"Caller\", \"CallerIpAddress\"])\n", "else:\n", " nbdisplay.display_timeline(related_alerts_df,\n", " time_column=\"TimeGenerated\",\n", " group_by=\"Level\",\n", " source_columns=[\"AlertName\", \"Description\", \"AlertSeverity\", \"ProviderName\"]) \n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8 - AzureML", "language": "python", "name": "python38-azureml" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 5 }