{ "cells": [ { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-09-25T19:53:48.349636Z", "start_time": "2019-09-25T19:53:48.344638Z" } }, "source": [ "# Entity Explorer - Domain and URL\n", "
\n", "  Details...\n", "\n", " **Notebook Version:** 2.0
\n", " **Python Version:** Python 3.8 (including Python 3.8 - AzureML)
\n", " **Required Packages**: msticpy, msticnb
\n", "\n", " **Data Sources Required**:\n", " - Log Analytics - SecurityAlerts, Bookmarks, DnsEvents, CommonSecurityLog, DeviceNetworkEvents
\n", "**TI Proviers Used**\n", " - VirusTotal, Open Page Rank, BrowShot(all required for certain elements), AlienVault OTX, IBM XForce (optional) - all providers require accounts and API keys\n", "
\n", "\n", "This Notebooks brings together a series of tools and techniques to enable threat hunting within the context of a domain name or URL that has been identified as of interest. It provides a series of techniques to assist in determining whether a domain or URL is malicious. Once this has been established it provides an overview of the scope of the domain or URL across an environment, along with indicators of areas for further investigation such as hosts of interest. " ] }, { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Hunting Hypothesis: \n", "Our broad initial hunting hypothesis is that a particular URL or domain might be malicious.
\n", "This notebook is designed to help you explore the data and identify if the URL is malicious and where the URL appears within the environment." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "### Notebook initialization\n", "The next cell:\n", "- Checks for the correct Python version\n", "- Checks versions and optionally installs required packages\n", "- Imports the required packages into the notebook\n", "- Sets a number of configuration options.\n", "\n", "
\n", " More details...\n", "\n", "This should complete without errors. If you encounter errors or warnings look at the following two notebooks:\n", "- [TroubleShootingNotebooks](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/TroubleShootingNotebooks.ipynb)\n", "- [ConfiguringNotebookEnvironment](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb)\n", "\n", "If you are running in the Microsoft Sentinel Notebooks environment (Azure Notebooks or Azure ML) you can run live versions of these notebooks:\n", "- [Run TroubleShootingNotebooks](./TroubleShootingNotebooks.ipynb)\n", "- [Run ConfiguringNotebookEnvironment](./ConfiguringNotebookEnvironment.ipynb)\n", "\n", "You may also need to do some additional configuration to successfully use functions such as Threat Intelligence service lookup and Geo IP lookup. \n", "There are more details about this in the `ConfiguringNotebookEnvironment` notebook and in these documents:\n", "- [msticpy configuration](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html)\n", "- [Threat intelligence provider configuration](https://msticpy.readthedocs.io/en/latest/data_acquisition/TIProviders.html#configuration-file)\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2020-05-15T23:19:52.320806Z", "start_time": "2020-05-15T23:19:48.201597Z" } }, "outputs": [], "source": [ "from IPython.display import display, HTML, Image\n", "from datetime import datetime, timedelta, timezone\n", "\n", "\n", "display(HTML(\"

Starting Notebook setup...

\"))\n", "\n", "# If not using Azure Notebooks, install msticpy and msticnb with\n", "# %pip install msticpy --upgrade\n", "# %pip install msticnb --upgrade\n", "\n", "import msticpy as mp\n", "extra_imports = [\n", " \"msticpy.nbtools, observationlist\",\n", " \"msticpy.sectools, domain_utils\",\n", " \"pyvis.network, Network\",\n", "]\n", "mp.init_notebook(\n", " extra_imports=extra_imports,\n", " additional_packages=[\"msticnb>=1.0\"],\n", ");" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "# papermill default parameters\n", "ws_name = \"Default\"\n", "url = \"\"\n", "end = datetime.now(timezone.utc)\n", "start = end - timedelta(days=2)" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-09-25T20:20:04.563899Z", "start_time": "2019-09-25T20:20:04.507874Z" } }, "source": [ "### Authentication\n", "
\n", " Details...\n", "If you are using user/device authentication, run the following cell. \n", "- Click the 'Copy code to clipboard and authenticate' button.\n", "- This will pop up an Azure Active Directory authentication dialog (in a new tab or browser window). The device code will have been copied to the clipboard. \n", "- Select the text box and paste (Ctrl-V/Cmd-V) the copied value. \n", "- You should then be redirected to a user authentication page where you should authenticate with a user account that has permission to query your Log Analytics workspace.\n", "\n", "Use the following syntax if you are authenticating using an Azure Active Directory AppId and Secret:\n", "```\n", "%kql loganalytics://tenant(aad_tenant).workspace(WORKSPACE_ID).clientid(client_id).clientsecret(client_secret)\n", "```\n", "instead of\n", "```\n", "%kql loganalytics://code().workspace(WORKSPACE_ID)\n", "```\n", "\n", "Note: you may occasionally see a JavaScript error displayed at the end of the authentication - you can safely ignore this.
\n", "On successful authentication you should see a ```popup schema``` button.\n", "To find your Workspace Id go to [Log Analytics](https://ms.portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.OperationalInsights%2Fworkspaces). Look at the workspace properties to find the ID.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\n", " \"Configured workspaces: \",\n", " \", \".join(msticpy.settings.get_config(\"AzureSentinel.Workspaces\").keys()),\n", ")\n", "import ipywidgets as widgets\n", "\n", "ws_param = widgets.Combobox(\n", " description=\"Workspace Name\",\n", " value=ws_name,\n", " options=list(msticpy.settings.get_config(\"AzureSentinel.Workspaces\").keys()),\n", ")\n", "ws_param" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from msticpy.common.timespan import TimeSpan\n", "\n", "# Authentication\n", "qry_prov = QueryProvider(data_environment=\"MSSentinel\")\n", "qry_prov.connect(WorkspaceConfig(workspace=ws_param.value))\n", "\n", "nb_timespan = TimeSpan(start, end)\n", "qry_prov.query_time.timespan = nb_timespan\n", "md(\"
\")\n", "md(\"Confirm time range to search\", \"bold\")\n", "qry_prov.query_time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Authentication and Configuration Problems\n", "\n", "
\n", "
\n", " Click for details about configuring your authentication parameters\n", " \n", "The notebook is expecting your Microsoft Sentinel Tenant ID and Workspace ID to be configured in one of the following places:\n", "- `config.json` in the current folder\n", "- `msticpyconfig.yaml` in the current folder or location specified by `MSTICPYCONFIG` environment variable.\n", " \n", "For help with setting up your `config.json` file (if this hasn't been done automatically) see the [`ConfiguringNotebookEnvironment`](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb) notebook in the root folder of your Azure-Sentinel-Notebooks project. This shows you how to obtain your Workspace and Subscription IDs from the Microsoft Sentinel Portal. You can use the SubscriptionID to find your Tenant ID). To view the current `config.json` run the following in a code cell.\n", "\n", "```%pfile config.json```\n", "\n", "For help with setting up your `msticpyconfig.yaml` see the [Setup](#Setup) section at the end of this notebook and the [ConfigureNotebookEnvironment notebook](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb)\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import and initialize notebooklets\n", "\n", "This imports the **msticnb** package and the notebooklets classes.\n", "\n", "These are needed for the notebook's operation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import msticnb as nb\n", "\n", "nb.init(query_provider=qry_prov)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Select the domain or URL to investigate\n", "Enter the domain or URL you wish to investigate. e.g. www.microsoft.com/index.html" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2020-05-15T23:20:48.657499Z", "start_time": "2020-05-15T23:20:48.634498Z" } }, "outputs": [], "source": [ "url_txt = nbwidgets.GetText(\n", " prompt=\"Enter the URL to investigate:\", value=url\n", ")\n", "display(url_txt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# URL Overview\n", "\n", "The following cell runs the URL Summary notebooklet the collects relevant information about the URL, its domain, and associated IPs.
\n", "Use the output to understand the context of the URL and where it appears in the environment. From here you can identify further areas and scopes to hunt on." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url_nb = nb.nblts.azsent.url.URLSummary()\n", "url_result = url_nb.run(value=url_txt.value, timespan=qry_prov.query_time.timespan, options=[\"+screenshot\"])" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-09-27T00:28:22.232839Z", "start_time": "2019-09-27T00:28:22.229839Z" } }, "source": [ "# The URL in the Environment\n", "Once we have determined the nature of the domain or URL under investigation we want to see what the scope of impact is in our environment but identifying any presence of the domain or URL in our datasets.
\n", "If the domain has a high page rank score it is likely that it will be highly prevalent in a large environment, therefore you may wish to consider whether to run these cells for such a domain due to the data volumes involved." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2020-05-15T23:23:46.219660Z", "start_time": "2020-05-15T23:23:46.205659Z" } }, "outputs": [], "source": [ "if not url_result.dns_results.empty:\n", " url_result.dns_results.mp_plot.timeline(title=f\"DNS lookup events for {url_txt.value}\")\n", "else:\n", " md(f\"No dns events found relating to {url_txt.value}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Related Alerts\n", "Understanding where a URL has appeared in alerts can help provide context on when a URL was first seen in an environment and its link to malicious activity." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2020-05-15T23:24:32.411730Z", "start_time": "2020-05-15T23:24:30.075322Z" } }, "outputs": [], "source": [ "url_result.display_alert_timeline()\n", "url_result.browse_alerts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Related Bookmarks\n", "As with alerts, understanding where a URL has appeared in bookmarks can help provide context on when a URL has previously been investigated within an environment." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "if not url_result.bookmarks.empty:\n", " url_result.bookmarks.mp_plot.timeline()\n", " display(url_result.bookmarks)\n", "else:\n", " md(f\"No bookmarks found relating to {url_txt.value}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Hosts Observed Communicating with the URL\n", "During the cells executed above we have identified hosts communicating with the URL in question. This cell provides a summary of these hosts.
\n", "These hosts are potential candidates for further investigation using Microsoft Sentinel or via the host entity explorer Notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2020-05-15T23:25:44.248494Z", "start_time": "2020-05-15T23:25:44.234496Z" } }, "outputs": [], "source": [ "if url_result.hosts:\n", " md(f\"{len(url_result.hosts)} hosts observed connecting with {url_txt.value}\")\n", " for host in url_result.hosts:\n", " md(host)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Additional environment data\n", "\n", "To dig further into data regarding the URL in the environment there are a number of data sources returned by the URL Summary notebooklet.
\n", "There are:
\n", "- `dns_results` : shows DNS lookup events for the domain.\n", "- `flows` : network flow logs for connections to the URL. \n", "\n", "They can be accessed by using `url_result.[data_source]` in the cells below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "## Use other notebooklets and pivots functions to drill down on other entities\n", "\n", "You may want to drill down on other entities in the Host data.\n", "You can use methods of the IpAddress or Host entities, for example,\n", "to look at these in more detail.\n", "\n", "Run the ip_address_summary notebooklet pivot\n", "```python\n", "IpAddress = entities.IpAddress\n", "ip_result = IpAddress.nblt.ip_address_summary(\"157.56.162.53\")\n", "```\n", "\n", "View the TI results\n", "```python\n", "ip_result.browse_ti_results()\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "# More information:\n", "\n", "## Notebooklets and Pivots\n", "[Notebooklets](https://msticnb.readthedocs.io/en/latest/)\n", "\n", "[Pivot functions](https://msticpy.readthedocs.io/en/latest/data_analysis/PivotFunctions.html)\n", "\n", "## Notebook/MSTICPy configuration\n", "[Getting Started](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/A%20Getting%20Started%20Guide%20For%20Azure%20Sentinel%20ML%20Notebooks.ipynb)
\n", "[MSTICPy Configuration guide](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html)\n", "\n", "[ConfigureNotebookEnvironment notebook](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb)" ] } ], "metadata": { "celltoolbar": "Tags", "hide_input": false, "kernelspec": { "display_name": "Python 3.8 - AzureML", "language": "python", "name": "python38-azureml" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "352.33px" }, "toc_section_display": true, "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false }, "vscode": { "interpreter": { "hash": "245974e5c299ad9c888cb47ad7d7f097035e3d545bc92549ebde7e73bd6f3a70" } }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 4 }