{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "Stolen Szechuan Sauce - Analysis.ipynb",
      "private_outputs": true,
      "provenance": [],
      "collapsed_sections": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/google/timesketch/blob/master/notebooks/Stolen_Szechuan_Sauce_Analysis.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "STgOOx_i9NKV"
      },
      "source": [
        "# The Case of The Stolen Szechuan Sauce\n",
        "\n",
        "This is a simple colab demonstrating one way of analyzing data from the Stolen Szechuan Sauce challenge (found [here](https://dfirmadness.com/the-stolen-szechuan-sauce/)).\n",
        "\n",
        "This colab will not go into any of the data upload. It assumes that all data is already collected and uploaded to Timesketch. To see one way of uploading the data to Timesketch, use [this colab](https://colab.research.google.com/github/google/timesketch/blob/master/notebooks/Stolen_Szechuan_Sauce_Data_Upload.ipynb)\n",
        "\n",
        "For a more generic instructions of Colab can be [found here](https://colab.research.google.com/github/google/timesketch/blob/master/notebooks/colab-timesketch-demo.ipynb)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uy3o_dS2T6hg"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VlUAyi73BJUI"
      },
      "source": [
        "If you are running this on a cloud runtime you'll need to install these dependencies:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "YywEaQSTBOjH"
      },
      "source": [
        "# @markdown Only execute if not already installed and running a cloud runtime\n",
        "!pip install -q timesketch_api_client\n",
        "!pip install -q vt-py nest_asyncio pandas\n",
        "!pip install -q picatrix"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "fi9-7n--lXOV",
        "cellView": "form"
      },
      "source": [
        "# @title Import libraries\n",
        "# @markdown This cell will import all the libraries needed for the running of this colab.\n",
        "\n",
        "import re\n",
        "import requests\n",
        "\n",
        "import pandas as pd\n",
        "\n",
        "from timesketch_api_client import config\n",
        "from picatrix import notebook_init\n",
        "\n",
        "import vt\n",
        "import nest_asyncio # https://github.com/VirusTotal/vt-py/issues/21\n",
        "\n",
        "nest_asyncio.apply()\n",
        "notebook_init.init()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "MQg0I0Cl6ecu",
        "cellView": "form"
      },
      "source": [
        "# @title VirusTotal Configuration\n",
        "# @markdown In order to be able to lookup domains/IPs/samples using VirtusTotal we need to get an API key.\n",
        "# @markdown\n",
        "# @markdown If you don't have an API key you must sign up to [VirusTotal Community](https://www.virustotal.com/gui/join-us).\n",
        "# @markdown Once you have a valid VirusTotal Community account you will find your personal API key in your personal settings section. \n",
        "\n",
        "VT_API_KEY = '' # @param {type: \"string\"}\n",
        "\n",
        "# @markdown If you don't have the API key you will not be able to use the Virustotal API\n",
        "# @markdown to lookup information."
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "REUKmoy_G1p_",
        "cellView": "form"
      },
      "source": [
        "# @title Declare functions\n",
        "\n",
        "# @markdown This cell will define few functions that we will use throughout\n",
        "# @markdown this colab. This would be better to define outside of the notebook\n",
        "# @markdown in a library that would be imported, but we keep it here for now.\n",
        "\n",
        "def print_dict(my_dict, space_before=0):\n",
        "  \"\"\"Print the content of a dictionary.\"\"\"\n",
        "  max_len = max([len(x) for x in my_dict.keys()])\n",
        "  spaces = ' '*space_before\n",
        "  format_str = f'{spaces}{{key:{max_len}s}} = {{value}}'\n",
        "  for key, value in my_dict.items():\n",
        "    if isinstance(value, dict):\n",
        "      print(format_str.format(key=key, value=''))\n",
        "      print_dict(value, space_before=space_before + 8)\n",
        "    elif isinstance(value, list):\n",
        "      value_str = ', '.join(value)\n",
        "      print(format_str.format(key=key, value=value_str))\n",
        "    else:\n",
        "      print(format_str.format(key=key, value=value))\n",
        "\n",
        "\n",
        "def ip_info(address):\n",
        "  \"\"\"Print out information about an IP address using the VT API.\"\"\"\n",
        "  url = 'https://www.virustotal.com/vtapi/v2/ip-address/report'\n",
        "  params = {\n",
        "      'apikey': VT_API_KEY,\n",
        "      'ip': address}\n",
        "\n",
        "  response = requests.get(url, params=params)\n",
        "  j_obj = response.json()\n",
        "\n",
        "  def _print_stuff(part):\n",
        "    print('')\n",
        "    header = part.replace('_', ' ').capitalize()\n",
        "    print(f'{header}:')\n",
        "    for item in j_obj.get(part, []):\n",
        "      print_dict(item, 2)\n",
        "\n",
        "  _print_stuff('resolutions')\n",
        "  _print_stuff('detected_urls')\n",
        "  _print_stuff('detected_referrer_samples')\n",
        "  _print_stuff('detected_communicating_samples')\n",
        "  _print_stuff('detected_downloaded_samples')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "FiPTLA-qlkWQ",
        "cellView": "form"
      },
      "source": [
        "# @markdown Get a copy of the Timesketch client object.\n",
        "# @markdown Parameters to configure the client:\n",
        "# @markdown + host_uri: https://demo.timesketch.org\n",
        "# @markdown + username: demo\n",
        "# @markdown + auth_mode: timesketch (username/password)\n",
        "# @markdown + password: demo\n",
        "\n",
        "ts_client = config.get_client(confirm_choices=True)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hkQ98MnC-x3p"
      },
      "source": [
        "Now that we've got a copy of the TS client we need to get to the sketch."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RN5fKCshls9L"
      },
      "source": [
        "for sketch in ts_client.list_sketches():\n",
        "  if not sketch.name.startswith('Szechuan'):\n",
        "    continue\n",
        "\n",
        "  print('We found the sketch to use')\n",
        "  print(f'[{sketch.id}] {sketch.name} - {sketch.description}')\n",
        "  break"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Bpz29qgAGLFt"
      },
      "source": [
        "OK, sketch nr 6 is the one that we are after, let's set that as the active sketch. This is something that the Timesketch picatrix magics expect, that is to first set the active sketch that you will be using. After that all the magics don't need sketch definitions."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pO76nz3TGZAH"
      },
      "source": [
        "%timesketch_set_active_sketch 6"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "R6kRDR2fToLC"
      },
      "source": [
        "To learn more about picatrix and how it works, please use the magic `%picatrixmagics` and see what magics are available and then use `%magic --help` or `magic_func?` to see more information about that magic.\n",
        "\n",
        "One such example could be:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RJheDxzhTxS2"
      },
      "source": [
        "timesketch_list_saved_searches_func?"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2Ir9TevSLGBq"
      },
      "source": [
        "## Pre-Thoughts"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xgwQlv1FLHzx"
      },
      "source": [
        "Timesketch analyzers can provide quite a lot of value to any analysis. They can do pretty much everything that can be achieved in a colab like this, and in the Timesketch UI, except programatically. In this case, one of the very valuable analyzers is the `logon` analyzer. That analyzer will look for evidence of logons, and then extract values out of the logon entries and add them to the dataset.\n",
        "\n",
        "Another potentially valuable analyzer is browser search, etc. To get a history of what analyzers have been run you can visit [this page](https://demo.timesketch.org/sketch/6/manage/timelines) or run the following code snippet:\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "kIjGq8dmMYJv"
      },
      "source": [
        "for status in sketch.get_analyzer_status():\n",
        "  print(f'Analyzer: {status[\"analyzer\"]} - status: {status[\"status\"]}')\n",
        "  print(f'Results: {status[\"results\"]}')\n",
        "  print('')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lLlaWPgi_o1r"
      },
      "source": [
        "From there you can get a glance at what has analysis has been done on the dataset, and what the results were.. for instance that `login` was completed and it found several logon and logoff entries.\n",
        "\n",
        "However now we can start answering the questions.\n",
        "\n",
        "## Questions"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kNyG9uur_1xK"
      },
      "source": [
        "### What’s the Operating System of the Server?\n",
        "\n",
        "Let's start exploring this, OS information is stored in the registry. Let's query it"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_gDu_58uk3Go"
      },
      "source": [
        "search_query = timesketch_query_func(\n",
        "    'parser:\"winreg/windows_version\"',\n",
        "    fields='datetime,key_path,data_type,message,timestamp_desc,parser,display_name,product_name,hostname,timestamp_desc'\n",
        ")\n",
        "cur_df = search_query.table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "hfAIKnt-EKak"
      },
      "source": [
        "cur_df[['hostname', 'product_name']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mQ4RxzwnDrbu"
      },
      "source": [
        "So we now have the all the data, we can read the data from the table or do one more filtering to get the answer:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "x01scNWNG9Zn"
      },
      "source": [
        "cur_df[cur_df.hostname == 'CITADEL-DC01'].product_name.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2ksIYrrI_8Be"
      },
      "source": [
        "### What’s the Operating System of the Desktop?\n",
        "\n",
        "we can use the same data as we collected before:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jDRnhdlUHG4S"
      },
      "source": [
        "cur_df[cur_df.hostname == 'DESKTOP-SDN1RPT'].product_name.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wrpH_DttAejS"
      },
      "source": [
        "### What was the local time of the Server?\n",
        "\n",
        "To answer that we need to get the current control set"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XRo0pe4eJyMi"
      },
      "source": [
        "cur_df = timesketch_query_func(\n",
        "    'HKEY_LOCAL_MACHINE*System*Select AND hostname:\"CITADEL-DC01\"',\n",
        "    fields=(\n",
        "        'datetime,key_path,data_type,message,timestamp_desc,parser,display_name,'\n",
        "        'product_name,hostname,timestamp_desc,values')\n",
        ").table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oPECs_jt_T_H"
      },
      "source": [
        "Now let's look at what the value is set for the key."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1TwaMYZBJ2ce"
      },
      "source": [
        "for key, value in cur_df[['key_path', 'values']].values:\n",
        "  print(f'Key: {key}')\n",
        "  print(f'Value: {value}')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1taojYh2_kDX"
      },
      "source": [
        "We can parse this out a bit more if we want to, or just read from there that the current value is 1"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "MaRYpTIN_on9"
      },
      "source": [
        "cur_df['current_value'] = cur_df['values'].str.extract(r'Current: \\[[A-Z_]+\\] (\\d) ')\n",
        "\n",
        "cur_df[['key_path', 'current_value']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VSAo1MWHJ-vP"
      },
      "source": [
        "The current one is set 1"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "duOb8qURKJpn"
      },
      "source": [
        "cur_df = timesketch_query_func(\n",
        "    'TimeZoneInformation AND hostname:\"CITADEL-DC01\"',\n",
        "    fields='datetime,key_path,data_type,message,timestamp_desc,parser,display_name,product_name,hostname,timestamp_desc,configuration'\n",
        ").table\n",
        "cur_df"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VTkHkocEB9Iy"
      },
      "source": [
        "Let's increase the column with for pandas, that will make it easier to read columns with longer text in them."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LQIF_bVFLYLt"
      },
      "source": [
        "pd.set_option('max_colwidth', 400)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "3ZeK5sfCKcwi"
      },
      "source": [
        "cur_df[cur_df.key_path.str.contains('ControlSet001')][['configuration']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IJMrF5txAjHm"
      },
      "source": [
        "So we need to extract what is in `TimeZoneKeyName`, we can do this differently. For now we can just read the configuration field, and then split it into a dict and then construct a new DataFrame with these fields, that is taking a line that is `key1: value1 key2: value2 ...` and creating a data frame with `key1, key2, ...` being the column names."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eBSQDEI33wdH"
      },
      "source": [
        "lines = []\n",
        "\n",
        "for value in cur_df[cur_df.key_path.str.contains('ControlSet001')]['configuration'].values:\n",
        "  items = value.split(':')\n",
        "  line_dict = {}\n",
        "  key = items[0]\n",
        "  for item in items[1:-1]:\n",
        "    *values, new_key = item.split()\n",
        "\n",
        "    line_dict[key] = ' '.join(values)\n",
        "    key = new_key\n",
        "\n",
        "  line_dict[key] = items[-1]\n",
        "  lines.append(line_dict)\n",
        "\n",
        "time_df = pd.DataFrame(lines)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LG_9D9mzCBrX"
      },
      "source": [
        "Let's look at the newly constructed data frame"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "cgtctQ_hCEfi"
      },
      "source": [
        "time_df"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9f4lPffKEByw"
      },
      "source": [
        "Then we've got the time zone of the server, which is `Pacific Standard Time`"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oNZXvnc5Ajs1"
      },
      "source": [
        "### What was the initial entry vector (how did they get in)?\n",
        "\n",
        "If we assume they got in from externally, doing some statistics on the network data might be useful. For that we need to do some aggregations.\n",
        "\n",
        "First to understand what aggregations are available to use, and how to use them, let's use the `list_available_aggregators` which produces a data frame with the names of the aggregators and what parameters they need for configuration.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "y76Xnu7AEmO8"
      },
      "source": [
        "%timesketch_available_aggregators"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QHok29TaExGy"
      },
      "source": [
        "Now that we know what aggregators are available, let's start with aggregating the field `Source`, and get the top 10.\n",
        "\n",
        "For that we need to use the `field_bucket` aggregator, and configuring it using the parameters `field`, `limit` and `supported_charts`.\n",
        "\n",
        "The charts that are available are:\n",
        " + barchart\n",
        " + hbarchart\n",
        " + table\n",
        " + circlechart\n",
        " + linechart\n",
        "\n",
        "For this let's use a horizontal bar chart, `hbarchart`"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "gLMtyrWtLqLj"
      },
      "source": [
        "params = {\n",
        "    'field': 'Source',\n",
        "    'limit': 10,\n",
        "    'supported_charts': 'hbarchart',\n",
        "    'chart_title': 'Top 10 Source IP',\n",
        "}\n",
        "\n",
        "aggregation = timesketch_run_aggregator_func(\n",
        "    'field_bucket', parameters=params\n",
        ")\n",
        "aggregation.chart"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "X0oGT9mk0B5D"
      },
      "source": [
        "If you are viewing this as in Colab but connecting to a local runtime you may need to enable this in order to be able to view the charts:\n",
        "\n",
        "(if it doesn't work, uncomment the code that is applicable to you and then re-run the aggregation cell)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "0VqBw5ovz4p5"
      },
      "source": [
        "# Remove the commend and run this code if you are running in colab\n",
        "# but have a local Jupyter kernel running:\n",
        "# alt.renderers.enable('colab')\n",
        "\n",
        "# Remove this comment if you are running in Jupyter and the chart is not displayed\n",
        "# alt.renderers.enable('notebook')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GHNhwNsGFSUH"
      },
      "source": [
        "If you prefer to get the data frame instead of the chart you can call `aggregation.table`"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ZFnInNAnFZYB"
      },
      "source": [
        "aggregation.table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0EsPZ5wyFbxY"
      },
      "source": [
        "Now let's look at the `Destination` field, same as before:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "WNauKP1OL1Ps"
      },
      "source": [
        "params = {\n",
        "    'field': 'Destination',\n",
        "    'limit': 10,\n",
        "    'supported_charts': 'hbarchart',\n",
        "    'chart_title': 'Top 10 Source IP',\n",
        "}\n",
        "\n",
        "aggregation = timesketch_run_aggregator_func('field_bucket', parameters=params)\n",
        "aggregation.chart"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Dm5dT3ZyL3rU"
      },
      "source": [
        "We can clearly see that the ```194.61.24.102``` sticks out, so lets try to understand what this IP did. Also note that it is not common that a system from the internet tries to connect to a intranet IP."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w-QQPvb8MIgr"
      },
      "source": [
        "#### A Look at IP 194.61.24.102"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Os8SYJiMM96R"
      },
      "source": [
        "attacker_dst = timesketch_query_func(\n",
        "    'Source:\"194.61.24.102\" AND data_type:\"pcap:wireshark:entry\"',\n",
        "    fields='datetime,message,timestamp_desc,Destination,DST port,Source,Protocol,src port').table\n",
        "attacker_dst.head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NQarJWFW_K4h"
      },
      "source": [
        "OK, we can see that the API says we got 40k records returned but the search actually produced 128.328 records,so let's increase our max entries..."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "CwIEey96_TUA"
      },
      "source": [
        "search_obj = timesketch_query_func(\n",
        "    'Source:\"194.61.24.102\" AND data_type:\"pcap:wireshark:entry\"',\n",
        "    fields='datetime,message,timestamp_desc,Destination,DST port,Source,Protocol,src port')\n",
        "\n",
        "search_obj.max_entries = 150000\n",
        "attacker_dst = search_obj.table\n",
        "attacker_dst.head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yHK4xt7mGAz1"
      },
      "source": [
        "We got a fairly large table, let's look at the size:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2_MxJxRqGE6L"
      },
      "source": [
        "attacker_dst.shape"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XFmjxrW2GHyf"
      },
      "source": [
        "We will now need to do some aggregation on the data that we got, let's use pandas for that. For that there is a function called `groupby` where we can run aggregations.\n",
        "\n",
        "We want to group based on `DST port` and `Destination`, so we only need those two columns + one more to store the count/sum."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "D0uY7rFeGW4u"
      },
      "source": [
        "attacker_group = attacker_dst[['DST port','Destination', 'Protocol']].groupby(\n",
        "    ['DST port','Destination'], as_index=False)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RWDgHXLdG37E"
      },
      "source": [
        "Now we got a group, and to get a count, we can use the `count()` function of the group."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "GZ2OEKNmG7NY"
      },
      "source": [
        "attacker_dst_mytable = attacker_group.count()\n",
        "attacker_dst_mytable.rename(columns={'Protocol': 'Count'}, inplace=True)\n",
        "attacker_dst_mytable.sort_values(by=['Count'], ascending=False)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0TWTkbGONNkj"
      },
      "source": [
        "So we can already point out that there is a lot of traffic from this ip to ```10.42.85.10``` on port ```3389```which is used for Remote Desktop Protocol (RDP)\n",
        "\n",
        "Let's now look at the IP traffic as it was parsed by scapy"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pmqx3AFHNlCc"
      },
      "source": [
        "attacker_dst = timesketch_query_func(\n",
        "    '194.61.24.102 AND data_type:\"scapy:pcap:entry\"',\n",
        "    fields='datetime,message,timestamp_desc,ip_flags,ip_dst,ip_src,payload,tcp_flags,tcp_seq,tcp_sport,tcp_dport,tcp_window').table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "QCbCRWmxH7ML"
      },
      "source": [
        "Let's look at a few entries here:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "esIgaBjeOwCN"
      },
      "source": [
        "attacker_dst.head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-VfhmF4eIW2Q"
      },
      "source": [
        "What we can see here is that quite a bit of the information is  in the message field that we need to decode.\n",
        "\n",
        "We also see that the `evil` bit is set... we could query for that as well. Let's start there, to do an aggregation based on that."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "fYpGly-JIe26"
      },
      "source": [
        "params = {\n",
        "    'field': 'ip_src',\n",
        "    'query_string': 'ip_flags:\"evil\"',\n",
        "    'supported_charts': 'hbarchart',\n",
        "    'chart_title': 'Source IPs with \"evil\" bit set',\n",
        "}\n",
        "\n",
        "aggregation = timesketch_run_aggregator_func('query_bucket', parameters=params)\n",
        "aggregation.table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7m7YfTJ5JZbw"
      },
      "source": [
        "We could even save this (if you have write access to the sketch, which the demo user does not have)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "GVfPCwpcJdjV"
      },
      "source": [
        "name = 'Source IPs with \"evil\" bit set'\n",
        "aggregation.name = name\n",
        "aggregation.title = name\n",
        "aggregation.save()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u4zTQKnpJlm1"
      },
      "source": [
        "And now we could use this in a story for instance.\n",
        "\n",
        "But let's move on and parse the message field:\n",
        "\n",
        "First let's look at a single entry. To see how it is constructed:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "-l2C19JEJsSp"
      },
      "source": [
        "attacker_dst.iloc[0].message"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZYX6y6yPPef3"
      },
      "source": [
        "Now that we know that, let's first remove the `<bound method...` in the beginning. Let's check to see if it's the same across the board:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "aSvMRCNPP1Cs"
      },
      "source": [
        "attacker_dst.message.str.slice(start=0, stop=30).unique()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DmZK1tcXQE29"
      },
      "source": [
        "OK, so it's the same, we can therefore just use the slice method to remove this part of the string. After that we can then split the string based on `|` which separates the protocols."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "o1DIGDJPJwIH"
      },
      "source": [
        "attacker_packages = attacker_dst.message.str.slice(start=30).str.split('|', expand=True)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "b8Drp6zHQYTp"
      },
      "source": [
        "Let's explain what was done in the above syntax. First of all we used the [slice method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.slice.html) to cut the first 30 characters out of the messages field. What we are left with is the rest of the message string. We then use the [split method](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.split.html) to split the string, based on `|`, and adding the option of `expand=True`, which then expands the results into a separate dataframe (as an opposed to just a list).\n",
        "\n",
        "Now let's look at how this looks like:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "SuDvaBjgQye5"
      },
      "source": [
        "attacker_packages.head(3)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vNxAynm1QDzr"
      },
      "source": [
        "We can see a lot of values there are marked as None.. and basically all the columns from 3 and up are not useful, so let's remove those. And then rename the remaining columns"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "U0srug77RAU2"
      },
      "source": [
        "attacker_packages = attacker_packages[[0, 1, 2]]\n",
        "attacker_packages.columns = ['ether', 'ip', 'transport']"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "438XdwXCRLOz"
      },
      "source": [
        "And let's look at how this looks like now:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Gay0e-toRNUL"
      },
      "source": [
        "attacker_packages.head(3)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-9r-fqPVRS5S"
      },
      "source": [
        "Now let's look at what happened in the first few packages:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8OwFKOyxRWyo"
      },
      "source": [
        "attacker_packages[['transport']].head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TJIR59T4Pvzu"
      },
      "source": [
        "What we can see here is that there is first an ICMP (Ping) then two HTTP/HTTPS Requests , another ICMP and then the 3389 traffic begins.\n",
        "\n",
        "We could obviously parse this even further if we want to."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "cwKqQXkqRsBH"
      },
      "source": [
        "def parse_row(row):\n",
        "  items = row.split()\n",
        "  protocol = items[0][1:]\n",
        "  line_dict = {\n",
        "      'protocol': protocol\n",
        "  }\n",
        "  for item in items[1:]:\n",
        "    key, _, value = item.partition('=')\n",
        "    if key == 'options':\n",
        "      # We don't want options nor anything after that.\n",
        "      break\n",
        "    line_dict[key] = value\n",
        "  return line_dict\n",
        "\n",
        "proto_df = pd.DataFrame(list(attacker_packages['transport'].apply(parse_row).values))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8A04dusllrNC"
      },
      "source": [
        "Let's look at it, but first let's add in the datetime, since these are the same records as we had in the original DF we can simply apply the datatime there."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "427b8pEjl0Wx"
      },
      "source": [
        "proto_df['datetime'] = attacker_dst['datetime']"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "au1Otrp0lskR"
      },
      "source": [
        "proto_df.head(3)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CU-cbu2dl_a8"
      },
      "source": [
        "So now if we look at the first few actions made:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "v3z1wI0rmBNB"
      },
      "source": [
        "proto_df[['datetime', 'protocol', 'type', 'dport']].head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Vx22Q-lWosEK"
      },
      "source": [
        "So you can see the first action here.\n",
        "\n",
        "+ ICMP echo request\n",
        "+ TCP HTTPS\n",
        "+ TCP HTTP\n",
        "+ ICMP timestamp request\n",
        "+ ICMP echo reply\n",
        "+ TCP Remote Desktop, 3389\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zYpFNPF4pEBX"
      },
      "source": [
        "Let's look at the pair of both IPs:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_lraZKXiMAYK"
      },
      "source": [
        "attacker_dst = timesketch_query_func(\n",
        "    '(194.61.24.102 AND 10.42.85.10) AND data_type:\"scapy:pcap:entry\"', \n",
        "    fields='datetime,message,timestamp_desc,ip_flags,ip_dst,ip_src,payload,tcp_flags,tcp_seq,tcp_sport,tcp_dport,tcp_window', max_entries=500000).table\n",
        "attacker_dst.head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HkTsG_38qRf8"
      },
      "source": [
        "We can then do the same as we did before to break things down."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "7WLD0Kj2qais"
      },
      "source": [
        "attacker_packages = attacker_dst.message.str.slice(start=30).str.split('|', expand=True)\n",
        "attacker_packages = attacker_packages[[0, 1, 2]]\n",
        "attacker_packages.columns = ['ether', 'ip', 'transport']\n",
        "\n",
        "proto_df = pd.DataFrame(list(attacker_packages['transport'].apply(parse_row).values))\n",
        "proto_df['datetime'] = attacker_dst['datetime']\n",
        "\n",
        "proto_df[['datetime', 'protocol', 'type', 'dport']].head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_pkywJLQrmSm"
      },
      "source": [
        "So we know that this seems to be a RDP connection from the IP 194.61.24.102. Let's look at login events:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "irsZNInorrwf"
      },
      "source": [
        "evtx_df = timesketch_query_func(\n",
        "    '194.61.24.102 AND data_type:\"windows:evtx:record\"', fields='*').table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ruUnzS4psleT"
      },
      "source": [
        "evtx_df.head(3)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pBLPc52ttMBZ"
      },
      "source": [
        "Let's get a quick overview of the data:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ms19SLAvtNqk"
      },
      "source": [
        "evtx_df.username.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "GIjNacIatSYC"
      },
      "source": [
        "evtx_df.event_identifier.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ixlYQi7CtkD1"
      },
      "source": [
        "evtx_df.source_name.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yl_k-ujyoZq9"
      },
      "source": [
        "Let's look at the Administrator logins here:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "nJ6_SUhtoGyB"
      },
      "source": [
        "evtx_df[evtx_df.username == 'Administrator'][['datetime', 'event_identifier', 'tag', 'logon_type', 'source_address']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uSk48Ejhoc5V"
      },
      "source": [
        "So we can see here that the user Administrator was logged in remotely on `2020-09-19` quite a few times, all between 3 and 4 am UTC."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OgEoDRROo6A0"
      },
      "source": [
        "Lets look at whether there was any other user that got logged into the machine from this IP address:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "EZ3ff6B8orW7"
      },
      "source": [
        "timesketch_query_func(\n",
        "    'source_address:\"194.61.24.102\" AND data_type:\"windows:evtx:record\"',\n",
        "    fields='logon_type,username').table[['logon_type', 'username']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "11YmoGvFpAH9"
      },
      "source": [
        "Does not look like it. Only administrator. But now we've got a timeframe to search for."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "y82_3pD7pPzU"
      },
      "source": [
        "We can now start looking at some actions on the system around that time."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "yc1hhGU3pPJx"
      },
      "source": [
        "timeframe_df = timesketch_query_func(\n",
        "    '*', start_date='2020-09-19T01:00:00', end_date='2020-09-19T04:20:00', max_entries=50000\n",
        ").table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Zznpdz1ssgnC"
      },
      "source": [
        "OK, we can see that in this timeframe we have 925k records, but we only got back 50k, so let's re-run this and increase the size\n",
        "\n",
        "**warning: since we are pulling in quite a lot of records (925k) this will take a bit of time, as well as memory**"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "37dxXRObsQ3s"
      },
      "source": [
        "max_entries = 1500000\n",
        "\n",
        "timeframe_df = timesketch_query_func(\n",
        "    '*', start_date='2020-09-19T01:00:00', end_date='2020-09-19T04:20:00', max_entries=max_entries, fields='*'\n",
        ").table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KjTw4wq0ssGm"
      },
      "source": [
        "And now look at the size:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "L-NayWV0stMw"
      },
      "source": [
        "timeframe_df.shape"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HNIyjOT-sudf"
      },
      "source": [
        "And to look at what we've got here:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rka0Iu5EswCq"
      },
      "source": [
        "timeframe_df.data_type.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WSX84O9VtC3d"
      },
      "source": [
        "Let's start by looking at what type of EVTX records we are seeing:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2FRU5cUFtGbn"
      },
      "source": [
        "group = timeframe_df[\n",
        "    timeframe_df.data_type == 'windows:evtx:record'][['event_identifier', 'timestamp', 'source_name']].groupby(\n",
        "        by=['event_identifier', 'source_name'], as_index=False\n",
        "    )"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4Qlta88gtn2T"
      },
      "source": [
        "group.count().rename(columns={'timestamp': 'count'}).sort_values('count', ascending=False)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DnW6E0Cfvjw1"
      },
      "source": [
        "The two most common alerts here are `Schannel/36888` (A Fatal Alert Was Generated) and `Schannel/36874` (An SSL Connection Request Was Received From a Remote Client Application, But None of the Cipher Suites Supported by the Client Application Are Supported by the Server)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "TEcUqQvNv_aj"
      },
      "source": [
        "timeframe_evtx = timeframe_df[timeframe_df.data_type == 'windows:evtx:record'].copy()\n",
        "timeframe_evtx['event_identifier'] = timeframe_evtx.event_identifier.fillna(value=0)\n",
        "\n",
        "timeframe_evtx[timeframe_evtx.event_identifier == 36888].strings.str.join('|').unique()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "J7-mrnJWyHY6"
      },
      "source": [
        "Let's ignore those for a while, but we do see a lot of RDP connections:\n",
        "+ **261**: EVENT_LISTENER_CONNECTION_RECEIVED)\n",
        "+ **131**: RDP connection established"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1aZ3tvczymIm"
      },
      "source": [
        "In this timeframe we are seeing 700 established RDP connections, looks like someone is brute-forcing the password? LEt's look at these records:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "0WpZ502vytkh"
      },
      "source": [
        "timeframe_evtx = timeframe_df[timeframe_df.data_type == 'windows:evtx:record'].copy()\n",
        "timeframe_evtx['event_identifier'] = timeframe_evtx.event_identifier.fillna(value=0)\n",
        "\n",
        "timeframe_evtx[(timeframe_evtx.event_identifier == 131) & (timeframe_evtx.source_name == 'Microsoft-Windows-RemoteDesktopServices-RdpCoreTS')].strings.str.join('|').unique()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dFzOXfEEzAqR"
      },
      "source": [
        "It all seems to be our infamous IP address. That is brute forcing RDP.\n",
        "\n",
        "That's how they got in, brute-forcing RDP until they got in via the Administrator account."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "j3PE9nbPAlmy"
      },
      "source": [
        "### Was malware used? If so what was it? If there was malware answer the following:\n",
        "\n",
        "\n",
        "\n",
        "#### What IP Address is the malware calling to?\n",
        "#### When did it first appear?\n",
        "#### Did someone move it?\n",
        "#### What were the capabilities of this malware?\n",
        "#### Is this malware easily obtained?\n",
        "#### Was this malware installed with persistence on any machine?\n",
        "##### When?\n",
        "##### Where?\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_waGIBGlzaRy"
      },
      "source": [
        "OK, we've still got the data from the timeframe, let's look at it again:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "U1tjcFZWzcvM"
      },
      "source": [
        "timeframe_df.data_type.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "A6lO_uUZzhV_"
      },
      "source": [
        "We see some `windows:prefetch:execution`, let's look at that one (remember first logon entry is at 2020-09-19 03:21:48.891087+00:00\t): "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "B7Q0Yx6qzju2"
      },
      "source": [
        "timeframe_df[timeframe_df.data_type == 'windows:prefetch:execution'][['datetime', 'executable', 'run_count']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vvBXWf7w0EJJ"
      },
      "source": [
        "Let's look at this another way:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "fL-s9QPw0GRE"
      },
      "source": [
        "timeframe_df[timeframe_df.data_type == 'windows:prefetch:execution'].executable.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RM1QM7MR0WYA"
      },
      "source": [
        "Or by using run count as an indicator of something that is `rare`"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "DoXdHW_k0ZAH"
      },
      "source": [
        "timeframe_df[(timeframe_df.data_type == 'windows:prefetch:execution') & (~timeframe_df.run_count.isna()) & (timeframe_df.run_count < 2)][['executable', 'run_count']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1zuuXAop0xje"
      },
      "source": [
        "Here we see few applications that we may want to take a look at...\n",
        "\n",
        "+ sc.exe - why was that executed? \n",
        "+ coreupdater.exe - what is this?\n",
        "+ onedrivesetup.exe - why is one drive being installed?\n",
        "+ filesyncconfig.exe - what is this?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jU0FWMPx1B5B"
      },
      "source": [
        "Since SC was used, we should be looking for some services being registered:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8cawlDNn1EoF"
      },
      "source": [
        "timeframe_evtx[(timeframe_evtx.event_identifier == 7045) & (timeframe_evtx.source_name == 'Service Control Manager')]['strings']"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JjtK9bz43ACm"
      },
      "source": [
        "OK, here we do see some services being created, onoe of which is to create an `auto start` service called `coreupdater`. This is one of the files that we saw only executed once. Let's take a closer look at coreupdater."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ahOkGHDe4xxN"
      },
      "source": [
        "#### What process was malicious?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NShc3pqY3Wgm"
      },
      "source": [
        "That would be the coreupdater.exe\n",
        "\n",
        "*(and here is where we need to do some memory analysis, since we are going to miss some data here by not including the memory content... if you were to look at that, then you would see another malicious process)*"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uhQaLDZeyl6R"
      },
      "source": [
        "#### Identify the IP Address that delivered the payload.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4rMRKOiq2qgz"
      },
      "source": [
        "Lets simply do the same again from above with slicing the packages based on HTTP Get and the potential evil domain (just a guess, but somehow the attacker needs to deliver the payload)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "aLc_RW2uym6l"
      },
      "source": [
        "attacker_dst_http = timesketch_query_func(\n",
        "    '(194.61.24.102 AND 10.42.85.10) AND data_type:\"scapy:pcap:entry\" AND *http* AND *GET*', \n",
        "    fields='datetime,message,timestamp_desc,ip_flags,ip_dst,ip_src,payload,tcp_flags,tcp_seq,tcp_sport,tcp_dport,tcp_window').table\n",
        "attacker_dst_http.head(4)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HD7ot0WSAqQa"
      },
      "source": [
        "attacker_dst.shape"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "K44txRnx83HI"
      },
      "source": [
        "Let's look at HTTP traffic:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "-J1fbC3i8UHD"
      },
      "source": [
        "attacker_dst_http[attacker_dst_http.message.str.contains(r'GET|POST')].message.str.extract(r'<Raw  load=([^|]+)')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "n8BzjX98BB9U"
      },
      "source": [
        "Lets look at this again, this time splitting on the new lines, etc.."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pB8FkoslBFex"
      },
      "source": [
        "for v in attacker_dst_http[attacker_dst_http.message.str.contains(r'GET|POST')].message.str.extract(r'<Raw  load=([^|]+)').values:\n",
        "  value = str(v)\n",
        "  print(value.replace('\\\\\\\\r\\\\\\\\n', '\\n'))"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XA_ZZ2a-BqhW"
      },
      "source": [
        "So we are doing a HTTP request to: `http://194.61.24.102/coreupdater.exe`\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Vum5b5-42lHP"
      },
      "source": [
        "We could now go back to the PCAP and extract that file from the stream\n",
        "\n",
        "But the IP address is **192.61.24.102**"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z0IYqp7U40_c"
      },
      "source": [
        "#### Where is this malware on disk?\n",
        "Now we know the name of a suspicious file, lets search for it..."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eubiKpiP45eq"
      },
      "source": [
        "coreupdater = timesketch_query_func(\n",
        "    'coreupdater.exe AND data_type:\"fs:stat\"',\n",
        "    fields='file_size,filename,hostname,message,data_type,datetime,sha256_hash').table\n",
        "\n",
        "coreupdater.sort_values(by=['datetime'], ascending=True, inplace=True)\n",
        "coreupdater.head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tr9Ij9W-7rux"
      },
      "source": [
        "So we have all file entries with the filename coreupdater.exe. What is interesting here is that it is on both systems.\n",
        "\n",
        "Let's start to make sure all the hashes are the same:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "s45gT5wc8GhN"
      },
      "source": [
        "coreupdater[['hostname', 'filename', 'sha256_hash']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "66EcI0IN8ICK"
      },
      "source": [
        "We can now use the hash to lookup on Virustotal"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "tQf0AwRX8NyX"
      },
      "source": [
        "if VT_API_KEY:\n",
        "  vt_client = vt.Client(VT_API_KEY)\n",
        "else:\n",
        "  vt_client = None"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RHP8YYwb8m7V"
      },
      "source": [
        "Let's extract the hash value"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "DDem7Dni8oJI"
      },
      "source": [
        "hash_value = list(coreupdater[coreupdater.filename == '\\Windows\\System32\\coreupdater.exe'].sha256_hash.unique())[0]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "It9OmUWwAagC"
      },
      "source": [
        "And now we can look up the data."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1qMDeqId9nE5"
      },
      "source": [
        "if vt_client:\n",
        "  file_info = vt_client.get_object(f'/files/{hash_value}')\n",
        "\n",
        "  print_dict(file_info.last_analysis_stats)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WWmTKsGP88mN"
      },
      "source": [
        "Let's look at some of the summary"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Jc0zWXIw_mAu"
      },
      "source": [
        "if file_info:\n",
        "  stars = '*'*10\n",
        "  print(f'{stars}Summary{stars}')\n",
        "  print('')\n",
        "  print_dict(file_info.sigma_analysis_summary)\n",
        "  print('')\n",
        "  print(f'{stars}Analysis Stats{stars}')\n",
        "  print('')\n",
        "  print_dict(file_info.sigma_analysis_stats)\n",
        "else:\n",
        "  print('No VT API key defined, you\\'ll need to manually loookup the information')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hMt1TMGqABca"
      },
      "source": [
        "This clearly does not look good. Let's look at some information here:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "OiDA3SPeBm_C"
      },
      "source": [
        "if file_info:\n",
        "  print_dict(file_info.exiftool)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "djM_CsQJBtrB"
      },
      "source": [
        "if file_info:\n",
        "  print_dict(file_info.last_analysis_results)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mhRZDBcxCD5P"
      },
      "source": [
        "This does not look very good. Lets look for other events where coreupdater.exe is present."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "0TEFeU-tVBuC"
      },
      "source": [
        "coreupdater = timesketch_query_func(\n",
        "    'coreupdater.exe AND NOT data_type:\"fs:stat\"', fields='*').table\n",
        "\n",
        "coreupdater.sort_values(by=['datetime'], ascending=True, inplace=True)\n",
        "coreupdater.head(10)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XaJ9KxN5V3pc"
      },
      "source": [
        "Ok there is a lot to see here, lets start with the Top event, the autoruns are mathing the md5 sum we discovered earlier."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "A1DcPK23Wbfy"
      },
      "source": [
        "coreupdater.loc[coreupdater['data_type'] == 'autoruns:record']"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JR9QHkbUWo53"
      },
      "source": [
        "Then we have the service install events (note that it happened on both systems):"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "JH-esnyKWsL-"
      },
      "source": [
        "coreupdater.loc[coreupdater['data_type'] == 'windows:registry:service'][['datetime', 'hostname', 'key_path', 'image_path', 'name', 'start_type', 'service_type', 'values']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qBap_KHYXLST"
      },
      "source": [
        "And we have the GET events again:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vP7HXsydXNce"
      },
      "source": [
        "coreupdater.loc[coreupdater['data_type'] == 'pcap:wireshark:entry'][['datetime', 'message']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6GDQthKvApg6"
      },
      "source": [
        "### What malicious IP Addresses were involved?\n",
        "#### Were any IP Addresses from known adversary infrastructure?\n",
        "#### Are these pieces of adversary infrastructure involved in other attacks around the time of the attack?\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EO_HT2Mu5qFG"
      },
      "source": [
        "If we assume 10.42.0.0 is the internal network, lets see which connections are made from that internal network\n",
        "\n",
        "(here we make use of a cell magic for timesketch to demonstrate different ways of using the picatrix library)"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "5j3_pt4w5kQU"
      },
      "source": [
        "%%timesketch_query search_obj --fields datetime,ip_src,message,timestamp_desc,tcp_dport,ip_dst --max_entries 150000\n",
        "ip_src:10.42.85* AND NOT ip_dst:10.42.85* AND data_type:\"scapy:pcap:entry\"'"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "dwxoguZXvNiT"
      },
      "source": [
        "pd.options.display.max_colwidth = 200"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "A0nsQRSv9dDJ"
      },
      "source": [
        "Let's print out an aggregation of these IP pairs:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HO5waMoz57K-"
      },
      "source": [
        "data = search_obj.table\n",
        "data[['datetime','ip_src', 'ip_dst', 'tcp_dport']]\n",
        "\n",
        "mytable = data.groupby(['ip_src','ip_dst']).size().to_frame('count').reset_index()\n",
        "mytable.sort_values(by=['count'], ascending=False, inplace=True)\n",
        "mytable"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GoCYiWnqCIl1"
      },
      "source": [
        "The vast majority is the 194. IP, let's get a list of all external IPs here:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "eYpg5JT-CN8b"
      },
      "source": [
        "ip_set = set()\n",
        "list(map(ip_set.add, data['ip_src'].unique()))\n",
        "list(map(ip_set.add, data['ip_dst'].unique()))\n",
        "\n",
        "print(f'Found: {len(ip_set)} IPs (including internal)')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ilBwIeMzEshz"
      },
      "source": [
        "Let's look up the IP from earlier:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wIaphpfoEud7"
      },
      "source": [
        "ip_info('194.61.24.102')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hkO3Sh3sGtPA"
      },
      "source": [
        "We can now look up all of the IPs, and put it into a data frame \n",
        "\n",
        "**(this will take a while, since we are not doing bulk lookups here, but individual lookups per IP)**"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "D6M3I1HiEzvX"
      },
      "source": [
        "lines = []\n",
        "url = 'https://www.virustotal.com/vtapi/v2/ip-address/report'\n",
        "\n",
        "for ip_address in ip_set:\n",
        "  if ip_address.startswith('10.'):\n",
        "    continue\n",
        "\n",
        "  params = {\n",
        "      'apikey': VT_API_KEY,\n",
        "      'ip': ip_address,\n",
        "  }\n",
        "  response = requests.get(url, params=params)\n",
        "  j_obj = response.json()\n",
        "\n",
        "  line_dict = {\n",
        "      'resolutions': [x.get('hostname') for x in j_obj.get('resolutions', [])],\n",
        "      'ip': ip_address,\n",
        "      'detected_urls': j_obj.get('detected_urls'),\n",
        "      'detected_downloaded_samples': [\n",
        "          x.get('sha256') for x in j_obj.get('detected_downloaded_samples', [])] or None,\n",
        "      'detected_referrer_samples': [\n",
        "          x.get('sha256') for x in j_obj.get('detected_referrer_samples', [])] or None,\n",
        "      'detected_communicating_samples': [\n",
        "          x.get('sha256') for x in j_obj.get('detected_communicating_samples', [])] or None,\n",
        "  }  \n",
        "  lines.append(line_dict)\n",
        "ip_df = pd.DataFrame(lines)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JM1c3cKlJ607"
      },
      "source": [
        "Now we have a data frame, let's look at it briefly"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IL-tKFfxJ9Yn"
      },
      "source": [
        "ip_df.head(3)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "okawvzzgLMhS"
      },
      "source": [
        "We had our hash value from last time:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "MB9UNcSELN18"
      },
      "source": [
        "hash_value"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mtM-vjFQLPCG"
      },
      "source": [
        "Let's see if that shows up:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qu2p8jNhLQ3A"
      },
      "source": [
        "ip_df[~ip_df.detected_referrer_samples.isna() & ip_df.detected_referrer_samples.str.contains(hash_value)]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "QnwJnusfLbve"
      },
      "source": [
        "ip_df[~ip_df.detected_communicating_samples.isna() & ip_df.detected_communicating_samples.str.contains(hash_value)]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "QU5X_8oOLfEz"
      },
      "source": [
        "ip_df[~ip_df.detected_downloaded_samples.isna() & ip_df.detected_downloaded_samples.str.contains(hash_value)]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0U28xjRdMfIX"
      },
      "source": [
        "That does not seem to show up in our data frame, but let's see if there are other potentially malicious IPs in the data set."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3O2Gnt-SL0Wx"
      },
      "source": [
        "Let's look at those that have values in `detected_downloaded_samples`"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "cqamC45UL3Uz"
      },
      "source": [
        "ip_df[~ip_df.detected_downloaded_samples.isna()]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "G0MZbOeJM0WM"
      },
      "source": [
        "There are several here.. let's get a summary:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "7cUgySKsM2qT"
      },
      "source": [
        "ip_df[~ip_df.detected_downloaded_samples.isna()]['ip'].value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Vn0Dnyb7EJtI"
      },
      "source": [
        "Let's look for some strings in nthe data... "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4qK6hiP7ELHC"
      },
      "source": [
        "timesketch_query_func(\n",
        "    '\"This program cannot be run in DOS\" AND data_type:\"scapy:pcap:entry\"',\n",
        "    fields='*'\n",
        ").table[['ip_dst', 'ip_src']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_Id1M87hEiKa"
      },
      "source": [
        "There is another IP address here that we should look for... `203.78.103.109`"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VDfuPD6mAr4l"
      },
      "source": [
        "### Did the attacker access any other systems?\n",
        "#### How?\n",
        "#### When?\n",
        "#### Did the attacker steal or access any data?\n",
        "##### When?\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Blli20fkCy7V"
      },
      "source": [
        "search_obj = %timesketch_query --fields * username:\"Administrator\"\n",
        "other_df = search_obj.table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TQYBkKIWC7IK"
      },
      "source": [
        "Let's look at what systems are involved:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "FK9oXPfJC57m"
      },
      "source": [
        "other_df.hostname.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vrRXyC86DFN1"
      },
      "source": [
        "Let's look at this a bit more:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VVJJWm77DGtG"
      },
      "source": [
        "other_df[other_df['datetime'].dt.strftime('%Y%m%d') == '20200919'][['datetime', 'hostname', 'logon_type']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UC9pMrbsDrVj"
      },
      "source": [
        "Let's look at the different data types we've got:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "OuYOvBNkDcUT"
      },
      "source": [
        "other_df.data_type.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8EXnM3qNDukr"
      },
      "source": [
        "Let's look at the content of the registry entry for MSTC connection:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Dg3OdS7jDgri"
      },
      "source": [
        "other_df[other_df.data_type == 'windows:registry:mstsc:connection'][['datetime', 'display_name', 'hostname', 'message']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lx0gNzUkD7KI"
      },
      "source": [
        "We can see that it seems that the attacker did enter the desktop (DESKTOP-SDN1RPT) using RDP from the Domain Controller (that we already established that they brute-forced to gain access)\n",
        "\n",
        "Let's look at all logon events that occurred on 2020-09-19"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Qbtr0oGlEPC3"
      },
      "source": [
        "day_filter = other_df.datetime.dt.strftime('%Y%m%d') == '20200919'\n",
        "filtered_view = other_df[day_filter & other_df.tag.str.join(',').str.contains('logon-event')][[\n",
        "    'datetime', 'computer_name', 'logon_process', 'logon_type', 'source_address', 'source_username', 'windows_domain', 'username', 'workstation']]\n",
        "filtered_view"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qXp5hHRYGBs1"
      },
      "source": [
        "We can look at fewer records here, or some summaries:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_Ywa2cMKGEBq"
      },
      "source": [
        "filtered_view.workstation.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "g2Wzo6_gGIvU"
      },
      "source": [
        "So we go from `kali` to both the DC server and the Desktop."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2snYDhvNGMK_"
      },
      "source": [
        "filtered_view[filtered_view.workstation == 'kali']"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EXBQ9aGgGSfx"
      },
      "source": [
        "Likely that the attacker used Kali as their `attack platform` or operating system of choice for the attacker. Let's look at the remote interactive logons:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ea7UFp-nGd9S"
      },
      "source": [
        "filtered_view[filtered_view.logon_type == 'RemoteInteractive']"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zqyMGeMQGiia"
      },
      "source": [
        "Here is is clear that the attacker did RDP into CITADEL-DC01 from the IP of 194.61.24.102 and from there they entered DESKTOP-SDN1RPT (also via RDP using the same credentials)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KaIcORDwAuA-"
      },
      "source": [
        "### What was the network layout of the victim network?\n",
        "\n",
        "This is a question we might not be able to answer with Timesketch (at least not with the current version, but this may change soon with the introduction of graphing)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FhtSFN9ZAxkv"
      },
      "source": [
        "### Did the attacker steal the Szechuan sauce? If so, what time?\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "OEXtgBAiRGbJ"
      },
      "source": [
        "Ok so here we are just starting by guessing for potential interesting filenames based on the case description"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9YxZ71-ORFmp"
      },
      "source": [
        "secret_files = timesketch_query_func('secret', fields='*').table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "h0caY54XV9MS"
      },
      "source": [
        "Let's take a look at what we've got"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "BOpt0tEVV-l9"
      },
      "source": [
        "secret_files.data_type.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kGvHQo4PWHOL"
      },
      "source": [
        "and"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pN4b6TftWH3u"
      },
      "source": [
        "secret_files.parser.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FVdhnV5lWP3y"
      },
      "source": [
        "Let's look at some of these browser entries..."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "km-XHdAfRwsO"
      },
      "source": [
        "secret_files[secret_files.data_type == 'msie:webcache:container'][['datetime', 'timestamp_desc', 'url']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aYXmnIN_Sa4T"
      },
      "source": [
        "Uh that looks interesting: ```Administrator@file:///C:/FileShare/Secret/Szechuan%20Sauce.txt``` and the whole directory.\n",
        "\n",
        "Let's filter out the Secret folder... use the message field to start with:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XLM82cxuWk2W"
      },
      "source": [
        "secret_files[secret_files.message.str.contains(r'FileShare\\/Secret')]['message'].unique()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fkbvxtepW_UK"
      },
      "source": [
        "Let's look at some of the details here:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "72RPtGfhXCU3"
      },
      "source": [
        "secret_files[secret_files.message.str.contains(r'FileShare\\/Secret')][['timestamp_desc', 'message']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Y5bcEkrxXf5i"
      },
      "source": [
        "Let's remove the expiration time from this..."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "57WjnimBXiQW"
      },
      "source": [
        "secret_files[secret_files.message.str.contains(r'FileShare\\/Secret') & secret_files['timestamp_desc'] != 'Expiration Time'][['datetime', 'message']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kGLw14-DYO3d"
      },
      "source": [
        "Let's extract unique filenames here"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "A_NoyKKdYQrF"
      },
      "source": [
        "secret_files[secret_files.message.str.contains(r'FileShare\\/Secret')].message.str.extract(r'(FileShare\\/Secret\\/[^ ]+)')[0].unique()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6AUK6ocLYix-"
      },
      "source": [
        "There are more files here than just the stolen szechuan sauce...."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6ogp4UkpG5gX"
      },
      "source": [
        "Let's look more closely into the `beth` files:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "W_W8NgZrG8Yg"
      },
      "source": [
        "beth_df = secret_files[secret_files.message.str.contains('beth', case=False)]\n",
        "beth_df.data_type.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4C1ekFs9HJff"
      },
      "source": [
        "Start with the simple fs stat:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "gswnE8FxHOyZ"
      },
      "source": [
        "beth_df[beth_df.data_type == 'fs:stat'][['datetime', 'display_name', 'timestamp_desc']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1zYz5w7RHXAd"
      },
      "source": [
        "Let's look at another source of timestamped information:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4KLIQrlrHe6D"
      },
      "source": [
        "beth_df[beth_df.data_type == 'olecf:dest_list:entry'][['datetime', 'hostname', 'timestamp_desc', 'path']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "AN8IF7GpH2Km"
      },
      "source": [
        "beth_df[beth_df.data_type.str.contains('windows:shell_item:file_entry|msie:webcache:container')][[\n",
        "    'datetime', 'data_type', 'source_long', 'timestamp_desc', 'url', 'long_name', 'origin', 'shell_item_path']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vgqKnpEAIXie"
      },
      "source": [
        "The Administrator account accesses the file at \"`2020-09-18 20:32:13.141000+00:0`', yet the file's creation date is at `2020-09-18 23:33:54+00:00\t`.. that's odd...\n",
        "\n",
        "There are also shell entries (for instance jumplist entries) there before the `file was created`.\n",
        "\n",
        "This needs to be looked at more closely, but it looks like there may be some timestomping activity here."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KNnvTL-lZH-r"
      },
      "source": [
        "These files are also visited by the Administrator account... let's look at logins..."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "n-eDrhq4ZLZk"
      },
      "source": [
        "search_obj = %timesketch_query --fields * username:\"Administrator\" AND tag:\"logon-event\"\n",
        "admin_login = search_obj.table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uaLIBVdUZUop"
      },
      "source": [
        "admin_login.shape"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qjdDSPL5Zd3-"
      },
      "source": [
        "admin_login['datetime'] = pd.to_datetime(admin_login['datetime'])\n",
        "admin_login[admin_login.datetime.dt.strftime('%Y%m%d') == '20200918'][[\n",
        "    'datetime', 'logon_type', 'source_address', 'source_username', 'username', 'windows_domain', 'workstation']]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_Z6klmgUA0H0"
      },
      "source": [
        "### Did the attacker steal or access any other sensitive files? If so, what times?\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dEImhk_QJ0Q6"
      },
      "source": [
        "Let's look for signs of compressed files."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bsNkvwU_FI3Y"
      },
      "source": [
        "%%timesketch_query search_obj --fields *\n",
        "*.zip OR *.tar.gz OR *.tgz OR *.tbz OR *.tar.bz2 OR *.cab OR *.7a"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "67uJZEoPFPxS"
      },
      "source": [
        "zip_df = search_obj.table\n",
        "zip_df.columns"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "A_j35596FRus"
      },
      "source": [
        "zip_df.data_type.value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "95VW4L5JFUUk"
      },
      "source": [
        "zip_df[~zip_df.filename.isna() & (zip_df.filename.str.contains('.zip$'))]['filename'].value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_ylQCwZMKGpD"
      },
      "source": [
        "Let's limit ourselves to the day of the attack"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "CFEPSSwCKIaB"
      },
      "source": [
        "zip_day = zip_df[zip_df['datetime'].dt.strftime('%Y%m%d') == '20200919']\n",
        "zip_day['filename'].value_counts()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "D4tS338dKWy8"
      },
      "source": [
        "OK, there are some interesting files here... most notably the `Secret.zip` the temporary `Secret.zip`, the `My Social Security Number.zip` and `loot.zip`\n",
        "\n",
        "We can check first what files have an associated LNK entry there:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jAdViMlSFZAq"
      },
      "source": [
        "zip_day[zip_day.data_type == 'windows:lnk:link'][['link_target', 'local_path']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Kil_9lUpFm0-"
      },
      "source": [
        "zip_day[zip_day.data_type == 'fs:ntfs:usn_change'][['filename']].drop_duplicates()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BFkN3huXKomZ"
      },
      "source": [
        "These should be looked at a bit more closely."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_DFnIHM1IlgU"
      },
      "source": [
        "### What file was time stomped?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YHq03na1IqZq"
      },
      "source": [
        "There are many possible ways to \"timestomp\" (or alter the timestamps) of a file. Some of these are harder to detect than others. One method of altering timestamps changes the file timestamp to the second; however, NTFS timestamps have greater precision than that. An easy way to detect files with timestamps altered in this way is to look for timestamps with the fractional seconds set to 0."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZSZKDa98L0vS"
      },
      "source": [
        "Let's query for the file timestamps (\"fs:stat\"). As this would pull back a lot of results, I'll add some search terms found earlier in the case to limit the results. This would work with the keyword (but it would take much longer).   "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "DlyO77uRi2PT"
      },
      "source": [
        "timestomp_df = timesketch_query_func(\n",
        "    'data_type:\"fs:stat\" AND (*secret* OR *zip* OR *coreupdater* OR *Szechuan*)', \n",
        "    fields='datetime,timestamp,timestamp_desc,filename,hostname,inode').table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ihHLcslfNIx1"
      },
      "source": [
        "Let's see how many results we got:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "iBvlBpjtMeq_"
      },
      "source": [
        "timestomp_df.shape"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w3ip8R7bNlsd"
      },
      "source": [
        "And finally, let's look for any timestamps the end with 0s (this could have false positivies, but it's literally a 1 in a million chance)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "BUdrnFzCFQgZ"
      },
      "source": [
        "timestomp_df[timestomp_df.timestamp.astype('str').str.endswith('000000')]"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7mJiJi83N3V4"
      },
      "source": [
        "There we go! `Beth_Secret.txt` has timestamps that lack fractional seconds; that's strange. Let's take a closer look to see if anything else looks strange about this file."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WWYmVcHcQu9j"
      },
      "source": [
        "From the above results, the inode (or MFT file reference number) is `87111`. Since we got a disk image of the server relatively quickly after the incident, there's a good chance we still have USN journal entries that cover that file. Let's look!\n",
        "\n",
        "The following query looks for any USN change journal records for `87111`, as well as the timestamps from the above table (for comparison), limited to the server:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "thO49NAjGc6R"
      },
      "source": [
        "secret_timestomp_df = timesketch_query_func(\n",
        "    '\"File reference: 87111-\" AND data_type:\"fs:ntfs:usn_change\"',\n",
        "    fields='data_type,datetime,filename,hostname,inode,message').table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "6xjHyWyuORQt"
      },
      "source": [
        "secret_timestomp_df.sort_values('datetime')"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_wVMdST7SBmD"
      },
      "source": [
        "Interesting! We do have change journal records for that file. Let's work through these results (sorted oldest to newest):\n",
        " - Rows 0 through 3: pertain to a favicon; row 3 shows it being deleted. This isn't relevant to our file of interest.\n",
        " - Rows 4 and 5: `New Text Document.txt` is created\n",
        " - Rows 6, 7 & 8: all have the same timestamp (so the order here might be incorrect). Together they show a rename from `New Text Document.txt` to `Beth_Secret.txt`\n",
        " - Rows 9 & 10: Show an `OBJECT_ID_CHANGE`; honestly not sure what that is.\n",
        " - Rows 11 & 12: Data being added to the file (likely the \"secret\" being type into it (`USN_REASON_DATA_EXTEND`).\n",
        " - Rows 13 & 14: `USN_REASON_BASIC_INFO_CHANGE`; maybe, basic info like changing the timestamps :) \n",
        "\n",
        "That's a pretty comprehensive timeline for actions on that particular \"secrets\" file. All the timestamps seem to fit with each other and tell a plausible story. Now, let's add in those timestamps that look a bit suspicious:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_xkPs59cT_BM"
      },
      "source": [
        "more_timestomp_df = timesketch_query_func(\n",
        "    '((\"File reference: 87111-\" AND data_type:\"fs:ntfs:usn_change\") OR (inode:87111)) AND hostname:\"CITADEL-DC01\"', \n",
        "    fields='data_type,datetime,timestamp,timestamp_desc,filename,hostname,inode,message').table"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bDjgt859Wqw8"
      },
      "source": [
        "more_timestomp_df"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ywG6Bnj0Wvkl"
      },
      "source": [
        "They are all way before the USN journal activity! That, along with the null fractional seconds and the tight USN change journal timeline, point toward manipulated timestamps."
      ]
    }
  ]
}