# Guided Triage - Incidents
<details>
    <summary> <u>Details...</u></summary>
Notebook Version: 1.0<br>

**Data Sources Used**:<br>
- Microsoft Sentinel
    - Incidents
<br>
- Threat Intelligence Providers
    - OTX (https://otx.alienvault.com/)
    - VirusTotal (https://www.virustotal.com/)
    - XForce (https://www.ibm.com/security/xforce)
    - GreyNoise (https://www.greynoise.io)
</details>

This notebooks takes you through a guided triage of an Microsoft Sentinel Incident. The triage focuses on investigating the entities that attached to an Microsoft Sentinel Incident. This notebook can be extended with additional triage steps based on specific processes and workflows.

---
### Notebook initialization
The next cell:
- Checks for the correct Python version
- Checks versions and optionally installs required packages
- Imports the required packages into the notebook
- Sets a number of configuration options.

This should complete without errors. If you encounter errors or warnings look at the following two notebooks:
- [TroubleShootingNotebooks](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/TroubleShootingNotebooks.ipynb)
- [ConfiguringNotebookEnvironment](https://github.com/Azure/Azure-Sentinel-Notebooks/blob/master/ConfiguringNotebookEnvironment.ipynb)

If you are running in the Microsoft Sentinel Notebooks environment (Azure Notebooks or Azure ML) you can run live versions of these notebooks:
- [Run TroubleShootingNotebooks](./TroubleShootingNotebooks.ipynb)
- [Run ConfiguringNotebookEnvironment](./ConfiguringNotebookEnvironment.ipynb)

You may also need to do some additional configuration to successfully use functions such as Threat Intelligence service lookup and Geo IP lookup. 
There are more details about this in the `ConfiguringNotebookEnvironment` notebook and in these documents:
- [msticpy configuration](https://msticpy.readthedocs.io/en/latest/getting_started/msticpyconfig.html)
- [Threat intelligence provider configuration](https://msticpy.readthedocs.io/en/latest/data_acquisition/TIProviders.html#configuration-file)


In [None]:
# Install MSTICPy and MSTICNb
%pip install msticpy --upgrade
%pip install msticnb --upgrade

extra_imports = [
    "json",
    "collections,Counter",
    "datetime,datetime",
    "datetime,timedelta",
    "matplotlib.pyplot,,plt",
    "numpy,,np",
    "pandas,,pd",
    "bokeh.plotting,show",
    "whois,whois",
    "pytz",
    "msticnb,,nb",
    "msticpy.common.azure_auth,az_connect",
    "msticpy.common.timespan,TimeSpan",
    "msticpy.nbtools.nbwidgets,SelectAlert",
    "msticpy.nbtools.nbwidgets,Progress",
    "msticpy.data.azure_sentinel,AzureSentinel",
    "msticpy.nbtools.foliummap,FoliumMap",
    "msticpy.vis.entity_graph_tools, EntityGraph",
    "msticpy.data.azure_data, AzureData",
    "msticpy.data.azure, MicrosoftSentinel"
]

from msticpy.common.exceptions import MsticpyDataQueryError
import msticpy

msticpy.init_notebook(
    namespace=globals(),
    extra_imports=extra_imports,
    additional_packages=["python-whois>=0.7.3"],
)

<div class="alert alert-block alert-info">
<b>Note:</b> The following cell creates some helper functions used later in the notebook. This cell has no output.
</div>

In [None]:
def check_ent(items, entity):
    """Check if entity is present"""
    for item in items:
        if item[0].casefold() == entity.casefold():
            return True
    return False


def ti_color_cells(val):
    """Color cells of output dataframe based on severity"""
    color = "none"
    if isinstance(val, str):
        if val.casefold() == "high":
            color = "Red"
        elif val.casefold() == "warning" or val.casefold() == "medium":
            color = "Orange"
        elif val.casefold() == "information" or val.casefold() == "low":
            color = "Green"
    return f"background-color: {color}"


def ent_color_cells(val):
    """Color table cells based on values in the cells"""
    if isinstance(val, int):
        color = "yellow" if val < 3 else "none"
    elif isinstance(val, float):
        color = "yellow" if val > 4.30891 or val < 2.72120 else "none"
    else:
        color = "none"
    return "background-color: %s" % color

def ent_alerts(ent_val):
    query = f" SecurityAlert | where TimeGenerated between(datetime({alert_q_times.start})..datetime({alert_q_times.end})) | where Entities contains '{ent_val}'"
    alerts_df = qry_prov.exec_query(query)
    if isinstance(alerts_df, pd.DataFrame) and not alerts_df.empty:
        nbdisplay.display_timeline(
            data=alerts_df,
            source_columns=["DisplayName", "AlertSeverity", "ProviderName"],
            title=f"Alerts involving {ent_val}",
            group_by="AlertSeverity",
            height=300,
            time_column="TimeGenerated",
        )


### Authenticate to Microsoft Sentinel APIs and Select Subscriptions

This cell connects to the Microsoft Sentinel APIs and gets a list of subscriptions the user has access to for them to select. In order to use this the user must have at least read permissions on the Microsoft Sentinel workspace.
In the drop down select the name of the subscription that contains the Microsoft Sentinel workspace you want to triage incidents from.

In [None]:
az_data = AzureData()
az_data.connect()
subs = az_data.get_subscriptions()
active_subs = subs[subs["State"] == "Enabled"]
ws = WorkspaceConfig()

if ws.list_workspaces()["Default"] and "SubscriptionId" in ws.list_workspaces()["Default"]:
    default_sub = active_subs[active_subs['Subscription ID'] == ws.list_workspaces()['Default']["SubscriptionId"]]["Display Name"].iloc[0]
else:
    default_sub = ""

sub_picker = nbwidgets.SelectItem(
    description="Select subscription:",
    item_list=active_subs["Display Name"].unique().tolist(),
    value=default_sub
)
display(sub_picker)

Now select the name of the Microsoft Sentinel workspace in the subscription you want to triage incidents from.

In [None]:
ws_id = None
sub_value = active_subs[active_subs["Display Name"] == sub_picker.value][
    "Subscription ID"
].iloc[0]

azure_resources = az_data.get_resources(sub_id=sub_value)
wspcs = azure_resources[
                (azure_resources["resource_type"] == "Microsoft.OperationsManagement/solutions")
                & (azure_resources["name"].str.startswith("SecurityInsights"))
        ]       

if ws.list_workspaces():
    default_wss = list(set(ws.list_workspaces()).intersection(wspcs))
    default_ws = default_wss[0] if default_wss else ""
else:
    default_ws = ""

if isinstance(wspcs, pd.DataFrame) and len(wspcs.index) > 1:
    wspc_picker = nbwidgets.SelectItem(
        description="Select your Microsoft Sentinel workspace:",
        item_list=list(wspcs['name']),
        value = default_ws,
    )
    display(wspc_picker)
elif wspcs:
    ws_id = wspcs["resource_id"].iloc[0]
    md(f"Only one workspace found, selecting {list(wspcs.keys())[0]}")
else:
    md(
        f"{sub_picker.value} has no Microsoft Sentinel Workspaces, please pick another subscription"
    )


### Authenticate to Microsoft Sentinel, TI providers and load Notebooklets
<details>
    <summary> <u>Details...</u></summary>
If you are using user/device authentication, run the following cell. 
- Click the 'Copy code to clipboard and authenticate' button.
- This will pop up an Azure Active Directory authentication dialog (in a new tab or browser window). The device code will have been copied to the clipboard. 
- Select the text box and paste (Ctrl-V/Cmd-V) the copied value. 
- You should then be redirected to a user authentication page where you should authenticate with a user account that has permission to query your Log Analytics workspace.

Note: you may occasionally see a JavaScript error displayed at the end of the authentication - you can safely ignore this.<br>
On successful authentication you should see a ```popup schema``` button.
To find your Workspace Id go to [Log Analytics](https://ms.portal.azure.com/#blade/HubsExtension/Resources/resourceType/Microsoft.OperationalInsights%2Fworkspaces). Look at the workspace properties to find the ID.
    
Note that you may see a warning relating to the IPStack service when running this cell. This can be safely ignored as its not used in this case.
</details>

In [None]:
if not ws_id:
    ws_id = az_data.get_resource_details(
                    sub_id=sub_value, resource_id=wspcs[wspcs["name"]==wspc_picker.value]["resource_id"].iloc[0]  # type: ignore
                )['properties']['workspaceResourceId']

ms_sent = MicrosoftSentinel(ws_id)
ms_sent.connect()
# Set up notebooklets
qry_prov = QueryProvider("AzureSentinel")
qry_prov.connect(WorkspaceConfig(workspace=ws_id.split("/")[-1]))
ti = TILookup()
nb.init(qry_prov)
timespan = TimeSpan(start=datetime.now() - timedelta(days=7))

Use the time selector below to set the time range you wish to get incidents from.

In [None]:
md("Select a time range in which to get incidents from:", "bold")
alert_q_times = nbwidgets.QueryTime(units="hours", max_before=72, max_after=1, before=24)
alert_q_times.display()


### Incident Timeline
This timeline shows you all events in the selected workspace, grouped by the severity of the incident.

In [None]:
incidents = ms_sent.list_incidents()
# Adding timezones to allow comparison with UTC incident times
start = alert_q_times.start.astimezone()
end = alert_q_times.end.astimezone()
if isinstance(incidents, pd.DataFrame) and not incidents.empty:
    incidents["date"] = pd.to_datetime(incidents["properties.createdTimeUtc"], utc=True)
    filtered_incidents = incidents[incidents["date"].between(start, end)] if not incidents[incidents["date"].between(start, end)].empty else incidents
    filtered_incidents.mp_plot.timeline(
        source_columns=["properties.title", "properties.status"],
        title="Incidents over time - grouped by severity",
        height=300,
        group_by="properties.severity",
        time_column="date",
    )
else:
    md("No incidents found")


### Select Incident to Triage
From the table below select the incident you wish to triage.

In [None]:
md("Select an incident to triage:", "bold")


def display_incident(incident):
    details = f"""
            <h3>Selected Incident: {incident['properties.title']},</h3>
            <b>Incident time: </b> {incident['properties.createdTimeUtc']} - 
            <b>Severity: </b> {incident['properties.severity']} - 
            <b>Assigned to: </b>{incident['properties.owner.userPrincipalName']} - 
            <b>Status: </b> {incident['properties.status']}
            """
    new_idx = [idx.split(".")[-1] for idx in incident.index]
    incident.set_axis(new_idx, inplace=True)
    return HTML(details), pd.DataFrame(incident)


filtered_incidents["short_id"] = filtered_incidents["id"].apply(
    lambda x: x.split("/")[-1]
)

alert_sel = SelectAlert(
    alerts=filtered_incidents,
    columns=["properties.title", "properties.severity", "properties.status"],
    time_col="properties.createdTimeUtc",
    id_col="short_id",
    action=display_incident,
)
alert_sel.display()

The cell below shows you key details relating to the incident, including the associated entities and the graph of the relationships between these entities.

In [None]:
incident_details = ms_sent.get_incident(
    alert_sel.selected_alert.id.split("/")[-1], entities=True, alerts=True
)
ent_dfs = []
for ent in incident_details["Entities"][0]:
    ent_df = pd.json_normalize(ent[1])
    ent_df["Type"] = ent[0]
    ent_dfs.append(ent_df)

md("Incident Entities:", "bold")
if ent_dfs:
    new_df = pd.concat(ent_dfs, axis=0, ignore_index=True)
    grp_df = new_df.groupby("Type")
    for grp in grp_df:
        md(grp[0], "bold")
        display(grp[1].dropna(axis=1))

alert_out = []
if "Alerts" in incident_details.columns:
    for alert in incident_details.iloc[0]["Alerts"]:
        qry = f"SecurityAlert | where TimeGenerated between((datetime({alert_q_times.start})-7d)..datetime({alert_q_times.end})) | where SystemAlertId == '{alert['ID']}'"
        df = qry_prov.exec_query(qry)
        display(df)
        if df.empty or not df["Entities"].iloc[0]:
            alert_full = {"ID": alert["ID"], "Name": alert["Name"], "Entities": None}
        else:
            alert_full = {
                "ID": alert["ID"],
                "Name": alert["Name"],
                "Entities": json.loads(df["Entities"].iloc[0]),
            }
        alert_out.append(alert_full)

    incident_details["Alerts"] = [alert_out]

md("Graph of incident entities:", "bold")
graph = EntityGraph(incident_details.iloc[0])
graph.plot(timeline=True)

### Entity Analysis
Below is an analysis of the incident's entities that appear in threat intelligence sources.

In [None]:
sev = []
ip_ti_lookup_results = pd.DataFrame()

# For each entity look it up in Threat Intelligence data
md("Looking up entities in TI feeds...")
prog = Progress(completed_len=len(incident_details["Entities"].iloc[0]))
i = 0
for ent in incident_details["Entities"].iloc[0]:
    i += 1
    prog.update_progress(i)
    if ent[0] == "Ip":
        resp = ti.lookup_ioc(observable=ent[1]["address"], ioc_type="ipv4")
        ip_ti_lookup_results = ip_ti_lookup_results.append(ti.result_to_df(resp), ignore_index=True)
        for response in resp[1]:
            sev.append(response[1].severity)
    if ent[0] == "Url" or ent[0] == "DnsResolution":
        if "url" in ent[1]:
            lkup_dom = ent[1]["url"]
        else:
            lkup_dom = ent[1]["domainName"]
        resp = ti.lookup_ioc(lkup_dom, ioc_type="url")
        ip_ti_lookup_results = ip_ti_lookup_results.append(ti.result_to_df(resp), ignore_index=True)
        for response in resp[1]:
            sev.append(response[1].severity)
    if ent[0] == "FileHash":
        resp = ti.lookup_ioc(ent[1]["hashValue"])
        ip_ti_lookup_results = ip_ti_lookup_results.append(ti.result_to_df(resp), ignore_index=True)
        for response in resp[1]:
            sev.append(response[1].severity)

# Take overall severity of the entities based on the highest score
if "high" in sev:
    severity = "High"
elif "warning" in sev:
    severity = "Warning"
elif "information" in sev:
    severity = "Information"
else:
    severity = "None"

md("Checking to see if incident entities appear in TI data...")

incident_details["TI Severity"] = severity
# Output TI hits of high or warning severity
if (
    incident_details["TI Severity"].iloc[0] == "High"
    or incident_details["TI Severity"].iloc[0] == "Warning"
    or incident_details["TI Severity"].iloc[0] == "Information"
):
    print("Incident:")
    display(
        incident_details[
            [
                "properties.createdTimeUtc",
                "properties.incidentNumber",
                "properties.title",
                "properties.status",
                "properties.severity",
                "TI Severity",
            ]
        ]
        .style.applymap(ti_color_cells)
        .hide_index()
    )
    md("TI Results:", "bold")
    display(
        ip_ti_lookup_results[["Ioc", "IocType", "Provider", "Severity", "Details"]]
        .sort_values(by="Severity")
        .style.applymap(ti_color_cells)
        .hide_index()
    )
else:
    md("None of the Entities appeared in TI data", "bold")

### IP Entity Analysis
Below is an analysis of all IP entities attached to the incident.

In [None]:
# Enrich IP entities using the IP Summary notebooklet
ip_ent_nb = nb.nblts.azsent.network.IpAddressSummary()

def display_ipnb_details(item, nb_out, header = ""):
    if nb_out.check_valid_result_data(item, False):
        md(header, "bold")
        display(getattr(nb_out, item))

def display_ip_map(nb_out):
    if nb_out.check_valid_result_data("location", False):
        md(f"Geo IP details for {ip_addr}", "bold")
        folium_map.add_ip_cluster([nb_out.ip_entity])
        display(folium_map)

def display_ip_alerts(nb_out):
    if nb_out.check_valid_result_data("related_alerts", False):
        md(f"Alerts for {ip_addr}", "bold")
        nb_out.related_alerts.mp_plot.timeline(
            source_columns=["AlertName", "Severity"],
            title=f"Alerts associated with {ip_addr}",
            height=300,
        )

def display_ip_hosts(nb_out):
    if ip_ent_nb_out.host_entity["IpAddresses"]:
        md(f"{ip_addr} belongs to a known host", "bold")
        display(
            pd.DataFrame.from_records(
                [
                    {
                        x: nb_out.host_entity[x]
                        for x in nb_out.host_entity.__iter__()
                    }
                ]
            )
        )

if not ip_ti_lookup_results.empty and "ipv4" in ip_ti_lookup_results["IocType"].unique():
    for ip_addr in ip_ti_lookup_results[ip_ti_lookup_results["IocType"] == "ipv4"]["Ioc"].unique():
        folium_map = FoliumMap(width="50%", height="50%")
        try:
            display(HTML(f"<h1>Summary of Activity Related to {ip_addr}:</h1>"))
            ip_ent_nb_out = ip_ent_nb.run(value=ip_addr, timespan=timespan, silent=True)
            md(
                f"{ip_addr} - {ip_ent_nb_out.ip_origin} - {ip_ent_nb_out.ip_type}",
                "bold",
            )
            display_ipnb_details("whois", ip_ent_nb_out, f"Whois information for {ip_addr}")
            display_ip_map(ip_ent_nb_out)
            display_ipnb_details("passive_dns", ip_ent_nb_out, f"Passive DNS results for {ip_addr}")
            display_ipnb_details("vps_network", ip_ent_nb_out, f"{ip_addr} belongs to a known VPS provider")
            display_ip_alerts(ip_ent_nb_out)
            display_ipnb_details("ti_results", ip_ent_nb_out, f"TI results for {ip_addr}")
            display_ip_hosts(ip_ent_nb_out)
            md("<hr>")
            md("<br><br>")
        except:
            md(f"Error processing {ip_addr}", "bold")
else:
    md("No IP entities present", "bold")

### Domain Entity Analysis
Below is an analysis of all Domain/URL entities attached to the incident.

In [None]:
# Enrich Domain entities
domain_items = [
    "name",
    "org",
    "city",
    "state",
    "country",
    "registrar",
    "status",
    "creation_date",
    "expiration_date",
    "updated_date",
    "name_servers",
    "dnssec",
]


def Entropy(data):
    """Calculate entropy of string"""
    s, lens = Counter(data), np.float(len(data))
    return -sum(count / lens * np.log2(count / lens) for count in s.values())


domain_records = pd.DataFrame()
if not ip_ti_lookup_results.empty and "url" in ip_ti_lookup_results["IocType"].unique():
    md("Domain entity enrichment", "bold")
    for url in ip_ti_lookup_results[ip_ti_lookup_results["IocType"] == "url"]["Ioc"].unique():
        display(HTML(f"<h1>Summary of Activity Related to{url}:</h1>"))
        wis = whois(url)
        if not wis.domain_name:
            continue
        if isinstance(wis["domain_name"], list):
            domain = wis["domain_name"][0]
        else:
            domain = wis["domain_name"]
        # Create domain record from whois data
        dom_rec = {}
        for key in wis.keys():
            if key in domain_items:
                dom_rec[key] = [wis[key]]
        dom_rec["domain"] = domain
        dom_record = pd.DataFrame(dom_rec)
        page_rank = ti.result_to_df(
            ti.lookup_ioc(observable=domain, providers=["OPR"])
        )
        page_rank_score = page_rank["RawResult"][0]["response"][0][
            "page_rank_integer"
        ]
        dom_record["Page Rank"] = [page_rank_score]
        dom_ent = Entropy(domain)
        dom_record["Entropy"] = [dom_ent]
        # Highlight page rank of entropy scores of note
        display(
            dom_record.T.style.applymap(
                ent_color_cells, subset=pd.IndexSlice[["Page Rank", "Entropy"], 0]
            )
        )
        md("If Page Rank or Domain Entropy are highlighted this indicates that their values are outside the expected values of a legitimate website")
        md(f"The average entropy for the 1M most popular domains is 3.2675")
        md("<hr>")
        md("<br><br>")
else:
    md("No Domain entities present", "bold")

### User Entity Analysis
Below is an analysis of all User entities attached to the incident.

In [None]:
# Enrich Account entities using the AccountSummary notebooklet
timespan = TimeSpan(
    start=pd.to_datetime(incident_details.iloc[0]["properties.firstActivityTimeUtc"]) - timedelta(days=1),
    end=pd.to_datetime(incident_details.iloc[0]["properties.lastActivityTimeUtc"]) + timedelta(days=1)
)
account_nb = nb.nblts.azsent.account.AccountSummary()
user = None
uent = None

def display_accnb_details(item, nb_out, header = ""):
    if nb_out.check_valid_result_data(item, False):
        md(header, "bold")
        display(getattr(nb_out, item))

if check_ent(incident_details["Entities"][0], "account") or check_ent(
    incident_details["Entities"][0], "mailbox"
):
    md("Account entity enrichment", "bold")
    for ent in incident_details["Entities"][0]:
        if ent[0] == "Account" or ent[0] == "Mailbox":
            if "accountName" in ent[1].keys():
                uent = ent[1]["accountName"]
            elif "aadUserId" in ent[1].keys():
                uent = ent[1]["aadUserId"]
            elif "upn" in ent[1].keys():
                uent = ent[1]["upn"]
            if "upnSuffix" in ent[1].keys():
                user = uent + "@" + ent[1]["upnSuffix"]
            else:
                user = uent
            if user:
                try:
                    display(HTML(f"<h1>Summary of Activity Related to{user}:</h1>"))
                    ac_nb = account_nb.run(timespan=timespan, value=user, silent=True)
                    if ac_nb.account_selector is not None:
                        md("Warning: multiple matching accounts found")
                        display(ac_nb.account_selector)
                    display_accnb_details("account_activity", ac_nb, "Recent activity")
                    display_accnb_details("related_alerts", ac_nb, "Related alerts")
                    ac_nb.get_additional_data(silent=True)
                    display_accnb_details("host_logon_summary", ac_nb, "Host Logons")
                    display_accnb_details("azure_activity_summary", ac_nb, "Azure Activity")
                    display_accnb_details("azure_timeline_by_provider", ac_nb)
                    display_accnb_details("ip_summary", ac_nb, "IP summary")
                except:
                    print(f"Error processing {user}")
else:
    md("No Account entities present", "bold")

### Host Entity Analysis
Below is an analysis of all Host entities attached to the incident.

In [None]:
# Enrich Host entities using the HostSummary notebooklet
timespan = TimeSpan(
    start=pd.to_datetime(incident_details.iloc[0]["properties.firstActivityTimeUtc"]) - timedelta(days=1),
    end=pd.to_datetime(incident_details.iloc[0]["properties.lastActivityTimeUtc"]) + timedelta(days=1)
)
host_nb = nb.nblts.azsent.host.HostSummary()

if check_ent(incident_details["Entities"][0], "host"):
    md("Host entity enrichment", "bold")
    for ent in incident_details["Entities"][0]:
        if ent[0] == "Host":
            if "dnsDomain" in ent[1]:
                host_name = ent[1]["hostName"] + "." + ent[1]["dnsDomain"], ""
            else:
                host_name = ent[1]["hostName"]
            md(f"Host summary for {host_name}", "bold")
            try:
                display(HTML(f"<h1>Summary of Activity Related to{host_name}:</h1>"))
                host_sum_out = host_nb.run(value=host_name, timespan=timespan)
            except:
                print(f"Error processing {host_name}")
else:
    md("No Host entities present", "bold")

### Other Entity Analysis
If there are other entity types not analyzed above, a timeline of their appearance in security alerts appears below.

In [None]:
ent_map = {
    "FieHash": "hashValue",
    "Malware": "malwareName",
    "File": "fileName",
    "CloudApplication": "appId",
    "AzureResource": "ResourceId",
    "RegistryValue": "registryName",
    "SecurityGroup": "SID",
    "IoTDevice": "deviceId",
    "Mailbox": "mailboxPrimaryAddress",
    "MailMessage": "networkMessageId",
    "SubmissionMail": "submissionId",
}
for ent in incident_details["Entities"][0]:
    if ent[0] in ent_map:
        ent_alerts(ent[1][ent_map[ent[0]]])