{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Inversions\n",
    "\n",
    "In this notebook, we use a synthetic example to explore aspects of inversion including:\n",
    "- assigning uncertainties to the the data\n",
    "- adjusting the regularization parameters on the smallness and smoothness terms\n",
    "- adjusting the value of the trade-off parameter beta"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 0: Imports and load survey info\n",
    "\n",
    "These steps are the same as in the previous notebook. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# core python \n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from matplotlib.colors import LogNorm, SymLogNorm, Normalize\n",
    "import matplotlib.gridspec as gridspec\n",
    "import ipywidgets\n",
    "\n",
    "# tools in the simPEG Ecosystem \n",
    "import discretize  # for creating computational meshes\n",
    "\n",
    "# linear solvers\n",
    "try: \n",
    "    from pymatsolver import Pardiso as Solver  # this is a fast linear solver \n",
    "except ImportError:\n",
    "    from SimPEG import SolverLU as Solver  # this will be slower\n",
    "\n",
    "# SimPEG inversion machinery\n",
    "from SimPEG import (\n",
    "    Data, maps,\n",
    "    data_misfit, regularization, optimization, inverse_problem, \n",
    "    inversion, directives\n",
    ") \n",
    "\n",
    "# DC resistivity and IP modules\n",
    "from SimPEG.electromagnetics import resistivity as dc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set the font size in the plots\n",
    "from matplotlib import rcParams\n",
    "rcParams[\"font.size\"] = 14"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "try: \n",
    "    import warnings\n",
    "    warnings.filterwarnings('ignore')\n",
    "except:\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1: Load DC Survey\n",
    "\n",
    "This is the same as in the previous notebook, but included so we can re-visit any of the steps as necessary. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "line = \"46800E\"\n",
    "\n",
    "dc_data_file = f\"./century/{line}/{line[:-1]}POT.OBS\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def read_dcip_data(filename, verbose=True):\n",
    "    \"\"\"\n",
    "    Read in a .OBS file from the Century data set into a python dictionary. \n",
    "    The format is the old UBC-GIF DCIP format.\n",
    "    \n",
    "    Parameters\n",
    "    ----------\n",
    "    filename : str\n",
    "        Path to the file to be parsed\n",
    "    \n",
    "    verbose: bool\n",
    "        Print some things? \n",
    "    \n",
    "    \n",
    "    Returns\n",
    "    -------\n",
    "    dict\n",
    "        A dictionary with the locations of\n",
    "        - a_locations: the positive source electrode locations (numpy array) \n",
    "        - b_locations: the negative source electrode locations (numpy array) \n",
    "        - m_locations: the receiver locations (list of numpy arrays)\n",
    "        - n_locations: the receiver locations (list of numpy arrays)\n",
    "        - n_locations: the receiver locations (list of numpy arrays)\n",
    "        - observed_data: observed data (list of numpy arrays)\n",
    "        - standard_deviations: assigned standard deviations (list of numpy arrays)\n",
    "        - n_sources: number of sources (int)\n",
    "    \n",
    "    \"\"\"\n",
    "    \n",
    "    # read in the text file as a numpy array of strings (each row is an entry)\n",
    "    contents = np.genfromtxt(filename, delimiter=' \\n', dtype=np.str)\n",
    "    \n",
    "    # the second line has the number of sources, current, and data type (voltages if 1)\n",
    "    n_sources = int(contents[1].split()[0])\n",
    "    \n",
    "    if verbose is True: \n",
    "        print(f\"number of sources: {n_sources}\")\n",
    "    \n",
    "    # initialize storage for the electrode locations and data\n",
    "    a_locations = np.zeros(n_sources)\n",
    "    b_locations = np.zeros(n_sources)\n",
    "    m_locations = []\n",
    "    n_locations = []\n",
    "    observed_data = []\n",
    "    standard_deviations = []\n",
    "    \n",
    "    # index to track where we have read in content \n",
    "    content_index = 1 \n",
    "    \n",
    "    # loop over sources \n",
    "    for i in range(n_sources):\n",
    "        # start by reading in the source info \n",
    "        content_index = content_index + 1  # read the next line\n",
    "        a_location, b_location, nrx = contents[content_index].split()  # this is a string\n",
    "\n",
    "        # convert the strings to a float for locations and an int for the number of receivers\n",
    "        a_locations[i] = float(a_location)\n",
    "        b_locations[i] = float(b_location)\n",
    "        nrx = int(nrx)\n",
    "\n",
    "        if verbose is True: \n",
    "            print(f\"Source {i}: A-loc: {a_location}, B-loc: {b_location}, N receivers: {nrx}\")\n",
    "\n",
    "        # initialize space for receiver locations, observed data associated with this source\n",
    "        m_locations_i, n_locations_i = np.zeros(nrx), np.zeros(nrx)\n",
    "        observed_data_i, standard_deviations_i = np.zeros(nrx), np.zeros(nrx)\n",
    "\n",
    "        # read in the receiver info \n",
    "        for j in range(nrx):\n",
    "            content_index = content_index + 1  # read the next line\n",
    "            m_location, n_location, datum, std = contents[content_index].split()\n",
    "\n",
    "            # convert the locations and data to floats, and store them\n",
    "            m_locations_i[j] = float(m_location)\n",
    "            n_locations_i[j] = float(n_location)\n",
    "            observed_data_i[j] = float(datum)\n",
    "            standard_deviations_i[j] = float(std)\n",
    "\n",
    "        # append the receiver info to the lists\n",
    "        m_locations.append(m_locations_i)\n",
    "        n_locations.append(n_locations_i)\n",
    "        observed_data.append(observed_data_i)\n",
    "        standard_deviations.append(standard_deviations_i)\n",
    "    \n",
    "    return {\n",
    "        \"a_locations\": a_locations,\n",
    "        \"b_locations\": b_locations, \n",
    "        \"m_locations\": m_locations,\n",
    "        \"n_locations\": n_locations,\n",
    "        \"observed_data\": observed_data, \n",
    "        \"standard_deviations\": standard_deviations,\n",
    "        \"n_sources\": n_sources, \n",
    "    }"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dc_data_dict = read_dcip_data(dc_data_file, verbose=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# initialize an empty list for each \n",
    "source_list = []\n",
    "\n",
    "# center the survey and work in local coordinates\n",
    "x_local = 0.5*(np.min(dc_data_dict[\"a_locations\"]) + np.max(np.hstack(dc_data_dict[\"n_locations\"])))\n",
    "\n",
    "for i in range(dc_data_dict[\"n_sources\"]):\n",
    "    \n",
    "    # receiver electrode locations in 2D \n",
    "    m_locs = np.vstack([\n",
    "        dc_data_dict[\"m_locations\"][i] - x_local, \n",
    "        np.zeros_like(dc_data_dict[\"m_locations\"][i])\n",
    "    ]).T\n",
    "    n_locs = np.vstack([\n",
    "        dc_data_dict[\"n_locations\"][i] - x_local,\n",
    "        np.zeros_like(dc_data_dict[\"n_locations\"][i])\n",
    "    ]).T\n",
    "    \n",
    "    # construct the receiver object \n",
    "    receivers = dc.receivers.Dipole(locations_m=m_locs, locations_n=n_locs, storeProjections=False)\n",
    "    \n",
    "    # construct the source \n",
    "    source = dc.sources.Dipole(\n",
    "        location_a=np.r_[dc_data_dict[\"a_locations\"][i] - x_local, 0.],\n",
    "        location_b=np.r_[dc_data_dict[\"b_locations\"][i] - x_local, 0.],\n",
    "        receiver_list=[receivers]\n",
    "    )\n",
    "    \n",
    "    # append the new source to the source list\n",
    "    source_list.append(source)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "survey = dc.Survey(source_list=source_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2: Build a Mesh\n",
    "\n",
    "Similar to the previous notebook, we use a simple function to design our mesh. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def build_mesh(\n",
    "    survey=survey, \n",
    "    n_cells_per_spacing_x=4,\n",
    "    n_cells_per_spacing_z=4,\n",
    "    n_core_extra_x=4,\n",
    "    n_core_extra_z=4,\n",
    "    core_domain_z_ratio=1/3.,\n",
    "    padding_factor=1.3,\n",
    "    n_pad_x=10,\n",
    "    n_pad_z=10,\n",
    "):\n",
    "    \"\"\"\n",
    "    A function for designing a Tensor Mesh based on DC survey parameters\n",
    "    \n",
    "    Parameters\n",
    "    ----------\n",
    "    \n",
    "    survey: dc.Survey \n",
    "        A DC (or IP) survey object\n",
    "    \n",
    "    n_cells_per_spacing_[x, z]: int\n",
    "        Number of [x, z]-cells per the minimum electrode spacing\n",
    "        \n",
    "    n_core_extra_[x, z]: int\n",
    "        Number of extra cells with the same size as the core domain beyond the survey extent\n",
    "    \n",
    "    core_domain_z_ratio: float\n",
    "        Factor that multiplies the maximum AB, MN separation to define the core mesh extent\n",
    "    \n",
    "    padding_factor: float\n",
    "        Factor by which we expand the mesh cells in the padding region\n",
    "    \n",
    "    n_pad_[x, z]: int\n",
    "        Number of padding cells in the x, z directions\n",
    "    \"\"\"\n",
    "    min_electrode_spacing = np.min(np.abs(survey.locations_a[:, 0] - survey.locations_b[:, 0]))\n",
    "\n",
    "    dx = min_electrode_spacing / n_cells_per_spacing_x\n",
    "    dz = min_electrode_spacing / n_cells_per_spacing_z\n",
    "    \n",
    "    # define the x core domain\n",
    "    core_domain_x = np.r_[\n",
    "        survey.electrode_locations[:, 0].min(),\n",
    "        survey.electrode_locations[:, 0].max()\n",
    "    ]\n",
    "    \n",
    "    # find the y core domain\n",
    "    # find the maximum spacing between source, receiver midpoints\n",
    "    mid_ab = (survey.locations_a + survey.locations_b)/2\n",
    "    mid_mn = (survey.locations_m + survey.locations_n)/2\n",
    "    separation_ab_mn = np.abs(mid_ab - mid_mn)\n",
    "    max_separation = separation_ab_mn.max()\n",
    "    core_domain_z = np.r_[-core_domain_z_ratio * max_separation, 0.]\n",
    "    \n",
    "    # add extra cells beyond the core domain\n",
    "    n_core_x = np.ceil(np.diff(core_domain_x)/dx) + n_core_extra_x*2  # on each side\n",
    "    n_core_z = np.ceil(np.diff(core_domain_z)/dz) + n_core_extra_z  # just below\n",
    "    \n",
    "    # define the tensors in each dimension\n",
    "    hx = [(dx, n_pad_x, -padding_factor), (dx, n_core_x), (dx, n_pad_x, padding_factor)]\n",
    "    hz = [(dz, n_pad_z, -padding_factor), (dz, n_core_z)]\n",
    "\n",
    "    mesh = discretize.TensorMesh([hx, hz], x0=\"CN\")\n",
    "\n",
    "    return mesh, core_domain_x, core_domain_z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mesh, core_domain_x, core_domain_z = build_mesh(survey)\n",
    "mesh.plotGrid()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3: Design our True Model \n",
    "\n",
    "Here, we will focus in on a synthetic example so we know the true solution. Again, we will use a model containing 2 blocks in a halfspace. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.1 Define the model geometry and physical properties\n",
    "\n",
    "Here, we define a model with one conductive block and one resistive block in a halfspace. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define the resistivities\n",
    "rho_background = 100\n",
    "rho_resistive_block = 1000\n",
    "rho_conductive_block = 10\n",
    "\n",
    "# define the geometry of each block\n",
    "xlim_resistive_block = np.r_[-500, -250]\n",
    "zlim_resistive_block = np.r_[-150, -75]\n",
    "\n",
    "xlim_conductive_block = np.r_[250, 500]\n",
    "zlim_conductive_block = np.r_[-150, -75]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rho = rho_background * np.ones(mesh.nC)\n",
    "\n",
    "# resistive block\n",
    "inds_resistive_block = (\n",
    "    (mesh.gridCC[:, 0] >= xlim_resistive_block.min()) & (mesh.gridCC[:, 0] <= xlim_resistive_block.max()) &\n",
    "    (mesh.gridCC[:, 1] >= zlim_resistive_block.min()) & (mesh.gridCC[:, 1] <= zlim_resistive_block.max())\n",
    ")\n",
    "\n",
    "rho[inds_resistive_block] = rho_resistive_block\n",
    "\n",
    "# conductive block\n",
    "inds_conductive_block = (\n",
    "    (mesh.gridCC[:, 0] >= xlim_conductive_block.min()) & (mesh.gridCC[:, 0] <= xlim_conductive_block.max()) &\n",
    "    (mesh.gridCC[:, 1] >= zlim_conductive_block.min()) & (mesh.gridCC[:, 1] <= zlim_conductive_block.max())\n",
    ")\n",
    "\n",
    "rho[inds_conductive_block] = rho_conductive_block"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(1, 1, figsize=(12, 4))\n",
    "out = mesh.plotImage(np.log10(rho), ax=ax, pcolorOpts={\"cmap\":\"Spectral\"})\n",
    "plt.colorbar(out[0], ax=ax, label=\"log$_{10} \\\\rho$\", orientation=\"horizontal\", fraction=0.05, pad=0.25)\n",
    "ax.set_xlim(core_domain_x)\n",
    "ax.set_ylim(core_domain_z + np.r_[-100, 0])\n",
    "ax.set_aspect(1.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 Define the true model\n",
    "\n",
    "Since we will be working with log-resistivities in the inversion, our true model is defined as the log of the resistivity. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_true = np.log(rho)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4: Set up and run a forward simulation\n",
    "\n",
    "These will be the \"observed data\" in the inversion. We add noise to the result (a default of 5%). If you would like to adjust the noise level, you can change the `relative_error`, you can pass `add_noise` to the `make_synthetic_data` function. \n",
    "\n",
    "As in the first notebook, we will invert for log-resistivity, so we use an `ExpMap` in the forward simulation. Again, we use the `storeJ` option to store the  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mapping = maps.ExpMap(mesh)\n",
    "\n",
    "# Generate 2.5D DC problem\n",
    "simulation_dc = dc.Simulation2DNodal(\n",
    "    mesh, rhoMap=mapping, solver=Solver, survey=survey, storeJ=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time \n",
    "\n",
    "# compute our \"observed data\"\n",
    "synthetic_data = simulation_dc.make_synthetic_data(model_true, add_noise=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# plot pseudosections \n",
    "fig, ax = plt.subplots(1, 1, figsize=(12, 4))\n",
    "    \n",
    "# plot a psuedosection of the data\n",
    "dc.utils.plot_pseudosection(\n",
    "    synthetic_data, data_type=\"apparent resistivity\", \n",
    "    plot_type=\"contourf\", data_location=True, ax=ax, \n",
    "    cbar_opts={\"pad\":0.25}\n",
    ")\n",
    "ax.set_xlim(core_domain_x)\n",
    "ax.set_aspect(1.5)  # some vertical exxageration\n",
    "ax.set_xlabel(\"Northing (m)\")\n",
    "\n",
    "plt.tight_layout()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "# Recall: inversion as optimization \n",
    "\n",
    "We formulate the inverse problem as an optimization problem consisting of a data misfit and a regularization \n",
    "\n",
    "$$\n",
    "\\min_{\\mathbf{m}} \\Phi(\\mathbf{m}) = \\Phi_d(\\mathbf{m}) + \\beta\\Phi_m(\\mathbf{m}) \\\\ s.t. ~ \\Phi_d \\leq \\Phi_d^* \\quad \\mathbf{m}_i^{\\rm L} \\leq \\mathbf{m}_ \\leq \\mathbf{m}_i^{\\rm U}\n",
    "$$\n",
    "\n",
    "where:\n",
    "- $\\mathbf{m}$ is our inversion model - a vector containing the set of parameters that we invert for\n",
    "\n",
    "\n",
    "- $\\Phi_d$ is the data misfit\n",
    "  $$\n",
    "  \\Phi_d(\\mathbf{m}) = \\frac{1}{2}\\|\\mathbf{W_d} (\\mathcal{F}(\\mathbf{m}) -    \\mathbf{d}^{\\text{obs}})\\|^2\n",
    "  $$\n",
    "  \n",
    "\n",
    "- $\\Phi_m$ is the regularization\n",
    "  $$\n",
    "  \\Phi_m(\\mathbf{m}) = \\frac{1}{2}\\big(\\alpha_s\\|\\mathbf{W_s} (\\mathbf{m} - \\mathbf{m}_{\\text{ref}})\\|^2 + \\alpha_x\\|\\mathbf{W_x} (\\mathbf{m})\\|^2 + \\alpha_z\\|\\mathbf{W_z} (\\mathbf{m})\\|^2 \\big)\n",
    "  $$\n",
    "  \n",
    "\n",
    "- $\\beta$ is a trade-off parameter that weights the relative importance of the data misfit and regularization terms\n",
    "\n",
    "\n",
    "- $\\Phi_d^*$ is our target misfit, which is typically set to $N/2$ where $N$ is the number of data (Parker, 1994) (or also see [Oldenburg & Li (2005)](https://www.researchgate.net/profile/Douglas_Oldenburg/publication/238708196_5_Inversion_for_Applied_Geophysics_A_Tutorial/links/004635282572529927000000.pdf))\n",
    "---\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5: assign standard deviations to the data\n",
    "\n",
    "The standard deviations are an estimate of the level of noise on your data. These are used to construct the weights in the $\\mathbf{W_d}$ matrix of the data misfit.\n",
    "\n",
    "It is common to define the standard deviation in terms of a `relative_error` and a `noise_floor`. \n",
    "\n",
    "$$ \\text{standard_deviation} =  \\text{relative_error}\\times|d^{obs}| + \\text{noise_floor}$$\n",
    "\n",
    "For DC resistivity, it is common to choose a `relative_error` between 0.02-0.1 (2% - 10% error). The `noise_floor` value defines threshold for data below which we consider those values to be close to zero. It is important to set a non-zero `noise_floor` when we can have zero-crossings in our data (e.g. both positive and negative values - which we do in DC!). The `noise_floor` ensures that we don't try to fit near-zero values to very high accuracy.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(1, 1)\n",
    "ax.hist(np.log10(np.abs(synthetic_data.dobs)), 30)\n",
    "ax.set_xlabel(\"$log_{10}(|d^{obs}|)$\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "relative_error = 0.05  # 5% error\n",
    "noise_floor = 1e-4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(1, 1)\n",
    "ax.hist(np.log10(np.abs(synthetic_data.dobs)), 30)\n",
    "ax.set_xlabel(\"$log_{10}(|d^{obs}|)$\")\n",
    "ax.axvline(np.log10(noise_floor), linestyle=\"dashed\", color=\"C1\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5.1 Assign uncertainties to our data object\n",
    "\n",
    "In SimPEG, the `data` object is responsible for keeping track of the survey geometry, observed data and uncertainty values. The `standard_deviation` property is what is used to construct the $\\mathbf{W_d}$ matrix in the data misfit\n",
    "\n",
    "```\n",
    "W_d = diag(1 / data.standard_deviation) \n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "synthetic_data.relative_error = relative_error\n",
    "synthetic_data.noise_floor = noise_floor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert(np.allclose(\n",
    "    relative_error * np.abs(synthetic_data.dobs) + noise_floor, \n",
    "    synthetic_data.standard_deviation\n",
    "))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot the data sorted by amplitude and associated uncertainties."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inds_sort = np.argsort(np.abs(synthetic_data.dobs))\n",
    "sorted_data = np.abs(synthetic_data.dobs[inds_sort])\n",
    "sorted_std = synthetic_data.standard_deviation[inds_sort]\n",
    "\n",
    "fig, ax = plt.subplots(2, 1, figsize=(12, 8))\n",
    "\n",
    "x = np.arange(0, len(sorted_data))\n",
    "\n",
    "ax[0].plot(x, sorted_data, '.k')\n",
    "ax[1].semilogy(x, sorted_data, '.k')\n",
    "\n",
    "ax[0].set_title(\"sorted data and associated uncertainties\")\n",
    "\n",
    "for a in ax:\n",
    "    a.fill_between(x, sorted_data - sorted_std, sorted_data + sorted_std, alpha=0.25)\n",
    "    a.grid(alpha=0.5)\n",
    "    a.set_ylabel(\"observed data (V)\")\n",
    "\n",
    "plt.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 6: Assembling the inversion \n",
    "\n",
    "Here, we are going to set up functions for constructing and running the inversion so that we can easily adjust the parameters used in the inversion.\n",
    "\n",
    "We start be defining a function that will construct our `inverse_problem` object which requires the `data_misfit`, `regularization`, and `optimization` be constructed. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_inverse_problem(\n",
    "    relative_error=relative_error, noise_floor=noise_floor,\n",
    "    alpha_s=1e-3,  alpha_x=1, alpha_z=1, mref=np.log(rho_background), \n",
    "    maxIter=20, maxIterCG=20,\n",
    "):\n",
    "    \"\"\"\n",
    "    A function that contructs a data misfit, regularization, and optimization and \n",
    "    assembles them into an inverse_problem object. \n",
    "    \n",
    "    Parameters\n",
    "    ----------\n",
    "    \n",
    "    relative_error: float\n",
    "        relative error we assign to the data (used to construct the standard_deviation)\n",
    "    \n",
    "    noise_floor: float\n",
    "        noise floor that we use to construct the standard deviations of the data\n",
    "    \n",
    "    alpha_s: float\n",
    "        weight for the smallness component of the regularization\n",
    "    \n",
    "    alpha_[x, z]: float\n",
    "        weight for the [x, z]-smoothness term in the regularization\n",
    "    \n",
    "    mref: float\n",
    "        reference model value used in the smallness term of the regularization\n",
    "    \n",
    "    maxIter: int\n",
    "        maximum number of iterations in the inversion\n",
    "    \n",
    "    maxIterCG: int\n",
    "        maximum number of Conjugate Gradient iterations used to compute a step \n",
    "        in the Inexact Gauss Newton optimization\n",
    "        \n",
    "    \"\"\"\n",
    "    \n",
    "    # set the uncertainties and define the data misfit\n",
    "    synthetic_data.relative_error = relative_error\n",
    "    synthetic_data.noise_floor = noise_floor\n",
    "    dmisfit = data_misfit.L2DataMisfit(data=synthetic_data, simulation=simulation_dc)\n",
    "    \n",
    "    # regularization\n",
    "    reg = regularization.Tikhonov(\n",
    "        mesh, alpha_s=alpha_s, alpha_x=alpha_x, alpha_y=alpha_z, mref=mref*np.ones(mesh.nC)\n",
    "    )\n",
    "    \n",
    "    # optimization \n",
    "    opt = optimization.InexactGaussNewton(maxIter=maxIter, maxIterCG=maxIterCG)\n",
    "    opt.remember(\"xc\")\n",
    "    \n",
    "    # return the inverse problem \n",
    "    return inverse_problem.BaseInvProblem(dmisfit, reg, opt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we define a simple function for assembling the inversion. There are very simiar functions in SimPEG already that write these outputs to a file (see for example `directives.SaveOutputEveryIteration`). In part, I show this here to give you a sense of how directives work in the inversion so if you are interested in writing your own, then you can use this as a template. \n",
    "\n",
    "The `initialize` method is called at the very beginning of the inversion. In the `BetaEstimate_ByEig`, which we used in notebook 1, this is where we initialze an estimate of beta.\n",
    "\n",
    "The `endIter` method is called at the end of every iteration. For example in the `BetaSchedule` directive, this is where we update the value of beta. \n",
    "\n",
    "In this directive, we will create a simple dictionary that saves some outputs of interest during the inversion. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SaveInversionProgress(directives.InversionDirective):\n",
    "    \"\"\"\n",
    "    A custom directive to save items of interest during the course of an inversion\n",
    "    \"\"\"\n",
    "    \n",
    "    def initialize(self):\n",
    "        \"\"\"\n",
    "        This is called when we first start running an inversion\n",
    "        \"\"\"\n",
    "        # initialize an empty dictionary for storing results \n",
    "        self.inversion_results = {\n",
    "            \"iteration\":[],\n",
    "            \"beta\":[],\n",
    "            \"phi_d\":[],\n",
    "            \"phi_m\":[],\n",
    "            \"phi_m_small\":[],\n",
    "            \"phi_m_smooth_x\":[],\n",
    "            \"phi_m_smooth_z\":[],\n",
    "            \"dpred\":[],\n",
    "            \"model\":[]\n",
    "        }\n",
    "\n",
    "    def endIter(self):\n",
    "        \"\"\"\n",
    "        This is run at the end of every iteration. So here, we just append \n",
    "        the new values to our dictionary\n",
    "        \"\"\"\n",
    "        \n",
    "        # Save the data\n",
    "        self.inversion_results[\"iteration\"].append(self.opt.iter)\n",
    "        self.inversion_results[\"beta\"].append(self.invProb.beta)\n",
    "        self.inversion_results[\"phi_d\"].append(self.invProb.phi_d)\n",
    "        self.inversion_results[\"phi_m\"].append(self.invProb.phi_m)\n",
    "        self.inversion_results[\"dpred\"].append(self.invProb.dpred)\n",
    "        self.inversion_results[\"model\"].append(self.invProb.model)\n",
    "        \n",
    "        # grab the components of the regularization and evaluate them here\n",
    "        # the regularization has a list of objective functions  \n",
    "        # objfcts = [smallness, smoothness_x, smoothness_z]\n",
    "        # and the multipliers contain the alpha values\n",
    "        # multipliers = [alpha_s, alpha_x, alpha_z]\n",
    "        reg = self.reg.objfcts[0] \n",
    "        phi_s = reg.objfcts[0](self.invProb.model) * reg.multipliers[0]\n",
    "        phi_x = reg.objfcts[1](self.invProb.model) * reg.multipliers[1]\n",
    "        phi_z = reg.objfcts[2](self.invProb.model) * reg.multipliers[2]\n",
    "        \n",
    "        self.inversion_results[\"phi_m_small\"].append(phi_s)\n",
    "        self.inversion_results[\"phi_m_smooth_x\"].append(phi_x)\n",
    "        self.inversion_results[\"phi_m_smooth_z\"].append(phi_z)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There are three directives that we use in this inversion \n",
    "- `BetaEstimate_ByEig`: sets an initial value for $\\beta$ by estimating the largest eigenvalue of  $\\Phi_d$ and of $\\Phi_m$ and then taking their ratio. The value is then scaled by the `beta0_ratio` value, so for example if you wanted the regularization to be ~10 times more important than the data misfit, then we would set `beta0_ratio=10`\n",
    "\n",
    "\n",
    "- `BetaSchedule`: this reduces the value of beta during the course of the inversion. Particularly for non-linear problems, it can be advantageous to start off by seeking a smooth model which fits the regularization and gradually decreasing the infulence of the regularization to increase the influence of the data. \n",
    "\n",
    "\n",
    "- `TargetMisfit`: this directive checks at the end of each iteration if we have reached the target misfit. If we have, the inversion terminates, it not, it continues. Optionally, a `chifact` can be set to stop the inversion when `phi_d <= chifact * phi_d_star`. By default (`chifact=1`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_inversion(\n",
    "    inv_prob, beta0_ratio=1e2, cool_beta=True,\n",
    "    beta_cooling_factor=2, beta_cooling_rate=1,\n",
    "    use_target=True, chi_factor=1, \n",
    "):\n",
    "    \"\"\"\n",
    "    A method for creating a SimPEG inversion with directives for estimating and \n",
    "    updating beta as well as choosing a stopping criteria\n",
    "    \n",
    "    Parameters\n",
    "    ----------\n",
    "    \n",
    "    inv_prob: inverse_problem\n",
    "        An inverse problem object that describes the data misfit, regularization and optimization\n",
    "    \n",
    "    beta0_ratio: float\n",
    "        a scalar that multiplies ratio of the estimated largest eigenvalue of phi_d and phi_m\n",
    "    \n",
    "    cool_beta: bool\n",
    "        True: use a beta cooling schedule. False: use a fixed beta\n",
    "    \n",
    "    beta_cooling_factor: float\n",
    "        reduce beta by a factor of 1/beta_cooling factor \n",
    "    \n",
    "    beta_cooling_rate: int\n",
    "        number of iterations we keep beta fixed for \n",
    "        (we cool beta by the cooling factor every `beta_cooling_rate` iterations) \n",
    "    \n",
    "    chi_factor: float\n",
    "        Stop the inversion when phi_d <= chifact * phi_d_star. By default (`chifact=1`)\n",
    "        \n",
    "    \"\"\"\n",
    "    \n",
    "    # set up our directives\n",
    "    beta_est = directives.BetaEstimate_ByEig(beta0_ratio=beta0_ratio)\n",
    "    target = directives.TargetMisfit(chifact=chi_factor)\n",
    "    save = SaveInversionProgress()\n",
    "    \n",
    "    directives_list = [beta_est, save]\n",
    "    \n",
    "    if use_target is True:\n",
    "        directives_list.append(target)\n",
    "    \n",
    "    if cool_beta is True:\n",
    "        beta_schedule = directives.BetaSchedule(coolingFactor=beta_cooling_factor, coolingRate=beta_cooling_rate)\n",
    "        directives_list.append(beta_schedule)\n",
    "\n",
    "    return inversion.BaseInversion(inv_prob, directiveList=directives_list), target, save"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6.1 Create and run the inversion"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rho0 = rho_background"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inv_prob = create_inverse_problem(\n",
    "    relative_error=0.05, noise_floor=1e-4,\n",
    "    alpha_s=1e-3,  alpha_x=1, alpha_z=1, mref=np.log(rho0), \n",
    "    maxIter=20, maxIterCG=20,\n",
    ")\n",
    "\n",
    "inv, target_misfit, inversion_log = create_inversion(\n",
    "    inv_prob, beta0_ratio=1e2, cool_beta=True,\n",
    "    beta_cooling_factor=2, beta_cooling_rate=1,\n",
    "    use_target=False, chi_factor=1 \n",
    ")\n",
    "\n",
    "phi_d_star = survey.nD / 2\n",
    "target = target_misfit.chifact * phi_d_star"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "m0 = np.log(rho0) * np.ones(mesh.nC)\n",
    "model_recovered = inv.run(m0)\n",
    "inversion_results = inversion_log.inversion_results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# inversion_results_app = ipywidgets.interact(\n",
    "#     plot_results_at_iteration,\n",
    "#     iteration = ipywidgets.IntSlider(min=0, max=inversion_results[\"iteration\"][-1]-1, value=0)\n",
    "# )\n",
    "# inversion_results_app"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6.2 analyze the results - data misfit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_normalized_misfit(iteration=None, observed_data=synthetic_data, ax=None):\n",
    "    \"\"\"\n",
    "    Plot the normalized data misfit\n",
    "    \"\"\"\n",
    "    if ax is None:\n",
    "        fig, ax = plt.subplots(1, 1)\n",
    "        \n",
    "    if iteration is None:\n",
    "        dpred = inversion_results[\"dpred\"][-1]\n",
    "    else:\n",
    "        dpred = inversion_results[\"dpred\"][iteration]\n",
    "        \n",
    "    normalized_misfit = (dpred - observed_data.dobs)/observed_data.standard_deviation\n",
    "    out = dc.utils.plot_pseudosection(\n",
    "        observed_data, dobs=normalized_misfit, data_type=\"misfit\",\n",
    "        plot_type=\"contourf\", data_location=True, ax=ax, \n",
    "        cbar_opts={\"pad\":0.25}\n",
    "    )\n",
    "\n",
    "    ax.set_title(\"normalized misfit\")\n",
    "    ax.set_yticklabels([])\n",
    "    ax.set_aspect(1.5)  # some vertical exxageration\n",
    "    ax.set_xlabel(\"Northing (m)\")\n",
    "    ax.set_ylabel(\"n-spacing\")\n",
    "\n",
    "    cb_axes = plt.gcf().get_axes()[-1]\n",
    "    cb_axes.set_xlabel('normalized misfit')\n",
    "    return ax"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# plot misfit\n",
    "fig, ax = plt.subplots(1, 1, figsize=(12, 4))\n",
    "plot_normalized_misfit(ax=ax)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6.3 analyze the results - how did we get here? "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_beta(inversion_results, x=\"iteration\", ax=None):\n",
    "    \"\"\"\n",
    "    Plot the beta-value as a function of iteration \n",
    "    \"\"\"\n",
    "    if ax is None:\n",
    "        fig, ax = plt.subplots(1, 1)\n",
    "    if x not in [\"iteration\", \"beta\"]:\n",
    "        raise Exception(\"beta should be plotted as a function of iteration or beta\")\n",
    "    ax.plot(inversion_results[x], inversion_results[\"beta\"], \"-oC0\", label=\"$\\\\beta$\")\n",
    "    ax.set_ylabel(\"$\\\\beta$\")\n",
    "    ax.legend(loc=2)\n",
    "    return ax\n",
    "\n",
    "def plot_misfit_and_regularization(inversion_results, x=\"iteration\", ax=None):\n",
    "    \"\"\"\n",
    "    Plot the data misfit and regularization as a function of iteration\n",
    "    \"\"\"\n",
    "    if ax is None:\n",
    "        fig, ax = plt.subplots(1, 1)\n",
    "    \n",
    "    if x not in [\"iteration\", \"beta\"]:\n",
    "        raise Exception(\"misfit and regularization should be plotted as a function of iteration or beta\")\n",
    "        \n",
    "    ax.plot(inversion_results[x], inversion_results[\"phi_d\"], \"-oC1\", label=\"$\\Phi_d$\")\n",
    "    ax.axhline(phi_d_star, linestyle=\"--\", color=\"k\", label=\"$\\Phi_d^*$\")\n",
    "    ax.axhline(phi_d_star, linestyle=\"-.\", color=\"k\", label=\"$\\chi\\Phi_d^*$\")\n",
    "    ax.set_ylabel(\"$\\Phi_d$\")\n",
    "\n",
    "    ax_phim = ax.twinx()\n",
    "    ax_phim.plot(inversion_results[x], inversion_results[\"phi_m\"], \"-oC2\", label=\"$\\Phi_m$\")\n",
    "    ax_phim.set_ylabel(\"$\\Phi_m$\")\n",
    "    ax.set_xlabel(x)\n",
    "    \n",
    "    ax.legend(loc=2)\n",
    "    ax_phim.legend(loc=1)\n",
    "    return ax\n",
    "\n",
    "def plot_regularization_components(inversion_results, x=\"iteration\", ax=None):\n",
    "    \"\"\"\n",
    "    Plot the smallness and smoothness terms in the regularization as a function of iteration\n",
    "    \"\"\"\n",
    "    if ax is None:\n",
    "        fig, ax = plt.subplots(1, 1)\n",
    "    \n",
    "    if x not in [\"iteration\", \"beta\"]:\n",
    "        raise Exception(\"regularization components should be plotted as a function of iteration or beta\")\n",
    "    \n",
    "    ax.plot(inversion_results[x], inversion_results[\"phi_m_small\"], \"-oC3\", label=\"$\\\\alpha_s\\Phi_s$\")\n",
    "    ax.plot(inversion_results[x], inversion_results[\"phi_m_smooth_x\"], \"-oC4\", label=\"$\\\\alpha_x\\Phi_x$\")\n",
    "    ax.plot(inversion_results[x], inversion_results[\"phi_m_smooth_z\"], \"-oC5\", label=\"$\\\\alpha_z\\Phi_z$\")\n",
    "    ax.set_ylabel(\"$\\Phi_m$\")\n",
    "    ax.set_xlabel(x)\n",
    "    ax.legend(loc=2)\n",
    "    return ax\n",
    "\n",
    "def plot_tikhonov_curve(inversion_results, ax=None):\n",
    "    \"\"\"\n",
    "    Plot the Tikhonov curve: Phi_d as a function of Phi_m\n",
    "    \"\"\"\n",
    "    if ax is None:\n",
    "        fig, ax = plt.subplots(1, 1)\n",
    "    \n",
    "    ax.plot(inversion_results[\"phi_m\"], inversion_results[\"phi_d\"], \"-oC6\", label=\"$\\Phi_d$\")\n",
    "    ax.axhline(phi_d_star, linestyle=\"--\", color=\"k\", label=\"$\\Phi_d^*$\")\n",
    "    ax.axhline(phi_d_star, linestyle=\"-.\", color=\"k\", label=\"$\\chi\\Phi_d^*$\")\n",
    "    ax.set_ylabel(\"$\\Phi_d$\")\n",
    "    ax.set_xlabel(x)\n",
    "    \n",
    "    ax.set_xlabel(\"$\\Phi_m$\")\n",
    "    ax.set_ylabel(\"$\\Phi_d$\")\n",
    "    \n",
    "    ax.legend(loc=1)\n",
    "    return ax"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(3, 1, figsize=(10, 8), sharex=True)\n",
    "\n",
    "plot_beta(inversion_results, ax=ax[0])\n",
    "plot_misfit_and_regularization(inversion_results, ax=ax[1])\n",
    "plot_regularization_components(inversion_results, ax=ax[2])\n",
    "ax[2].set_xlabel(\"iteration\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_tikhonov_curve(inversion_results)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6.3  recovered model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_model(\n",
    "    model=model_recovered, ax=None, clim=np.exp(np.r_[np.min(model_true), np.max(model_true)]), \n",
    "    colorbar=True, zlim=np.r_[-400, 0]\n",
    "):\n",
    "    \"\"\"\n",
    "    A plotting function for our recovered model \n",
    "    \"\"\"\n",
    "    if ax is None:\n",
    "        fig, ax = plt.subplots(1, 1)\n",
    "    \n",
    "    out = mesh.plotImage(\n",
    "        mapping * model, pcolorOpts={'norm':LogNorm(), 'cmap':'Spectral'}, ax=ax,\n",
    "        clim=clim\n",
    "    )\n",
    "\n",
    "    ax.set_xlim(core_domain_x)\n",
    "    ax.set_ylim(zlim)\n",
    "    ax.set_ylabel('Elevation (m)')\n",
    "    ax.set_xlabel('Easting (m)')\n",
    "    ax.set_aspect(1.5)  # some vertical exxageration\n",
    "    \n",
    "    if colorbar is True:\n",
    "        cb = plt.colorbar(out[0], fraction=0.05, orientation='horizontal', ax=ax, pad=0.25)\n",
    "        cb.set_label(\"Resistivity ($\\Omega$m)\")\n",
    "    \n",
    "    return ax"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6.4 Explore the results as a function of iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_results_at_iteration(iteration=0):\n",
    "    \"\"\"\n",
    "    A function to plot inversion results as a function of iteration \n",
    "    \"\"\"\n",
    "    fig = plt.figure(figsize=(12, 8))\n",
    "    spec = gridspec.GridSpec(ncols=6, nrows=2, figure=fig)\n",
    "\n",
    "    ax_tikhonov = fig.add_subplot(spec[:, :2])\n",
    "    ax_misfit = fig.add_subplot(spec[0, 2:])\n",
    "    ax_model = fig.add_subplot(spec[1, 2:])\n",
    "\n",
    "    plot_tikhonov_curve(inversion_results, ax=ax_tikhonov)\n",
    "    ax_tikhonov.plot(\n",
    "        inversion_results[\"phi_m\"][iteration], inversion_results[\"phi_d\"][iteration], \n",
    "        'ks', ms=10\n",
    "    )\n",
    "    ax_tikhonov.set_title(f\"iteration {iteration}\")\n",
    "    \n",
    "    plot_normalized_misfit(iteration=iteration, ax=ax_misfit)\n",
    "\n",
    "    plot_model(inversion_results[\"model\"][iteration], ax=ax_model, )\n",
    "    # make the tikhonov plot square\n",
    "    ax_tikhonov.set_aspect(1./ax_tikhonov.get_data_ratio())\n",
    "\n",
    "    plt.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inversion_results_app = ipywidgets.interact(\n",
    "    plot_results_at_iteration,\n",
    "    iteration = ipywidgets.IntSlider(min=0, max=inversion_results[\"iteration\"][-1]-1, value=0)\n",
    ")\n",
    "inversion_results_app"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 6.5 Compare with the True model "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(2, 1, figsize=(10, 8))\n",
    "clim = np.exp(np.r_[model_true.min(), model_true.max()])\n",
    "plot_model(model_recovered, clim=clim,  ax=ax[0])\n",
    "plot_model(model_true, clim=clim,  ax=ax[1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Homework ✏️\n",
    "\n",
    "What happens if you introduce a layer above the blocks? \n",
    "- Define a layer from z=-50 to z=-100. \n",
    "- Start with a resistive layer. \n",
    "    - How does this change our ability to resolve the blocks? \n",
    "- Now try with a conductive layer.\n",
    "- Get creative! Build a model that interests you\n",
    "\n",
    "Send us your images and discussion on slack!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}