{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Searches=Nautilus\n",
        "=======================\n",
        "\n",
        "This example illustrates how to use the nested sampling algorithm Nautilus.\n",
        "\n",
        "Information about Nautilus can be found at the following links:\n",
        "\n",
        " - https://nautilus-sampler.readthedocs.io/en/stable/index.html\n",
        " - https://github.com/johannesulf/nautilus"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "\n",
        "%matplotlib inline\n",
        "from pyprojroot import here\n",
        "workspace_path = str(here())\n",
        "%cd $workspace_path\n",
        "print(f\"Working Directory has been set to `{workspace_path}`\")\n",
        "\n",
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "from os import path\n",
        "\n",
        "import autofit as af"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "__Data__\n",
        "\n",
        "This example fits a single 1D Gaussian, we therefore load and plot data containing one Gaussian."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "dataset_path = path.join(\"dataset\", \"example_1d\", \"gaussian_x1\")\n",
        "data = af.util.numpy_array_from_json(file_path=path.join(dataset_path, \"data.json\"))\n",
        "noise_map = af.util.numpy_array_from_json(\n",
        "    file_path=path.join(dataset_path, \"noise_map.json\")\n",
        ")\n",
        "\n",
        "plt.errorbar(\n",
        "    x=range(data.shape[0]),\n",
        "    y=data,\n",
        "    yerr=noise_map,\n",
        "    linestyle=\"\",\n",
        "    color=\"k\",\n",
        "    ecolor=\"k\",\n",
        "    elinewidth=1,\n",
        "    capsize=2,\n",
        ")\n",
        "plt.show()\n",
        "plt.close()"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "__Model + Analysis__\n",
        "\n",
        "We create the model and analysis, which in this example is a single `Gaussian` and therefore has dimensionality N=3."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "model = af.Model(af.ex.Gaussian)\n",
        "\n",
        "model.centre = af.UniformPrior(lower_limit=0.0, upper_limit=100.0)\n",
        "model.normalization = af.LogUniformPrior(lower_limit=1e-2, upper_limit=1e2)\n",
        "model.sigma = af.UniformPrior(lower_limit=0.0, upper_limit=30.0)\n",
        "\n",
        "analysis = af.ex.Analysis(data=data, noise_map=noise_map)"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "__Search__\n",
        "\n",
        "We now create and run the `Nautilus` object which acts as our non-linear search. \n",
        "\n",
        "We manually specify all of the Nautilus settings, descriptions of which are provided at the following webpage:\n",
        "\n",
        "https://github.com/johannesulf/nautilus"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "search = af.Nautilus(\n",
        "    path_prefix=path.join(\"searches\"),\n",
        "    name=\"Nautilus\",\n",
        "    number_of_cores=4,\n",
        "    n_live=100,  # Number of so-called live points. New bounds are constructed so that they encompass the live points.\n",
        "    n_update=None,  # The maximum number of additions to the live set before a new bound is created\n",
        "    enlarge_per_dim=1.1,  # Along each dimension, outer ellipsoidal bounds are enlarged by this factor.\n",
        "    n_points_min=None,  # The minimum number of points each ellipsoid should have. Effectively, ellipsoids with less than twice that number will not be split further.\n",
        "    split_threshold=100,  # Threshold used for splitting the multi-ellipsoidal bound used for sampling.\n",
        "    n_networks=4,  # Number of networks used in the estimator.\n",
        "    n_batch=100,  # Number of likelihood evaluations that are performed at each step. If likelihood evaluations are parallelized, should be multiple of the number of parallel processes.\n",
        "    n_like_new_bound=None,  # The maximum number of likelihood calls before a new bounds is created. If None, use 10 times n_live.\n",
        "    vectorized=False,  # If True, the likelihood function can receive multiple input sets at once.\n",
        "    seed=None,  # Seed for random number generation used for reproducible results accross different runs.\n",
        "    f_live=0.01,  # Maximum fraction of the evidence contained in the live set before building the initial shells terminates.\n",
        "    n_shell=1,  # Minimum number of points in each shell. The algorithm will sample from the shells until this is reached. Default is 1.\n",
        "    n_eff=500,  # Minimum effective sample size. The algorithm will sample from the shells until this is reached. Default is 10000.\n",
        "    discard_exploration=False,  # Whether to discard points drawn in the exploration phase. This is required for a fully unbiased posterior and evidence estimate.\n",
        "    verbose=True,  # Whether to print information about the run.\n",
        "    n_like_max=np.inf,  # Maximum number of likelihood evaluations. Regardless of progress, the sampler will stop if this value is reached. Default is infinity.\n",
        ")\n",
        "\n",
        "result = search.fit(model=model, analysis=analysis)"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "__Result__\n",
        "\n",
        "The result object returned by the fit provides information on the results of the non-linear search. Lets use it to\n",
        "compare the maximum log likelihood `Gaussian` to the data."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "model_data = result.max_log_likelihood_instance.model_data_from(\n",
        "    xvalues=np.arange(data.shape[0])\n",
        ")\n",
        "\n",
        "plt.errorbar(\n",
        "    x=range(data.shape[0]),\n",
        "    y=data,\n",
        "    yerr=noise_map,\n",
        "    linestyle=\"\",\n",
        "    color=\"k\",\n",
        "    ecolor=\"k\",\n",
        "    elinewidth=1,\n",
        "    capsize=2,\n",
        ")\n",
        "plt.plot(range(data.shape[0]), model_data, color=\"r\")\n",
        "plt.title(\"Nautilus model fit to 1D Gaussian dataset.\")\n",
        "plt.xlabel(\"x values of profile\")\n",
        "plt.ylabel(\"Profile normalization\")\n",
        "plt.show()\n",
        "plt.close()"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "__Search Internal__\n",
        "\n",
        "The result also contains the internal representation of the non-linear search.\n",
        "\n",
        "The internal representation of the non-linear search ensures that all sampling info is available in its native form.\n",
        "This can be passed to functions which take it as input, for example if the sampling package has bespoke visualization \n",
        "functions.\n",
        "\n",
        "For `DynestyStatic`, this is an instance of the `Sampler` object (`from nautilus import Sampler`)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "search_internal = result.search_internal\n",
        "\n",
        "print(search_internal)"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "The internal search is by default not saved to hard-disk, because it can often take up quite a lot of hard-disk space\n",
        "(significantly more than standard output files).\n",
        "\n",
        "This means that the search internal will only be available the first time you run the search. If you rerun the code \n",
        "and the search is bypassed because the results already exist on hard-disk, the search internal will not be available.\n",
        "\n",
        "If you are frequently using the search internal you can have it saved to hard-disk by changing the `search_internal`\n",
        "setting in `output.yaml` to `True`. The result will then have the search internal available as an attribute, \n",
        "irrespective of whether the search is re-run or not."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [],
      "outputs": [],
      "execution_count": null
    }
  ],
  "metadata": {
    "anaconda-cloud": {},
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.1"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 4
}