{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Tutorial: Alternative Searches\n",
        "==============================\n",
        "\n",
        "Up to now, we've always used the non-linear search Nautilus and not considered the input parameters that control its\n",
        "sampling. In this tutorial, we'll consider how we can change these setting to balance finding the global maxima\n",
        "solution with fast run time.\n",
        "\n",
        "We will also discuss other types of non-linear searches, such as MCMC and optimizers, which we can use to perform lens\n",
        "modeling. So far, we have no found any of these alternatives to give anywhere near as robust and efficient results as\n",
        "Nautilus, and we recommend users use Nautilus unless they are particularly interested in investigating different\n",
        "model-fitting techniques."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "%matplotlib inline\n",
        "from pyprojroot import here\n",
        "workspace_path = str(here())\n",
        "%cd $workspace_path\n",
        "print(f\"Working Directory has been set to `{workspace_path}`\")\n",
        "\n",
        "from os import path\n",
        "import autolens as al\n",
        "import autolens.plot as aplt\n",
        "import autofit as af"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "we'll use new strong lensing data, where:\n",
        "\n",
        " - The lens galaxy's light is an `Sersic`.\n",
        " - The lens galaxy's total mass distribution is an `Isothermal` and `ExternalShear`.\n",
        " - The source galaxy's light is an `Sersic`."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "dataset_name = \"lens_sersic\"\n",
        "dataset_path = path.join(\"dataset\", \"imaging\", dataset_name)\n",
        "\n",
        "dataset = al.Imaging.from_fits(\n",
        "    data_path=path.join(dataset_path, \"data.fits\"),\n",
        "    noise_map_path=path.join(dataset_path, \"noise_map.fits\"),\n",
        "    psf_path=path.join(dataset_path, \"psf.fits\"),\n",
        "    pixel_scales=0.1,\n",
        ")"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "we'll create and use a smaller 2.0\" `Mask2D` again."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "mask = al.Mask2D.circular(\n",
        "    shape_native=dataset.shape_native, pixel_scales=dataset.pixel_scales, radius=2.6\n",
        ")\n",
        "\n",
        "dataset = dataset.apply_mask(mask=mask)\n",
        "\n",
        "dataset_plotter = aplt.ImagingPlotter(\n",
        "    dataset=dataset, visuals_2d=aplt.Visuals2D(mask=mask)\n",
        ")\n",
        "dataset_plotter.subplot_dataset()"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "__Nested Sampling__\n",
        "\n",
        "Lets first perform the model-fit using Nautilus, but look at different parameters that control how long it takes to run. \n",
        "We'll therefore discuss in a bit more detail how Nautilus works, but still keep this description conceptually simple  \n",
        "and avoid technical terms and jargon. For a complete description of Nautilus you should check out the Nautilus \n",
        "publication `https://arxiv.org/abs/2306.16923`.\n",
        "\n",
        "nlive:\n",
        "\n",
        "Nautilus is a `nested sampling` algorithm. As we described in chapter 2, it throws down a set of `live points` in \n",
        "parameter space, where each live point corresponds to a lens model with a given set of parameters. These points are\n",
        "initially distributed according to our priors, hence why tuning our priors allows us to sample parameter space faster.\n",
        " \n",
        "The number of live points is set by the parameter `n_live`. More points provide a more thorough sampling of \n",
        "parameter space, increasing the probability that we locate the global maxima solution. Therefore, if you think your \n",
        "model-fit has gone to a local maxima, you should try increasing `n_live`. The downside of this is Nautilus will \n",
        "take longer to sample parameter space and converge on a solution. Ideally, we will use as few live points as possible \n",
        "to locate the global maxima as quickly as possible.\n",
        "\n",
        "f_live:\n",
        "\n",
        "A nested sampling algorithm estimates the *Bayesian Evidence* of the model-fit, which is quantity the non-linear \n",
        "search algorithms we introduce later do not. The Bayesian evidence quantifies how well the lens model as a whole fits\n",
        "the data, following a principle called Occam's Razor (`https://simple.wikipedia.org/wiki/Occam%27s_razor`). This \n",
        "penalizes models for being more complex (e.g. more parameters) and requires that their additional complexity improve \n",
        "their overall fit to the data compared to a simpler model. By computing the comparing the Bayesian evidence of \n",
        "different models one can objectively choose the lens model that best fits the data.\n",
        "\n",
        "A nested sampling algorithm stops sampling when it estimates that continuing sampling will not increase the Bayesian \n",
        "evidence (called the `log_evidence`) by more than the `f_live`. As Nautilus progresses and converges on the\n",
        "solution, the rate of increase of the estimated Bayesian evidence slows down. Therefore, higher `f_live`s \n",
        "mean Nautilus terminate sooner.\n",
        "    \n",
        "A high `f_live` will make the errors estimated on every parameter unreliable and its value must be kept \n",
        "below 0.8 for reliable error estimates. However, when chaining searches, we typically *do not care* about the errors \n",
        "in the first search, therefore setting a high evidence tolerance can be an effective means to make Nautilus converge\n",
        "faster (we'll estimate reliable errors in the second search when the `f_live is 0.8 or less). \n",
        "\n",
        "Lets perform two fits, where:\n",
        "\n",
        " - One has many live points and a higher evidence tolerance, causing the non-linear search to\n",
        " take a longer time to run.\n",
        "      \n",
        " - One has few live points, a high sampling efficiency and evidence tolerance, causing the non-linear search to\n",
        " converge and end quicker."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "model = af.Collection(\n",
        "    galaxies=af.Collection(\n",
        "        lens=af.Model(\n",
        "            al.Galaxy, redshift=0.5, bulge=al.lp.Sersic, mass=al.mp.Isothermal\n",
        "        ),\n",
        "        source=af.Model(al.Galaxy, redshift=1.0, bulge=al.lp.Sersic),\n",
        "    ),\n",
        ")\n",
        "\n",
        "search = af.Nautilus(\n",
        "    path_prefix=path.join(\"howtolens\", \"chapter_optional\"),\n",
        "    name=\"tutorial_searches_slow\",\n",
        "    unique_tag=dataset_name,\n",
        "    n_live=400,\n",
        ")\n",
        "\n",
        "analysis = al.AnalysisImaging(dataset=dataset)\n",
        "\n",
        "print(\n",
        "    \"The non-linear search has begun running - checkout the workspace/output\"\n",
        "    \"  folder for live output of the results, images and lens model.\"\n",
        "    \"  This Jupyter notebook cell with progress once search has completed - this could take some time!\"\n",
        ")\n",
        "\n",
        "result_slow = search.fit(model=model, analysis=analysis)"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Lets check that we get a good model and fit to the data."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "fit_plotter = aplt.FitImagingPlotter(fit=result_slow.max_log_likelihood_fit)\n",
        "fit_plotter.subplot_fit()"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We can use the result to tell us how many iterations Nautilus took to convergence on the solution."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "print(\"Total Nautilus Iterations (If you skip running the search, this is ~ 500000):\")\n",
        "print(result_slow.samples.total_samples)"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now lets run the search with fast settings, so we can compare the total number of iterations required."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "search = af.Nautilus(\n",
        "    path_prefix=path.join(\"howtolens\", \"chapter_2\"),\n",
        "    name=\"tutorial_searches_fast\",\n",
        "    unique_tag=dataset_name,\n",
        "    n_live=75,\n",
        ")\n",
        "\n",
        "print(\n",
        "    \"The non-linear search has begun running - checkout the workspace/output\"\n",
        "    \"  folder for live output of the results, images and lens model.\"\n",
        "    \"  This Jupyter notebook cell with progress once search has completed - this could take some time!\"\n",
        ")\n",
        "\n",
        "result_fast = search.fit(model=model, analysis=analysis)\n",
        "\n",
        "print(\"Search has finished run - you may now continue the notebook.\")"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Lets check that this search, despite its faster sampling settings, still gives us the global maxima solution."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "fit_plotter = aplt.FitImagingPlotter(fit=result_fast.max_log_likelihood_fit)\n",
        "fit_plotter.subplot_fit()"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "And now lets confirm it uses significantly fewer iterations."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "print(\"Total Nautilus Iterations:\")\n",
        "print(\"Slow settings: ~500000\")\n",
        "print(result_slow.samples.total_samples)\n",
        "print(\"Fast settings: \", result_fast.samples.total_samples)"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "__Optimizers__\n",
        "\n",
        "Nested sampling algorithms like Nautilus provides the errors on all of the model parameters, by fully mapping out all \n",
        "of the high likelihood regions of parameter space. This provides knowledge on the complete *range* of models that do \n",
        "and do not provide high likelihood fits to the data, but takes many extra iterations to perform. If we require precise \n",
        "error estimates (perhaps this is our final lens model fit before we publish the results in a paper), these extra\n",
        "iterations are acceptable. \n",
        "\n",
        "However, we often don't care about the errors. For example, in the previous tutorial when chaining searches, the only \n",
        "result we used from the fit performed in the first search was the maximum log likelihood model, omitting the errors\n",
        "entirely! Its seems wasteful to use a nested sampling algorithm like Nautilus to map out the entirity of parameter\n",
        "space when we don't use this information! \n",
        "\n",
        "There are a class of non-linear searches called `optimizers`, which seek to optimize just one thing, the log \n",
        "likelihood. They want to find the model that maximizes the log likelihood, with no regard for the errors, thus not \n",
        "wasting time mapping out in intricate detail every facet of parameter space. Lets see how much faster we can find a \n",
        "good fit to the lens data using an optimizer.\n",
        "\n",
        "we'll use the `Particle Swarm Optimizer` PySwarms. Conceptually this works quite similar to Nautilus, it has a set of \n",
        "points in parameter space (called `particles`) and it uses their likelihoods to determine where it thinks the higher\n",
        "likelihood regions of parameter space are. \n",
        "\n",
        "Unlike Nautilus, this algorithm requires us to specify how many iterations it should perform to find the global \n",
        "maxima solutions. Here, an iteration is the number of samples performed by every particle, so the total number of\n",
        "iterations is n_particles * iters. Lets try a total of 50000 iterations, a factor 10 less than our Nautilus runs above. \n",
        "\n",
        "In our experience, pyswarms is ineffective at initializing a lens model and therefore needs a the initial swarm of\n",
        "particles to surround the highest likelihood lens models. We set this starting point up below by manually inputting \n",
        "`GaussianPriors` on every parameter, where the centre of these priors is near the true values of the simulated lens data.\n",
        "\n",
        "Given this need for a robust starting point, PySwarms is only suited to model-fits where we have this information. It may\n",
        "therefore be useful when performing lens modeling search chaining (see HowToLens chapter 3). However, even in such\n",
        "circumstances, we have found that is often unrealible and often infers a local maxima."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "lens_bulge = af.Model(al.lp.Sersic)\n",
        "lens_bulge.centre.centre_0 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "lens_bulge.centre.centre_1 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "lens_bulge.ell_comps.ell_comps_0 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "lens_bulge.ell_comps.ell_comps_1 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "lens_bulge.intensity = af.GaussianPrior(mean=1.0, sigma=0.3)\n",
        "lens_bulge.effective_radius = af.GaussianPrior(mean=0.8, sigma=0.2)\n",
        "lens_bulge.sersic_index = af.GaussianPrior(mean=4.0, sigma=1.0)\n",
        "\n",
        "mass = af.Model(al.mp.Isothermal)\n",
        "mass.centre.centre_0 = af.GaussianPrior(mean=0.0, sigma=0.1)\n",
        "mass.centre.centre_1 = af.GaussianPrior(mean=0.0, sigma=0.1)\n",
        "mass.ell_comps.ell_comps_0 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "mass.ell_comps.ell_comps_1 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "mass.einstein_radius = af.GaussianPrior(mean=1.4, sigma=0.4)\n",
        "\n",
        "shear = af.Model(al.mp.ExternalShear)\n",
        "shear.gamma_1 = af.GaussianPrior(mean=0.0, sigma=0.1)\n",
        "shear.gamma_2 = af.GaussianPrior(mean=0.0, sigma=0.1)\n",
        "\n",
        "bulge = af.Model(al.lp.Sersic)\n",
        "bulge.centre.centre_0 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "bulge.centre.centre_1 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "bulge.ell_comps.ell_comps_0 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "bulge.ell_comps.ell_comps_1 = af.GaussianPrior(mean=0.0, sigma=0.3)\n",
        "bulge.intensity = af.GaussianPrior(mean=0.3, sigma=0.3)\n",
        "bulge.effective_radius = af.GaussianPrior(mean=0.2, sigma=0.2)\n",
        "bulge.sersic_index = af.GaussianPrior(mean=1.0, sigma=1.0)\n",
        "\n",
        "lens = af.Model(al.Galaxy, redshift=0.5, mass=mass, shear=shear)\n",
        "source = af.Model(al.Galaxy, redshift=1.0, bulge=bulge)\n",
        "\n",
        "model = af.Collection(galaxies=af.Collection(lens=lens, source=source))\n",
        "\n",
        "search = af.PySwarmsLocal(\n",
        "    path_prefix=path.join(\"howtolens\", \"chapter_optional\"),\n",
        "    name=\"tutorial_searches_pso\",\n",
        "    unique_tag=dataset_name,\n",
        "    n_particles=50,\n",
        "    iters=1000,\n",
        ")\n",
        "\n",
        "print(\n",
        "    \"The non-linear search has begun running - checkout the workspace/output\"\n",
        "    \"  folder for live output of the results, images and lens model.\"\n",
        "    \"  This Jupyter notebook cell with progress once search has completed - this could take some time!\"\n",
        ")\n",
        "\n",
        "result_pso = search.fit(model=model, analysis=analysis)\n",
        "\n",
        "print(\"PySwarms has finished run - you may now continue the notebook.\")\n",
        "\n",
        "fit_plotter = aplt.FitImagingPlotter(fit=result_pso.max_log_likelihood_fit)\n",
        "fit_plotter.subplot_fit()"
      ],
      "outputs": [],
      "execution_count": null
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "In our experience, the parameter spaces fitted by lens models are too complex for `PySwarms` to be used without a lot\n",
        "of user attention and care and careful setting up of the initialization priors, as shown above.\n",
        "\n",
        "__MCMC__\n",
        "\n",
        "For users familiar with Markov Chain Monte Carlo (MCMC) non-linear samplers, PyAutoFit supports the non-linear\n",
        "searches `Emcee` and `Zeus`. Like PySwarms, these also need a good starting point, and are generally less effective at \n",
        "lens modeling than Nautilus. \n",
        "\n",
        "I've included an example runs of Emcee and Zeus below, where the model is set up using `UniformPriors` to give\n",
        "the starting point of the MCMC walkers. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "lens_bulge = af.Model(al.lp.Sersic)\n",
        "lens_bulge.centre.centre_0 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "lens_bulge.centre.centre_1 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "lens_bulge.ell_comps.ell_comps_0 = af.UniformPrior(lower_limit=-0.3, upper_limit=0.3)\n",
        "lens_bulge.ell_comps.ell_comps_1 = af.UniformPrior(lower_limit=-0.3, upper_limit=0.3)\n",
        "lens_bulge.intensity = af.UniformPrior(lower_limit=0.5, upper_limit=1.5)\n",
        "lens_bulge.effective_radius = af.UniformPrior(lower_limit=0.2, upper_limit=1.6)\n",
        "lens_bulge.sersic_index = af.UniformPrior(lower_limit=3.0, upper_limit=5.0)\n",
        "\n",
        "\n",
        "mass = af.Model(al.mp.Isothermal)\n",
        "mass.centre.centre_0 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "mass.centre.centre_1 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "mass.ell_comps.ell_comps_0 = af.UniformPrior(lower_limit=-0.3, upper_limit=0.3)\n",
        "mass.ell_comps.ell_comps_1 = af.UniformPrior(lower_limit=-0.3, upper_limit=0.3)\n",
        "mass.einstein_radius = af.UniformPrior(lower_limit=1.0, upper_limit=2.0)\n",
        "\n",
        "shear = af.Model(al.mp.ExternalShear)\n",
        "shear.gamma_1 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "shear.gamma_2 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "\n",
        "bulge = af.Model(al.lp.Sersic)\n",
        "bulge.centre.centre_0 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "bulge.centre.centre_1 = af.UniformPrior(lower_limit=-0.1, upper_limit=0.1)\n",
        "bulge.ell_comps.ell_comps_0 = af.UniformPrior(lower_limit=-0.3, upper_limit=0.3)\n",
        "bulge.ell_comps.ell_comps_1 = af.UniformPrior(lower_limit=-0.3, upper_limit=0.3)\n",
        "bulge.intensity = af.UniformPrior(lower_limit=0.1, upper_limit=0.5)\n",
        "bulge.effective_radius = af.UniformPrior(lower_limit=0.0, upper_limit=0.4)\n",
        "bulge.sersic_index = af.UniformPrior(lower_limit=0.5, upper_limit=2.0)\n",
        "\n",
        "lens = af.Model(al.Galaxy, redshift=0.5, mass=mass, shear=shear)\n",
        "source = af.Model(al.Galaxy, redshift=1.0, bulge=bulge)\n",
        "\n",
        "model = af.Collection(galaxies=af.Collection(lens=lens, source=source))\n",
        "\n",
        "search = af.Zeus(\n",
        "    path_prefix=path.join(\"howtolens\", \"chapter_2\"),\n",
        "    name=\"tutorial_searches_zeus\",\n",
        "    unique_tag=dataset_name,\n",
        "    nwalkers=50,\n",
        "    nsteps=1000,\n",
        ")\n",
        "\n",
        "print(\n",
        "    \"Zeus has begun running - checkout the workspace/output\"\n",
        "    \"  folder for live output of the results, images and lens model.\"\n",
        "    \"  This Jupyter notebook cell with progress once search has completed - this could take some time!\"\n",
        ")\n",
        "\n",
        "result_zeus = search.fit(model=model, analysis=analysis)\n",
        "\n",
        "print(\"Zeus has finished run - you may now continue the notebook.\")\n",
        "\n",
        "fit_plotter = aplt.FitImagingPlotter(fit=result_zeus.max_log_likelihood_fit)\n",
        "fit_plotter.subplot_fit()\n",
        "\n",
        "\n",
        "search = af.Emcee(\n",
        "    path_prefix=path.join(\"howtolens\", \"chapter_2\"),\n",
        "    name=\"tutorial_searches_emcee\",\n",
        "    unique_tag=dataset_name,\n",
        "    nwalkers=50,\n",
        "    nsteps=1000,\n",
        ")\n",
        "\n",
        "print(\n",
        "    \"The non-linear search has begun running - checkout the workspace/output\"\n",
        "    \"  folder for live output of the results, images and lens model.\"\n",
        "    \"  This Jupyter notebook cell with progress once search has completed - this could take some time!\"\n",
        ")\n",
        "\n",
        "result_emcee = search.fit(model=model, analysis=analysis)\n",
        "\n",
        "print(\"The search has finished run - you may now continue the notebook.\")\n",
        "\n",
        "fit_plotter = aplt.FitImagingPlotter(fit=result_emcee.max_log_likelihood_fit)\n",
        "fit_plotter.subplot_fit()\n"
      ],
      "outputs": [],
      "execution_count": null
    }
  ],
  "metadata": {
    "anaconda-cloud": {},
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.1"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 4
}