{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "08669d2a-55a6-46d8-b082-a945c9032e91",
   "metadata": {},
   "source": [
    "# Detailed look at `Optimizer`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab6ac81e-8112-4e1c-bc4e-3a1d96b4abfd",
   "metadata": {},
   "source": [
    "In the [Gradient-based optimization][1] tutorial, Mitsuba's `Optimizer` class was used to build an optimization loop.\n",
    "In this tutorial, we will study the API of optimizers in more detail.\n",
    "It is designed to be convenient to use when directly optimizing parameters in a Mitsuba scene, but also flexible enough for more flexible use-cases (e.g. chaining together a neural network and differentiable rendering step in the same computation graph).\n",
    "\n",
    "Let's start by setting an AD-compatible variant.\n",
    "\n",
    "[1]: https://mitsuba.readthedocs.io/en/latest/src/inverse_rendering/gradient_based_opt.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "a4d17e00-6712-4252-950c-23d5baaae109",
   "metadata": {},
   "outputs": [],
   "source": [
    "import mitsuba as mi\n",
    "import drjit as dr \n",
    "\n",
    "mi.set_variant('llvm_ad_rgb')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e49b7142-7e1b-4401-a9b4-d9ddf533f355",
   "metadata": {},
   "source": [
    "## The basics\n",
    "\n",
    "To perform [gradient-based optimization][1], Mitsuba ships with standard optimizers including *Stochastic Gradient Descent* ([<code>SGD</code>][3]) with and without momentum, as well as [<code>Adam</code>][4]. Those both inherit from the [<code>Optimizer</code>][2] base class and can be found in the `mitsuba.ad` submodule.\n",
    "\n",
    "The `Optimizer` class behaves like a Python `dict` with some extra logic and methods. This how-to guide will take you through its API and highlight the best-practices and pittfalls related to this class.\n",
    "\n",
    "Let's first construct a simple `SGD` optimizer with a learning rate of `0.25`. \n",
    "\n",
    "[1]: https://en.wikipedia.org/wiki/Gradient_descent\n",
    "[2]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.Optimizer\n",
    "[3]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.SGD\n",
    "[4]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.Adam"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "7663cff5-2ff4-4025-8fc0-3df3371fcd1f",
   "metadata": {},
   "outputs": [],
   "source": [
    "opt = mi.ad.SGD(lr=0.25)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7a456654-3249-4947-8daa-701b29bff62d",
   "metadata": {},
   "source": [
    "We can now specify a variable to be optimized. The `Optimizer` will automatically enable gradient computation on the stored variable as it is necessary for further computation to produce any gradients."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "8cf564b5-7073-4f48-90e6-a822b0f3e3a1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SGD[\n",
       "  variables = ['x'],\n",
       "  lr = {'default': 0.25},\n",
       "  momentum = 0\n",
       "]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "opt['x'] = mi.Float(1.0)\n",
    "opt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b03a97c",
   "metadata": {},
   "source": [
    "It is also possible to directly pass a set of variables to be optimized in the constructor of the `Optimizer`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "dc36d2a7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SGD[\n",
       "  variables = ['x', 'y'],\n",
       "  lr = {'default': 0.25},\n",
       "  momentum = 0\n",
       "]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "opt = mi.ad.SGD(lr=0.25, params={'x': mi.Float(1.0), 'y': mi.Float(2.0)})\n",
    "opt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b817c5e2-c47d-489d-ba9a-e7744ca2f691",
   "metadata": {},
   "source": [
    "It also provides a similar API to perform basic dictionary manipulations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e3ed2d78-2433-4823-b903-0d40c351fe96",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x: [1.0]\n",
      "y: [2.0]\n"
     ]
    }
   ],
   "source": [
    "for k, v in opt.items():\n",
    "    print(f\"{k}: {v}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1c489cf0-19d1-4eb2-b93a-f5551b3184ca",
   "metadata": {},
   "source": [
    "<div class=\"admonition important alert alert-block alert-warning\">\n",
    "    \n",
    "⚠ It is important to note that a copy of the variable is made when assigned to the Optimizer via the `__setitem__` method. For instance in the following code, the original variable won't be attached to the AD graph, and its value will remain unchanged.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "307a2534-f7a6-4231-a374-bb45df11a103",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Original:  [2.0], grad_enabled=False.\n",
      "Optimizer: [3.0], grad_enabled=True.\n"
     ]
    }
   ],
   "source": [
    "y = mi.Float(2.0)\n",
    "opt['y'] = y  # A copy is made here\n",
    "opt['y'] += 1.0\n",
    "\n",
    "print(f\"Original:  {y}, grad_enabled={dr.grad_enabled(y)}.\")\n",
    "print(f\"Optimizer: {opt['y']}, grad_enabled={dr.grad_enabled(opt['y'])}.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd87d9fb-fc3e-4002-9b8c-b9c6503548b0",
   "metadata": {},
   "source": [
    "It is therefore crucial to use the variable held by the optimizer to perform the differentiable computation in order to produce the proper gradients. Here is a simple example where `x` and `y` are used in some calculation for which we then request the gradients to be backpropagated. We can validate that the gradients are adequately propagated to the optimizer variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "05bb71c6-4f3f-410c-af68-5d4a71df40e4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x grad=[1.0]\n",
      "y grad=[2.0]\n"
     ]
    }
   ],
   "source": [
    "z = opt['x'] + 2.0 * opt['y']\n",
    "dr.backward(z)\n",
    "\n",
    "print(f\"x grad={dr.grad(opt['x'])}\")\n",
    "print(f\"y grad={dr.grad(opt['y'])}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d72b8b1c-0ae8-447c-8f89-63ed1fa33cc5",
   "metadata": {},
   "source": [
    "During the optimization, the role of the optimizer will be to take a gradient step according to its update rule. In the case of a simple SGD optimizer with no momentum, the update rule is:\n",
    "$$\n",
    "x_{i+1} = x_i - \\texttt{grad}(x_i) \\times \\texttt{lr} \n",
    "$$\n",
    "\n",
    "The `Optimizer` method [<code>step()</code>][1] will apply its update rule to all the variables.\n",
    "\n",
    "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.Optimizer.step"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "ed3a06b4-87a7-4a5e-a4a5-73835f27e9c6",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Before the gradient step: x=[1.0], y=[3.0]\n",
      "After the gradient step:  x=[0.75], y=[2.5]\n"
     ]
    }
   ],
   "source": [
    "print(f\"Before the gradient step: x={opt['x']}, y={opt['y']}\")\n",
    "opt.step()\n",
    "print(f\"After the gradient step:  x={opt['x']}, y={opt['y']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18b09b3c-6fd0-4780-8b5e-92e50562d111",
   "metadata": {},
   "source": [
    "After performing the update rule, the `Optimizer` also resets the gradient values of its variables to `0.0` and ensures gradient computations are still enabled on all its variables. This ensures that everything is ready for the next iteration of the optimization loop."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "feee8867-1f54-41ec-bdbd-bc956a4a9fea",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Optimizing scene parameters\n",
    "\n",
    "In the context of differentiable rendering, we are often interested in optimizing parameters of a Mitsuba scene, which are exposed via the [`traverse()`][1] mechanism. This function returns a [`SceneParameters`][2] object which exposes dictionary-like interface.\n",
    "\n",
    "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.traverse\n",
    "[2]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.SceneParameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "7c47e418-55aa-4de7-9067-a7952e8fed20",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SceneParameters[\n",
       "  ----------------------------------------------------------------------------------------\n",
       "  Name                                 Flags    Type            Parent\n",
       "  ----------------------------------------------------------------------------------------\n",
       "  sensor.near_clip                              float           PerspectiveCamera\n",
       "  sensor.far_clip                               float           PerspectiveCamera\n",
       "  sensor.shutter_open                           float           PerspectiveCamera\n",
       "  sensor.shutter_open_time                      float           PerspectiveCamera\n",
       "  sensor.x_fov                                  float           PerspectiveCamera\n",
       "  sensor.to_world                               Transform4f     PerspectiveCamera\n",
       "  gray.reflectance.value               ∂        Color3f         SRGBReflectanceSpectrum\n",
       "  white.reflectance.value              ∂        Color3f         SRGBReflectanceSpectrum\n",
       "  green.reflectance.value              ∂        Color3f         SRGBReflectanceSpectrum\n",
       "  red.reflectance.value                ∂        Color3f         SRGBReflectanceSpectrum\n",
       "  glass.eta                                     float           SmoothDielectric\n",
       "  mirror.eta.value                     ∂, D     Float           UniformSpectrum\n",
       "  mirror.k.value                       ∂, D     Float           UniformSpectrum\n",
       "  mirror.specular_reflectance.value    ∂        Float           UniformSpectrum\n",
       "  light.emitter.radiance.value         ∂        Color3f         SRGBEmitterSpectrum\n",
       "  light.vertex_count                            int             OBJMesh\n",
       "  light.face_count                              int             OBJMesh\n",
       "  light.faces                                   UInt            OBJMesh\n",
       "  light.vertex_positions               ∂, D     Float           OBJMesh\n",
       "  light.vertex_normals                 ∂, D     Float           OBJMesh\n",
       "  light.vertex_texcoords               ∂        Float           OBJMesh\n",
       "  floor.vertex_count                            int             OBJMesh\n",
       "  floor.face_count                              int             OBJMesh\n",
       "  floor.faces                                   UInt            OBJMesh\n",
       "  floor.vertex_positions               ∂, D     Float           OBJMesh\n",
       "  floor.vertex_normals                 ∂, D     Float           OBJMesh\n",
       "  floor.vertex_texcoords               ∂        Float           OBJMesh\n",
       "  ceiling.vertex_count                          int             OBJMesh\n",
       "  ceiling.face_count                            int             OBJMesh\n",
       "  ceiling.faces                                 UInt            OBJMesh\n",
       "  ceiling.vertex_positions             ∂, D     Float           OBJMesh\n",
       "  ceiling.vertex_normals               ∂, D     Float           OBJMesh\n",
       "  ceiling.vertex_texcoords             ∂        Float           OBJMesh\n",
       "  back.vertex_count                             int             OBJMesh\n",
       "  back.face_count                               int             OBJMesh\n",
       "  back.faces                                    UInt            OBJMesh\n",
       "  back.vertex_positions                ∂, D     Float           OBJMesh\n",
       "  back.vertex_normals                  ∂, D     Float           OBJMesh\n",
       "  back.vertex_texcoords                ∂        Float           OBJMesh\n",
       "  greenwall.vertex_count                        int             OBJMesh\n",
       "  greenwall.face_count                          int             OBJMesh\n",
       "  greenwall.faces                               UInt            OBJMesh\n",
       "  greenwall.vertex_positions           ∂, D     Float           OBJMesh\n",
       "  greenwall.vertex_normals             ∂, D     Float           OBJMesh\n",
       "  greenwall.vertex_texcoords           ∂        Float           OBJMesh\n",
       "  redwall.vertex_count                          int             OBJMesh\n",
       "  redwall.face_count                            int             OBJMesh\n",
       "  redwall.faces                                 UInt            OBJMesh\n",
       "  redwall.vertex_positions             ∂, D     Float           OBJMesh\n",
       "  redwall.vertex_normals               ∂, D     Float           OBJMesh\n",
       "  redwall.vertex_texcoords             ∂        Float           OBJMesh\n",
       "  mirrorsphere.to_world                         Transform4f     Sphere\n",
       "  glasssphere.to_world                          Transform4f     Sphere\n",
       "]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "scene = mi.load_file('../scenes/cbox.xml')\n",
    "params = mi.traverse(scene)\n",
    "params"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd2aa704-7b87-4794-92ae-1ac33afa5ae9",
   "metadata": {},
   "source": [
    "This is very convinient as it can be directly passed to the `Optimizer` constructor to set all the scene parameters to be optimized. In this case the `Optimizer` will actually ignore all the variables that are not marked as differentiable (see the ∂ flag in the print of `params` object above).\n",
    "\n",
    "However, it is sometimes preferable to only optimize a subset of the params, which can easily be achieved by filtering out items in the `params` object using the `keep` method. This method accepts both a list of keys or a REGEX."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "2fa809dc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SceneParameters[\n",
       "  ------------------------------------------------------------------------------\n",
       "  Name                       Flags    Type            Parent\n",
       "  ------------------------------------------------------------------------------\n",
       "  gray.reflectance.value     ∂        Color3f         SRGBReflectanceSpectrum\n",
       "  white.reflectance.value    ∂        Color3f         SRGBReflectanceSpectrum\n",
       "  green.reflectance.value    ∂        Color3f         SRGBReflectanceSpectrum\n",
       "  red.reflectance.value      ∂        Color3f         SRGBReflectanceSpectrum\n",
       "]"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "params.keep(r'.*\\.reflectance\\.value')\n",
    "params"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b3fbccf1",
   "metadata": {},
   "source": [
    "We can now constructor an Optimizer that will optimize those reflectance values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "cd866e0b-f2ce-4a0f-b603-7d015d96dd56",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SGD[\n",
       "  variables = ['gray.reflectance.value', 'white.reflectance.value', 'green.reflectance.value', 'red.reflectance.value'],\n",
       "  lr = {'default': 0.25},\n",
       "  momentum = 0\n",
       "]"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "opt = mi.ad.SGD(lr=0.25, params=params)\n",
    "opt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2364924a-1278-40cd-b9ef-23f987ba7aa3",
   "metadata": {},
   "source": [
    "Of course, as done before, it is also possible to start from an empty optimizer and set variables one by one from the `params` object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "ef5e310e-4e41-4df4-994b-6070da70a0ca",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SGD[\n",
       "  variables = ['red.reflectance.value'],\n",
       "  lr = {'default': 0.25},\n",
       "  momentum = 0\n",
       "]"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "opt = mi.ad.SGD(lr=0.25)\n",
    "opt['red.reflectance.value'] = params['red.reflectance.value']\n",
    "opt"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0724513b-74be-4bb2-b6e5-9929c26d9934",
   "metadata": {},
   "source": [
    "As explained above, the loaded parameters will be copied internally, so any attempt to change their value in the optimizer will not directly be reflected in `params`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "d59361d9-a3b0-47b3-a206-8e3d6cb2676f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "params:   [[0.5700680017471313, 0.043013498187065125, 0.04437059909105301]]\n",
      "optimize: [[0.2850340008735657, 0.021506749093532562, 0.022185299545526505]]\n"
     ]
    }
   ],
   "source": [
    "opt['red.reflectance.value'] *= 0.5\n",
    "\n",
    "print(f\"params:   {params['red.reflectance.value']}\")\n",
    "print(f\"optimize: {opt['red.reflectance.value']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af6d5335-2da7-4f7b-9ea1-288248103c00",
   "metadata": {},
   "source": [
    "In order to propagate those changes to `params` (and to the `Scene` itself), we use the [<code>update()</code>][1] method of `SceneParameters`. Internally this method will look for the variables keys of the optimizer matching with the ones in `params` and overwrite the corresponding values. Then it will update the internal scene state (e.g. re-build the BVH when some geometry data was modified).\n",
    "\n",
    "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.SceneParameters.update"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "9731b3ca-6958-452e-9582-dacfdcae3209",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "params:   [[0.2850340008735657, 0.021506749093532562, 0.022185299545526505]]\n",
      "optimize: [[0.2850340008735657, 0.021506749093532562, 0.022185299545526505]]\n"
     ]
    }
   ],
   "source": [
    "params.update(opt);\n",
    "\n",
    "print(f\"params:   {params['red.reflectance.value']}\")\n",
    "print(f\"optimize: {opt['red.reflectance.value']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "beeaa2ea-1d14-4d68-8550-459cd96a4ab8",
   "metadata": {},
   "source": [
    "## Optimizing latent variables\n",
    "\n",
    "In more complex optimization scenarios, scene parameters might be described **as a function** of some other parameters. In such a scenario, we would be interested in optimizing those other parameters instead of the scene parameters directly. \n",
    "\n",
    "For example, this would be needed when generating the vertex positions of a mesh using a neural network: we'd want to optimize the weights of the neural network, not the vertex positions themselves. Another example could be a procedurally generated texture, maybe from a physically-based model that can be tweaked with a few parameters.\n",
    "\n",
    "We refer to this type of \"external\" parameters as <i>latent variables</i>. Many of the design decisions of the [<code>Optimizer</code>][1] class were made to support latent variables.\n",
    "\n",
    "For a simpler example, let's consider the case where we are aiming at optimizing the translation vector of a 3D mesh object in our scene. Even from an convexity standpoint, optimizing those three translation values will be much easier that having to optimize all the vertex positions simultaneously and hope for the best.\n",
    "\n",
    "Let's fetch the scene parameters again and initialize our optimizer one more time.\n",
    "\n",
    "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.Optimizer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "844dad8b-4e52-4801-8109-2ef10ffa2d9a",
   "metadata": {},
   "outputs": [],
   "source": [
    "params = mi.traverse(scene)\n",
    "opt = mi.ad.SGD(lr=0.25)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5eba6fc-1c32-4dd4-9d43-b9a0857db6bb",
   "metadata": {},
   "source": [
    "We can then append a latent variable to the optimizer, similar do what we did in the first section of this How-to Guide."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "4029d646-5ad9-4799-a8d9-715eca6632d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "opt['translation'] = mi.Vector3f(0, 0, 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "314d7db1-0de7-4aca-b8bd-c67edc2c27f4",
   "metadata": {},
   "source": [
    "In this scenario, it will be the user's responsability to propagate the changes of the latent variable to the scene parameters _every time the latent variable is updated_. \n",
    "To this end, it's helpful to define a dedicated update function. \n",
    "\n",
    "Note that the vertex positions on a mesh are stored in a linearized fashion in Mitsuba (`xyzxyzxyzxyz...`).\n",
    "In order to apply the translation, we first reorder the initial coordinates into 3D points. \n",
    "Once the translation has been applied, we need to write the new values back into the linearized array before the assignment to the scene parameters. For those operations, we use the `dr.unravel` and `dr.ravel` helper functions of DrJIT."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "9ad2ba24-9fe4-4e3b-91bb-1479cd0169f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Copy or our vertex positions (and convert them to 3D points)\n",
    "initial_vertex_pos = dr.unravel(mi.Point3f, params['redwall.vertex_positions'])\n",
    "\n",
    "# Now we define the update rule\n",
    "def update_vertex_pos():\n",
    "    # Create the translation transformation\n",
    "    T = mi.Transform4f.translate(opt['translation'])\n",
    "    \n",
    "    # Apply the transformation to the vertex positions\n",
    "    new_vertex_pos = T @ initial_vertex_pos\n",
    "\n",
    "    # Flatten the vertex position array before assigning it to `params`\n",
    "    params['redwall.vertex_positions'] = dr.ravel(new_vertex_pos)\\\n",
    "    \n",
    "    # Propagate changes through the scene (e.g. rebuild BVH)\n",
    "    params.update()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "60dbe51e-5c8f-409c-86d8-517356199c19",
   "metadata": {},
   "source": [
    "With this function implemented, all we need to do is to make sure to call it after every optimizer step to update the vertex positions accordingly."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20d0ad96-3f16-4730-b8a0-fa7915058c01",
   "metadata": {},
   "source": [
    "<div class=\"admonition important alert alert-block alert-info\">\n",
    "💭 On top of propagating the changes of values, calling `update_vertex_pos()` also builds the computational graph necessary to the automatic differentiation layer to later compute the gradients of the scene parameters with respect to the latent variable.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88e835a5-5648-4ae9-8dcb-8ee4457c60f9",
   "metadata": {},
   "source": [
    "## Optimizer state\n",
    "\n",
    "On top of carrying variables to optimize, the [<code>Optimizer</code>][1] also holds an internal state used in its update rule. For instance, the momentum-based [</code>SGD</code>][2] optimizer tracks the velocity of the previous iteration to apply the momentum in its [<code>step()</code>][3] method. In most cases this state is stored on a per-parameter basis.\n",
    "\n",
    "In some cases, it is useful to reset the state of an `Optimizer` (e.g. optimization scheduling that resizes the optimized volume grid). For this, we can use the [<code>reset()</code>][4] method which will zero-initialize the internal state associated with a specific parameter.\n",
    "\n",
    "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.Optimizer\n",
    "[2]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.SGD\n",
    "[3]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.Optimizer.step\n",
    "[4]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.ad.Optimizer.reset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "9abc58ff-d80b-40de-8b4d-2eac88408853",
   "metadata": {},
   "outputs": [],
   "source": [
    "opt.reset('translation')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "afec19f1-d7a9-4744-98d4-580c2a31761d",
   "metadata": {},
   "source": [
    "Another useful feature of the Mitsuba `Optimizer` is its ability to <i>mask</i> the update of its internal state based on the presence or not of gradients w.r.t. each variables. \n",
    "\n",
    "This can be useful in Monte Carlo simulations such as differentiable path tracing, where not <i>all</i> optimized parameters necessarily receive gradients at each iteration. We found that in this situation, updating the state of the optimizer for parameters that did not receive gradients may degrade convergence. The update should rather be discarded instead. When constructing the `Optimizer`, it is possible to specific whether to discard such updates when the gradients are zero using the `mask_updates` argument (the default is `False`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "a571c69b-aedf-4e1b-b1a7-d45526614e1c",
   "metadata": {},
   "outputs": [],
   "source": [
    "opt = mi.ad.SGD(lr=0.25, mask_updates=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab611f59-df32-408a-8cce-b56145728053",
   "metadata": {},
   "source": [
    "Finally, the Mitsuba implementation of the `SGD` optimizers also supports per-parameter learning rates. This is useful to control the magnitude of the gradient step taken on individual parameters. This can be achieved using the `set_learning_rate()` method, which takes a optional `key` argument to specify for which parameter to set the learning rate, or alternatively a `dict` with all keys for which to override the learning rate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "76c36025-aea9-43d2-8e83-3ac18c6e1299",
   "metadata": {},
   "outputs": [],
   "source": [
    "opt['x'] = mi.Float(1.0)\n",
    "opt['y'] = mi.Float(1.0)\n",
    "\n",
    "opt.set_learning_rate({'x': 0.125, 'y': 0.25})"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}