{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Adaptive PDE discretizations on Cartesian grids\n",
    "## Volume : Algorithmic tools\n",
    "## Part : Automatic differentiation\n",
    "## Chapter : Sparse automatic differentiation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook illustrates automatic differentiation, with an application to an simple one-dimensional denoising problem. More precisely, we use the following flavors of automatic differentiation:\n",
    "* *First* and *Second* order. The variables used encode gradient and (sometimes) hessian information.\n",
    "* *Sparse*. The Taylor expansion information - gradient and hessian - is stored in sparse form. This is practical for functions with low complexity, but an input of arbitrary dimension.\n",
    "* *Forward*. The Taylor expansion information is propagated forward along with the computations."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[**Summary**](Summary.ipynb) of volume Algorithmic tools, this series of notebooks.\n",
    "\n",
    "[**Main summary**](../Summary.ipynb) of the Adaptive Grid Discretizations \n",
    "\tbook of notebooks, including the other volumes.\n",
    "\n",
    "# Table of contents\n",
    "  * [1. A word on functional optimization](#1.-A-word-on-functional-optimization)\n",
    "  * [2. Sparse automatic differentiation](#2.-Sparse-automatic-differentiation)\n",
    "    * [2.1 Elementary symbolic perturbation](#2.1-Elementary-symbolic-perturbation)\n",
    "    * [2.2 Evaluating the objective function](#2.2-Evaluating-the-objective-function)\n",
    "    * [2.3 Evaluating the PDE residue](#2.3-Evaluating-the-PDE-residue)\n",
    "    * [2.4 Other operations and mathematical functions](#2.4-Other-operations-and-mathematical-functions)\n",
    "  * [3. Composition of Dense and Sparse AD](#3.-Composition-of-Dense-and-Sparse-AD)\n",
    "    * [3.1 The problem : growth of the Sparse AD information](#3.1-The-problem-:-growth-of-the-Sparse-AD-information)\n",
    "    * [3.2 A first solution: regular simplifications](#3.2-A-first-solution:-regular-simplifications)\n",
    "    * [3.3 Another possible solution: the combination with dense automatic differentiation](#3.3-Another-possible-solution:-the-combination-with-dense-automatic-differentiation)\n",
    "  * [4. Differentiating extrema using the envelope theorem](#4.-Differentiating-extrema-using-the-envelope-theorem)\n",
    "\n",
    "\n",
    "\n",
    "**Acknowledgement.** Some of the experiments presented in these notebooks are part of \n",
    "ongoing research with Ludovic Métivier and Da Chen.\n",
    "\n",
    "Copyright Jean-Marie Mirebeau, Centre Borelli, ENS Paris-Saclay, CNRS, University Paris-Saclay"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Known bugs and incompatibilities.** Our implementation of automatic differentiation is based on subclassing the numpy array class. While simple and powerful, this approach suffers from a few pitfalls, described in notebook [ADBugs](ADBugs.ipynb)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0. Importing the required libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.682733Z",
     "iopub.status.busy": "2024-04-30T08:47:01.682444Z",
     "iopub.status.idle": "2024-04-30T08:47:01.692504Z",
     "shell.execute_reply": "2024-04-30T08:47:01.692041Z"
    }
   },
   "outputs": [],
   "source": [
    "import sys; sys.path.insert(0,\"..\") # Allow import of agd from parent directory (useless if conda package installed)\n",
    "#from Miscellaneous import TocTools; print(TocTools.displayTOC('Sparse','Algo'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.695207Z",
     "iopub.status.busy": "2024-04-30T08:47:01.695012Z",
     "iopub.status.idle": "2024-04-30T08:47:01.745388Z",
     "shell.execute_reply": "2024-04-30T08:47:01.744997Z"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.747174Z",
     "iopub.status.busy": "2024-04-30T08:47:01.747018Z",
     "iopub.status.idle": "2024-04-30T08:47:01.752226Z",
     "shell.execute_reply": "2024-04-30T08:47:01.751937Z"
    }
   },
   "outputs": [],
   "source": [
    "import agd.AutomaticDifferentiation as ad\n",
    "import agd.FiniteDifferences as fd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.753702Z",
     "iopub.status.busy": "2024-04-30T08:47:01.753612Z",
     "iopub.status.idle": "2024-04-30T08:47:01.755820Z",
     "shell.execute_reply": "2024-04-30T08:47:01.755560Z"
    }
   },
   "outputs": [],
   "source": [
    "def LInfNorm(a): return np.max(np.abs(a))\n",
    "def LInfNorm_AD(a): return np.max((LInfNorm(a.value),LInfNorm(a.coef)))\n",
    "def LInfNorm_AD2(a): return np.max((LInfNorm(a.value),LInfNorm(a.coef1),LInfNorm(a.coef2)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.757224Z",
     "iopub.status.busy": "2024-04-30T08:47:01.757136Z",
     "iopub.status.idle": "2024-04-30T08:47:01.759162Z",
     "shell.execute_reply": "2024-04-30T08:47:01.758911Z"
    }
   },
   "outputs": [],
   "source": [
    "def reload_packages():\n",
    "    from Miscellaneous.rreload import rreload\n",
    "    global ad,fd\n",
    "    [ad,fd] = rreload([ad,fd],rootdir='..')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. A word on functional optimization\n",
    "\n",
    "We illustrate sparse automatic differentiation with an optimization problem of the form\n",
    "$$\n",
    "    \\min_u \\int_\\Omega F(x,\\nabla u(x)) + G(x,u(x)) \\, dx,\n",
    "$$\n",
    "where the unknown $u$ is a function defined over a given domain $\\Omega$, with arbitrary boundary conditions.\n",
    "\n",
    "The cost functions $F$ and $G$ are given as well. Classicaly, $F$ is a regularization term, whereas $G$ is a data fidelity term.\n",
    "The simplest case would be \n",
    "$$\n",
    "    F_0(x,v) := \\frac 1 2 \\|v\\|^2, \\qquad G_0(x,y) := \\frac 1 2 \\|y-g_0(x)\\|^2,\n",
    "$$\n",
    "where $g_0(x)$ is some given data.\n",
    "However, one may also think of numerous variants, involving for instance and without loss of generality:\n",
    "* Inhomogeneous and Anisotropic norms, such as $F(x,v) = \\frac 1 2 <v,D(x) v>$ where $D(x)$ is a positive definite matrix. This allows, in dimension $d\\geq 2$, to enforce regularity more or less strongly depending on the position and the orientation.\n",
    "* $L^1$ penalization, such as $F(x,v) = \\|v\\|$ or $G(x,y)=|y-g_0(x)|$. These convex yet weaker cost functions promote 'sparsity' in certain contexts. \n",
    "* Entropic penalization, for positive functions : $G(x,y) = y\\ln(y/u_0(x))$.\n",
    "\n",
    "**Costs involving a kernel.** Applications such as deblurring often involve an additional cost of the form \n",
    "$$\n",
    "    H(x, (\\rho\\star u)(x)),\n",
    "$$\n",
    "where $\\rho$ is a convolution kernel. Because convolution is a non-local operator with a large support, sparse automatic differentiation is not a good fit. Backward automatic differentiation would be appropriate here, but it is not supported at this point. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Problem parameters.**\n",
    "In order to emphasize automatic differentiation tools, rather than a particular application, we consider as a start one of the most basic problem instantiations. See the other notebooks for more advanced examples.\n",
    "\n",
    "The domain $\\Omega$ is defined as the (one dimensional) segment $[0,1]$ equipped with periodic boundary conditions.\n",
    "We let $F_0$ and $G_0$ be the simple instances above defined, with a discontinuous $u_0$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.760660Z",
     "iopub.status.busy": "2024-04-30T08:47:01.760580Z",
     "iopub.status.idle": "2024-04-30T08:47:01.763097Z",
     "shell.execute_reply": "2024-04-30T08:47:01.762850Z"
    }
   },
   "outputs": [],
   "source": [
    "def F0(x,v):\n",
    "    return 0.5*v**2\n",
    "\n",
    "def g0(x):\n",
    "    return (np.abs(x-0.5)<0.3) + np.sin(2.*np.pi*x)\n",
    "def G0(x,y):\n",
    "    return 0.5*(y-g0(x))**2\n",
    "\n",
    "def Objective(u,x,dx,F,G):\n",
    "#    h=x[1]-x[0] # The grid scale\n",
    "    du = fd.DiffUpwind(u,(1,),dx,padding=None) # padding=None is for periodic bc\n",
    "    return (F(x+0.5*dx,du)+G(x,u)).sum(axis=0) * dx # Integrate over the domain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.764463Z",
     "iopub.status.busy": "2024-04-30T08:47:01.764382Z",
     "iopub.status.idle": "2024-04-30T08:47:01.766210Z",
     "shell.execute_reply": "2024-04-30T08:47:01.765976Z"
    }
   },
   "outputs": [],
   "source": [
    "x,dx=np.linspace(0,1,10,endpoint=False,retstep=True)\n",
    "u=np.zeros(x.shape)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.767477Z",
     "iopub.status.busy": "2024-04-30T08:47:01.767404Z",
     "iopub.status.idle": "2024-04-30T08:47:01.770643Z",
     "shell.execute_reply": "2024-04-30T08:47:01.770404Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.5"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Objective(u,x,dx,F0,G0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The optimality condition of the variational problem, for above specific choices of $F_0$ and $G_0$, is \n",
    "$$\n",
    "    -\\Delta u + u-g_0 = 0.\n",
    "$$\n",
    "We can therefore, alternatively, solve directly this PDE instead going through the minimization problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.772054Z",
     "iopub.status.busy": "2024-04-30T08:47:01.771970Z",
     "iopub.status.idle": "2024-04-30T08:47:01.773958Z",
     "shell.execute_reply": "2024-04-30T08:47:01.773722Z"
    }
   },
   "outputs": [],
   "source": [
    "def Scheme0(u,x):\n",
    "    h=float(x[1]-x[0])\n",
    "    d2u = fd.Diff2(u,(1,),h,padding=None)\n",
    "    return -d2u +u -g0(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.775295Z",
     "iopub.status.busy": "2024-04-30T08:47:01.775205Z",
     "iopub.status.idle": "2024-04-30T08:47:01.777398Z",
     "shell.execute_reply": "2024-04-30T08:47:01.777181Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0.        , -0.58778525, -0.95105652, -1.95105652, -1.58778525,\n",
       "       -1.        , -0.41221475, -0.04894348,  0.95105652,  0.58778525])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Scheme0(u,x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Sparse automatic differentiation\n",
    "\n",
    "The sparse automatic differentiation classes, first and second order, allow to compute gradients, jacobian matrices, and hessians, represented in a sparse manner."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.1 Elementary symbolic perturbation\n",
    "\n",
    "Assume that $u^0 = (u^0_i)_{0 \\leq i < n}$ is some array. \n",
    "Using the `ad.Sparse.identity` constructor, one can create an augmented variable $u^1$ with components\n",
    "$$\n",
    "    u^1_i = u^0_i + \\delta_i + o(\\delta_i)\n",
    "$$\n",
    "where $(\\delta_i)_{0 \\leq i < n}$ are independent first order symbolic perturbations. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.778814Z",
     "iopub.status.busy": "2024-04-30T08:47:01.778726Z",
     "iopub.status.idle": "2024-04-30T08:47:01.780473Z",
     "shell.execute_reply": "2024-04-30T08:47:01.780247Z"
    }
   },
   "outputs": [],
   "source": [
    "x = np.linspace(0,1,5,endpoint=False)\n",
    "u0 = np.cos(2.*np.pi*x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.781823Z",
     "iopub.status.busy": "2024-04-30T08:47:01.781747Z",
     "iopub.status.idle": "2024-04-30T08:47:01.783372Z",
     "shell.execute_reply": "2024-04-30T08:47:01.783150Z"
    }
   },
   "outputs": [],
   "source": [
    "u1 = ad.Sparse.identity(constant=u0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Elementary properties of the `np.ndarray` class are inherited"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.784682Z",
     "iopub.status.busy": "2024-04-30T08:47:01.784606Z",
     "iopub.status.idle": "2024-04-30T08:47:01.786525Z",
     "shell.execute_reply": "2024-04-30T08:47:01.786267Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ndim : 1 , shape : (5,) , size : 5 , len : 5\n"
     ]
    }
   ],
   "source": [
    "print(\"ndim :\",u1.ndim,\", shape :\",u1.shape,\", size :\",u1.size,\", len :\",len(u1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "However, internal data includes coefficient and index arrays, with one additional dimension. For now, this additional dimension is a singleton, because the perturbations of each component $u_i^1$ is the elementary $\\delta_i$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.787839Z",
     "iopub.status.busy": "2024-04-30T08:47:01.787768Z",
     "iopub.status.idle": "2024-04-30T08:47:01.789655Z",
     "shell.execute_reply": "2024-04-30T08:47:01.789426Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "u1.size_ad"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.791020Z",
     "iopub.status.busy": "2024-04-30T08:47:01.790947Z",
     "iopub.status.idle": "2024-04-30T08:47:01.792763Z",
     "shell.execute_reply": "2024-04-30T08:47:01.792521Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Constant term : [ 1.          0.30901699 -0.80901699 -0.80901699  0.30901699]\n",
      "Coefficients of the symbolic perturbation : [[1.]\n",
      " [1.]\n",
      " [1.]\n",
      " [1.]\n",
      " [1.]]\n",
      "Indices of the symbolic perturbation : [[0]\n",
      " [1]\n",
      " [2]\n",
      " [3]\n",
      " [4]]\n"
     ]
    }
   ],
   "source": [
    "print(\"Constant term :\", u1.value)\n",
    "print(\"Coefficients of the symbolic perturbation :\", u1.coef)\n",
    "print(\"Indices of the symbolic perturbation :\", u1.index)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.794016Z",
     "iopub.status.busy": "2024-04-30T08:47:01.793943Z",
     "iopub.status.idle": "2024-04-30T08:47:01.796069Z",
     "shell.execute_reply": "2024-04-30T08:47:01.795844Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([ 1.        ,  0.30901699, -0.80901699, -0.80901699,  0.30901699]), array([[1.],\n",
       "       [1.],\n",
       "       [1.],\n",
       "       [1.],\n",
       "       [1.]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "u1 # Scalar values, gradient coefficients, gradient indices."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Second order sparse automatic differentiation is based on the same principles. \n",
    "Using the `ad.Sparse2.identity` constructor, one generates an augmented variable $u^2$ with components\n",
    "$$\n",
    "    u^2_i = u^0_i + \\delta_i + o(\\delta_i^2)\n",
    "$$\n",
    "where $(\\delta_i)_{0 \\leq i < n}$ are independent first order symbolic perturbations. The only difference with the first order case here lies in the remainder $o(\\delta_i^2)$: the second order coefficients are known and explicitly set to zero."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.797425Z",
     "iopub.status.busy": "2024-04-30T08:47:01.797346Z",
     "iopub.status.idle": "2024-04-30T08:47:01.798993Z",
     "shell.execute_reply": "2024-04-30T08:47:01.798760Z"
    }
   },
   "outputs": [],
   "source": [
    "u2 = ad.Sparse2.identity(constant=u0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.800212Z",
     "iopub.status.busy": "2024-04-30T08:47:01.800133Z",
     "iopub.status.idle": "2024-04-30T08:47:01.802156Z",
     "shell.execute_reply": "2024-04-30T08:47:01.801937Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Constant term : [ 1.          0.30901699 -0.80901699 -0.80901699  0.30901699]\n",
      "First order coefficients  : [[1.]\n",
      " [1.]\n",
      " [1.]\n",
      " [1.]\n",
      " [1.]]\n",
      "First order indices : [[0]\n",
      " [1]\n",
      " [2]\n",
      " [3]\n",
      " [4]]\n",
      "\n",
      "Second order coefficients : []\n",
      "Second order row indices : []\n",
      "Second order column indices : []\n"
     ]
    }
   ],
   "source": [
    "print(\"Constant term :\", u2.value)\n",
    "print(\"First order coefficients  :\", u2.coef1)\n",
    "print(\"First order indices :\", u2.index)\n",
    "print()\n",
    "\n",
    "print(\"Second order coefficients :\",u2.coef2)\n",
    "print(\"Second order row indices :\",u2.index_row)\n",
    "print(\"Second order column indices :\",u2.index_col)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.803400Z",
     "iopub.status.busy": "2024-04-30T08:47:01.803324Z",
     "iopub.status.idle": "2024-04-30T08:47:01.805494Z",
     "shell.execute_reply": "2024-04-30T08:47:01.805272Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD2(array([ 1.        ,  0.30901699, -0.80901699, -0.80901699,  0.30901699]), array([[1.],\n",
       "       [1.],\n",
       "       [1.],\n",
       "       [1.],\n",
       "       [1.]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]), array([], shape=(5, 0), dtype=float64), array([], shape=(5, 0), dtype=int64), array([], shape=(5, 0), dtype=int64))"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "u2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.2 Evaluating the objective function \n",
    "Arithmetic operations, such as left and right multiplication, addition, substraction, division, work as usual.\n",
    "In particular, we can evaluate our objective function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.806844Z",
     "iopub.status.busy": "2024-04-30T08:47:01.806771Z",
     "iopub.status.idle": "2024-04-30T08:47:01.808615Z",
     "shell.execute_reply": "2024-04-30T08:47:01.808394Z"
    }
   },
   "outputs": [],
   "source": [
    "obj1 = Objective(u1,x,dx,F0,F0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The result is an `array scalar`, i.e. an array whose shape is the empty tuple, but with substantial AD information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.809942Z",
     "iopub.status.busy": "2024-04-30T08:47:01.809868Z",
     "iopub.status.idle": "2024-04-30T08:47:01.811958Z",
     "shell.execute_reply": "2024-04-30T08:47:01.811743Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array(17.39957514), array([ -6.90983006, -11.18033989,   0.        ,  11.18033989,\n",
       "         6.90983006,   6.90983006,  11.18033989,  -0.        ,\n",
       "       -11.18033989,  -6.90983006,   0.1       ,   0.0309017 ,\n",
       "        -0.0809017 ,  -0.0809017 ,   0.0309017 ]), array([1, 2, 3, 4, 0, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4]))"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "obj1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Dense representation of the AD information would be more adequate for this particular variable. A conversion is obtained as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.813246Z",
     "iopub.status.busy": "2024-04-30T08:47:01.813166Z",
     "iopub.status.idle": "2024-04-30T08:47:01.815232Z",
     "shell.execute_reply": "2024-04-30T08:47:01.814982Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "denseAD(array(17.39957514),\n",
       "array([ 13.91966011,   4.30141153, -11.26124159, -11.26124159,\n",
       "         4.30141153]))"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "obj1.to_dense()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note also that the variable $b^1=$ `obj1` has the form\n",
    "$$\n",
    "    b^1 = b^0 + \\sum_{0 \\leq i <I} \\alpha_i \\delta_{k_i},\n",
    "$$\n",
    "where the $k_i$ are sorted and not pairwise distinct. A canonical form is obtained with the `simplify_ad` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.816632Z",
     "iopub.status.busy": "2024-04-30T08:47:01.816556Z",
     "iopub.status.idle": "2024-04-30T08:47:01.819008Z",
     "shell.execute_reply": "2024-04-30T08:47:01.818784Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array(17.39957514), array([ 13.91966011,   4.30141153, -11.26124159, -11.26124159,\n",
       "         4.30141153]), array([0, 1, 2, 3, 4]))"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "obj1.simplify_ad() # In place simplification\n",
    "obj1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.820482Z",
     "iopub.status.busy": "2024-04-30T08:47:01.820401Z",
     "iopub.status.idle": "2024-04-30T08:47:01.822305Z",
     "shell.execute_reply": "2024-04-30T08:47:01.822071Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "denseAD(array(17.39957514),\n",
       "array([ 13.91966011,   4.30141153, -11.26124159, -11.26124159,\n",
       "         4.30141153]))"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "obj1.to_dense()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First order automatic differentiation provides us with the gradient of the objective function, which enables gradient descent methods for minimization. However, Newton methods are more appropriate in certain contexts. For that purpose, second order automatic differentiation can extract the objective function hessian. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.823627Z",
     "iopub.status.busy": "2024-04-30T08:47:01.823551Z",
     "iopub.status.idle": "2024-04-30T08:47:01.825483Z",
     "shell.execute_reply": "2024-04-30T08:47:01.825248Z"
    }
   },
   "outputs": [],
   "source": [
    "obj2 = Objective(u2,x,dx,F0,G0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at the dense form, one recognizes the matrix of a Laplacian."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.826758Z",
     "iopub.status.busy": "2024-04-30T08:47:01.826683Z",
     "iopub.status.idle": "2024-04-30T08:47:01.829274Z",
     "shell.execute_reply": "2024-04-30T08:47:01.829047Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "denseAD2(array(17.78637854),\n",
       "array([ 13.91966011,   4.20630588, -11.42002011, -11.30246306,\n",
       "         4.39651718]),\n",
       "array([[ 20.1, -10. ,   0. ,   0. , -10. ],\n",
       "       [-10. ,  20.1, -10. ,   0. ,   0. ],\n",
       "       [  0. , -10. ,  20.1, -10. ,   0. ],\n",
       "       [  0. ,   0. , -10. ,  20.1, -10. ],\n",
       "       [-10. ,   0. ,   0. , -10. ,  20.1]]))"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "obj2.to_dense()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our particular objective function, with those specific choices of $F_0$ and $G_0$, is quadratic. Therefore `obj2` is in fact an exact representation, and a Newton method based on these reconstructed gradient and hessian would converge in a single step. See the notebook [Elliptic](../Notebooks_Div/Elliptic.ipynb) for more on this approach."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.3 Evaluating the PDE residue\n",
    "\n",
    "The numerical scheme, returning the PDE residue, can also be evaluated with AD variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.830652Z",
     "iopub.status.busy": "2024-04-30T08:47:01.830575Z",
     "iopub.status.idle": "2024-04-30T08:47:01.832644Z",
     "shell.execute_reply": "2024-04-30T08:47:01.832425Z"
    }
   },
   "outputs": [],
   "source": [
    "scheme0 = Scheme0(u0,x)\n",
    "scheme1 = Scheme0(u1,x)\n",
    "scheme2 = Scheme0(u2,x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The constant part is simply the PDE residue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.833920Z",
     "iopub.status.busy": "2024-04-30T08:47:01.833845Z",
     "iopub.status.idle": "2024-04-30T08:47:01.835876Z",
     "shell.execute_reply": "2024-04-30T08:47:01.835633Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 35.54915028,  10.03423506, -30.34765197, -29.17208146,\n",
       "        11.93634809])"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "scheme0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first order part contains the numerical scheme jacobian matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.837244Z",
     "iopub.status.busy": "2024-04-30T08:47:01.837165Z",
     "iopub.status.idle": "2024-04-30T08:47:01.839339Z",
     "shell.execute_reply": "2024-04-30T08:47:01.839129Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([ 35.54915028,  10.03423506, -30.34765197, -29.17208146,\n",
       "        11.93634809]), array([[-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.]]), array([[1, 0, 4, 0],\n",
       "       [2, 1, 0, 1],\n",
       "       [3, 2, 1, 2],\n",
       "       [4, 3, 2, 3],\n",
       "       [0, 4, 3, 4]]))"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "scheme1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at the dense form, we recognize the matrix of a laplacian."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.840720Z",
     "iopub.status.busy": "2024-04-30T08:47:01.840639Z",
     "iopub.status.idle": "2024-04-30T08:47:01.842780Z",
     "shell.execute_reply": "2024-04-30T08:47:01.842552Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "denseAD(array([ 35.54915028,  10.03423506, -30.34765197, -29.17208146,\n",
       "        11.93634809]),\n",
       "array([[ 51., -25.,   0.,   0., -25.],\n",
       "       [-25.,  51., -25.,   0.,   0.],\n",
       "       [  0., -25.,  51., -25.,   0.],\n",
       "       [  0.,   0., -25.,  51., -25.],\n",
       "       [-25.,   0.,   0., -25.,  51.]]))"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "scheme1.to_dense(5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since this specific numerical scheme is linear, the variable `scheme1` is a faithful representation of it, and a Newton method based on the extracted Jacobian will converge in a single step. See the notebook [LinearMonotoneSchemes2D](../Notebooks_NonDiv/LinearMonotoneSchemes2D.ipynb) for similar but more advanced examples."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, since this numerical scheme is linear, second order terms are zero."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.844163Z",
     "iopub.status.busy": "2024-04-30T08:47:01.844086Z",
     "iopub.status.idle": "2024-04-30T08:47:01.846234Z",
     "shell.execute_reply": "2024-04-30T08:47:01.846020Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD2(array([ 35.54915028,  10.03423506, -30.34765197, -29.17208146,\n",
       "        11.93634809]), array([[-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.],\n",
       "       [-25.,  50., -25.,   1.]]), array([[1, 0, 4, 0],\n",
       "       [2, 1, 0, 1],\n",
       "       [3, 2, 1, 2],\n",
       "       [4, 3, 2, 3],\n",
       "       [0, 4, 3, 4]]), array([], shape=(5, 0), dtype=float64), array([], shape=(5, 0), dtype=int64), array([], shape=(5, 0), dtype=int64))"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "scheme2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2.4 Other operations and mathematical functions\n",
    "A number of elementary mathematical functions are implemented."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.847591Z",
     "iopub.status.busy": "2024-04-30T08:47:01.847515Z",
     "iopub.status.idle": "2024-04-30T08:47:01.849685Z",
     "shell.execute_reply": "2024-04-30T08:47:01.849459Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([1.41421356, 1.14412281, 0.43701602, 0.43701602, 1.14412281]), array([[0.35355339],\n",
       "       [0.43701602],\n",
       "       [1.14412281],\n",
       "       [1.14412281],\n",
       "       [0.43701602]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.sqrt(1+u1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.850955Z",
     "iopub.status.busy": "2024-04-30T08:47:01.850881Z",
     "iopub.status.idle": "2024-04-30T08:47:01.853023Z",
     "shell.execute_reply": "2024-04-30T08:47:01.852793Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([1.        , 0.30901699, 0.80901699, 0.80901699, 0.30901699]), array([[ 1.],\n",
       "       [ 1.],\n",
       "       [-1.],\n",
       "       [-1.],\n",
       "       [ 1.]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.abs(u1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Comparison operators return a basic `np.ndarray`, as well as integer valued functions. Indeed, since these functions take their values in a discrete set, AD information would be pointless."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.854335Z",
     "iopub.status.busy": "2024-04-30T08:47:01.854254Z",
     "iopub.status.idle": "2024-04-30T08:47:01.856257Z",
     "shell.execute_reply": "2024-04-30T08:47:01.856031Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([False, False,  True,  True, False])"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "u1 <= (-1+x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.857564Z",
     "iopub.status.busy": "2024-04-30T08:47:01.857489Z",
     "iopub.status.idle": "2024-04-30T08:47:01.859588Z",
     "shell.execute_reply": "2024-04-30T08:47:01.859377Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.,  0., -1., -1.,  0.])"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.floor(u1+0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Maximum and minimum work as well"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.860901Z",
     "iopub.status.busy": "2024-04-30T08:47:01.860822Z",
     "iopub.status.idle": "2024-04-30T08:47:01.863052Z",
     "shell.execute_reply": "2024-04-30T08:47:01.862816Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([1.        , 0.30901699, 0.4       , 0.6       , 0.8       ]), array([[1.],\n",
       "       [1.],\n",
       "       [0.],\n",
       "       [0.],\n",
       "       [0.]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.maximum(x,u1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Composition of Dense and Sparse AD\n",
    "\n",
    "Let us recall the recommended use cases of Sparse and Dense automatic differentiation.\n",
    "* *Sparse* AD is best suited for *simple* functions of *many* variables.\n",
    "* *Dense* AD is best suited for *complex* functions with *few* variables.\n",
    "\n",
    "In particular, using Sparse AD with a complex function, or Dense AD with many variables, will likely be inefficient in terms of memory usage and computation time.\n",
    "However, when numerically solving a PDE with a complex expression, one is often confronted with expressions such as:\n",
    "$$\n",
    "    f(\\frac{u(x+h e_0)-u(x)} h,\\cdots,\\frac{u(x+h e_n)-u(x)} h) \n",
    "$$\n",
    "We show, in that case, how to combine:\n",
    "* Sparse automatic differentiation, for the construction of the finite differences, which can be in any direction $e_i\\in Z^d$ on the grid.\n",
    "* Dense automatic differentiation, for evaluating the function $f$ which may have a complex expression, possibly featuring inner loops."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.1 The problem : growth of the Sparse AD information\n",
    "\n",
    "Sparse automatic differentiation, as implemented, is mostly intended for very simple functions. Indeed, the size of the ad information grows quickly and can become overwhelming. For instance, consider the following function with a loop.\n",
    "<!---, which is a purposedly complicated way of computing \n",
    "$$\n",
    "    f(y) = 8 y.\n",
    "$$--->"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.864402Z",
     "iopub.status.busy": "2024-04-30T08:47:01.864323Z",
     "iopub.status.idle": "2024-04-30T08:47:01.866156Z",
     "shell.execute_reply": "2024-04-30T08:47:01.865918Z"
    }
   },
   "outputs": [],
   "source": [
    "def fun_with_loop(y):\n",
    "    r=y.copy()\n",
    "    for i in range(3):\n",
    "        r=r+r**2\n",
    "    return r"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.867412Z",
     "iopub.status.busy": "2024-04-30T08:47:01.867337Z",
     "iopub.status.idle": "2024-04-30T08:47:01.869707Z",
     "shell.execute_reply": "2024-04-30T08:47:01.869478Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([42.        ,  0.89091371, -0.11356996, -0.11356996,  0.89091371]), array([[ 1.00000000e+00,  2.00000000e+00,  4.00000000e+00,\n",
       "         8.00000000e+00,  1.20000000e+01,  2.40000000e+01,\n",
       "         4.80000000e+01,  9.60000000e+01],\n",
       "       [ 1.00000000e+00,  6.18033989e-01,  8.09016994e-01,\n",
       "         5.00000000e-01,  1.13627124e+00,  7.02254249e-01,\n",
       "         9.19262746e-01,  5.68135621e-01],\n",
       "       [ 1.00000000e+00, -1.61803399e+00, -3.09016994e-01,\n",
       "         5.00000000e-01, -2.61271243e-01,  4.22745751e-01,\n",
       "         8.07372542e-02, -1.30635621e-01],\n",
       "       [ 1.00000000e+00, -1.61803399e+00, -3.09016994e-01,\n",
       "         5.00000000e-01, -2.61271243e-01,  4.22745751e-01,\n",
       "         8.07372542e-02, -1.30635621e-01],\n",
       "       [ 1.00000000e+00,  6.18033989e-01,  8.09016994e-01,\n",
       "         5.00000000e-01,  1.13627124e+00,  7.02254249e-01,\n",
       "         9.19262746e-01,  5.68135621e-01]]), array([[0, 0, 0, 0, 0, 0, 0, 0],\n",
       "       [1, 1, 1, 1, 1, 1, 1, 1],\n",
       "       [2, 2, 2, 2, 2, 2, 2, 2],\n",
       "       [3, 3, 3, 3, 3, 3, 3, 3],\n",
       "       [4, 4, 4, 4, 4, 4, 4, 4]]))"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f1 = fun_with_loop(u1)\n",
    "f1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After each binary operation, the AD information is as big as the concatenation of the input AD informations. In a loop, it therefore grows exponentially and becomes extremely redundant. Here, the result can be much simplified.\n",
    "\n",
    "AD information is however simplified only at the request of the user, because that is a costly operation, involving some sorting and array resizing. And simplifying the final result only partially solves the problem, due to the cost of computing the intermediate results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.870994Z",
     "iopub.status.busy": "2024-04-30T08:47:01.870918Z",
     "iopub.status.idle": "2024-04-30T08:47:01.873351Z",
     "shell.execute_reply": "2024-04-30T08:47:01.873129Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([42.        ,  0.89091371, -0.11356996, -0.11356996,  0.89091371]), array([[195.        ],\n",
       "       [  6.25297484],\n",
       "       [ -0.31547484],\n",
       "       [ -0.31547484],\n",
       "       [  6.25297484]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f1.simplify_ad()\n",
    "f1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.2 A first solution: regular simplifications\n",
    "\n",
    "When a loop is present, the AD information grows exponentially and its size is quickly our of control, as illustrated above. A partial solution is to simplify the AD information at each iteration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.874683Z",
     "iopub.status.busy": "2024-04-30T08:47:01.874607Z",
     "iopub.status.idle": "2024-04-30T08:47:01.876428Z",
     "shell.execute_reply": "2024-04-30T08:47:01.876195Z"
    }
   },
   "outputs": [],
   "source": [
    "def fun_with_loop_simpl(y):\n",
    "    r=y.copy()\n",
    "    for i in range(3):\n",
    "        r=r+r**2\n",
    "        ad.simplify_ad(r) # Simplify AD information\n",
    "    return r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Complexity, is now linear w.r.t. the number of iterations, instead of exponential."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.877748Z",
     "iopub.status.busy": "2024-04-30T08:47:01.877674Z",
     "iopub.status.idle": "2024-04-30T08:47:01.880175Z",
     "shell.execute_reply": "2024-04-30T08:47:01.879943Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([42.        ,  0.89091371, -0.11356996, -0.11356996,  0.89091371]), array([[195.        ],\n",
       "       [  6.25297484],\n",
       "       [ -0.31547484],\n",
       "       [ -0.31547484],\n",
       "       [  6.25297484]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fun_with_loop_simpl(u1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The function had to be slightly modified, but it still applies transparently to non AD variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.881450Z",
     "iopub.status.busy": "2024-04-30T08:47:01.881374Z",
     "iopub.status.idle": "2024-04-30T08:47:01.883394Z",
     "shell.execute_reply": "2024-04-30T08:47:01.883138Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([42.        ,  0.89091371, -0.11356996, -0.11356996,  0.89091371])"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fun_with_loop_simpl(u0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3.3 Another possible solution: the combination with dense automatic differentiation\n",
    "\n",
    "Dense automatic differentiation is  not subject to the same limitations: it is efficient for evaluating complex functions depending only on a few independent variables. \n",
    "\n",
    "When such a function needs to be executed on a sparse AD variable, the recommended approach is to compose dense and sparse automatic differentiation. This is achieved using the `ad.apply` command and indicating that the last dimensions are bound together."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.884746Z",
     "iopub.status.busy": "2024-04-30T08:47:01.884669Z",
     "iopub.status.idle": "2024-04-30T08:47:01.886300Z",
     "shell.execute_reply": "2024-04-30T08:47:01.886086Z"
    }
   },
   "outputs": [],
   "source": [
    "u1 = ad.Sparse.identity(constant=u0)\n",
    "u2 = ad.Sparse2.identity(constant=u0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.887546Z",
     "iopub.status.busy": "2024-04-30T08:47:01.887471Z",
     "iopub.status.idle": "2024-04-30T08:47:01.889875Z",
     "shell.execute_reply": "2024-04-30T08:47:01.889649Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([42.        ,  0.89091371, -0.11356996, -0.11356996,  0.89091371]), array([[195.        ],\n",
       "       [  6.25297484],\n",
       "       [ -0.31547484],\n",
       "       [ -0.31547484],\n",
       "       [  6.25297484]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ad.apply(fun_with_loop,u1,shape_bound=u1.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following command achieves the same result \"by hand\", in a less automated way."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.891216Z",
     "iopub.status.busy": "2024-04-30T08:47:01.891139Z",
     "iopub.status.idle": "2024-04-30T08:47:01.893792Z",
     "shell.execute_reply": "2024-04-30T08:47:01.893583Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([42.        ,  0.89091371, -0.11356996, -0.11356996,  0.89091371]), array([[195.        ],\n",
       "       [  6.25297484],\n",
       "       [ -0.31547484],\n",
       "       [ -0.31547484],\n",
       "       [  6.25297484]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# These dimensions can be \"bound together\", a.k.a. not regarded as independent.\n",
    "shape_bound = u1.shape\n",
    "# Generate an adequately dimensioned denseAD variable\n",
    "u1d = ad.Dense.identity(constant=u1.value,shape_bound=shape_bound)\n",
    "# Evaluate the complex function \n",
    "fu1d = fun_with_loop(u1d)\n",
    "# Compose the dense en sparse AD\n",
    "ad.compose(fu1d,np.expand_dims(u1,axis=0),shape_bound=shape_bound)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Composition also works between *dense* AD types. It may speed up computations in the case of the evaluation of a complex function on a dense AD type with many components."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.895220Z",
     "iopub.status.busy": "2024-04-30T08:47:01.895142Z",
     "iopub.status.idle": "2024-04-30T08:47:01.897512Z",
     "shell.execute_reply": "2024-04-30T08:47:01.897274Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "denseAD(array([42.        ,  0.89091371, -0.11356996, -0.11356996,  0.89091371]),\n",
       "array([[195.        ,   0.        ,   0.        ,   0.        ,\n",
       "          0.        ],\n",
       "       [  0.        ,   6.25297484,   0.        ,   0.        ,\n",
       "          0.        ],\n",
       "       [  0.        ,   0.        ,  -0.31547484,   0.        ,\n",
       "          0.        ],\n",
       "       [  0.        ,   0.        ,   0.        ,  -0.31547484,\n",
       "          0.        ],\n",
       "       [  0.        ,   0.        ,   0.        ,   0.        ,\n",
       "          6.25297484]]))"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ad.apply(fun_with_loop,u1.to_dense(),shape_bound=u1.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Composition also works for second order automatic differentiation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.898833Z",
     "iopub.status.busy": "2024-04-30T08:47:01.898754Z",
     "iopub.status.idle": "2024-04-30T08:47:01.900542Z",
     "shell.execute_reply": "2024-04-30T08:47:01.900311Z"
    }
   },
   "outputs": [],
   "source": [
    "result = fun_with_loop(u2.to_dense())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.901868Z",
     "iopub.status.busy": "2024-04-30T08:47:01.901789Z",
     "iopub.status.idle": "2024-04-30T08:47:01.906335Z",
     "shell.execute_reply": "2024-04-30T08:47:01.906104Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1.4210854715202004e-14, 7.105427357601002e-15, 0.0)"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tuple(LInfNorm_AD2(result - a.to_dense()) for a in (\n",
    "    fun_with_loop(u2),\n",
    "    fun_with_loop_simpl(u2),\n",
    "    ad.apply(fun_with_loop,u2,shape_bound=u2.shape),\n",
    "))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.907686Z",
     "iopub.status.busy": "2024-04-30T08:47:01.907606Z",
     "iopub.status.idle": "2024-04-30T08:47:01.910073Z",
     "shell.execute_reply": "2024-04-30T08:47:01.909845Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(0.0,)"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tuple(LInfNorm_AD2(result - a) for a in (\n",
    "    ad.apply(fun_with_loop,u2.to_dense(),shape_bound=u2.shape),\n",
    "))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Differentiating extrema using the envelope theorem\n",
    "\n",
    "*The envelope theorem.* Consider a function $f:X \\to R$ defined as the minimum of a bivariate function $F:X \\times A \\to R$ over the second variable:\n",
    "$$\n",
    "    f(x):=\\min_{\\alpha\\in A} F(x,\\alpha),\n",
    "$$\n",
    "for each element $x$ of a domain $X \\subset R^d$, and where $A$ is an arbitrary set. The *envelope theorem* states that, under minor technical assumptions, one has \n",
    "$$\n",
    "    \\nabla f(x) = \\frac{\\partial F}{\\partial x}(x,\\alpha(x)),\n",
    "$$\n",
    "where $\\alpha(x) \\in A$ is the minimizer to the above optimization problem, that is $f(x) = F(x,\\alpha(x))$.\n",
    "\n",
    "**Generalizations.**\n",
    "The function $f$ may be defined by a maximum instead of a minimum, or a combination e.g. $f(x) =\\min_{\\alpha\\in A}\\max_{\\beta \\in B} F(x,\\alpha,\\beta)$. It may also be defined in a piecewise manner, etc.\n",
    "\n",
    "**Relevance to automatic differentiation.**\n",
    "In a typical pratical situation:\n",
    "* Computing the optimal $\\alpha(x)$ is a costly procedure, involving an exhausitve serach or an iterative optimization method. The AD information is useless in this context, and has an significant performance impact. It is preferable *not to* use AD types part.\n",
    "* Evaluating $f(x) = F(x,\\alpha(x))$ is rather straightforward once $\\alpha(x)$ is known. Furthermore AD types are needed in this final step in order to effectively differentiate $f$.\n",
    "\n",
    "**First order derivatives only !** From a mathematical standpoint, the envelope theorem *does not apply to second order derivatives* in general. Please recall this. Our AD library will not stop you or issue a warning if you attempt to do so. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Practical implementation.**\n",
    "Implement the function $f$ with the following enhancements:\n",
    "* *Input*. In addition to the standard parameters (a.k.a. the variable $x$ above), the function input should feature an  `oracle` *optional* parameter (corresponding to $\\alpha(x)$). If the oracle is provided, then the optimization step is bypassed.\n",
    "* *Output*. In addition to the relevant value $f(x)$ the function output should include the optimized parameter $\\alpha(x)$, which may be used as an `oracle` in subsequent calls.\n",
    "\n",
    "Then call the function $f$ twice, first with raw `np.ndarray` data types, and then with the desired AD type. We provide a helper function in the AD library to automate this process."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.911460Z",
     "iopub.status.busy": "2024-04-30T08:47:01.911385Z",
     "iopub.status.idle": "2024-04-30T08:47:01.913049Z",
     "shell.execute_reply": "2024-04-30T08:47:01.912818Z"
    }
   },
   "outputs": [],
   "source": [
    "u1 = ad.Sparse.identity(constant=u0)\n",
    "u2 = ad.Sparse2.identity(constant=u0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define a function whose evaluation involves a minimization by exhaustive search in a large table, here with $100$ entries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.914368Z",
     "iopub.status.busy": "2024-04-30T08:47:01.914293Z",
     "iopub.status.idle": "2024-04-30T08:47:01.916039Z",
     "shell.execute_reply": "2024-04-30T08:47:01.915799Z"
    }
   },
   "outputs": [],
   "source": [
    "def f(x,A,B):\n",
    "    return np.min(A*x+B,axis=0) # Exhaustive search"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.917307Z",
     "iopub.status.busy": "2024-04-30T08:47:01.917232Z",
     "iopub.status.idle": "2024-04-30T08:47:01.919123Z",
     "shell.execute_reply": "2024-04-30T08:47:01.918898Z"
    }
   },
   "outputs": [],
   "source": [
    "A=np.linspace(0,1,100)**0.5; B=np.linspace(1,0,100)**2\n",
    "A,B = (np.expand_dims(e,axis=-1) for e in (A,B))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A little bit of code rewrite is needed to make this function compatible with the oracle interface.\n",
    "For convenience, we provide a `min_argmin` function, and likewise a `max_argmax` function. (Note that the minimizer index given in the output of `f_oracle` will be erroneous if an oracle is given, but this shall not be an issue.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.920413Z",
     "iopub.status.busy": "2024-04-30T08:47:01.920341Z",
     "iopub.status.idle": "2024-04-30T08:47:01.922265Z",
     "shell.execute_reply": "2024-04-30T08:47:01.922065Z"
    }
   },
   "outputs": [],
   "source": [
    "def f_oracle(x,A,B,oracle=None):\n",
    "    if not (oracle is None): \n",
    "        A,B = (np.take_along_axis(e,np.expand_dims(oracle,axis=0),axis=0) for e in (A,B)) \n",
    "    return ad.min_argmin(A*x+B,axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is no need to use the AD types to compute the minimizer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.923569Z",
     "iopub.status.busy": "2024-04-30T08:47:01.923491Z",
     "iopub.status.idle": "2024-04-30T08:47:01.925242Z",
     "shell.execute_reply": "2024-04-30T08:47:01.925012Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Value (without AD) :  [ 0.92667447  0.30279844 -0.80901699 -0.80901699  0.30279844] \n",
      "and minimizer  [69 91 99 99 91]\n"
     ]
    }
   ],
   "source": [
    "value,minimizer = f_oracle(u0,A,B)\n",
    "print(\"Value (without AD) : \", value, \"\\nand minimizer \", minimizer)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given the minimizer, one can  efficiently evaluate $f$ on the AD type, bypassing most computations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.926577Z",
     "iopub.status.busy": "2024-04-30T08:47:01.926502Z",
     "iopub.status.idle": "2024-04-30T08:47:01.928759Z",
     "shell.execute_reply": "2024-04-30T08:47:01.928528Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "spAD(array([ 0.92667447,  0.30279844, -0.80901699, -0.80901699,  0.30279844]), array([[0.83484711],\n",
       "       [0.95874497],\n",
       "       [1.        ],\n",
       "       [1.        ],\n",
       "       [0.95874497]]), array([[0],\n",
       "       [1],\n",
       "       [2],\n",
       "       [3],\n",
       "       [4]]))"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f_oracle(u1,A,B,oracle=minimizer)[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use `ad.apply`, with the keyword argument `envelope=True`, to automatically apply the above procedure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2024-04-30T08:47:01.930079Z",
     "iopub.status.busy": "2024-04-30T08:47:01.930004Z",
     "iopub.status.idle": "2024-04-30T08:47:01.932349Z",
     "shell.execute_reply": "2024-04-30T08:47:01.932105Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(spAD(array([ 0.92667447,  0.30279844, -0.80901699, -0.80901699,  0.30279844]), array([[0.83484711],\n",
       "        [0.95874497],\n",
       "        [1.        ],\n",
       "        [1.        ],\n",
       "        [0.95874497]]), array([[0],\n",
       "        [1],\n",
       "        [2],\n",
       "        [3],\n",
       "        [4]])),\n",
       " array([69, 91, 99, 99, 91]))"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ad.apply(f_oracle,u1,A,B,envelope=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.8"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}