{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tutorial on JIT versus Scipy execution within Parcels"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This tutorial is meant to highlight the potentially very big difference between the computational time required to run Parcels in **JIT** (Just-In-Time compilation) versus in **Scipy** mode. It also discusses how to more efficiently sample in Scipy mode."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Short summary: JIT is faster than scipy\n",
    "\n",
    "In the code snippet below, we use `AdvectionRK4` to advect 100 particles in the peninsula `FieldSet`. We first do it in JIT mode (by setting `ptype=JITParticle` in the declaration of `pset`) and then we also do it in Scipy mode (by setting `ptype=ScipyParticle` in the declaration of `pset`).\n",
    "\n",
    "In both cases, we advect the particles for 1 hour, with a timestep of 30 seconds.\n",
    "\n",
    "To measure the computational time, we use the `timer` module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO: Compiled JITParticleAdvectionRK4 ==> /var/folders/r2/8593q8z93kd7t4j9kbb_f7p00000gr/T/parcels-504/fa40f24fcc10fceffaa228a76b22051d_0.so\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(100%)  Timer root                       : 6.823e+00 s\n",
      "(  1%)    (  1%) Timer fieldset creation : 8.873e-02 s\n",
      "( 87%)    ( 87%) Timer scipy             : 5.910e+00 s\n",
      "( 12%)    ( 12%) Timer jit               : 8.232e-01 s\n"
     ]
    }
   ],
   "source": [
    "from parcels import FieldSet, ParticleSet, JITParticle, ScipyParticle\n",
    "from parcels import AdvectionRK4\n",
    "from parcels import timer\n",
    "from datetime import timedelta as delta\n",
    "\n",
    "timer.root = timer.Timer('root')\n",
    "\n",
    "timer.fieldset = timer.Timer('fieldset creation', parent=timer.root)\n",
    "fieldset = FieldSet.from_parcels('Peninsula_data/peninsula', allow_time_extrapolation=True)\n",
    "timer.fieldset.stop()\n",
    "\n",
    "ptype = {'scipy': ScipyParticle, 'jit': JITParticle}\n",
    "ptimer = {'scipy': timer.Timer('scipy', parent=timer.root, start=False),\n",
    "          'jit': timer.Timer('jit', parent=timer.root, start=False)}\n",
    "\n",
    "for p in ['scipy', 'jit']:\n",
    "    pset = ParticleSet.from_line(fieldset=fieldset, pclass=ptype[p], size=100, start=(3e3, 3e3), finish=(3e3, 45e3))\n",
    "\n",
    "    ptimer[p].start()\n",
    "    pset.execute(AdvectionRK4, runtime=delta(hours=1), dt=delta(seconds=30))\n",
    "    ptimer[p].stop()\n",
    "\n",
    "timer.root.stop()\n",
    "timer.root.print_tree()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see above, even in this very small example **Scipy mode took almost 8 times as long** (6.0 seconds versus 0.7 seconds) as the JIT mode. For larger examples, this can grow to hundreds of times slower.\n",
    "\n",
    "This is just an illustrative example, depending on the number of calls to `AdvectionRK4`, the size of the `FieldSet`, the size of the `pset`, the ratio between `dt` and `outputdt` in the `.execute` etc, the difference between JIT and Scipy can vary significantly. However, JIT will almost always be faster!\n",
    "\n",
    "So why does Parcels support both JIT and Scipy mode then? Because Scipy is easier to debug when writing custom kernels, so can provide faster development of new features."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*As an aside, you may wonder why we use the `time.time` module, and not the timeit module, to time the runs above. That's because it affects the AST of the kernels, causing errors in JIT mode.*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Further digging into Scipy mode: adding `particle` keyword to `Field`-sampling\n",
    "\n",
    "Sometimes, you'd want to run Parcels in Scipy mode anyways. In that case, there are ways to make Parcels a bit faster.\n",
    "\n",
    "As background, one of the most computationally expensive operations in Parcels is the [Field Sampling](http://nbviewer.jupyter.org/github/OceanParcels/parcels/blob/master/parcels/examples/tutorial_sampling.ipynb). In the default sampling in Scipy mode, we don't keep track of _where_ in the grid a particle is; which means that for every sampling call, we need to again search for which grid cell a particle is in.\n",
    "\n",
    "Let's see how this works in the simple Peninsula FieldSet used above. We use a simple Euler-Forward Advection now to make the point. In particular, we use two types of Advection Kernels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def AdvectionEE_depth_lat_lon_time(particle, fieldset, time):\n",
    "    (u1, v1) = fieldset.UV[time, particle.depth, particle.lat, particle.lon]\n",
    "    particle.lon += u1 * particle.dt\n",
    "    particle.lat += v1 * particle.dt\n",
    "\n",
    "def AdvectionEE_depth_lat_lon_time_particle(particle, fieldset, time):\n",
    "    (u1, v1) = fieldset.UV[time, particle.depth, particle.lat, particle.lon, particle]  # note the extra particle argument here\n",
    "    particle.lon += u1 * particle.dt\n",
    "    particle.lat += v1 * particle.dt\n",
    "\n",
    "kernels = {'dllt': AdvectionEE_depth_lat_lon_time, 'dllt_p': AdvectionEE_depth_lat_lon_time_particle}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(100%)  Timer root                       : 5.533e+00 s\n",
      "( 50%)    ( 50%) Timer dllt              : 2.750e+00 s\n",
      "( 50%)    ( 50%) Timer dllt_p            : 2.781e+00 s\n"
     ]
    }
   ],
   "source": [
    "timer.root = timer.Timer('root')\n",
    "ptimer = {'dllt': timer.Timer('dllt', parent=timer.root, start=False),\n",
    "          'dllt_p': timer.Timer('dllt_p', parent=timer.root, start=False)}\n",
    "\n",
    "for k in ['dllt', 'dllt_p']:\n",
    "    pset = ParticleSet.from_line(fieldset=fieldset, pclass=ScipyParticle, size=100, start=(3e3, 3e3), finish=(3e3, 45e3))\n",
    "\n",
    "    ptimer[k].start()\n",
    "    pset.execute(kernels[k], runtime=delta(hours=1), dt=delta(seconds=30))\n",
    "    ptimer[k].stop()\n",
    "\n",
    "timer.root.stop()\n",
    "timer.root.print_tree()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You will see that the two kernels don't really differ in speed. That is because the Peninsula FieldSet is a simple _Rectilinear_ Grid, where indexing a particle location to the grid is very fast.\n",
    "\n",
    "However, the difference is much more pronounced if we use a _Curvilinear_ Grid like in the [NEMO dataset](https://nbviewer.jupyter.org/github/OceanParcels/parcels/blob/master/parcels/examples/tutorial_nemo_curvilinear.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "WARNING: Casting lon data to np.float32\n",
      "WARNING: Casting lat data to np.float32\n",
      "WARNING: Casting depth data to np.float32\n",
      "WARNING: Trying to initialize a shared grid with different chunking sizes - action prohibited. Replacing requested field_chunksize with grid's master chunksize.\n"
     ]
    }
   ],
   "source": [
    "data_path = 'NemoCurvilinear_data/'\n",
    "filenames = {'U': {'lon': data_path + 'mesh_mask.nc4',\n",
    "                   'lat': data_path + 'mesh_mask.nc4',\n",
    "                   'data': data_path + 'U_purely_zonal-ORCA025_grid_U.nc4'},\n",
    "             'V': {'lon': data_path + 'mesh_mask.nc4',\n",
    "                   'lat': data_path + 'mesh_mask.nc4',\n",
    "                   'data': data_path + 'V_purely_zonal-ORCA025_grid_V.nc4'}}\n",
    "variables = {'U': 'U', 'V': 'V'}\n",
    "dimensions = {'lon': 'glamf', 'lat': 'gphif', 'time': 'time_counter'}\n",
    "fieldset = FieldSet.from_nemo(filenames, variables, dimensions, allow_time_extrapolation=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(100%)  Timer root                       : 9.748e+00 s\n",
      "( 96%)    ( 96%) Timer dllt              : 9.341e+00 s\n",
      "(  4%)    (  4%) Timer dllt_p            : 4.055e-01 s\n"
     ]
    }
   ],
   "source": [
    "timer.root = timer.Timer('root')\n",
    "ptimer = {'dllt': timer.Timer('dllt', parent=timer.root, start=False),\n",
    "          'dllt_p': timer.Timer('dllt_p', parent=timer.root, start=False)}\n",
    "\n",
    "for k in ['dllt', 'dllt_p']:\n",
    "    pset = ParticleSet.from_line(fieldset=fieldset, pclass=ScipyParticle, size=10, start=(45, 40), finish=(60, 40))\n",
    "\n",
    "    ptimer[k].start()\n",
    "    pset.execute(kernels[k], runtime=delta(days=10), dt=delta(hours=6))\n",
    "    ptimer[k].stop()\n",
    "\n",
    "timer.root.stop()\n",
    "timer.root.print_tree()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, the difference is massive, with the `AdvectionEE_depth_lat_lon_time_particle` kernel more than 20 times faster than the kernel without the `particle` argument at the end of the Field sampling operation.\n",
    "\n",
    "So, if you want to run in Scipy mode, make sure to add `particle` at the end of your Field sampling!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}