{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Faster Feature Location Through Parallel Computation\n",
    "\n",
    "Feature-finding can easily be parallelized: each frame an independent task, and the tasks can be divided among the multiple CPU cores in most modern computers. Instead of running in a single process as usual, your code is spread across multiple \"worker\" processes, each running on its own CPU core.\n",
    "\n",
    "First, let's set up the movie to track:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pims\n",
    "import trackpy as tp\n",
    "\n",
    "@pims.pipeline\n",
    "def gray(image):\n",
    "    return image[:, :, 1]\n",
    "\n",
    "frames = gray(pims.ImageSequence('../sample_data/bulk_water/*.png'))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "tp.quiet()  # Disabling progress reports makes this a fairer comparison"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using trackpy.batch\n",
    "\n",
    "Beginning with trackpy v0.4.2, use the \"processes\" argument to have `trackpy.batch` run on multiple CPU cores at once (using Python's built-in multiprocessing module). Give the number of cores you want to use, or specify `'auto'` to let trackpy detect how many cores your computer has.\n",
    "\n",
    "Let's compare the time required to process the first 100 frames:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.33 s ± 55.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "features = tp.batch(frames[:100], 13, invert=True, processes='auto')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For comparison, here's the same thing running in a single process. This was run on a laptop with only 2 cores, so we should expect `batch` to take roughly twice as long as the parallel version:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4.93 s ± 110 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "features = tp.batch(frames[:100], 13, invert=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using IPython Parallel\n",
    "\n",
    "Using [IPython parallel](https://github.com/ipython/ipyparallel) is a little more involved, but it gives you a lot of flexibility if you need to go beyond `batch`, for example by having the parallel workers run your own custom image processing. It also works with all versions of trackpy.\n",
    "\n",
    "## Install ipyparallel and start a cluster\n",
    "\n",
    "As of IPython 6.2 (November 2017), IPython parallel is a separate package. If you are not using a comprehensive distribution like Anaconda, you may need to install this package at the command prompt using `pip install ipyparallel` or `conda install ipyparallel`.\n",
    "\n",
    "It is simplest to start a cluster on the CPUs of your local machine. In order to start a cluster, you will need to go to a Terminal and type:\n",
    "```\n",
    "ipcluster start\n",
    "```\n",
    "\n",
    "This automatically uses all available CPU cores, but you can also use the `-n` option to specify how many workers to start. Now you are running a cluster — it's that easy! More information on IPython parallel is available in [the IPython parallel documentation](http://ipyparallel.readthedocs.io/en/latest/intro.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ipyparallel import Client\n",
    "client = Client()\n",
    "view = client.load_balanced_view()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that there are four cores available."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<DirectView [0, 1, 2, 3]>"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "client[:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use a little magic, ``%%px``, to import trackpy on all cores."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%px\n",
    "import trackpy as tp\n",
    "tp.quiet()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use the workers to locate features\n",
    "\n",
    "Define a function from ``locate`` with all the parameters specified, so the function's only argument is the image to be analyzed. We can map this function directly onto our collection of images. (This is a called \"currying\" a function, hence the choice of name.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "curried_locate = lambda image: tp.locate(image, 13, invert=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AsyncMapResult: <lambda>>"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "view.map(curried_locate, frames[:4])  # Optionally, prime each engine: make it set up numba."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare the time it takes to locate features in the first 100 images with and without parallelization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 100/100 tasks finished after    2 s\n",
      "done\n",
      "2.9 s ± 195 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "amr = view.map_async(curried_locate, frames[:100])\n",
    "amr.wait_interactive()\n",
    "results = amr.get()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3.9 s ± 58.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "serial_result = list(map(curried_locate, frames[:100]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, if we want to get output similar to `batch`, we collect the results into a single DataFrame:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " 100/100 tasks finished after    2 s\n",
      "done\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>y</th>\n",
       "      <th>x</th>\n",
       "      <th>mass</th>\n",
       "      <th>size</th>\n",
       "      <th>ecc</th>\n",
       "      <th>signal</th>\n",
       "      <th>raw_mass</th>\n",
       "      <th>ep</th>\n",
       "      <th>frame</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <td>0</td>\n",
       "      <td>5.728435</td>\n",
       "      <td>295.067222</td>\n",
       "      <td>297.073839</td>\n",
       "      <td>2.499673</td>\n",
       "      <td>0.230136</td>\n",
       "      <td>16.877187</td>\n",
       "      <td>14760.0</td>\n",
       "      <td>0.081197</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>1</td>\n",
       "      <td>5.918431</td>\n",
       "      <td>339.195418</td>\n",
       "      <td>254.571603</td>\n",
       "      <td>2.979975</td>\n",
       "      <td>0.300296</td>\n",
       "      <td>13.077611</td>\n",
       "      <td>14693.0</td>\n",
       "      <td>0.089778</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>2</td>\n",
       "      <td>6.782609</td>\n",
       "      <td>309.578502</td>\n",
       "      <td>219.491795</td>\n",
       "      <td>3.551496</td>\n",
       "      <td>0.137154</td>\n",
       "      <td>4.506474</td>\n",
       "      <td>14508.0</td>\n",
       "      <td>0.126768</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>3</td>\n",
       "      <td>7.380101</td>\n",
       "      <td>431.548351</td>\n",
       "      <td>474.240123</td>\n",
       "      <td>2.852436</td>\n",
       "      <td>0.358819</td>\n",
       "      <td>16.877187</td>\n",
       "      <td>15011.0</td>\n",
       "      <td>0.059789</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <td>4</td>\n",
       "      <td>8.202306</td>\n",
       "      <td>36.250343</td>\n",
       "      <td>321.903627</td>\n",
       "      <td>2.882596</td>\n",
       "      <td>0.173362</td>\n",
       "      <td>10.603468</td>\n",
       "      <td>15401.0</td>\n",
       "      <td>0.042414</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          y           x        mass      size       ecc     signal  raw_mass  \\\n",
       "0  5.728435  295.067222  297.073839  2.499673  0.230136  16.877187   14760.0   \n",
       "1  5.918431  339.195418  254.571603  2.979975  0.300296  13.077611   14693.0   \n",
       "2  6.782609  309.578502  219.491795  3.551496  0.137154   4.506474   14508.0   \n",
       "3  7.380101  431.548351  474.240123  2.852436  0.358819  16.877187   15011.0   \n",
       "4  8.202306   36.250343  321.903627  2.882596  0.173362  10.603468   15401.0   \n",
       "\n",
       "         ep  frame  \n",
       "0  0.081197      0  \n",
       "1  0.089778      0  \n",
       "2  0.126768      0  \n",
       "3  0.059789      0  \n",
       "4  0.042414      0  "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "amr = view.map_async(curried_locate, frames[:100])\n",
    "amr.wait_interactive()\n",
    "results = amr.get()\n",
    "\n",
    "features_ipy = pd.concat(results, ignore_index=True)\n",
    "features_ipy.head()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "trackpy-examples",
   "language": "python",
   "name": "trackpy-examples"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}