{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to use PyHEADTAIL on GPU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook will show you how to use the GPU functionality of PyHEADTAIL. Created on 19. Feb 2016, Stefan Hegglin"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Installation notes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to use the GPU module, you will need the following:\n",
    "    - A Nvidia GPU: Tested on Tesla C2075 and Kepler K20\n",
    "    - CUDA version >6.5\n",
    "    - PyCUDA version 2015.1.3. Earlier versions possible but are not tested.\n",
    "    - scikit-cuda 0.5.1\n",
    "    "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Simulation Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "The usual imports: numpy, matplotlib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from __future__ import division, print_function\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "from scipy.constants import c, e, m_p"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to use the GPU, initialise it via the following statement. If it fails, this means the GPU or pycuda was not setup correctly. [This could also be performed automatically inside the context module, however it is less safe since we do not know what happens if the user creates another context]\n",
    "\n",
    "Note: it is *important* to initialise the GPU before PyHEADTAIL is imported for the first time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import pycuda.autoinit"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add PyHEADTAIL to the path:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# sets the PyHEADTAIL directory etc.\n",
    "try:\n",
    "    from settings import *\n",
    "except:\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import the GPU and CPU contextmanagers and the PyHEADTAIL `Synchrotron`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "PyHEADTAIL v1.12.4.7\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from PyHEADTAIL.machines.synchrotron import Synchrotron\n",
    "from PyHEADTAIL.general.contextmanager import GPU\n",
    "from PyHEADTAIL.general.contextmanager import CPU"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define machine parameters and create a machine object and a corresponding matched bunch:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "*** Maximum RMS bunch length 0.117895151015m.\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/oeftiger/anaconda/lib/python2.7/site-packages/scipy/integrate/quadpack.py:356: IntegrationWarning: The integral is probably divergent, or slowly convergent.\n",
      "  warnings.warn(msg, IntegrationWarning)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "... distance to target bunch length: -5.0000e-02\n",
      "... distance to target bunch length: 6.4638e-02\n",
      "... distance to target bunch length: 4.8815e-02\n",
      "... distance to target bunch length: 5.6104e-03\n",
      "... distance to target bunch length: -1.3673e-03\n",
      "... distance to target bunch length: -2.1248e-05\n",
      "... distance to target bunch length: 8.4927e-09\n",
      "... distance to target bunch length: -2.6939e-07\n",
      "--> Bunch length: 0.0500000084927\n",
      "--> Emittance: 0.163402703633\n"
     ]
    }
   ],
   "source": [
    "# machine parameters\n",
    "circumference = 26658.8832\n",
    "n_segments = 10\n",
    "charge = e\n",
    "mass = m_p\n",
    "beta_x = 92.7 \n",
    "D_x = 0\n",
    "beta_y = 93.2 \n",
    "D_y = 0 \n",
    "\n",
    "Q_x = 64.28\n",
    "Q_y = 59.31\n",
    "\n",
    "Qp_x = 10.\n",
    "Qp_y = 15.\n",
    "\n",
    "app_x = 0.0000e-9\n",
    "app_y = 0.0000e-9\n",
    "app_xy = 0\n",
    "\n",
    "alpha = 3.225e-04\n",
    "\n",
    "h1, h2 = 35640, 35640*2\n",
    "V1, V2 = 6e6, 0.\n",
    "dphi1, dphi2 = 0, np.pi\n",
    "\n",
    "longitudinal_mode = 'non-linear'\n",
    "p0 = 450e9 * e / c\n",
    "p_increment = 0\n",
    "\n",
    "machine = Synchrotron(\n",
    "    optics_mode='smooth', circumference=circumference, \n",
    "    n_segments=n_segments, \n",
    "    beta_x=beta_x, D_x=D_x, beta_y=beta_y, D_y=D_y,\n",
    "    accQ_x=Q_x, accQ_y=Q_y, Qp_x=Qp_x, Qp_y=Qp_y, \n",
    "    app_x=app_x, app_y=app_y, app_xy=app_xy,\n",
    "    alpha_mom_compaction=alpha, longitudinal_mode=longitudinal_mode,\n",
    "    h_RF=[h1,h2], V_RF=[V1,V2], dphi_RF=[dphi1,dphi2], \n",
    "    p0=p0, p_increment=p_increment, charge=charge, mass=mass,\n",
    "    use_cython=False\n",
    ")\n",
    "\n",
    "# bunch parameters\n",
    "macroparticlenumber = 100000\n",
    "intensity = 1e11\n",
    "epsn_x = 2.5e-6\n",
    "epsn_y = 3.5e-6\n",
    "sigma_z = 0.05\n",
    "bunch = machine.generate_6D_Gaussian_bunch_matched(\n",
    "    macroparticlenumber, intensity, epsn_x, epsn_y, sigma_z=sigma_z\n",
    ")\n",
    "\n",
    "# simulation parameters\n",
    "n_turns = 10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Main tracking loop\n",
    "Up to this point everything has been performed on the CPU and the script for CPUs and GPUs is the same (except the use_cython=False parameter and the `import pycuda.autoinit` statement) when setting up the simulation. Next, we'll create a GPU context to enclose the main tracking loop."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "with GPU(bunch) as context:\n",
    "    for n in range(n_turns):\n",
    "        machine.track(bunch)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ok, this seems to work. How do we know it's actually running on the GPU? We can check the type of the bunch phase-space arrays inside of the with statement:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The type of bunch.x before entering the with-statement is <type 'numpy.ndarray'>\n",
      "The type of bunch.x inside of the with-statement is <class 'pycuda.gpuarray.GPUArray'>\n",
      "The type of bunch.x after the with-region is <type 'numpy.ndarray'>\n"
     ]
    }
   ],
   "source": [
    "print ('The type of bunch.x before entering the with-statement is', type(bunch.x))\n",
    "with GPU(bunch) as context:\n",
    "        machine.track(bunch)\n",
    "        print ('The type of bunch.x inside of the with-statement is', type(bunch.x))\n",
    "print ('The type of bunch.x after the with-region is', type(bunch.x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also use the CPU contextmanager to have more similar code for GPU and CPU scripts:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The type of bunch.x before entering the with-statement is  <type 'numpy.ndarray'>\n",
      "The type of bunch.x inside of the with-statement is  <type 'numpy.ndarray'>\n",
      "The type of bunch.x after the with-region is  <type 'numpy.ndarray'>\n"
     ]
    }
   ],
   "source": [
    "print ('The type of bunch.x before entering the with-statement is ', type(bunch.x))\n",
    "with CPU(bunch) as context:\n",
    "        machine.track(bunch)\n",
    "        print ('The type of bunch.x inside of the with-statement is ', type(bunch.x))\n",
    "print ('The type of bunch.x after the with-region is ', type(bunch.x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "That's it! If you need access to the bunch-phase space arrays during the simulation, you can move a copy back to the CPU by using bunch.x.get(). Printing GPUArrays works out of the box if you need it for debugging:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The type of bunch.x inside of the with-statement is  <class 'pycuda.gpuarray.GPUArray'>\n",
      "\n",
      "The first three entries of bunch.x are  [ 0.00019505 -0.00044431 -0.00050942]  (note: the array sits in GPU memory!)\n",
      "\n",
      "A CPU copy of bunch.x inside the with-statement has type  <type 'numpy.ndarray'>\n"
     ]
    }
   ],
   "source": [
    "with GPU(bunch) as context:\n",
    "    print ('The type of bunch.x inside of the with-statement is ', type(bunch.x))\n",
    "    print ('\\nThe first three entries of bunch.x are ', bunch.x[0:3], ' (note: the array sits in GPU memory!)\\n')\n",
    "    cpu_bunch_x = bunch.x.get()\n",
    "    print ('A CPU copy of bunch.x inside the with-statement has type ', type(cpu_bunch_x))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}