{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# How to use PyHEADTAIL on GPU" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook will show you how to use the GPU functionality of PyHEADTAIL. Created on 19. Feb 2016, Stefan Hegglin" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Installation notes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to use the GPU module, you will need the following:\n", " - A Nvidia GPU: Tested on Tesla C2075 and Kepler K20\n", " - CUDA version >6.5\n", " - PyCUDA version 2015.1.3. Earlier versions possible but are not tested.\n", " - scikit-cuda 0.5.1\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Simulation Setup" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "The usual imports: numpy, matplotlib" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from __future__ import division, print_function\n", "\n", "import numpy as np\n", "\n", "from scipy.constants import c, e, m_p" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to use the GPU, initialise it via the following statement. If it fails, this means the GPU or pycuda was not setup correctly. [This could also be performed automatically inside the context module, however it is less safe since we do not know what happens if the user creates another context]\n", "\n", "Note: it is *important* to initialise the GPU before PyHEADTAIL is imported for the first time." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pycuda.autoinit" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Add PyHEADTAIL to the path:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# sets the PyHEADTAIL directory etc.\n", "try:\n", " from settings import *\n", "except:\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import the GPU and CPU contextmanagers and the PyHEADTAIL `Synchrotron`:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "PyHEADTAIL v1.12.4.7\n", "\n", "\n" ] } ], "source": [ "from PyHEADTAIL.machines.synchrotron import Synchrotron\n", "from PyHEADTAIL.general.contextmanager import GPU\n", "from PyHEADTAIL.general.contextmanager import CPU" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define machine parameters and create a machine object and a corresponding matched bunch:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "*** Maximum RMS bunch length 0.117895151015m.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/home/oeftiger/anaconda/lib/python2.7/site-packages/scipy/integrate/quadpack.py:356: IntegrationWarning: The integral is probably divergent, or slowly convergent.\n", " warnings.warn(msg, IntegrationWarning)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "... distance to target bunch length: -5.0000e-02\n", "... distance to target bunch length: 6.4638e-02\n", "... distance to target bunch length: 4.8815e-02\n", "... distance to target bunch length: 5.6104e-03\n", "... distance to target bunch length: -1.3673e-03\n", "... distance to target bunch length: -2.1248e-05\n", "... distance to target bunch length: 8.4927e-09\n", "... distance to target bunch length: -2.6939e-07\n", "--> Bunch length: 0.0500000084927\n", "--> Emittance: 0.163402703633\n" ] } ], "source": [ "# machine parameters\n", "circumference = 26658.8832\n", "n_segments = 10\n", "charge = e\n", "mass = m_p\n", "beta_x = 92.7 \n", "D_x = 0\n", "beta_y = 93.2 \n", "D_y = 0 \n", "\n", "Q_x = 64.28\n", "Q_y = 59.31\n", "\n", "Qp_x = 10.\n", "Qp_y = 15.\n", "\n", "app_x = 0.0000e-9\n", "app_y = 0.0000e-9\n", "app_xy = 0\n", "\n", "alpha = 3.225e-04\n", "\n", "h1, h2 = 35640, 35640*2\n", "V1, V2 = 6e6, 0.\n", "dphi1, dphi2 = 0, np.pi\n", "\n", "longitudinal_mode = 'non-linear'\n", "p0 = 450e9 * e / c\n", "p_increment = 0\n", "\n", "machine = Synchrotron(\n", " optics_mode='smooth', circumference=circumference, \n", " n_segments=n_segments, \n", " beta_x=beta_x, D_x=D_x, beta_y=beta_y, D_y=D_y,\n", " accQ_x=Q_x, accQ_y=Q_y, Qp_x=Qp_x, Qp_y=Qp_y, \n", " app_x=app_x, app_y=app_y, app_xy=app_xy,\n", " alpha_mom_compaction=alpha, longitudinal_mode=longitudinal_mode,\n", " h_RF=[h1,h2], V_RF=[V1,V2], dphi_RF=[dphi1,dphi2], \n", " p0=p0, p_increment=p_increment, charge=charge, mass=mass,\n", " use_cython=False\n", ")\n", "\n", "# bunch parameters\n", "macroparticlenumber = 100000\n", "intensity = 1e11\n", "epsn_x = 2.5e-6\n", "epsn_y = 3.5e-6\n", "sigma_z = 0.05\n", "bunch = machine.generate_6D_Gaussian_bunch_matched(\n", " macroparticlenumber, intensity, epsn_x, epsn_y, sigma_z=sigma_z\n", ")\n", "\n", "# simulation parameters\n", "n_turns = 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Main tracking loop\n", "Up to this point everything has been performed on the CPU and the script for CPUs and GPUs is the same (except the use_cython=False parameter and the `import pycuda.autoinit` statement) when setting up the simulation. Next, we'll create a GPU context to enclose the main tracking loop." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "with GPU(bunch) as context:\n", " for n in range(n_turns):\n", " machine.track(bunch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, this seems to work. How do we know it's actually running on the GPU? We can check the type of the bunch phase-space arrays inside of the with statement:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The type of bunch.x before entering the with-statement is \n", "The type of bunch.x inside of the with-statement is \n", "The type of bunch.x after the with-region is \n" ] } ], "source": [ "print ('The type of bunch.x before entering the with-statement is', type(bunch.x))\n", "with GPU(bunch) as context:\n", " machine.track(bunch)\n", " print ('The type of bunch.x inside of the with-statement is', type(bunch.x))\n", "print ('The type of bunch.x after the with-region is', type(bunch.x))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also use the CPU contextmanager to have more similar code for GPU and CPU scripts:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The type of bunch.x before entering the with-statement is \n", "The type of bunch.x inside of the with-statement is \n", "The type of bunch.x after the with-region is \n" ] } ], "source": [ "print ('The type of bunch.x before entering the with-statement is ', type(bunch.x))\n", "with CPU(bunch) as context:\n", " machine.track(bunch)\n", " print ('The type of bunch.x inside of the with-statement is ', type(bunch.x))\n", "print ('The type of bunch.x after the with-region is ', type(bunch.x))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's it! If you need access to the bunch-phase space arrays during the simulation, you can move a copy back to the CPU by using bunch.x.get(). Printing GPUArrays works out of the box if you need it for debugging:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The type of bunch.x inside of the with-statement is \n", "\n", "The first three entries of bunch.x are [ 0.00019505 -0.00044431 -0.00050942] (note: the array sits in GPU memory!)\n", "\n", "A CPU copy of bunch.x inside the with-statement has type \n" ] } ], "source": [ "with GPU(bunch) as context:\n", " print ('The type of bunch.x inside of the with-statement is ', type(bunch.x))\n", " print ('\\nThe first three entries of bunch.x are ', bunch.x[0:3], ' (note: the array sits in GPU memory!)\\n')\n", " cpu_bunch_x = bunch.x.get()\n", " print ('A CPU copy of bunch.x inside the with-statement has type ', type(cpu_bunch_x))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }