{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "<center>\n",
    "<h1>Julia</h1>\n",
    "<br>\n",
    "<br>\n",
    "<h2>Kyle Barbary</h2>\n",
    "<br>\n",
    "<h4>UC Berkeley</h4>\n",
    "<h4>Physics Project Scientist</h4>\n",
    "<h4>Berkeley Institute for Data Science Fellow</h4>\n",
    "<br>\n",
    "GitHub: @kbarbary\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# What's the big deal?\n",
    "\n",
    "\n",
    "![microbenchmarks](julia-microbenchmarks.png)\n",
    "\n",
    "Timing relative to C (lower is better). http://julialang.org"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# What does it look like?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "pi_sum (generic function with 1 method)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\"\"\"approximate π^2/6 with the first n terms in an infinite series.\"\"\"\n",
    "function pi_sum(n)\n",
    "    sum = 0.0\n",
    "    for k in 1:n\n",
    "        sum += 1.0 / (k*k)\n",
    "    end\n",
    "    return sum\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3.1415925580959025"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sqrt(6 * pi_sum(10000000.7))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "## How does it achieve speed?\n",
    "\n",
    "- Just-in-time (JIT) compilation\n",
    "- Careful language design"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Julia solves the \"two language problem\"\n",
    "\n",
    "- Traditional dynamic scientific programming languages offer high-productivity but require writing C/C++/Fortran for speed\n",
    "- Julia is both dynamic (high-productivity) and high-performance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# ... but wait, there's more!\n",
    "\n",
    "- **Multiple dispatch** *(similar to function overloading in C++)*\n",
    "- **Parametric functions and types**\n",
    "  *(think templates in C++)*\n",
    "- **Generic programming**\n",
    "- **metaprogramming** *(think `#define` in C but more powerful)*\n",
    "- **Call C / Fortran functions directly** with no overhead and no wrappers!\n",
    "- **Multithreading** *(experimental in v0.5; target for v1.0)*\n",
    "- **Solid package manager** and vibrant package ecosystem\n",
    "- **MIT licensed**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Julia for Pythonistas\n",
    "\n",
    "One of the best things about coming to Julia from Python is that the languages are quite similar in *semantics*.\n",
    "Specifically, the way variables are assigned and passed to functions is identical. While you have to remember the surface syntax differences, you don't have to re-learn how to *think* about your code."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Assignment of names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       " 1.0\n",
       " 2.0\n",
       " 3.0\n",
       " 4.0"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = [1.0, 2.0, 3.0, 4.0] # some array"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       " 5.0\n",
       " 2.0\n",
       " 3.0\n",
       " 4.0"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = a  # assign the name \"b\" to the same array that 'a' is pointing to.\n",
    "b[1] = 5.0  # modify the first element in that array\n",
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       " 5.0\n",
       " 2.0\n",
       " 3.0\n",
       " 4.0"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a  # change is reflected in a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Function calls: pass by sharing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "double! (generic function with 1 method)"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# define a function that modifies an array\n",
    "function double!(x)\n",
    "    for i=1:length(x)\n",
    "        x[i] *= 2.0\n",
    "    end\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       " 1.0\n",
       " 2.0\n",
       " 3.0\n",
       " 4.0"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a = [1.0, 2.0, 3.0, 4.0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2.0,4.0,6.0,8.0]\n"
     ]
    }
   ],
   "source": [
    "double!(a)\n",
    "println(a)  # modification is reflected to caller,\n",
    "            # because there was only ever one array!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "**Summary: Your hard work learning Python can transfer well to Julia.**\n",
    "\n",
    "For more on how both languages treat names and values, http://nedbatchelder.com/text/names1.html is a great reference."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Julia: fixing Python annoyances\n",
    "\n",
    "- Python lists & ndarrays\n",
    "- Defining efficient small classes\n",
    "- optimizing bottlenecks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Julia unifies Python \"lists\" and ndarrays\n",
    "\n",
    "In Python, most of us are heavy users of numpy, which provides a `ndarray` class for homogenous arrays. On the other hand, we also have Python's built-in `list` type, which are heterogeneous 1-d arrays. It can sometimes be awkward dealing with two types that have such overlapping functionality.\n",
    "\n",
    "```python\n",
    "x = [1, 'two', 3.0]  # heterogeneous\n",
    "y = np.array([1.0, 2.0, 3.0])  # homogeneous\n",
    "```\n",
    "\n",
    "(I end up starting a lot of functions with `x = np.asarray(x)`.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Julia arrays: homogeneous and heterogeneous"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element Array{Any,1}:\n",
       "  1.0     \n",
       "  2       \n",
       "   \"three\"\n",
       " 4+0im    "
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# equivalent of Python list or ndarray with dtype='object'\n",
    "a = [1.0, 2, \"three\", 4+0im]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Array{Any,1}"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "typeof(a)  # a is an array of heterogenous objects"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4-element Array{Any,1}:\n",
       " Float64       \n",
       " Int64         \n",
       " ASCIIString   \n",
       " Complex{Int64}"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "map(typeof, a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Array{Float64,1}"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# equivalent of Python ndarray with dtype=float64\n",
    "b = [1.0, 2.0, 3.0, 4.0]\n",
    "typeof(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "32"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# array only takes up 4 * 8 bytes\n",
    "sizeof(b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Arrays easily extensible to new types\n",
    "\n",
    "You can't do this (efficiently) in NumPy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "immutable Point\n",
    "    x::Float64\n",
    "    y::Float64\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Point,1}:\n",
       " Point(1.0,2.0)\n",
       " Point(3.0,4.0)\n",
       " Point(5.0,6.0)"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = [Point(1., 2.), Point(3., 4.), Point(5., 6.)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "48"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sizeof(x)  # points are stored efficiently in-line"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### This often means that you can design the code much more naturally than in Python.\n",
    "\n",
    "For performance in Python, you'd have to do something like\n",
    "\n",
    "```python\n",
    "class Points(object):\n",
    "    \"\"\"A container for two arrays giving x and y coordinates.\"\"\"\n",
    "\n",
    "    def __init__(self, x, y):\n",
    "        self.x = x\n",
    "        self.y = y\n",
    "\n",
    "    def add_offset(self, x_offset, y_offset):\n",
    "        self.x += x_offset\n",
    "        self.y += y_offset\n",
    "        \n",
    "    # ... other methods that operate element-wise\n",
    "```\n",
    "\n",
    "What you really want is a `Point` object, but if you write classes that way in Python, performance will suffer."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Writing performance-sensitive code\n",
    "\n",
    "## (the big win)\n",
    "\n",
    "Suppose you're doing some array operations, and it turns out to be a bottleneck:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# two 200 x 200 matricies\n",
    "n = 200\n",
    "A = rand(n, n)\n",
    "B = rand(n, n);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "f (generic function with 1 method)"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f(A, B) = 2A + 3B + 4A.*A  # function we want to optimize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "using TimeIt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1000 loops, best of 3: 315.26 µs per loop\n"
     ]
    }
   ],
   "source": [
    "@timeit f(A, B);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Python version\n",
    "\n",
    "We get similar performance in Python:\n",
    "\n",
    "```python\n",
    "In [5]: n = 200\n",
    "\n",
    "In [6]: from numpy.random import rand\n",
    "\n",
    "In [7]: A = rand(n, n);\n",
    "\n",
    "In [8]: B = rand(n, n);\n",
    "\n",
    "In [9]: %timeit 2*A + 3*B + 4*A*A\n",
    "1000 loops, best of 3: 354 µs per loop\n",
    "```\n",
    "\n",
    "But if needed to optimize this further, we'd have to reach for a specialized tool such as cython, numba, ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Optimize in Julia\n",
    "\n",
    "### Using loops"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "f2 (generic function with 1 method)"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "function f2(A, B)\n",
    "    length(A) == length(B) || error(\"array length mismatch\")\n",
    "    C = similar(A, promote_type(eltype(A),eltype(B)))\n",
    "    @inbounds for i=1:length(C)\n",
    "        C[i] = 2A[i] + 3B[i] + 4A[i]*A[i]\n",
    "    end\n",
    "    return C\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10000 loops, best of 3: 65.61 µs per loop\n"
     ]
    }
   ],
   "source": [
    "@timeit f2(A, B);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Using loops and pre-allocated memory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "f3! (generic function with 1 method)"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "function f3!(A, B, C)\n",
    "    length(A) == length(B) == length(C) || error(\"array length mismatch\")\n",
    "    @inbounds @simd for i=1:length(C)\n",
    "        C[i] = 2A[i] + 3B[i] + 4A[i]*A[i]\n",
    "    end\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10000 loops, best of 3: 48.96 µs per loop\n"
     ]
    }
   ],
   "source": [
    "C = similar(A, promote_type(eltype(A),eltype(B)))\n",
    "@timeit f3!(A, B, C);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Calling compiled code with no overhead\n",
    "\n",
    "As an example, we have the sine function built-in:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8414709848078965"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sin(1.0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But what if we want to call a different implementation of the sine function?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8414709848078965"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ccall((:sin, \"libopenlibm\"), Float64, (Float64,), 1.0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "In general, if you've compiled code to a shared library, `libname.so`, you can call it with:\n",
    "```julia\n",
    "ccall((:function_name, \"libname\"),\n",
    "      ReturnType,\n",
    "      (Arg1Type, Arg2Type,...),\n",
    "      arg1, arg2, ...)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Sane built-in package manager\n",
    "\n",
    "Packages and managing dependencies are super important. Julia's `Pkg` is declarative (like conda).\n",
    "\n",
    "```julia\n",
    "Pkg.add(\"Cosmology\")\n",
    "```\n",
    "would add \"Cosmology\" to the requirements file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "IJulia\n",
      "ForwardDiff\n",
      "Requests\n",
      "HTTPClient\n",
      "DocOpt\n",
      "Example\n",
      "Gadfly\n",
      "Winston\n",
      "Logging\n",
      "SIUnits\n",
      "PyCall\n",
      "GeneralizedSampling\n",
      "JLD\n"
     ]
    }
   ],
   "source": [
    ";cat ~/.julia/v0.4/REQUIRE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Julia figures out dependencies and installs the optimal version of every package to satisfy dependencies minimally."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Try it out:\n",
    "\n",
    "- In the browser: http://juliabox.com\n",
    "\n",
    "- At NERSC/Cori (experimental support):\n",
    "  ```\n",
    "  module load julia\n",
    "  ```\n",
    "\n",
    "- http://julialang.org\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Thanks for listening!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Julia downsides\n",
    "\n",
    "- Less mature package ecosystem (But rapidly expanding. Plus, `PyCall` is pretty good.)\n",
    "- Slower module loading (but improving with \"precompilation\" and will probably improve more in the future).\n",
    "- Dynamically dispatched code (when Julia can't infer the types) can be slow.\n",
    "- Language is still changing. Currently at v0.5; be ready to update code for v1.0 (1-2 years away)."
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Julia 0.4.5",
   "language": "julia",
   "name": "julia-0.4"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "0.4.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}