{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Is Julia fast?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Julia isn't fast *per se*.\n",
"\n",
"One can write terribly slow code in any language, including Julia.\n",
"\n",
"So let's ask a different question."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# *Can* Julia be fast?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" ### Microbenchmarks\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### More realistic case: Vandermonde matrix\n",
"(modified from [Steve's Julia intro](https://web.mit.edu/18.06/www/Fall17/1806/julia/Julia-intro.pdf))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[Vandermonde matrix:](https://en.wikipedia.org/wiki/Vandermonde_matrix)\n",
"\\begin{align}V=\\begin{bmatrix}1&\\alpha _{1}&\\alpha _{1}^{2}&\\dots &\\alpha _{1}^{n-1}\\\\1&\\alpha _{2}&\\alpha _{2}^{2}&\\dots &\\alpha _{2}^{n-1}\\\\1&\\alpha _{3}&\\alpha _{3}^{2}&\\dots &\\alpha _{3}^{n-1}\\\\\\vdots &\\vdots &\\vdots &\\ddots &\\vdots \\\\1&\\alpha _{m}&\\alpha _{m}^{2}&\\dots &\\alpha _{m}^{n-1}\\end{bmatrix}\\end{align}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"using PyCall"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"np = pyimport(\"numpy\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"np.vander(1:5, increasing=true)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The source code for this function is [here](https://github.com/numpy/numpy/blob/v1.16.1/numpy/lib/twodim_base.py#L475-L563). It calls `np.multiply.accumulate` which is implemented in C [here](https://github.com/numpy/numpy/blob/deea4983aedfa96905bbaee64e3d1de84144303f/numpy/core/src/umath/ufunc_object.c#L3678). However, this code doesn't actually perform the computation, it basically only checks types and stuff. The actual kernel that gets called is [here](https://github.com/numpy/numpy/blob/deea4983aedfa96905bbaee64e3d1de84144303f/numpy/core/src/umath/loops.c.src#L1742). This isn't even C code but a template for C code which is used to generate type specific kernels.\n",
"\n",
"Overall, this setup only supports a limited set of types, like `Float64`, `Float32`, and so forth."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here is a simple Julia implementation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"function vander(x::AbstractVector{T}, n=length(x)) where T\n",
" m = length(x)\n",
" V = Matrix{T}(undef, m, n)\n",
" for j = 1:m\n",
" V[j,1] = one(x[j])\n",
" end\n",
" for i= 2:n\n",
" for j = 1:m\n",
" V[j,i] = x[j] * V[j,i-1]\n",
" end\n",
" end\n",
" return V\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"vander(1:5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### A quick speed comparison"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
Show Code
\n",
"
\n",
" \n",
"```julia\n",
"using BenchmarkTools, Plots\n",
"ns = exp10.(range(1, 4, length=30));\n",
"\n",
"tnp = Float64[]\n",
"tjl = Float64[]\n",
"for n in ns\n",
" x = 1:n |> collect\n",
" push!(tnp, @belapsed np.vander(\\$x) samples=3 evals=1)\n",
" push!(tjl, @belapsed vander(\\$x) samples=3 evals=1)\n",
"end\n",
"plot(ns, tnp./tjl, m=:circle, xscale=:log10, xlab=\"matrix size\", ylab=\"NumPy time / Julia time\", legend=:false)\n",
"```\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the clean and concise Julia implementation is **beating numpy's C implementation for small matrices** and is **on-par for large matrix sizes**.\n",
"\n",
"At the same time, the Julia code is *generic* and works for arbitrary types!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"vander(Int32[4, 8, 16, 32])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It even works for non-numerical types. The only requirement is that the type has a *one* (identity element) and a multiplication operation defined."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"vander([\"this\", \"is\", \"a\", \"test\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here, `one(String) == \"\"` since the empty string is the identity under multiplication (string concatenation)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How can Julia be fast?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
