{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Jupyter and multiple Languages" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook is [based on an oldnotebook of mine](https://matthiasbussonnier.com/posts/23-Cross-Language-Integration.html), and the supporting material behing a [Jupyter blog post](https://blog.jupyter.org/i-python-you-r-we-julia-baf064ca1fb6) talking about cross language integration.\n", "This will quite short on narrative, and dive a bit more into technical details than the blog post does. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "An often requested feature for the [Jupyter Notebook](https://jupyter.org) is the ability to have multiple kernels, often in many languages, for a single notebook. \n", "\n", "While the request in spirit is a perfectly valid one, it is often a misunderstanding of what having a single kernel means. In particular having multiple language is often easier if you have a single process which handle the dispatching of various instructions to potentially multiple underlying languages. It is possible to do that in a _Single Kernel_ which does orchestrate dispatching instruction and moving data around.\n", "\n", "Whether the multiple languages that get orchestrated together are remote processes, or simply library calls or more complex mechanisms becomes an implementation detail. \n", "\n", "[Python](https://python.org) is known to be a good \"glue\" language, and over the year the [IPython kernel](https://ipython.org) have seen a growing number of extensions showing that dynamic cross language integration can be seamless form the point of view of the user.\n", "\n", "In the following we only scratch the surface of what is possible across a variety of languages. The approach shown here is one among many. The [Calysto](https://github.com/Calysto) organisation for example has several projects taking different approaches on the problem.\n", "\n", "In the following I will show a quick overview on how you can in single notebook interact with many languages, via Common Foreign function interface (C, Rust, Fortran, ...), or even crazier approaches (Julia).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# IPython and cross language integration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The rest of this is mostly a demo on how cross-language integration works in a Jupyter notebook by using the features of the _Reference IPython Kernel implementation_. These features are completely handled in the kernel so need to be reimplemented on a per-kernel basis. Though they **also** work on pure terminal IPython, nbconvert or any other programmatic use of IPython.\n", "\n", "Most of what you will see here are _just_ thin wrappers around already existing libraries. These libraries (and their respective authors) do all the heavy lifting. I just show how seamless a cross language environment can be from the user point of view. The installation of these library might not be easy either and getting all these language to play together can be complex task. It is though becoming easier and easier.\n", "\n", "The term _just_ does not imply that the wrappers are simple, or easy to write. It indicate that the wrappers are far from being complete. What is shown here is completely doable using standard Python syntax and bit of manual work. SO what you'll see here is mostly _convenience_. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The good old example of Fibonacci" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Understanding the multiple languages themselves is not necessary; most of the code here should self explanatory and straightforward. We'll define many function that compute the nth `Fibonacci` number more or less efficiently. We'll define them either using the classic recursive implementation, or sometime using an unrolled optimized version. As a reminder the Fibonacci sequence is defines a the following:\n", "\n", "$$ F_n = \\begin{cases} 1 &\\mbox{if } n \\leq 2 \\\\ \n", "F_{n-1}+F_{n-2} & \\mbox{otherwise }\\end{cases}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The fact that we calculate the Fibonacci sequence as little importance, except that the value of $F_n$ can grow _really_ fast in $O(e^n)$ if I remember correctly. And the recursive implementation will have a hard time getting beyond (n=100) as the number of call will be greater than $O(e^n)$ as well. Be careful especially if you calculate $F_{F_n}$ or more composition. Remembering that n=5 is stable via $F$ might be useful. \n", "\n", "Here are the first terms of the Fibonacci sequence:\n", "\n", " 1. 1\n", " 1. 1\n", " 1. 1+1 = 2\n", " 1. 2+1 = 3\n", " 1. 3+2 = 5\n", " 1. 5+3 = 8\n", " 1. 8+5 = 13\n", " ...\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Basic Python cross-language integration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's define the `fibonacci` function in python:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "def fib(n):\n", " \"\"\"\n", " A simple definition of fibonacci manually unrolled\n", " \"\"\"\n", " if n<2:\n", " return 1\n", " x,y = 1,1\n", " for i in range(n-2):\n", " x,y = y,x+y\n", " return y" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 1, 2, 3, 5, 8, 13, 21, 34]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[fib(i) for i in range(1,10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Store the value from 1 to 30 in `Y`, and graph it." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5,1,'The Fibonacci sequence grows fast !')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import numpy as np\n", "X = np.arange(1,30)\n", "Y = np.array([fib(x) for x in X])\n", "import matplotlib.pyplot as plt\n", "fig, ax = plt.subplots()\n", "ax.scatter(X, Y)\n", "ax.set_xlabel('n')\n", "ax.set_ylabel('fib(n)')\n", "ax.set_title('The Fibonacci sequence grows fast !')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It may not surprise you, but this looks like an exponential, so if we were to look at $log(fib(n))$ × $n$ it would look approximately like a line. We can try to do a linear regression using this model. R is a language many people use to do statistics. So, let's use R. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's enable integration between Python and R using the [__`RPy2`__](https://rpy2.readthedocs.io/en/version_2.8.x/) python package developed by Laurent Gautier and the rest of the rpy2 team." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "import rpy2.rinterface\n", "\n", "%load_ext rpy2.ipython" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Following will \"Send\" the X and Y array to R." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "%Rpush Y X" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now let's try to fit a linear model ($ln(Y) = A.X + B$) using R. I'm not a R user myself, so don't take this as idiomatic R." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "\n" }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R\n", "my_summary = summary(lm(log(Y)~X))\n", "val <- my_summary$coefficients\n", "\n", "plot(X, log(Y))\n", "abline(my_summary)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "\n", "Call:\n", "lm(formula = log(Y) ~ X)\n", "\n", "Residuals:\n", " Min 1Q Median 3Q Max \n", "-0.183663 -0.013497 -0.004137 0.006046 0.296094 \n", "\n", "Coefficients:\n", " Estimate Std. Error t value Pr(>|t|) \n", "(Intercept) -0.775851 0.026173 -29.64 <2e-16 ***\n", "X 0.479757 0.001524 314.84 <2e-16 ***\n", "---\n", "Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1\n", "\n", "Residual standard error: 0.06866 on 27 degrees of freedom\n", "Multiple R-squared: 0.9997,\tAdjusted R-squared: 0.9997 \n", "F-statistic: 9.912e+04 on 1 and 27 DF, p-value: < 2.2e-16\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R\n", "my_summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Good, we have now the some statistics on the fit, which also looks good. __And__ we were able to not only send variable to R, but to plot directly from R !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are happy as $F_n = \\left[\\frac{\\phi^n}{\\sqrt 5}\\right]$, where `[]` is closest integer and $\\phi = \\frac{1+\\sqrt 5}{2}$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also look at the variables more carefully" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " Estimate Std. Error t value Pr(>|t|)\n", "(Intercept) -0.7758510 0.026172673 -29.64355 3.910319e-22\n", "X 0.4797571 0.001523832 314.83597 1.137181e-49\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%R\n", "val" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or even the following that _looks_ more like python" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-7.75850975e-01, 2.61726725e-02, -2.96435519e+01,\n", " 3.91031947e-22],\n", " [ 4.79757090e-01, 1.52383191e-03, 3.14835966e+02,\n", " 1.13718145e-49]])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%R val" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can even get the variable back from R as Python objects:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([-7.75850975e-01, 2.61726725e-02, -2.96435519e+01, 3.91031947e-22]),\n", " array([4.79757090e-01, 1.52383191e-03, 3.14835966e+02, 1.13718145e-49]))" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "coefs = %Rget val\n", "y0,k = coefs[0:2]\n", "y0,k" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's all from the R part. I hope this shows you some of the power of IPython, both in notebook and command line. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# CFFI" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! We were able to send data back and forth! If does not works for all objects, but at least for the basic ones. It requires quite some work from the authors of the underlying library to allow you to do that. Though we are still limited to data. We can't (yet) send functions over which limits the utility." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Mix and Match : C" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of the critical point of any code may at some point be performance. Python is known to not be the most performant language, though it is convenient and quick to write and has a large ecosystem. Most of the function you requires are probably available in a package, battle tested and optimized. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You might still need here and there the raw power of an ubiquitous language which is known for its speed when you know how to wield it well: C. \n", "\n", "Though one of the disadvantage of C is the (relatively) slow iteration process due to the necessity of compilation/run part of the cycle. Let see if we can improve that by leveraging the excellent [CFFI project](https://cffi.readthedocs.io/), using my own small [cffi_magic](https://pypi.python.org/pypi/cffi_magic) wrapper. " ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "import cffi_magic" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "rm -rf *.o *.c *.so Cargo.* src target" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ls: *.c: No such file or directory\r\n", "ls: *.h: No such file or directory\r\n", "ls: *.o: No such file or directory\r\n" ] } ], "source": [ "ls *.c *.h *.o" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the `%%cffi` magic we can define in the middle of our python code some C function:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "%%cffi int cfib(int);\n", "\n", "int cfib(int n)\n", "{\n", " int res=0;\n", " if (n <= 1){\n", " res = 1;\n", " } else {\n", " res = cfib(n-1)+cfib(n-2);\n", " }\n", " return res;\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first line take the \"header\"/\"signature\" of the function we declare, and the rest of the cell takes the body of this function. The `cfib` function will automatically be made available to you in the main python namespace." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cfib(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Oops there is a mistake as we should have `fib(5) == 5`. Luckily we can redefine the function on the fly. I could edit the above cell, but here as this will be rendered statically for the sake of demo purpose, I'm going to make a second cell:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "%%cffi int cfib(int);\n", "\n", "int cfib(int n)\n", "{\n", " int res=0;\n", " if (n <= 2){ /*mistake was here*/\n", " res = 1;\n", " } else {\n", " res = cfib(n-1)+cfib(n-2);\n", " }\n", " return res;\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(you may need to run above cell twice... I don't know why)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cfib(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great ! Let's compare the timing." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "287 ns ± 10.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n" ] } ], "source": [ "%timeit cfib(10)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "902 ns ± 61 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n" ] } ], "source": [ "%timeit fib(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not so bad considering the C implementation is recursive, and the Python version is manually hand-rolled. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Implementation detail" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So how do we do that magically under the hood? The knowledgeable reader is aware that CPython extensions cannot be reloaded. Though here we redefine the function... how come? \n", "\n", "Using the user provided code we compile a shared object with a random name, import this as a module and alias using a user friendly name in the `__main__` namespace. If the user re-execute we just get a new name, and change the alias mapping. \n", "\n", "If one wan to optimize you can use a hash of the codecell string to not recompile if the user hasn't changed the code. " ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "_cffi_5c004d54c0a603cba22b0cb2067ddbe4.c\r\n", "_cffi_5c004d54c0a603cba22b0cb2067ddbe4.o\r\n", "_cffi_dcac8fc76ec118571b0ae4bda71fb975.c\r\n", "_cffi_dcac8fc76ec118571b0ae4bda71fb975.o\r\n" ] } ], "source": [ "ls *.o *.c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With this in mind you can guess the same can be done for any language which can be compiled to a shared object, or a dynamically loadable library. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Mix and Match : rust" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `cffi` module also allows you to do the same with [Rust](https://www.rust-lang.org), a new language designed by Mozilla, which provide the same C-like level of control, while incorporating more recent understanding of programming and provide better memory safety. Let's see how we would do the same with Rust:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to put rustc on `$PATH`:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "import cffi_magic\n", "from os import environ as E\n", "if 'cargo' not in E['PATH']:\n", " E['PATH'] = E['HOME']+'/.cargo/bin:'+E['PATH']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fetching crates registry should take only 30sec or so... (if you know how to tell gargo not to hit the network). If it get stuck restart the notebook and \"Run all above\"." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "injecting `rfib` in user ns\n" ] } ], "source": [ "%%rust int rfib(int);\n", "\n", "#[no_mangle]\n", "pub extern fn rfib(n: i32) -> i32 {\n", " match n {\n", " 0 => 1,\n", " 1 => 1,\n", " 2 => 1,\n", " _ => rfib(n-1)+rfib(n-2)\n", " }\n", "}" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 1, 2, 3, 5, 8, 13, 21, 34]" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[rfib(x) for x in range(1,10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I'm not a Rustacean, but the above seem pretty straightforward to me. Again this might not be idiomatic Rust but you should be able to decipher what's above. The same than for C applies. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Still in development" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Both the C and Rust example shown above use the `cffi_magic` on which I spent roughly 4 hours total, so the functionalities can be really crude and the documentation minimal at best. Feel free to send PRs if you are interested. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fortran" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [fortran magic](https://pypi.python.org/pypi/fortran-magic) does the same as above, but has been developed by [mgaitan](https://github.com/mgaitan/fortran_magic) and is slightly older. Again no surprise except you are supposed to mark fortran variable that are used to return the values. " ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "scrolled": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/bussonniermatthias/anaconda/lib/python3.6/site-packages/fortranmagic.py:147: UserWarning: get_ipython_cache_dir has moved to the IPython.paths module since IPython 4.0.\n", " self._lib_dir = os.path.join(get_ipython_cache_dir(), 'fortran')\n" ] }, { "data": { "application/javascript": [ "$.getScript(\"https://raw.github.com/marijnh/CodeMirror/master/mode/fortran/fortran.js\", function () {\n", "IPython.config.cell_magic_highlight['magic_fortran'] = {'reg':[/^%%fortran/]};});\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%load_ext fortranmagic" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "%%fortran\n", "RECURSIVE SUBROUTINE ffib(n, fibo) \n", " IMPLICIT NONE\n", " INTEGER, INTENT(IN) :: n\n", " INTEGER, INTENT(OUT) :: fibo\n", " INTEGER :: tmp\n", " IF (n <= 2) THEN \n", " fibo = 1\n", " ELSE\n", " CALL ffib(n-1,fibo)\n", " CALL ffib(n-2,tmp)\n", " fibo = fibo + tmp\n", " END IF\n", "END SUBROUTINE ffib" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 1, 2, 3, 5, 8, 13, 21, 34]" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[ffib(x) for x in range(1,10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "No surprise here, you are well aware of what we are doing." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cython" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "IPython used to ship with the Cython magic that is now part of [Cython](http://cython.org/) itself.\n", "Cython is a superset of Python that compiles to C and importable from Python. You should be a ble to take your python code as is, type annotate it, and get c-like speed.\n", "The same principle applies:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "import cython" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "%load_ext cython" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "%%cython\n", "\n", "def cyfib(int n): # note the `int` here\n", " \"\"\"\n", " A simple definition of fibonacci manually unrolled\n", " \"\"\"\n", " cdef int x,y # and the `cdef int x,y` here\n", " if n < 2:\n", " return 1\n", " x,y = 1,1\n", " for i in range(n-2):\n", " x,y = y,x+y\n", " return y" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 1, 2, 3, 5, 8, 13, 21, 34]" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[cyfib(x) for x in range(1,10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### benchmark" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "651 ns ± 25 ns per loop (mean ± std. dev. of 3 runs, 100 loops each)\n" ] } ], "source": [ "%timeit -n100 -r3 fib(5)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The slowest run took 15.29 times longer than the fastest. This could mean that an intermediate result is being cached.\n", "1.82 µs ± 2.07 µs per loop (mean ± std. dev. of 3 runs, 100 loops each)\n" ] } ], "source": [ "%timeit -n100 -r3 cfib(5)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "170 ns ± 8.86 ns per loop (mean ± std. dev. of 3 runs, 100 loops each)\n" ] } ], "source": [ "%timeit -n100 -r3 ffib(5)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "206 ns ± 204 ns per loop (mean ± std. dev. of 3 runs, 100 loops each)\n" ] } ], "source": [ "%timeit -n100 -r3 cyfib(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The benchmark result can be astonishing, but keep in mind that the Python and Cython version use manually unrolled loop. Main point being that we reached our goal and used Fortran, Cython, C (and Rust) in the middle of our Python program." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " > let's skip the Rust fib version, it tends to segfault, and it would be sad to segfault now :-) If you konw why I would be happy to include a fix!" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "# %timeit rfib(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The Cake is not a lie!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So can we do a layer cake? Can we call rust from Python from Fortran from Cython? Or Cython from C from Fortran? Or Fortron from Cytran from Rust?" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Pray the demo-gods it wont segfault even without rust...\n" ] } ], "source": [ "import itertools\n", "lookup = {'c':cfib,\n", " # 'rust': rfib, # as before Rust may segfault, but I dont' know why ...\n", " 'python': fib,\n", " 'fortran': ffib,\n", " 'cython': cyfib\n", " }\n", "\n", "print(\"Pray the demo-gods it wont segfault even without rust...\")" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "for function in lookup.values():\n", " assert function(5) == 5, \"Make sure all is correct or will use 100% CPU for a looong time.\"" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "c -> python -> fortran -> cython : 5\n", "c -> python -> cython -> fortran : 5\n", "c -> fortran -> python -> cython : 5\n", "c -> fortran -> cython -> python : 5\n", "c -> cython -> python -> fortran : 5\n", "c -> cython -> fortran -> python : 5\n", "python -> c -> fortran -> cython : 5\n", "python -> c -> cython -> fortran : 5\n", "python -> fortran -> c -> cython : 5\n", "python -> fortran -> cython -> c : 5\n", "python -> cython -> c -> fortran : 5\n", "python -> cython -> fortran -> c : 5\n", "fortran -> c -> python -> cython : 5\n", "fortran -> c -> cython -> python : 5\n", "fortran -> python -> c -> cython : 5\n", "fortran -> python -> cython -> c : 5\n", "fortran -> cython -> c -> python : 5\n", "fortran -> cython -> python -> c : 5\n", "cython -> c -> python -> fortran : 5\n", "cython -> c -> fortran -> python : 5\n", "cython -> python -> c -> fortran : 5\n", "cython -> python -> fortran -> c : 5\n", "cython -> fortran -> c -> python : 5\n", "cython -> fortran -> python -> c : 5\n" ] } ], "source": [ "for order in itertools.permutations(lookup):\n", " t = 5\n", " for f in order:\n", " t = lookup[f](t)\n", " \n", " print(' -> '.join(order), ':', t)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "It worked ! I can run all the permutations !\n" ] } ], "source": [ "print('It worked ! I can run all the permutations !')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Cherry on the Layer Cake, with Julia" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you have a small idea about how the above layer-cake is working you'll understand that there is (still) a non-negligible overhead as between each language switch we need to go back to Python-land. And the scope in which we can access function is still quite limited. The following is some really _Dark Magic_ concocted by Fernando Perez and Steven Johnson using the [Julia](http://julialang.org/) programming language. I can't even pretend to understand how this possible, but it's really impressive to see. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try to handwave what's happening. I would be happy to get corrections." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The crux is that the Python and Julia interpreters can be started in a way where they each have access to the other process memory. Thus the Julia and Python interpreter can share live objects. You then \"just\" need to teach the Julia language about the structure of Python objects and it can manipulate these as desired, either directly (if the memory layout allow it) or using proxy objects that \"delegate\" the functionality to the python process.\n", "\n", "The result being that Julia can import and use Python modules (using the Julia `PyCall` package), and Julia functions are available from within Python using the `pyjulia` module. \n", "\n", "Let's see how this look like." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that if you're using Julia version `0.7` or greater (any recent Julia), we need to work around this issue:\n", "https://pyjulia.readthedocs.io/en/latest/troubleshooting.html#your-python-interpreter-is-statically-linked-to-libpython\n", "\n", "The easiest way to do that is to initialize Juila with `compiled_modules=False`, as we do in this next cell:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "from julia.api import Julia\n", "jl = Julia(compiled_modules=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And now we can load and start using Julia:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initializing Julia runtime. This may take some time...\n" ] } ], "source": [ "%load_ext julia.magic" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "julia_version = %julia VERSION\n", "julia_version # you can see this is a wrapper" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "He we tell the _julia_ process to import the _python_ matplotlib module, as well as numpy." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "%julia @pyimport matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "%julia @pyimport numpy as np" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "%%julia\n", " # Note how we mix numpy and julia:\n", "t = range(0, stop=2*pi, length=1000); # use the julia `range` and `pi`\n", "s = sin.(3*t + 4*np.cos(2*t)); # use the numpy cosine and julia sine\n", "fig = plt.gcf() # **** WATCH THIS VARIABLE ****\n", "plt.plot(t, s, color=\"red\", linewidth=2.0, linestyle=\"--\", label=\"sin(3t+4.cos(2t))\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the above block of code is Julia, where, `range`,`pi`,`sin` are builtins of Julia. `np.*` and `plt.*` are referencing Python function and methods.\n", "\n", "We see that `t` is a Julia \"Array\" (technically a `1000-element StepRangeLen{Float64}`), which can get sent to `numpy.cos`, multiply by a Julia int, (..etc) and end up being plotted via matplotlib (Python), and displayed inline.\n", "\n", "Let's finish our graph in Python" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "fig = %julia fig\n", "fig.axes[0].plot(X[:6], np.log(Y[:6]), '--', label='fib')\n", "fig.axes[0].set_title('A weird Julia function and Fib')\n", "fig.axes[0].legend()\n", "\n", "fig" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above we get the reference to our previously defined figure (in Julia), plot the log of our `fib` function.\n", "The key value here is that we get the _same_ object from within Python and Julia. But let's push even further." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above we had _explicit_ transition between the Julia code and the Python code. Can we be more sneaky?\n", "\n", "One toy example is to define the Fibonacci function using the recursive form and _explicitly_ pass the function with which we recurse.\n", "\n", "We'll define such a function both on the Julia and Python side, ask the Julia function to recurse by calling the Python one, and the Python one to recurse using the Julia one.\n", "\n", "Let's print `(P` when we enter Python Kingdom, `(J` when we enter Julia Realm, and close the parenthesis accordingly:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "from __future__ import print_function\n", "\n", "\n", "# julia fib function\n", "jlfib = %julia _fib(n, pyfib) = n <= 2 ? 1 : pyfib(n-1, _fib) + pyfib(n-2, _fib)\n", "\n", "\n", "def pyfib(n, _fib):\n", " \"\"\"\n", " Python fib function\n", " \"\"\"\n", " print('(P', end='')\n", " if n <= 2:\n", " r = 1\n", " else:\n", " print('(J', end='')\n", " # here we tell julia (_fib) to recurse using Python\n", " r = _fib(n-1, pyfib) + _fib(n-2, pyfib)\n", " print(')',end='')\n", " print(')',end='')\n", " return r" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(P(J(P(J(P(J(P(J(P)(P)))(P(J))(P(J))(P)))(P(J(P(J))(P)(P)(P)))(P(J(P(J))(P)(P)(P)))(P(J(P)(P)))))(P(J(P(J(P(J))(P)(P)(P)))(P(J(P)(P)))(P(J(P)(P)))(P(J))))(P(J(P(J(P(J))(P)(P)(P)))(P(J(P)(P)))(P(J(P)(P)))(P(J))))(P(J(P(J(P)(P)))(P(J))(P(J))(P)))))" ] }, { "data": { "text/plain": [ "55" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fibonacci = lambda x: pyfib(x, jlfib)\n", "\n", "fibonacci(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Cross language is Easy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I hope you enjoyed that, I find it quite interesting and useful when you need to leverage the tools available across multiple domains. I'm sure there are plenty of other tools that allow this kind of things and a host of other languages that can interact with each other in this way.\n", "\n", "From the top of my head I know of a few magics (SQL, Redis...) that provide such integration. Every language has its strong and weak points, and knowing what to use is often hard. I hope I convinced you that mixing languages is not such a daunting task.\n", "\n", "The other case when this is useful is when you are learning a new language, you can leverage your current expertise temporarily and get something that work before learning the idiomatic way and available libraries.\n", "\n", "You can head back to the [Jupyter blog post](https://blog.jupyter.org/i-python-you-r-we-julia-baf064ca1fb6) to read the end ! \n", "\n", "Happy coding.\n" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 1 }