{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# How I use git cherry-pick" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have a very nice series of informal computing centric talks/seminars at the University Observatory Munich, labelled \"Code Coffee\". I enjoy these a lot, although I prefer tea. Even for topics on which I already know a lot, I usually learn something new and interesting. The most recent installment was an introduction to `git`. Although by no means an expert, I was probably one of the more advanced users in the room. One question, whether and how I use `git cherry-pick` led to a -- I believe -- somewhat convoluted answer and got me thinking that this should probably be the \"comeback\" blog post after a longer absence. Basic familiarity with `git add`, `git commit`, `git merge`, and how to identify a commit by its hash is assumed throughout.\n", "\n", "To see how I use `git cherry-pick` to quickly backport important changes like critical bug fixed from development branches, we will create a new, initially buggy, project. As a toy project, let's create a Python module, which computes the $n$-th Fibonacci number. We will start by creating a new directory for our project and initializing git." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/joerg/IPyNB/Blog/fibonacci\n", "Initialized empty Git repository in /home/joerg/IPyNB/Blog/fibonacci/.git/\n" ] } ], "source": [ "%mkdir fibonacci\n", "%cd fibonacci\n", "!git init" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are now in the new project directory and can start adding code." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Writing fibonacci.py\n" ] } ], "source": [ "%%writefile -a fibonacci.py\n", "\n", "import sys\n", "def fibonacci(n):\n", " \"\"\"Computes the n-th Fibonacci by recursion\"\"\"\n", " if n in [0, 1]:\n", " return 1\n", " else:\n", " return fibonacci(n - 1) + fibonacci(n - 2)\n", " \n", "if __name__ == \"__main__\":\n", " print(fibonacci(int(sys.argv[1])))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from fibonacci import fibonacci\n", "\n", "[fibonacci(n) for n in range(3, 15)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This seems to work. Each number in this list ist the sum of its two predecessors. So let's commit our first version." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[master (root-commit) b90d77e] First version of recursive Fibonacci sequence\n", " 1 file changed, 11 insertions(+)\n", " create mode 100644 fibonacci.py\n" ] } ], "source": [ "%%bash\n", "git add fibonacci.py\n", "git commit -m \"First version of recursive Fibonacci sequence\" fibonacci.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A little further testing, however, shows that this becomes terribly slow for only slightly smaller number than we tested above." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Execution time fibonacci(31): CPU times: user 512 ms, sys: 3.79 ms, total: 516 ms\n", "Wall time: 515 ms\n", "Execution time fibonacci(32): CPU times: user 831 ms, sys: 158 µs, total: 831 ms\n", "Wall time: 831 ms\n", "Execution time fibonacci(33): CPU times: user 1.63 s, sys: 4.11 ms, total: 1.64 s\n", "Wall time: 1.64 s\n", "Execution time fibonacci(34): CPU times: user 2.17 s, sys: 160 µs, total: 2.17 s\n", "Wall time: 2.17 s\n", "Execution time fibonacci(35): CPU times: user 3.48 s, sys: 20 µs, total: 3.48 s\n", "Wall time: 3.48 s\n" ] } ], "source": [ "for n in range(31, 36):\n", " print(f'Execution time fibonacci({n})', end=': ')\n", " %time fibonacci(n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And what's worse: The execution time of `fibonacci(n)` is the sum of the execution times of `fibonacci(n-1)` and `fibonacci(n-2)`. At this rate, `fibonacci(100)` will finish in roughly 4 million years! Clearly, something must be done about this. So, we create a new development branch to deal with this problem." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Switched to a new branch 'make_faster'\n" ] } ], "source": [ "%%bash\n", "git checkout -b make_faster" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, our developer, apparently unaware of the goodnes that is jupyter notebooks and the ipython %time and %timeit magics, creates their own time measurement, to see how they are doing." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting fibonacci.py\n" ] } ], "source": [ "%%writefile fibonacci.py\n", "\n", "import sys\n", "import time\n", "\n", "def fibonacci(n):\n", " \"\"\"Computes the n-th Fibonacci by recursion\"\"\"\n", " if n in [0, 1]:\n", " return 1\n", " else:\n", " return fibonacci(n - 1) + fibonacci(n - 2)\n", " \n", "if __name__ == \"__main__\":\n", " t0 = time.time()\n", " print(fibonacci(int(sys.argv[1])))\n", " duration = time.time() - t0\n", " print(f'Executed in {duration:.2} seconds')" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "14930352\n", "Executed in 3.7 seconds\n" ] } ], "source": [ "!python ./fibonacci.py 35" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's apparently working. So lets commit this." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[make_faster 7a418cd] add debugging timer\n", " 1 file changed, 6 insertions(+), 1 deletion(-)\n" ] } ], "source": [ "%%bash\n", "git commit -m 'add debugging timer' fibonacci.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Meanwhile somebody notices that nothing in `fibonacci` ensures that `n` actually is an integer and makes a quick commit directly to master (not usually encouraged!) to rectify this." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Switched to branch 'master'\n" ] } ], "source": [ "!git checkout master" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting fibonacci.py\n" ] } ], "source": [ "%%writefile fibonacci.py\n", "\n", "import sys\n", "def fibonacci(n):\n", " \"\"\"Computes the n-th Fibonacci by recursion\"\"\"\n", " n = int(n)\n", " if n in [0, 1]:\n", " return 1\n", " else:\n", " return fibonacci(n - 1) + fibonacci(n - 2)\n", " \n", "if __name__ == \"__main__\":\n", " print(fibonacci(int(sys.argv[1])))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[master cfd121c] ensure that n is an integer or cast to int\n", " 1 file changed, 1 insertion(+)\n", "* commit cfd121c94a0d96a73aad4fae30b646fb176e7398\n", "| Author: Jörg Dietrich \n", "| Date: Sat Mar 17 00:05:34 2018 +0100\n", "| \n", "| ensure that n is an integer or cast to int\n", "| \n", "| * commit 7a418cdc3dc2effa01829ab3c45dab3f0777a35b\n", "|/ Author: Jörg Dietrich \n", "| Date: Sat Mar 17 00:05:19 2018 +0100\n", "| \n", "| add debugging timer\n", "| \n", "* commit b90d77e2d9fd1a0e5e1f8383751ca6a208264e6c\n", " Author: Jörg Dietrich \n", " Date: Sat Mar 17 00:04:49 2018 +0100\n", " \n", " First version of recursive Fibonacci sequence\n" ] } ], "source": [ "%%bash\n", "git commit -m \"ensure that n is an integer or cast to int\" fibonacci.py\n", "git log --all --graph" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, we have two branches (`master` and `make_faster`), which both have commits not present in the other one. \n", "\n", "Time to get back to development and trying to make our program faster where our developer notices a serious problem:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1\n", "Executed in 4.4e-05 seconds\n", "1 1\n", "Executed in 2.3e-05 seconds\n", "2 2\n", "Executed in 2e-05 seconds\n", "3 3\n", "Executed in 1.7e-05 seconds\n", "4 5\n", "Executed in 1.9e-05 seconds\n", "5 8\n", "Executed in 2.3e-05 seconds\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Switched to branch 'make_faster'\n" ] } ], "source": [ "%%bash\n", "git checkout make_faster\n", "for i in `seq 0 5`; do echo -n \"$i \"; python ./fibonacci.py $i; done" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our Fibonacci series is off by one! `fibonacci(0)` should be 0, and `fibonacci(4)` should be 3, while `fibonacci(5)` should 5. Let's fix this first." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting fibonacci.py\n" ] } ], "source": [ "%%writefile fibonacci.py\n", "\n", "import sys\n", "import time\n", "\n", "def fibonacci(n):\n", " \"\"\"Computes the n-th Fibonacci by recursion\"\"\"\n", " if n in [0, 1]:\n", " return n # Return n, not 1!\n", " else:\n", " return fibonacci(n - 1) + fibonacci(n - 2)\n", " \n", "if __name__ == \"__main__\":\n", " t0 = time.time()\n", " print(fibonacci(int(sys.argv[1])))\n", " duration = time.time() - t0\n", " print(f'Executed in {duration:.2} seconds')" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# Let's make some simple test. These should really be automated unit tests …\n", "assert fibonacci(5) == 5\n", "assert fibonacci(1) == 1\n", "assert fibonacci(10) == 55" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[make_faster 33e0161] fix obiwan bug in fibonacci\n", " 1 file changed, 1 insertion(+), 1 deletion(-)\n" ] } ], "source": [ "!git commit -m \"fix obiwan bug in fibonacci\" fibonacci.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Phew, this fix is important enough that it should go directly into our production code in master. This time, however, we'll follow the good (best?!) practice of not committing to master directly but create a bug fix branch. This could then be reviewed before being merged into master. We don't want to merge the `make_faster` branch into `master`, because our developer is experimenting and has littered it with debugging code, and god know's what he's been trying there.\n", "\n", "We make a note of the bug fix (short) commit hash (`33e0161`) now, but could of course retrieve this later from `git log` as well. This is the one we will cherry-pick into our bug fix branch." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[fix_obiwan dbff8eb] fix obiwan bug in fibonacci\n", " Date: Sat Mar 17 00:06:02 2018 +0100\n", " 1 file changed, 1 insertion(+), 1 deletion(-)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Switched to branch 'master'\n", "Switched to a new branch 'fix_obiwan'\n" ] } ], "source": [ "%%bash\n", "# We want to fix master, so we'll branch from there\n", "git checkout master\n", "git checkout -b fix_obiwan\n", "git cherry-pick 33e0161" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And then, after code review, etc. this bug fix branch is merged into master." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Updating cfd121c..dbff8eb\n", "Fast-forward\n", " fibonacci.py | 2 +-\n", " 1 file changed, 1 insertion(+), 1 deletion(-)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Switched to branch 'master'\n" ] } ], "source": [ "%%bash\n", "git checkout master\n", "git merge fix_obiwan" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Meanwhile our developer has an epiphany. Every time the recursion calls `fibonacci(n - 2)` it computes the same values that the previous call to `fibonacci(n - 1)` already computed. If only there was a way to cache the results of previous function calls … Well, this being Python, of course there is." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Switched to branch 'make_faster'\n" ] } ], "source": [ "!git checkout make_faster" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting fibonacci.py\n" ] } ], "source": [ "%%writefile fibonacci.py\n", "\n", "from functools import lru_cache\n", "import sys\n", "import time\n", "\n", "\n", "@lru_cache()\n", "def fibonacci(n):\n", " \"\"\"Computes the n-th Fibonacci by recursion\"\"\"\n", " if n in [0, 1]:\n", " return n # Return n, not 1!\n", " else:\n", " return fibonacci(n - 1) + fibonacci(n - 2)\n", " \n", "if __name__ == \"__main__\":\n", " t0 = time.time()\n", " print(fibonacci(int(sys.argv[1])))\n", " duration = time.time() - t0\n", " print(f'Executed in {duration:.2} seconds')" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "354224848179261915075\n", "Executed in 0.00012 seconds\n" ] } ], "source": [ "!python ./fibonacci.py 100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Whoah, that's a lot faster than 4 Myr! Let's commit this and then clean up." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[make_faster abf1332] speed up by using least-recently used cache decorator\n", " 1 file changed, 3 insertions(+)\n" ] } ], "source": [ "!git commit -m \"speed up by using least-recently used cache decorator\" fibonacci.py" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting fibonacci.py\n" ] } ], "source": [ "%%writefile fibonacci.py\n", "\n", "from functools import lru_cache\n", "import sys\n", "\n", "@lru_cache()\n", "def fibonacci(n):\n", " \"\"\"Computes the n-th Fibonacci by recursion\"\"\"\n", " if n in [0, 1]:\n", " return n # Return n, not 1!\n", " else:\n", " return fibonacci(n - 1) + fibonacci(n - 2)\n", " \n", "if __name__ == \"__main__\":\n", " print(fibonacci(int(sys.argv[1])))" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[make_faster 6d8d63f] remove debugging statements\n", " 1 file changed, 1 insertion(+), 6 deletions(-)\n", "Auto-merging fibonacci.py\n", "Merge made by the 'recursive' strategy.\n", " fibonacci.py | 3 +++\n", " 1 file changed, 3 insertions(+)\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Switched to branch 'master'\n" ] } ], "source": [ "%%bash\n", "git commit -m \"remove debugging statements\" fibonacci.py\n", "git checkout master\n", "git merge make_faster" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "* \u001b[33mcommit 06f69272a3289715bae44ba8bc00701dc13c84f6\u001b[m\n", "\u001b[31m|\u001b[m\u001b[32m\\\u001b[m Merge: dbff8eb 6d8d63f\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Author: Jörg Dietrich \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Date: Sat Mar 17 00:06:43 2018 +0100\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Merge branch 'make_faster'\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m * \u001b[33mcommit 6d8d63f612ae1b087f1449b24d7b50de3665b280\u001b[m\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Author: Jörg Dietrich \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Date: Sat Mar 17 00:06:43 2018 +0100\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m remove debugging statements\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m * \u001b[33mcommit abf13323a213d94dd4fe0c095e9d4a50335dd253\u001b[m\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Author: Jörg Dietrich \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Date: Sat Mar 17 00:06:37 2018 +0100\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m speed up by using least-recently used cache decorator\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m * \u001b[33mcommit 33e016177dd7c3805767decd4f6d45b58bded942\u001b[m\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Author: Jörg Dietrich \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Date: Sat Mar 17 00:06:02 2018 +0100\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m fix obiwan bug in fibonacci\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m * \u001b[33mcommit 7a418cdc3dc2effa01829ab3c45dab3f0777a35b\u001b[m\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Author: Jörg Dietrich \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m Date: Sat Mar 17 00:05:19 2018 +0100\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m add debugging timer\n", "\u001b[31m|\u001b[m \u001b[32m|\u001b[m \n", "* \u001b[32m|\u001b[m \u001b[33mcommit dbff8eba5b33067c6a9dbe468c46f71b742fce6a\u001b[m\n", "\u001b[32m|\u001b[m \u001b[32m|\u001b[m Author: Jörg Dietrich \n", "\u001b[32m|\u001b[m \u001b[32m|\u001b[m Date: Sat Mar 17 00:06:02 2018 +0100\n", "\u001b[32m|\u001b[m \u001b[32m|\u001b[m \n", "\u001b[32m|\u001b[m \u001b[32m|\u001b[m fix obiwan bug in fibonacci\n", "\u001b[32m|\u001b[m \u001b[32m|\u001b[m \n", "* \u001b[32m|\u001b[m \u001b[33mcommit cfd121c94a0d96a73aad4fae30b646fb176e7398\u001b[m\n", "\u001b[32m|\u001b[m\u001b[32m/\u001b[m Author: Jörg Dietrich \n", "\u001b[32m|\u001b[m Date: Sat Mar 17 00:05:34 2018 +0100\n", "\u001b[32m|\u001b[m \n", "\u001b[32m|\u001b[m ensure that n is an integer or cast to int\n", "\u001b[32m|\u001b[m \n", "* \u001b[33mcommit b90d77e2d9fd1a0e5e1f8383751ca6a208264e6c\u001b[m\n", " Author: Jörg Dietrich \n", " Date: Sat Mar 17 00:04:49 2018 +0100\n", " \n", " First version of recursive Fibonacci sequence\n" ] } ], "source": [ "!git log --all --graph" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great, we merged everything back into one branch! There cherry-picked bug fix, which was modification in both branches did not cause a merge conflict. \n", "\n", "In summary, `git cherry-pick` is a convenient way to quickly backport important bug fixes from development branches to more slowly moving branches, such as stable releases.\n", "\n", "And finally, there's of course no need to compute the Fibonacci sequence with recursion. One could as well do this iteratively." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "This blog \n", "post is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This post\n", " was written as a Jupyter Notebook in Python. You \n", "can download the original notebook." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }