{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "

Raphael Das Gupta

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# GIS\n", "Geoinformation systems\n", "![some colorful map](images/colorful_map.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "[![Binder](images/binder_badge.svg)](https://mybinder.org/v2/gh/das-g/comprehensions-talk/master)\n", "Follow along on\n", "\n", "https://mybinder.org/v2/gh/das-g/comprehensions-talk\n", "\n", "# Comprehensions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "inspired by (and partially ripped off of)\n", "\n", "### From List Comprehensions to Generator Expressions\n", "by Guido van Rossum [on \"The History of Python\"](http://python-history.blogspot.ch/2010/06/from-list-comprehensions-to-generator.html)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## List Comprehensions ([PEP 202](https://www.python.org/dev/peps/pep-0202/))\n", "\n", "* since **Python 2.0** (2000-10-16)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Date in parentheses is the release date.\n", "\n", "Source:\n", "* [Python 2.0](https://www.python.org/download/releases/2.0/) Release" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "> \"[...] a Pythonic interpretation of a well-known notation for sets used by mathematicians.\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$${\\large\n", " \\left\\{ x \\,\\middle|\\, x > 10 \\right\\}\n", "}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> \"set of all $x$ such that $x > 10$\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Set notations in math\n", "\n", "
\n", " What is this? (click to reveal)\n", "

Set of square numbers

\n", "
\n", "\n", "$${\\large\n", " \\left\\{ 1, 4, 9, 16, 25, 36, \\ldots \\right\\}\n", "}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$${\\large\n", " \\left\\{ 1^2, 2^2, 3^2, 4^2, 5^2, 6^2, \\ldots \\right\\}\n", "}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$${\\large\n", " \\left\\{ x^2 \\,\\middle|\\, x \\in \\mathbb{N}\\right\\}\n", "}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Set-builder notation\n", "\n", "$${\\large\n", " \\left\\{ x^2 \\,\\middle|\\, x \\in \\mathbb{N}\\right\\}\n", "}$$\n", "\n", "Read \"$|$\" as\n", "\n", "* \"such that\"
\n", " or\n", "* \"for which\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Thus, for this example:\n", "\n", "> Set of (all) $x^2$, such that $x \\in \\mathbb{N}$\n", "\n", "or\n", "\n", "> Set of (all) $x^2$ for which $x \\in \\mathbb{N}$\n", "\n", "More on this notation: https://en.wikipedia.org/wiki/Set-builder_notation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "$${\\large\n", " \\big\\{ \\overbrace{x^2}^\\text{some expression on $x$} \\,\\big|\\, \\overbrace{x \\in \\mathbb{N}}^\\text{predicate for $x$} \\big\\}\n", "}$$\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "In math, the curly braces (\"$\\{$\" and \"$\\}$\") denote that this defines a set. Often, the predicate affirms a membership in another set.\n", "\n", "But Python 2.0 **didn't have sets**. (The built-in `set` type and `set()` function were added with [PEP 218](https://www.python.org/dev/peps/pep-0218/) in [Python 2.4](https://docs.python.org/dev/whatsnew/2.4.html#pep-218-built-in-set-objects), [released 2004-11-30](https://www.python.org/download/releases/2.4/). Set literals of the form `{1, 2, 3}` were added yet later.)\n", "\n", "But with list comprehensions we can use a similar concept to create lists:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$${\\large\n", " \\big[ \\underbrace{\\texttt{x * x}}_\\text{some expression on $x$}\\ \\texttt{ for }\\ \\underbrace{\\texttt{x}}_\\text{variable}\\ \\texttt{ in }\\ \\underbrace{\\texttt{range(1, 7)}}_\\text{an iterable} \\quad\\big]\n", "}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "An **itearable** is anything over that you can iterate over (e.g., use it in a `for` loop)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "[1, 4, 9, 16, 25, 36]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[1 * 1, 2 * 2, 3 * 3, 4 * 4, 5 * 5, 6 * 6]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$${\\large\n", " \\big[ \\underbrace{\\texttt{x * x}}_\\text{some expression on $x$}\\ \\texttt{ for }\\ \\underbrace{\\texttt{x}}_\\text{variable}\\ \\texttt{ in }\\ \\underbrace{\\texttt{range(1, 7)}}_\\text{an iterable} \\quad\\big]\n", "}$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[x * x for x in range(1,7)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The math example from the beginning:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "$${\\large\n", " \\big\\{ \\overbrace{x}^\\text{some expression on $x$} \\,\\big|\\, \\overbrace{x > 10}^\\text{predicate for $x$} \\big\\}\n", "}$$\n", "(some) universal set implied (often by context)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$${\\large\n", " \\big\\{ \\overbrace{x}^\\text{some expression on $x$} \\,\\big|\\, \\overbrace{x > 10, x \\in \\mathbb{Z}}^\\text{predicate for $x$} \\big\\}\n", "}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$${\\large\n", " \\big[ \\underbrace{\\texttt{x}}_\\text{some expression on $x$}\\ \\texttt{ for }\\ \\underbrace{\\texttt{x}}_\\text{variable}\\ \\texttt{ in }\\ \\underbrace{\\texttt{s}}_\\text{an iterable} \\texttt{if} \\underbrace{\\texttt{x > 10}}_\\text{condition for $x$} \\quad\\big]\n", "}$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "$${\\large\n", " \\big[ \\underbrace{\\texttt{x}}_\\text{some expression on $x$}\\ \\texttt{ for }\\ \\underbrace{\\texttt{x}}_\\text{variable}\\ \\texttt{ in }\\ \\underbrace{\\texttt{s}}_\\text{an iterable} \\texttt{if} \\underbrace{\\texttt{x > 10}}_\\text{condition for $x$} \\quad\\big]\n", "}$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "from random import randint\n", "\n", "s = [randint(-30, +30) for _ in range(25)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "[x for x in s if x > 10]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Squares of natural numbers whose cube is between 10 and 100\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[n**2 for n in range(5) if 10 <= n**3 <= 100]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Not just for numbers" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "sentence = \"I'm walking to the market where I'll be buying fruit\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sentence.split()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "[f\"I like {word}.\" for word in sentence.split() if word.endswith(\"ing\")]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### List comprehensions as alternative to `map()` and `filter()`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In Python 2.x:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", " map(f, S) == [f(x) for x in S ]\n", " filter(P, S) == [ x for x in S if P(x)]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In general\n", "```python\n", " [f(x) for x in S if P(x)]\n", "```\n", "is equivalent to\n", "```python\n", " map(f, filter(P, S)) # Python 2\n", " list( map(f, filter(P, S)) ) # Python 3\n", "``` " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Let's find the ASCII codes of the capital letters in the following" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "sentence = \"Guido and I are walking to the supermarket where we'll buy Spam.\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "map(ord, sentence)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "list(_)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "with\n", "```python\n", " map(f, S) == [f(x) for x in S]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "[ord(c) for c in sentence]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "sentence" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "filter(str.isupper, sentence)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(_)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "with\n", "```python\n", " filter(P, S) == [x for x in S if P(x)]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "[c for c in sentence if str.isupper(c)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "[c for c in sentence if c.isupper()]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "sentence" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "map(ord, filter(str.isupper, sentence))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(_)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "with\n", "```python\n", " map(f, filter(P, S)) == [f(x) for x in S if P(x)]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "[ord(c) for c in sentence if str.isupper(c)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "[ord(c) for c in sentence if c.isupper()]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "# convert them back\n", "''.join(map(chr, _))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Interlude\n", "\n", "## Beyond lists" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Dictionary Comprehensions ([PEP 274](https://www.python.org/dev/peps/pep-0274/))\n", "\n", "* since **Python 3.0** (2008-12-03)\n", "* since **Python 2.7** (2010-07-03)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Sources\n", "* [Python 3.0 Release](https://www.python.org/download/releases/3.0/)\n", " * [What’s New In Python 3.0](https://docs.python.org/3.0/whatsnew/3.0.html) → [New Syntax](https://docs.python.org/3.0/whatsnew/3.0.html#new-syntax)\n", "\n", "* [Python 2.7.4 Release](https://www.python.org/download/releases/2.7/)\n", " * [What’s New in Python 2.7](https://docs.python.org/dev/whatsnew/2.7.html) → [Python 3.1 Features](https://docs.python.org/dev/whatsnew/2.7.html#python-3-1-features)\n", " * [What’s New in Python 2.7](https://docs.python.org/dev/whatsnew/2.7.html) → [Other Language Changes](https://docs.python.org/dev/whatsnew/2.7.html#other-language-changes)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "from string import printable\n", "\n", "ascii_table = {ord(c): c for c in printable}\n", "\n", "ascii_table" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "use unpacking and `.items()` to access keys and values when the iterable is a dict:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "inverse_ascii_table = {b: a for a, b in ascii_table.items()}\n", "\n", "inverse_ascii_table" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Set Comprehensions\n", "\n", "* since **Python 3.0** (2008-12-03)\n", "* since **Python 2.7** (2010-07-03)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "together with set literal syntax" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "{1, 2, 3}" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "type( {} )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "set()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "{i / 2 for i in [4, 6, 4, 2, 2]}" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### List comprehension vs. `map`/`filter`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# PEP 20" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "# i.e.\n", "import this" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### When expression and predicate are already defined as unary functions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", " map(ord, sentence)\n", "```\n", "or\n", "```python\n", " [ord(c) for c in sentence]\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "slight preference for `map()`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", " filter(str.isupper, sentence)\n", "```\n", "or\n", "```python\n", " [c for c in sentence if c.isupper()]\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "slight preference for `filter()`" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", " map(ord, filter(str.isupper, sentence))\n", "```\n", "or\n", "```python\n", " [ord(c) for c in sentence if c.isupper()]\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "> Flat is better than nested.\n", "\n", "→ slight preference for list comprehension" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unless ..." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "... there might be a better way to un-nest it\n", "```python\n", " capital_letters = filter(str.isupper, sentence)\n", " asciii_codes = map(ord, capital_letters)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### When expression and predicate _aren't_ already defined as unary functions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", " def is_vowel(chr):\n", " return chr in 'aeiou'\n", "\n", " filter(is_vowel, sentence)\n", "```\n", "or\n", "```python \n", " filter(lambda c: c in 'aeiou', sentence)\n", "```\n", "or\n", "```python\n", " [c for c in sentence if c in 'aeiou']\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "slight preference for list comprehension" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### When expression and predicate _aren't_ already defined as unary functions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "... and when we have **problems naming the functions properly** when defining them" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", " def booify(chr):\n", " return f'B{chr}{chr}'\n", "\n", " map(booify, sentence)\n", "```\n", "or\n", "```python\n", " map(lambda c: f'B{c}{c}', sentence)\n", "```\n", "or\n", "```python\n", " [f'B{c}{c}' for c in sentence]\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "clear preference for list comprehension" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "```python\n", " def is_vowel(chr):\n", " return chr in 'aeiou'\n", "\n", " def booify(chr):\n", " return f'B{chr}{chr}'\n", "\n", " map(booify, filter(is_vowel, sentence))\n", "```\n", "or\n", "```python\n", " map(lambda c: f'B{c}{c}', filter(lambda c: c in 'aeiou', sentence))\n", "```\n", "or\n", "```python\n", " [f'B{c}{c}' for c in sentence if c in 'aeiou']\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**very clear** preference for list comprehension" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def is_vowel(chr):\n", " return chr in 'aeiou'\n", "\n", "def booify(chr):\n", " return f'B{chr}{chr}'\n", "\n", "map(booify, filter(is_vowel, sentence))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "So, what does this do, anyway?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(_)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "(It takes the vovels of the sentence and creates kinda baby-speach from it.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "[f'B{c}{c}' for c in sentence if c in 'aeiou']" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "Off course, the list comprehension does the same." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "
\n", " github.com/das-g\n", "
\n", "\n", "
\n", " gitlab.com/das-g\n", "
\n", "\n", "
\n", " keybase.io/das_g\n", "
" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 2 }