{ "cells": [ { "cell_type": "markdown", "id": "c4180b6b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Comprehensions\n" ] }, { "cell_type": "markdown", "id": "6411dda2", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Filtering and mapping\n", "\n", "In the previous chapter there was a section on `filter()` and `map()`. These functions represent processes that are present in a majority of data processing activities.\n", "\n", "For convenience, the concepts are repeated here: \n", "\n", "" ] }, { "cell_type": "markdown", "id": "f62baa9e", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## What are comprehensions?\n" ] }, { "cell_type": "markdown", "id": "75904908", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ ":::{admonition} Comprehensions\n", ":class: note\n", "\n", "\n", "Comprehensions are shorthand syntactic structures that can be used to generate, filter and map different types of collections\n", ":::" ] }, { "cell_type": "markdown", "id": "3b5cf497", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "There are several types of comprehensions, depending on the type of collection they generate: \n", "\n", "- **list comprehensions** generate lists using `[]`\n", "- **tuple comprehensions** generate tuples using the `tuple()` constructor\n", "- **set comprehensions** generate sets using `{}` with single values\n", "- **dict comprehensions** generate dictionaries using `{}` with key-value combinations\n", "- **generator comprehensions** are used to generate values _on demand_ using `()`\n" ] }, { "cell_type": "markdown", "id": "626ca1e4", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Are these essential to Python programming?\n", "\n", "Comprehensions have superseded the use of the `map()` and `filter()` functions, and are very popular constructs of the language. \n", "\n", "However, some beginning programmers feel awkward using them. So remember, if you feel ill at ease with these, you can always implement any comprehension using regular flow control elements such as `for` and `if` structures, or fall back to the use of `map()` and `filter()`" ] }, { "cell_type": "markdown", "id": "4abf9a26", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## List comprehensions\n", "The most-used type of comprehension is list comprehension, or listcomp in short.\n", "\n", "Let's implement both examples from the picture using these." ] }, { "cell_type": "code", "execution_count": 10, "id": "2ae1b153", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[81, 1, 25, 16, 64, 9, 4, 1, 16]\n", "[1, 4, 3, 2, 1, 4]\n" ] } ], "source": [ "numbers = [9, 1, 5, 4, 8, 3, 2, 1, 4]\n", "\n", "#map\n", "print([x**2 for x in numbers])\n", "\n", "#filter\n", "print([x for x in numbers if x < 5])" ] }, { "cell_type": "code", "execution_count": 9, "id": "abe44ccf", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[81, 1, 25, 16, 64, 9, 4, 1, 16]\n", "[1, 4, 3, 2, 1, 4]\n" ] } ], "source": [ "# using plain Python\n", "\n", "#map\n", "result = list()\n", "for x in numbers:\n", " result.append(x**2)\n", "print(result)\n", "\n", "#filter\n", "result = list()\n", "for x in numbers:\n", " if x < 5:\n", " result.append(x)\n", "print(result)\n" ] }, { "cell_type": "markdown", "id": "8ab01d6d", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "" ] }, { "cell_type": "markdown", "id": "1cadac85", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The above example shows a list comprehension. However, this general structure applies to all comprehension types. All comprehension have an iteration at their core that generate data elements. These data elements can be filtered using a predicate, and optionally be mapped -to multiple values if so required- to the resulting values that end up in the final collection." ] }, { "cell_type": "markdown", "id": "9a2b03be", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### if else shorthand in the mapping\n", "\n", "The mapping operation itself can also contain flow control using if else shorthand, as in the example below" ] }, { "cell_type": "code", "execution_count": 2, "id": "d6f8594e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[8, 4, 'aa', 'cc', 12, 2]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = [4, 2, 'a', 'c', 6, True]\n", "\n", "[x**2 if type(x) == 'int' else x * 2 for x in data]\n" ] }, { "cell_type": "markdown", "id": "cee4dda1", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Nested comprehensions\n", "\n", "Comprehensions can be nested to create complex combinations and datastructures. \n", "Note that when you think the code gets unclear it is time to simplify things and switch back to regular flow control." ] }, { "cell_type": "code", "execution_count": 25, "id": "97452b8d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['3c', '3e', '2c', '2e', '5c', '5e']\n", "[[3, 'c'], [3, 'e'], [2, 'c'], [2, 'e'], [5, 'c'], [5, 'e']]\n" ] } ], "source": [ "A = [3, 2, 5]\n", "B = ['c', 'e']\n", "\n", "print([str(a) + b for a in A for b in B])\n", "print([[a, b] for a in A for b in B])" ] }, { "cell_type": "markdown", "id": "3697674c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Besides generating nested structures, they can also be used to filter and process nested stuctures. For instance, given this list of x-y-z coordinates:" ] }, { "cell_type": "code", "execution_count": 19, "id": "e513c271", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-3, -5, -7, -5]\n", "['9/2/-3', '2/-5/4', '5/-7/-5']\n" ] } ], "source": [ "xyz_coordinates = [(4, 8, 1), (9, 2, -3), (2, -5, 4), (5, -7, -5)]\n", "\n", "#filter for coordinates that are negative\n", "print([coord for xyz in xyz_coordinates for coord in xyz if coord < 0])\n", "\n", "#filter for xyz combinations where any coordinate is negative\n", "print([f'{x}/{y}/{z}' for (x, y, z) in xyz_coordinates if x < 0 or y < 0 or z < 0])" ] }, { "cell_type": "markdown", "id": "ecfb6b84", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Some more examples\n", "Below are some additional examples to give you a general idea of the possibilities. " ] }, { "cell_type": "code", "execution_count": 5, "id": "51683162", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[(3.0, 81),\n", " (1.0, 1),\n", " (2.24, 25),\n", " (2.0, 16),\n", " (2.83, 64),\n", " (1.73, 9),\n", " (1.41, 4),\n", " (1.0, 1),\n", " (2.0, 16)]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import math\n", "numbers = [9, 1, 5, 4, 8, 3, 2, 1, 4]\n", "\n", "# a list of tuples: the square root (rounded) and square of each number\n", "[(round(math.sqrt(n), 2), n**2) for n in numbers]\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "fb49dfcd", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[(3.0, 81),\n", " (nan, 1),\n", " (2.23606797749979, 25),\n", " (2.0, 16),\n", " (nan, 64),\n", " (1.7320508075688772, 9),\n", " (1.4142135623730951, 4),\n", " (1.0, 1),\n", " (2.0, 16)]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers = [9, -1, 5, 4, -8, 3, 2, 1, 4]\n", "NaN = float(\"NaN\") # create a NaN variable\n", "\n", "# a list of tuples: the square root (rounded) and square of each number if number is positive else a NaN\n", "[(math.sqrt(n), n**2) if n > 0 else (NaN, n**2) for n in numbers]" ] }, { "cell_type": "code", "execution_count": 11, "id": "f16eef44", "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "[3.0, 2.23606797749979, 2.0, 1.7320508075688772, 1.4142135623730951, 1.0, 2.0]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers = [9, -1, 5, 4, -8, 3, 2, 1, 4]\n", "# only when n is positive\n", "[n**0.5 for n in numbers if n >= 0]" ] }, { "cell_type": "markdown", "id": "03005a93", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Dict comprehensions\n", "\n", "Dict comprehensions are different because they need to produce key-value pairs in the form of `key: value`." ] }, { "cell_type": "code", "execution_count": 29, "id": "ba3201fd", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "{110: 'n',\n", " 111: 'o',\n", " 112: 'p',\n", " 113: 'q',\n", " 114: 'r',\n", " 115: 's',\n", " 116: 't',\n", " 117: 'u',\n", " 118: 'v',\n", " 119: 'w'}" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "{x: chr(x) for x in range(110, 120)}" ] }, { "cell_type": "markdown", "id": "9cb4baf8", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Other comprehensions\n", "\n", "Set comprehensions look a lot like dict comprehensions but they have no key: value pairs. \n", "Tuples are created using the tuple constructor. \n", "Generator expressions generate values that can be used to feed other collections, or iterate the values. \n", "Other that the embracing symbols they work the same as lists." ] }, { "cell_type": "code", "execution_count": 40, "id": "031b4285", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{0, 1}\n", "('H', 'A', 'P', 'P', 'Y')\n", "<generator object <genexpr> at 0x110ba33c0>\n", "frozenset({0, 1})\n" ] } ], "source": [ "#set with {}\n", "print({x % 2 for x in range(10)})\n", "\n", "#tuple with tuple()\n", "print(tuple(x.upper() for x in 'happy'))\n", "\n", "#generator with ()\n", "gen = (x % 2 for x in range(10))\n", "print(gen)\n", "print(frozenset(gen))" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 5 }