{ "metadata": { "name": "", "signature": "sha256:d2bb0608f6744094f138c589e145d6cede5e0fc41dcc775603603353d1a25e81" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Hands-on: Python Fundamentals -- Sets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Objectives:**\n", "\n", "Upon completion of this lesson, you should be able to:\n", "\n", "* Describe the characteristics of the builtin `set` container in Python\n", "\n", "* Perform basic operations with `set`s including creation, \"querying\", updates, and basic set operations\n", "\n", "* Get an idea in which situations `set`s are and should be used" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "The set data structure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* In Python, a `set` is an efficient storage for \"membership\" checking\n", "\n", "* **`set` is like a `dict` but only with keys and without values**" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "The set data structure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Values are similar to \"keys\" in `dict` and can be any Python data type **BUT**\n", "\n", "* they **should be immutable**" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Creating a set" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* There are a number of ways to create and fill a set. E.g. you can create an empty one and keep assining new values" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Create an empty set\n", "eng = set()\n", "print eng" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "eng.add('one')\n", "print eng" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "eng.add('two')\n", "print eng" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Creating a set \"hardcoded\" way" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Very similar to `dict` but without values. As well as with `dict`, the order of items in a set is unpredictable" ] }, { "cell_type": "code", "collapsed": false, "input": [ "eng = {'one', 'two', 'three'}\n", "print eng" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Creating a set from a list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create a set from an iterable (e.g. list):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "eng = set(['one', 'two', 'three'])\n", "print eng" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Set comprehension" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Very similar to dict comprehensions:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "{e for e in ['one', 'two', 'three'] if 'e' in e}" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "The in operator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Very similar to lists and tuples:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "6 in {4, 5, 6, 7}\n", "True" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Deleting items" ] }, { "cell_type": "code", "collapsed": false, "input": [ "eng.add('five')\n", "print eng\n", "eng.remove('five')\n", "print eng" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Why the heck \"set\"s?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Why do we need sets if we could check for membership in lists and tuples?**\n" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "1. Because lookup in a `set` is much faster" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def lookups(container):\n", " for i in range(100, 200):\n", " i in container" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "import random\n", "l = range(1000)\n", "random.shuffle(l)\n", "t = tuple(l)\n", "s = set(l)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit lookups(l)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit lookups(t)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit lookups(s)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "2. Because they provide set operations" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print dir(set)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "{1, 2, 3, 'mom', 'dad'}.union({2, 3, 10})" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "{1, 2, 3, 'mom', 'dad'} | {2, 3, 10}" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "{1, 2, 3, 'mom', 'dad'}.intersection({2, 3, 10})" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "{1, 2, 3, 'mom', 'dad'} & {2, 3, 10}" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "{1, 2, 3, 'mom', 'dad'}.difference({2, 3, 10})" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "{1, 2, 3, 'mom', 'dad'} - {2, 3, 10}" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "More on sets could be found in the documentation: https://docs.python.org/2/library/stdtypes.html#set" ] } ], "metadata": {} } ] }