{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Python Tidbits for NLP\n", "## Anoop Sarkar" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is an extremely concise introduction to Python for programmers already proficient in at least one other programming language. \n", "\n", "A slower and more thorough tutorial is the [Python Tutorial](https://docs.python.org/2/tutorial/) by Guido van Rossum. Read it at least upto Chapter 10.\n", "\n", "These code fragments are for Python version 3.x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Be agnostic of the operating system" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python code, especially for file system interaction, can be written so that it runs on many different operating systems. This makes your code more portable and easier to maintain as well." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/usr/share/dict/words\n" ] } ], "source": [ "import os\n", "import sys\n", "if sys.platform == 'win32':\n", " ROOT = os.path.splitdrive(os.path.abspath('.'))[0]\n", "elif sys.platform == 'linux2' or sys.platform == 'darwin':\n", " ROOT = os.sep\n", "else:\n", " raise ValueError(\"unknown operating system\")\n", "dictfile = os.path.join(ROOT, 'usr','share','dict','words')\n", "print(dictfile)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## For loops" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use built in functions to create ranges." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "3\n", "5\n", "7\n", "9\n" ] } ], "source": [ "for i in range(1,10,2):\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Opening and closing file handles" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Always open a file using the `with` statement because it closes the file at the end of the statement (even if there is an exception during interaction with the file system). A for loop can be used to iterate through lines using the file handle." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "antidisestablishmentarianism\n", "formaldehydesulphoxylate\n", "pathologicopsychological\n", "scientificophilosophical\n", "tetraiodophenolphthalein\n", "thyroparathyroidectomize\n" ] } ], "source": [ "with open(dictfile, 'r') as fhandle:\n", " for line in fhandle:\n", " line = line.strip()\n", " if len(line) > 23:\n", " print(line)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## List comprehensions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "List comprehensions are very useful to replace a for-loop. Example below finds unique elements as a one line python program using the built-in set data structure." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['a', 'd', 'b', 'c']\n" ] } ], "source": [ "x = ['a', 'b', 'c', 'd', 'a', 'b', 'c']\n", "print([ i for i in set(x) ])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also, you can use an 'if' statement in a list comprehension." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['d', 'b', 'c']\n" ] } ], "source": [ "print([ i for i in set(x) if i != 'a'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using list comprehensions, the following Python code prints out the lowercased tokens of length greater than 15 from Sense and Sensibility (note that one of them occurs twice)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['incomprehensible', 'incomprehensible', 'disinterestedness', 'companionableness', 'disqualifications']\n" ] } ], "source": [ "import nltk\n", "longwords = [ word.lower() for word in nltk.corpus.gutenberg.words('austen-sense.txt') if len(word) > 15]\n", "print(longwords)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Enumerate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "enumerate is very useful when you want a counter variable for each element in a list." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 a\n", "1 c\n", "2 b\n", "3 d\n" ] } ], "source": [ "x = ['a', 'c', 'b', 'd']\n", "for (index,element) in enumerate(x): print(index, element)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dictionary comprehensions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dictionary comprehensions are just like list comprehensions except they let you build a dictionary instead of a list. Say we want to build a dictionary where the dictionary keys are lowercase ASCII characters and the values are the probabilities for each character. In the following we just assign a random probability to each lowercase character." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.040804314032715325\n", "0.04517300968758016\n" ] } ], "source": [ "import string\n", "import numpy\n", "# set up a random probability distribution over lowercase ASCII characters\n", "counts = [ numpy.random.random() for c in string.ascii_lowercase ]\n", "total = sum(counts)\n", "# the following is a dictionary comprehension\n", "prob = { c: (counts[i] / total) for (i,c) in enumerate(string.ascii_lowercase) }\n", "print(prob['e'])\n", "print(prob['z'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## argmax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Often we wish to compute the argmax using a probability distribution. The argmax function returns the element that has the highest probability. $$\\hat{x} = \\arg\\max_x P(x)$$" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "w 0.07334002043038697\n" ] } ], "source": [ "def P(c):\n", " return prob[c]\n", "# the character with the highest probability is given by argmax_c P(c)\n", "argmax_char = max(string.ascii_lowercase, key=P)\n", "print(argmax_char, P(argmax_char))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Formatted Strings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Formatted strings, where you want to insert a value into a string, where %s is a string value, %d is a decimal integer, %f is a floating point number." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "x = 10 and y = 0.000300\n" ] } ], "source": [ "print(\"%s = %d and %s = %f\" % (\"x\", 10, \"y\", 0.0003))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The answer is 42.\n" ] } ], "source": [ "print(\"The %(foo)s is %(bar)i.\" % {'foo': 'answer', 'bar':42})" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The answer is 42\n" ] } ], "source": [ "print(\"The {foo} is {bar}\".format(foo='answer', bar=42))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tuples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The builtin function 'tuple' can be used to create n-grams from a list of words." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "print unigrams aka 1-grams: ['a', 'good', 'book', 'is', 'all', 'you', 'need', '.']\n", "print bigrams aka 2-grams: [('a', 'good'), ('good', 'book'), ('book', 'is'), ('is', 'all'), ('all', 'you'), ('you', 'need'), ('need', '.')]\n", "print trigrams aka 3-grams: [('a', 'good', 'book'), ('good', 'book', 'is'), ('book', 'is', 'all'), ('is', 'all', 'you'), ('all', 'you', 'need'), ('you', 'need', '.')]\n" ] } ], "source": [ "words = ['a', 'good', 'book', 'is', 'all', 'you', 'need', '.']\n", "print(\"print unigrams aka 1-grams: \", end='')\n", "print(words)\n", "\n", "print(\"print bigrams aka 2-grams: \", end='')\n", "print([ tuple(words[i:i+2]) for i in range(len(words)-1) ])\n", "\n", "print(\"print trigrams aka 3-grams: \", end=''),\n", "print([ tuple(words[i:i+3]) for i in range(len(words)-2) ])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sorting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function itemgetter from the operator module in Python provides a concise way to sort on different tuple elements in a list of tuples. Note that itemgetter(1) is set to the 2nd component of the tuple, and used as a key to sort the tuples." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('the', 1223), ('a', 2413), ('Mr.', 450), ('Elton', 10)]\n", "[('a', 2413), ('the', 1223), ('Mr.', 450), ('Elton', 10)]\n" ] } ], "source": [ "word_freq = [ ('the', 1223), ('a', 2413), ('Mr.', 450), ('Elton', 10) ]\n", "print(word_freq)\n", "from operator import itemgetter\n", "word_freq.sort(key=itemgetter(1), reverse=True)\n", "print(word_freq)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also use the built-in 'map' function to get the sorted values." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2413, 1223, 450, 10]\n" ] } ], "source": [ "print(list(map(itemgetter(1), word_freq)))" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['a', 'the', 'Mr.', 'Elton']\n" ] } ], "source": [ "print(list(map(itemgetter(0), word_freq)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Classes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A class works pretty much like what you would expect from other languages such as C++ or Java. Methods of a class are determined by indentation. Each method that is part of the class must take at least one argument. The first argument of each method in a class is a pointer to the object derived from the class definition. By convention this first argument is typically called `self` and it is analogous but not exactly the same as the C++ `this` pointer." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a\n" ] } ], "source": [ "class C:\n", " def foo(self):\n", " return self.a\n", " def bar(self, a):\n", " self.a = a\n", "x = C()\n", "x.bar('a')\n", "print(x.foo())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Constructor and Destructor methods in a class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The magic method `__init__` is the constructor method for the class and `__del__` is the destructor method which is called by the garbage collector (Python is similar to Java -- it does not require explicit memory management)." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A\n", "a\n", "aa\n", "aal\n", "aalii\n" ] } ], "source": [ "from itertools import islice\n", "\n", "class FileObject:\n", " '''Wrapper for file objects to make sure the file gets closed on deletion.'''\n", "\n", " def __init__(self, filename):\n", " self.file = open(filename, 'r')\n", "\n", " def __del__(self):\n", " self.file.close()\n", " del self.file\n", "\n", "f = FileObject(dictfile) # dictfile is defined in an earlier cell\n", "for line in islice(f.file, 5):\n", " print(line,end='')\n", "del f # get rid of f -- this is typically not explicitly done in Python. trust the garbage collector to do it for you." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterators" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A class is an iterator if it has a `__iter__` and `next` method\n", "defined as shown in this example." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "2\n", "3\n", "1\n", "2\n" ] } ], "source": [ "# circular queue \n", "class cq:\n", " q = [] # needs to be initialized with a list\n", " def __init__(self,q): # the argument q is a list \n", " self.q = q \n", " def __iter__(self): \n", " return self \n", " def __next__(self): \n", " r = self.q[0]\n", " self.q = self.q[1:] + [r] # rotate the list\n", " return r\n", "\n", "x = cq([1,2,3])\n", "print(x.__next__())\n", "print(x.__next__())\n", "print(x.__next__())\n", "print(x.__next__())\n", "print(x.__next__())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Magic!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Methods like `__iter__` in the above code for `cq` is called a magic method. Here is a [guide to all Python magic methods](http://anoopsarkar.github.io/nlp-class/cached/magicmethods.pdf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iteration tools" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function islice allows you to take a slice of an iterator." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "2\n", "3\n", "1\n", "2\n", "1\n", "2\n", "3\n", "[4, 5, 1, 2, 3, 4, 5, 1, 2, 3]\n" ] } ], "source": [ "from itertools import islice\n", "x = cq([1,2,3])\n", "for i in islice(x, 5):\n", " print(i)\n", "\n", "y = cq([1,2,3,4,5])\n", "for i in islice(y,3): print(i)\n", "z = [i for i in islice(y,10)]\n", "print(z)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convenient Dictionaries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The class defaultdict allows convenient insertion into a dictionary. You do not need to check if a key exists first before updating the value when using defaultdict." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "defaultdict(, {'a': 2}) defaultdict(, {'b': [1, 2]})\n" ] } ], "source": [ "from collections import defaultdict\n", "foo = defaultdict(int)\n", "bar = defaultdict(list)\n", "foo['a'] += 1\n", "foo['a'] += 1\n", "bar['b'].append(1)\n", "bar['b'].append(2)\n", "print(foo, bar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generators" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use generators instead of lists. Generators behave like streams which you can iterate over while lists are statically allocated." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n", "5\n", "14\n", "30\n", "55\n", "91\n", "140\n", "204\n", "285\n", "385\n" ] } ], "source": [ "def sum_of_squares(n):\n", " v = 0\n", " for i in range(1,n+1):\n", " v += i*i\n", " yield v\n", "for i in sum_of_squares(10): print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generator expressions" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2, 4, 6, 8]\n", " at 0x17b552730>\n" ] } ], "source": [ "a = [1,2,3,4] # this is a list\n", "b = [2*x for x in a] # this is a list comprehension\n", "c = (2*x for x in a) # this is a generator, not a list. it creates an iterator object\n", "print(b)\n", "print(c)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(0, 4)\n", "(0, 5)\n", "(1, 4)\n", "(1, 5)\n" ] } ], "source": [ "n = ((a,b) for a in range(0,2) for b in range(4,6))\n", "for i in n:\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More on Generators" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read [Generator Tricks for Systems Programmers](http://anoopsarkar.github.io/nlp-class/cached/generators.pdf) by David Beazley." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Use built-in functions" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "9.43 µs ± 32 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n", "8.34 µs ± 9.61 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n" ] } ], "source": [ "# from Part 2 of Peter Norvig's excellent essay on xkcd 1313 \n", "# http://nbviewer.ipython.org/url/norvig.com/ipython/xkcd1313-part2.ipynb\n", "import re\n", "searcher = re.compile('^a.o').search\n", "data = frozenset('''all particularly just less indeed over soon course still yet before \n", " certainly how actually better to finally pretty then around very early nearly now \n", " always either where right often hard back home best out even away enough probably \n", " ever recently never however here quite alone both about ok ahead of usually already \n", " suddenly down simply long directly little fast there only least quickly much forward \n", " today more on exactly else up sometimes eventually almost thus tonight as in close \n", " clearly again no perhaps that when also instead really most why ago off \n", " especially maybe later well together rather so far once'''.split())\n", "%timeit { s for s in data if searcher(s) }\n", "%timeit set(filter(searcher, data))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So about 18% faster to use the built-in command `filter` instead of the set comprehension with an `if` statement." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Unpacking tuples and dictionaries" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AB\n", "AB\n" ] } ], "source": [ "def concat(x, y): \n", " return x + y \n", "\n", "foo = ('A', 'B')\n", "bar = {'y': 'B', 'x': 'A'}\n", "\n", "print(concat(*foo))\n", "print(concat(**bar))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exceptions" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "10\n" ] } ], "source": [ "def doit(x,y):\n", " if x < 0:\n", " raise ValueError(\"x should be >= 0\")\n", " return y\n", "\n", "print(doit(0,10))" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "# This will raise an exception, if you uncomment the following line:\n", "# print doit(-1,10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Advanced Features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Easter Eggs" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The Zen of Python, by Tim Peters\n", "\n", "Beautiful is better than ugly.\n", "Explicit is better than implicit.\n", "Simple is better than complex.\n", "Complex is better than complicated.\n", "Flat is better than nested.\n", "Sparse is better than dense.\n", "Readability counts.\n", "Special cases aren't special enough to break the rules.\n", "Although practicality beats purity.\n", "Errors should never pass silently.\n", "Unless explicitly silenced.\n", "In the face of ambiguity, refuse the temptation to guess.\n", "There should be one-- and preferably only one --obvious way to do it.\n", "Although that way may not be obvious at first unless you're Dutch.\n", "Now is better than never.\n", "Although never is often better than *right* now.\n", "If the implementation is hard to explain, it's a bad idea.\n", "If the implementation is easy to explain, it may be a good idea.\n", "Namespaces are one honking great idea -- let's do more of those!\n", "Gur Mra bs Clguba, ol Gvz Crgref\n", "\n", "Ornhgvshy vf orggre guna htyl.\n", "Rkcyvpvg vf orggre guna vzcyvpvg.\n", "Fvzcyr vf orggre guna pbzcyrk.\n", "Pbzcyrk vf orggre guna pbzcyvpngrq.\n", "Syng vf orggre guna arfgrq.\n", "Fcnefr vf orggre guna qrafr.\n", "Ernqnovyvgl pbhagf.\n", "Fcrpvny pnfrf nera'g fcrpvny rabhtu gb oernx gur ehyrf.\n", "Nygubhtu cenpgvpnyvgl orngf chevgl.\n", "Reebef fubhyq arire cnff fvyragyl.\n", "Hayrff rkcyvpvgyl fvyraprq.\n", "Va gur snpr bs nzovthvgl, ershfr gur grzcgngvba gb thrff.\n", "Gurer fubhyq or bar-- naq cersrenoyl bayl bar --boivbhf jnl gb qb vg.\n", "Nygubhtu gung jnl znl abg or boivbhf ng svefg hayrff lbh'er Qhgpu.\n", "Abj vf orggre guna arire.\n", "Nygubhtu arire vf bsgra orggre guna *evtug* abj.\n", "Vs gur vzcyrzragngvba vf uneq gb rkcynva, vg'f n onq vqrn.\n", "Vs gur vzcyrzragngvba vf rnfl gb rkcynva, vg znl or n tbbq vqrn.\n", "Anzrfcnprf ner bar ubaxvat terng vqrn -- yrg'f qb zber bs gubfr!\n", "The Zen of Python, by Tim Peters\n", "\n", "Beautiful is better than ugly.\n", "Explicit is better than implicit.\n", "Simple is better than complex.\n", "Complex is better than complicated.\n", "Flat is better than nested.\n", "Sparse is better than dense.\n", "Readability counts.\n", "Special cases aren't special enough to break the rules.\n", "Although practicality beats purity.\n", "Errors should never pass silently.\n", "Unless explicitly silenced.\n", "In the face of ambiguity, refuse the temptation to guess.\n", "There should be one-- and preferably only one --obvious way to do it.\n", "Although that way may not be obvious at first unless you're Dutch.\n", "Now is better than never.\n", "Although never is often better than *right* now.\n", "If the implementation is hard to explain, it's a bad idea.\n", "If the implementation is easy to explain, it may be a good idea.\n", "Namespaces are one honking great idea -- let's do more of those!\n" ] } ], "source": [ "import this, codecs\n", "print(this.s)\n", "print(codecs.encode(this.s, \"rot-13\")) # -> uryyb" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Uncomment the following easter eggs to see what happens." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "# from __future__ import braces" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "# import __phello__" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "import antigravity" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Scoping and Namespaces in Python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This section is strictly for programming language wonks. Scoping in Python can sometimes be tricky." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1: a\n", "2: b\n", "3: b\n" ] } ], "source": [ "x = 'a'\n", "class wat:\n", " x = 'b'\n", " def __init__(self):\n", " print(\"1:\", x)\n", " print(\"2:\", self.x)\n", "f = wat()\n", "print(\"3:\", f.x)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1: [1, 2] a\n", "2: [1, 2] a\n" ] } ], "source": [ "x = 'a'\n", "print(\"1:\", list(x for x in (1,2)), x)\n", "print(\"2:\", [x for x in (1,2)], x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more visit http://programmingwats.tumblr.com/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Function Decorators" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python has syntactic support for function composition. " ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Never in the wrong time or wrong place\n", "Desecration is the smile on my face\n", "\n", "My face my face, hey\n", "Desecration is the smile on my face\n", "\n" ] } ], "source": [ "## function composition of foo with bar: foo(bar(args)) using a decorator\n", "\n", "def foo(f):\n", " def decorator_func(*args, **keyword_args):\n", " f(*args, **keyword_args)\n", " print(\"Desecration is the smile on my face\\n\")\n", " return decorator_func\n", "\n", "@foo\n", "def bar(n):\n", " print(n)\n", "bar(\"Never in the wrong time or wrong place\")\n", "\n", "## function composition directly by calling foo(bar_bar(args))\n", "\n", "def bar_bar(n):\n", " print(n)\n", "\n", "# notice how I give a function as an argument to foo which returns a function\n", "# and I then provide an argument to that function returned by foo\n", "foo(bar_bar)(\"My face my face, hey\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## End" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.core.display import HTML\n", "\n", "\n", "def css_styling():\n", " styles = open(\"../css/notebook.css\", \"r\").read()\n", " return HTML(styles)\n", "css_styling()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 4 }