{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Effective Python - 59 Specific Ways to Write Better Python. \n", "# *Chapter 2 - Functions*\n", "Book by Brett Slatkin. \n", "Summary notes by Tyler Banks." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 14: Prefer Exceptions to Returning `None`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Instead of returning `None` as a return type, `raise` exceptions from helper functions. `except` them from the calling function\n", "* Functions returning `None` as a special meaning are error prone because `None` and other values like `0` or the empty string evaluate to False as well" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [], "source": [ "#Bad\n", "def divide(a, b):\n", " try:\n", " return a / b\n", " except ZeroDivisionError:\n", " return None\n", " \n", "x, y = 0, 5 \n", "result = divide(x, y)\n", "if result is None:\n", " print('Invalid inputs') \n" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The result is 2.5\n" ] } ], "source": [ "#Better\n", "def divide(a, b):\n", " try:\n", " return a / b\n", " except ZeroDivisionError as e:\n", " raise ValueError('Invalid inputs') from e\n", " \n", "x, y = 5, 2\n", "try:\n", " result = divide(x, y)\n", "except ValueError:\n", " print('Invalid inputs')\n", "else:\n", " print('The result is %.1f' % result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 15: Know How Closures Interact with Variable Scope" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Python suports closures: functions that refer to variables from the scope in which they were defined. The helper functions are able to access the group argument, shown below.\n", "* Functions are first-class objects. You can refer to them directly, assign them to variables, pass them as arguments to functions, compare them with if statemens. Sort method below is accepting `helper` function." ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "def sort_priority(values, group):\n", " def helper(x):\n", " if x in group:\n", " return (0, x)\n", " return (1, x)\n", " values.sort(key=helper)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When referencing a variable, the interpreter will traverse the scope in the following order\n", "1. Current function scope\n", "2. Any enclosing scopes (other containing functions) (above)\n", "3. Scope of the module that contains the code (global scope)\n", "4. The built-in scope (that contains functions like `len` and `str`)\n", "\n", "If no places have the defined variable `NameError` exception is raised\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Assignment works differently, if the variable doesn't exist in the current scope Python treats the assignment as a variable definition\n", "* Example:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [], "source": [ "def sort_priority2(numbers, group):\n", " found = False # Scope: 'sort_priority'’\n", " def helper(x):\n", " if x in group:\n", " found = True # Scope: 'helper' this is bad!\n", " return (0, x)\n", " return (1, x)\n", " numbers.sort(key=helper)\n", " return found" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To fix this, use `nonlocal`\n", "* NOTE: `nonlocal` does not resolve up to the module level" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [], "source": [ "def sort_priority3(numbers, group):\n", " found = False\n", " def helper(x):\n", " nonlocal found\n", " if x in group:\n", " found = True\n", " return (0, x)\n", " return (1, x)\n", " numbers.sort(key=helper)\n", " return found" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " * The author cautions against using `nonlocal` in most cases outside simple functions like this" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 16: Consider Generators Instead of Returning Lists" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Considering the following code that returns a list of indexes for the start of each word" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0, 5, 8, 13, 18, 21]\n" ] } ], "source": [ "def index_words(text):\n", " result = []\n", " if text:\n", " result.append(0)\n", " for index, letter in enumerate(text):\n", " if letter == ' ':\n", " result.append(index + 1)\n", " return result\n", "\n", "words = 'This is some text to test'\n", "result = index_words(words)\n", "print(result[:])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Problems:\n", "* Noisy and dense\n", "* With each new result the append method is called\n", "\n", "\n", "Resolution:\n", "* Use a *generator* \n", "* Generators use `yeild` expressions\n", "* A generator returns an iterator instead of the result\n", "* With each call to the built-in `next` funtion the iterator will advance the generator to its next yield statement\n", "\n", "\n", "Ex:" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0, 5, 8, 13, 18, 21]\n" ] } ], "source": [ "def index_words_iter(text):\n", " if text:\n", " yield 0\n", " for index, letter in enumerate(text):\n", " if letter == ' ':\n", " yield index + 1\n", "\n", "result = list(index_words_iter(words))\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 17: Be Defensive When Iterating Over Arguments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Beware of functions that iterate over inputs multiple times. If the arguments are iterators you may see strange behavior\n", "* Consider the following:" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [], "source": [ "def normalize(numbers):\n", " total = sum(numbers)\n", " result = []\n", " for value in numbers:\n", " percent = 100 * value / total\n", " result.append(percent)\n", " return result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* This works when using lists but if the list is too big we'll need to suply a generator.\n", "* However, generators can only be used once and `numbers` is iterated over multiple times. Note: iterators raise the `StopIteration` when exhausted.\n", "* You could just make a copy of the list at the beginning (`numbers = list(numbers)`) and use that but this would again, only work for small inputs\n", "* A better solution would be to suppy an iterator instead\n" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "def read_visits(data_path):\n", " with open(data_path) as f:\n", " for line in f:\n", " yield int(line)\n", "\n", "def normalize_func(get_iter):\n", " total = sum(get_iter()) # New iterator\n", " result = []\n", " for value in get_iter(): # New iterator\n", " percent = 100 * value / total\n", " result.append(percent)\n", " return result\n", "\n", "path = \"data/none.txt\"\n", "#percentages = normalize_func(lambda: read_visits(path))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* However, this looks terrible\n", "* The *best* solution is to write a class that implements the `__iter__` function\n", "* This is known as a container" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [], "source": [ "class ReadVisits(object):\n", " def __init__(self, data_path):\n", " self.data_path = data_path\n", " def __iter__(self):\n", " with open(self.data_path) as f:\n", " for line in f:\n", " yield int(line)\n", " \n", "def normalize_defensive(numbers):\n", " if iter(numbers) is iter(numbers): # An iterator - bad!\n", " raise TypeError('You have to supply a container!')\n", " total = sum(numbers)\n", " result = []\n", " for value in numbers:\n", " percent = 100 * value / total\n", " result.append(percent)\n", " return result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* You can detect that a value is an iterator (instead of a container) if calling iter on the iterator produces the same result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 18: Reduce Visual Noise with Variable Positional Arguments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Use `*args` for optional parameters to clear up noise\n", "* Compare the following:" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "nums are: 1, 2\n", "yo\n" ] } ], "source": [ "def log(message, values):\n", " if not values:\n", " print(message)\n", " else:\n", " values_str = ', '.join(str(x) for x in values)\n", " print('%s: %s' % (message, values_str))\n", " \n", "log('nums are', [1, 2])\n", "log('yo', [])" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "nums are: [1, 2]\n", "yo\n" ] } ], "source": [ "def log(message, *values):\n", " if not values:\n", " print(message)\n", " else:\n", " values_str = ', '.join(str(x) for x in values)\n", " print('%s: %s' % (message, values_str))\n", " \n", "log('nums are', [1, 2])\n", "log('yo') #Nicer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* You can also pass lists as variable arguments using `*`" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "faves: 7, 33, 99\n" ] } ], "source": [ "favorites = [7, 33, 99]\n", "log('faves', *favorites)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Problems with accepting variable number of position arguments:\n", "* Variable arguments are always turned into tuples before passed, this will mean that a generator will exhaust its iterator and possibly consume mass amounts of memory. Limit the use of `*args` to small inputs.\n", "* You can't add new positional arguments to the functions in the future without migrating the caller functions. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 19: Provide Optional Behavior with Keyword Arguments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* All positional arguments to functions can be passed by keyword. Ex:" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "6\n", "6\n", "6\n", "6\n" ] } ], "source": [ "def remainder(number, divisor):\n", " return number % divisor\n", "\n", "print(remainder(20, 7))\n", "print(remainder(20, divisor=7))\n", "print(remainder(number=20, divisor=7))\n", "print(remainder(divisor=7, number=20))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Positional arguments MUST be specified before positional arguments\n", "* Each argument can only be specified once" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "ename": "SyntaxError", "evalue": "positional argument follows keyword argument (, line 1)", "output_type": "error", "traceback": [ "\u001b[1;36m File \u001b[1;32m\"\"\u001b[1;36m, line \u001b[1;32m1\u001b[0m\n\u001b[1;33m remainder(number=20, 7)\u001b[0m\n\u001b[1;37m ^\u001b[0m\n\u001b[1;31mSyntaxError\u001b[0m\u001b[1;31m:\u001b[0m positional argument follows keyword argument\n" ] } ], "source": [ "remainder(number=20, 7)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "remainder(20, number=7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Keyword args provide several benefits:\n", "1. Readability (`remainder(20,7)` is not as clear as `remainder(number=20,divisor=7)`)\n", "2. Default values. In the method header you can specify default functionality by `def remainder(number, divisor=1):`\n", "3. Provide a way to remain backwards compatible with existing callers, allowing extended function without migration." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 20: Use None and Docstrings to Specify Dynamic Default Arguments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Important if you want to specify dynamic values when method is called with keyword arguments\n", "* Ex:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from datetime import datetime\n", "from time import sleep\n", "def log(message, when=datetime.now()):\n", " print('%s: %s' % (when, message))\n", "log('hi')\n", "sleep(0.1)\n", "log('hi')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Result is the same because python evaluates the method definition on module load.\n", "* Use none to specify a different behavior, dynamically" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def log(message, when=None):\n", " \"\"\"Log a message with a timestamp.\n", " Args:\n", " message: Message to print.\n", " when: datetime of when the message occurred.\n", " Defaults to the present time.\n", " \"\"\"\n", " when = datetime.now() if when is None else when\n", " print('%s: %s' % (when, message))\n", "\n", "log('hi')\n", "sleep(0.1)\n", "log('hi')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* The same is true for when you return a blank dictionary. You'll end up returning the same object reference.\n", "* Ex, with fix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "#Bad\n", "def decode(data, default={}):\n", " try:\n", " return json.loads(data)\n", " except ValueError:\n", " return default\n", " \n", "foo = decode('bad')\n", "foo['a'] = 5\n", "bar = decode('another bad')\n", "bar['b'] = 1\n", "print('Bad results')\n", "print('a: ', foo)\n", "print('b: ', bar)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Fixed \n", "def decode(data, default=None):\n", " \"\"\"Load JSON data from a string.\n", " Args:\n", " data: JSON data to decode.\n", " default: Value to return if decoding fails.\n", " Defaults to an empty dictionary.\n", " \"\"\"\n", " if default is None:\n", " default = {}\n", " try:\n", " return json.loads(data)\n", " except ValueError:\n", " return default\n", "\n", "foo = decode('bad')\n", "foo['a'] = 5\n", "bar = decode('another bad')\n", "bar['b'] = 1\n", "print('Good results')\n", "print('a: ', foo)\n", "print('b: ', bar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Item 21: Enforce Clarity with Keyword-Only Arguments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Keyword arguments make intention clear\n", "* Use keyword only args to force callers to supply keyword args for confusing functions\n", "* Use the `*` symbol to indicate the end of positional arguments\n", "* See the difference:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def safe_division_b(number, divisor,\n", " ignore_overflow=False,\n", " ignore_zero_division=False):\n", " return 0\n", "\n", "def safe_division_c(number, divisor, *, #Note the star\n", " ignore_overflow=False,\n", " ignore_zero_division=False):\n", " return 0" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n" ] } ], "source": [ "print(safe_division_b(1,2,True,False))" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "ename": "NameError", "evalue": "name 'safe_division_c' is not defined", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0msafe_division_c\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;32mTrue\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;32mFalse\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mNameError\u001b[0m: name 'safe_division_c' is not defined" ] } ], "source": [ "print(safe_division_c(1,2,True,False))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }