{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 2, "metadata": { "cell_tags": [] }, "source": [ "Formatting strings in python using the `format()` method" ] }, { "cell_type": "markdown", "metadata": { "cell_tags": [ "objectives" ] }, "source": [ "#### Objectives\n", "\n", "* Explain how to format strings using the `format()` method\n", "* Give examples of different formatting options for numbers\n", "* Give examples of different formatting optiong for text\n", "* Show how to find more options to customize the output of any object, including numbers and text" ] }, { "cell_type": "heading", "level": 3, "metadata": { "cell_tags": [] }, "source": [ "Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The most basic way to print something in python is simply to use the `print` command, followed by the string to print:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"King Arthur\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `print` command also accepts variable names:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "castle = \"Camelot\"\n", "print castle" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Strings and variables can be mixed by using a comma to separate them:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"King Arthur lived in\", castle" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> #### Note\n", ">\n", "> A space is automatically added between the different arguments." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the `print` command is easy and works very well for simple cases, but it has no formatting options, and sometimes you may need more flexibility. \n", "There are numerous situations when it is useful to print an object in a specific output format." ] }, { "cell_type": "heading", "level": 3, "metadata": { "cell_tags": [] }, "source": [ "Formatting numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When dealing with floating-point numbers, it is common to want to limit the output shown to a few decimals. \n", "As an example, let's use $\\pi$ and $e$, two constants defined in the `math` module:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "print math.pi\n", "print math.e" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our goal is to print \"pi is\" followed by the actual value of $\\pi$ rounded to 2 decimals. Let's start by printing the string we need:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"pi is ...\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In python, every string has a `format()` method. Let's apply that method to our string:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"pi is ...\".format()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When `format()` is applied to a string but given no arguments, the output string is not modified. \n", "However, we now have the structure needed to print any object anywhere in the string. \n", "To print the value of $\\pi$ in the string, we must do 2 things:\n", "\n", "* specify in the string where we want to put our variable, which we do by adding curly braces with a zero inside\n", "\n", "* specify in the `format()` method which object should be printed, which we do by adding `math.pi` as argument to `format()`" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"pi is {0}\".format(math.pi)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the string, the curly braces with a 0 inside (`{0}`) are replaced by the first argument in the `format()` method. (Recall that in python, the index starts at 0.) \n", "If we add a second argument in the `format()` method, we can refer to both of those in the string:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"pi is {0} and e is {1}\".format(math.pi, math.e)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The curly braces with 0 (`{0}`) refer to the first argument, the curly braces with 1 (`{1}`) refer to the second argument, etc. \n", "Instead of using the index of the arguments, we can name them to make things even more clear:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"pi is {pi} and e is {e}\".format(pi=math.pi, e=math.e)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The curly braces with `pi` (`{pi}`) refer to the argument named `pi` in the `format()` method, and the curly braces with `e` (`{e}`) refer to the argument named `e`. \n", "The default output of the `format()` method is exactly the same as the output of the `print` command. However, the `format()` method allows us to decide precisely how the objects should be printed. To add a specification, the identifier must be followed by a column and the specification requested:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"pi is {pi:.2f} and e is {e:.3f}\".format(pi=math.pi, e=math.e)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the example above:\n", "\n", "* $\\pi$ is shown as a floating-point number (indicated by the letter `f`), with 2 decimals (indicated by `.2`)\n", "* e is also shown as a floating-point number, (`f`) with 3 decimals (`.3`)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> #### Note\n", ">\n", "> The specification depends on the type of objects to be printed: for instance, limiting the number of decimals would have no meaning for a string. \n", "> However, the specification always follows the same model: `{identifier:specification}`" ] }, { "cell_type": "markdown", "metadata": { "cell_tags": [ "challenges" ] }, "source": [ "#### Challenges\n", "\n", "1. How many decimals of $\\pi$ do you know by heart? \n", " Print the value of $\\pi$ with 30 decimals." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is another example involving a large number:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"large is {number}\".format(number=math.pi*1E6)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The default output shows all the digits, but it is possible to use the exponential notation instead:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"large is {number:.2e}\".format(number=math.pi*1E6)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the example above, the number is shown with the exponential notation (indicated by the `e`) and with 2 decimals (indicated by the `.2`). \n", "A useful specifier is `g`, which does automatic formatting of numbers:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"pi is {pi:g}\".format(pi=math.pi)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another useful specifier is `%`, which shows percentages:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "students = 24\n", "success = 20\n", "print \"The success rate is {rate:.1%}\".format(rate=float(success)/students)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the example above, the number is shown as a percentage (indicated by the `%`) and with 1 decimal (indicated by the `.1`)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> #### Note\n", ">\n", "> Formatting output is often used in two different situations:\n", ">\n", "> * to show the output to a human; or\n", "> * to save the output in a file.\n", ">\n", "> Those two situations require slightly different formatting specifications:\n", ">\n", "> * To format output for a human, it is best to be totally explicit about the desired layout (number of decimals, field width, etc). \n", "> * To format output that it is machine-readable, it is best to use a different separator (typically, a tab), although the visual alignment may not be perfect." ] }, { "cell_type": "markdown", "metadata": { "cell_tags": [ "challenges" ] }, "source": [ "#### Challenges\n", "\n", "1. The [*Hubble constant*](http://en.wikipedia.org/wiki/Hubble%27s_law) gives an estimated value of the rate of expansion of the Universe. \n", " Below are a few recent measurements of the Hubble constant (restricted to those that have symmetric errors). \n", " The goal is to output the different values into a machine-readable table. \n", " Therefore, each field should be separated by a tab (use \"\\t\" to insert a tab). \n", " The columns should be in the following order: mission, measurement, error. \n", " The numbers should all be shown with one decimal. \n", " Because of the tabs, the visual alignment of the columns will not be perfect; another challenge below will correct that. \n", " \n", " ~~~\n", " missions = [\"Planck\", \"WMAP (9 years)\", \"WMAP (7 years)\", \"Hubble\"]\n", " measurements = [67.80, 69.32, 71.0, 71]\n", " errors = [0.77, 0.80, 2.5, 8]\n", " ~~~\n", " Source: [Wikipedia](http://en.wikipedia.org/wiki/Hubble%27s_law)\n", "\n", "2. If you use $\\LaTeX$, change the output so that the measurement and the error are combined together and ready to be included in a $\\LaTeX$ document (`$measurement \\pm error$`), and leave all available digits." ] }, { "cell_type": "heading", "level": 3, "metadata": { "cell_tags": [] }, "source": [ "Formatting curly braces" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The curly braces are used to indicate a replacement field, thus they do not appear in the output. If you do want to show curly braces, simply double them:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"LaTeX makes heavy use of the {{ and }} symbols\".format()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": { "cell_tags": [ "challenges" ] }, "source": [ "#### Challenges\n", "\n", "1. The errors on measurements can also be asymmetric. \n", " Similarly to the previous challenges, below are a few other recent measurements of the Hubble constant, but this time restricted to those that have asymmetric errors. \n", " If you use $\\LaTeX$, create an output suitable for inclusion into a $\\LaTeX$ document, by printing the name of the mission, followed by a tab, and then by the result of the experiment as `$measurement^{+error_up}_{-error_down}$`\n", "\n", " ~~~\n", " missions = [\"WMAP7\", \"WMAP5\", \"Chandra\"]\n", " measurements = [70.4, 71.9, 77.6]\n", " up_errors = [1.3, 2.6, 14.9]\n", " down_errors = [1.4, 2.7, 12.5]\n", " ~~~\n", " Source: [Wikipedia](http://en.wikipedia.org/wiki/Hubble%27s_law)" ] }, { "cell_type": "heading", "level": 3, "metadata": { "cell_tags": [] }, "source": [ "Formatting text" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at how to format text using a poem as an example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "author = \"Robert Herrick\"\n", "\n", "life = \"1591-1674\"\n", "\n", "title = \"To the Virgins, to Make Much of Time\"\n", "\n", "poem = '''Gather ye rosebuds while ye may,\n", "Old time is still a-flying;\n", "And this same flower that smiles today\n", "Tomorrow will be dying.\n", "\n", "The glorious lamp of heaven the sun,\n", "The higher he's a-getting,\n", "The sooner will his race be run,\n", "And nearer he's to setting.\n", "\n", "That age is best which is the first,\n", "When youth and blood are warmer;\n", "But being spent, the worse, and worst\n", "Times still succeed the former.\n", "\n", "Then be not coy, but use your time,\n", "And, while ye may, go marry;\n", "For, having lost but once your prime,\n", "You may forever tarry.'''" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Source: [Wikipedia](http://en.wikipedia.org/wiki/To_the_Virgins,_to_Make_Much_of_Time)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start by printing a little bit of text, for instance the author name, surrounded by two asterisks:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"*{text}*\".format(text=author)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have added asterisks on both sides to that we see the width of the field. By default, the field width is as large as the content (but not larger).\n", "\n", "We can specify the width of the field explicitly:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"*{text:50}*\".format(text=author)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the example above, the field width is set to be (at least) 50 characters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> #### Note\n", ">\n", "> If the specified field width is smaller than the content, the field width will simply be ignored and the complete content will be shown. \n", "> For instance, `print \"*{text:3}*\".format(text=author)` prints the complete author name, even though the output has more than 3 characters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The default justification for strings is to the left, but it can be changed:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"*{text:<50}*\".format(text=author)\n", "print \"*{text:^50}*\".format(text=author)\n", "print \"*{text:>50}*\".format(text=author)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the examples above, the string is justified to the left (indicated by `<`), center (`^`), and right (`>`). \n", "When the field width is larger than the content, we can specify the characted used to fill the empty spaces:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"*{text:-^50}*\".format(text=author)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the example above, the string is justified to the center (indicated by `^`), and the fill character is a dash (indicated by `-`). \n", "Let's print each line of the poem, and the length of each line:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for line in poem.split('\\n'):\n", " print \"{length} {line}\".format(length=len(line), line=line)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This shows that the longest line has 38 characters. To output the poem, we can thus specify any field width larger than 38. \n", "Therefore, if we use a field width of 78 characters surrounded by an asterisk on both sides, each line will have 80 characters in total." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> #### Note\n", ">\n", "> Historically, shells were often restricted to 80 characters in width. \n", "> Although that technical limitation does not apply to modern shells anymore, it is still customary to limit lines to be 80 characters at most." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a final example, let's format the entire poem:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"{title}\".format(title=title)\n", "print \"*\"*80\n", "for line in poem.split('\\n'):\n", " # print line\n", " print \"*{line:^78}*\".format(line=line)\n", "print \"*\"*80\n", "print \"{author:>80}\".format(author=author)\n", "print \"{life:>80}\".format(life=life)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": { "cell_tags": [ "challenges" ] }, "source": [ "#### Challenges\n", "\n", "1. In a previous challenge, you output a machine-readable table with measurements of the Hubble constant with symmetric errors. \n", " In this challenge, the goal is to output the different values into a table that looks nice to a human. \n", " Therefore, the tabs between the columns should be removed, and the mission column should be set to be 20 characters wide. \n", " As before, the numbers should all be shown with one decimal, and the columns should be in the following order: mission, measurement, error. \n", " \n", " ~~~\n", " missions = [\"Planck\", \"WMAP (9 years)\", \"WMAP (7 years)\", \"Hubble\"]\n", " measurements = [67.80, 69.32, 71.0, 71]\n", " errors = [0.77, 0.80, 2.5, 8]\n", " ~~~\n", " Source: [Wikipedia](http://en.wikipedia.org/wiki/Hubble%27s_law)" ] }, { "cell_type": "heading", "level": 3, "metadata": { "cell_tags": [] }, "source": [ "Links for more information" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* The [format specification mini-language](https://docs.python.org/2/library/string.html#formatspec) is part of the [python documentation for strings](https://docs.python.org/2/library/string.html); it describes all the available options (and there are many more than those described here!) and gives plenty of examples.\n", "\n", "* [The python documentation for strings](https://docs.python.org/2/library/string.html), also contains an introduction to the use of the `format()` method as well as numerous examples.\n", "\n", "* More details about the `format()` method are available in [the Python Enhancement Proposal (PEP) 3101](http://legacy.python.org/dev/peps/pep-3101/), entitled \"Advanced String Formatting\"" ] }, { "cell_type": "markdown", "metadata": { "cell_tags": [ "keypoints" ] }, "source": [ "#### Key Points\n", "\n", "* Every string has a `format()` method\n", "* Every pair of curly braces in the string will be replaced by the arguments given to the `format()` method\n", "* The arguments can be referred to in two different ways, either by index or by name:\n", " ~~~\n", " # Argument referred to by index:\n", " print \"King Arthur lived in {0}\".format(\"Camelot\")`\n", " ~~~\n", " ~~~\n", " # Argument referred to by name (preferred, as it leads to clearer code):\n", " print \"King Arthur lived in {castle}\".format(castle=\"Camelot\")\n", " ~~~\n", "* The custom specification (if desired) is added inside the curly braces, following a colon\n", "* A typical example of custom formatting for a number would be:\n", " ~~~\n", " print \"pi is {pi:.2f}\".format(pi=math.pi)\n", " ~~~\n", "* A typical example of custom formatting for text would be:\n", " ~~~\n", " print \"*{king:^50}*\".format(king=\"King Arthur\")\n", " ~~~\n", "* The [format specification mini-language](https://docs.python.org/2/library/string.html#formatspec) gives all the custom specification options" ] } ], "metadata": {} } ] }