{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "# Strings\n", "\n", "Number objects are useful for storing values which are, well, numbers. But what \n", "if we want to store a sentence? Enter _strings_!" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "a = \"What's orange and sounds like a parrot?\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Strings can be joined with `+`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "b = 'A carrot'\n", "a + ' ' + b" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "And they can be multiplied by numbers, amazingly." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "c = 'omg'\n", "10 * c" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We’ve specified strings _literally_, _in_ the source code, by wrapping the text \n", "with singles quotes or double quotes. There’s no difference; most people choose \n", "one and stick with it.\n", "\n", "It can be useful to change if your text contains the quote character. If it \n", "contains both, you can _escape_ the quote mark by preceding it with a \n", "backslash. This tells Python that the quote is part of the string you want, and \n", "not the ending quote." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "fact = \"Gary's favourite word is \\\"python\\\".\"\n", "fact" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Python prints strings by surrounding them with _single_ quotes, so it escapes \n", "the single quotes in our string. This is useful because we can copy-paste the \n", "string into some Python code to use it somewhere else, without having to worry \n", "about escaping things.\n", "\n", "We can create multi-line strings by using three quotation marks. \n", "Conventionally, double quotations are usually used for these." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "long_fact = \"\"\"This is a long string.\n", "Quite long indeed.\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "print(long_fact)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Creating strings like this is useful when you want to include line breaks in \n", "your string. You can also use `\\n` in strings to insert line breaks." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "'This is a long string\\n\\nQuite long indeed.\\n'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We can convert things to strings by using the `str` method, which can also \n", "create an _empty_ string for us." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "''" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "'A number: ' + str(999 - 1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Strings are objects, and have lots of useful methods attached to them. If you \n", "want to know how many characters are in a string, you use the global `len` \n", "method." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "b_uppercase = b.upper()\n", "print(b_uppercase)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b_lowercase = b_uppercase.lower()\n", "print(b_lowercase)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "b.upper().lower()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "b.replace('carrot', 'parrot').replace(' ', '_')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "len(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "b" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Notice that none of these operations _modify_ the value of the `b` variable. \n", "Operations on strings _always_ return _new_ strings. Strings are said to be \n", "_immutable_ for this reason: you can never change a string, just make new ones." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "str1 = \"aoeueoa\"\n", "str2 = str1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "str1.upper()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## Formatting\n", "\n", "One of the most common things you’ll find yourself doing with strings is \n", "interleaving values into them. For example, you’ve finished an amazing \n", "analysis, and want to print the results." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "result1 = 42.0\n", "result2 = 123.21\n", "print('My results are: ' + str(result1) + ', ' + str(result2))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "\n", "This is already quite ugly, and will only get worse with more results. We can \n", "instead use the `f-string` and use the \n", "special `{}` placeholders to say where we want the values to go in the string by placing an `f` in front of the string (a \"formatted\" string)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "output = f'My results are: {result1} {result2}'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Instead, we can also just create a string withouth the `f` in front and later insert values. This is not only the more historical method, it also provides the ability to template a string and then use the `format` method that’s available on strings." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "template = 'My results are: {}, {}'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "print(template.format(result1, result2))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Much better! We define the whole string at once, and then place the missing \n", "values in later.\n", "\n", "We can add numbers inside the placeholders, `{0}` and `{1}`, which correspond to the indices \n", "of the arguments passed to the `format` method, where `0` is the first \n", "argument, `1` is the second, and so on. By referencing positions like this, we \n", "can easily repeat placeholders in the string, but only pass the values once to \n", "`format`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "template = 'My results are: {1}, {0}. The best is {0}, obviously.' # no need to start with 0 here" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "print(template.format(result1, result2))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "You can also use _named_ placeholders, then passing the values to `format` \n", "using the same name." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "template3 = 'My results are: {best}, {worst}. But the best is {best}, obviously.'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "print(template3.format(best=result1, worst=result2))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "But remember the `f-string`, if we don't need to do something fancy, it's a lot more convenient to use it (and the 99% use-case)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "f'My results are: {result1}, {result2}. But the best is {result1}, obviously.'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "print(template3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "This is nice because it gives more meaning to what the placeholders are for.\n", "\n", "There’s [a lot you can do inside the placeholders](https://docs.python.org/3/tutorial/inputoutput.html#the-string-format-method), such as specifying that you want to format a number with a certain number of decimal places." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "print(f'This number is great: {result1 * 0.000001:.5f}')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "If you want to print a literal curly brace using `format`, you will need to\n", "escape it by doubling it, so that `{{` will become `{` and `}}` will become `}`.\n", "Here's an example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "print(f'This number will be surrounded by curly braces: {{{result1}}}')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "The innermost `{0}` is replaced with the number, and `{{...}}` becomes `{...}`." ] } ], "metadata": { "language_info": { "codemirror_mode": { "name": "ipython" }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python" } }, "nbformat": 4, "nbformat_minor": 4 }