{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[Python for Developers](http://ricardoduarte.github.io/python-for-developers/#content)\n", "===================================\n", "First Edition\n", "-----------------------------------\n", "\n", "Chapter 11: Standard Library\n", "=============================\n", "_____________________________\n", "It's often said that Python comes with \"batteries included\", in reference to the vast library of modules and packages that are distributed with the interpreter.\n", "\n", "Some important modules of the standard library:\n", "\n", "+ Math: math, cmath, and random decimal.\n", "+ System: os, glob, subprocess and shutils.\n", "+ Threads: threading.\n", "+ Persistence: pickle and cPickle.\n", "+ XML: xml.dom, xml.sax and ElementTree (since version 2.5).\n", "+ Configuration: ConfigParser and optparse.\n", "+ Time: time and datetime.\n", "+ Other: sys, logging, traceback, types and timeit.\n", "\n", "Maths\n", "----------\n", "In addition to the *builtin* numeric types interpreter in the Python standard library there are several modules devoted to implement other types and mathematical operations.\n", "\n", "The *math* module defines logarithmic, exponentiation, trigonometric, hyperbolic functions and angular conversions, among others. The *cmath* module, implements similar functions, but made to handle complex numbers.\n", "\n", "example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "\n", "import cmath\n", "\n", "# Complex\n", "for cpx in [3j, 1.5 + 1j, -2 - 2j]:\n", "\n", " # Polar coordinates convertion\n", " plr = cmath.polar(cpx)\n", " print 'Complex:', cpx\n", " print 'Polar:', plr, '(in radians)'\n", " print 'Amplitude:', abs(cpx)\n", " print 'Angle:', math.degrees(plr[1]), '(grades)'" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Complex: 3j\n", "Polar: (3.0, 1.5707963267948966) (in radians)\n", "Amplitude: 3.0\n", "Angle: 90.0 (grades)\n", "Complex: (1.5+1j)\n", "Polar: (1.8027756377319946, 0.5880026035475675) (in radians)\n", "Amplitude: 1.80277563773\n", "Angle: 33.690067526 (grades)\n", "Complex: (-2-2j)\n", "Polar: (2.8284271247461903, -2.356194490192345) (in radians)\n", "Amplitude: 2.82842712475\n", "Angle: -135.0 (grades)\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The *random* module brings functions for random number generation.\n", "\n", "Examples:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import random\n", "import string\n", "\n", "# Choose a letter\n", "print random.choice(string.ascii_uppercase)\n", "\n", "# Choose a number from 1 to 10\n", "print random.randrange(1, 11)\n", "\n", "# Escolha um float no intervalo de 0 a 1\n", "print random.random()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "B\n", "2\n", "0.117017323204\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the standard library there is the decimal module that defines operations with real numbers with fixed precision.\n", "\n", "example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from decimal import Decimal\n", "\n", "t = 5.\n", "for i in range(50):\n", " t = t - 0.1\n", "\n", "print 'Float:', t\n", "\n", "t = Decimal('5.')\n", "for i in range(50):\n", " t = t - Decimal('0.1')\n", "\n", "print 'Decimal:', t" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Float: 1.02695629778e-15\n", "Decimal: 0.0\n" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "With this module, it is possible to reduce the introduction of rounding errors arising from floating point arithmetic.\n", "\n", "In version 2.6, the module *fractions*, which deals with rational numbers, is also available.\n", "\n", "example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from fractions import Fraction\n", "\n", "# Three fractions\n", "f1 = Fraction('-2/3')\n", "f2 = Fraction(3, 4)\n", "f3 = Fraction('.25')\n", "print \"Fraction('-2/3') =\", f1\n", "print \"Fraction('3, 4') =\", f2\n", "print \"Fraction('.25') =\", f3\n", "\n", "# Sum\n", "print f1, '+', f2, '=', f1 + f2\n", "print f2, '+', f3, '=', f2 + f3" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Fraction('-2/3') = -2/3\n", "Fraction('3, 4') = 3/4\n", "Fraction('.25') = 1/4\n", "-2/3 + 3/4 = 1/12\n", "3/4 + 1/4 = 1\n" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fractions can be initialized in several ways: as a *string*, as a pair of integer or as a real number. The module also has a function called `gcd ()` which calculates the greatest common divisor (gcd) of two integers.\n", "\n", "Files and I/O\n", "--------------\n", "Files in Python are represented by objects of type *file*, which offer various methods for file operations. Files can be opened for reading ('r', which is the default), write ('w') or addition ('a'), in text or binary ('b') mode.\n", "\n", "In Python:\n", "\n", "+ *sys.stdin* is the standard input.\n", "+ *sys.stdout* is the standard output.\n", "+ *sys.stderr* is the standard error output.\n", "\n", "The standard input, output and error are handled by Python as open files. The input in read mode and the other in the recording mode.\n", "\n", "Sample of writing:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import sys\n", "\n", "# Create an object of type file\n", "temp = open('temp.txt', 'w')\n", "\n", "# Writing output\n", "for i in range(20):\n", " temp.write('%03d\\n' % i)\n", "\n", "temp.close()\n", "\n", "temp = open('temp.txt')\n", "\n", "# Write in terminal\n", "for x in temp:\n", " # writing in sys.stdout sends\n", " # text to standard output\n", " sys.stdout.write(x)\n", "\n", "temp.close()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "000\n", "001\n", "002\n", "003\n", "004\n", "005\n", "006\n", "007\n", "008\n", "009\n", "010\n", "011\n", "012\n", "013\n", "014\n", "015\n", "016\n", "017\n", "018\n", "019\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "At each iteration in the second loop, the object returns a line from the file each time.\n", "\n", "Reading example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import sys\n", "import os.path\n", "\n", "fn = 'test.txt'\n", "\n", "if not os.path.exists(fn):\n", " print 'Try again...'\n", " sys.exit()\n", "\n", "# Numbering lines\n", "for i, s in enumerate(open(fn)):\n", " print i + 1, s," ], "language": "python", "metadata": {}, "outputs": [ { "ename": "SystemExit", "evalue": "", "output_type": "pyerr", "traceback": [ "An exception has occurred, use %tb to see the full traceback.\n", "\u001b[1;31mSystemExit\u001b[0m\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Try again...\n" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "To exit: use 'exit', 'quit', or Ctrl-D.\n" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is possible to read all the lines with the method `readlines()`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Prints a list with all the lines from a file\n", "print open('temp.txt').readlines()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['000\\n', '001\\n', '002\\n', '003\\n', '004\\n', '005\\n', '006\\n', '007\\n', '008\\n', '009\\n', '010\\n', '011\\n', '012\\n', '013\\n', '014\\n', '015\\n', '016\\n', '017\\n', '018\\n', '019\\n']\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The objects of type file also have the method `seek()`, which allow going to any position in the file.\n", "\n", "In version 2.6, it is available the module *io*, which implements separately the file operations and the text manipulation routines.\n", "\n", "File Systems\n", "-------------------\n", "Modern operating systems store files in hierarchical structures called *file systems*.\n", "\n", "Several features related to file systems are implemented in the module *os.path*, such as: \n", "\n", "+ `os.path.basename()`: returns the final component of a path.\n", "+ `os.path.dirname()`: returns a path without the final component.\n", "+ `os.path.exists()`: returns *True* if the path exists or *False* otherwise.\n", "+ `os.path.getsize()`: returns the size of the file in *bytes*.\n", "\n", "*glob* is another module related with the file system:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os.path\n", "import glob\n", "\n", "# Shows a list of file names\n", "# and their respective sizes \n", "for arq in sorted(glob.glob('*.py')):\n", " print arq, os.path.getsize(arq)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The *glob.glob()* function returns a list of filenames that meet the criteria passed as a parameter in a similar way to `ls` command available on UNIX systems.\n", "\n", "Temporary files\n", "--------------------\n", "The module *os* implements some functions to facilitate the creation of temporary files, freeing the developer from some concerns, such as:\n", "\n", "+ Avoiding collisions with names of files that are in use.\n", "+ Identifying the appropriate area of the file system for temporary files (which varies by operating system).\n", "+ Not exposing the implementation risks (temporary area is used by other processes).\n", "\n", "Example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "\n", "text = 'Test'\n", "# creates a temporary file\n", "temp = os.tmpfile()\n", "\n", "# writes in the temp file\n", "temp.write(text)\n", "\n", "# Go back to the beginning the the file\n", "temp.seek(0)\n", "\n", "# Shows file content\n", "print temp.read()\n", "\n", "# Closes file\n", "temp.close()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Test\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Existe tamb\u00e9m a fun\u00e7\u00e3o `tempnam()`, que retorna um nome v\u00e1lido para arquivo tempor\u00e1rio, incluindo um caminho que respeite as conven\u00e7\u00f5es do sistema operacional. Por\u00e9m, fica por conta do desenvolvedor garantir que a rotina seja usada de forma a n\u00e3o comprometer a seguran\u00e7a da aplica\u00e7\u00e3o.\n", "\n", "Arquivos compactados\n", "--------------------\n", "O Python possui m\u00f3dulos para trabalhar com v\u00e1rios formatos de arquivos compactados.\n", "\n", "Exemplo de grava\u00e7\u00e3o de um arquivo \u201c.zip\u201d:\n", "\n", "There is also the `tempnam()` function, which returns a valid name for temporary file, including a path that respects the conventions of the operating system. However, it is up to the developer to ensure that the routine is used so as not to compromise the security of the application.\n", "\n", "Compressed files\n", "--------------------\n", "Python has modules to work with multiple formats of compressed files.\n", "\n", "Example of writing a \".zip\" file:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "\"\"\"\n", "Writing text in a compressed file\n", "\"\"\"\n", "\n", "import zipfile\n", "\n", "text = \"\"\"\n", "**************************************\n", "This text will be compressed and ...\n", "... stored inside a zip file.\n", "***************************************\n", "\"\"\"\n", "\n", "# Creates a new zip\n", "zip = zipfile.ZipFile('arq.zip', 'w',\n", " zipfile.ZIP_DEFLATED)\n", "\n", "# Writes a string in zip as if it were a file\n", "zip.writestr('text.txt', text)\n", "\n", "# closes the zip\n", "zip.close()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reading example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "\"\"\"\n", "Reading a compressed file\n", "\"\"\"\n", "\n", "import zipfile\n", "\n", "# Open the zip file for reading \n", "zip = zipfile.ZipFile('arq.zip')\n", "\n", "# Gets a list of compressed files\n", "arqs = zip.namelist()\n", "\n", "for arq in arqs:\n", " # Shows the file name\n", " print 'File:', arq\n", " # get file info\n", " zipinfo = zip.getinfo(arq)\n", " print 'Original size:', zipinfo.file_size\n", " print 'Compressed size:', zipinfo.compress_size\n", "\n", " # Shows file content\n", " print zip.read(arq)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "File: text.txt\n", "Original size: 147\n", "Compressed size: 75\n", "\n", "**************************************\n", "This text will be compressed and ...\n", "... stored inside a zip file.\n", "***************************************\n", "\n" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python also provides modules for gzip, bzip2 and tar formats that are widely used in UNIX environments.\n", "\n", "Data file\n", "----------------\n", "In the standard library, Python also provides a module to simplify the processing of files in CSV (*Comma Separated Values*) format.\n", "\n", "In CSV format, the data is stored in text form, separated by commas, one record per line.\n", "\n", "Writing example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import csv\n", "\n", "# Data\n", "dt = (('temperatura', 15.0, 'C', '10:40', '2006-12-31'),\n", " ('peso', 42.5, 'kg', '10:45', '2006-12-31'))\n", "\n", "# A writing routine which receives one object of type file\n", "out = csv.writer(file('dt.csv', 'w'))\n", "\n", "# Writing the tuples in file\n", "out.writerows(dt)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Reading example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import csv\n", "\n", "# The reading routine receives a file object\n", "dt = csv.reader(file('dt.csv'))\n", "\n", "# For each record in file, prints\n", "for reg in dt:\n", " print reg" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['temperature', '15.0', 'C', '10:40', '2006-12-31']\n", "['weight', '42.5', 'kg', '10:45', '2006-12-31']\n" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "O formato CSV \u00e9 aceito pela maioria das planilhas e sistemas de banco de dados para importa\u00e7\u00e3o e exporta\u00e7\u00e3o de informa\u00e7\u00f5es.\n", "\n", "Sistema operacional\n", "-------------------\n", "Al\u00e9m do sistema de arquivos, os m\u00f3dulos da biblioteca padr\u00e3o tamb\u00e9m fornecem acesso a outros servi\u00e7os providos pelo sistema operacional.\n", "\n", "Exemplo:\n", "\n", "\n", "The CSV format is supported by most spreadsheet and databases for data import and export.\n", "\n", "Operating System\n", "-------------------\n", "Apart from the file system, the modules of the standard library also provides access to other services provided by the operating system.\n", "\n", "example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "import sys\n", "import platform\n", "\n", "def uid():\n", " \"\"\"\n", " uid() -> returns the current user identification\n", " or None if not possible to identify\n", " \"\"\"\n", "\n", " # Ambient variables for each operating system\n", " us = {'Windows': 'USERNAME',\n", " 'Linux': 'USER'}\n", "\n", " u = us.get(platform.system())\n", " return os.environ.get(u)\n", "\n", "print 'User:', uid()\n", "print 'plataform:', platform.platform()\n", "print 'Current dir:', os.path.abspath(os.curdir)\n", "exep, exef = os.path.split(sys.executable)\n", "print 'Executable:', exef\n", "print 'Executable dir:', exep" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "User: csig\n", "plataform: Linux-3.11.0-15-generic-x86_64-with-Ubuntu-13.10-saucy\n", "Current dir: /home/csig/teste/python-for-developers/Chapter11\n", "Executable: python\n", "Executable dir: /home/csig/env/teste/bin\n" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Process execution example:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import sys\n", "from subprocess import Popen, PIPE\n", "\n", "# ping\n", "cmd = 'ping -c 1 '\n", "# No Windows\n", "if sys.platform == 'win32':\n", " cmd = 'ping -n 1 '\n", "\n", "# Local just for testing\n", "host = '127.0.0.1'\n", "\n", "# Comunicates with another process\n", "# a pipe with the command stdout\n", "py = Popen(cmd + host, stdout=PIPE)\n", "\n", "# Shows command output\n", "print py.stdout.read()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The *subprocess* module provides a generic way of running processes with Popen() function which allows communication with the process through operating system pipes.\n", "\n", "Time\n", "-----\n", "Python has two modules to handle time:\n", "\n", "+ *Time*: implements functions that allow using the time generated by the system.\n", "+ *Datetime*: implements high-level types to perform date and time operations.\n", "\n", "Example with time:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import time\n", "\n", "# localtime() Returns a date and local time in the form \n", "# of a structure called struct_time, which is a \n", "# collection with the items: year, month, day, hour, minute,\n", "# secund, day of the week, day of the year and e daylight saving time\n", "print time.localtime()\n", "\n", "# asctime() returns a date and hour with string, according to\n", "# operating system configuration\n", "print time.asctime()\n", "\n", "# time() returns system time in seconds\n", "ts1 = time.time()\n", "\n", "# gmtime() converts seconds to struct_time\n", "tt1 = time.gmtime(ts1)\n", "print ts1, '->', tt1\n", "\n", "# Adding an hour\n", "tt2 = time.gmtime(ts1 + 3600.)\n", "\n", "# mktime() converts struct_time to seconds\n", "ts2 = time.mktime(tt2)\n", "print ts2, '->', tt2\n", "\n", "# clock() returs time since the program started, in seconds\n", "print 'The program took', time.clock(), \\\n", " 'seconds up to now...'\n", "\n", "# Counting seconds...\n", "for i in xrange(5):\n", "\n", " # sleep() waits the number of seconds specified as parameter\n", " time.sleep(1)\n", " print i + 1, 'second(s)'" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "time.struct_time(tm_year=2014, tm_mon=2, tm_mday=4, tm_hour=13, tm_min=51, tm_sec=28, tm_wday=1, tm_yday=35, tm_isdst=1)\n", "Tue Feb 4 13:51:28 2014\n", "1391529088.79 -> time.struct_time(tm_year=2014, tm_mon=2, tm_mday=4, tm_hour=15, tm_min=51, tm_sec=28, tm_wday=1, tm_yday=35, tm_isdst=0)\n", "1391543488.0 -> time.struct_time(tm_year=2014, tm_mon=2, tm_mday=4, tm_hour=16, tm_min=51, tm_sec=28, tm_wday=1, tm_yday=35, tm_isdst=0)\n", "The program took 1.59 seconds up to now...\n", "1" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " second(s)\n", "2" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " second(s)\n", "3" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " second(s)\n", "4" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " second(s)\n", "5" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " second(s)\n" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "In *datetime*, four types are defined for representing time:\n", "\n", "+ *datetime*: date and time.\n", "+ *date*: just date.\n", "+ *time*: just time.\n", "+ *timedelta*: time diference.\n", "\n", "Exemple:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import datetime\n", "\n", "# datetime() receives as parameter:\n", "# year, month, day, hour, minute, second and \n", "# returns an object of type datetime\n", "dt = datetime.datetime(2020, 12, 31, 23, 59, 59)\n", "\n", "# Objects date and time can be created from\n", "# a datetime object\n", "date = dt.date()\n", "hour = dt.time()\n", "\n", "# How many time to 12/31/2020\n", "dd = dt - dt.today()\n", "\n", "print 'Date:', date\n", "print 'Hour:', hour\n", "print 'How many time to 12/31/2020:', dd\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Date: 2020-12-31\n", "Hour: 23:59:59\n", "How many time to 12/31/2020: 2522 days, 10:02:22.939467\n" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Objects of types *date* and *datetime* return dates in ISO format.\n", "\n", "Regulares expressions\n", "--------------------\n", "Regular expression is a form of identifying patterns in character strings. In Python, the *re* module provides a syntactic parser that allows the use of such expressions. The patterns are defined by characters that have special meaning to the parser.\n", "\n", "Main characteres:\n", "\n", "+ Point (`.`): In standard mode means any character except the newline.\n", "+ Circunflex (`^`): In standard mode, means beginning of the string.\n", "+ Dollar (`$`): In standard mode, means end of the string.\n", "+ Backslash (`\\`): Escape character, allows using special chars as normal chars.\n", "+ Brackets (`[]`): Any character of the listed inside the brackets.\n", "+ Asterisk (`*`): Zero or more ocurrrences of previous expression.\n", "+ Plus sign (`+`): One or more ocurrences of previous expression.\n", "+ Question mark (`?`): Zero or one ocurrence of previous expression.\n", "+ Braces (`{n}`): n ocurrences of previous expression.\n", "+ Vertical bar (`|`): logical \u201cor\u201d.\n", "+ Parenthesis (`()`): Delimit a group of expressions.\n", "+ `\\d`: Digit. Same as `[0-9]`.\n", "+ `\\D`: Non digit. Same as `[^0-9]`.\n", "+ `\\s`: Any spacing character (`[ \\t\\n\\r\\f\\v]`).\n", "+ `\\S`: Any nonspacing character (`[^ \\t\\n\\r\\f\\v]`).\n", "+ `\\w`: Alphanumeric character or underline (`[a-zA-Z0-9_]`).\n", "+ `\\W`: Not an Alphanumeric character or underline (`[^a-zA-Z0-9_]`).\n", "\n", "Exemplos:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import re\n", "\n", "# Compile the regular expression using compile()\n", "# the compiled regular expression is stored and \n", "# can be reused\n", "rex = re.compile('\\w+')\n", "\n", "# Finds the occurrences according to the expression\n", "bands = 'Yes, Genesis & Camel'\n", "print bands, '->', rex.findall(bands)\n", "\n", "# Identify occurrences of Bj\u00f6rk (and their variations)\n", "bjork = re.compile('[Bb]j[\u00f6o]rk')\n", "for m in ('Bj\u00f6rk', 'bj\u00f6rk', 'Biork', 'Bjork', 'bjork'):\n", "\n", " # match() finds occurrences at the beginning of the string\n", " # to find at any part of the string, use search()\n", " print m, '->', bool(bjork.match(m))\n", "\n", "# Replacing text\n", "text = 'The next track is Stairway to Heaven'\n", "print text, '->', re. sub('[Ss]tairway [Tt]o [Hh]eaven',\n", " 'The Rover', text)\n", "\n", "# Splitting text\n", "bands = 'Tool, Porcupine Tree and NIN'\n", "print bands, '->', re.split(',?\\s+and?\\s+', bands)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Yes, Genesis & Camel -> ['Yes', 'Genesis', 'Camel']\n", "Bj\u00f6rk -> False\n", "bj\u00f6rk -> False\n", "Biork -> False\n", "Bjork -> True\n", "bjork -> True\n", "The next track is Stairway to Heaven -> The next track is The Rover\n", "Tool, Porcupine Tree and NIN -> ['Tool, Porcupine Tree', 'NIN']\n" ] } ], "prompt_number": 22 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The behaviour of the functions of this module can be changed by options, to tread strings as *unicode*, for instance." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", "" ], "output_type": "pyout", "prompt_number": 1, "text": [ "" ] } ], "prompt_number": 1 } ], "metadata": {} } ] }