{ "metadata": { "name": "python-databases" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Using Databases Inside Python\n", "=============================\n", "We will use SQLite to discuss how to use Python with databases,\n", "but most other major systems work in the same way, just with a different library.\n", "\n", "Example database\n", "----------------\n", "The database we'll be using as an example has a table named Experiment.\n", "This table is a log of all of the work done on experiments in a research lab,\n", "broken down by project and scientist. The table has columns for:\n", "\n", "* the login id of the scientist\n", "* the name of their project\n", "* a numeric id for their experiment\n", "* how many hours they spent on it\n", "* and when it took place\n", "\n", "Importing the library\n", "---------------------\n", "We need to tell Python that we're going to be working with SQLite, so we start\n", "by importing the sqlite3 library. By importing using a generic name (e.g., dbapi)\n", "if we ever decide to change the database system that we're using we can simply\n", "change the library name in this single line of code (in concept; in practice\n", "subtle differences between database systems may break things):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import sqlite3 as dbapi" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Connecting to the database\n", "--------------------------\n", "Next, we need to connect to the database using the database library's ``connect`` method. With SQLite this is as simple as first making the connect:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "con = dbapi.connect('examp_database.sqlite')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "and then getting a cursor which will let us execute SQL commands:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "cur = con.cursor()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the database does not currently exist then creating the connection will create it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Querying the database\n", "---------------------\n", "\n", "We can now execute any SQL commands that we want in the database using the cursor's\n", "``execute`` method. Querying the database simply involves writing the appropriate SQL\n", "and placing it inside a string in the execute method call:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "cur.execute(\"SELECT * FROM Experiment\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 4, "text": [ "" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "To get the results into Python we then use either the ``fetchone`` method to fetch one\n", "record at a time (it returns ``None`` when there are no more records to fetch so that\n", "you know when to stop)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "record = cur.fetchone()\n", "while record:\n", " print record\n", " record = cur.fetchone()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(u'best', u'Insulin', 1, 3.2, u'1922-03-23')\n", "(u'ivan', u'Time Travel', 2, -1.5, u'1910-07-05')\n", "(u'dian', u'Great Apes', 1, 15.0, u'1967-04-12')\n", "(u'fban', u'Insulin', 1, 5.5, u'1922-03-23')\n", "(u'mlom', u'Antigravity', 1, 1.5, u'1889-11-10')\n", "(u'fban', u'Insulin', 2, 1.2, u'1922-03-24')\n", "(u'taube', u'Redox Reaction', 2, 2.0, u'1942-12-04')\n", "(u'ivan', u'Teleportation', 3, 7.0, u'1891-02-17')\n", "(u'skol', u'Teleportation', 2, 14.0, u'1891-01-23')\n", "(u'dmitri', u'Teleportation', 1, 7.0, u'1891-01-22')\n", "(u'jane', u'Great Apes', 1, 7.0, u'1967-04-13')\n", "(u'skol', u'Time Travel', 1, -17.5, u'1900-07-05')\n", "(u'herschel', u'Infrared', 1, 7.0, u'1798-08-12')\n", "(u'taube', u'Redox Reaction', 1, 5.0, u'1942-11-17')\n", "(u'mlom', u'Teleportation', 2, 7.2, u'1891-03-23')\n", "(u'mlom', u'Antigravity', 2, 14.3, u'1889-11-01')\n", "(u'best', u'Insulin', 2, 10.0, u'1922-03-24')\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "or the ``fetchall`` method to return all of the records into a list.\n", "Each record is returned as a tuple of values, one value per field." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cur.execute(\"SELECT * FROM Experiment WHERE Project = 'Teleportation'\")\n", "myqueryrecords = cur.fetchall()\n", "myqueryrecords" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 6, "text": [ "[(u'ivan', u'Teleportation', 3, 7.0, u'1891-02-17'),\n", " (u'skol', u'Teleportation', 2, 14.0, u'1891-01-23'),\n", " (u'dmitri', u'Teleportation', 1, 7.0, u'1891-01-22'),\n", " (u'mlom', u'Teleportation', 2, 7.2, u'1891-03-23')]" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The u's that appear in front of each wring indicate that they are stored in unicode.\n", "If you want the database to return regular strings you can do this using the following command." ] }, { "cell_type": "code", "collapsed": false, "input": [ "con.text_factory = str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using Python variables inside SQL\n", "---------------------------------\n", "Often when working with databases in Python we will want to include variables from\n", "our Python code as part of the query. To do this we place a ? where each value should\n", "be inserted into the SQL statement and then include a tuple with the values to be\n", "inserted as a second argument in the execute statement." ] }, { "cell_type": "code", "collapsed": false, "input": [ "current_experiment = 'Teleportation'\n", "cur.execute(\"SELECT * FROM Experiment WHERE Project = ?\", (current_experiment,))\n", "myqueryrecords = cur.fetchall()\n", "myqueryrecords" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 8, "text": [ "[('ivan', 'Teleportation', 3, 7.0, '1891-02-17'),\n", " ('skol', 'Teleportation', 2, 14.0, '1891-01-23'),\n", " ('dmitri', 'Teleportation', 1, 7.0, '1891-01-22'),\n", " ('mlom', 'Teleportation', 2, 7.2, '1891-03-23')]" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Modifying a database\n", "--------------------\n", "Modifications can also be made to the database using the execute function. For example," ] }, { "cell_type": "code", "collapsed": false, "input": [ "cur.execute(\"INSERT INTO Experiment VALUES ('gimli', 'TrollKilling', '1', 112, '1954-07-21')\")\n", "con.commit()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order for the changes to actually be made to the database we need to commit them\n", "by calling ``con.commit()`` when we are ready to make the changes. This protects us\n", "if something goes wrong with our program because the changes won't be finalized until\n", "the point where we commit them.\n", "\n", "We can check to see if the changes have been made by running a quick query." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cur.execute(\"SELECT * FROM Experiment WHERE Project = 'TrollKilling'\")\n", "myqueryrecords = cur.fetchall()\n", "myqueryrecords" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 10, "text": [ "[('gimli', 'TrollKilling', 1, 112.0, '1954-07-21')]" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And then get rid of these changes using a ``DELETE`` statement." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cur.execute(\"DELETE FROM Experiment WHERE Project='TrollKilling'\")\n", "con.commit()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "cur.execute(\"SELECT * FROM Experiment WHERE Project = 'TrollKilling'\")\n", "myqueryrecords = cur.fetchall()\n", "myqueryrecords" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 12, "text": [ "[]" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Execute commands across many values\n", "-----------------------------------\n", "Often it will be useful to execute the same SQL command repeatedly with different inputs. For example, if we want to append a new set of data to an existing table, we will probably want to append multiple rows. We could do this using a for loop, but Python provides the ``executemany()`` method to make this easier." ] }, { "cell_type": "code", "collapsed": false, "input": [ "new_data_records = [('mschilling', 'genetics', 1, 112.0, '2005-07-21'),\n", " ('jkoch', 'bees', 1, 7.2, '2012-01-15'),\n", " ('akleinhesselink', 'commecol', 1, 2.5, '2010-05-05')]\n", "\n", "cur.executemany(\"INSERT INTO Experiment VALUES (?, ?, ?, ?, ?)\", new_data_records)\n", "cur.execute(\"SELECT * FROM Experiment\")\n", "cur.fetchall()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 13, "text": [ "[('best', 'Insulin', 1, 3.2, '1922-03-23'),\n", " ('ivan', 'Time Travel', 2, -1.5, '1910-07-05'),\n", " ('dian', 'Great Apes', 1, 15.0, '1967-04-12'),\n", " ('fban', 'Insulin', 1, 5.5, '1922-03-23'),\n", " ('mlom', 'Antigravity', 1, 1.5, '1889-11-10'),\n", " ('fban', 'Insulin', 2, 1.2, '1922-03-24'),\n", " ('taube', 'Redox Reaction', 2, 2.0, '1942-12-04'),\n", " ('ivan', 'Teleportation', 3, 7.0, '1891-02-17'),\n", " ('skol', 'Teleportation', 2, 14.0, '1891-01-23'),\n", " ('dmitri', 'Teleportation', 1, 7.0, '1891-01-22'),\n", " ('jane', 'Great Apes', 1, 7.0, '1967-04-13'),\n", " ('skol', 'Time Travel', 1, -17.5, '1900-07-05'),\n", " ('herschel', 'Infrared', 1, 7.0, '1798-08-12'),\n", " ('taube', 'Redox Reaction', 1, 5.0, '1942-11-17'),\n", " ('mlom', 'Teleportation', 2, 7.2, '1891-03-23'),\n", " ('mlom', 'Antigravity', 2, 14.3, '1889-11-01'),\n", " ('best', 'Insulin', 2, 10.0, '1922-03-24'),\n", " ('mschilling', 'genetics', 1, 112.0, '2005-07-21'),\n", " ('jkoch', 'bees', 1, 7.2, '2012-01-15'),\n", " ('akleinhesselink', 'commecol', 1, 2.5, '2010-05-05')]" ] } ], "prompt_number": 13 } ], "metadata": {} } ] }