{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "![](http://www.sqlitetutorial.net/wp-content/uploads/2016/01/SQLite-Python.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# SQLite with python\n", "\n", "References:\n", "\n", " * [SQLite tutorial](http://www.sqlitetutorial.net/)\n", " * [SQLite and python](http://www.sqlitetutorial.net/sqlite-python/)\n", " * [SQLite and python](https://www.tutorialspoint.com/sqlite/sqlite_python.htm)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is SQLite?\n", "\n", "from [What is SQLite?](http://www.sqlitetutorial.net/what-is-sqlite/)\n", "\n", "SQLite is a software library that provides a relational database management system. The lite in SQLite means light weight in terms of setup, database administration, and required resource.\n", "\n", "SQLite has the following noticeable features: self-contained, serverless, zero-configuration, transactional.\n", "Serverless\n", "\n", "Normally, an RDBMS such as MySQL, PostgreSQL, etc., requires a separate server process to operate. The applications that want to access the database server use TCP/IP protocol to send and receive requests. This is called client/server architecture.\n", "\n", "SQLite does NOT work this way. SQLite does NOT require a server to run. SQLite database is integrated with the application that accesses the database. The applications interact with the SQLite database read and write directly from the database files stored on disk.\n", "\n", "The following diagram illustrates the SQLite server-less architecture:\n", "\n", "![](http://www.sqlitetutorial.net/wp-content/uploads/2015/12/What-is-SQLite.jpg)\n", "\n", "SQLite has three important features:\n", "\n", " * Self-Contained\n", " * Zero-configuration\n", " * Transactional\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Full example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Connect To Database\n", "\n", "Following Python code shows how to connect to an existing database. If the database does not exist, then it will be created and finally a database object will be returned.\n", "\n", "Te database of the examples has been dowloaded from [SQLite sample database](http://www.sqlitetutorial.net/sqlite-sample-database/)\n", "\n", "The following database diagram illustrates the chinook database tables and their relationships:\n", "\n", "![](http://www.sqlitetutorial.net/wp-content/uploads/2015/11/sqlite-sample-database-color.jpg)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Opened database successfully\n" ] } ], "source": [ "#!/usr/bin/python\n", "\n", "import sqlite3\n", "\n", "sqlite_conn = sqlite3.connect('./db/chinook.db')\n", "\n", "print(\"Opened database successfully\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We show the tables in database\n", "\n", "[Show tables in sqlite database in python](https://stackoverflow.com/questions/31986520/show-tables-in-sqlite-database-in-python)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('albums',)\n", "('artists',)\n", "('customers',)\n", "('employees',)\n", "('genres',)\n", "('invoice_items',)\n", "('invoices',)\n", "('media_types',)\n", "('playlist_track',)\n", "('playlists',)\n", "('sqlite_sequence',)\n", "('sqlite_stat1',)\n", "('tracks',)\n" ] } ], "source": [ "cur = sqlite_conn.cursor() \n", "cur.execute(\"SELECT name FROM sqlite_master WHERE type='table' ORDER BY name\")\n", "available_table = (cur.fetchall())\n", "for table in available_table:\n", " print(table)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We show the column names in table\n", "\n", "[See the column names for table](https://www.daniweb.com/programming/software-development/threads/124403/sqlite3-how-to-see-column-names-for-table)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(0, 'TrackId', 'INTEGER', 1, None, 1)\n", "(1, 'Name', 'NVARCHAR(200)', 1, None, 0)\n", "(2, 'AlbumId', 'INTEGER', 0, None, 0)\n", "(3, 'MediaTypeId', 'INTEGER', 1, None, 0)\n", "(4, 'GenreId', 'INTEGER', 0, None, 0)\n", "(5, 'Composer', 'NVARCHAR(220)', 0, None, 0)\n", "(6, 'Milliseconds', 'INTEGER', 1, None, 0)\n", "(7, 'Bytes', 'INTEGER', 0, None, 0)\n", "(8, 'UnitPrice', 'NUMERIC(10,2)', 1, None, 0)\n" ] } ], "source": [ "cur.execute(\"PRAGMA table_info(tracks)\")\n", "rows = cur.fetchall()\n", "for row in rows:\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[PRAGMA Statements](www.sqlite.org/pragma.html)\n", "\n", "Others examples with PRAGMA statements are:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[(864,)]\n", "[]\n", "[(0, 'main', 'C:\\\\Users\\\\Ibon.DOMINIO\\\\Documents\\\\github\\\\miscellaneous\\\\SQLite&python\\\\db\\\\chinook.db')]\n" ] } ], "source": [ "cur.execute(\"PRAGMA page_count\")\n", "print(cur.fetchall())\n", "cur.execute(\"PRAGMA function_list\")\n", "print(cur.fetchall())\n", "cur.execute(\"PRAGMA database_list\")\n", "print(cur.fetchall())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### SELECT sentence\n", "\n", "The SELECT statement is used to select data from a database.\n", "\n", "The data returned is stored in a result table, called the result-set." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99)\n", "(2, 'Balls to the Wall', 2, 2, 1, None, 342562, 5510424, 0.99)\n", "(3, 'Fast As a Shark', 3, 2, 1, 'F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman', 230619, 3990994, 0.99)\n", "(4, 'Restless and Wild', 3, 2, 1, 'F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman', 252051, 4331779, 0.99)\n", "(5, 'Princess of the Dawn', 3, 2, 1, 'Deaffy & R.A. Smith-Diesel', 375418, 6290521, 0.99)\n" ] } ], "source": [ "cur = sqlite_conn.execute(\"SELECT * FROM tracks LIMIT 5;\")\n", "\n", "rows = cur.fetchall()\n", "for row in rows:\n", " print(row)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99)\n", "(2, 'Balls to the Wall', 2, 2, 1, None, 342562, 5510424, 0.99)\n", "(3, 'Fast As a Shark', 3, 2, 1, 'F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman', 230619, 3990994, 0.99)\n", "(4, 'Restless and Wild', 3, 2, 1, 'F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman', 252051, 4331779, 0.99)\n", "(5, 'Princess of the Dawn', 3, 2, 1, 'Deaffy & R.A. Smith-Diesel', 375418, 6290521, 0.99)\n" ] } ], "source": [ "cur = sqlite_conn.execute(\"SELECT * FROM tracks LIMIT 5;\")\n", "\n", "row = cur.fetchone()\n", "print(row)\n", "row = cur.fetchone()\n", "print(row)\n", "row = cur.fetchone()\n", "print(row)\n", "row = cur.fetchone()\n", "print(row)\n", "row = cur.fetchone()\n", "print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fetch\n", "\n", "There are three `fetch` functions in python with SQLite:\n", "\n", " * **fetchone()**: This method fetches the next row of a query result set, returning a single sequence, or None when no more data is available. \n", "\n", "\n", " * **fetchmany([size = cursor.arraysize])**: This routine fetches the next set of rows of a query result, returning a list. An empty list is returned when no more rows are available. The method tries to fetch as many rows as indicated by the size parameter. \n", " \n", " \n", " * **fetchall()**: This routine fetches all (remaining) rows of a query result, returning a list. An empty list is returned when no rows are available.\n", " \n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "fetchmany 2 ----------------\n", "(1, 'For Those About To Rock (We Salute You)', 1, 1, 1, 'Angus Young, Malcolm Young, Brian Johnson', 343719, 11170334, 0.99)\n", "(2, 'Balls to the Wall', 2, 2, 1, None, 342562, 5510424, 0.99)\n", "fetchone ----------------\n", "(3, 'Fast As a Shark', 3, 2, 1, 'F. Baltes, S. Kaufman, U. Dirkscneider & W. Hoffman', 230619, 3990994, 0.99)\n", "fetchall ----------------\n", "(4, 'Restless and Wild', 3, 2, 1, 'F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. Dirkscneider & W. Hoffman', 252051, 4331779, 0.99)\n", "(5, 'Princess of the Dawn', 3, 2, 1, 'Deaffy & R.A. Smith-Diesel', 375418, 6290521, 0.99)\n" ] } ], "source": [ "cur = sqlite_conn.execute(\"SELECT * FROM tracks LIMIT 5;\")\n", "\n", "rows = cur.fetchmany(2)\n", "print(\"fetchmany 2 ----------------\")\n", "for row in rows:\n", " print(row)\n", "\n", "row = cur.fetchone() \n", "print(\"fetchone ----------------\")\n", "print(row)\n", "\n", "print(\"fetchall ----------------\")\n", "rows = cur.fetchall() \n", "for row in rows:\n", " print(row) " ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('Alabama Song', 'Weill-Brecht', 200097)\n", "('Purgatory', 'Steve Harris', 200150)\n", "('F**k Me Pumps', 'Salaam Remi', 200253)\n", "('Why Go', 'Jeff Ament', 200254)\n", "('Ice Cream Man', 'John Brim', 200306)\n", "('Summertime', 'Miles Davis', 200437)\n", "('Coroné Antonio Bento', None, 200437)\n", "('Último Pau-De-Arara', 'Corumbá/José Gumarães/Venancio', 200437)\n", "('The Wind Cries Mary', 'Jimi Hendrix', 200463)\n", "('Over Again', 'Tim Maia', 200489)\n", "('Rock Bottom', 'Paul Stanley, Ace Frehley', 200594)\n", "('All My Love', 'E. Schrody/L. Dimant', 200620)\n", "('Question!', 'Tankian, Serj', 200698)\n", "('Posso Até Me Apaixonar', 'Dudu Nobre', 200698)\n", "('Cold Day In The Sun', 'Dave Grohl, Taylor Hawkins, Nate Mendel, Chris Shiflett/FOO FIGHTERS', 200724)\n", "('Too Fast For Love', 'Nikki Sixx', 200829)\n", "('O Cidadão Do Mundo', 'Chico Science', 200933)\n" ] } ], "source": [ "cur = sqlite_conn.execute(\"\"\"\n", "--====\n", "SELECT Name, Composer, Milliseconds \n", " FROM tracks \n", " WHERE Milliseconds >= ? \n", " AND \n", " Milliseconds <= ? \n", " ORDER BY Milliseconds;\n", "--====\"\"\", (200000, 201000))\n", "rows = cur.fetchall()\n", "for row in rows:\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### SQLite and pandas\n", "\n", "References:\n", " \n", " * [pandas with databases](https://www.dataquest.io/blog/python-pandas-databases/)\n", " \n", "yotube\n", "\n", " * [SQL y Data Science #1: Introduccion a SQL](https://www.youtube.com/watch?v=7NSr6C8IhCo)\n", " * [SQL for Data Science #2: SQLite+pandas to analyze 10 million NYC citibike records](https://www.youtube.com/watch?v=fL53-abGSuM)\n", " * [SQL for Data Science #3a: SQLite+pandas to analyze 10 million NYC citibike records](https://www.youtube.com/watch?v=dUtBqDqmyQg)\n", " * [SQL for Data Science #3b: SQLite+pandas to analyze 10 million NYC citibike records](https://www.youtube.com/watch?v=rB5di9XjC94)\n", " \n", " \n", " * [Python Pandas connect directly to SQLite, Oracle, IBM Db2, MS SQL Server, PostgreSQL, MySQL](https://www.youtube.com/watch?v=gC-0CaRzR48)\n", "\n", " " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TrackIdNameAlbumIdMediaTypeIdGenreIdComposerMillisecondsBytesUnitPrice
01For Those About To Rock (We Salute You)111Angus Young, Malcolm Young, Brian Johnson343719111703340.99
12Balls to the Wall221None34256255104240.99
23Fast As a Shark321F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho...23061939909940.99
34Restless and Wild321F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D...25205143317790.99
45Princess of the Dawn321Deaffy & R.A. Smith-Diesel37541862905210.99
\n", "
" ], "text/plain": [ " TrackId Name AlbumId MediaTypeId \\\n", "0 1 For Those About To Rock (We Salute You) 1 1 \n", "1 2 Balls to the Wall 2 2 \n", "2 3 Fast As a Shark 3 2 \n", "3 4 Restless and Wild 3 2 \n", "4 5 Princess of the Dawn 3 2 \n", "\n", " GenreId Composer Milliseconds \\\n", "0 1 Angus Young, Malcolm Young, Brian Johnson 343719 \n", "1 1 None 342562 \n", "2 1 F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho... 230619 \n", "3 1 F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D... 252051 \n", "4 1 Deaffy & R.A. Smith-Diesel 375418 \n", "\n", " Bytes UnitPrice \n", "0 11170334 0.99 \n", "1 5510424 0.99 \n", "2 3990994 0.99 \n", "3 4331779 0.99 \n", "4 6290521 0.99 " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_sql_query(sql = \"\"\"\n", "--====\n", "SELECT * \n", " FROM tracks \n", "LIMIT 5;\n", "--====\"\"\", con = sqlite_conn)\n", "df" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TrackIdNameAlbumIdMediaTypeIdGenreIdComposerMillisecondsBytesUnitPrice
01For Those About To Rock (We Salute You)111Angus Young, Malcolm Young, Brian Johnson343719111703340.99
12Balls to the Wall221None34256255104240.99
23Fast As a Shark321F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho...23061939909940.99
34Restless and Wild321F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D...25205143317790.99
45Princess of the Dawn321Deaffy & R.A. Smith-Diesel37541862905210.99
56Put The Finger On You111Angus Young, Malcolm Young, Brian Johnson20566267134510.99
67Let's Get It Up111Angus Young, Malcolm Young, Brian Johnson23392676365610.99
78Inject The Venom111Angus Young, Malcolm Young, Brian Johnson21083468528600.99
89Snowballed111Angus Young, Malcolm Young, Brian Johnson20310265994240.99
910Evil Walks111Angus Young, Malcolm Young, Brian Johnson26349786112450.99
\n", "
" ], "text/plain": [ " TrackId Name AlbumId MediaTypeId \\\n", "0 1 For Those About To Rock (We Salute You) 1 1 \n", "1 2 Balls to the Wall 2 2 \n", "2 3 Fast As a Shark 3 2 \n", "3 4 Restless and Wild 3 2 \n", "4 5 Princess of the Dawn 3 2 \n", "5 6 Put The Finger On You 1 1 \n", "6 7 Let's Get It Up 1 1 \n", "7 8 Inject The Venom 1 1 \n", "8 9 Snowballed 1 1 \n", "9 10 Evil Walks 1 1 \n", "\n", " GenreId Composer Milliseconds \\\n", "0 1 Angus Young, Malcolm Young, Brian Johnson 343719 \n", "1 1 None 342562 \n", "2 1 F. Baltes, S. Kaufman, U. Dirkscneider & W. Ho... 230619 \n", "3 1 F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D... 252051 \n", "4 1 Deaffy & R.A. Smith-Diesel 375418 \n", "5 1 Angus Young, Malcolm Young, Brian Johnson 205662 \n", "6 1 Angus Young, Malcolm Young, Brian Johnson 233926 \n", "7 1 Angus Young, Malcolm Young, Brian Johnson 210834 \n", "8 1 Angus Young, Malcolm Young, Brian Johnson 203102 \n", "9 1 Angus Young, Malcolm Young, Brian Johnson 263497 \n", "\n", " Bytes UnitPrice \n", "0 11170334 0.99 \n", "1 5510424 0.99 \n", "2 3990994 0.99 \n", "3 4331779 0.99 \n", "4 6290521 0.99 \n", "5 6713451 0.99 \n", "6 7636561 0.99 \n", "7 6852860 0.99 \n", "8 6599424 0.99 \n", "9 8611245 0.99 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_sql_query(sql = \"\"\"\n", "--====\n", "SELECT * \n", " FROM tracks \n", "LIMIT %d;\n", "--====\"\"\" % (10), con = sqlite_conn)\n", "df" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NameComposerMilliseconds
0Alabama SongWeill-Brecht200097
1PurgatorySteve Harris200150
2F**k Me PumpsSalaam Remi200253
3Why GoJeff Ament200254
4Ice Cream ManJohn Brim200306
5SummertimeMiles Davis200437
6Coroné Antonio BentoNone200437
7Último Pau-De-AraraCorumbá/José Gumarães/Venancio200437
8The Wind Cries MaryJimi Hendrix200463
9Over AgainTim Maia200489
10Rock BottomPaul Stanley, Ace Frehley200594
11All My LoveE. Schrody/L. Dimant200620
12Question!Tankian, Serj200698
13Posso Até Me ApaixonarDudu Nobre200698
14Cold Day In The SunDave Grohl, Taylor Hawkins, Nate Mendel, Chris...200724
15Too Fast For LoveNikki Sixx200829
16O Cidadão Do MundoChico Science200933
\n", "
" ], "text/plain": [ " Name Composer \\\n", "0 Alabama Song Weill-Brecht \n", "1 Purgatory Steve Harris \n", "2 F**k Me Pumps Salaam Remi \n", "3 Why Go Jeff Ament \n", "4 Ice Cream Man John Brim \n", "5 Summertime Miles Davis \n", "6 Coroné Antonio Bento None \n", "7 Último Pau-De-Arara Corumbá/José Gumarães/Venancio \n", "8 The Wind Cries Mary Jimi Hendrix \n", "9 Over Again Tim Maia \n", "10 Rock Bottom Paul Stanley, Ace Frehley \n", "11 All My Love E. Schrody/L. Dimant \n", "12 Question! Tankian, Serj \n", "13 Posso Até Me Apaixonar Dudu Nobre \n", "14 Cold Day In The Sun Dave Grohl, Taylor Hawkins, Nate Mendel, Chris... \n", "15 Too Fast For Love Nikki Sixx \n", "16 O Cidadão Do Mundo Chico Science \n", "\n", " Milliseconds \n", "0 200097 \n", "1 200150 \n", "2 200253 \n", "3 200254 \n", "4 200306 \n", "5 200437 \n", "6 200437 \n", "7 200437 \n", "8 200463 \n", "9 200489 \n", "10 200594 \n", "11 200620 \n", "12 200698 \n", "13 200698 \n", "14 200724 \n", "15 200829 \n", "16 200933 " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_sql_query(sql = \"\"\"\n", "SELECT Name, Composer, Milliseconds \n", " FROM tracks \n", " WHERE Milliseconds >= %d \n", " AND \n", " Milliseconds <= %d\n", " ORDER BY Milliseconds;\n", "--====\"\"\" % (200000, 201000), con = sqlite_conn)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Join\n", "\n", "A JOIN clause is used to combine rows from two or more tables, based on a related column between them.\n", "\n", "![](https://i.stack.imgur.com/1UKp7.png)\n", "\n", "Different Types of SQL JOINs\n", "\n", "Here are the different types of the JOINs in SQL:\n", "\n", " * **(INNER) JOIN**: Returns records that have matching values in both tables\n", " \n", " ![](https://www.w3schools.com/sql/img_innerjoin.gif)\n", " \n", " * **LEFT (OUTER) JOIN**: Return all records from the left table, and the matched records from the right table\n", " \n", " ![](https://www.w3schools.com/sql/img_leftjoin.gif)\n", " \n", " * **RIGHT (OUTER) JOIN**: Return all records from the right table, and the matched records from the left table\n", " \n", " ![](https://www.w3schools.com/sql/img_rightjoin.gif)\n", " \n", " * **FULL (OUTER) JOIN**: Return all records when there is a match in either left or right table\n", "\n", " ![](https://www.w3schools.com/sql/img_fulljoin.gif)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the following sentence we will join three tables: tracks, albums and artists. The result will be a table with the name of the artis/group, the title of the album and the name of the track. We will sort teh results by artis and album." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('AC/DC', 'For Those About To Rock We Salute You', 'For Those About To Rock (We Salute You)')\n", "('AC/DC', 'For Those About To Rock We Salute You', 'Put The Finger On You')\n", "('AC/DC', 'For Those About To Rock We Salute You', \"Let's Get It Up\")\n", "('AC/DC', 'For Those About To Rock We Salute You', 'Inject The Venom')\n", "('AC/DC', 'For Those About To Rock We Salute You', 'Snowballed')\n", "('AC/DC', 'For Those About To Rock We Salute You', 'Evil Walks')\n", "('AC/DC', 'For Those About To Rock We Salute You', 'C.O.D.')\n", "('AC/DC', 'For Those About To Rock We Salute You', 'Breaking The Rules')\n", "('AC/DC', 'For Those About To Rock We Salute You', 'Night Of The Long Knives')\n", "('AC/DC', 'For Those About To Rock We Salute You', 'Spellbound')\n", "('AC/DC', 'Let There Be Rock', 'Go Down')\n", "('AC/DC', 'Let There Be Rock', 'Dog Eat Dog')\n", "('AC/DC', 'Let There Be Rock', 'Let There Be Rock')\n", "('AC/DC', 'Let There Be Rock', 'Bad Boy Boogie')\n", "('AC/DC', 'Let There Be Rock', 'Problem Child')\n", "('AC/DC', 'Let There Be Rock', 'Overdose')\n", "('AC/DC', 'Let There Be Rock', \"Hell Ain't A Bad Place To Be\")\n", "('AC/DC', 'Let There Be Rock', 'Whole Lotta Rosie')\n", "('Aaron Copland & London Symphony Orchestra', 'A Copland Celebration, Vol. I', 'Fanfare for the Common Man')\n", "('Aaron Goldberg', 'Worlds', \"OAM's Blues\")\n" ] } ], "source": [ "cur = sqlite_conn.execute(\"\"\"\n", "--==== ====================================================\n", "--==== INICIO DE LA QUERY\n", "SELECT artists.Name, albums.Title, tracks.Name\n", " FROM tracks\n", " LEFT JOIN albums ON albums.Albumid = tracks.Albumid\n", " LEFT JOIN artists ON albums.Artistid = artists.Artistid\n", " ORDER BY artists.Name, albums.Title ASC;\n", "--==== FIN DE LA QUERY\n", "--==== ==================================================== \n", "\"\"\")\n", "rows = cur.fetchmany(20) # Only the 20 first results\n", "for row in rows:\n", " print(row)\n", "rows = cur.fetchall() " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A example with analysis. We will calculate the time average by album in milliseconds." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('AC/DC', 'For Those About To Rock We Salute You', 'Spellbound', 240042.0)\n", "('AC/DC', 'Let There Be Rock', 'Whole Lotta Rosie', 306657.0)\n", "('Aaron Copland & London Symphony Orchestra', 'A Copland Celebration, Vol. I', 'Fanfare for the Common Man', 198064.0)\n", "('Aaron Goldberg', 'Worlds', \"OAM's Blues\", 266936.0)\n", "('Academy of St. Martin in the Fields & Sir Neville Marriner', 'The World of Classical Favourites', 'Fantasia On Greensleeves', 232601.0)\n", "('Academy of St. Martin in the Fields Chamber Ensemble & Sir Neville Marriner', 'Sir Neville Marriner: A Celebration', '\"Eine Kleine Nachtmusik\" Serenade In G, K. 525: I. Allegro', 348971.0)\n", "('Academy of St. Martin in the Fields, John Birch, Sir Neville Marriner & Sylvia McNair', 'Fauré: Requiem, Ravel: Pavane & Others', 'Requiem, Op.48: 4. Pie Jesu', 258924.0)\n", "('Academy of St. Martin in the Fields, Sir Neville Marriner & Thurston Dart', 'Bach: Orchestral Suites Nos. 1 - 4', 'Suite No. 3 in D, BWV 1068: III. Gavotte I & II', 225933.0)\n", "('Accept', 'Balls to the Wall', 'Balls to the Wall', 342562.0)\n", "('Accept', 'Restless and Wild', 'Princess of the Dawn', 286029.0)\n" ] } ], "source": [ "cur = sqlite_conn.execute(\"\"\"\n", "--==== ====================================================\n", "--==== INICIO DE LA QUERY\n", "SELECT artists.Name, albums.Title, tracks.Name, ROUND(AVG(tracks.Milliseconds),0)\n", " FROM tracks -- TABLA PRINCIPAL\n", " LEFT JOIN albums ON albums.Albumid = tracks.Albumid\n", " LEFT JOIN artists ON albums.Artistid = artists.Artistid\n", " GROUP BY artists.Name, albums.Title\n", " ORDER BY artists.Name, albums.Title ASC;\n", "--==== FIN DE LA QUERY\n", "--==== ==================================================== \n", "\"\"\")\n", "rows = cur.fetchmany(10)\n", "for row in rows:\n", " print(row)\n", "rows = cur.fetchall()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will calculate the number of tracks in each album." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('Lenny Kravitz', 'Greatest Hits', 57)\n", "('Chico Buarque', 'Minha Historia', 34)\n", "('Eric Clapton', 'Unplugged', 30)\n", "('Lost', 'Lost, Season 3', 26)\n", "('Lost', 'Lost, Season 1', 25)\n", "('The Office', 'The Office, Season 3', 25)\n", "('Battlestar Galactica (Classic)', 'Battlestar Galactica (Classic), Season 1', 24)\n", "('Frank Sinatra', 'My Way: The Best Of Frank Sinatra [Disc 1]', 24)\n", "('Lost', 'Lost, Season 2', 24)\n", "('Chico Science & Nação Zumbi', 'Afrociberdelia', 23)\n" ] } ], "source": [ "cur = sqlite_conn.execute(\"\"\"\n", "--==== ====================================================\n", "--==== INICIO DE LA QUERY\n", "SELECT artists.Name, albums.Title, COUNT(tracks.Name) AS N\n", " FROM tracks\n", " LEFT JOIN albums ON albums.Albumid = tracks.Albumid\n", " LEFT JOIN artists ON albums.Artistid = artists.Artistid\n", " GROUP BY artists.Name, albums.Title\n", " ORDER BY N DESC;\n", "--==== FIN DE LA QUERY\n", "--==== ==================================================== \n", "\"\"\")\n", "rows = cur.fetchmany(10)\n", "for row in rows:\n", " print(row)\n", "rows = cur.fetchall()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### SQL statement from file\n", "\n", "In the previous examples we have seen SQL codes very large, with comments and several analysis. If is a code that will be used in several examples or analysis, then we will be more efficient if we will save the SQL code in a file.\n", "\n", "#### Example 1\n", "\n", "If the SQL statement is very large then is a better way to save the SQL in a file \\*.sql and load it in the `read_sql_query`. In the file _example.sql_ is written the following SQL code.\n", "\n", "```\n", "--====\n", "SELECT Name, Composer, Milliseconds \n", " FROM tracks \n", " WHERE Milliseconds >= 200000\n", " AND \n", " Milliseconds <= 201000\n", " ORDER BY Milliseconds;\n", "--====\n", "```" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "f = open(\"example.sql\")\n", "sql_statement = f.read()\n", "f.close()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--====\n", "SELECT Name, Composer, Milliseconds \n", " FROM tracks \n", " WHERE Milliseconds >= 200000 \n", " AND \n", " Milliseconds <= 201000\n", " ORDER BY Milliseconds;\n", "--====\n", "\n" ] } ], "source": [ "print(sql_statement)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NameComposerMilliseconds
0Alabama SongWeill-Brecht200097
1PurgatorySteve Harris200150
2F**k Me PumpsSalaam Remi200253
3Why GoJeff Ament200254
4Ice Cream ManJohn Brim200306
5SummertimeMiles Davis200437
6Coroné Antonio BentoNone200437
7Último Pau-De-AraraCorumbá/José Gumarães/Venancio200437
8The Wind Cries MaryJimi Hendrix200463
9Over AgainTim Maia200489
10Rock BottomPaul Stanley, Ace Frehley200594
11All My LoveE. Schrody/L. Dimant200620
12Question!Tankian, Serj200698
13Posso Até Me ApaixonarDudu Nobre200698
14Cold Day In The SunDave Grohl, Taylor Hawkins, Nate Mendel, Chris...200724
15Too Fast For LoveNikki Sixx200829
16O Cidadão Do MundoChico Science200933
\n", "
" ], "text/plain": [ " Name Composer \\\n", "0 Alabama Song Weill-Brecht \n", "1 Purgatory Steve Harris \n", "2 F**k Me Pumps Salaam Remi \n", "3 Why Go Jeff Ament \n", "4 Ice Cream Man John Brim \n", "5 Summertime Miles Davis \n", "6 Coroné Antonio Bento None \n", "7 Último Pau-De-Arara Corumbá/José Gumarães/Venancio \n", "8 The Wind Cries Mary Jimi Hendrix \n", "9 Over Again Tim Maia \n", "10 Rock Bottom Paul Stanley, Ace Frehley \n", "11 All My Love E. Schrody/L. Dimant \n", "12 Question! Tankian, Serj \n", "13 Posso Até Me Apaixonar Dudu Nobre \n", "14 Cold Day In The Sun Dave Grohl, Taylor Hawkins, Nate Mendel, Chris... \n", "15 Too Fast For Love Nikki Sixx \n", "16 O Cidadão Do Mundo Chico Science \n", "\n", " Milliseconds \n", "0 200097 \n", "1 200150 \n", "2 200253 \n", "3 200254 \n", "4 200306 \n", "5 200437 \n", "6 200437 \n", "7 200437 \n", "8 200463 \n", "9 200489 \n", "10 200594 \n", "11 200620 \n", "12 200698 \n", "13 200698 \n", "14 200724 \n", "15 200829 \n", "16 200933 " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_sql_query(sql = sql_statement, \n", " con = sqlite_conn)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Example 2\n", "\n", "If the SQL statement is very large then is a better way to save the SQL in a file \\*.sql and load it in the `read_sql_query`. In the file _example2.sql_ is written the following SQL code.\n", "\n", "```\n", "--====\n", "SELECT Name, Composer, Milliseconds \n", " FROM %s \n", " WHERE Milliseconds >= %d\n", " AND \n", " Milliseconds <= %d\n", " ORDER BY Milliseconds;\n", "--====\n", "```\n", "\n", "We can see the %s (string) and %d (integer) insertions. These parameters allow the reutilitation of the SQL sentence." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "f = open(\"example2.sql\")\n", "sql_statement = f.read()\n", "f.close()" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NameComposerMilliseconds
0Alabama SongWeill-Brecht200097
1PurgatorySteve Harris200150
2F**k Me PumpsSalaam Remi200253
3Why GoJeff Ament200254
4Ice Cream ManJohn Brim200306
5SummertimeMiles Davis200437
6Coroné Antonio BentoNone200437
7Último Pau-De-AraraCorumbá/José Gumarães/Venancio200437
8The Wind Cries MaryJimi Hendrix200463
9Over AgainTim Maia200489
\n", "
" ], "text/plain": [ " Name Composer Milliseconds\n", "0 Alabama Song Weill-Brecht 200097\n", "1 Purgatory Steve Harris 200150\n", "2 F**k Me Pumps Salaam Remi 200253\n", "3 Why Go Jeff Ament 200254\n", "4 Ice Cream Man John Brim 200306\n", "5 Summertime Miles Davis 200437\n", "6 Coroné Antonio Bento None 200437\n", "7 Último Pau-De-Arara Corumbá/José Gumarães/Venancio 200437\n", "8 The Wind Cries Mary Jimi Hendrix 200463\n", "9 Over Again Tim Maia 200489" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_sql_query(sql = sql_statement % ('tracks', 200000, 200500), \n", " con = sqlite_conn)\n", "df" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NameComposerMilliseconds
0Rock BottomPaul Stanley, Ace Frehley200594
1All My LoveE. Schrody/L. Dimant200620
2Question!Tankian, Serj200698
3Posso Até Me ApaixonarDudu Nobre200698
4Cold Day In The SunDave Grohl, Taylor Hawkins, Nate Mendel, Chris...200724
5Too Fast For LoveNikki Sixx200829
6O Cidadão Do MundoChico Science200933
\n", "
" ], "text/plain": [ " Name Composer \\\n", "0 Rock Bottom Paul Stanley, Ace Frehley \n", "1 All My Love E. Schrody/L. Dimant \n", "2 Question! Tankian, Serj \n", "3 Posso Até Me Apaixonar Dudu Nobre \n", "4 Cold Day In The Sun Dave Grohl, Taylor Hawkins, Nate Mendel, Chris... \n", "5 Too Fast For Love Nikki Sixx \n", "6 O Cidadão Do Mundo Chico Science \n", "\n", " Milliseconds \n", "0 200594 \n", "1 200620 \n", "2 200698 \n", "3 200698 \n", "4 200724 \n", "5 200829 \n", "6 200933 " ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_sql_query(sql = sql_statement % ('tracks', 200500, 201000), \n", " con = sqlite_conn)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Close de connect\n", "\n", "We close the conexion with database with the following command:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "sqlite_conn.close()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "337px" }, "toc_section_display": true, "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }