{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Connecting to a Microsoft SQL Server with _pyodbc_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a **QUICK START** guide for the **pyobc** library for those who use it to connect to a **Microsoft SQL SERVER** database. However, this will not go into any detail, so you can and should read the full documentation [HERE](https://github.com/mkleehammer/pyodbc/wiki)\n", "\n", "Besides the standard query stuff you would expect... there are A LOT of other things that you can do with pyodbc, with some creativity. For example, you can use it to stage an ETL process...yes, you can execute a python script with a scheduled task on your computer... pretty cool. I often run queries in a FOR loop when I want to process data over multiple periods of time, but it is more efficient to load it in one month or one year increments. The sky is the limit! Enjoy!" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pyodbc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The connection string\n", "Use the _pyodbc.drivers()_ method to find available drivers; the connection strings for these are essentially the same and will look like this: \n", "\n", "`conx = pyodbc.connect('DRIVER={SQL Server}; SERVER=TestServer; Database=TestDatabase; UID=UserID; PWD=Password;')`\n", "\n", "If you use **Windows Authentication** to connect to the server, your string will look like this: \n", "\n", "`conx = pyodbc.connect('DRIVER={SQL Server}; SERVER=TestServer; Database=TestDatabase; TRUSTED_CONNECTION=yes')`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup the connection string & query variables\n", "**Find** the DRIVERS you have available by using the `pyodbc.drivers()` method" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['SQL Server',\n", " 'SQL Server Native Client 11.0',\n", " 'ODBC Driver 13 for SQL Server',\n", " 'SQL Server Native Client RDA 11.0',\n", " 'ODBC Driver 17 for SQL Server']" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pyodbc.drivers()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Create** a VARIABLE to store the connection string" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "conx_string = \"driver={SQL SERVER}; server=GTLPF1MZF5M\\IZZY_SQL_001; database=ADVENTUREWORKS2017; trusted_connection=YES;\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Create** a VARIABLE for the sql query" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "query = \"SELECT Name, CreditRating FROM Purchasing.Vendor WHERE CreditRating < 3\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Connect and extract data | Individual steps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Create** a CONNECTION using the connection string and `pyodbc.connect()` " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "conx = pyodbc.connect(conx_string);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Create** a CURSOR that we can use to work in the database" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "cursor = conx.cursor();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Run** the QUERY using `cursor.execute()`" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "cursor.execute(query);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Store** the RESULTS in a variable" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "data = cursor.fetchall()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Display** the RESULTS to check the data (first 5 rows)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('Australia Bike Retailer', 1), ('Allenson Cycles', 2), ('Advanced Bicycles', 1), ('Trikes, Inc.', 2), ('Morgan Bike Accessories', 1)]\n" ] } ], "source": [ "print(data[:5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Close** the CONNECTION using the `close()` method" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "conx.close()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Connect and extract data | Consolidated with statement\n", "**Create** a CONNECTION and **import** the DATA. There is no need to close the connection manually here as it will close when exiting the with statement." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "with pyodbc.connect(conx_string) as conx:\n", " cursor = conx.cursor()\n", " cursor.execute(query)\n", " data = cursor.fetchall() " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Display** the RESULTS to check the data (first 5 rows)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('Australia Bike Retailer', 1), ('Allenson Cycles', 2), ('Advanced Bicycles', 1), ('Trikes, Inc.', 2), ('Morgan Bike Accessories', 1)]\n" ] } ], "source": [ "print(data[:5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Other useful methods\n", "**Access** the DATA as a NAMED TUPLE, which can prove to be very handy" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "VENDOR NAME: Australia Bike Retailer\n", "CREDIT RATING: 1\n", "VENDOR NAME: Allenson Cycles\n", "CREDIT RATING: 2\n", "VENDOR NAME: Advanced Bicycles\n", "CREDIT RATING: 1\n", "VENDOR NAME: Trikes, Inc.\n", "CREDIT RATING: 2\n", "VENDOR NAME: Morgan Bike Accessories\n", "CREDIT RATING: 1\n" ] } ], "source": [ "for row in data[:5]:\n", " print(f\"VENDOR NAME: {row.Name}\\nCREDIT RATING: {row.CreditRating}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Retrieve** the COLUMN NAMES for a table" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['BusinessEntityID', 'AccountNumber', 'Name', 'CreditRating', 'PreferredVendorStatus', 'ActiveFlag', 'PurchasingWebServiceURL', 'ModifiedDate']\n" ] } ], "source": [ "columns = [row.column_name for row in cursor.columns(table='Vendor')]\n", "print(columns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Retrieve** a LIST OF TABLES in the database (first 5 records)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['AWBuildVersion', 'DatabaseLog', 'ErrorLog', 'Department', 'Employee']\n" ] } ], "source": [ "tables = [row.table_name for row in cursor.tables()]\n", "print(tables[:5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Select queries with parameters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**SINGLE** parameter query (Female employees)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "with pyodbc.connect(conx_string) as conx:\n", " cursor = conx.cursor()\n", " cursor.execute('SELECT NationalIDNumber, JobTitle FROM HumanResources.Employee WHERE Gender = ?', 'F')\n", " f_emp_data = cursor.fetchall()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Retrieve** a LIST of TABLES in the database (first 5 records)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('245797967', 'Vice President of Engineering')\n", "('695256908', 'Design Engineer')\n", "('811994146', 'Research and Development Engineer')\n", "('658797903', 'Research and Development Engineer')\n", "('486228782', 'Tool Designer')\n" ] } ], "source": [ "for row in f_emp_data[:5]:\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**MULTI** parameters query (Male employees, Vacation hours < 40)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "with pyodbc.connect(conx_string) as conx:\n", " cursor = conx.cursor()\n", " cursor.execute('SELECT NationalIDNumber, JobTitle, VacationHours FROM HumanResources.Employee WHERE Gender = ? AND VacationHours < ?' , ('M',40))\n", " m_vac_data = cursor.fetchall()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Retrieve** a LIST of TABLES in the database (first 5 records)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('509647174', 'Engineering Manager', 2)\n", "('998320692', 'Design Engineer', 6)\n", "('879342154', 'Research and Development Manager', 16)\n", "('974026903', 'Senior Tool Designer', 7)\n", "('480168528', 'Tool Designer', 9)\n" ] } ], "source": [ "for row in m_vac_data[:5]:\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Insert queries\n", "**Create** a CONNECTION STRING. This is a different database than the other I used." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "conx_string_b = \"driver={SQL SERVER}; server=GTLPF1MZF5M\\IZZY_SQL_001; database=SANDBOX; trusted_connection=YES;\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Insert** a SINGLE record" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "with pyodbc.connect(conx_string_b) as conx:\n", " cursor = conx.cursor()\n", " cursor.execute(\"INSERT INTO VacationExceptions (NationalIDNumber, JobTitle, VacationHours) VALUES(125467985467854, 'Head Hancho', 500)\") " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Verify** the DATA with a SELECT query" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[('125467985467854', 'Head Hancho', 500)]\n" ] } ], "source": [ "with pyodbc.connect(conx_string_b) as conx:\n", " cursor = conx.cursor()\n", " cursor.execute(\"SELECT * FROM VacationExceptions\")\n", " data = cursor.fetchall()\n", "\n", "print(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Insert** MULTIPLE records from VALUES" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "exceptions = [('615389812','Sales Representative',150),('982310417','European Sales Manager',75)]\n", "\n", "with pyodbc.connect(conx_string_b) as conx:\n", " cursor = conx.cursor()\n", " cursor.executemany(\"INSERT INTO VacationExceptions (NationalIDNumber, JobTitle, VacationHours) VALUES(?,?,?)\", exceptions)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Verify** the DATA with a SELECT query (1 record from single insert, and 2 records for multi insert... 3 total)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('125467985467854', 'Head Hancho', 500)\n", "('615389812', 'Sales Representative', 150)\n", "('982310417', 'European Sales Manager', 75)\n" ] } ], "source": [ "with pyodbc.connect(conx_string_b) as conx:\n", " cursor = conx.cursor()\n", " cursor.execute(\"SELECT * FROM VacationExceptions\")\n", " data = cursor.fetchall()\n", "\n", "for row in data:\n", " print(row)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Insert** MULTIPLE records from ANOTHER query \n", "Load the MALE vacation exceptions dataset into the VacationExceptions table" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "with pyodbc.connect(conx_string_b) as conx:\n", " cursor = conx.cursor()\n", " cursor.executemany(\"INSERT INTO VacationExceptions (NationalIDNumber, JobTitle, VacationHours) VALUES(?,?,?)\", m_vac_data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Verify** the DATA with a SELECT query (first 10 records)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('125467985467854', 'Head Hancho', 500)\n", "('615389812', 'Sales Representative', 150)\n", "('982310417', 'European Sales Manager', 75)\n", "('509647174', 'Engineering Manager', 2)\n", "('998320692', 'Design Engineer', 6)\n", "('879342154', 'Research and Development Manager', 16)\n", "('974026903', 'Senior Tool Designer', 7)\n", "('480168528', 'Tool Designer', 9)\n", "('42487730', 'Senior Design Engineer', 3)\n", "('14417807', 'Production Technician - WC60', 21)\n" ] } ], "source": [ "with pyodbc.connect(conx_string_b) as conx:\n", " cursor = conx.cursor()\n", " cursor.execute(\"SELECT * FROM VacationExceptions\")\n", " data = cursor.fetchall()\n", "\n", "for row in data[:10]:\n", " print(row)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }