{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Python\n", "\n", "Python is a general-purpose programming language that is designed to be easy to read and easy to learn. It allows you to express ideas in relatively few lines of code. The various core Python packages, along with those written by the community, extend Python's capabilities immensely.\n", "\n", "This is a [Jupyter Notebook](jupyter.org), an open-source web application that allows you to run Python code directly inside your web browser. It combines live code with narrative text. The Notebook is divided into \"cells\". These can contain code or plain text formatted in [Markdown](https://en.wikipedia.org/wiki/Markdown)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Numbers\n", "\n", "Python can function just like a calculator. You can use it to do some simple arithmetic.\n", "\n", "Execute code cells by pressing Shift-Enter on your keyboard with the code cell selected. The output is shown immediately underneath." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Addition**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "2 + 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Division**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "10 / 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exponentiation**\n", "\n", "Double asterisk (\\*\\*) means exponentiation, i.e. 2\\*\\*3 = $2^3$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "2**3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Subtraction and Multiplication**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "10 - 4 * 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Brackets**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(10 - 4) * 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variables\n", "\n", "Create variables and use them to store numbers or other kinds of data. Print their value using the `print()` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = 2\n", "b = 3\n", "c = a + b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(a)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(c)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(b - a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Strings\n", "\n", "Text data is stored as a 'string' and can be saved to a variable just like numbers." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "'Hello World'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "greeting = 'Hello World'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(greeting)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# How long is the string, including letters and spaces?\n", "len(greeting)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Make all lowercase\n", "greeting.lower()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Make all uppercase\n", "greeting.upper()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sentence = 'The date today is May 21, 2019.'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Split the sentence on every comma\n", "sentence.split(',')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Insert a variable into a string\n", "name = 'Emily'\n", "\n", "print('My name is {0}'.format(name))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Insert a variable in multiple places of a string and perform some arithmetic in the process\n", "age = 30\n", "\n", "print('I am {0} years old. Tomorrow is my birthday. I will be {1}.'.format(age, age+1))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Combine strings\n", "first = 'Hello'\n", "last = 'World'\n", "\n", "print(first + last)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# A space is also a string. Combine three strings together\n", "print(first + ' ' + last)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lists\n", "\n", "Lists are another data structure in Python and are identified with square brackets. List items are separated by commas: `[a, b, c, ...]`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mylist = [1, 2, 3, 4, 5]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# How long is my list?\n", "len(mylist)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Python indexes starting at zero. Retrieve the first item in my list:\n", "mylist[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the second item in my list\n", "mylist[1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the last item in my list\n", "mylist[-1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get items indexed 0 to 3\n", "mylist[0:3]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get items indexed 2 to 4\n", "mylist[2:4]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get the first 4 items\n", "mylist[:4]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get items indexed from 3 onwards\n", "mylist[3:]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get every second item in my list\n", "mylist[::2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get all items in my list in reverse order\n", "mylist[::-1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Append the number 6 to the end of my list\n", "mylist.append(6)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(mylist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dictionaries\n", "\n", "A Python dictionary is a way of storing named items together. The items in a dictionary are called key-value pairs. The 'key' is the name of the item. The 'value' can be any data type or object and can be mixed within the same dictionary." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "person = {'name': 'Sam', 'age': 22, 'weight': 170.2}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(person)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "person['name']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "person['age']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "person.keys()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "person.values()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "person.items()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tuples" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = (1, 2, 3) # is a tuple\n", "b = [1, 2, 3] # is a list" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(a)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Main difference: items in a tuple can't be modified**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[1] = 7" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b[1] = 7\n", "\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**When to use tuples:** Use tuples for groupings of items that go together and shouldn't be changed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "account = ('admin', 'john.smith@company.com', '12345')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "username, email, password = account" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(email)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loops\n", "\n", "Loops allow you to perform repeated steps." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in [1, 2, 3, 4, 5]:\n", " print(i)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(5):\n", " print(i)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "value = 1\n", "limit = 50\n", "\n", "while value < limit:\n", " value = value*2\n", " print(value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conditionals\n", "\n", "You can control the flow of code using `if` statements. In Python there is no need to write `then`. You end your `if` statement with a colon (`:`), and then start your next line with an indent. The indented code beneath an `if` statement is how Python knows what code to execute if the `if` statement evaluates as true.\n", "\n", "The other conditional structures, like `else` and `elif` (else-if) work the same way." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "greeting = 'Hello'\n", "\n", "if greeting == 'Aloha':\n", " print('Aloha')\n", "else:\n", " print('Hi')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "today = 'Tuesday'\n", "\n", "if today is 'Tuesday':\n", " print('It is Tuesday')\n", "elif today is 'Wednesday':\n", " print('Today is Wednesday')\n", "else:\n", " print('Today is {}'.format(today))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`True` and `False` are special keywords in Python and must be written in the title case." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sleepy = False\n", "\n", "if sleepy:\n", " print('Drink coffee')\n", " " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = True\n", "b = True\n", "\n", "a and b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = True\n", "b = False\n", "\n", "a and b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a or b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = False\n", "b = False\n", "\n", "a and b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "not a and not b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fancy Trick: List Comprehension\n", "\n", "If you're creating a list or similar object that has a repeating pattern, you can use a trick called list comprehension. It is a more compact way of expressing an idea that would otherwise take a for-loop to express.\n", "\n", "The examples below come from: https://hackernoon.com/list-comprehension-in-python-8895a785550b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = [2*i for i in range(5)]\n", "\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above could also be expressed as a for loop:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = []\n", "for i in range(5):\n", " a.append(2*i)\n", " \n", "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find common numbers from two list using for loop" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list_a = [1, 2, 3, 4]\n", "list_b = [2, 3, 4, 5]\n", "\n", "common_num = []\n", "\n", "for a in list_a:\n", " for b in list_b:\n", " if a == b:\n", " common_num.append(a)\n", "\n", "print(common_num)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find common numbers from two list using list comprehension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list_a = [1, 2, 3, 4]\n", "list_b = [2, 3, 4, 5]\n", "\n", "common_num = [a for a in list_a for b in list_b if a == b]\n", "\n", "print(common_num)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Return numbers from the list which are not equal as tuple:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list_a = [1, 2, 3]\n", "list_b = [2, 7]\n", "\n", "different_num = [(a, b) for a in list_a for b in list_b if a != b]\n", "\n", "print(different_num)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Functions\n", "\n", "Python let's you define functions that encapsulate a set of instructions, taking inputs and returning outputs." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def square(x):\n", " return x**2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "square(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def greet_me(name):\n", " greeting = 'Hello ' + name + '!'\n", " return greeting" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "greet_me('Mikhail')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading and Writing Files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reading" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "filename = 'Data/sonnet33.txt'\n", "\n", "file = open(filename, 'r')\n", "sonnet = file.readlines()\n", "file.close()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(sonnet)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(len(sonnet)):\n", " print(sonnet[i])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Writing a File" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The famous first paragraph from Herman Melville's classic, Moby Dick; or, The Whale.\n", "mobydick = \"Call me Ishmael. Some years ago- never mind how long precisely- \\\n", " having little or no money in my purse, and nothing particular to \\\n", " interest me on shore, I thought I would sail about a little and \\\n", " see the watery part of the world. It is a way I have of driving \\\n", " off the spleen and regulating the circulation. Whenever I find \\\n", " myself growing grim about the mouth; whenever it is a damp, \\\n", " drizzly November in my soul; whenever I find myself involuntarily \\\n", " pausing before coffin warehouses, and bringing up the rear of every \\\n", " funeral I meet; and especially whenever my hypos get such an upper \\\n", " hand of me, that it requires a strong moral principle to prevent me \\\n", " from deliberately stepping into the street, and methodically \\\n", " knocking people's hats off- then, I account it high time to get to \\\n", " sea as soon as I can. This is my substitute for pistol and ball. \\\n", " With a philosophical flourish Cato throws himself upon his sword; \\\n", " I quietly take to the ship. There is nothing surprising in this. If \\\n", " they but knew it, almost all men in their degree, some time or other, \\\n", " cherish very nearly the same feelings towards the ocean with me.\".split(' ')\n", "\n", "# Above, we split the text block on the long chunk of white space I used at the beginning\n", "# of most lines in order to align the text" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use Python's line continuation character, `\\`, to split a very long string across multiple lines inside the Notebook cell. However, because we aligned the next, we introduced a lot of white space at the beginning of each line. The `.split()` method breaks the long string into a list of a shorter strings, splitting on the long whitespace." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(mobydick)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's write the text line by line to a file called `mobydick.txt`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open('Data/mobydick.txt', 'w') as file:\n", " for line in mobydick:\n", " file.write(line)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check to make sure..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open('Data/mobydick.txt', 'r') as file:\n", " lines = file.readlines()\n", " \n", "print(lines)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Accessing the Internet" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this example, we will use the `yahoo_fin` Python package, which scrapes data from Yahoo Finance. It can do many impressive things, but we're only going to use it to fetch the current stock price of a few companies. We need to look these up by their ticker symbols.\n", "\n", "* MSFT = Microsoft\n", "* AAPL = Apple\n", "* NFLX = Netflix" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# stock_info is a submodule of the Yahoo Finance package.\n", "# We're going to import it by itself and assigned it to \n", "# the variable `si`\n", "from yahoo_fin import stock_info as si\n", "\n", "stocks = ['MSFT', 'AAPL', 'NFLX']\n", "\n", "for stock in stocks:\n", " price = si.get_live_price(stock)\n", " print('{}: ${:.2f}'.format(stock, price))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The syntax `{:.2f}` might seem a bit confusing, but we're just including some formatting instructions: present the stock price to two decimal places." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Python Packages" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the last example we imported a package called `yahoo_fin` that accessed Yahoo Finance. Python has *many* packages you can install that extend Python's capabilities tremendously. Anybody can write their own packages and submit them to the community." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Working with Numerical Data using `numpy`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def circle_area(radius):\n", " # The area of a circle of a circle is pi times the radius squared.\n", " return np.pi * radius**2\n", "\n", "def circumference(radius):\n", " # The circumference of a circle is twice the radius times pi.\n", " return 2 * np.pi * radius" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "circle_area(10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "circumference(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making Pretty Charts with `matplotlib`\n", "\n", "Use the most popular plotting library to make some charts in Python." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1, 2, 3, 4, 5]\n", "y = [1, 9, 5, 7, 8]\n", "\n", "plt.plot(x, y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.linspace(0, 5, 20)\n", "y = circle_area(x)\n", "\n", "print(x)\n", "\n", "print(y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(x, y)\n", "\n", "axes = plt.gca()\n", "axes.set_title('Circle Area as a Function of Radius')\n", "axes.set_xlabel('Radius')\n", "axes.set_ylabel('Area')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "microsoft = si.get_data('msft' , start_date = '01/01/1999')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = microsoft.index.values\n", "y = microsoft['close'].values\n", "\n", "plt.plot(x, y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.6", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }