{ "metadata": { "name": "", "signature": "sha256:f867a045157c051efd96af92f3c1ff0ff0dd7468df91659d3e77106659d933a7" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Introduction to Python (A crash course)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For those of you that know python, this aims to refresh your memory. For those of you that don't know python -- but do know programming -- this class aims to give you an idea how python is similar/different with your favorite programming language. " ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Printing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the interactive python environment:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print \"Hello World\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From a file:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "#!/usr/bin/env python\n", "\n", "print \"Hello World!\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Standard I/O" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Writing to standard out:\n", "\n", "print \"Python is awesome!\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Reading from standard input and output to standard output\n", "\n", "name = raw_input(\"What is your name?\")\n", "print name" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Data types" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Basic data types:\n", "\n", "1. Strings\n", "2. Integers\n", "3. Floats\n", "4. Booleans\n", "\n", "These are all objects in Python. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "#String\n", "a = \"apple\"\n", "type(a)\n", "#print type(a)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "#Integer \n", "b = 3\n", "type(b)\n", "#print type(b)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "#Float \n", "c = 3.2\n", "type(c)\n", "#print type(c)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "#Boolean\n", "d = True\n", "type(d)\n", "#print type(d)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python **doesn't require explicitly declared variable types** like C and other languages. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pay special attention to assigning floating point values to variables or you may get values you do not expect in your programs." ] }, { "cell_type": "code", "collapsed": false, "input": [ "14/b" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "14/c" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you divide an integer by an integer, it will return an answer rounded to the nearest integer. If you want a floating point answer, one of the numbers must be a float. Simply appending a decimal point will do the trick:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "14./b" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Strings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "String manipulation will be very important for many of the tasks we will do. Therefore let us play around a bit with strings." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#Concatenating strings\n", "\n", "a = \"Hello\" # String\n", "b = \" World\" # Another string\n", "print a + b # Concatenation" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Slicing strings\n", "\n", "a = \"World\"\n", "\n", "print a[0]\n", "print a[-1]\n", "print \"World\"[0:4]\n", "print a[::-1]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Popular string functions\n", "a = \"Hello World\"\n", "print \"-\".join(a)\n", "print a.startswith(\"Wo\")\n", "print a.endswith(\"rld\")\n", "print a.replace(\"o\",\"0\").replace(\"d\",\"[)\").replace(\"l\",\"1\")\n", "print a.split()\n", "print a.split('o')" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Strings are an example of an **imutable** data type. Once you instantiate a string you cannot change any characters in it's set. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "string = \"string\"\n", "string[-1] = \"y\" #Here we attempt to assign the last character in the string to \"y\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Whitespace in Python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python uses indents and whitespace to group statements together. To write a short loop in C, you might use:\n", "\n", " ```c\n", " for (i = 0, i < 5, i++){\n", " printf(\"Hi! \\n\");\n", " }\n", " ```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python does not use curly braces like C, so the same program as above is written in Python as follows:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for i in range(5):\n", " print \"Hi \\n\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you have nested for-loops, there is a further indent for the inner loop." ] }, { "cell_type": "code", "collapsed": false, "input": [ "for i in range(3):\n", " for j in range(3):\n", " print i, j\n", " \n", " print \"This statement is within the i-loop, but not the j-loop\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "File I/O" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Writing to a file\n", "with open(\"example.txt\", \"w\") as f:\n", " f.write(\"Hello World! \\n\")\n", " f.write(\"How are you? \\n\")\n", " f.write(\"I'm fine.\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Reading from a file\n", "with open(\"example.txt\", \"r\") as f:\n", " data = f.readlines()\n", " for line in data:\n", " words = line.split()\n", " print words" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Count lines and words in a file\n", "lines = 0\n", "words = 0\n", "the_file = \"example.txt\"\n", "\n", "with open(the_file, 'r') as f:\n", " for line in f:\n", " lines += 1\n", " words += len(line.split())\n", "print \"There are %i lines and %i words in the %s file.\" % (lines, words, the_file)\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Lists, Tuples, Sets and Dictionaries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Number and strings alone are not enough! we need data types that can hold multiple values." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Lists: " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lists are **mutable** or able to be altered. Lists are a collection of data and that data can be of differing types." ] }, { "cell_type": "code", "collapsed": false, "input": [ "groceries = []\n", "\n", "# Add to list\n", "groceries.append(\"oranges\") \n", "groceries.append(\"meat\")\n", "groceries.append(\"asparangus\")\n", "\n", "# Access by index\n", "print groceries[2]\n", "print groceries[0]\n", "\n", "# Find number of things in list\n", "print len(groceries)\n", "\n", "# Sort the items in the list\n", "groceries.sort()\n", "print groceries\n", "\n", "# List Comprehension\n", "veggie = [x for x in groceries if x is not \"meat\"]\n", "print veggie\n", "\n", "# Remove from list\n", "groceries.remove(\"asparangus\")\n", "print groceries\n", "\n", "#The list is mutable\n", "groceries[0] = 2\n", "print groceries" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Tuples: " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tuples are an **immutable** type. Like strings, once you create them, you cannot change them. It is their immutability that allows you to use them as keys in dictionaries. However, they are similar to lists in that they are a collection of data and that data can be of differing types. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Tuple grocery list\n", "\n", "groceries = ('orange', 'meat', 'asparangus', 2.5, True)\n", "\n", "print groceries\n", "\n", "#print groceries[2]\n", "\n", "#groceries[2] = 'milk'" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Sets: " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A set is a sequence of items that cannot contain duplicates. They handle operations like sets in mathematics." ] }, { "cell_type": "code", "collapsed": false, "input": [ "numbers = range(10)\n", "evens = [2, 4, 6, 8]\n", "\n", "evens = set(evens)\n", "numbers = set(numbers)\n", "\n", "# Use difference to find the odds\n", "odds = numbers - evens\n", "\n", "print odds\n", "\n", "# Note: Set also allows for use of union (|), and intersection (&)\n", "\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Dictionaries: " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A dictionary is a map of keys to values. **Keys must be unique**." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# A simple dictionary\n", "\n", "simple_dic = {'cs591': 'data-mining tools'}\n", "\n", "# Access by key\n", "print simple_dic['cs591']" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "\n", "# A longer dictionary\n", "classes = {\n", " 'cs591': 'data-mining tools',\n", " 'cs565': 'data-mining algorithms'\n", "}\n", "\n", "# Check if item is in dictionary\n", "print 'cs530' in classes\n", "\n", "# Add new item\n", "classes['cs530'] = 'algorithms'\n", "print classes['cs530']\n", "\n", "# Print just the keys\n", "print classes.keys()\n", "\n", "# Print just the values\n", "print classes.values()\n", "\n", "# Print the items in the dictionary\n", "print classes.items()\n", "\n", "# Print dictionary pairs another way\n", "for key, value in classes.items():\n", " print key, value" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# Complex Data structures\n", "# Dictionaries inside a dictionary!\n", "\n", "professors = {\n", " \"prof1\": {\n", " \"name\": \"Evimaria Terzi\",\n", " \"department\": \"Computer Science\",\n", " \"research interests\": [\"algorithms\", \"data mining\", \"machine learning\",]\n", " },\n", " \"prof2\": {\n", " \"name\": \"Chris Dellarocas\",\n", " \"department\": \"Management\",\n", " \"interests\": [\"market analysis\", \"data mining\", \"computational education\",],\n", " }\n", "}\n", "\n", "for prof in professors:\n", " print professors[prof][\"name\"]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Functions" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def displayperson(name,age):\n", " print \"My name is \"+ name +\" and I am \"+age+\" years old.\"\n", " return\n", " \n", "displayperson(\"Bob\",\"40\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Libraries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python is a high-level open-source language. But the _Python world_ is inhabited by many packages or libraries that provide useful things like array operations, plotting functions, and much more. We can (and we should) import libraries of functions to expand the capabilities of Python in our programs. \n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import random\n", "myList = [2, 109, False, 10, \"data\", 482, \"mining\"]\n", "random.choice(myList)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "from random import shuffle\n", "x = [[i] for i in range(10)]\n", "shuffle(x)\n", "print x" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "APIs" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Getting data from an API\n", "\n", "import requests\n", "\n", "width = '200'\n", "height = '300'\n", "response = requests.get('http://placekitten.com/g/' + width + '/' + height)\n", "\n", "print response\n", "\n", "with open('kitten.jpg', 'wb') as f:\n", " f.write(response.content)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.display import Image\n", "Image(filename=\"kitten.jpg\")" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python is a high-level open-source language. But the _Python world_ is inhabited by many packages or libraries that provide useful things like array operations, plotting functions, and much more. We can (and we should) import libraries of functions to expand the capabilities of Python in our programs. \n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Code for setting the style of the notebook\n", "from IPython.core.display import HTML\n", "def css_styling():\n", " styles = open(\"../theme/custom.css\", \"r\").read()\n", " return HTML(styles)\n", "css_styling()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }