{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Files\n", "\n", "Python uses file objects to interact with external files on your computer. These file objects can be any sort of file you have on your computer, whether it be an audio file, a text file, emails, Excel documents, etc. Note: You will probably need to install certain libraries or modules to interact with those various file types, but they are easily available. (We will cover downloading modules later on in the course).\n", "\n", "Python has a built-in open function that allows us to open and play with basic file types. First we will need a file though. We're going to use some iPython magic to create a text file!\n", "\n", "## iPython Writing a File" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting test.txt\n" ] } ], "source": [ "%%writefile test.txt\n", "Hello, this is a quick test file" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Python Opening a file\n", "\n", "We can open a file with the open() function. The open function also takes in arguments (also called parameters). Lets see how this is used:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Open the text.txt we made earlier\n", "my_file = open('test.txt')" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'Hello, this is a quick test file'" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We can now read the file\n", "my_file.read()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "''" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# But what happens if we try to read it again?\n", "my_file.read()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This happens because you can imagine the reading \"cursor\" is at the end of the file after having read it. So there is nothing left to read. We can reset the \"cursor\" like this:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Seek to the start of file (index 0)\n", "my_file.seek(0)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'Hello, this is a quick test file'" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Now read again\n", "my_file.read()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to not have to reset every time, we can also use the readlines method. Use caution with large files, since everything will be held in memory. We will learn how to iterate over large files later in the course." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "['Hello, this is a quick test file']" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Readlines returns a list of the lines in the file.\n", "my_file.readlines()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing to a File\n", "\n", "By default, using the open() function will only allow us to read the file, we need to pass the argument 'w' to write over the file. For example:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Add a second argument to the function, 'w' which stands for write\n", "my_file = open('test.txt','w+')" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Write to the file\n", "my_file.write('This is a new line')" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'This is a new line'" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Read the file\n", "my_file.read()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterating through a File\n", "\n", "Lets get a quick preview of a for loop by iterating over a text file. First let's make a new text file with some iPython Magic:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting test.txt\n" ] } ], "source": [ "%%writefile test.txt\n", "First Line\n", "Second Line" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use a little bit of flow to tell the program to for through every line of the file and do something:" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First Line\n", "\n", "Second Line\n" ] } ], "source": [ "for line in open('test.txt'):\n", " print line" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Don't worry about fully understanding this yet, for loops are coming up soon. But we'll break down what we did above. We said that for every line in this text file, go ahead and print that line. Its important to note a few things here:\n", "\n", " 1.) We could have called the 'line' object anything (see example below).\n", " 2.) By not calling .read() on the file, the whole text file was not stored in memory.\n", " 3.) Notice the indent on the second line for print. This whitespace is required in Python.\n", "\n", "We'll learn a lot more about this later, but up next: Sets and Booleans!" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First Line\n", "\n", "Second Line\n" ] } ], "source": [ "# Pertaining to the first point above\n", "for asdf in open('test.txt'):\n", " print asdf" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }