{ "metadata": { "name": "9932_02_01" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Chapter 2, example 1\n", "====================\n", "\n", "In this example, we show how to use IPython to perform common system actions such as downloading a Zip file, extracting it in a new folder, etc. Specifically, we download some social data about anonymous volunteer Facebook users.\n", "\n", "The data is freely available on [Stanford's SNAP project](http://snap.stanford.edu/data/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We begin by importing native modules used to download and extract compressed files." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import urllib2, zipfile" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "url = 'http://ipython.rossant.net/'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "filename = 'facebook.zip'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We download the file in memory with `urllib2.urlopen`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "downloaded = urllib2.urlopen(url + filename)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we create a new folder named `data` and we save the Zip file in it." ] }, { "cell_type": "code", "collapsed": false, "input": [ "folder = 'data'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "mkdir $folder" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "cd $folder" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "chapter2\\data\n" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "with open(filename, 'wb') as f:\n", " f.write(downloaded.read())" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use the `zipfile` module to extract the Zip file in the `data` folder." ] }, { "cell_type": "code", "collapsed": false, "input": [ "with zipfile.ZipFile(filename) as zip:\n", " zip.extractall('.')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Common system file commands such as `ls` just work in IPython, even on Windows systems!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ls" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "facebook.zip\n", "" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "cd facebook" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "chapter2\\data\\facebook\n" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "ls" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0.circles\n", "0.edges\n", "107.circles\n", "107.edges\n", "1684.circles\n", "1684.edges\n", "1912.circles\n", "1912.edges\n", "3437.circles\n", "3437.edges\n", "348.circles\n", "348.edges\n", "3980.circles\n", "3980.edges\n", "414.circles\n", "414.edges\n", "686.circles\n", "686.edges\n", "698.circles\n", "698.edges\n", "" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We save the `data/facebook/` directory as an alias for later. We will be able to enter this directory just with `cd fbdata`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%bookmark fbdata" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 13 } ], "metadata": {} } ] }