{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Automating Microsoft Office with Python\n", "\n", "Windows applications, for many years, have provided a [COM](http://en.wikipedia.org/wiki/Component_Object_Model) API for automation. This includes Microsoft Office as well.\n", "\n", "[pywin32](http://sourceforge.net/projects/pywin32/) is a library that lets you do many interesting things in Windows, including access these COM APIs.\n", "\n", "For example, to open PowerPoint and draw a circle, this is what it takes:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import win32com.client\n", "\n", "# Open PowerPoint\n", "Application = win32com.client.Dispatch(\"PowerPoint.Application\")\n", "\n", "# Add a presentation\n", "Presentation = Application.Presentations.Add()\n", "\n", "# Add a slide with a blank layout (12 stands for blank layout)\n", "Base = Presentation.Slides.Add(1, 12)\n", "\n", "# Add an oval. Shape 9 is an oval.\n", "oval = Base.Shapes.AddShape(9, 100, 100, 100, 100)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "You'll have to try this out to see the result, but just FYI, this will open a new PowerPoint window and add a slide with a circle in it.\n", "\n", "This opens up a lot of opportunities for slideware. Similarly, we can open an Excel application, add a circle, and change a few cells." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Open Excel\n", "Application = win32com.client.Dispatch(\"Excel.Application\")\n", "\n", "# Show Excel. Unlike PPT, Word & Excel open up \"hidden\"\n", "Application.Visible = 1\n", "\n", "# Add a workbook\n", "Workbook = Application.Workbooks.Add()\n", "\n", "# Take the active sheet\n", "Base = Workbook.ActiveSheet\n", "\n", "# Add an oval. Shape 9 is an oval.\n", "oval = Base.Shapes.AddShape(9, 100, 100, 100, 100)\n", "\n", "# In the first row, add Values: 0.0, 0.5, 1.0\n", "Base.Cells(1, 1).Value = 'Values'\n", "Base.Cells(1, 2).Value = 0.0\n", "Base.Cells(1, 3).Value = 0.5\n", "Base.Cells(1, 4).Value = 1.0" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This means one can go about creating Excel models directly with Python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Picturing the IMDb Top 250\n", "\n", "Let's begin by creating a slide that shows all of the [Top 250 movies](http://www.imdb.com/chart/top) on the IMDb.\n", "\n", "First, let's load all the movies." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from lxml.html import parse\n", "\n", "tree = parse('http://www.imdb.com/chart/top')\n", "movies = tree.findall('.//table[@class=\"chart\"]//td[@class=\"titleColumn\"]//a')\n", "movies[0].text_content()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "'The Shawshank Redemption'" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can show these movies. But before that, we can't remember these numbers for the shapes (like 9 is for circle). Let's pre-define those in line with how Office uses them and import them." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from MSO import *" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's draw each movie poster as a little box on a 25x10 grid. We don't have the images yet, but first, let's just draw the rectangles." ] }, { "cell_type": "code", "collapsed": false, "input": [ "Base = Presentation.Slides.Add(1, ppLayoutBlank)\n", "\n", "width, height = 28, 41\n", "for i, movie in enumerate(movies):\n", " x = 10 + width * (i % 25)\n", " y = 100 + height * (i // 25)\n", " r = Base.Shapes.AddShape(\n", " msoShapeRectangle,\n", " x, y,\n", " width, height)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "It would be nice to get posters into those, so let's scrape the posters." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import os\n", "from urlparse import urljoin\n", "from urllib import urlretrieve\n", "from hashlib import md5\n", "\n", "# We'll keep the files under an img/ folder\n", "if not os.path.exists('img'):\n", " os.makedirs('img')\n", " \n", "def filename(movie):\n", " '''Filename = MD5 hash of its title in UTF8'''\n", " name = md5(movie.text_content().encode('utf8')).hexdigest()\n", " return os.path.join('img', name + '.jpg')\n", " \n", "for movie in movies:\n", " if os.path.exists(filename(movie)):\n", " continue\n", " \n", " url = urljoin('http://www.imdb.com/', movie.get('href'))\n", " tree = parse(url)\n", " img = tree.find('.//td[@id=\"img_primary\"]//img')\n", " urlretrieve(img.get('src'), filename(movie))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, instead of just rectangles, we'll use the posters." ] }, { "cell_type": "code", "collapsed": false, "input": [ "Base = Presentation.Slides.Add(1, ppLayoutBlank)\n", "\n", "width, height = 28, 41\n", "for i, movie in enumerate(movies):\n", " x = 10 + width * (i % 25)\n", " y = 100 + height * (i // 25)\n", " image = Base.Shapes.AddPicture(\n", " os.path.abspath(filename(movie)),\n", " LinkToFile=True,\n", " SaveWithDocument=False,\n", " Left=x, Top=y,\n", " Width=width, Height=height)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wouldn't it be nice to have these hyperlinked to the movies?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "Base = Presentation.Slides.Add(1, ppLayoutBlank)\n", "\n", "width, height = 28, 41\n", "for i, movie in enumerate(movies):\n", " x = 10 + width * (i % 25)\n", " y = 100 + height * (i // 25)\n", " image = Base.Shapes.AddPicture(\n", " os.path.abspath(filename(movie)),\n", " LinkToFile=True,\n", " SaveWithDocument=False,\n", " Left=x, Top=y,\n", " Width=width, Height=height)\n", " url = urljoin('http://www.imdb.com/', movie.get('href'))\n", " link = image.ActionSettings(ppMouseClick).Hyperlink\n", " link.Address = url\n", " link.ScreenTip = movie.text_content().encode('cp1252')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is ordered by rank, which is useful, but this makes it hard to locate a specific movie. What if we could sort this alphabetically?\n", "\n", "But then, we don't want to lose the ordering by rank either. Could we, perhaps, get these movies to move on the click of a button to alphabetical or rank order?\n", "\n", "Let's start by adding two buttons -- one to sort alphabetically and the ohter to sort by rank." ] }, { "cell_type": "code", "collapsed": true, "input": [ "Base = Presentation.Slides.Add(1, ppLayoutBlank)\n", "\n", "# Add two buttons: alphabetical and by rating\n", "button_alpha = Base.Shapes.AddShape(msoShapeRectangle, 400, 10, 150, 40)\n", "button_alpha.TextFrame.TextRange.Text = 'Alphabetical'\n", "\n", "button_rating = Base.Shapes.AddShape(msoShapeRectangle, 560, 10, 150, 40)\n", "button_rating.TextFrame.TextRange.Text = 'By rating'\n", "\n", "# Get the index position when sorted alphabetically\n", "movies_alpha = sorted(movies, key=lambda v: v.text_content())\n", "index_alpha = dict((movie.text_content(), i) for i, movie in enumerate(movies_alpha))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll create a function that moves an image along a path when a trigger is clicked. This will be applied to each of the images." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def animate(seq, image, trigger, path, duration=1.5):\n", " '''Move image along path when trigger is clicked'''\n", " effect = seq.AddEffect(\n", " Shape=image,\n", " effectId=msoAnimEffectPathDown,\n", " trigger=msoAnimTriggerOnShapeClick,\n", " )\n", " ani = effect.Behaviors.Add(msoAnimTypeMotion)\n", " ani.MotionEffect.Path = path\n", " effect.Timing.TriggerType = msoAnimTriggerWithPrevious\n", " effect.Timing.TriggerShape = trigger\n", " effect.Timing.Duration = duration" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we draw all the images. After drawing them, we specify one animation for alphabetical ordering, and another for ordering by rating." ] }, { "cell_type": "code", "collapsed": false, "input": [ "seq_alpha = Base.TimeLine.InteractiveSequences.Add()\n", "seq_rating = Base.TimeLine.InteractiveSequences.Add()\n", "\n", "width, height = 28, 41\n", "for i, movie in enumerate(movies):\n", " x = 10 + width * (i % 25)\n", " y = 100 + height * (i // 25)\n", " image = Base.Shapes.AddPicture(\n", " os.path.abspath(filename(movie)),\n", " LinkToFile=True,\n", " SaveWithDocument=False,\n", " Left=x, Top=y,\n", " Width=width, Height=height)\n", " url = urljoin('http://www.imdb.com/', movie.get('href'))\n", " link = image.ActionSettings(ppMouseClick).Hyperlink\n", " link.Address = url\n", " link.ScreenTip = movie.text_content().encode('cp1252')\n", " \n", " # Alphabetical \n", " index = index_alpha[movie.text_content()]\n", " animate(seq_alpha, image, trigger=button_alpha, path='M0,0 L{:.3f},{:.3f}'.format(\n", " (10 + width * (index % 25) - x) / 720.,\n", " (100 + height * (index // 25) - y) / 540.,\n", " ))\n", " \n", " # By rating\n", " animate(seq_rating, image, trigger=button_rating, path='M{:.3f},{:.3f} L0,0'.format(\n", " (10 + width * (index % 25) - x) / 720.,\n", " (100 + height * (index // 25) - y) / 540.,\n", " ))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Real-life examples\n", "\n", "You can see how these were put to use at [Gramener](http://gramener.com/):\n", "\n", "- [Vijay Karnataka's election coverage](http://gramener.com/vijaykarnataka/) was produced entirely in PowerPoint using Python. (And in Kannada, no less!)\n", "- [Appstore downloads](http://gramener.com/appstore/appstore.pptx) -- a mini interactive application that shows the number of downloads from an app store, broken up by country, device, customer segment and time.\n", "- [Revenue breakup](http://gramener.com/fmcg/fmcg-revenue-breakup.pptx) of an FMCG company as a clickable application" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Live tweeting\n", "\n", "Just for kicks, let's use PowerPoint as a dashboard to show live tweets.\n", "\n", "I picked [TwitterAPI](https://github.com/geduldig/TwitterAPI) to get streaming results, but [twython](https://github.com/ryanmcgrath/twython) and [Python Twitter Tools](https://github.com/sixohsix/twitter) look fine too." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from TwitterAPI import TwitterAPI\n", "\n", "# I'm keeping my keys and secrets in a secret file.\n", "from secret_twitter import consumer_key, consumer_secret, access_token_key, access_token_secret\n", "api = TwitterAPI(consumer_key, consumer_secret, access_token_key, access_token_secret)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This function will draw a tweet in a reasonably nice way on a slide. There's a block each for the profile picture, the text of the tweet, and the name of the user." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def draw_tweet(Base, item, pos):\n", " y = 40 + (pos % 4) * 120\n", " \n", " image = Base.Shapes.AddPicture(\n", " # To get the larger resolution image, just remove _normal from the URL\n", " item['user']['profile_image_url'].replace('_normal', ''),\n", " LinkToFile=True,\n", " SaveWithDocument=False,\n", " Left=20, Top=y,\n", " Width=100, Height=100)\n", " \n", " try:\n", " status = item['text'].encode('cp1252')\n", " except UnicodeEncodeError:\n", " status = item['text']\n", " text = Base.Shapes.AddShape(msoShapeRectangle, 130, y, 460, 100)\n", " text.Fill.ForeColor.ObjectThemeColor = msoThemeColorText1\n", " text.Fill.ForeColor.Brightness = +0.95\n", " text.Line.Visible = msoFalse\n", " text.TextFrame.TextRange.Text = status\n", " text.TextFrame.TextRange.Font.Color.ObjectThemeColor = msoThemeColorText1\n", " text.TextFrame.TextRange.ParagraphFormat.Alignment = ppAlignLeft\n", " \n", " user = Base.Shapes.AddShape(msoShapeRectangle, 600, y, 100, 100)\n", " user.Fill.ForeColor.ObjectThemeColor = msoThemeColorAccent6\n", " user.Line.Visible = False\n", " user.TextFrame.TextRange.Text = '@' + item['user']['screen_name']" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's track requests for specific words, and see what we get." ] }, { "cell_type": "code", "collapsed": false, "input": [ "Base = Presentation.Slides.Add(1, ppLayoutBlank)\n", "api.request('statuses/filter', {'track': 'beer'})\n", "\n", "for pos, item in enumerate(api.get_iterator()):\n", " draw_tweet(Base, item, pos)\n", " if pos > 10:\n", " break" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 17 } ], "metadata": {} } ] }