{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "First, compute the number of pushes to *github* repositories aggregated per month, as described [here](http://www.githubarchive.org/). Perform the query first considering all repositories, the restricting the search to \\*.github.io repositories. The query restricted to \\*.github.io repositories is the following:\n", "```\n", "SELECT LEFT(created_at, 7) as month, COUNT(*) as pushes\n", "FROM [githubarchive:github.timeline]\n", "WHERE\n", " type='CreateEvent' AND\n", " RIGHT(repository_name, 10) = '.github.io'\n", "GROUP BY month\n", "ORDER BY month DESC\n", "```\n", "\n", "Then process the created CSV files with the following script, to make the data readable by [NVD3](http://nvd3.org/):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import csv\n", "import datetime\n", "\n", "data = {}\n", "for name in ['All', 'IO']:\n", " with open('../data/push_monthly_%s_results-20140115.csv' % name, 'rb') as f:\n", " data[name] = [x for x in csv.reader(f)][1:]\n", " # Convert 'YYYY-MM' to unix epoch in milliseconds\n", " times = [datetime.date(int(x[0].split('-')[0]), int(x[0].split('-')[1]), 1)- datetime.timedelta(days=1) for x in data[name]]\n", " data[name] = zip(times, [x[1] for x in data[name]][1:]) \n", "\n", "# Compute the ratio between pushes to *.github.io repos vs total pushes, for each month\n", "ratios = [(v[0][0].strftime('%s000'), float(v[0][1])/float(v[1][1])*100.0) for v in zip(data['IO'], data['All'])] \n", "import json\n", "output = []\n", "output.append({'key': '% push to *.github.io vs. total pushes', 'values' : ratios})\n", "with open('../data/githubIO.json', 'w') as f:\n", " f.write(json.dumps(output))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 65 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, place your HTML/JS code to display the data. In this case the code is placed in the `_includes` directory so it can be loaded with the **Jekyll** directive:\n", "\n", "`{% include bladh/github_io_infographic.html %}` \n", "\n", "from any blog post. The HTML code to generate the graph is adapted from [this](http://nvd3.org/ghpages/cumulativeLine.html) example." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "cat ../_includes/bladh/github_io_infographic.html" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "\n", "\n", "
\n", " \n", " See\n", " how to make this\n", "\n", "
\n", "\n", "\n" ] } ], "prompt_number": 66 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }