{
 "metadata": {
  "name": ""
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "First, compute the number of pushes to *github* repositories aggregated per month, as described [here](http://www.githubarchive.org/). Perform the query first considering all repositories, the restricting the search to \\*.github.io repositories. The query restricted to \\*.github.io repositories is the following:\n",
      "```\n",
      "SELECT LEFT(created_at, 7) as month, COUNT(*) as pushes\n",
      "FROM [githubarchive:github.timeline]\n",
      "WHERE\n",
      "    type='CreateEvent' AND\n",
      "    RIGHT(repository_name, 10) = '.github.io'\n",
      "GROUP BY month\n",
      "ORDER BY month DESC\n",
      "```\n",
      "\n",
      "Then process the created CSV files with the following script, to make the data readable by [NVD3](http://nvd3.org/):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import csv\n",
      "import datetime\n",
      "\n",
      "data = {}\n",
      "for name in ['All', 'IO']:\n",
      "    with open('../data/push_monthly_%s_results-20140115.csv' % name, 'rb') as f:\n",
      "        data[name] = [x for x in csv.reader(f)][1:]\n",
      "        # Convert 'YYYY-MM' to unix epoch in milliseconds\n",
      "        times = [datetime.date(int(x[0].split('-')[0]), int(x[0].split('-')[1]), 1)- datetime.timedelta(days=1) for x in data[name]]\n",
      "        data[name] = zip(times, [x[1] for x in data[name]][1:]) \n",
      "\n",
      "# Compute the ratio between pushes to *.github.io repos vs total pushes, for each month\n",
      "ratios = [(v[0][0].strftime('%s000'), float(v[0][1])/float(v[1][1])*100.0) for v in zip(data['IO'], data['All'])] \n",
      "import json\n",
      "output = []\n",
      "output.append({'key': '% push to *.github.io vs. total pushes', 'values' : ratios})\n",
      "with open('../data/githubIO.json', 'w') as f:\n",
      "    f.write(json.dumps(output))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 65
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Finally, place your HTML/JS code to display the data. In this case the code is placed in the `_includes` directory so it can be loaded with the **Jekyll** directive:\n",
      "\n",
      "`{% include bladh/github_io_infographic.html %}` \n",
      "\n",
      "from any blog post. The HTML code to generate the graph is adapted from [this](http://nvd3.org/ghpages/cumulativeLine.html) example."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%bash\n",
      "cat ../_includes/bladh/github_io_infographic.html"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "<style>\n",
        "\n",
        "#chart1 svg {\n",
        "  height: 400px;\n",
        "}\n",
        "\n",
        "</style>\n",
        "\n",
        "\n",
        "<div id=\"chart1\">\n",
        "  <svg></svg>\n",
        "      <a style = \"float:right; margin-top:-1.5em;padding-bottom:.5em; font-size:small\" href =\n",
        "  'http://nbviewer.ipython.org/github/LucaFoschini/lucafoschini.github.io/blob/master/notebooks/GithubIO.ipynb'>See\n",
        "  how to make this</a>\n",
        "\n",
        "</div>\n",
        "\n",
        "<script type=\"text/javascript\">\n",
        " d3.json('/data/githubIO.json', function(data) {\n",
        "\n",
        "nv.addGraph(function() {\n",
        "  var chart = nv.models.stackedAreaChart()\n",
        "                .x(function(d) { return d[0] })\n",
        "                .y(function(d) { return d[1] })\n",
        "                .tooltips(true)\n",
        "                .tooltipContent(function(key, y, e, graph) { return e})\n",
        "                .showControls(false);\n",
        "\n",
        "  chart.xAxis\n",
        "      .showMaxMin(false)\n",
        "      .tickFormat(function(d) { return d3.time.format('%x')(new Date(d)) });\n",
        "\n",
        "  chart.yAxis\n",
        "      .tickFormat(d3.format(',.2f'));\n",
        "\n",
        "  d3.select('#chart1 svg')\n",
        "    .datum(data)\n",
        "      .transition().duration(500).call(chart);\n",
        "\n",
        "  nv.utils.windowResize(chart.update);\n",
        "\n",
        "  return chart;\n",
        "});\n",
        "\n",
        " })\n",
        "</script>\n"
       ]
      }
     ],
     "prompt_number": 66
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}