{
 "metadata": {
  "name": "",
  "signature": "sha256:cff615b1a66b493f4dd109218fc6a4928586350fd1e2c6da0206997a4a1c1b1f"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Calculating how many emails pass through the FSL mailing list"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "I searched the FSL mailing list archives, found the summary page for [April 2014][1] and copied and pasted the table of contents into a text file. That file is saved on GitHub and can be accessed with the [urllib2][2] python library. \n",
      "\n",
      "[1]: https://www.jiscmail.ac.uk/cgi-bin/webadmin?A1=ind1404&L=fsl&X=1908DF04DF624BAC5B&Y\n",
      "\n",
      "[2]: https://docs.python.org/2/library/urllib2.html"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import urllib2\n",
      "archive_file = urllib2.urlopen('https://raw.githubusercontent.com/HappyPenguin/OpenScience/master/FSLmailinglist_archive_April2014.txt')\n",
      "archive_lines = archive_file.readlines()\n",
      "archive_lines = [ line.rstrip() for line in archive_lines]\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 1
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The first few lines of this file are:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "for line in archive_lines[:10]:\n",
      "    print line"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\"radiating\" normalised functional images (5 messages)\n",
        ".vmr and .vmp BESA output files (1 message)\n",
        "1 PostDoc + 2 PhD positions in Computational Cognitive NeuroImaging; University of Birmingham, UK (1 message)\n",
        "2x2 ANCOVA results visualization (6 messages)\n",
        "3D viewer gif? (6 messages)\n",
        "a problem with read_avw (2 messages)\n",
        "About probtrackx2 rseed flag (2 messages)\n",
        "Advanced Clinical Neuroimaging Course, Brussels, May 30-31, 2014 (1 message)\n",
        "Analysis steps for resting state fMRI (12 messages)\n",
        "ASL Perfusion Values (2 messages)\n"
       ]
      }
     ],
     "prompt_number": 2
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "So the first thing we can find out are how many distinct email subjects were used in the month of April:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "len(archive_lines)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 3,
       "text": [
        "286"
       ]
      }
     ],
     "prompt_number": 3
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "But, to be honest, we don't really care about the subject lines; we want to know how many emails were sent in total. And that requires a little parsing of this text file. Specifically we'll split each line at the *last* '(' character, discard everything to the left of that, and strip out the constant string ' messages)' from the part that's left."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "archive_messages_n = [ x.rsplit('(',1)[1].rstrip(' messages)') for x in archive_lines ]\n",
      "print archive_messages_n[:10]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "['5', '1', '1', '6', '6', '2', '2', '1', '12', '2']\n"
       ]
      }
     ],
     "prompt_number": 4
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We can convert those values to integers and save them in a numpy array data frame so we can get a total value!"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import numpy as np\n",
      "array = np.array(archive_messages_n, dtype=np.int)\n",
      "print array.sum()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "871\n"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**871 emails in one month! Crazy days!**\n",
      "\n",
      "Thank you FMRIB ;)"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 43
    }
   ],
   "metadata": {}
  }
 ]
}