{ "metadata": { "name": "", "signature": "sha256:cff615b1a66b493f4dd109218fc6a4928586350fd1e2c6da0206997a4a1c1b1f" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Calculating how many emails pass through the FSL mailing list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I searched the FSL mailing list archives, found the summary page for [April 2014][1] and copied and pasted the table of contents into a text file. That file is saved on GitHub and can be accessed with the [urllib2][2] python library. \n", "\n", "[1]: https://www.jiscmail.ac.uk/cgi-bin/webadmin?A1=ind1404&L=fsl&X=1908DF04DF624BAC5B&Y\n", "\n", "[2]: https://docs.python.org/2/library/urllib2.html" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import urllib2\n", "archive_file = urllib2.urlopen('https://raw.githubusercontent.com/HappyPenguin/OpenScience/master/FSLmailinglist_archive_April2014.txt')\n", "archive_lines = archive_file.readlines()\n", "archive_lines = [ line.rstrip() for line in archive_lines]\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first few lines of this file are:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for line in archive_lines[:10]:\n", " print line" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\"radiating\" normalised functional images (5 messages)\n", ".vmr and .vmp BESA output files (1 message)\n", "1 PostDoc + 2 PhD positions in Computational Cognitive NeuroImaging; University of Birmingham, UK (1 message)\n", "2x2 ANCOVA results visualization (6 messages)\n", "3D viewer gif? (6 messages)\n", "a problem with read_avw (2 messages)\n", "About probtrackx2 rseed flag (2 messages)\n", "Advanced Clinical Neuroimaging Course, Brussels, May 30-31, 2014 (1 message)\n", "Analysis steps for resting state fMRI (12 messages)\n", "ASL Perfusion Values (2 messages)\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "So the first thing we can find out are how many distinct email subjects were used in the month of April:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "len(archive_lines)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "286" ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "But, to be honest, we don't really care about the subject lines; we want to know how many emails were sent in total. And that requires a little parsing of this text file. Specifically we'll split each line at the *last* '(' character, discard everything to the left of that, and strip out the constant string ' messages)' from the part that's left." ] }, { "cell_type": "code", "collapsed": false, "input": [ "archive_messages_n = [ x.rsplit('(',1)[1].rstrip(' messages)') for x in archive_lines ]\n", "print archive_messages_n[:10]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['5', '1', '1', '6', '6', '2', '2', '1', '12', '2']\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can convert those values to integers and save them in a numpy array data frame so we can get a total value!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import numpy as np\n", "array = np.array(archive_messages_n, dtype=np.int)\n", "print array.sum()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "871\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**871 emails in one month! Crazy days!**\n", "\n", "Thank you FMRIB ;)" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 43 } ], "metadata": {} } ] }