{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Analyse in Fiji and plot using R"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook demonstrates how to analyze a batch of images associated to the paper ['Subdiffraction imaging of centrosomes reveals higher-order organizational features of pericentriolar material.'](http://dx.doi.org/10.1038/ncb2591) using the scripting facility available in [Fiji](https://fiji.sc/).",
    "\n",
    "In the paper, Images were analyzed using Matlab scripts. In this notebook, we use the *Auto Threshold* and *Analyze Particles* Fiji plugins. The results are saved in a CVS file. We then use R to plot the results and reproduce graphics from the paper."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Analyse data in Fiji using the scripting facility"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set up\n",
    "\n",
    "1. Install Fiji\n",
    "2. Install the OMERO.insight-ij plugin. See http://help.openmicroscopy.org/imagej.html#plugins\n",
    "3. Installing the plugin adds the dependencies required to connect to IDR using the *Script Editor* of Fiji."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Analysis\n",
    "Analysing the full set of images will take roughly 30mins depending on the network speed.\n",
    "To analyse the data, follow the steps below:\n",
    "\n",
    "1. Select the Project https://idr.openmicroscopy.org/webclient/?show=project-51\n",
    "2. Note its ID displayed on the right-hand panel\n",
    "3. Launch Fiji\n",
    "4. Go to *File > New > Script...*\n",
    "5. A dialog pops up. In the *Language* menu select *Python*\n",
    "6. Copy the content of the script [idr0021.jy](includes/idr0021.jy) into the text editor\n",
    "7. Edit USERNAME and PASSWORD\n",
    "8. Click *Run*\n",
    "9. The script will process all Images in the specified Project. The following steps will be applied to each image:\n",
    "    - Convert floating-point pixel-type to 8-bit using *8-bit* \n",
    "    - Apply *Auto Threshold* with the *method* set to *MaxEntropy* and *stack* option selected\n",
    "    - Apply *Analyze Particles* with the *Size* set to *10-Infinity* and the *Pixel units* checked.\n",
    "    - Parse the results to only keep a summary of the statistics of the channel which name is matching the name of the dataset are kept. To identify the channel, we read the map annotation associated to the image.The map annotation contains pairs (emission wavelength, protein).\n",
    "    - Save the summary as a CSV file. One file per Image\n",
    "10. All the CSV files are then aggregated into a single CSV file."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot with R"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A simple example which reads the CSV file, performs some basic statistics on it and creates a plot which should be similar to [Figure 1](https://www.nature.com/articles/ncb2591/figures/1)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load the data from the CSV file as R dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Read the CSV file\n",
    "home <- path.expand('~')\n",
    "path <- paste0(home, \"/notebooks/includes/idr0021_merged_results.csv\")\n",
    "df <- read.csv(path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# CEP120 is represented by two datasets'CEP120/20111106' and 'CEP120/20111209',\n",
    "# just combine them to 'CEP120'\n",
    "df$Dataset <- as.character(df$Dataset)\n",
    "df[ df == 'CEP120/20111106' ] <- 'CEP120'\n",
    "df[ df == 'CEP120/20111209' ] <- 'CEP120'\n",
    "df$Dataset <- as.factor(df$Dataset)\n",
    "\n",
    "# Remove the extreme (very likely wrong) values of the NEDD1-C1 dataset\n",
    "df <- df[ which(df$Bounding_Box < 100), ]\n",
    "# Use the bounding box of the biggest shapes for the specified channel\n",
    "df$diameter <- sqrt(df$Bounding_Box)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Order the data from lowest to highest mean diameter (to match the order of figure 1)\n",
    "ag <-aggregate(df$diameter ~ df$Dataset, df, mean)\n",
    "orderedDatasets <- factor(df$Dataset, levels=ag[order(ag$`df$diameter`), 'df$Dataset'])\n",
    "\n",
    "plot(df$diameter ~ orderedDatasets, ylab='Bounding Box', xlab=\"Protein\", cex.axis=0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Is there a significant difference between the proteins?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "options(warn=-1)\n",
    "# One-way analysis of variance:\n",
    "fit <- aov(df$diameter ~ df$Dataset)\n",
    "summary(fit)\n",
    "\n",
    "# Two-sample Wilcoxon test ('Mann-Whitney') of all pairwise combinations:\n",
    "combins <- combn(levels(df$Dataset), 2)\n",
    "params_list <- split(as.vector(combins), rep(1:ncol(combins), each = nrow(combins)))\n",
    "testResults <- data.frame()\n",
    "for (param in params_list) {\n",
    "  testdf <- subset(df, df$Dataset %in% param)\n",
    "  pval <- wilcox.test(formula = diameter ~ Dataset, data = testdf)$p.value\n",
    "  testResults<-rbind(testResults, data.frame(Protein_1=param[1], Protein_2=param[2], p_value=pval))\n",
    "}\n",
    "testResults"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "### License\n",
    "\n",
    "Copyright (C) 2018 University of Dundee. All Rights Reserved.\n",
    "\n",
    "This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.\n",
    "\n",
    "This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R",
   "language": "R",
   "name": "ir"
  },
  "language_info": {
   "codemirror_mode": "r",
   "file_extension": ".r",
   "mimetype": "text/x-r-source",
   "name": "R",
   "pygments_lexer": "r",
   "version": "3.4.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}