{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Systems Immunogenetics Project\n",
    "\n",
    "## WNV Cleaning Steps\n",
    "\n",
    "### McWeeney Lab, Oregon Health & Science University\n",
    "\n",
    "** Authors: Gabrielle Choonoo (choonoo@ohsu.edu) and Michael Mooney (mooneymi@ohsu.edu) **\n",
    "\n",
    "## Introduction\n",
    "\n",
    "This is the step-by-step process for cleaning the WNV qPCR data (ByLine and ByMouse).\n",
    "\n",
    "Required Files:\n",
    "* qPCR ByLine and ByMouse Data Files\n",
    "* This notebook** (`SIG_WNV_qPCR_Data_Cleaning.ipynb`): [[Download here]](https://raw.githubusercontent.com/biodev/SIG/master/SIG_WNV_qPCR_Data_Cleaning.ipynb)\n",
    "* The R script (`qpcr_data_cleaning_functions.r`): [[Download here]](https://raw.githubusercontent.com/biodev/SIG/master/scripts/qpcr_data_cleaning_functions.r)\n",
    "* The data dictionary containing all qPCR variables (`WNV_Data_Dictionary.xlsx`): [[Download here]](https://raw.githubusercontent.com/biodev/SIG/master/data/WNV_Data_Dictionary.xlsx)\n",
    "\n",
    "** Note: this notebook can also be downloaded as an R script (only the code blocks seen below will be included): [[Download R script here]](https://raw.githubusercontent.com/biodev/SIG/master/SIG_WNV_qPCR_Data_Cleaning.r)\n",
    "\n",
    "Required R packages:\n",
    "- `gdata` - [https://cran.r-project.org/web/packages/gdata/index.html](https://cran.r-project.org/web/packages/gdata/index.html)\n",
    "\n",
    "** All code is available on GitHub: [https://github.com/biodev/SIG](https://github.com/biodev/SIG) **"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 1. Load Necessary R Packages and Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.\n",
      "\n",
      "gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.\n",
      "\n",
      "Attaching package: ‘gdata’\n",
      "\n",
      "The following object is masked from ‘package:stats’:\n",
      "\n",
      "    nobs\n",
      "\n",
      "The following object is masked from ‘package:utils’:\n",
      "\n",
      "    object.size\n",
      "\n"
     ]
    }
   ],
   "source": [
    "source('./scripts/qpcr_data_cleaning_functions.r')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 2. Read ByLine Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>UW_Line</th><th scope=col>Mating</th><th scope=col>Timepoint</th><th scope=col>Condition</th><th scope=col>Tissue</th><th scope=col>Experiment</th><th scope=col>N</th><th scope=col>dCt.mean</th><th scope=col>dCt.sd</th><th scope=col>baseline.dCt</th><th scope=col>ddCt.mean</th><th scope=col>ddCt.sd</th><th scope=col>fc.mean</th><th scope=col>fc.sd</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>1</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IFIT1</td><td>3</td><td>7.060444</td><td>4.44931</td><td>12.28855</td><td>-5.228111</td><td>4.44931</td><td>37.4816</td><td>4.44931</td></tr>\n",
       "\t<tr><th scope=row>2</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IFITM1</td><td>3</td><td>7.164444</td><td>1.29619</td><td>8.593667</td><td>-1.429223</td><td>1.29619</td><td>2.693016</td><td>1.29619</td></tr>\n",
       "\t<tr><th scope=row>3</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IFNb1</td><td>3</td><td>14.47589</td><td>5.691944</td><td>18.79678</td><td>-4.32089</td><td>5.691944</td><td>19.98561</td><td>5.691944</td></tr>\n",
       "\t<tr><th scope=row>4</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IL12b</td><td>3</td><td>13.98678</td><td>5.186199</td><td>18.61556</td><td>-4.628778</td><td>5.186199</td><td>24.74008</td><td>5.186199</td></tr>\n",
       "\t<tr><th scope=row>5</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>WNV</td><td>3</td><td>12.963</td><td>7.274413</td><td>19.51256</td><td>-6.549556</td><td>7.274413</td><td>93.67266</td><td>7.274413</td></tr>\n",
       "\t<tr><th scope=row>6</th><td>4</td><td>16188x3252</td><td>12M</td><td>B_d12M</td><td>Brain</td><td>IFIT1</td><td>3</td><td>12.28855</td><td>0.7611464</td><td>12.28855</td><td>0</td><td>0.7611464</td><td>1</td><td>0.7611464</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|llllllllllllll}\n",
       "  & UW_Line & Mating & Timepoint & Condition & Tissue & Experiment & N & dCt.mean & dCt.sd & baseline.dCt & ddCt.mean & ddCt.sd & fc.mean & fc.sd\\\\\n",
       "\\hline\n",
       "\t1 & 4 & 16188x3252 & 12 & B_d12 & Brain & IFIT1 & 3 & 7.060444 & 4.44931 & 12.28855 & -5.228111 & 4.44931 & 37.4816 & 4.44931\\\\\n",
       "\t2 & 4 & 16188x3252 & 12 & B_d12 & Brain & IFITM1 & 3 & 7.164444 & 1.29619 & 8.593667 & -1.429223 & 1.29619 & 2.693016 & 1.29619\\\\\n",
       "\t3 & 4 & 16188x3252 & 12 & B_d12 & Brain & IFNb1 & 3 & 14.47589 & 5.691944 & 18.79678 & -4.32089 & 5.691944 & 19.98561 & 5.691944\\\\\n",
       "\t4 & 4 & 16188x3252 & 12 & B_d12 & Brain & IL12b & 3 & 13.98678 & 5.186199 & 18.61556 & -4.628778 & 5.186199 & 24.74008 & 5.186199\\\\\n",
       "\t5 & 4 & 16188x3252 & 12 & B_d12 & Brain & WNV & 3 & 12.963 & 7.274413 & 19.51256 & -6.549556 & 7.274413 & 93.67266 & 7.274413\\\\\n",
       "\t6 & 4 & 16188x3252 & 12M & B_d12M & Brain & IFIT1 & 3 & 12.28855 & 0.7611464 & 12.28855 & 0 & 0.7611464 & 1 & 0.7611464\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "  UW_Line     Mating Timepoint Condition Tissue Experiment N  dCt.mean\n",
       "1       4 16188x3252        12     B_d12  Brain      IFIT1 3  7.060444\n",
       "2       4 16188x3252        12     B_d12  Brain     IFITM1 3  7.164444\n",
       "3       4 16188x3252        12     B_d12  Brain      IFNb1 3 14.475888\n",
       "4       4 16188x3252        12     B_d12  Brain      IL12b 3 13.986777\n",
       "5       4 16188x3252        12     B_d12  Brain        WNV 3 12.963000\n",
       "6       4 16188x3252       12M    B_d12M  Brain      IFIT1 3 12.288555\n",
       "     dCt.sd baseline.dCt ddCt.mean   ddCt.sd   fc.mean     fc.sd\n",
       "1 4.4493099    12.288555 -5.228111 4.4493099 37.481600 4.4493099\n",
       "2 1.2961898     8.593667 -1.429223 1.2961898  2.693016 1.2961898\n",
       "3 5.6919441    18.796778 -4.320890 5.6919441 19.985608 5.6919441\n",
       "4 5.1861991    18.615555 -4.628778 5.1861991 24.740076 5.1861991\n",
       "5 7.2744126    19.512556 -6.549556 7.2744126 93.672658 7.2744126\n",
       "6 0.7611464    12.288555  0.000000 0.7611464  1.000000 0.7611464"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>1430</li>\n",
       "\t<li>14</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 1430\n",
       "\\item 14\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 1430\n",
       "2. 14\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 1430   14"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Set data directory\n",
    "data_dir = \"/Users/mooneymi/Documents/SIG/WNV/qPCR\"\n",
    "\n",
    "## Read in data (byLine)\n",
    "qpcr_data = read.xls(file.path(data_dir, \"16-May-2016/Gale_qPCR_byLine_5-16-16 %282%29.xlsx\"), sheet=1)\n",
    "\n",
    "head(qpcr_data)\n",
    "dim(qpcr_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>1430</li>\n",
       "\t<li>14</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 1430\n",
       "\\item 14\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 1430\n",
       "2. 14\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 1430   14"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Replace line 82 with fixed data (special case for May 16, 2016 data)\n",
    "qpcr_data = qpcr_data[qpcr_data$UW_Line != 82, ]\n",
    "\n",
    "line_82 = read.xls(file.path(data_dir, \"18-May-2016/Gale_qPCR_byLine_5-18-16.xlsx\"), sheet=1)\n",
    "qpcr_data = rbind(qpcr_data, line_82)\n",
    "\n",
    "dim(qpcr_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>1560</li>\n",
       "\t<li>14</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 1560\n",
       "\\item 14\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 1560\n",
       "2. 14\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 1560   14"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Replace lines 54 and 58 with fixed data (special case for May 16, 2016 data)\n",
    "qpcr_data = qpcr_data[!qpcr_data$UW_Line %in% c(54, 58), ]\n",
    "\n",
    "lines_54_58 = read.xls(file.path(data_dir, \"23-May-2016/Gale_qPCR_byLine_5-23-16.xlsx\"), sheet=1)\n",
    "qpcr_data = rbind(qpcr_data, lines_54_58)\n",
    "\n",
    "dim(qpcr_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "0"
      ],
      "text/latex": [
       "0"
      ],
      "text/markdown": [
       "0"
      ],
      "text/plain": [
       "[1] 0"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check for duplicates\n",
    "sum(duplicated(qpcr_data))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<dl class=dl-horizontal>\n",
       "\t<dt>4</dt>\n",
       "\t\t<dd>16188x3252</dd>\n",
       "\t<dt>42</dt>\n",
       "\t\t<dd>8008x8016</dd>\n",
       "\t<dt>45</dt>\n",
       "\t\t<dd>16441x8024</dd>\n",
       "\t<dt>46</dt>\n",
       "\t\t<dd>8048x15155</dd>\n",
       "\t<dt>48</dt>\n",
       "\t\t<dd>13140x16680</dd>\n",
       "\t<dt>61</dt>\n",
       "\t\t<dd>8056x8033</dd>\n",
       "\t<dt>62</dt>\n",
       "\t\t<dd>8054x8036</dd>\n",
       "\t<dt>70</dt>\n",
       "\t\t<dd>8045x4410</dd>\n",
       "\t<dt>71</dt>\n",
       "\t\t<dd>3564x8027</dd>\n",
       "\t<dt>72</dt>\n",
       "\t\t<dd>5035x16785</dd>\n",
       "\t<dt>73</dt>\n",
       "\t\t<dd>5358x8046</dd>\n",
       "\t<dt>74</dt>\n",
       "\t\t<dd>8046x8004</dd>\n",
       "\t<dt>75</dt>\n",
       "\t\t<dd>8016x8004</dd>\n",
       "\t<dt>76</dt>\n",
       "\t\t<dd>8024x8048</dd>\n",
       "\t<dt>77</dt>\n",
       "\t\t<dd>8034x8043</dd>\n",
       "\t<dt>78</dt>\n",
       "\t\t<dd>13421x16034</dd>\n",
       "\t<dt>79</dt>\n",
       "\t\t<dd>16034x13067</dd>\n",
       "\t<dt>80</dt>\n",
       "\t\t<dd>16521x3260</dd>\n",
       "\t<dt>81</dt>\n",
       "\t\t<dd>16072x5346</dd>\n",
       "\t<dt>113</dt>\n",
       "\t\t<dd>8027x477</dd>\n",
       "\t<dt>82</dt>\n",
       "\t\t<dd>16557x3154</dd>\n",
       "\t<dt>54</dt>\n",
       "\t\t<dd>8036x18018</dd>\n",
       "\t<dt>58</dt>\n",
       "\t\t<dd>5346x16768</dd>\n",
       "</dl>\n"
      ],
      "text/latex": [
       "\\begin{description*}\n",
       "\\item[4] 16188x3252\n",
       "\\item[42] 8008x8016\n",
       "\\item[45] 16441x8024\n",
       "\\item[46] 8048x15155\n",
       "\\item[48] 13140x16680\n",
       "\\item[61] 8056x8033\n",
       "\\item[62] 8054x8036\n",
       "\\item[70] 8045x4410\n",
       "\\item[71] 3564x8027\n",
       "\\item[72] 5035x16785\n",
       "\\item[73] 5358x8046\n",
       "\\item[74] 8046x8004\n",
       "\\item[75] 8016x8004\n",
       "\\item[76] 8024x8048\n",
       "\\item[77] 8034x8043\n",
       "\\item[78] 13421x16034\n",
       "\\item[79] 16034x13067\n",
       "\\item[80] 16521x3260\n",
       "\\item[81] 16072x5346\n",
       "\\item[113] 8027x477\n",
       "\\item[82] 16557x3154\n",
       "\\item[54] 8036x18018\n",
       "\\item[58] 5346x16768\n",
       "\\end{description*}\n"
      ],
      "text/markdown": [
       "4\n",
       ":   16188x325242\n",
       ":   8008x801645\n",
       ":   16441x802446\n",
       ":   8048x1515548\n",
       ":   13140x1668061\n",
       ":   8056x803362\n",
       ":   8054x803670\n",
       ":   8045x441071\n",
       ":   3564x802772\n",
       ":   5035x1678573\n",
       ":   5358x804674\n",
       ":   8046x800475\n",
       ":   8016x800476\n",
       ":   8024x804877\n",
       ":   8034x804378\n",
       ":   13421x1603479\n",
       ":   16034x1306780\n",
       ":   16521x326081\n",
       ":   16072x5346113\n",
       ":   8027x47782\n",
       ":   16557x315454\n",
       ":   8036x1801858\n",
       ":   5346x16768\n",
       "\n"
      ],
      "text/plain": [
       "          4          42          45          46          48          61 \n",
       " 16188x3252   8008x8016  16441x8024  8048x15155 13140x16680   8056x8033 \n",
       "         62          70          71          72          73          74 \n",
       "  8054x8036   8045x4410   3564x8027  5035x16785   5358x8046   8046x8004 \n",
       "         75          76          77          78          79          80 \n",
       "  8016x8004   8024x8048   8034x8043 13421x16034 16034x13067  16521x3260 \n",
       "         81         113          82          54          58 \n",
       " 16072x5346    8027x477  16557x3154  8036x18018  5346x16768 \n",
       "23 Levels: 13140x16680 13421x16034 16034x13067 16072x5346 ... 8056x8033"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check that each line has a single mating\n",
    "line_matings = sapply(unique(qpcr_data$UW_Line), function(x){unique(qpcr_data$Mating[qpcr_data$UW_Line==x])})\n",
    "names(line_matings) = unique(qpcr_data$UW_Line)\n",
    "line_matings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 3. Format and Clean ByLine Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Add Data_Altered and Notes columns\n",
    "qpcr_data$Data_Altered = NA\n",
    "qpcr_data$Notes = NA"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Annotate Virus\n",
    "qpcr_data$Virus = NA\n",
    "qpcr_data[grep(\"M\",qpcr_data[,\"Timepoint\"]),\"Virus\"] <- \"Mock\"\n",
    "qpcr_data[-grep(\"M\",qpcr_data[,\"Timepoint\"]),\"Virus\"] <- \"WNV\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "## Remove 'M' from time points and convert to numeric\n",
    "qpcr_data[,\"Timepoint\"] <- as.numeric(as.character(gsub(\"M\",\"\",qpcr_data[,\"Timepoint\"])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<dl class=dl-horizontal>\n",
       "\t<dt>IFIT1</dt>\n",
       "\t\t<dd>312</dd>\n",
       "\t<dt>IFITM1</dt>\n",
       "\t\t<dd>312</dd>\n",
       "\t<dt>IFNb1</dt>\n",
       "\t\t<dd>312</dd>\n",
       "\t<dt>IL12b</dt>\n",
       "\t\t<dd>312</dd>\n",
       "\t<dt>WNV</dt>\n",
       "\t\t<dd>312</dd>\n",
       "</dl>\n"
      ],
      "text/latex": [
       "\\begin{description*}\n",
       "\\item[IFIT1] 312\n",
       "\\item[IFITM1] 312\n",
       "\\item[IFNb1] 312\n",
       "\\item[IL12b] 312\n",
       "\\item[WNV] 312\n",
       "\\end{description*}\n"
      ],
      "text/markdown": [
       "IFIT1\n",
       ":   312IFITM1\n",
       ":   312IFNb1\n",
       ":   312IL12b\n",
       ":   312WNV\n",
       ":   312\n",
       "\n"
      ],
      "text/plain": [
       " IFIT1 IFITM1  IFNb1  IL12b    WNV \n",
       "   312    312    312    312    312 "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check experiment names\n",
    "sum(names(summary(qpcr_data[,\"Experiment\"])) == c(\"IFIT1\",\"IFITM1\", \"IFNb1\", \"IL12b\", \"WNV\")) == 5\n",
    "summary(qpcr_data[,\"Experiment\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Add Group column: UW Line, Timepoint, Virus, Tissue, Experiment separated by \"_\"\n",
    "qpcr_data$Group = paste(qpcr_data$UW_Line, qpcr_data$Timepoint, qpcr_data$Virus, qpcr_data$Tissue, qpcr_data$Experiment, sep=\"_\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Add Lab column\n",
    "qpcr_data$Lab = \"Gale\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>UW_Line</th><th scope=col>Mating</th><th scope=col>Timepoint</th><th scope=col>Condition</th><th scope=col>Tissue</th><th scope=col>Experiment</th><th scope=col>N</th><th scope=col>dCt.mean</th><th scope=col>dCt.sd</th><th scope=col>baseline.dCt</th><th scope=col>ddCt.mean</th><th scope=col>ddCt.sd</th><th scope=col>fc.mean</th><th scope=col>fc.sd</th><th scope=col>Data_Altered</th><th scope=col>Notes</th><th scope=col>Virus</th><th scope=col>Group</th><th scope=col>Lab</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>1</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IFIT1</td><td>3</td><td>7.060444</td><td>4.44931</td><td>12.28855</td><td>-5.228111</td><td>4.44931</td><td>37.4816</td><td>4.44931</td><td>NA</td><td>NA</td><td>WNV</td><td>4_12_WNV_Brain_IFIT1</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>2</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IFITM1</td><td>3</td><td>7.164444</td><td>1.29619</td><td>8.593667</td><td>-1.429223</td><td>1.29619</td><td>2.693016</td><td>1.29619</td><td>NA</td><td>NA</td><td>WNV</td><td>4_12_WNV_Brain_IFITM1</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>3</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IFNb1</td><td>3</td><td>14.47589</td><td>5.691944</td><td>18.79678</td><td>-4.32089</td><td>5.691944</td><td>19.98561</td><td>5.691944</td><td>NA</td><td>NA</td><td>WNV</td><td>4_12_WNV_Brain_IFNb1</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>4</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>IL12b</td><td>3</td><td>13.98678</td><td>5.186199</td><td>18.61556</td><td>-4.628778</td><td>5.186199</td><td>24.74008</td><td>5.186199</td><td>NA</td><td>NA</td><td>WNV</td><td>4_12_WNV_Brain_IL12b</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>5</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12</td><td>Brain</td><td>WNV</td><td>3</td><td>12.963</td><td>7.274413</td><td>19.51256</td><td>-6.549556</td><td>7.274413</td><td>93.67266</td><td>7.274413</td><td>NA</td><td>NA</td><td>WNV</td><td>4_12_WNV_Brain_WNV</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>6</th><td>4</td><td>16188x3252</td><td>12</td><td>B_d12M</td><td>Brain</td><td>IFIT1</td><td>3</td><td>12.28855</td><td>0.7611464</td><td>12.28855</td><td>0</td><td>0.7611464</td><td>1</td><td>0.7611464</td><td>NA</td><td>NA</td><td>Mock</td><td>4_12_Mock_Brain_IFIT1</td><td>Gale</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|lllllllllllllllllll}\n",
       "  & UW_Line & Mating & Timepoint & Condition & Tissue & Experiment & N & dCt.mean & dCt.sd & baseline.dCt & ddCt.mean & ddCt.sd & fc.mean & fc.sd & Data_Altered & Notes & Virus & Group & Lab\\\\\n",
       "\\hline\n",
       "\t1 & 4 & 16188x3252 & 12 & B_d12 & Brain & IFIT1 & 3 & 7.060444 & 4.44931 & 12.28855 & -5.228111 & 4.44931 & 37.4816 & 4.44931 & NA & NA & WNV & 4_12_WNV_Brain_IFIT1 & Gale\\\\\n",
       "\t2 & 4 & 16188x3252 & 12 & B_d12 & Brain & IFITM1 & 3 & 7.164444 & 1.29619 & 8.593667 & -1.429223 & 1.29619 & 2.693016 & 1.29619 & NA & NA & WNV & 4_12_WNV_Brain_IFITM1 & Gale\\\\\n",
       "\t3 & 4 & 16188x3252 & 12 & B_d12 & Brain & IFNb1 & 3 & 14.47589 & 5.691944 & 18.79678 & -4.32089 & 5.691944 & 19.98561 & 5.691944 & NA & NA & WNV & 4_12_WNV_Brain_IFNb1 & Gale\\\\\n",
       "\t4 & 4 & 16188x3252 & 12 & B_d12 & Brain & IL12b & 3 & 13.98678 & 5.186199 & 18.61556 & -4.628778 & 5.186199 & 24.74008 & 5.186199 & NA & NA & WNV & 4_12_WNV_Brain_IL12b & Gale\\\\\n",
       "\t5 & 4 & 16188x3252 & 12 & B_d12 & Brain & WNV & 3 & 12.963 & 7.274413 & 19.51256 & -6.549556 & 7.274413 & 93.67266 & 7.274413 & NA & NA & WNV & 4_12_WNV_Brain_WNV & Gale\\\\\n",
       "\t6 & 4 & 16188x3252 & 12 & B_d12M & Brain & IFIT1 & 3 & 12.28855 & 0.7611464 & 12.28855 & 0 & 0.7611464 & 1 & 0.7611464 & NA & NA & Mock & 4_12_Mock_Brain_IFIT1 & Gale\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "  UW_Line     Mating Timepoint Condition Tissue Experiment N  dCt.mean\n",
       "1       4 16188x3252        12     B_d12  Brain      IFIT1 3  7.060444\n",
       "2       4 16188x3252        12     B_d12  Brain     IFITM1 3  7.164444\n",
       "3       4 16188x3252        12     B_d12  Brain      IFNb1 3 14.475888\n",
       "4       4 16188x3252        12     B_d12  Brain      IL12b 3 13.986777\n",
       "5       4 16188x3252        12     B_d12  Brain        WNV 3 12.963000\n",
       "6       4 16188x3252        12    B_d12M  Brain      IFIT1 3 12.288555\n",
       "     dCt.sd baseline.dCt ddCt.mean   ddCt.sd   fc.mean     fc.sd Data_Altered\n",
       "1 4.4493099    12.288555 -5.228111 4.4493099 37.481600 4.4493099           NA\n",
       "2 1.2961898     8.593667 -1.429223 1.2961898  2.693016 1.2961898           NA\n",
       "3 5.6919441    18.796778 -4.320890 5.6919441 19.985608 5.6919441           NA\n",
       "4 5.1861991    18.615555 -4.628778 5.1861991 24.740076 5.1861991           NA\n",
       "5 7.2744126    19.512556 -6.549556 7.2744126 93.672658 7.2744126           NA\n",
       "6 0.7611464    12.288555  0.000000 0.7611464  1.000000 0.7611464           NA\n",
       "  Notes Virus                 Group  Lab\n",
       "1    NA   WNV  4_12_WNV_Brain_IFIT1 Gale\n",
       "2    NA   WNV 4_12_WNV_Brain_IFITM1 Gale\n",
       "3    NA   WNV  4_12_WNV_Brain_IFNb1 Gale\n",
       "4    NA   WNV  4_12_WNV_Brain_IL12b Gale\n",
       "5    NA   WNV    4_12_WNV_Brain_WNV Gale\n",
       "6    NA  Mock 4_12_Mock_Brain_IFIT1 Gale"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "head(qpcr_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 4. Read ByMouse Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>ID</th><th scope=col>Mating</th><th scope=col>RIX_ID</th><th scope=col>UW_Line</th><th scope=col>UWID</th><th scope=col>Sample_Name</th><th scope=col>Experiment</th><th scope=col>Tissue</th><th scope=col>Condition</th><th scope=col>Timepoint</th><th scope=col>Ct</th><th scope=col>Ct.sd</th><th scope=col>ref.Ct</th><th scope=col>ref.sd</th><th scope=col>dCt</th><th scope=col>dCt.linear</th><th scope=col>dCt.sd</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>1</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>S 1.12</td><td>IFIT1</td><td>Spleen</td><td>WNV</td><td>12</td><td>30.31133</td><td>0.1537316</td><td>20.94267</td><td>0.08401435</td><td>9.368667</td><td>0.001512691</td><td>0.1751908</td></tr>\n",
       "\t<tr><th scope=row>2</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>B 1.12</td><td>IFITM1</td><td>Brain</td><td>WNV</td><td>12</td><td>26.49567</td><td>0.1571378</td><td>20.519</td><td>0.076237</td><td>5.976665</td><td>0.01587978</td><td>0.174655</td></tr>\n",
       "\t<tr><th scope=row>3</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>K 1.12</td><td>IFIT1</td><td>Kidney</td><td>WNV</td><td>12</td><td>29.47433</td><td>0.09113332</td><td>19.27533</td><td>0.04600315</td><td>10.199</td><td>0.0008507361</td><td>0.1020861</td></tr>\n",
       "\t<tr><th scope=row>4</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>K 1.12</td><td>IL12b</td><td>Kidney</td><td>WNV</td><td>12</td><td>40</td><td>0</td><td>19.27533</td><td>0.04600315</td><td>20.72467</td><td>5.77e-07</td><td>0.04600315</td></tr>\n",
       "\t<tr><th scope=row>5</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>B 1.12</td><td>IFNb1</td><td>Brain</td><td>WNV</td><td>12</td><td>29.12167</td><td>0.1936756</td><td>20.519</td><td>0.076237</td><td>8.602666</td><td>0.002572405</td><td>0.2081402</td></tr>\n",
       "\t<tr><th scope=row>6</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>B 1.12</td><td>IFIT1</td><td>Brain</td><td>WNV</td><td>12</td><td>23.525</td><td>0.02586539</td><td>20.519</td><td>0.076237</td><td>3.006</td><td>0.1244812</td><td>0.08050527</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|lllllllllllllllll}\n",
       "  & ID & Mating & RIX_ID & UW_Line & UWID & Sample_Name & Experiment & Tissue & Condition & Timepoint & Ct & Ct.sd & ref.Ct & ref.sd & dCt & dCt.linear & dCt.sd\\\\\n",
       "\\hline\n",
       "\t1 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & S 1.12 & IFIT1 & Spleen & WNV & 12 & 30.31133 & 0.1537316 & 20.94267 & 0.08401435 & 9.368667 & 0.001512691 & 0.1751908\\\\\n",
       "\t2 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & B 1.12 & IFITM1 & Brain & WNV & 12 & 26.49567 & 0.1571378 & 20.519 & 0.076237 & 5.976665 & 0.01587978 & 0.174655\\\\\n",
       "\t3 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & K 1.12 & IFIT1 & Kidney & WNV & 12 & 29.47433 & 0.09113332 & 19.27533 & 0.04600315 & 10.199 & 0.0008507361 & 0.1020861\\\\\n",
       "\t4 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & K 1.12 & IL12b & Kidney & WNV & 12 & 40 & 0 & 19.27533 & 0.04600315 & 20.72467 & 5.77e-07 & 0.04600315\\\\\n",
       "\t5 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & B 1.12 & IFNb1 & Brain & WNV & 12 & 29.12167 & 0.1936756 & 20.519 & 0.076237 & 8.602666 & 0.002572405 & 0.2081402\\\\\n",
       "\t6 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & B 1.12 & IFIT1 & Brain & WNV & 12 & 23.525 & 0.02586539 & 20.519 & 0.076237 & 3.006 & 0.1244812 & 0.08050527\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "              ID     Mating RIX_ID UW_Line UWID Sample_Name Experiment Tissue\n",
       "1 16188x3252_248 16188x3252    248       4 1.12      S 1.12      IFIT1 Spleen\n",
       "2 16188x3252_248 16188x3252    248       4 1.12      B 1.12     IFITM1  Brain\n",
       "3 16188x3252_248 16188x3252    248       4 1.12      K 1.12      IFIT1 Kidney\n",
       "4 16188x3252_248 16188x3252    248       4 1.12      K 1.12      IL12b Kidney\n",
       "5 16188x3252_248 16188x3252    248       4 1.12      B 1.12      IFNb1  Brain\n",
       "6 16188x3252_248 16188x3252    248       4 1.12      B 1.12      IFIT1  Brain\n",
       "  Condition Timepoint       Ct      Ct.sd   ref.Ct     ref.sd       dCt\n",
       "1       WNV        12 30.31133 0.15373160 20.94267 0.08401435  9.368667\n",
       "2       WNV        12 26.49567 0.15713780 20.51900 0.07623700  5.976665\n",
       "3       WNV        12 29.47433 0.09113332 19.27533 0.04600315 10.199001\n",
       "4       WNV        12 40.00000 0.00000000 19.27533 0.04600315 20.724667\n",
       "5       WNV        12 29.12167 0.19367563 20.51900 0.07623700  8.602666\n",
       "6       WNV        12 23.52500 0.02586539 20.51900 0.07623700  3.006000\n",
       "    dCt.linear     dCt.sd\n",
       "1 0.0015126910 0.17519080\n",
       "2 0.0158797774 0.17465500\n",
       "3 0.0008507361 0.10208610\n",
       "4 0.0000005770 0.04600315\n",
       "5 0.0025724055 0.20814017\n",
       "6 0.1244812292 0.08050527"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>4210</li>\n",
       "\t<li>17</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 4210\n",
       "\\item 17\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 4210\n",
       "2. 17\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 4210   17"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Read in data (byMouse)\n",
    "qpcr_data_mouse = read.xls(file.path(data_dir, \"16-May-2016/Gale_qPCR_byMouse_5-16-16 %281%29.xlsx\"), sheet=1)\n",
    "\n",
    "head(qpcr_data_mouse)\n",
    "dim(qpcr_data_mouse)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>4215</li>\n",
       "\t<li>17</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 4215\n",
       "\\item 17\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 4215\n",
       "2. 17\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 4215   17"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Replace line 82 with fixed data (special case for May 16, 2016 data)\n",
    "qpcr_data_mouse = qpcr_data_mouse[qpcr_data_mouse$UW_Line != 82, ]\n",
    "\n",
    "line_82_mouse = read.xls(file.path(data_dir, \"18-May-2016/Gale_qPCR_byMouse_5-18-16.xlsx\"), sheet=1)\n",
    "qpcr_data_mouse = rbind(qpcr_data_mouse, line_82_mouse)\n",
    "\n",
    "dim(qpcr_data_mouse)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>4600</li>\n",
       "\t<li>17</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 4600\n",
       "\\item 17\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 4600\n",
       "2. 17\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 4600   17"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Replace lines 54 and 58 with fixed data (special case for May 16, 2016 data)\n",
    "qpcr_data_mouse = qpcr_data_mouse[!qpcr_data_mouse$UW_Line %in% c(54, 58), ]\n",
    "\n",
    "lines_54_58_mouse = read.xls(file.path(data_dir, \"23-May-2016/Gale_qPCR_byMouse_5-23-16.xlsx\"), sheet=1)\n",
    "qpcr_data_mouse = rbind(qpcr_data_mouse, lines_54_58_mouse)\n",
    "\n",
    "dim(qpcr_data_mouse)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 5. Format and Clean ByMouse Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "0"
      ],
      "text/latex": [
       "0"
      ],
      "text/markdown": [
       "0"
      ],
      "text/plain": [
       "[1] 0"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check if any CT < 15\n",
    "length(which(qpcr_data_mouse[,\"Ct\"] < 15))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "0"
      ],
      "text/latex": [
       "0"
      ],
      "text/markdown": [
       "0"
      ],
      "text/plain": [
       "[1] 0"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check if any reference CT < 15\n",
    "length(which(qpcr_data_mouse[,\"ref.Ct\"] < 15))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "0"
      ],
      "text/latex": [
       "0"
      ],
      "text/markdown": [
       "0"
      ],
      "text/plain": [
       "[1] 0"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check if an reference CT == 40\n",
    "length(which(qpcr_data_mouse[,\"ref.Ct\"] == 40))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<dl class=dl-horizontal>\n",
       "\t<dt>4</dt>\n",
       "\t\t<dd>16188x3252</dd>\n",
       "\t<dt>42</dt>\n",
       "\t\t<dd>8008x8016</dd>\n",
       "\t<dt>45</dt>\n",
       "\t\t<dd>16441x8024</dd>\n",
       "\t<dt>46</dt>\n",
       "\t\t<dd>8048x15155</dd>\n",
       "\t<dt>48</dt>\n",
       "\t\t<dd>13140x16680</dd>\n",
       "\t<dt>61</dt>\n",
       "\t\t<dd>8056x8033</dd>\n",
       "\t<dt>62</dt>\n",
       "\t\t<dd>8054x8036</dd>\n",
       "\t<dt>70</dt>\n",
       "\t\t<dd>8045x4410</dd>\n",
       "\t<dt>71</dt>\n",
       "\t\t<dd>3564x8027</dd>\n",
       "\t<dt>72</dt>\n",
       "\t\t<dd>5035x16785</dd>\n",
       "\t<dt>73</dt>\n",
       "\t\t<dd>5358x8046</dd>\n",
       "\t<dt>74</dt>\n",
       "\t\t<dd>8046x8004</dd>\n",
       "\t<dt>75</dt>\n",
       "\t\t<dd>8016x8004</dd>\n",
       "\t<dt>76</dt>\n",
       "\t\t<dd>8024x8048</dd>\n",
       "\t<dt>77</dt>\n",
       "\t\t<dd>8034x8043</dd>\n",
       "\t<dt>78</dt>\n",
       "\t\t<dd>13421x16034</dd>\n",
       "\t<dt>79</dt>\n",
       "\t\t<dd>16034x13067</dd>\n",
       "\t<dt>80</dt>\n",
       "\t\t<dd>16521x3260</dd>\n",
       "\t<dt>81</dt>\n",
       "\t\t<dd>16072x5346</dd>\n",
       "\t<dt>113</dt>\n",
       "\t\t<dd>8027x477</dd>\n",
       "\t<dt>82</dt>\n",
       "\t\t<dd>16557x3154</dd>\n",
       "\t<dt>54</dt>\n",
       "\t\t<dd>8036x18018</dd>\n",
       "\t<dt>58</dt>\n",
       "\t\t<dd>5346x16768</dd>\n",
       "</dl>\n"
      ],
      "text/latex": [
       "\\begin{description*}\n",
       "\\item[4] 16188x3252\n",
       "\\item[42] 8008x8016\n",
       "\\item[45] 16441x8024\n",
       "\\item[46] 8048x15155\n",
       "\\item[48] 13140x16680\n",
       "\\item[61] 8056x8033\n",
       "\\item[62] 8054x8036\n",
       "\\item[70] 8045x4410\n",
       "\\item[71] 3564x8027\n",
       "\\item[72] 5035x16785\n",
       "\\item[73] 5358x8046\n",
       "\\item[74] 8046x8004\n",
       "\\item[75] 8016x8004\n",
       "\\item[76] 8024x8048\n",
       "\\item[77] 8034x8043\n",
       "\\item[78] 13421x16034\n",
       "\\item[79] 16034x13067\n",
       "\\item[80] 16521x3260\n",
       "\\item[81] 16072x5346\n",
       "\\item[113] 8027x477\n",
       "\\item[82] 16557x3154\n",
       "\\item[54] 8036x18018\n",
       "\\item[58] 5346x16768\n",
       "\\end{description*}\n"
      ],
      "text/markdown": [
       "4\n",
       ":   16188x325242\n",
       ":   8008x801645\n",
       ":   16441x802446\n",
       ":   8048x1515548\n",
       ":   13140x1668061\n",
       ":   8056x803362\n",
       ":   8054x803670\n",
       ":   8045x441071\n",
       ":   3564x802772\n",
       ":   5035x1678573\n",
       ":   5358x804674\n",
       ":   8046x800475\n",
       ":   8016x800476\n",
       ":   8024x804877\n",
       ":   8034x804378\n",
       ":   13421x1603479\n",
       ":   16034x1306780\n",
       ":   16521x326081\n",
       ":   16072x5346113\n",
       ":   8027x47782\n",
       ":   16557x315454\n",
       ":   8036x1801858\n",
       ":   5346x16768\n",
       "\n"
      ],
      "text/plain": [
       "          4          42          45          46          48          61 \n",
       " 16188x3252   8008x8016  16441x8024  8048x15155 13140x16680   8056x8033 \n",
       "         62          70          71          72          73          74 \n",
       "  8054x8036   8045x4410   3564x8027  5035x16785   5358x8046   8046x8004 \n",
       "         75          76          77          78          79          80 \n",
       "  8016x8004   8024x8048   8034x8043 13421x16034 16034x13067  16521x3260 \n",
       "         81         113          82          54          58 \n",
       " 16072x5346    8027x477  16557x3154  8036x18018  5346x16768 \n",
       "23 Levels: 13140x16680 13421x16034 16034x13067 16072x5346 ... 8056x8033"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check that each line has a single mating\n",
    "mouse_line_matings = sapply(unique(qpcr_data_mouse$UW_Line), function(x){unique(qpcr_data_mouse$Mating[qpcr_data_mouse$UW_Line==x])})\n",
    "names(mouse_line_matings) = unique(qpcr_data_mouse$UW_Line)\n",
    "mouse_line_matings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Remove 'M' from time points and convert to numeric\n",
    "qpcr_data_mouse[,\"Timepoint\"] <- as.numeric(as.character(gsub(\"M\",\"\",qpcr_data_mouse[,\"Timepoint\"])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "## Change Condition column name to Virus\n",
    "names(qpcr_data_mouse)[which(names(qpcr_data_mouse) == \"Condition\")] <- \"Virus\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<dl class=dl-horizontal>\n",
       "\t<dt>IFIT1</dt>\n",
       "\t\t<dd>920</dd>\n",
       "\t<dt>IFITM1</dt>\n",
       "\t\t<dd>920</dd>\n",
       "\t<dt>IFNb1</dt>\n",
       "\t\t<dd>920</dd>\n",
       "\t<dt>IL12b</dt>\n",
       "\t\t<dd>920</dd>\n",
       "\t<dt>WNV</dt>\n",
       "\t\t<dd>920</dd>\n",
       "</dl>\n"
      ],
      "text/latex": [
       "\\begin{description*}\n",
       "\\item[IFIT1] 920\n",
       "\\item[IFITM1] 920\n",
       "\\item[IFNb1] 920\n",
       "\\item[IL12b] 920\n",
       "\\item[WNV] 920\n",
       "\\end{description*}\n"
      ],
      "text/markdown": [
       "IFIT1\n",
       ":   920IFITM1\n",
       ":   920IFNb1\n",
       ":   920IL12b\n",
       ":   920WNV\n",
       ":   920\n",
       "\n"
      ],
      "text/plain": [
       " IFIT1 IFITM1  IFNb1  IL12b    WNV \n",
       "   920    920    920    920    920 "
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Check experiment names\n",
    "sum(names(summary(qpcr_data_mouse[,\"Experiment\"])) == c(\"IFIT1\",\"IFITM1\", \"IFNb1\", \"IL12b\", \"WNV\")) == 5\n",
    "summary(qpcr_data_mouse[,\"Experiment\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Add Group column: UW Line, Timepoint, Virus, Tissue, Experiment separated by \"_\"\n",
    "qpcr_data_mouse$Group <- paste(qpcr_data_mouse$UW_Line, qpcr_data_mouse$Timepoint, qpcr_data_mouse$Virus, \n",
    "                               qpcr_data_mouse$Tissue, qpcr_data_mouse$Experiment, sep=\"_\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "0"
      ],
      "text/latex": [
       "0"
      ],
      "text/markdown": [
       "0"
      ],
      "text/plain": [
       "[1] 0"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "0"
      ],
      "text/latex": [
       "0"
      ],
      "text/markdown": [
       "0"
      ],
      "text/plain": [
       "[1] 0"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check that ByLine file contains data from only those animals in ByMouse file\n",
    "length(setdiff(qpcr_data$Group, qpcr_data_mouse$Group))\n",
    "length(setdiff(qpcr_data_mouse$Group, qpcr_data$Group))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Add Data_Altered and Notes columns\n",
    "qpcr_data_mouse$Data_Altered = NA\n",
    "qpcr_data_mouse$Notes = NA"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Add Lab column\n",
    "qpcr_data_mouse$Lab = \"Gale\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead><tr><th></th><th scope=col>ID</th><th scope=col>Mating</th><th scope=col>RIX_ID</th><th scope=col>UW_Line</th><th scope=col>UWID</th><th scope=col>Sample_Name</th><th scope=col>Experiment</th><th scope=col>Tissue</th><th scope=col>Virus</th><th scope=col>Timepoint</th><th scope=col>Ct</th><th scope=col>Ct.sd</th><th scope=col>ref.Ct</th><th scope=col>ref.sd</th><th scope=col>dCt</th><th scope=col>dCt.linear</th><th scope=col>dCt.sd</th><th scope=col>Group</th><th scope=col>Data_Altered</th><th scope=col>Notes</th><th scope=col>Lab</th></tr></thead>\n",
       "<tbody>\n",
       "\t<tr><th scope=row>1</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>S 1.12</td><td>IFIT1</td><td>Spleen</td><td>WNV</td><td>12</td><td>30.31133</td><td>0.1537316</td><td>20.94267</td><td>0.08401435</td><td>9.368667</td><td>0.001512691</td><td>0.1751908</td><td>4_12_WNV_Spleen_IFIT1</td><td>NA</td><td>NA</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>2</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>B 1.12</td><td>IFITM1</td><td>Brain</td><td>WNV</td><td>12</td><td>26.49567</td><td>0.1571378</td><td>20.519</td><td>0.076237</td><td>5.976665</td><td>0.01587978</td><td>0.174655</td><td>4_12_WNV_Brain_IFITM1</td><td>NA</td><td>NA</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>3</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>K 1.12</td><td>IFIT1</td><td>Kidney</td><td>WNV</td><td>12</td><td>29.47433</td><td>0.09113332</td><td>19.27533</td><td>0.04600315</td><td>10.199</td><td>0.0008507361</td><td>0.1020861</td><td>4_12_WNV_Kidney_IFIT1</td><td>NA</td><td>NA</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>4</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>K 1.12</td><td>IL12b</td><td>Kidney</td><td>WNV</td><td>12</td><td>40</td><td>0</td><td>19.27533</td><td>0.04600315</td><td>20.72467</td><td>5.77e-07</td><td>0.04600315</td><td>4_12_WNV_Kidney_IL12b</td><td>NA</td><td>NA</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>5</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>B 1.12</td><td>IFNb1</td><td>Brain</td><td>WNV</td><td>12</td><td>29.12167</td><td>0.1936756</td><td>20.519</td><td>0.076237</td><td>8.602666</td><td>0.002572405</td><td>0.2081402</td><td>4_12_WNV_Brain_IFNb1</td><td>NA</td><td>NA</td><td>Gale</td></tr>\n",
       "\t<tr><th scope=row>6</th><td>16188x3252_248</td><td>16188x3252</td><td>248</td><td>4</td><td>1.12</td><td>B 1.12</td><td>IFIT1</td><td>Brain</td><td>WNV</td><td>12</td><td>23.525</td><td>0.02586539</td><td>20.519</td><td>0.076237</td><td>3.006</td><td>0.1244812</td><td>0.08050527</td><td>4_12_WNV_Brain_IFIT1</td><td>NA</td><td>NA</td><td>Gale</td></tr>\n",
       "</tbody>\n",
       "</table>\n"
      ],
      "text/latex": [
       "\\begin{tabular}{r|lllllllllllllllllllll}\n",
       "  & ID & Mating & RIX_ID & UW_Line & UWID & Sample_Name & Experiment & Tissue & Virus & Timepoint & Ct & Ct.sd & ref.Ct & ref.sd & dCt & dCt.linear & dCt.sd & Group & Data_Altered & Notes & Lab\\\\\n",
       "\\hline\n",
       "\t1 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & S 1.12 & IFIT1 & Spleen & WNV & 12 & 30.31133 & 0.1537316 & 20.94267 & 0.08401435 & 9.368667 & 0.001512691 & 0.1751908 & 4_12_WNV_Spleen_IFIT1 & NA & NA & Gale\\\\\n",
       "\t2 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & B 1.12 & IFITM1 & Brain & WNV & 12 & 26.49567 & 0.1571378 & 20.519 & 0.076237 & 5.976665 & 0.01587978 & 0.174655 & 4_12_WNV_Brain_IFITM1 & NA & NA & Gale\\\\\n",
       "\t3 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & K 1.12 & IFIT1 & Kidney & WNV & 12 & 29.47433 & 0.09113332 & 19.27533 & 0.04600315 & 10.199 & 0.0008507361 & 0.1020861 & 4_12_WNV_Kidney_IFIT1 & NA & NA & Gale\\\\\n",
       "\t4 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & K 1.12 & IL12b & Kidney & WNV & 12 & 40 & 0 & 19.27533 & 0.04600315 & 20.72467 & 5.77e-07 & 0.04600315 & 4_12_WNV_Kidney_IL12b & NA & NA & Gale\\\\\n",
       "\t5 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & B 1.12 & IFNb1 & Brain & WNV & 12 & 29.12167 & 0.1936756 & 20.519 & 0.076237 & 8.602666 & 0.002572405 & 0.2081402 & 4_12_WNV_Brain_IFNb1 & NA & NA & Gale\\\\\n",
       "\t6 & 16188x3252_248 & 16188x3252 & 248 & 4 & 1.12 & B 1.12 & IFIT1 & Brain & WNV & 12 & 23.525 & 0.02586539 & 20.519 & 0.076237 & 3.006 & 0.1244812 & 0.08050527 & 4_12_WNV_Brain_IFIT1 & NA & NA & Gale\\\\\n",
       "\\end{tabular}\n"
      ],
      "text/plain": [
       "              ID     Mating RIX_ID UW_Line UWID Sample_Name Experiment Tissue\n",
       "1 16188x3252_248 16188x3252    248       4 1.12      S 1.12      IFIT1 Spleen\n",
       "2 16188x3252_248 16188x3252    248       4 1.12      B 1.12     IFITM1  Brain\n",
       "3 16188x3252_248 16188x3252    248       4 1.12      K 1.12      IFIT1 Kidney\n",
       "4 16188x3252_248 16188x3252    248       4 1.12      K 1.12      IL12b Kidney\n",
       "5 16188x3252_248 16188x3252    248       4 1.12      B 1.12      IFNb1  Brain\n",
       "6 16188x3252_248 16188x3252    248       4 1.12      B 1.12      IFIT1  Brain\n",
       "  Virus Timepoint       Ct      Ct.sd   ref.Ct     ref.sd       dCt\n",
       "1   WNV        12 30.31133 0.15373160 20.94267 0.08401435  9.368667\n",
       "2   WNV        12 26.49567 0.15713780 20.51900 0.07623700  5.976665\n",
       "3   WNV        12 29.47433 0.09113332 19.27533 0.04600315 10.199001\n",
       "4   WNV        12 40.00000 0.00000000 19.27533 0.04600315 20.724667\n",
       "5   WNV        12 29.12167 0.19367563 20.51900 0.07623700  8.602666\n",
       "6   WNV        12 23.52500 0.02586539 20.51900 0.07623700  3.006000\n",
       "    dCt.linear     dCt.sd                 Group Data_Altered Notes  Lab\n",
       "1 0.0015126910 0.17519080 4_12_WNV_Spleen_IFIT1           NA    NA Gale\n",
       "2 0.0158797774 0.17465500 4_12_WNV_Brain_IFITM1           NA    NA Gale\n",
       "3 0.0008507361 0.10208610 4_12_WNV_Kidney_IFIT1           NA    NA Gale\n",
       "4 0.0000005770 0.04600315 4_12_WNV_Kidney_IL12b           NA    NA Gale\n",
       "5 0.0025724055 0.20814017  4_12_WNV_Brain_IFNb1           NA    NA Gale\n",
       "6 0.1244812292 0.08050527  4_12_WNV_Brain_IFIT1           NA    NA Gale"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "head(qpcr_data_mouse)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 6. Calculate and Check Summary Measures"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6a. Calculate dCt mean"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"All dCt mean correct\"\n"
     ]
    }
   ],
   "source": [
    "## Calculate dCt mean from byMouse data\n",
    "dct_mean = aggregate(formula=qpcr_data_mouse[,\"dCt\"]~qpcr_data_mouse[,\"Group\"], data=qpcr_data_mouse, FUN=mean)\n",
    "names(dct_mean) <- c(\"Group\",\"dCt.mean.V2\")\n",
    "dct_mean[order(dct_mean[,1]),] -> dct_mean_order\n",
    "\n",
    "## Get dCt mean from byLine data\n",
    "byline_dct_mean = unique(qpcr_data[,c(\"Group\",\"dCt.mean\")])\n",
    "byline_dct_mean[order(byline_dct_mean[,1]),] -> byline_dct_mean_order\n",
    "\n",
    "## Check that the dCt mean calculations are the same\n",
    "check_dct_mean_v2 = sapply(1:dim(dct_mean_order)[1], \n",
    "                           function(x){isTRUE(all.equal(dct_mean_order[x,2], byline_dct_mean_order[x,2]))})\n",
    "\n",
    "## Print discrepancies, if they exist\n",
    "if(sum(check_dct_mean_v2) == nrow(dct_mean_order)){\n",
    "  print(\"All dCt mean correct\")\n",
    "} else {\n",
    "  print(\"Need to clean dCt mean\")\n",
    "  dct_mean_errs = cbind(byline_dct_mean_order[which(check_dct_mean_v2==F),], dct_mean_order[which(check_dct_mean_v2==F),])\n",
    "  dct_mean_errs\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6b. Update dCt mean if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "if (dim(dct_mean_errs)[1] > 0) {\n",
    "  for (i in 1:dim(dct_mean_errs)[1]) {\n",
    "    qpcr_data$dCt.mean[qpcr_data$Group==dct_mean_errs[i,3]] = dct_mean_errs[i,4]\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6c. Calculate N for each group"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"All N correct\"\n"
     ]
    }
   ],
   "source": [
    "## Calculate the N for each group from the byMouse data\n",
    "data.frame(summary(as.factor(qpcr_data_mouse[,\"Group\"]),maxsum=8000)) -> bymouse_n\n",
    "names(bymouse_n) <- c(\"N.V2\")\n",
    "bymouse_n[,2] <- row.names(bymouse_n)\n",
    "names(bymouse_n)[2] <- \"Group\"\n",
    "bymouse_n = bymouse_n[,c(2,1)]\n",
    "bymouse_n[order(bymouse_n[,\"Group\"]),] -> bymouse_n_order\n",
    "\n",
    "## Get the N for each group from the byLine data \n",
    "qpcr_data[,c(\"Group\",\"N\")] -> byline_n\n",
    "byline_n[order(byline_n[,\"Group\"]),] -> byline_n_order\n",
    "\n",
    "## Print discrepancies, if they exist\n",
    "if(sum(bymouse_n_order[,2] == byline_n_order[,2]) == nrow(bymouse_n_order)){\n",
    "  print(\"All N correct\")\n",
    "} else {\n",
    "  print(\"Need to clean N\")\n",
    "  n_errs = cbind(byline_n_order[bymouse_n_order[,2] != byline_n_order[,2],], \n",
    "                 bymouse_n_order[bymouse_n_order[,2] != byline_n_order[,2],])\n",
    "  n_errs\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6d. Update N if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "if (dim(n_errs)[1] > 0) {\n",
    "  for (i in 1:dim(n_errs)[1]) {\n",
    "    qpcr_data$N[qpcr_data$Group==n_errs[i,3]] = n_errs[i,4]\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6e. Calculate dCt SD"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false,
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"All dCt SD correct\"\n"
     ]
    }
   ],
   "source": [
    "## Calculate dCt SD from the byMouse data \n",
    "dct_sd = aggregate(formula=qpcr_data_mouse[,\"dCt\"]~qpcr_data_mouse[,\"Group\"], data=qpcr_data_mouse, FUN=sd)\n",
    "names(dct_sd) <- c(\"Group\",\"dCt.sd.V2\")\n",
    "dct_sd[order(dct_sd[,1]),] -> dct_sd_order\n",
    "\n",
    "## Get dCt SD from the byLine data\n",
    "byline_dct_sd = qpcr_data[,c(\"Group\",\"dCt.sd\")]\n",
    "byline_dct_sd[order(byline_dct_sd[,1]),] -> byline_dct_sd_order\n",
    "\n",
    "check_dct_sd = sapply(1:dim(dct_sd_order)[1],function(x){isTRUE(all.equal(dct_sd_order[x,2],byline_dct_sd_order[x,2]))})\n",
    "\n",
    "## Print discrepancies, if they exist\n",
    "if(sum(check_dct_sd) == nrow(dct_sd_order)) {\n",
    "  print(\"All dCt SD correct\")\n",
    "} else {\n",
    "  print(\"Need to clean dCt SD\")\n",
    "  dct_sd_errs = cbind(byline_dct_sd_order[which(check_dct_sd==F),], dct_sd_order[which(check_dct_sd==F),])\n",
    "  dct_sd_errs\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6f. Update dCt SD if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "if (dim(dct_sd_errs)[1] > 0) {\n",
    "  for (i in 1:dim(dct_sd_errs)[1]) {\n",
    "    qpcr_data$dCt.sd[qpcr_data$Group==dct_sd_errs[i,3]] = dct_sd_errs[i,4]\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6g. Calculate baseline dCt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"All baseline dCt correct\"\n"
     ]
    }
   ],
   "source": [
    "## Check baseline.dct\n",
    "## Add new group column to annotate baseline\n",
    "qpcr_data$Group_g <- paste(qpcr_data[,\"UW_Line\"],qpcr_data[,\"Tissue\"], qpcr_data[,\"Experiment\"],sep=\"_\")\n",
    "\n",
    "## Calculate baseline, baseline is 12 for this data\n",
    "qpcr_data$baseline.dCt.V2 = NA\n",
    "baseline = 12\n",
    "\n",
    "for(i in unique(qpcr_data[,\"Group_g\"])){\n",
    "  qpcr_data[which(qpcr_data[,\"Group_g\"] == i),\"baseline.dCt.V2\"] <- \n",
    "    qpcr_data[which(qpcr_data[,\"Group_g\"] == i & qpcr_data[,\"Virus\"] == \"Mock\" & qpcr_data[,\"Timepoint\"] == baseline),\"dCt.mean\"]\n",
    "}\n",
    "\n",
    "# Print discrepancies, if they exist\n",
    "if(sum(qpcr_data[,\"baseline.dCt\"] == qpcr_data[,\"baseline.dCt.V2\"]) == nrow(qpcr_data)){\n",
    "  print(\"All baseline dCt correct\")\n",
    "} else {\n",
    "  print(\"Need to clean baseline dCt\")\n",
    "  baseline_dct_errs = qpcr_data[qpcr_data[,\"baseline.dCt\"] != qpcr_data[,\"baseline.dCt.V2\"], \n",
    "                                c(\"Group\", \"Group_g\", \"baseline.dCt\", \"baseline.dCt.V2\")]\n",
    "  baseline_dct_errs\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6h. Update baseline dCt if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "if (dim(baseline_dct_errs)[1] > 0) {\n",
    "  for (i in 1:dim(baseline_dct_errs)[1]) {\n",
    "    qpcr_data$baseline.dCt[qpcr_data$Group==baseline_dct_errs[i,1]] = baseline_dct_errs[i,4]\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6i. Calculate ddCt mean"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"All ddCt mean correct\"\n"
     ]
    }
   ],
   "source": [
    "## Calculate ddCt mean \n",
    "qpcr_data$ddCt.mean.V2 <- as.numeric(as.character(qpcr_data[,\"dCt.mean\"])) - as.numeric(as.character(qpcr_data[,\"baseline.dCt\"]))\n",
    "\n",
    "check_ddct_mean = sapply(1:dim(qpcr_data)[1],function(x)isTRUE(all.equal(qpcr_data[x,\"ddCt.mean\"], qpcr_data[x,\"ddCt.mean.V2\"], tolerance=5.5e-8)))\n",
    "\n",
    "## Print discrepancies, if they exist\n",
    "if(sum(check_ddct_mean) == dim(qpcr_data)[1]){\n",
    "  print(\"All ddCt mean correct\")\n",
    "} else {\n",
    "  print(\"Need to clean ddCt mean\")\n",
    "  ddct_errs = qpcr_data[which(check_ddct_mean==F), c(\"Group\", \"Group_g\", \"ddCt.mean\", \"ddCt.mean.V2\")]\n",
    "  ddct_errs\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6j. Update ddCt mean if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "if (dim(ddct_errs)[1] > 0) {\n",
    "  for (i in 1:dim(ddct_errs)[1]) {\n",
    "    qpcr_data$ddCt.mean[qpcr_data$Group==ddct_errs[i,1]] = ddct_errs[i,4]\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6k. Calculate fold change mean"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1] \"All FC mean correct\"\n"
     ]
    }
   ],
   "source": [
    "# check fc mean correct\n",
    "qpcr_data$fc.mean.V2 <- 2^-qpcr_data[,\"ddCt.mean\"]\n",
    "\n",
    "check_fc_mean = sapply(1:dim(qpcr_data)[1],function(x)isTRUE(all.equal(qpcr_data[x,\"fc.mean\"], qpcr_data[x,\"fc.mean.V2\"])))\n",
    "\n",
    "if(sum(check_fc_mean) == dim(qpcr_data)[1]){\n",
    "  print(\"All FC mean correct\")\n",
    "} else {\n",
    "  print(\"Need to clean FC mean\")\n",
    "  fc_errs = qpcr_data[which(check_fc_mean==F), c(\"Group\", \"Group_g\", \"fc.mean\", \"fc.mean.V2\")]\n",
    "  fc_errs\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 6l. Update fold change mean if necessary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "if (dim(fc_errs)[1] > 0) {\n",
    "  for (i in 1:dim(fc_errs)[1]) {\n",
    "    qpcr_data$fc.mean[qpcr_data$Group==fc_errs[i,1]] = fc_errs[i,4]\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 7. Remove Unused Columns and Add Baseline dCt.sd and ddCt.se"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Remove extra (unused) columns\n",
    "remove_cols = c(\"ddCt.sd\", \"fc.sd\", \"baseline.dCt.V2\", \"ddCt.mean.V2\", \"fc.mean.V2\")\n",
    "qpcr_data_v2 = qpcr_data[,-as.vector(unlist(sapply(remove_cols,function(x)which(x==names(qpcr_data)))))]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Add baseline.dCt.sd column\n",
    "qpcr_data_v2$baseline.dCt.sd = NA\n",
    "\n",
    "# compute baseline sd, use baseline = 12 saved above\n",
    "for (i in unique(qpcr_data_v2[,\"Group_g\"])){\n",
    "  qpcr_data_v2[which(qpcr_data_v2[,\"Group_g\"] == i),\"baseline.dCt.sd\"] <- \n",
    "    qpcr_data_v2[which(qpcr_data_v2[,\"Group_g\"] == i & \n",
    "                       qpcr_data_v2[,\"Virus\"] == \"Mock\" & \n",
    "                       qpcr_data_v2[,\"Timepoint\"] == baseline),\"dCt.sd\"]\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "## Add ddCt.se\n",
    "ddCt.se = sapply(1:dim(qpcr_data_v2)[1],function(x){\n",
    "  sqrt((qpcr_data_v2[x,\"dCt.sd\"]^2/qpcr_data_v2[x,\"N\"]) + \n",
    "      (qpcr_data_v2[x,\"baseline.dCt.sd\"]^2/qpcr_data_v2[which(qpcr_data_v2[,\"Group_g\"] == qpcr_data_v2[x,\"Group_g\"] & \n",
    "                                                              qpcr_data_v2[,\"Virus\"] == \"Mock\" & \n",
    "                                                              qpcr_data_v2[,\"Timepoint\"] == baseline),\"N\"])\n",
    "  )\n",
    "})\n",
    "qpcr_data_v2$ddCt.se <- ddCt.se"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 8. Finalize Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "TRUE"
      ],
      "text/latex": [
       "TRUE"
      ],
      "text/markdown": [
       "TRUE"
      ],
      "text/plain": [
       "[1] TRUE"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Order by group\n",
    "qpcr_data_v3_final = qpcr_data_v2[order(qpcr_data_v2[,\"Group\"]),]\n",
    "\n",
    "## Double-check all calculations are correct\n",
    "## dCt mean\n",
    "isTRUE(all.equal(dct_mean_order[,2], qpcr_data_v3_final[,\"dCt.mean\"]))\n",
    "\n",
    "## N\n",
    "isTRUE(all.equal(bymouse_n_order[,2], qpcr_data_v3_final[,\"N\"]))\n",
    "\n",
    "## dCt SD\n",
    "isTRUE(all.equal(dct_sd_order[,2], qpcr_data_v3_final[,\"dCt.sd\"]))\n",
    "\n",
    "## baseline dCt\n",
    "isTRUE(all.equal(qpcr_data[order(qpcr_data[,\"Group\"]),\"baseline.dCt.V2\"],qpcr_data_v3_final[,\"baseline.dCt\"]))\n",
    "\n",
    "## ddCt mean\n",
    "isTRUE(all.equal(qpcr_data[order(qpcr_data[,\"Group\"]),\"ddCt.mean.V2\"],qpcr_data_v3_final[,\"ddCt.mean\"]))\n",
    "\n",
    "## FC mean\n",
    "isTRUE(all.equal(qpcr_data[order(qpcr_data[,\"Group\"]),\"fc.mean.V2\"],qpcr_data_v3_final[,\"fc.mean\"]))\n",
    "\n",
    "## baseline SD, computed using corrected dCt SD at timepoint 12\n",
    "\n",
    "## ddCt SE, computed using correct dCt SD and baseline SD\n",
    "\n",
    "## Remove Group_g column\n",
    "qpcr_data_final_format <- qpcr_data_v3_final[,!(names(qpcr_data_v3_final) %in% \"Group_g\")]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "## Order columns according to data dictionary (byLine)\n",
    "## Note: you may have to change the path to the data dictionary\n",
    "data_dict <- read.xls(xls=\"./data/WNV_Data_Dictionary.xlsx\", sheet=\"qPCR Data - By Line\", as.is=T)\n",
    "qpcr_data_final_format_order = qpcr_data_final_format[, data_dict[,1]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "## Order columns according to data dictionary (byMouse)\n",
    "data_dict <- read.xls(xls=\"./data/WNV_Data_Dictionary.xlsx\", sheet=\"qPCR Data - By Mouse\", as.is=T)\n",
    "qpcr_data_mouse_order = qpcr_data_mouse[, data_dict[,1]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "\t<li>TRUE</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\item TRUE\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. TRUE\n",
       "2. TRUE\n",
       "3. TRUE\n",
       "4. TRUE\n",
       "5. TRUE\n",
       "6. TRUE\n",
       "7. TRUE\n",
       "8. TRUE\n",
       "9. TRUE\n",
       "10. TRUE\n",
       "11. TRUE\n",
       "12. TRUE\n",
       "13. TRUE\n",
       "14. TRUE\n",
       "15. TRUE\n",
       "16. TRUE\n",
       "17. TRUE\n",
       "18. TRUE\n",
       "19. TRUE\n",
       "20. TRUE\n",
       "21. TRUE\n",
       "22. TRUE\n",
       "23. TRUE\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       " [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE\n",
       "[16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check both byLine and byMouse have the same UW lines\n",
    "names(summary(as.factor(qpcr_data_mouse_order[,\"UW_Line\"]))) == \n",
    "names(summary(as.factor(qpcr_data_final_format_order[,\"UW_Line\"])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>1560</li>\n",
       "\t<li>19</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 1560\n",
       "\\item 19\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 1560\n",
       "2. 19\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 1560   19"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>4600</li>\n",
       "\t<li>21</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 4600\n",
       "\\item 21\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 4600\n",
       "2. 21\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 4600   21"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dim(qpcr_data_final_format_order)\n",
    "dim(qpcr_data_mouse_order)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 9. Combine with Previously Cleaned Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>3990</li>\n",
       "\t<li>19</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 3990\n",
       "\\item 19\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 3990\n",
       "2. 19\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 3990   19"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Read in previous version of byLine data\n",
    "cleaned_data_dir = \"~/Documents/SIG/WNV/Cleaned_Data_Releases/23-Mar-2016/\"\n",
    "prev_qpcr_byline = read.xls(xls=file.path(cleaned_data_dir, \"Gale_qPCR_byLine_23-Mar-2016_final.xlsx\"), sheet=1, as.is=T)\n",
    "dim(prev_qpcr_byline)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "170"
      ],
      "text/latex": [
       "170"
      ],
      "text/markdown": [
       "170"
      ],
      "text/plain": [
       "[1] 170"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check for duplicates (new data will overwrite old)\n",
    "dup_groups = intersect(qpcr_data_final_format_order$Group, prev_qpcr_byline$Group)\n",
    "length(dup_groups)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>3820</li>\n",
       "\t<li>19</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 3820\n",
       "\\item 19\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 3820\n",
       "2. 19\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 3820   19"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Remove duplicated groups from previous data\n",
    "prev_qpcr_byline = prev_qpcr_byline[!prev_qpcr_byline$Group %in% dup_groups, ]\n",
    "dim(prev_qpcr_byline)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>5380</li>\n",
       "\t<li>19</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 5380\n",
       "\\item 19\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 5380\n",
       "2. 19\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 5380   19"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>5380</li>\n",
       "\t<li>19</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 5380\n",
       "\\item 19\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 5380\n",
       "2. 19\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 5380   19"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Combine old and new data\n",
    "qpcr_byline_updated = rbind(prev_qpcr_byline, qpcr_data_final_format_order)\n",
    "dim(qpcr_byline_updated)\n",
    "\n",
    "# Set blanks to NA\n",
    "qpcr_byline_updated_cleaned = clean_na(qpcr_byline_updated)\n",
    "\n",
    "# Remove duplicates\n",
    "if(sum(duplicated(qpcr_byline_updated_cleaned)) != 0){\n",
    "  qpcr_byline_updated_cleaned = qpcr_byline_updated_cleaned[!duplicated(qpcr_byline_updated_cleaned),]\n",
    "}\n",
    "dim(qpcr_byline_updated_cleaned)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>11564</li>\n",
       "\t<li>21</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 11564\n",
       "\\item 21\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 11564\n",
       "2. 21\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 11564    21"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Read in previous version of byMouse data\n",
    "prev_qpcr_bymouse = read.xls(xls=file.path(cleaned_data_dir, \"Gale_qPCR_byMouse_23-Mar-2016_final.xlsx\"), sheet=1)\n",
    "dim(prev_qpcr_bymouse)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "495"
      ],
      "text/latex": [
       "495"
      ],
      "text/markdown": [
       "495"
      ],
      "text/plain": [
       "[1] 495"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Check for duplicates (new data will overwrite old)\n",
    "prev_ids = paste(prev_qpcr_bymouse$ID, prev_qpcr_bymouse$Tissue, prev_qpcr_bymouse$Experiment, sep='_')\n",
    "new_ids = paste(qpcr_data_mouse_order$ID, qpcr_data_mouse_order$Tissue, qpcr_data_mouse_order$Experiment, sep='_')\n",
    "dup_ids = intersect(prev_ids, new_ids)\n",
    "length(dup_ids)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>11069</li>\n",
       "\t<li>21</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 11069\n",
       "\\item 21\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 11069\n",
       "2. 21\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 11069    21"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Remove duplicated IDs from previous data\n",
    "prev_qpcr_bymouse = prev_qpcr_bymouse[which(!prev_ids %in% dup_ids), ]\n",
    "dim(prev_qpcr_bymouse)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>15669</li>\n",
       "\t<li>21</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 15669\n",
       "\\item 21\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 15669\n",
       "2. 21\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 15669    21"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "text/html": [
       "<ol class=list-inline>\n",
       "\t<li>15669</li>\n",
       "\t<li>21</li>\n",
       "</ol>\n"
      ],
      "text/latex": [
       "\\begin{enumerate*}\n",
       "\\item 15669\n",
       "\\item 21\n",
       "\\end{enumerate*}\n"
      ],
      "text/markdown": [
       "1. 15669\n",
       "2. 21\n",
       "\n",
       "\n"
      ],
      "text/plain": [
       "[1] 15669    21"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Combine old and new data\n",
    "qpcr_bymouse_updated = rbind(prev_qpcr_bymouse, qpcr_data_mouse_order)\n",
    "dim(qpcr_bymouse_updated)\n",
    "\n",
    "## Set blanks to NA\n",
    "qpcr_bymouse_updated_cleaned = clean_na(qpcr_bymouse_updated)\n",
    "\n",
    "## Remove duplicates\n",
    "if(sum(duplicated(qpcr_bymouse_updated_cleaned)) != 0){\n",
    "  qpcr_bymouse_updated_cleaned = qpcr_byline_updated_cleaned[!duplicated(qpcr_bymouse_updated_cleaned),]\n",
    "}\n",
    "dim(qpcr_bymouse_updated_cleaned)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 10. Make Any Manual Corrections, If Necessary (Record These in README)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Step 11. Save Cleaned Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "## Save ByLine Data\n",
    "write.table(qpcr_byline_updated_cleaned, file=file.path(data_dir, \"23-May-2016/Gale_qPCR_byLine_5-23-16_MM_updated.txt\"), \n",
    "            col.names=T, row.names=F, sep='\\t', quote=F, na=\"\")\n",
    "## Save ByMouse Data\n",
    "write.table(qpcr_bymouse_updated_cleaned, file=file.path(data_dir, \"23-May-2016/Gale_qPCR_byMouse_5-23-16_MM_updated.txt\"), \n",
    "            col.names=T, row.names=F, sep='\\t', quote=F, na=\"\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "#### Last Updated: 26-May-2016"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R",
   "language": "R",
   "name": "ir"
  },
  "language_info": {
   "codemirror_mode": "r",
   "file_extension": ".r",
   "mimetype": "text/x-r-source",
   "name": "R",
   "pygments_lexer": "r",
   "version": "3.2.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}