{
 "metadata": {
  "name": "TJGR_Mgo_Expression",
  "signature": "sha256:00a7a1c68f84fd5aa7501e06ee8f565a62a5f626f2b96bf0783748a4e4262d9d"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#Developing Oyster Male Gonad Gene Expression Tracks\n",
      "\n",
      "Attempting to develop IGV genome browser tracks from Zhang et al data."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Overview\n",
      "\n",
      "1.\n",
      "2.\n",
      "3.\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "\n",
      "\n",
      "##1. Track based on transcript specific expression\n",
      "\n",
      "Table is available in SQLShare \n",
      "<https://sqlshare.escience.washington.edu/sqlshare#s=query/sr320%40washington.edu/Mgo_RNAseq_transcript>\n",
      "\n",
      "_Description:_\n",
      "```\n",
      "Paired end Male Gonad RNA-Seq data from Zhang et al 2012 Exported file from CLCL v6\n",
      "Data provided at exon level\n",
      "Derived using Dataset:\n",
      "Zhang, G; Fang, X; Guo, X; Li, L; Luo, R; Xu, F; Yang, P; Zhang, L; Wang, X; Qi, H; Zhu, Y; Yang, L; Huang, Z (2012): Genomic data from the Pacific oyster (Crassostrea gigas). GigaScience. http://dx.doi.org/10.5524/100030          \n",
      "          Maximum paired distance = 250\n",
      "          Unspecific match limit = 10\n",
      "          Minimum exon coverage fraction = 0.2\n",
      "          Count paired reads as two = No\n",
      "          Additional downstream bases = 0\n",
      "          Minimum paired distance = 180\n",
      "          Minimum similarity fraction = 0.8\n",
      "          Additional upstream bases = 0\n",
      "          Strand = Forward\n",
      "          Use annotations for gene and transcript identification = Yes\n",
      "          Organism type = EUKARYOTE\n",
      "          Minimum length fraction (long reads) = 0.9\n",
      "          Minimum read count fusion gene table = 5\n",
      "          Exon discovery = Yes\n",
      "          Expression level = Transcripts\n",
      "          Create report = Yes\n",
      "          Use colorspace encoding = No\n",
      "          Expression value = RPKM\n",
      "          Create list of unmapped reads = No\n",
      "          Create fusion gene table = No\n",
      "          Minimum number of reads = 10\n",
      "          Maximum number of mismatches allowed (applies to short reads) = 2\n",
      "          References = oyster.v9_90-7\n",
      "          Use strand specific assembly = No\n",
      "          Minimum length of putative exons = 25\n",
      "          Expression value = Read Per Kilobase of exon Model value\n",
      "Found: 25123 genes.\n",
      "Total number of reads : 54739722 ( single reads: 0, paired reads: 54739722 )\n",
      "Total number of mapped reads : 25685813 ( single reads: 0, paired reads: 25685813 )\n",
      "Total number of unmapped reads : 29053909 ( single reads: 0, paired reads: 29053909 )\n",
      "    ```\n",
      "    \n",
      "_Screenshot_\n",
      "    \n",
      "<img src=\"https://www.evernote.com/shard/s10/sh/7cf328c4-581f-4038-bd35-f0d62ccc81b4/a0bd8da1cd6c088cc2f5a6d751a111f7/deep/0/Screenshot%205/29/13%202:28%20PM.jpg\" alt=\"Screenshot%205/29/13%202:28%20PM\" Width = 50% />\n",
      "    "
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Query\n"
     ]
    },
    {
     "cell_type": "raw",
     "metadata": {},
     "source": [
      "SELECT\n",
      "  Chromosome,\n",
      "  \"Chromosome region start\" - 1 as start,\n",
      "  \"Chromosome region end\",\n",
      "  'exon' as Feature,\n",
      "  \"Expression Values\"\n",
      "  \n",
      "  FROM [sr320@washington.edu].[Mgo_RNAseq_transcript]\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\n",
      "  \n",
      "\u200bOrder by \"Expression Values\"\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b\u200b Desc\u200b\u200b"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "_Screenshot_\n",
      "\n",
      "<img src=\"https://www.evernote.com/shard/s10/sh/ee8b2fd1-4d5b-46b7-8f13-a21ea5adbbe7/348393ae625775c02eea7dfa2741c3af/deep/0/Screenshot%205/29/13%202:39%20PM.jpg\" alt=\"Screenshot%205/29/13%202:39%20PM\" width =50%/>"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "_Downloading via python client_"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "\n",
      "\n",
      "`python fetchdata.py -d \"[sr320@washington.edu].[Mgo_RNAseq_transcript_IGV]\" -f tsv -o /Volumes/web/cnidarian/Mgo_RNAseq_transcript.igv`"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<http://eagle.fish.washington.edu/cnidarian/Mgo_RNAseq_transcript.igv>\n",
      "    \n",
      "Sorted within IGV\n",
      "    \n",
      "<img src=\"https://www.evernote.com/shard/s10/sh/fe8e45fd-5b72-4733-b106-2c0723d28ab7/c9d96c825a464c6bb2e1637706721c8e/deep/0/Screenshot%205/29/13%202:53%20PM.jpg\" alt=\"Screenshot%205/29/13%202:53%20PM\" width = 40%/>\n",
      "\n",
      "This creates a new file\n",
      "<http://eagle.fish.washington.edu/cnidarian/Mgo_RNAseq_transcript.sorted.igv>    "
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Seems to have worked but not certain on coordinates...\n",
      "\n",
      "<img src=\"https://www.evernote.com/shard/s10/sh/80e58f04-a5c0-4dfb-aade-360841761da5/500a1d94ec9f22ad5481f650dada43fa/deep/0/Screenshot%205/29/13%203:00%20PM.jpg\" alt=\"Screenshot%205/29/13%203:00%20PM\" width = 40% />"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>\n",
      "##2. Track based on gene specific expression\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "<hr>\n",
      "##3. Expression data as SAM file\n",
      "<http://eagle.fish.washington.edu/cnidarian/Mgo_1%20(paired)%20RNA-Seq.sam>"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}