{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Running in Docker container on Ostrich\n", "\n", "#### Started Docker container with the following command:\n", "\n", "```docker run -p 8888:8888 -v /Users/sam/data/:/data -v /Users/sam/owl_home/:/owl_home -v /Users/sam/owl_web/:/owl_web -v /Users/sam/gitrepos:/gitrepos -it f99537d7e06a```\n", "\n", "The command allows access to Jupyter Notebook over port 8888 and makes my Jupyter Notebook GitHub repo and my data files on Owl/home and Owl/web accessible to the Docker container.\n", "\n", "Once the container was started, started Jupyter Notebook with the following command inside the Docker container:\n", "\n", "```jupyter notebook```\n", "\n", "This is configured in the Docker container to launch a Jupyter Notebook without a browser on port 8888.\n", "\n", "The Docker container is running on an image created from this [Dockerfile (Git commit 443bc42)](https://github.com/sr320/LabDocs/blob/443bc425cd36d23a07cf12625f38b7e3a397b9be/code/dockerfiles/Dockerfile.bio)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Thu Dec 29 22:59:51 UTC 2016\n" ] } ], "source": [ "%%bash\n", "date" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Check computer specs" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0f2bca9c664b\n" ] } ], "source": [ "%%bash\n", "hostname" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Architecture: x86_64\n", "CPU op-mode(s): 32-bit, 64-bit\n", "Byte Order: Little Endian\n", "CPU(s): 8\n", "On-line CPU(s) list: 0-7\n", "Thread(s) per core: 1\n", "Core(s) per socket: 8\n", "Socket(s): 1\n", "Vendor ID: GenuineIntel\n", "CPU family: 6\n", "Model: 26\n", "Model name: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz\n", "Stepping: 5\n", "CPU MHz: 2260.998\n", "BogoMIPS: 4521.99\n", "Hypervisor vendor: KVM\n", "Virtualization type: full\n", "L1d cache: 32K\n", "L1i cache: 32K\n", "L2 cache: 256K\n", "L3 cache: 8192K\n" ] } ], "source": [ "%%bash\n", "lscpu" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download all Panopea generosa RRBS files from Genewiz using ```wget```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### This uses the ```-m``` flag to recursrively download all files and mirror the directory structure of the target. Additionally, it uses the ```-P``` flag to specify the output location of the downloaded files. The ```--quiet``` eliminates status from printing to screen. Needed to prevent notebook from getting too large, due to number of output lines from downloading these large files." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 110G\n", "-rw-rw-rw- 1 srlab staff 2.8K Nov 2 19:29 readme.md\n", "-rw-rw-rw- 1 srlab staff 2.1K Mar 24 2016 checksums.md5\n", "-rw-rw-rw- 1 srlab staff 11G Jun 30 2015 Geo_Pool_M_CTTGTA_L006_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 11G Jun 30 2015 Geo_Pool_M_CTTGTA_L006_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 9.4G Jun 30 2015 Geo_Pool_F_GGCTAC_L006_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 9.3G Jun 30 2015 Geo_Pool_F_GGCTAC_L006_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDACDTAAPEI-102_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDACDTAAPEI-102_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.9G Jan 27 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDABDLAAPEI-100_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.0G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDABDLAAPEI-100_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDACDTAAPEI-102_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDACDTAAPEI-102_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDABDLAAPEI-100_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.1G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDABDLAAPEI-100_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.0G Jan 25 2016 160103_I137_FCH3V5YBBXX_L4_WHPANwalDDAADWAAPEI-101_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.8G Jan 25 2016 160103_I137_FCH3V5YBBXX_L4_WHPANwalDDAADWAAPEI-101_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.9G Jan 25 2016 160103_I137_FCH3V5YBBXX_L3_WHPANwalDDAADWAAPEI-101_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.8G Jan 25 2016 160103_I137_FCH3V5YBBXX_L3_WHPANwalDDAADWAAPEI-101_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 7.2G Jan 25 2016 151122_I136_FCH3L2FBBXX_L7_wHAXPI023990-97_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 6.4G Jan 25 2016 151122_I136_FCH3L2FBBXX_L7_wHAXPI023990-97_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 8.1G Jan 25 2016 151114_I191_FCH3Y35BCXX_L2_wHAMPI023988-81_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 7.8G Jan 25 2016 151114_I191_FCH3Y35BCXX_L2_wHAMPI023988-81_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 11G Jan 25 2016 151114_I191_FCH3Y35BCXX_L1_wHAIPI023989-79_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 9.9G Jan 25 2016 151114_I191_FCH3Y35BCXX_L1_wHAIPI023989-79_1.fq.gz\n" ] } ], "source": [ "%%bash\n", "ls -lhr /owl_web/nightingales/P_generosa/" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Process is interrupted.\n" ] } ], "source": [ "%%bash\n", "time wget --quiet -m ftp://Rick.Goetz:Q1p487Hj@ftp2.genewiz.com/DW1604132_R3 \\\n", "- P /owl_web/nightingales/P_generosa/" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 110G\n", "-rw-rw-rw- 1 srlab staff 2.8K Nov 2 19:29 readme.md\n", "-rw-rw-rw- 1 srlab staff 2.1K Mar 24 2016 checksums.md5\n", "-rw-rw-rw- 1 srlab staff 11G Jun 30 2015 Geo_Pool_M_CTTGTA_L006_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 11G Jun 30 2015 Geo_Pool_M_CTTGTA_L006_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 9.4G Jun 30 2015 Geo_Pool_F_GGCTAC_L006_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 9.3G Jun 30 2015 Geo_Pool_F_GGCTAC_L006_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDACDTAAPEI-102_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDACDTAAPEI-102_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.9G Jan 27 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDABDLAAPEI-100_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.0G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDABDLAAPEI-100_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDACDTAAPEI-102_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDACDTAAPEI-102_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDABDLAAPEI-100_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.1G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDABDLAAPEI-100_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.0G Jan 25 2016 160103_I137_FCH3V5YBBXX_L4_WHPANwalDDAADWAAPEI-101_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.8G Jan 25 2016 160103_I137_FCH3V5YBBXX_L4_WHPANwalDDAADWAAPEI-101_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.9G Jan 25 2016 160103_I137_FCH3V5YBBXX_L3_WHPANwalDDAADWAAPEI-101_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.8G Jan 25 2016 160103_I137_FCH3V5YBBXX_L3_WHPANwalDDAADWAAPEI-101_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 7.2G Jan 25 2016 151122_I136_FCH3L2FBBXX_L7_wHAXPI023990-97_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 6.4G Jan 25 2016 151122_I136_FCH3L2FBBXX_L7_wHAXPI023990-97_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 8.1G Jan 25 2016 151114_I191_FCH3Y35BCXX_L2_wHAMPI023988-81_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 7.8G Jan 25 2016 151114_I191_FCH3Y35BCXX_L2_wHAMPI023988-81_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 11G Jan 25 2016 151114_I191_FCH3Y35BCXX_L1_wHAIPI023989-79_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 9.9G Jan 25 2016 151114_I191_FCH3Y35BCXX_L1_wHAIPI023989-79_1.fq.gz\n" ] } ], "source": [ "%%bash\n", "ls -lhr /owl_web/nightingales/P_generosa/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Interesting, the above command doesn't work. Looked at an old [notebook post where I retrieved data from Genewiz](http://onsnetwork.org/kubu4/2016/01/08/data-received-bisulfite-treated-illumina-sequencing-from-genewiz/). Changing command..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### The ```wget``` command below uses the following arguments:\n", "\n", "```--quiet``` - Suppresses output printed to screen.\n", "\n", "```-r``` - Download recursively (i.e. go through the entire set of directories)\n", "\n", "```-np``` - Stands for \"no parent.\" Means do not ascend above the initial directory.\n", "\n", "```-nc``` - Stands for \"no clobber.\" Means do not overwrite files with same names.\n", "\n", "```-A``` - Access list. Needs to be followed with the types of files that are acceptable for wget to download.\n", "\n", "```-P``` - Specifies output location." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Process is interrupted.\n" ] } ], "source": [ "%%bash\n", "time wget --quiet -r -np -nc -A \"*.gz\" ftp://Rick.Goetz:Q1p487Hj@ftp2.genewiz.com/DW1604132_R3 \\\n", "- P /owl_web/nightingales/P_generosa/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### That also wasn't working! Try w/o specifying output directory and change directly into desired download location..." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/owl_web/nightingales/P_generosa\n" ] } ], "source": [ "cd /owl_web/nightingales/P_generosa/" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "real\t69m4.293s\n", "user\t0m3.960s\n", "sys\t16m20.760s\n" ] } ], "source": [ "%%bash\n", "time wget --quiet -r -np -nc -A \"*.gz\" ftp://Rick.Goetz:Q1p487Hj@ftp2.genewiz.com/DW1604132_R3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Check the downloaded files" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 110G\n", "-rw-rw-rw- 1 srlab staff 2.8K Nov 2 19:29 readme.md\n", "drwxrwxrwx 1 srlab staff 264 Dec 29 23:52 ftp2.genewiz.com\n", "-rw-rw-rw- 1 srlab staff 2.1K Mar 24 2016 checksums.md5\n", "-rw-rw-rw- 1 srlab staff 11G Jun 30 2015 Geo_Pool_M_CTTGTA_L006_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 11G Jun 30 2015 Geo_Pool_M_CTTGTA_L006_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 9.4G Jun 30 2015 Geo_Pool_F_GGCTAC_L006_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 9.3G Jun 30 2015 Geo_Pool_F_GGCTAC_L006_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDACDTAAPEI-102_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDACDTAAPEI-102_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.9G Jan 27 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDABDLAAPEI-100_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.0G Jan 25 2016 160103_I137_FCH3V5YBBXX_L6_WHPANwalDDABDLAAPEI-100_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDACDTAAPEI-102_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDACDTAAPEI-102_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.3G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDABDLAAPEI-100_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.1G Jan 25 2016 160103_I137_FCH3V5YBBXX_L5_WHPANwalDDABDLAAPEI-100_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 2.0G Jan 25 2016 160103_I137_FCH3V5YBBXX_L4_WHPANwalDDAADWAAPEI-101_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.8G Jan 25 2016 160103_I137_FCH3V5YBBXX_L4_WHPANwalDDAADWAAPEI-101_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.9G Jan 25 2016 160103_I137_FCH3V5YBBXX_L3_WHPANwalDDAADWAAPEI-101_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 1.8G Jan 25 2016 160103_I137_FCH3V5YBBXX_L3_WHPANwalDDAADWAAPEI-101_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 7.2G Jan 25 2016 151122_I136_FCH3L2FBBXX_L7_wHAXPI023990-97_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 6.4G Jan 25 2016 151122_I136_FCH3L2FBBXX_L7_wHAXPI023990-97_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 8.1G Jan 25 2016 151114_I191_FCH3Y35BCXX_L2_wHAMPI023988-81_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 7.8G Jan 25 2016 151114_I191_FCH3Y35BCXX_L2_wHAMPI023988-81_1.fq.gz\n", "-rw-rw-rw- 1 srlab staff 11G Jan 25 2016 151114_I191_FCH3Y35BCXX_L1_wHAIPI023989-79_2.fq.gz\n", "-rw-rw-rw- 1 srlab staff 9.9G Jan 25 2016 151114_I191_FCH3Y35BCXX_L1_wHAIPI023989-79_1.fq.gz\n" ] } ], "source": [ "%%bash\n", "ls -lhr" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 35G\n", "-rw-rw-rw- 1 srlab staff 1.5G Dec 28 18:33 EPI-145_S38_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.5G Dec 28 18:33 EPI-145_S38_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Dec 28 18:33 EPI-143_S37_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Dec 28 18:33 EPI-143_S37_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.6G Dec 28 18:33 EPI-136_S36_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.6G Dec 28 18:33 EPI-136_S36_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.7G Dec 28 18:33 EPI-135_S35_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.6G Dec 28 18:33 EPI-135_S35_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.4G Dec 28 18:33 EPI-128_S34_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.4G Dec 28 18:33 EPI-128_S34_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Dec 28 18:33 EPI-127_S33_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.3G Dec 28 18:33 EPI-127_S33_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.4G Dec 28 18:33 EPI-120_S32_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.4G Dec 28 18:33 EPI-120_S32_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.7G Dec 28 18:33 EPI-119_S31_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.7G Dec 28 18:33 EPI-119_S31_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.5G Dec 28 18:33 EPI-113_S30_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.5G Dec 28 18:33 EPI-113_S30_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.5G Dec 28 18:33 EPI-111_S29_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.6G Dec 28 18:33 EPI-111_S29_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.7G Dec 28 18:33 EPI-104_S28_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.7G Dec 28 18:33 EPI-104_S28_L005_R1_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Dec 28 18:33 EPI-103_S27_L005_R2_001.fastq.gz\n", "-rw-rw-rw- 1 srlab staff 1.2G Dec 28 18:33 EPI-103_S27_L005_R1_001.fastq.gz\n" ] } ], "source": [ "%%bash\n", "ls -lhr ftp2.genewiz.com/DW1604132_R3/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Generate MD5 checksums and append to existing checksums file" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "real\t33m38.336s\n", "user\t0m3.810s\n", "sys\t3m54.180s\n" ] } ], "source": [ "%%bash\n", "time for i in ftp2.genewiz.com/DW1604132_R3/*\n", " do\n", " md5sum \"$i\" >> checksums.md5\n", " done" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "17dbac3c0a80c19b7cf9c65840188dc0 ftp2.genewiz.com/DW1604132_R3/EPI-103_S27_L005_R1_001.fastq.gz\n", "4ec237a381460fd7635b77b84bc530f1 ftp2.genewiz.com/DW1604132_R3/EPI-103_S27_L005_R2_001.fastq.gz\n", "1d4fb44a60cb4b50367f35751d71b040 ftp2.genewiz.com/DW1604132_R3/EPI-104_S28_L005_R1_001.fastq.gz\n", "f534dea1569d9c3a7345fe0e6230f10d ftp2.genewiz.com/DW1604132_R3/EPI-104_S28_L005_R2_001.fastq.gz\n", "5211ee55bda2d9fc104cd3fd862b4e5a ftp2.genewiz.com/DW1604132_R3/EPI-111_S29_L005_R1_001.fastq.gz\n", "6bf50011fe4dacad4379828897d3b99b ftp2.genewiz.com/DW1604132_R3/EPI-111_S29_L005_R2_001.fastq.gz\n", "7ca64d5b500f0768b3ce697c21614948 ftp2.genewiz.com/DW1604132_R3/EPI-113_S30_L005_R1_001.fastq.gz\n", "ddc2988966835ea69cfd0f4a94d65cba ftp2.genewiz.com/DW1604132_R3/EPI-113_S30_L005_R2_001.fastq.gz\n", "1cae2a72398e94e68947eca56f875d97 ftp2.genewiz.com/DW1604132_R3/EPI-119_S31_L005_R1_001.fastq.gz\n", "8f3ab566a1e10c978a4ed6a90ecfcda2 ftp2.genewiz.com/DW1604132_R3/EPI-119_S31_L005_R2_001.fastq.gz\n", "6d8058be169234534288abbd36066dee ftp2.genewiz.com/DW1604132_R3/EPI-120_S32_L005_R1_001.fastq.gz\n", "3b2c02d050a9fc60a5b3894dcd8e5fd6 ftp2.genewiz.com/DW1604132_R3/EPI-120_S32_L005_R2_001.fastq.gz\n", "1849c59c8f788a87ce038b684b7f50b0 ftp2.genewiz.com/DW1604132_R3/EPI-127_S33_L005_R1_001.fastq.gz\n", "85b48a03757fbd3ae6738dce05cc62b8 ftp2.genewiz.com/DW1604132_R3/EPI-127_S33_L005_R2_001.fastq.gz\n", "e84ecd41bb4254fd6d0307b39c662c79 ftp2.genewiz.com/DW1604132_R3/EPI-128_S34_L005_R1_001.fastq.gz\n", "2256a06956601d00e8126f78c1ebd671 ftp2.genewiz.com/DW1604132_R3/EPI-128_S34_L005_R2_001.fastq.gz\n", "b8772ae3cfef50928fdc8551bba3da39 ftp2.genewiz.com/DW1604132_R3/EPI-135_S35_L005_R1_001.fastq.gz\n", "3a3da121e8871e57d7a09180831d2ffd ftp2.genewiz.com/DW1604132_R3/EPI-135_S35_L005_R2_001.fastq.gz\n", "a59b275111575bcf5907561dabf3872a ftp2.genewiz.com/DW1604132_R3/EPI-136_S36_L005_R1_001.fastq.gz\n", "b73945c56c67f79fb92c1e9283d90bdf ftp2.genewiz.com/DW1604132_R3/EPI-136_S36_L005_R2_001.fastq.gz\n", "c89f049090efd2a6825cedee82f0c643 ftp2.genewiz.com/DW1604132_R3/EPI-143_S37_L005_R1_001.fastq.gz\n", "01442d00a6aa4fb2bb83d1822124ee0d ftp2.genewiz.com/DW1604132_R3/EPI-143_S37_L005_R2_001.fastq.gz\n", "74461e7cfc369eef5f3350e033de8544 ftp2.genewiz.com/DW1604132_R3/EPI-145_S38_L005_R1_001.fastq.gz\n", "d7ec7b3c0766ea6bd16a992f275595cb ftp2.genewiz.com/DW1604132_R3/EPI-145_S38_L005_R2_001.fastq.gz\n" ] } ], "source": [ "%%bash\n", "grep EPI checksums.md5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Count the number of files downloaded" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "24\n" ] } ], "source": [ "%%bash\n", "ls -1 ftp2.genewiz.com/DW1604132_R3/ | wc -l" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Count the number of checksums generated from files to verify that all the files were processed" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "24\n" ] } ], "source": [ "%%bash\n", "grep EPI checksums.md5 | wc -l" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 2 }