{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Fungal ITS QIIME analysis tutorial\n", "==================================\n", "\n", "In this tutorial we illustrate steps for analyzing fungal ITS amplicon data using the QIIME/UNITE reference OTUs (alpha version 12_11) to compare the composition of 9 soil communities using open-reference OTU picking. More recent ITS reference databases based on UNITE are available on the [QIIME resources page](http://qiime.org/home_static/dataFiles.html). The steps in this tutorial can be generalized to work with other marker genes, such as 18S.\n", "\n", "We recommend working through the [Illumina Overview Tutorial](http://qiime.org/tutorials/illumina_overview_tutorial.html) before working through this tutorial, as it provides more detailed annotation of the steps in a QIIME analysis. This tutorial is intended to highlight the differences that are necessary to work with a database other than QIIME's default reference database. For ITS, we won't build a phylogenetic tree and therefore use nonphylogenetic diversity metrics. Instructions are included for how to build a phylogenetic tree if you're sequencing a non-16S, phylogenetically-informative marker gene (e.g., 18S).\n", "\n", "First, we obtain the tutorial data and reference database:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!(wget ftp://ftp.microbio.me/qiime/tutorial_files/its-soils-tutorial.tgz || curl -O ftp://ftp.microbio.me/qiime/tutorial_files/its-soils-tutorial.tgz)\n", "!(wget ftp://ftp.microbio.me/qiime/tutorial_files/its_12_11_otus.tgz || curl -O ftp://ftp.microbio.me/qiime/tutorial_files/its_12_11_otus.tgz)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now unzip these files." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!tar -xzf its-soils-tutorial.tgz\n", "!tar -xzf its_12_11_otus.tgz\n", "!gunzip ./its_12_11_otus/rep_set/97_otus.fasta.gz\n", "!gunzip ./its_12_11_otus/taxonomy/97_otu_taxonomy.txt.gz" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can then view the files in each of these direcories by passing the directory name to the ``FileLinks`` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from IPython.display import FileLink, FileLinks\n", "FileLinks('its-soils-tutorial')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The ``params.txt`` file modifies some of the default parameters of this analysis. You can review those by clicking the link or by catting the file. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!cat its-soils-tutorial/params.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The parameters that differentiate ITS analysis from analysis of other amplicons are the two ``assign_taxonomy`` parameters, which are pointing to the reference collection that we just downloaded." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We're now ready to run the ``pick_open_reference_otus.py`` workflow. Discussion of these methods can be found in [Rideout et. al (2014)](https://peerj.com/articles/545/).\n", "\n", "Note that we pass `-r` to specify a non-default reference database. We're also passing `--suppress_align_and_tree` because we know that trees generated from ITS sequences are generally not phylogenetically informative." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!pick_open_reference_otus.py -i its-soils-tutorial/seqs.fna -r its_12_11_otus/rep_set/97_otus.fasta -o otus/ -p its-soils-tutorial/params.txt --suppress_align_and_tree" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:** If you would like to build a phylogenetic tree (e.g., if you're using a phylogentically-informative marker gene such as 18S instead of ITS), you should remove the ``--suppress_align_and_tree`` parameter from the above command and add the following lines to the parameters file:\n", "\n", "```\n", "align_seqs:template_fp \n", "filter_alignment:suppress_lane_mask_filter True\n", "filter_alignment:entropy_threshold 0.10\n", "```\n", "\n", "After that completes (it will take a few minutes) we'll have the OTU table with taxonomy. You can review all of the files that are created by passing the path to the `index.html` file in the output directory to the `FileLink` function." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "FileLink('otus/index.html')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can then pass the OTU table to ``biom summarize-table`` to view a summary of the information in the OTU table." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!biom summarize-table -i otus/otu_table_mc2_w_tax.biom" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we run several core diversity analyses, including alpha/beta diversity and taxonomic summarization. We will use an even sampling depth of 353 based on the results of `biom summarize-table` above. Since we did not built a phylogenetic tree, we'll pass the `--nonphylogenetic_diversity` flag, which specifies to compute Bray-Curtis distances instead of UniFrac distances, and to use only nonphylogenetic alpha diversity metrics." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "!core_diversity_analyses.py -i otus/otu_table_mc2_w_tax.biom -o cdout/ -m its-soils-tutorial/map.txt -e 353 --nonphylogenetic_diversity" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may see a warning issued above; this is safe to ignore.\n", "\n", "**Note:** If you built a phylogenetic tree, you should pass the path to that tree via `-t` and not pass ``--nonphylogenetic_diversity``.\n", "\n", "You can view the output of `core_diversity_analyses.py` using `FileLink`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "FileLink('cdout/index.html')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Precomputed results\n", "\n", "In case you're having trouble running the steps above, for example because of a broken QIIME installation, all of the output generated above has been precomputed. You can access this by running the cell below." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "FileLinks(\"its-soils-tutorial/precomputed-output/\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.5" } }, "nbformat": 4, "nbformat_minor": 0 }