{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Summarizing annotations to a term and descendants\n", "\n", "This notebook demonstrates summarizing annotation counts for a term and its descendants.\n", "\n", "An example use of this is a GO annotator exploring refactoring a subtree in GO\n", "\n", "Of course, if this were a regular thing we would make a command line or even web interface,\n", "but keeping as a notebook gives us some flexibility in logic, and anyway is intended largely\n", "as a demonstration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### boilerplate\n", "\n", " * importing relevant ontobiolibraries\n", " * set up key objects" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "\n", "## Create an ontology factory in order to fetch GO\n", "from ontobio.ontol_factory import OntologyFactory\n", "ofactory = OntologyFactory()\n", "\n", "## GOLR queries\n", "from ontobio.golr.golr_query import GolrAssociationQuery\n", "\n", "## rendering ontologies\n", "from ontobio import GraphRenderer" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "## Load GO. Note the first time this runs Jupyter will show '*' - be patient\n", "ont = ofactory.create(\"go\") " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Finding descendants\n", "\n", "Here we are using the in-memory ontology object, no external service calls are executed\n", "\n", "Change the value of `term_id` to what you like" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "term_id = \"GO:0009070\" ## serine family amino acid biosynthetic process\n", "descendants = ont.descendants(term_id, reflexive=True, relations=['subClassOf', 'BFO:0000050'])" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['GO:0016260',\n", " 'GO:0004124',\n", " 'GO:0019343',\n", " 'GO:0019265',\n", " 'GO:0019345',\n", " 'GO:0006564',\n", " 'GO:0006545',\n", " 'GO:0071269',\n", " 'GO:0006535',\n", " 'GO:0009090',\n", " 'GO:0070179',\n", " 'GO:0019264',\n", " 'GO:0009070',\n", " 'GO:0019344']" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "descendants" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### rendering subtrees\n", "\n", "We use the good-old-fashioned Tree renderer\n", "\n", "(this doesn't scale well for latticey-subontologies)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "renderer = GraphRenderer.create('tree')" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ". GO:0009070 ! serine family amino acid biosynthetic process\n", " % GO:0006545 ! glycine biosynthetic process\n", " % GO:0019264 ! glycine biosynthetic process from serine\n", " % GO:0019265 ! glycine biosynthetic process, by transamination of glyoxylate\n", " % GO:0006564 ! L-serine biosynthetic process\n", " % GO:0019344 ! cysteine biosynthetic process\n", " % GO:0006535 ! cysteine biosynthetic process from serine\n", " % GO:0019343 ! cysteine biosynthetic process via cystathionine\n", " % GO:0019345 ! cysteine biosynthetic process via S-sulfo-L-cysteine\n", " < GO:0004124 ! cysteine synthase activity\n", " % GO:0009090 ! homoserine biosynthetic process\n", " % GO:0016260 ! selenocysteine biosynthetic process\n", " % GO:0070179 ! D-serine biosynthetic process\n", " % GO:0071269 ! L-homocysteine biosynthetic process\n", "\n", "\n" ] } ], "source": [ "print(renderer.render_subgraph(ont, nodes=descendants))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### summarizing annotations\n", "\n", "We write a short procedure to wrap calling Golr and returning a summary dict\n", "\n", "The dict is keyed by taxon label. We also include an entry for `ALL`\n" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [], "source": [ "DEFAULT_FACET_FIELDS = ['taxon_subset_closure_label', 'evidence_label', 'assigned_by']\n", "def summarize(t: str, \n", " evidence_closure='ECO:0000269', ## restrict to experimental\n", " facet_fields=None) -> dict:\n", " \"\"\"\n", " Summarize a term\n", " \"\"\"\n", " if facet_fields == None:\n", " facet_fields = DEFAULT_FACET_FIELDS\n", " q = GolrAssociationQuery(object=t, rows=0, object_category='function', \n", " fq={'evidence_closur'taxon_subset_closure_label'e_label':'experimental evidence'},\n", " facet_fields=facet_fields)\n", " #params = q.solr_params()\n", " #print(params)\n", " result = q.exec()\n", " fc = result['facet_counts']\n", " item = {'ALL': result['numFound']} ## make sure this is the first entry\n", " for ff in facet_fields:\n", " if ff in fc:\n", " item.update(fc[ff])\n", " return item" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'ALL': 144, 'Eukaryota': 92, 'Bacteria': 52, 'Metazoa': 33, 'Fungi': 32, 'Escherichia coli K-12': 27, 'Viridiplantae': 23, 'Mammalia': 22, 'Vertebrata <vertebrates>': 22, 'Arabidopsis thaliana': 21, 'Saccharomyces cerevisiae S288C': 17, 'Mycobacterium tuberculosis H37Rv': 11, 'Schizosaccharomyces pombe': 11, 'Homo sapiens': 8, 'Caenorhabditis elegans': 7, 'Mus musculus': 7, 'Rattus norvegicus': 7, 'Bacillus subtilis subsp. subtilis str. 168': 6, 'Pseudomonas aeruginosa PAO1': 4, 'Aspergillus nidulans FGSC A4': 3, 'Apis mellifera': 2, 'Leishmania major strain Friedlin': 2, 'Bombyx mori': 1, 'Candida albicans SC5314': 1, 'Dictyostelium discoideum': 1, 'Drosophila melanogaster': 1, 'direct assay evidence used in manual assertion': 81, 'mutant phenotype evidence used in manual assertion': 53, 'genetic interaction evidence used in manual assertion': 10, 'EcoCyc': 20, 'TAIR': 19, 'SGD': 17, 'UniProt': 16, 'PomBase': 11, 'MTBBASE': 10, 'EcoliWiki': 7, 'RGD': 7, 'WB': 7, 'MGI': 6, 'CAFA': 5, 'PseudoCAP': 5, 'BHF-UCL': 4, 'AspGD': 3, 'GeneDB': 3, 'CGD': 1, 'FlyBase': 1, 'GOC': 1, 'dictyBase': 1}\n" ] } ], "source": [ "print(summarize(term_id))" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "def summarize_set(ids, facet_fields=None) -> pd.DataFrame:\n", " \"\"\"\n", " Summarize a set of annotations, return a dataframe\n", " \"\"\"\n", " items = []\n", " for id in ids:\n", " item = {'id': id, 'name:': ont.label(id)}\n", " for k,v in summarize(id, facet_fields=facet_fields).items():\n", " item[k] = v\n", " items.append(item)\n", " df = pd.DataFrame(items).fillna(0)\n", " # sort using total number\n", " df.sort_values('ALL', axis=0, ascending=False, inplace=True)\n", " return df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summarize GO term and descendants\n", "\n", "More advanced visualziations are easy with plotly etc. We leave as an exercise to the reader...\n", "\n", "As an example, for the first query we bundle all facets (species, evidence, assigned by) together" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>id</th>\n", " <th>name:</th>\n", " <th>ALL</th>\n", " <th>Bacteria</th>\n", " <th>Escherichia coli K-12</th>\n", " <th>Eukaryota</th>\n", " <th>Mammalia</th>\n", " <th>Metazoa</th>\n", " <th>Mus musculus</th>\n", " <th>Trypanosoma brucei brucei TREU927</th>\n", " <th>Vertebrata <vertebrates></th>\n", " <th>mutant phenotype evidence used in manual assertion</th>\n", " <th>direct assay evidence used in manual assertion</th>\n", " <th>EcoCyc</th>\n", " <th>GeneDB</th>\n", " <th>MGI</th>\n", " <th>Viridiplantae</th>\n", " <th>Arabidopsis thaliana</th>\n", " <th>Fungi</th>\n", " <th>Caenorhabditis elegans</th>\n", " <th>Schizosaccharomyces pombe</th>\n", " <th>Bacillus subtilis subsp. subtilis str. 168</th>\n", " <th>Mycobacterium tuberculosis H37Rv</th>\n", " <th>Saccharomyces cerevisiae S288C</th>\n", " <th>Solanum tuberosum</th>\n", " <th>Spinacia oleracea</th>\n", " <th>Streptomyces lavendulae</th>\n", " <th>genetic interaction evidence used in manual assertion</th>\n", " <th>TAIR</th>\n", " <th>PomBase</th>\n", " <th>UniProt</th>\n", " <th>WB</th>\n", " <th>CAFA</th>\n", " <th>EcoliWiki</th>\n", " <th>MTBBASE</th>\n", " <th>SGD</th>\n", " <th>Aspergillus nidulans FGSC A4</th>\n", " <th>Apis mellifera</th>\n", " <th>Homo sapiens</th>\n", " <th>Leishmania major strain Friedlin</th>\n", " <th>AspGD</th>\n", " <th>BHF-UCL</th>\n", " <th>Rattus norvegicus</th>\n", " <th>RGD</th>\n", " <th>Pseudomonas aeruginosa PAO1</th>\n", " <th>Thermus thermophilus HB27</th>\n", " <th>PseudoCAP</th>\n", " <th>Bombyx mori</th>\n", " <th>Drosophila melanogaster</th>\n", " <th>FlyBase</th>\n", " <th>Lactobacillus casei</th>\n", " <th>Candida albicans SC5314</th>\n", " <th>CGD</th>\n", " <th>Dictyostelium discoideum</th>\n", " <th>dictyBase</th>\n", " <th>GOC</th>\n", " <th>Pseudomonas aeruginosa</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>12</th>\n", " <td>GO:0009070</td>\n", " <td>serine family amino acid biosynthetic process</td>\n", " <td>144</td>\n", " <td>52.0</td>\n", " <td>27.0</td>\n", " <td>92.0</td>\n", " <td>22.0</td>\n", " <td>33.0</td>\n", " <td>7.0</td>\n", " <td>0.0</td>\n", " <td>22.0</td>\n", " <td>53.0</td>\n", " <td>81.0</td>\n", " <td>20.0</td>\n", " <td>3.0</td>\n", " <td>6.0</td>\n", " <td>23.0</td>\n", " <td>21.0</td>\n", " <td>32.0</td>\n", " <td>7.0</td>\n", " <td>11.0</td>\n", " <td>6.0</td>\n", " <td>11.0</td>\n", " <td>17.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>10.0</td>\n", " <td>19.0</td>\n", " <td>11.0</td>\n", " <td>16.0</td>\n", " <td>7.0</td>\n", " <td>5.0</td>\n", " <td>7.0</td>\n", " <td>10.0</td>\n", " <td>17.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>8.0</td>\n", " <td>2.0</td>\n", " <td>3.0</td>\n", " <td>4.0</td>\n", " <td>7.0</td>\n", " <td>7.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>13</th>\n", " <td>GO:0019344</td>\n", " <td>cysteine biosynthetic process</td>\n", " <td>82</td>\n", " <td>25.0</td>\n", " <td>8.0</td>\n", " <td>57.0</td>\n", " <td>6.0</td>\n", " <td>15.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>30.0</td>\n", " <td>49.0</td>\n", " <td>6.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>21.0</td>\n", " <td>19.0</td>\n", " <td>19.0</td>\n", " <td>7.0</td>\n", " <td>10.0</td>\n", " <td>5.0</td>\n", " <td>8.0</td>\n", " <td>6.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>17.0</td>\n", " <td>10.0</td>\n", " <td>10.0</td>\n", " <td>7.0</td>\n", " <td>5.0</td>\n", " <td>2.0</td>\n", " <td>7.0</td>\n", " <td>6.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>GO:0004124</td>\n", " <td>cysteine synthase activity</td>\n", " <td>29</td>\n", " <td>7.0</td>\n", " <td>3.0</td>\n", " <td>22.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>21.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>13.0</td>\n", " <td>11.0</td>\n", " <td>5.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>2.0</td>\n", " <td>10.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>GO:0006564</td>\n", " <td>L-serine biosynthetic process</td>\n", " <td>21</td>\n", " <td>16.0</td>\n", " <td>8.0</td>\n", " <td>5.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>9.0</td>\n", " <td>6.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>8</th>\n", " <td>GO:0006535</td>\n", " <td>cysteine biosynthetic process from serine</td>\n", " <td>20</td>\n", " <td>9.0</td>\n", " <td>4.0</td>\n", " <td>11.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>13.0</td>\n", " <td>3.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>3.0</td>\n", " <td>5.0</td>\n", " <td>2.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>2.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>6</th>\n", " <td>GO:0006545</td>\n", " <td>glycine biosynthetic process</td>\n", " <td>14</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>11.0</td>\n", " <td>7.0</td>\n", " <td>9.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>7.0</td>\n", " <td>2.0</td>\n", " <td>11.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>5.0</td>\n", " <td>5.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>GO:0019343</td>\n", " <td>cysteine biosynthetic process via cystathionine</td>\n", " <td>9</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>8.0</td>\n", " <td>1.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>6.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>9</th>\n", " <td>GO:0009090</td>\n", " <td>homoserine biosynthetic process</td>\n", " <td>8</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>10</th>\n", " <td>GO:0070179</td>\n", " <td>D-serine biosynthetic process</td>\n", " <td>8</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>8.0</td>\n", " <td>6.0</td>\n", " <td>6.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>0.0</td>\n", " <td>8.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>0</th>\n", " <td>GO:0016260</td>\n", " <td>selenocysteine biosynthetic process</td>\n", " <td>6</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>4.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>11</th>\n", " <td>GO:0019264</td>\n", " <td>glycine biosynthetic process from serine</td>\n", " <td>6</td>\n", " <td>2.0</td>\n", " <td>2.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>2.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>GO:0019265</td>\n", " <td>glycine biosynthetic process, by transaminatio...</td>\n", " <td>4</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>7</th>\n", " <td>GO:0071269</td>\n", " <td>L-homocysteine biosynthetic process</td>\n", " <td>1</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>GO:0019345</td>\n", " <td>cysteine biosynthetic process via S-sulfo-L-cy...</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " id name: ALL \\\n", "12 GO:0009070 serine family amino acid biosynthetic process 144 \n", "13 GO:0019344 cysteine biosynthetic process 82 \n", "1 GO:0004124 cysteine synthase activity 29 \n", "5 GO:0006564 L-serine biosynthetic process 21 \n", "8 GO:0006535 cysteine biosynthetic process from serine 20 \n", "6 GO:0006545 glycine biosynthetic process 14 \n", "2 GO:0019343 cysteine biosynthetic process via cystathionine 9 \n", "9 GO:0009090 homoserine biosynthetic process 8 \n", "10 GO:0070179 D-serine biosynthetic process 8 \n", "0 GO:0016260 selenocysteine biosynthetic process 6 \n", "11 GO:0019264 glycine biosynthetic process from serine 6 \n", "3 GO:0019265 glycine biosynthetic process, by transaminatio... 4 \n", "7 GO:0071269 L-homocysteine biosynthetic process 1 \n", "4 GO:0019345 cysteine biosynthetic process via S-sulfo-L-cy... 0 \n", "\n", " Bacteria Escherichia coli K-12 Eukaryota Mammalia Metazoa \\\n", "12 52.0 27.0 92.0 22.0 33.0 \n", "13 25.0 8.0 57.0 6.0 15.0 \n", "1 7.0 3.0 22.0 0.0 4.0 \n", "5 16.0 8.0 5.0 1.0 1.0 \n", "8 9.0 4.0 11.0 0.0 4.0 \n", "6 3.0 3.0 11.0 7.0 9.0 \n", "2 1.0 0.0 8.0 1.0 2.0 \n", "9 4.0 4.0 4.0 0.0 0.0 \n", "10 0.0 0.0 8.0 6.0 6.0 \n", "0 4.0 4.0 2.0 1.0 1.0 \n", "11 2.0 2.0 4.0 2.0 4.0 \n", "3 0.0 0.0 4.0 3.0 3.0 \n", "7 0.0 0.0 1.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Mus musculus Trypanosoma brucei brucei TREU927 Vertebrata <vertebrates> \\\n", "12 7.0 0.0 22.0 \n", "13 2.0 0.0 6.0 \n", "1 0.0 0.0 0.0 \n", "5 0.0 0.0 1.0 \n", "8 0.0 0.0 0.0 \n", "6 0.0 0.0 7.0 \n", "2 0.0 0.0 1.0 \n", "9 0.0 0.0 0.0 \n", "10 4.0 0.0 6.0 \n", "0 1.0 1.0 1.0 \n", "11 0.0 0.0 2.0 \n", "3 0.0 0.0 3.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " mutant phenotype evidence used in manual assertion \\\n", "12 53.0 \n", "13 30.0 \n", "1 6.0 \n", "5 9.0 \n", "8 6.0 \n", "6 2.0 \n", "2 6.0 \n", "9 4.0 \n", "10 0.0 \n", "0 4.0 \n", "11 2.0 \n", "3 0.0 \n", "7 0.0 \n", "4 0.0 \n", "\n", " direct assay evidence used in manual assertion EcoCyc GeneDB MGI \\\n", "12 81.0 20.0 3.0 6.0 \n", "13 49.0 6.0 2.0 1.0 \n", "1 21.0 2.0 0.0 0.0 \n", "5 6.0 3.0 0.0 0.0 \n", "8 13.0 3.0 1.0 0.0 \n", "6 11.0 3.0 0.0 0.0 \n", "2 3.0 0.0 1.0 0.0 \n", "9 4.0 4.0 0.0 0.0 \n", "10 8.0 0.0 0.0 4.0 \n", "0 2.0 4.0 1.0 1.0 \n", "11 4.0 2.0 0.0 0.0 \n", "3 4.0 0.0 0.0 0.0 \n", "7 1.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 \n", "\n", " Viridiplantae Arabidopsis thaliana Fungi Caenorhabditis elegans \\\n", "12 23.0 21.0 32.0 7.0 \n", "13 21.0 19.0 19.0 7.0 \n", "1 13.0 11.0 5.0 4.0 \n", "5 1.0 1.0 3.0 0.0 \n", "8 0.0 0.0 6.0 3.0 \n", "6 0.0 0.0 2.0 0.0 \n", "2 0.0 0.0 5.0 0.0 \n", "9 0.0 0.0 4.0 0.0 \n", "10 1.0 1.0 0.0 0.0 \n", "0 0.0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 1.0 0.0 \n", "7 0.0 0.0 1.0 0.0 \n", "4 0.0 0.0 0.0 0.0 \n", "\n", " Schizosaccharomyces pombe Bacillus subtilis subsp. subtilis str. 168 \\\n", "12 11.0 6.0 \n", "13 10.0 5.0 \n", "1 4.0 2.0 \n", "5 0.0 1.0 \n", "8 5.0 2.0 \n", "6 0.0 0.0 \n", "2 0.0 0.0 \n", "9 0.0 0.0 \n", "10 0.0 0.0 \n", "0 0.0 0.0 \n", "11 0.0 0.0 \n", "3 0.0 0.0 \n", "7 1.0 0.0 \n", "4 0.0 0.0 \n", "\n", " Mycobacterium tuberculosis H37Rv Saccharomyces cerevisiae S288C \\\n", "12 11.0 17.0 \n", "13 8.0 6.0 \n", "1 1.0 1.0 \n", "5 3.0 3.0 \n", "8 2.0 1.0 \n", "6 0.0 2.0 \n", "2 1.0 3.0 \n", "9 0.0 3.0 \n", "10 0.0 0.0 \n", "0 0.0 0.0 \n", "11 0.0 0.0 \n", "3 0.0 1.0 \n", "7 0.0 0.0 \n", "4 0.0 0.0 \n", "\n", " Solanum tuberosum Spinacia oleracea Streptomyces lavendulae \\\n", "12 0.0 0.0 0.0 \n", "13 1.0 1.0 0.0 \n", "1 1.0 1.0 1.0 \n", "5 0.0 0.0 0.0 \n", "8 0.0 0.0 0.0 \n", "6 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 \n", "9 0.0 0.0 0.0 \n", "10 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " genetic interaction evidence used in manual assertion TAIR PomBase \\\n", "12 10.0 19.0 11.0 \n", "13 3.0 17.0 10.0 \n", "1 2.0 10.0 4.0 \n", "5 6.0 1.0 0.0 \n", "8 1.0 0.0 5.0 \n", "6 1.0 0.0 0.0 \n", "2 0.0 0.0 0.0 \n", "9 0.0 0.0 0.0 \n", "10 0.0 1.0 0.0 \n", "0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 \n", "7 0.0 0.0 1.0 \n", "4 0.0 0.0 0.0 \n", "\n", " UniProt WB CAFA EcoliWiki MTBBASE SGD \\\n", "12 16.0 7.0 5.0 7.0 10.0 17.0 \n", "13 10.0 7.0 5.0 2.0 7.0 6.0 \n", "1 4.0 4.0 2.0 1.0 1.0 1.0 \n", "5 2.0 0.0 0.0 5.0 3.0 3.0 \n", "8 2.0 3.0 2.0 1.0 2.0 1.0 \n", "6 2.0 0.0 0.0 0.0 0.0 2.0 \n", "2 1.0 0.0 0.0 0.0 1.0 3.0 \n", "9 0.0 0.0 0.0 0.0 0.0 3.0 \n", "10 2.0 0.0 0.0 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "11 1.0 0.0 0.0 0.0 0.0 0.0 \n", "3 1.0 0.0 0.0 0.0 0.0 1.0 \n", "7 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Aspergillus nidulans FGSC A4 Apis mellifera Homo sapiens \\\n", "12 3.0 2.0 8.0 \n", "13 3.0 2.0 3.0 \n", "1 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 \n", "8 0.0 1.0 0.0 \n", "6 0.0 0.0 2.0 \n", "2 2.0 1.0 1.0 \n", "9 0.0 0.0 0.0 \n", "10 0.0 0.0 2.0 \n", "0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 \n", "3 0.0 0.0 2.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " Leishmania major strain Friedlin AspGD BHF-UCL Rattus norvegicus RGD \\\n", "12 2.0 3.0 4.0 7.0 7.0 \n", "13 2.0 3.0 2.0 1.0 1.0 \n", "1 0.0 0.0 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 1.0 1.0 \n", "8 1.0 0.0 0.0 0.0 0.0 \n", "6 0.0 0.0 1.0 5.0 5.0 \n", "2 1.0 2.0 1.0 0.0 0.0 \n", "9 0.0 0.0 0.0 0.0 0.0 \n", "10 0.0 0.0 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 2.0 2.0 \n", "3 0.0 0.0 1.0 1.0 1.0 \n", "7 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Pseudomonas aeruginosa PAO1 Thermus thermophilus HB27 PseudoCAP \\\n", "12 4.0 0.0 5.0 \n", "13 1.0 0.0 2.0 \n", "1 0.0 0.0 0.0 \n", "5 3.0 1.0 3.0 \n", "8 0.0 0.0 0.0 \n", "6 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 \n", "9 0.0 0.0 0.0 \n", "10 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " Bombyx mori Drosophila melanogaster FlyBase Lactobacillus casei \\\n", "12 1.0 1.0 1.0 0.0 \n", "13 0.0 0.0 0.0 1.0 \n", "1 0.0 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 0.0 \n", "8 0.0 0.0 0.0 1.0 \n", "6 1.0 1.0 1.0 0.0 \n", "2 0.0 0.0 0.0 0.0 \n", "9 0.0 0.0 0.0 0.0 \n", "10 0.0 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 0.0 \n", "11 1.0 1.0 1.0 0.0 \n", "3 0.0 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 \n", "\n", " Candida albicans SC5314 CGD Dictyostelium discoideum dictyBase GOC \\\n", "12 1.0 1.0 1.0 1.0 1.0 \n", "13 0.0 0.0 0.0 0.0 1.0 \n", "1 0.0 0.0 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 0.0 0.0 \n", "8 0.0 0.0 0.0 0.0 0.0 \n", "6 0.0 0.0 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 0.0 0.0 \n", "9 1.0 1.0 0.0 0.0 0.0 \n", "10 0.0 0.0 1.0 1.0 0.0 \n", "0 0.0 0.0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Pseudomonas aeruginosa \n", "12 0.0 \n", "13 1.0 \n", "1 0.0 \n", "5 0.0 \n", "8 0.0 \n", "6 0.0 \n", "2 0.0 \n", "9 0.0 \n", "10 0.0 \n", "0 0.0 \n", "11 0.0 \n", "3 0.0 \n", "7 0.0 \n", "4 0.0 " ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.options.display.max_columns = None\n", "df = summarize_set(descendants)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary by assigned by\n", "\n" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>id</th>\n", " <th>name:</th>\n", " <th>ALL</th>\n", " <th>EcoCyc</th>\n", " <th>GeneDB</th>\n", " <th>MGI</th>\n", " <th>TAIR</th>\n", " <th>PomBase</th>\n", " <th>UniProt</th>\n", " <th>WB</th>\n", " <th>CAFA</th>\n", " <th>EcoliWiki</th>\n", " <th>MTBBASE</th>\n", " <th>SGD</th>\n", " <th>AspGD</th>\n", " <th>BHF-UCL</th>\n", " <th>RGD</th>\n", " <th>PseudoCAP</th>\n", " <th>FlyBase</th>\n", " <th>CGD</th>\n", " <th>dictyBase</th>\n", " <th>GOC</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>12</th>\n", " <td>GO:0009070</td>\n", " <td>serine family amino acid biosynthetic process</td>\n", " <td>144</td>\n", " <td>20.0</td>\n", " <td>3.0</td>\n", " <td>6.0</td>\n", " <td>19.0</td>\n", " <td>11.0</td>\n", " <td>16.0</td>\n", " <td>7.0</td>\n", " <td>5.0</td>\n", " <td>7.0</td>\n", " <td>10.0</td>\n", " <td>17.0</td>\n", " <td>3.0</td>\n", " <td>4.0</td>\n", " <td>7.0</td>\n", " <td>5.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " </tr>\n", " <tr>\n", " <th>13</th>\n", " <td>GO:0019344</td>\n", " <td>cysteine biosynthetic process</td>\n", " <td>82</td>\n", " <td>6.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>17.0</td>\n", " <td>10.0</td>\n", " <td>10.0</td>\n", " <td>7.0</td>\n", " <td>5.0</td>\n", " <td>2.0</td>\n", " <td>7.0</td>\n", " <td>6.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>GO:0004124</td>\n", " <td>cysteine synthase activity</td>\n", " <td>29</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>10.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>GO:0006564</td>\n", " <td>L-serine biosynthetic process</td>\n", " <td>21</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>8</th>\n", " <td>GO:0006535</td>\n", " <td>cysteine biosynthetic process from serine</td>\n", " <td>20</td>\n", " <td>3.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>2.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>6</th>\n", " <td>GO:0006545</td>\n", " <td>glycine biosynthetic process</td>\n", " <td>14</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>5.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>GO:0019343</td>\n", " <td>cysteine biosynthetic process via cystathionine</td>\n", " <td>9</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>9</th>\n", " <td>GO:0009090</td>\n", " <td>homoserine biosynthetic process</td>\n", " <td>8</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>10</th>\n", " <td>GO:0070179</td>\n", " <td>D-serine biosynthetic process</td>\n", " <td>8</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>0</th>\n", " <td>GO:0016260</td>\n", " <td>selenocysteine biosynthetic process</td>\n", " <td>6</td>\n", " <td>4.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>11</th>\n", " <td>GO:0019264</td>\n", " <td>glycine biosynthetic process from serine</td>\n", " <td>6</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>GO:0019265</td>\n", " <td>glycine biosynthetic process, by transaminatio...</td>\n", " <td>4</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>7</th>\n", " <td>GO:0071269</td>\n", " <td>L-homocysteine biosynthetic process</td>\n", " <td>1</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>GO:0019345</td>\n", " <td>cysteine biosynthetic process via S-sulfo-L-cy...</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " id name: ALL \\\n", "12 GO:0009070 serine family amino acid biosynthetic process 144 \n", "13 GO:0019344 cysteine biosynthetic process 82 \n", "1 GO:0004124 cysteine synthase activity 29 \n", "5 GO:0006564 L-serine biosynthetic process 21 \n", "8 GO:0006535 cysteine biosynthetic process from serine 20 \n", "6 GO:0006545 glycine biosynthetic process 14 \n", "2 GO:0019343 cysteine biosynthetic process via cystathionine 9 \n", "9 GO:0009090 homoserine biosynthetic process 8 \n", "10 GO:0070179 D-serine biosynthetic process 8 \n", "0 GO:0016260 selenocysteine biosynthetic process 6 \n", "11 GO:0019264 glycine biosynthetic process from serine 6 \n", "3 GO:0019265 glycine biosynthetic process, by transaminatio... 4 \n", "7 GO:0071269 L-homocysteine biosynthetic process 1 \n", "4 GO:0019345 cysteine biosynthetic process via S-sulfo-L-cy... 0 \n", "\n", " EcoCyc GeneDB MGI TAIR PomBase UniProt WB CAFA EcoliWiki \\\n", "12 20.0 3.0 6.0 19.0 11.0 16.0 7.0 5.0 7.0 \n", "13 6.0 2.0 1.0 17.0 10.0 10.0 7.0 5.0 2.0 \n", "1 2.0 0.0 0.0 10.0 4.0 4.0 4.0 2.0 1.0 \n", "5 3.0 0.0 0.0 1.0 0.0 2.0 0.0 0.0 5.0 \n", "8 3.0 1.0 0.0 0.0 5.0 2.0 3.0 2.0 1.0 \n", "6 3.0 0.0 0.0 0.0 0.0 2.0 0.0 0.0 0.0 \n", "2 0.0 1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 \n", "9 4.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "10 0.0 0.0 4.0 1.0 0.0 2.0 0.0 0.0 0.0 \n", "0 4.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "11 2.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", " MTBBASE SGD AspGD BHF-UCL RGD PseudoCAP FlyBase CGD dictyBase \\\n", "12 10.0 17.0 3.0 4.0 7.0 5.0 1.0 1.0 1.0 \n", "13 7.0 6.0 3.0 2.0 1.0 2.0 0.0 0.0 0.0 \n", "1 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "5 3.0 3.0 0.0 0.0 1.0 3.0 0.0 0.0 0.0 \n", "8 2.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "6 0.0 2.0 0.0 1.0 5.0 0.0 1.0 0.0 0.0 \n", "2 1.0 3.0 2.0 1.0 0.0 0.0 0.0 0.0 0.0 \n", "9 0.0 3.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 \n", "10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n", "0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 0.0 2.0 0.0 1.0 0.0 0.0 \n", "3 0.0 1.0 0.0 1.0 1.0 0.0 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", " GOC \n", "12 1.0 \n", "13 1.0 \n", "1 0.0 \n", "5 0.0 \n", "8 0.0 \n", "6 0.0 \n", "2 0.0 \n", "9 0.0 \n", "10 0.0 \n", "0 0.0 \n", "11 0.0 \n", "3 0.0 \n", "7 0.0 \n", "4 0.0 " ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "summarize_set(descendants, facet_fields=['assigned_by'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Summarize by species\n", "\n", "use `taxon_subset_closure_label` facet" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>id</th>\n", " <th>name:</th>\n", " <th>ALL</th>\n", " <th>Bacteria</th>\n", " <th>Escherichia coli K-12</th>\n", " <th>Eukaryota</th>\n", " <th>Mammalia</th>\n", " <th>Metazoa</th>\n", " <th>Mus musculus</th>\n", " <th>Trypanosoma brucei brucei TREU927</th>\n", " <th>Vertebrata <vertebrates></th>\n", " <th>Viridiplantae</th>\n", " <th>Arabidopsis thaliana</th>\n", " <th>Fungi</th>\n", " <th>Caenorhabditis elegans</th>\n", " <th>Schizosaccharomyces pombe</th>\n", " <th>Bacillus subtilis subsp. subtilis str. 168</th>\n", " <th>Mycobacterium tuberculosis H37Rv</th>\n", " <th>Saccharomyces cerevisiae S288C</th>\n", " <th>Solanum tuberosum</th>\n", " <th>Spinacia oleracea</th>\n", " <th>Streptomyces lavendulae</th>\n", " <th>Aspergillus nidulans FGSC A4</th>\n", " <th>Apis mellifera</th>\n", " <th>Homo sapiens</th>\n", " <th>Leishmania major strain Friedlin</th>\n", " <th>Rattus norvegicus</th>\n", " <th>Pseudomonas aeruginosa PAO1</th>\n", " <th>Thermus thermophilus HB27</th>\n", " <th>Bombyx mori</th>\n", " <th>Drosophila melanogaster</th>\n", " <th>Lactobacillus casei</th>\n", " <th>Candida albicans SC5314</th>\n", " <th>Dictyostelium discoideum</th>\n", " <th>Pseudomonas aeruginosa</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>12</th>\n", " <td>GO:0009070</td>\n", " <td>serine family amino acid biosynthetic process</td>\n", " <td>144</td>\n", " <td>52.0</td>\n", " <td>27.0</td>\n", " <td>92.0</td>\n", " <td>22.0</td>\n", " <td>33.0</td>\n", " <td>7.0</td>\n", " <td>0.0</td>\n", " <td>22.0</td>\n", " <td>23.0</td>\n", " <td>21.0</td>\n", " <td>32.0</td>\n", " <td>7.0</td>\n", " <td>11.0</td>\n", " <td>6.0</td>\n", " <td>11.0</td>\n", " <td>17.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>8.0</td>\n", " <td>2.0</td>\n", " <td>7.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>13</th>\n", " <td>GO:0019344</td>\n", " <td>cysteine biosynthetic process</td>\n", " <td>82</td>\n", " <td>25.0</td>\n", " <td>8.0</td>\n", " <td>57.0</td>\n", " <td>6.0</td>\n", " <td>15.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>21.0</td>\n", " <td>19.0</td>\n", " <td>19.0</td>\n", " <td>7.0</td>\n", " <td>10.0</td>\n", " <td>5.0</td>\n", " <td>8.0</td>\n", " <td>6.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>3.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>GO:0004124</td>\n", " <td>cysteine synthase activity</td>\n", " <td>29</td>\n", " <td>7.0</td>\n", " <td>3.0</td>\n", " <td>22.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>13.0</td>\n", " <td>11.0</td>\n", " <td>5.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>GO:0006564</td>\n", " <td>L-serine biosynthetic process</td>\n", " <td>21</td>\n", " <td>16.0</td>\n", " <td>8.0</td>\n", " <td>5.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>8</th>\n", " <td>GO:0006535</td>\n", " <td>cysteine biosynthetic process from serine</td>\n", " <td>20</td>\n", " <td>9.0</td>\n", " <td>4.0</td>\n", " <td>11.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>3.0</td>\n", " <td>5.0</td>\n", " <td>2.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>6</th>\n", " <td>GO:0006545</td>\n", " <td>glycine biosynthetic process</td>\n", " <td>14</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>11.0</td>\n", " <td>7.0</td>\n", " <td>9.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>7.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>GO:0019343</td>\n", " <td>cysteine biosynthetic process via cystathionine</td>\n", " <td>9</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>8.0</td>\n", " <td>1.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>5.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>9</th>\n", " <td>GO:0009090</td>\n", " <td>homoserine biosynthetic process</td>\n", " <td>8</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>10</th>\n", " <td>GO:0070179</td>\n", " <td>D-serine biosynthetic process</td>\n", " <td>8</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>8.0</td>\n", " <td>6.0</td>\n", " <td>6.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>6.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>0</th>\n", " <td>GO:0016260</td>\n", " <td>selenocysteine biosynthetic process</td>\n", " <td>6</td>\n", " <td>4.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>11</th>\n", " <td>GO:0019264</td>\n", " <td>glycine biosynthetic process from serine</td>\n", " <td>6</td>\n", " <td>2.0</td>\n", " <td>2.0</td>\n", " <td>4.0</td>\n", " <td>2.0</td>\n", " <td>4.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>GO:0019265</td>\n", " <td>glycine biosynthetic process, by transaminatio...</td>\n", " <td>4</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>4.0</td>\n", " <td>3.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>3.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>2.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>7</th>\n", " <td>GO:0071269</td>\n", " <td>L-homocysteine biosynthetic process</td>\n", " <td>1</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>GO:0019345</td>\n", " <td>cysteine biosynthetic process via S-sulfo-L-cy...</td>\n", " <td>0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " id name: ALL \\\n", "12 GO:0009070 serine family amino acid biosynthetic process 144 \n", "13 GO:0019344 cysteine biosynthetic process 82 \n", "1 GO:0004124 cysteine synthase activity 29 \n", "5 GO:0006564 L-serine biosynthetic process 21 \n", "8 GO:0006535 cysteine biosynthetic process from serine 20 \n", "6 GO:0006545 glycine biosynthetic process 14 \n", "2 GO:0019343 cysteine biosynthetic process via cystathionine 9 \n", "9 GO:0009090 homoserine biosynthetic process 8 \n", "10 GO:0070179 D-serine biosynthetic process 8 \n", "0 GO:0016260 selenocysteine biosynthetic process 6 \n", "11 GO:0019264 glycine biosynthetic process from serine 6 \n", "3 GO:0019265 glycine biosynthetic process, by transaminatio... 4 \n", "7 GO:0071269 L-homocysteine biosynthetic process 1 \n", "4 GO:0019345 cysteine biosynthetic process via S-sulfo-L-cy... 0 \n", "\n", " Bacteria Escherichia coli K-12 Eukaryota Mammalia Metazoa \\\n", "12 52.0 27.0 92.0 22.0 33.0 \n", "13 25.0 8.0 57.0 6.0 15.0 \n", "1 7.0 3.0 22.0 0.0 4.0 \n", "5 16.0 8.0 5.0 1.0 1.0 \n", "8 9.0 4.0 11.0 0.0 4.0 \n", "6 3.0 3.0 11.0 7.0 9.0 \n", "2 1.0 0.0 8.0 1.0 2.0 \n", "9 4.0 4.0 4.0 0.0 0.0 \n", "10 0.0 0.0 8.0 6.0 6.0 \n", "0 4.0 4.0 2.0 1.0 1.0 \n", "11 2.0 2.0 4.0 2.0 4.0 \n", "3 0.0 0.0 4.0 3.0 3.0 \n", "7 0.0 0.0 1.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " Mus musculus Trypanosoma brucei brucei TREU927 Vertebrata <vertebrates> \\\n", "12 7.0 0.0 22.0 \n", "13 2.0 0.0 6.0 \n", "1 0.0 0.0 0.0 \n", "5 0.0 0.0 1.0 \n", "8 0.0 0.0 0.0 \n", "6 0.0 0.0 7.0 \n", "2 0.0 0.0 1.0 \n", "9 0.0 0.0 0.0 \n", "10 4.0 0.0 6.0 \n", "0 1.0 1.0 1.0 \n", "11 0.0 0.0 2.0 \n", "3 0.0 0.0 3.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " Viridiplantae Arabidopsis thaliana Fungi Caenorhabditis elegans \\\n", "12 23.0 21.0 32.0 7.0 \n", "13 21.0 19.0 19.0 7.0 \n", "1 13.0 11.0 5.0 4.0 \n", "5 1.0 1.0 3.0 0.0 \n", "8 0.0 0.0 6.0 3.0 \n", "6 0.0 0.0 2.0 0.0 \n", "2 0.0 0.0 5.0 0.0 \n", "9 0.0 0.0 4.0 0.0 \n", "10 1.0 1.0 0.0 0.0 \n", "0 0.0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 0.0 \n", "3 0.0 0.0 1.0 0.0 \n", "7 0.0 0.0 1.0 0.0 \n", "4 0.0 0.0 0.0 0.0 \n", "\n", " Schizosaccharomyces pombe Bacillus subtilis subsp. subtilis str. 168 \\\n", "12 11.0 6.0 \n", "13 10.0 5.0 \n", "1 4.0 2.0 \n", "5 0.0 1.0 \n", "8 5.0 2.0 \n", "6 0.0 0.0 \n", "2 0.0 0.0 \n", "9 0.0 0.0 \n", "10 0.0 0.0 \n", "0 0.0 0.0 \n", "11 0.0 0.0 \n", "3 0.0 0.0 \n", "7 1.0 0.0 \n", "4 0.0 0.0 \n", "\n", " Mycobacterium tuberculosis H37Rv Saccharomyces cerevisiae S288C \\\n", "12 11.0 17.0 \n", "13 8.0 6.0 \n", "1 1.0 1.0 \n", "5 3.0 3.0 \n", "8 2.0 1.0 \n", "6 0.0 2.0 \n", "2 1.0 3.0 \n", "9 0.0 3.0 \n", "10 0.0 0.0 \n", "0 0.0 0.0 \n", "11 0.0 0.0 \n", "3 0.0 1.0 \n", "7 0.0 0.0 \n", "4 0.0 0.0 \n", "\n", " Solanum tuberosum Spinacia oleracea Streptomyces lavendulae \\\n", "12 0.0 0.0 0.0 \n", "13 1.0 1.0 0.0 \n", "1 1.0 1.0 1.0 \n", "5 0.0 0.0 0.0 \n", "8 0.0 0.0 0.0 \n", "6 0.0 0.0 0.0 \n", "2 0.0 0.0 0.0 \n", "9 0.0 0.0 0.0 \n", "10 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 \n", "3 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " Aspergillus nidulans FGSC A4 Apis mellifera Homo sapiens \\\n", "12 3.0 2.0 8.0 \n", "13 3.0 2.0 3.0 \n", "1 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 \n", "8 0.0 1.0 0.0 \n", "6 0.0 0.0 2.0 \n", "2 2.0 1.0 1.0 \n", "9 0.0 0.0 0.0 \n", "10 0.0 0.0 2.0 \n", "0 0.0 0.0 0.0 \n", "11 0.0 0.0 0.0 \n", "3 0.0 0.0 2.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " Leishmania major strain Friedlin Rattus norvegicus \\\n", "12 2.0 7.0 \n", "13 2.0 1.0 \n", "1 0.0 0.0 \n", "5 0.0 1.0 \n", "8 1.0 0.0 \n", "6 0.0 5.0 \n", "2 1.0 0.0 \n", "9 0.0 0.0 \n", "10 0.0 0.0 \n", "0 0.0 0.0 \n", "11 0.0 2.0 \n", "3 0.0 1.0 \n", "7 0.0 0.0 \n", "4 0.0 0.0 \n", "\n", " Pseudomonas aeruginosa PAO1 Thermus thermophilus HB27 Bombyx mori \\\n", "12 4.0 0.0 1.0 \n", "13 1.0 0.0 0.0 \n", "1 0.0 0.0 0.0 \n", "5 3.0 1.0 0.0 \n", "8 0.0 0.0 0.0 \n", "6 0.0 0.0 1.0 \n", "2 0.0 0.0 0.0 \n", "9 0.0 0.0 0.0 \n", "10 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 \n", "11 0.0 0.0 1.0 \n", "3 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " Drosophila melanogaster Lactobacillus casei Candida albicans SC5314 \\\n", "12 1.0 0.0 1.0 \n", "13 0.0 1.0 0.0 \n", "1 0.0 0.0 0.0 \n", "5 0.0 0.0 0.0 \n", "8 0.0 1.0 0.0 \n", "6 1.0 0.0 0.0 \n", "2 0.0 0.0 0.0 \n", "9 0.0 0.0 1.0 \n", "10 0.0 0.0 0.0 \n", "0 0.0 0.0 0.0 \n", "11 1.0 0.0 0.0 \n", "3 0.0 0.0 0.0 \n", "7 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 \n", "\n", " Dictyostelium discoideum Pseudomonas aeruginosa \n", "12 1.0 0.0 \n", "13 0.0 1.0 \n", "1 0.0 0.0 \n", "5 0.0 0.0 \n", "8 0.0 0.0 \n", "6 0.0 0.0 \n", "2 0.0 0.0 \n", "9 0.0 0.0 \n", "10 1.0 0.0 \n", "0 0.0 0.0 \n", "11 0.0 0.0 \n", "3 0.0 0.0 \n", "7 0.0 0.0 \n", "4 0.0 0.0 " ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "summarize_set(descendants, facet_fields=['taxon_subset_closure_label'])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 2 }