{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Load ontology\n", "\n", "First load an ontology\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading a local ontology\n", "\n", "Assume we have an ontology on disk in obographs-json format, e.g. nucleus.json.\n", "\n", "First create an ontology *factory*, and create an ontology by passing the filename as the handle" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from ontobio.ontol_factory import OntologyFactory\n", "\n", "ofactory = OntologyFactory()\n", "ont = ofactory.create(\"nucleus.json\") ## interpreted as a json local handle" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can search the ontology by label;\n", "regexes can be used, but here we specify an exact match, so we expect one result:\n", " " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "[nucleus] = ont.search('nucleus')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Ancestor Queries" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'GO:0005575',\n", " 'GO:0005622',\n", " 'GO:0005623',\n", " 'GO:0043226',\n", " 'GO:0043227',\n", " 'GO:0043229',\n", " 'GO:0043231',\n", " 'GO:0044424',\n", " 'GO:0044464'}" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ont.ancestors(nucleus)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[\"GO:0005575 'cellular_component'\",\n", " \"GO:0044464 'cell part'\",\n", " \"GO:0043229 'intracellular organelle'\",\n", " \"GO:0005622 'intracellular'\",\n", " \"GO:0043227 'membrane-bounded organelle'\",\n", " \"GO:0044424 'intracellular part'\",\n", " \"GO:0043231 'intracellular membrane-bounded organelle'\",\n", " \"GO:0043226 'organelle'\",\n", " \"GO:0005623 'cell'\"]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## Extract labels for results\n", "[\"{} '{}'\".format(x, ont.label(x)) for x in ont.ancestors(nucleus)]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'GO:0005575',\n", " 'GO:0043226',\n", " 'GO:0043227',\n", " 'GO:0043229',\n", " 'GO:0043231',\n", " 'GO:0044424',\n", " 'GO:0044464'}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## Filter ancestors based on relationship type\n", "ont.ancestors(nucleus, relations='subClassOf')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Node objects" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 'http://purl.obolibrary.org/obo/GO_0005634',\n", " 'label': 'nucleus',\n", " 'lbl': 'nucleus',\n", " 'meta': {'basicPropertyValues': [{'pred': 'http://www.geneontology.org/formats/oboInOwl#hasOBONamespace',\n", " 'val': 'cellular_component'}],\n", " 'definition': {'val': \"A membrane-bounded organelle of eukaryotic cells in which chromosomes are housed and replicated. In most cells, the nucleus contains all of the cell's chromosomes except the organellar chromosomes, and is the site of RNA synthesis and processing. In some species, or in specialized cell types, RNA metabolism or DNA replication may be absent.\",\n", " 'xrefs': ['GOC:go_curators']},\n", " 'subsets': ['http://purl.obolibrary.org/obo/go-test#goslim_aspergillus',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_pir',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_candida',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_metagenomics',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_yeast',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_generic',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_plant',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_chembl'],\n", " 'synonyms': [{'pred': 'hasExactSynonym',\n", " 'val': 'cell nucleus',\n", " 'xrefs': []}],\n", " 'xrefs': [{'val': 'Wikipedia:Cell_nucleus'},\n", " {'val': 'NIF_Subcellular:sao1702920020'}]},\n", " 'type': 'CLASS'}" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## Fetch the nucleus node\n", "\n", "ont.node(nucleus)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the ontol object provides special convenience methods for accessing\n", "information in a node (see Synonyms section below). However, it is always possible to access\n", "the dictionary object for a node and to probe this directly, for example:\n" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[{'pred': 'hasExactSynonym', 'val': 'cell nucleus', 'xrefs': []}]" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ont.node(nucleus)['meta']['synonyms']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Synonyms Objects\n", "\n", "A synonym object includes the scope, type and any provenance for the synonym.\n", "See the notebook on lexical mapping for more details on ontology lexical elements" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[GO:0005575 \"cellular component\" hasExactSynonym None [],\n", " GO:0005575 \"subcellular entity\" hasRelatedSynonym None ['NIF_Subcellular:nlx_subcell_100315'],\n", " GO:0005575 \"cell or subcellular entity\" hasExactSynonym None []]" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ont.synonyms('GO:0005575')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Networkx Graphs\n", "\n", "The underlying representation of the ontology is a networkx graph. You can access a graph directly:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": true }, "outputs": [], "source": [ "graph = ont.get_graph()" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "28" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## number of nodes in this test ontology\n", "len(graph)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 'http://purl.obolibrary.org/obo/GO_0005634',\n", " 'label': 'nucleus',\n", " 'lbl': 'nucleus',\n", " 'meta': {'basicPropertyValues': [{'pred': 'http://www.geneontology.org/formats/oboInOwl#hasOBONamespace',\n", " 'val': 'cellular_component'}],\n", " 'definition': {'val': \"A membrane-bounded organelle of eukaryotic cells in which chromosomes are housed and replicated. In most cells, the nucleus contains all of the cell's chromosomes except the organellar chromosomes, and is the site of RNA synthesis and processing. In some species, or in specialized cell types, RNA metabolism or DNA replication may be absent.\",\n", " 'xrefs': ['GOC:go_curators']},\n", " 'subsets': ['http://purl.obolibrary.org/obo/go-test#goslim_aspergillus',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_pir',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_candida',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_metagenomics',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_yeast',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_generic',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_plant',\n", " 'http://purl.obolibrary.org/obo/go-test#goslim_chembl'],\n", " 'synonyms': [{'pred': 'hasExactSynonym',\n", " 'val': 'cell nucleus',\n", " 'xrefs': []}],\n", " 'xrefs': [{'val': 'Wikipedia:Cell_nucleus'},\n", " {'val': 'NIF_Subcellular:sao1702920020'}]},\n", " 'type': 'CLASS'}" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "graph.node[nucleus]" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import networkx\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/cjm/repos/ontobio/venv/lib/python3.5/site-packages/networkx/drawing/nx_pylab.py:126: MatplotlibDeprecationWarning: pyplot.hold is deprecated.\n", " Future behavior will be consistent with the long-time default:\n", " plot commands add elements without first clearing the\n", " Axes and/or Figure.\n", " b = plt.ishold()\n", "/Users/cjm/repos/ontobio/venv/lib/python3.5/site-packages/networkx/drawing/nx_pylab.py:138: MatplotlibDeprecationWarning: pyplot.hold is deprecated.\n", " Future behavior will be consistent with the long-time default:\n", " plot commands add elements without first clearing the\n", " Axes and/or Figure.\n", " plt.hold(b)\n", "/Users/cjm/repos/ontobio/venv/lib/python3.5/site-packages/matplotlib/__init__.py:917: UserWarning: axes.hold is deprecated. Please remove it from your matplotlibrc and/or style files.\n", " warnings.warn(self.msg_depr_set % key)\n", "/Users/cjm/repos/ontobio/venv/lib/python3.5/site-packages/matplotlib/rcsetup.py:152: UserWarning: axes.hold is deprecated, will be removed in 3.0\n", " warnings.warn(\"axes.hold is deprecated, will be removed in 3.0\")\n" ] } ], "source": [ "networkx.draw(graph)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "plt.savefig('foo.png')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }