{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Searching and Downloading Data from the Blue Brain Knowledge Graph using the Knowledge Graph Forge" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initialize and configure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get an authentication token" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For now, the [Nexus web application](https://bbp.epfl.ch/nexus/web) can be used to get a token. We are looking for other simpler alternatives.\n", "\n", "- Step 1: From the opened web page, click on the login button on the right corner and follow the instructions.\n", "\n", "![login-ui](./login-ui.png)\n", "\n", "- Step 2: At the end you’ll see a token button on the right corner. Click on it to copy the token.\n", "\n", "![login-ui](./copy-token.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once a token is obtained then proceed to paste it below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import getpass" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "TOKEN = getpass.getpass()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configure a client (forge) to access the knowledge graph " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from kgforge.core import KnowledgeGraphForge" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let target the sscx dissemination project in Nexus\n", "ORG = \"public\"\n", "PROJECT = \"sscx\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "forge = KnowledgeGraphForge(\"prod-forge-nexus.yml\",bucket=f\"{ORG}/{PROJECT}\",token=TOKEN)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Search and Download" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "forge.types()" ] }, { "cell_type": "markdown", "metadata": { "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "### For ontologies" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set filters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Supported filters for the time being are:\n", "from kgforge.core.commons.strategies import ResolvingStrategy\n", "text = \"somatosensory\"\n", "limit=10" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# other Search strategy can be ResolvingStrategy.BEST_MATCH, ResolvingStrategy.EXACT_MATCH\n", "brain_region = forge.resolve(text, scope=\"ontology\", target=\"terms\", strategy=ResolvingStrategy.ALL_MATCHES, limit=limit)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "forge.as_dataframe(brain_region).head(100)" ] }, { "cell_type": "markdown", "metadata": { "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "### For Morphologies " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set filters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Supported filters for the time being are:\n", "_type = \"ReconstructedCell\"\n", "classification_type=\"nsg:MType\"\n", "mType=\"L4_NBC\"\n", "brainRegion = \"primary somatosensory cortex\"\n", "layer = \"layer 4\"\n", "encodingFormat=\"application/swc\"\n", "limit=2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "forge.template(\"Dataset\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Run Query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = forge.paths(\"Dataset\") # to have autocompletion on the properties\n", "data = forge.search(path.type.id == _type,\n", " path.annotation.hasBody.type ==classification_type,\n", " path.annotation.hasBody.label ==mType,\n", " path.brainLocation.brainRegion.label == brainRegion,\n", " path.brainLocation.layer.label == layer,\n", " path.distribution.encodingFormat == encodingFormat,\n", " limit=limit)\n", "\n", "print(str(len(data))+\" dataset of type '\"+_type+\"' found.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display the results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DISPLAY_LIMIT = 10\n", "reshaped_data = forge.reshape(data, keep=[\"id\",\"name\",\"subject\",\"brainLocation.brainRegion.id\",\"brainLocation.brainRegion.label\",\"brainLocation.layer.id\",\"brainLocation.layer.label\", \"contribution\",\"brainLocation.layer.id\",\"brainLocation.layer.label\",\"distribution.name\",\"distribution.contentUrl\",\"distribution.encodingFormat\"])\n", "\n", "forge.as_dataframe(reshaped_data[:DISPLAY_LIMIT])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Dowload" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dirpath = \"./downloaded/\"\n", "forge.download(data, \"distribution.contentUrl\", dirpath)" ] }, { "cell_type": "markdown", "metadata": { "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "### For Trace" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set filters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Supported filters for the time being are:\n", "_type = \"Trace\"\n", "classification_type=\"nsg:EType\"\n", "eType=\"cADpyr\"\n", "brainRegion = \"primary somatosensory cortex\"\n", "layer = \"layer 5\"\n", "encodingFormat=\"application/nwb\"\n", "limit=10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Run Query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = forge.paths(\"Dataset\") # to have autocompletion on the properties\n", "data = forge.search(path.type.id == _type,\n", " path.annotation.hasBody.type ==classification_type,\n", " path.annotation.hasBody.label ==eType,\n", " path.brainLocation.brainRegion.label == brainRegion,\n", " path.brainLocation.layer.label == layer,\n", " path.distribution.encodingFormat == encodingFormat,\n", " limit=limit)\n", "\n", "print(str(len(data))+\" data of type '\"+_type+\"' found.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display the results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DISPLAY_LIMIT = 10\n", "reshaped_data = forge.reshape(data, keep=[\"id\",\"name\",\"subject\",\"brainLocation.brainRegion.id\",\"brainLocation.brainRegion.label\",\"brainLocation.layer.id\",\"brainLocation.layer.label\", \"contribution\",\"brainLocation.layer.id\",\"brainLocation.layer.label\",\"distribution.name\",\"distribution.contentUrl\",\"distribution.encodingFormat\"])\n", "\n", "forge.as_dataframe(reshaped_data[:DISPLAY_LIMIT])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Dowload" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dirpath = \"./downloaded/\"\n", "forge.download(data, \"distribution.contentUrl\", dirpath)" ] }, { "cell_type": "markdown", "metadata": { "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "### For LayerThickness " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set filters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Supported filters for the time being are:\n", "_type = \"LayerThickness\"\n", "brainRegion = \"primary somatosensory cortex\"\n", "layer = \"layer 2\"\n", "encodingFormat=\"application/xlsx\"\n", "limit=10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Run query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = forge.paths(\"Dataset\") # to have autocompletion on the properties\n", "data = forge.search(path.type.id == _type,\n", " path.brainLocation.layer.label == layer,\n", " path.brainLocation.brainRegion.label == brainRegion,\n", " path.distribution.encodingFormat == encodingFormat,\n", " limit=limit)\n", "\n", "print(str(len(data))+\" data of type '\"+_type+\"' found.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display Results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DISPLAY_LIMIT = 10\n", "reshaped_data = forge.reshape(data, keep=[\"id\",\"name\",\"subject\",\"brainLocation.brainRegion.id\",\"brainLocation.brainRegion.label\",\"brainLocation.layer.id\",\"brainLocation.layer.label\", \"contribution\",\"brainLocation.layer.id\",\"brainLocation.layer.label\",\"distribution.name\",\"distribution.contentUrl\",\"distribution.encodingFormat\"])\n", "\n", "forge.as_dataframe(reshaped_data[:DISPLAY_LIMIT])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Dowload" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dirpath = \"./downloaded/\"\n", "forge.download(data, \"distribution.contentUrl\", dirpath)" ] }, { "cell_type": "markdown", "metadata": { "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "### For Neuron Density " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set filters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Supported filters for the time being are:\n", "_type = \"NeuronDensity\"\n", "brainRegion = \"primary somatosensory cortex\"\n", "layer = \"layer 2\"\n", "encodingFormat=\"application/xlsx\"\n", "limit=10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Run query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = forge.paths(\"Dataset\") # to have autocompletion on the properties\n", "data = forge.search(path.type.id == _type,\n", " path.brainLocation.layer.label == layer,\n", " path.brainLocation.brainRegion.label == brainRegion,\n", " path.distribution.encodingFormat == encodingFormat,\n", " limit=limit)\n", "\n", "print(str(len(data))+\" data of type '\"+_type+\"' found.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display Results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DISPLAY_LIMIT = 10\n", "reshaped_data = forge.reshape(data, keep=[\"id\",\"name\",\"subject\",\"brainLocation.brainRegion.id\",\"brainLocation.brainRegion.label\",\"brainLocation.layer.id\",\"brainLocation.layer.label\", \"contribution\",\"brainLocation.layer.id\",\"brainLocation.layer.label\",\"distribution.name\",\"distribution.contentUrl\",\"distribution.encodingFormat\"])\n", "\n", "forge.as_dataframe(reshaped_data[:DISPLAY_LIMIT])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Dowload" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dirpath = \"./downloaded/\"\n", "forge.download(data, \"distribution.contentUrl\", dirpath)" ] }, { "cell_type": "markdown", "metadata": { "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "### For Atlas Release" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let target the bbp/atlas project in Nexus\n", "\n", "forge_atlas = KnowledgeGraphForge(\"prod-forge-nexus.yml\", bucket=\"bbp/atlas\", token=TOKEN)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Atlas related types:\n", " AtlasRelease\n", " CellPositions\n", " BrainParcellationDataLayer\n", " CellDensityDataLayer\n", " GeneExpressionVolumetricDataLayer\n", " GliaCellDensity\n", " NISSLImageDataLayer" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set filters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Supported filters for the time being are:\n", "_type = \"CellPositions\"\n", "limit=10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Run query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "path = forge_atlas.paths(\"Dataset\") # to have autocompletion on the properties\n", "data = forge_atlas.search(path.type.id == _type,\n", " limit=limit)\n", "\n", "print(str(len(data))+\" data of type '\"+_type+\"' found.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Display Results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DISPLAY_LIMIT = 10\n", "reshaped_data = forge_atlas.reshape(data, keep=[\"id\",\"name\",\"brainLocation.brainRegion.id\",\"brainLocation.brainRegion.label\", \"contribution\",\"distribution.name\",\"distribution.contentUrl\",\"distribution.encodingFormat\"])\n", "\n", "forge_atlas.as_dataframe(reshaped_data[:DISPLAY_LIMIT])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dirpath = \"./downloaded/\"\n", "forge_atlas.download(data, \"distribution.contentUrl\", dirpath)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.10" } }, "nbformat": 4, "nbformat_minor": 4 }