{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Installation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: If you are on Binder, you don't need to execute the following command." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "!pip install nexusforge[linking_sklearn]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os \n", "\n", "def full_path_relative_to_root(path): \n", " return os.path.join(os.path.abspath('../../..'), path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Configuration\n", "\n", "This notebook presents a set of configuation options to set up when creating a knowledge graph forge session. Refer to the [Nexus Forge docs](https://nexus-forge.readthedocs.io/en/latest/interaction.html#forge) to learn more about all the possible configuration options." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config = dict()" ] }, { "cell_type": "markdown", "metadata": { "toc-hr-collapsed": true, "toc-nb-collapsed": true }, "source": [ "## Configure for Demo\n", "This configuration is for testing Nexus Forge features without using or deploying a persistent store. Not all features are accessible with the demo configuration. The demo configuration is therefore not recommendeded for production use. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config['Model'] = {\n", " \"name\": \"DemoModel\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"tests/data/demo-model/\"),\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Store" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config[\"Store\"] = {\n", " \"name\": \"DemoStore\",\n", " \"model\": {\"name\": \"DemoModel\"},\n", " \"versioned_id_template\": \"{x.id}?_version={x._store_metadata.version}\"\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Resolvers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### sourced from a directory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config[\"Resolvers\"] = {\n", " \"terms\": [\n", " {\n", " \"resolver\": \"DemoResolver\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"tests/data/demo-resolver/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"sexontology\",\n", " \"bucket\": \"sex.json\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\"examples/configurations/demo-resolver/term-to-resource-mapping.hjson\")\n", " }\n", " ],\n", " \"entities\": [\n", " {\n", " \"resolver\": \"DemoResolver\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"tests/data/demo-resolver/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"agents\",\n", " \"bucket\": \"agents.json\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\"examples/configurations/demo-resolver/entity-to-resource-mapping.hjson\")\n", " }\n", " ],\n", " \"schemaorg\": [\n", " {\n", " \"resolver\": \"EntityLinkerSkLearn from kgentitylinkingsklearn\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"examples/data/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"terms\",\n", " \"bucket\": \"tfidfvectorizer_model_schemaorg_linking\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\"examples/configurations/entitylinking-resolver/entitylinking-mapper.hjson\")\n", " }\n", " ]\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configure Nexus Forge to use with [Blue Brain Nexus Delta](https://bluebrainnexus.io/docs/delta/api/current/index.html) as a Store" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create a project to work with in the BlueBrainNexus sandbox" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Get a token" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [Nexus sandbox application](https://sandbox.bluebrainnexus.io/web) can be used to login and get a token.\n", "\n", "- Step 1: From the opened web page, click on the login button on the right corner and follow the instructions.\n", "\n", "![login-ui](https://raw.githubusercontent.com/BlueBrain/nexus-forge/master/examples/notebooks/use-cases/login-ui.png)\n", "\n", "- Step 2: At the end you’ll see a token button on the right corner. Click on it to copy the token.\n", "\n", "![login-ui](https://raw.githubusercontent.com/BlueBrain/nexus-forge/master/examples/notebooks/use-cases/copy-token.png)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import getpass\n", "\n", "token = getpass.getpass(prompt=\"Token\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Set a project name" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "endpoint = \"https://sandbox.bluebrainnexus.io/v1\"\n", "org =\"github-users\"\n", "\n", "project = getpass.getpass(prompt=\"Github username\") # Provide here the automatically created project name corresponding to your Github login when you logged in the Nexus sandbox instance." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### RDFModel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This model supports the W3C SHACL schema language. Let use examples of SHACL schemas from https://github.com/INCF/neuroshapes. SHACL schemas can be loaded either from a directory or from a store. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### sourced from BlueBrainNexus store" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "shacl_schema_bucket = \"neurosciencegraph/datamodels\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config['Model'] = {\n", " \"name\": \"RdfModel\",\n", " \"origin\": \"store\",\n", " \"source\": \"BlueBrainNexus\",\n", " \"context\": {\n", " \"iri\": \"https://bbp.neuroshapes.org\",\n", " \"bucket\": shacl_schema_bucket\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the following tutorials please keep the following Model configuration:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### sourced from a directory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "neuroshapes_path = full_path_relative_to_root(\"examples/models/neuroshapes\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! rm -Rf $neuroshapes_path" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! git clone https://github.com/INCF/neuroshapes.git $neuroshapes_path" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "! cp -R $neuroshapes_path/shapes/neurosciencegraph/datashapes/core/dataset $neuroshapes_path/shapes/neurosciencegraph/commons/\n", "! cp -R $neuroshapes_path/shapes/neurosciencegraph/datashapes/core/activity $neuroshapes_path/shapes/neurosciencegraph/commons/\n", "! cp -R $neuroshapes_path/shapes/neurosciencegraph/datashapes/core/entity $neuroshapes_path/shapes/neurosciencegraph/commons/\n", "! cp -R $neuroshapes_path/shapes/neurosciencegraph/datashapes/core/ontology $neuroshapes_path/shapes/neurosciencegraph/commons/\n", "! cp -R $neuroshapes_path/shapes/neurosciencegraph/datashapes/core/person $neuroshapes_path/shapes/neurosciencegraph/commons/\n", "! cp -R $neuroshapes_path/shapes/neurosciencegraph/datashapes/core/contribution $neuroshapes_path/shapes/neurosciencegraph/commons/" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config['Model'] = {\n", " \"name\": \"RdfModel\",\n", " \"origin\": \"directory\",\n", " \"source\": f\"{neuroshapes_path}/shapes/neurosciencegraph/commons/\",\n", " \"context\": {\n", " \"iri\": full_path_relative_to_root(\"examples/models/neuroshapes_context.json\")\n", " },\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Store" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config[\"Store\"] = {\n", " \"name\": \"BlueBrainNexus\",\n", " \"endpoint\": endpoint,\n", " \"model\": {\"name\": \"RdfModel\"}, \n", " \"searchendpoints\":{\n", " \"sparql\":{\n", " \"endpoint\":\"https://bluebrain.github.io/nexus/vocabulary/defaultSparqlIndex\"\n", " },\n", " \"elastic\":{\n", " \"endpoint\":\"https://bluebrain.github.io/nexus/vocabulary/defaultElasticSearchIndex\"\n", " }\n", " },\n", " \"bucket\": f\"{org}/{project}\",\n", " \"token\": token,\n", " \"vocabulary\":{\n", " \"metadata\":{\n", " \"iri\": \"https://bluebrain.github.io/nexus/contexts/metadata.json\",\n", " \"local_iri\": \"https://bluebrainnexus.io/contexts/metadata.json\"\n", " }, \n", " \"namespace\": \"https://bluebrain.github.io/nexus/vocabulary/\",\n", " \"deprecated_property\": \"https://bluebrain.github.io/nexus/vocabulary/deprecated\",\n", " \"project_property\": \"https://bluebrain.github.io/nexus/vocabulary/project\"\n", " },\n", " \"max_connection\": 50,\n", " \"versioned_id_template\": \"{x.id}?rev={x._store_metadata._rev}\",\n", " \"file_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/nexus-store/file-to-resource-mapping.hjson\"\n", " )\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Resolvers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### sourced from a store" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ontology_bucket = \"neurosciencegraph/datamodels\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config[\"Resolvers\"] = {\n", " \"terms\": [\n", " {\n", " \"resolver\": \"OntologyResolver\",\n", " \"origin\": \"store\",\n", " \"source\": \"BlueBrainNexus\",\n", " \"targets\": [\n", " {\n", " \"identifier\": \"sexontology\",\n", " \"bucket\": ontology_bucket\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/nexus-resolver/term-to-resource-mapping.hjson\"\n", " )\n", " }\n", " ],\n", " \"ontology\": [\n", " {\n", " \"resolver\": \"OntologyResolver\",\n", " \"origin\": \"store\",\n", " \"source\": \"BlueBrainNexus\",\n", " \"targets\": [\n", " {\n", " \"identifier\": \"cells\",\n", " \"bucket\": ontology_bucket\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/demo-resolver/term-to-resource-mapping.hjson\"\n", " )\n", " }\n", " ],\n", " \"entities\": [\n", " {\n", " \"resolver\": \"DemoResolver\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"tests/data/demo-resolver/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"agents\",\n", " \"bucket\": \"agents.json\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/demo-resolver/entity-to-resource-mapping.hjson\"\n", " )\n", " }\n", " ],\n", " \"schemaorg\": [\n", " {\n", " \"resolver\": \"EntityLinkerSkLearn from kgentitylinkingsklearn\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"examples/data/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"terms\",\n", " \"bucket\": \"tfidfvectorizer_model_schemaorg_linking\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/entitylinking-resolver/entitylinking-mapper.hjson\"\n", " )\n", " }\n", " ]\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### sourced from a directory" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config[\"Resolvers\"] = {\n", " \"terms\": [\n", " {\n", " \"resolver\": \"DemoResolver\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"tests/data/demo-resolver/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"sexontology\",\n", " \"bucket\": \"sex.json\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/demo-resolver/term-to-resource-mapping.hjson\"\n", " )\n", " }\n", " ],\n", " \"ontology\": [\n", " {\n", " \"resolver\": \"DemoResolver\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"tests/data/demo-resolver/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"cells\",\n", " \"bucket\": \"cell_types.json\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/demo-resolver/term-to-resource-mapping.hjson\"\n", " )\n", " }\n", " ],\n", " \"entities\": [\n", " {\n", " \"resolver\": \"DemoResolver\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"tests/data/demo-resolver/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"agents\",\n", " \"bucket\": \"agents.json\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/demo-resolver/entity-to-resource-mapping.hjson\"\n", " )\n", " }\n", " ],\n", " \"schemaorg\": [\n", " {\n", " \"resolver\": \"EntityLinkerSkLearn from kgentitylinkingsklearn\",\n", " \"origin\": \"directory\",\n", " \"source\": full_path_relative_to_root(\"examples/data/\"),\n", " \"targets\": [\n", " {\n", " \"identifier\": \"terms\",\n", " \"bucket\": \"tfidfvectorizer_model_schemaorg_linking\"\n", " }\n", " ],\n", " \"result_resource_mapping\": full_path_relative_to_root(\n", " \"examples/configurations/entitylinking-resolver/entitylinking-mapper.hjson\"\n", " )\n", " }\n", " ]\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Configure formatters" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "config[\"Formatters\"] = {\n", " \"identifier\": \"https://kg.example.ch/{}/{}\",\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save configuration" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import yaml" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open(full_path_relative_to_root(\"examples/configurations/forge.yml\"), \"w\") as f:\n", " yaml.dump(config, f)" ] } ], "metadata": { "kernelspec": { "display_name": "my_env_notebook", "language": "python", "name": "my_env_notebook" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.17" }, "vscode": { "interpreter": { "hash": "9ac393a5ddd595f2c78ea58b15bf8d269850a4413729cbea5c5fae9013762763" } } }, "nbformat": 4, "nbformat_minor": 4 }