{ "cells": [ { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:19.036357Z", "start_time": "2019-09-23T18:50:19.031896Z" } }, "source": [ "# Querying\n", "\n", "This notebook demonstrates Nexus Forge data [querying features](https://nexus-forge.readthedocs.io/en/latest/interaction.html#querying)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.068658Z", "start_time": "2019-09-23T18:50:19.054054Z" } }, "outputs": [], "source": [ "from kgforge.core import KnowledgeGraphForge" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A configuration file is needed in order to create a KnowledgeGraphForge session. A configuration can be generated using the notebook [00-Initialization.ipynb](00%20-%20Initialization.ipynb)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "forge = KnowledgeGraphForge(\"../../configurations/forge.yml\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from kgforge.core import Resource\n", "from kgforge.specializations.resources import Dataset\n", "from kgforge.core.wrappings.paths import Filter, FilterOperator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Retrieval" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### latest version" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\", award=[\"Nobel\"])" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _register_one\n", " True\n" ] } ], "source": [ "forge.register(jane)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "resource = forge.retrieve(jane.id)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource == jane" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### specific version" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\", award=[\"Nobel\"])" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _register_one\n", " True\n" ] } ], "source": [ "forge.register(jane)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _tag_one\n", " True\n" ] } ], "source": [ "forge.tag(jane, \"v1\")" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "jane.email = [\"jane.doe@epfl.ch\", \"jane.doe@example.org\"]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _update_one\n", " True\n" ] } ], "source": [ "forge.update(jane)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:21.317601Z", "start_time": "2019-09-23T18:50:21.310418Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3\n" ] } ], "source": [ "try:\n", " # DemoStore\n", " print(jane._store_metadata.version)\n", "except:\n", " # BlueBrainNexus\n", " print(jane._store_metadata._rev)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:21.332678Z", "start_time": "2019-09-23T18:50:21.322025Z" } }, "outputs": [], "source": [ "jane_v1 = forge.retrieve(jane.id, version=1)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:21.370051Z", "start_time": "2019-09-23T18:50:21.363782Z" } }, "outputs": [], "source": [ "jane_v1_tag = forge.retrieve(jane.id, version=\"v1\")" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "jane_v1_rev = forge.retrieve(jane.id+\"?rev=1\")" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:21.379911Z", "start_time": "2019-09-23T18:50:21.373539Z" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jane_v1 == jane_v1_tag" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jane_v1 == jane_v1_rev" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jane_v1 != jane" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1\n" ] } ], "source": [ "try:\n", " # DemoStore\n", " print(jane_v1._store_metadata.version)\n", "except:\n", " # BlueBrainNexus\n", " print(jane_v1._store_metadata._rev)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### crossbucket retrieval\n", "It is possible to retrieve resources stored in buckets different then the configured one. The configured store should of course support it." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "resource = forge.retrieve(jane.id, cross_bucket=True) # cross_bucket defaults to False" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99',\n", " '_constrainedBy': 'https://bluebrain.github.io/nexus/schemas/unconstrained.json',\n", " '_createdAt': '2022-04-12T21:29:14.410Z',\n", " '_createdBy': 'https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy',\n", " '_deprecated': False,\n", " '_incoming': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99/incoming',\n", " '_outgoing': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99/outgoing',\n", " '_project': 'https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge',\n", " '_rev': 3,\n", " '_schemaProject': 'https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge',\n", " '_self': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99',\n", " '_updatedAt': '2022-04-12T21:29:21.465Z',\n", " '_updatedBy': 'https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy'}" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._store_metadata" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Action(error=None, message=None, operation='retrieve', succeeded=True)" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._last_action" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._synchronized" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Original source retrieval\n", "When using BlueBrainNexusStore, it is possible to retrieve resources' payload as they were registered (retrieve_source=True) without any changes related to store added metadata or JSONLD framing." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "resource = forge.retrieve(jane.id, retrieve_source=False) # retrieve_source defaults to True" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99',\n", " 'type': 'Person',\n", " 'award': 'Nobel',\n", " 'email': ['jane.doe@epfl.ch', 'jane.doe@example.org'],\n", " 'name': 'Jane Doe'}" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_json(resource)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99',\n", " '_constrainedBy': 'https://bluebrain.github.io/nexus/schemas/unconstrained.json',\n", " '_createdAt': '2022-04-12T21:29:14.410Z',\n", " '_createdBy': 'https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy',\n", " '_deprecated': False,\n", " '_incoming': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99/incoming',\n", " '_outgoing': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99/outgoing',\n", " '_project': 'https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge',\n", " '_rev': 3,\n", " '_schemaProject': 'https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge',\n", " '_self': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/0ee3e0d7-84f5-424d-937a-76dc7a5d7a99',\n", " '_updatedAt': '2022-04-12T21:29:21.465Z',\n", " '_updatedBy': 'https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy'}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._store_metadata" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Action(error=None, message=None, operation='retrieve', succeeded=True)" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._last_action" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._synchronized" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### error handling" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " retrieve\n", " RetrievalError: 404 Client Error: Not Found for url: https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/%3A%2F%2F123/source\n", "\n" ] } ], "source": [ "resource = forge.retrieve(\"123\")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource is None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Searching" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: DemoModel and RdfModel schemas have not been synchronized yet. This section is to be run with RdfModel. Commented lines are for DemoModel." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\")\n", "contribution_jane = Resource(type=\"Contribution\", agent=jane)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "john = Resource(type=\"Person\", name=\"John Smith\")\n", "contribution_john = Resource(type=\"Contribution\", agent=john)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "dataset = Dataset(forge, type=\"Dataset\", contribution=[contribution_jane, contribution_john])\n", "dataset.add_distribution(\"../../data/associations.tsv\")" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _register_one\n", " True\n" ] } ], "source": [ "forge.register(dataset)" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': 'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/69ff5c8c-098b-4440-8387-367bf9cfa9cc',\n", " 'type': 'Dataset',\n", " 'contribution': [{'type': 'Contribution',\n", " 'agent': {'type': 'Person', 'name': 'Jane Doe'}},\n", " {'type': 'Contribution', 'agent': {'type': 'Person', 'name': 'John Smith'}}],\n", " 'distribution': {'type': 'DataDownload',\n", " 'atLocation': {'type': 'Location',\n", " 'store': {'id': 'https://bluebrain.github.io/nexus/vocabulary/diskStorageDefault',\n", " 'type': 'DiskStorage',\n", " '_rev': 1}},\n", " 'contentSize': {'unitCode': 'bytes', 'value': 477},\n", " 'contentUrl': 'https://bbp.epfl.ch/nexus/v1/files/dke/kgforge/af70dd9d-5161-49a4-a6ae-ef247f233694',\n", " 'digest': {'algorithm': 'SHA-256',\n", " 'value': '789aa07948683fe036ac29811814a826b703b562f7d168eb70dee1fabde26859'},\n", " 'encodingFormat': 'text/tab-separated-values',\n", " 'name': 'associations.tsv'}}" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_json(dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using resource paths as filters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `paths` method load the template or property paths (ie. expected properties) for a given type.\n", "\n", "Please refer to the [Modeling.ipynb](11%20-%20Modeling.ipynb) notebook to learn about templates and types." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "p = forge.paths(\"Dataset\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Autocompletion is enabled on `p` and this can be used to create search filters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: There is a known issue for RdfModel which requires using `p.type.id` instead of `p.type`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All [python comparison operators](https://www.w3schools.com/python/gloss_python_comparison_operators.asp) are supported." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "resources = forge.search(p.type.id==\"Person\", limit=3)" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypeemailnameaddress.typeaddress.countryaddress.locality
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Personjohn.smith@epfl.chJohn SmithNaNNaNNaN
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Person(missing)Jane DoePostalAddressSwitzerlandGeneva
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Personjohn.smith@epfl.chJohn SmithNaNNaNNaN
\n", "
" ], "text/plain": [ " id type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person \n", "\n", " email name address.type address.country \\\n", "0 john.smith@epfl.ch John Smith NaN NaN \n", "1 (missing) Jane Doe PostalAddress Switzerland \n", "2 john.smith@epfl.ch John Smith NaN NaN \n", "\n", " address.locality \n", "0 NaN \n", "1 Geneva \n", "2 NaN " ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypeemailname_constrainedBy_createdAt_createdBy_deprecated_incoming_outgoing_project_rev_schemaProject_self_updatedAt_updatedByaddress.typeaddress.countryaddress.locality
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Personjohn.smith@epfl.chJohn Smithhttps://bluebrain.github.io/nexus/schemas/unco...2021-05-07T07:46:04.511Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2021-05-07T07:46:04.511Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syNaNNaNNaN
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Person(missing)Jane Doehttps://bluebrain.github.io/nexus/schemas/unco...2021-05-07T07:46:04.513Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2021-05-07T07:46:04.513Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syPostalAddressSwitzerlandGeneva
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Personjohn.smith@epfl.chJohn Smithhttps://bluebrain.github.io/nexus/schemas/unco...2021-05-07T07:47:26.453Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2021-05-07T07:47:26.453Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syNaNNaNNaN
\n", "
" ], "text/plain": [ " id type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person \n", "\n", " email name \\\n", "0 john.smith@epfl.ch John Smith \n", "1 (missing) Jane Doe \n", "2 john.smith@epfl.ch John Smith \n", "\n", " _constrainedBy \\\n", "0 https://bluebrain.github.io/nexus/schemas/unco... \n", "1 https://bluebrain.github.io/nexus/schemas/unco... \n", "2 https://bluebrain.github.io/nexus/schemas/unco... \n", "\n", " _createdAt _createdBy \\\n", "0 2021-05-07T07:46:04.511Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "1 2021-05-07T07:46:04.513Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "2 2021-05-07T07:47:26.453Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "\n", " _deprecated _incoming \\\n", "0 False https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 False https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 False https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _outgoing \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _project _rev \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "1 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "2 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "\n", " _schemaProject \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "1 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "2 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "\n", " _self \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _updatedAt _updatedBy \\\n", "0 2021-05-07T07:46:04.511Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "1 2021-05-07T07:46:04.513Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "2 2021-05-07T07:47:26.453Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "\n", " address.type address.country address.locality \n", "0 NaN NaN NaN \n", "1 PostalAddress Switzerland Geneva \n", "2 NaN NaN NaN " ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources, store_metadata=True)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Search results are not synchronized\n", "resources[0]._synchronized" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Using nested resource property" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Property autocompletion is available on a path `p` even for nested properties like `p.contribution`." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "# Search for resources of type Dataset and with text/tab-separated-values as distribution.encodingFormat\n", "resources = forge.search(p.type.id == \"Dataset\", p.distribution.encodingFormat == \"text/tab-separated-values\", limit=3)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypecontributiondistribution.typedistribution.atLocation.typedistribution.atLocation.store.iddistribution.contentSize.unitCodedistribution.contentSize.valuedistribution.contentUrldistribution.digest.algorithmdistribution.digest.valuedistribution.encodingFormatdistribution.name
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Dataset[{'type': 'Contribution', 'agent': {'type': 'P...DataDownloadLocationhttps://bluebrain.github.io/nexus/vocabulary/d...bytes506https://bbp.epfl.ch/nexus/v1/files/dke/kgforge...SHA-2569639abc864e91c645779f510ae5c06a1618941d569eb1a...text/tab-separated-valuesassociations.tsv
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Dataset[{'type': 'Contribution', 'agent': {'type': 'P...DataDownloadLocationhttps://bluebrain.github.io/nexus/vocabulary/d...bytes506https://bbp.epfl.ch/nexus/v1/files/dke/kgforge...SHA-2569639abc864e91c645779f510ae5c06a1618941d569eb1a...text/tab-separated-valuesassociations.tsv
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...DatasetNaNDataDownloadLocationhttps://bluebrain.github.io/nexus/vocabulary/d...bytes506https://bbp.epfl.ch/nexus/v1/files/dke/kgforge...SHA-2569639abc864e91c645779f510ae5c06a1618941d569eb1a...text/tab-separated-valuesassociations.tsv
\n", "
" ], "text/plain": [ " id type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Dataset \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Dataset \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Dataset \n", "\n", " contribution distribution.type \\\n", "0 [{'type': 'Contribution', 'agent': {'type': 'P... DataDownload \n", "1 [{'type': 'Contribution', 'agent': {'type': 'P... DataDownload \n", "2 NaN DataDownload \n", "\n", " distribution.atLocation.type \\\n", "0 Location \n", "1 Location \n", "2 Location \n", "\n", " distribution.atLocation.store.id \\\n", "0 https://bluebrain.github.io/nexus/vocabulary/d... \n", "1 https://bluebrain.github.io/nexus/vocabulary/d... \n", "2 https://bluebrain.github.io/nexus/vocabulary/d... \n", "\n", " distribution.contentSize.unitCode distribution.contentSize.value \\\n", "0 bytes 506 \n", "1 bytes 506 \n", "2 bytes 506 \n", "\n", " distribution.contentUrl \\\n", "0 https://bbp.epfl.ch/nexus/v1/files/dke/kgforge... \n", "1 https://bbp.epfl.ch/nexus/v1/files/dke/kgforge... \n", "2 https://bbp.epfl.ch/nexus/v1/files/dke/kgforge... \n", "\n", " distribution.digest.algorithm \\\n", "0 SHA-256 \n", "1 SHA-256 \n", "2 SHA-256 \n", "\n", " distribution.digest.value \\\n", "0 9639abc864e91c645779f510ae5c06a1618941d569eb1a... \n", "1 9639abc864e91c645779f510ae5c06a1618941d569eb1a... \n", "2 9639abc864e91c645779f510ae5c06a1618941d569eb1a... \n", "\n", " distribution.encodingFormat distribution.name \n", "0 text/tab-separated-values associations.tsv \n", "1 text/tab-separated-values associations.tsv \n", "2 text/tab-separated-values associations.tsv " ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using dictionaries as filters\n", "A dictionary can be provided for filters:\n", "* {'type': {'id':'Dataset'}} is equivalent to p.type.id==\"Dataset\"\n", "* only the '==' operator is supported\n", "* nested dict are supported\n", "* it is not mandatory for the provided properties and values to be defined in the forge model. Results will be retrieved if there are corresponding data in the store.\n", "\n", "This feature is not supported when using the DemoStore\n" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [], "source": [ "# Search for resources of type Dataset and with text/tab-separated-values as distribution.encodingFormat\n", "# and created a given dateTime (by default, dateTime values should be signaled by the suffix \"^^xsd:dateTime\")\n", "filters = {\n", " \"type\": \"Dataset\", \n", " \"distribution\":{\"encodingFormat\":\"text/tab-separated-values\"},\n", " \"_createdAt\":dataset._store_metadata._createdAt+\"^^xsd:dateTime\"\n", " }\n", "resources = forge.search(filters, limit=3)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypecontributiondistribution.typedistribution.atLocation.typedistribution.atLocation.store.iddistribution.atLocation.store.typedistribution.atLocation.store._revdistribution.contentSize.unitCodedistribution.contentSize.value..._createdBy_deprecated_incoming_outgoing_project_rev_schemaProject_self_updatedAt_updatedBy
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Dataset[{'type': 'Contribution', 'agent': {'type': 'P...DataDownloadLocationhttps://bluebrain.github.io/nexus/vocabulary/d...DiskStorage1bytes477...https://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2022-04-12T21:45:50.461Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy
\n", "

1 rows × 27 columns

\n", "
" ], "text/plain": [ " id type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Dataset \n", "\n", " contribution distribution.type \\\n", "0 [{'type': 'Contribution', 'agent': {'type': 'P... DataDownload \n", "\n", " distribution.atLocation.type \\\n", "0 Location \n", "\n", " distribution.atLocation.store.id \\\n", "0 https://bluebrain.github.io/nexus/vocabulary/d... \n", "\n", " distribution.atLocation.store.type distribution.atLocation.store._rev \\\n", "0 DiskStorage 1 \n", "\n", " distribution.contentSize.unitCode distribution.contentSize.value ... \\\n", "0 bytes 477 ... \n", "\n", " _createdBy _deprecated \\\n", "0 https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy False \n", "\n", " _incoming \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _outgoing \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _project _rev \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "\n", " _schemaProject \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "\n", " _self \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _updatedAt _updatedBy \n", "0 2022-04-12T21:45:50.461Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "\n", "[1 rows x 27 columns]" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources, store_metadata=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using built-in Filter objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Supported filter operators" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['__eq__ (EQUAL)',\n", " '__ne__ (NOT_EQUAL)',\n", " '__lt__ (LOWER_THAN)',\n", " '__le__ (LOWER_OR_Equal_Than)',\n", " '__gt__ (GREATER_Than)',\n", " '__ge__ (GREATER_OR_Equal_Than)']" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[f\"{op.value} ({op.name})\" for op in FilterOperator] # These are equivalent to the Python comparison operators" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "# Search for resources of type Dataset and with text/tab-separated-values as distribution.encodingFormat\n", "# and created a given dateTime (dateTime values should be signaled by the suffix \"^^xsd:dateTime\")\n", "filter_1 = Filter(operator=FilterOperator.EQUAL, path=[\"type\"], value=\"Dataset\")\n", "filter_2 = Filter(operator=FilterOperator.EQUAL, path=[\"distribution\",\"encodingFormat\"], value=\"text/tab-separated-values\")\n", "filter_3 = Filter(operator=FilterOperator.LOWER_OR_Equal_Than, path=[\"_createdAt\"], value=dataset._store_metadata._createdAt+\"^^xsd:dateTime\")\n", "\n", "resources = forge.search(filter_1, filter_2, filter_3, limit=3)" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypecontributiondistribution.typedistribution.atLocation.typedistribution.atLocation.store.iddistribution.contentSize.unitCodedistribution.contentSize.valuedistribution.contentUrldistribution.digest.algorithm..._createdBy_deprecated_incoming_outgoing_project_rev_schemaProject_self_updatedAt_updatedBy
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Dataset[{'type': 'Contribution', 'agent': {'type': 'P...DataDownloadLocationhttps://bluebrain.github.io/nexus/vocabulary/d...bytes506https://bbp.epfl.ch/nexus/v1/files/dke/kgforge...SHA-256...https://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2021-03-17T22:07:09.443Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Dataset[{'type': 'Contribution', 'agent': {'type': 'P...DataDownloadLocationhttps://bluebrain.github.io/nexus/vocabulary/d...bytes506https://bbp.epfl.ch/nexus/v1/files/dke/kgforge...SHA-256...https://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2021-03-17T22:14:28.904Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...DatasetNaNDataDownloadLocationhttps://bluebrain.github.io/nexus/vocabulary/d...bytes506https://bbp.epfl.ch/nexus/v1/files/dke/kgforge...SHA-256...https://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2021-03-17T22:31:38.741Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy
\n", "

3 rows × 25 columns

\n", "
" ], "text/plain": [ " id type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Dataset \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Dataset \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Dataset \n", "\n", " contribution distribution.type \\\n", "0 [{'type': 'Contribution', 'agent': {'type': 'P... DataDownload \n", "1 [{'type': 'Contribution', 'agent': {'type': 'P... DataDownload \n", "2 NaN DataDownload \n", "\n", " distribution.atLocation.type \\\n", "0 Location \n", "1 Location \n", "2 Location \n", "\n", " distribution.atLocation.store.id \\\n", "0 https://bluebrain.github.io/nexus/vocabulary/d... \n", "1 https://bluebrain.github.io/nexus/vocabulary/d... \n", "2 https://bluebrain.github.io/nexus/vocabulary/d... \n", "\n", " distribution.contentSize.unitCode distribution.contentSize.value \\\n", "0 bytes 506 \n", "1 bytes 506 \n", "2 bytes 506 \n", "\n", " distribution.contentUrl \\\n", "0 https://bbp.epfl.ch/nexus/v1/files/dke/kgforge... \n", "1 https://bbp.epfl.ch/nexus/v1/files/dke/kgforge... \n", "2 https://bbp.epfl.ch/nexus/v1/files/dke/kgforge... \n", "\n", " distribution.digest.algorithm ... \\\n", "0 SHA-256 ... \n", "1 SHA-256 ... \n", "2 SHA-256 ... \n", "\n", " _createdBy _deprecated \\\n", "0 https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy False \n", "1 https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy False \n", "2 https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy False \n", "\n", " _incoming \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _outgoing \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _project _rev \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "1 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "2 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "\n", " _schemaProject \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "1 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "2 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "\n", " _self \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _updatedAt _updatedBy \n", "0 2021-03-17T22:07:09.443Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "1 2021-03-17T22:14:28.904Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "2 2021-03-17T22:31:38.741Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "\n", "[3 rows x 25 columns]" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources, store_metadata=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using search endpoints\n", "\n", "Two types of search endpoints are supported: 'sparql' (default) for graph queries and 'elastic' for document oriented queries. The types of available search endpoint can be configured (see [00-Initialization.ipynb](00%20-%20Initialization.ipynb) for an example of search endpoints config) or set when creating a KnowledgeGraphForge session using the 'searchendpoints' arguments.\n", "\n", "The search endpoint to hit when calling forge.search(...) is 'sparql' by default but can be specified using the 'search_endpoint' argument." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### SPARQL Search Endpoint" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [], "source": [ "# Search for resources of type Person\n", "filters = {\"type\": \"Person\"}\n", "resources = forge.search(filters, limit=3, search_endpoint='sparql')" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypebirthDateemployer.typeemployer.namename_constrainedBy_createdAt_createdBy_deprecated_incoming_outgoing_project_rev_schemaProject_self_updatedAt_updatedBydescription
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Person12.12.1990OrganizationepflPeter Kindhttps://bluebrain.github.io/nexus/schemas/unco...2020-03-08T20:00:09.092Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge1https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2020-03-08T20:00:09.092Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syNaN
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Person12.12.1990NaNNaNPeter Kindhttps://bluebrain.github.io/nexus/schemas/unco...2020-03-22T21:51:06.830Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge3https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2020-03-22T21:54:37.507Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syNaN
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Person12.12.1990NaNNaNPeter K.https://bluebrain.github.io/nexus/schemas/unco...2020-03-22T21:56:16.084Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syFalsehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge2https://bbp.epfl.ch/nexus/v1/projects/dke/kgforgehttps://bbp.epfl.ch/nexus/v1/resources/dke/kgf...2020-03-22T21:58:41.450Zhttps://bbp.epfl.ch/nexus/v1/realms/bbp/users/syResource without user provided context
\n", "
" ], "text/plain": [ " id type birthDate \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person 12.12.1990 \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person 12.12.1990 \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Person 12.12.1990 \n", "\n", " employer.type employer.name name \\\n", "0 Organization epfl Peter Kind \n", "1 NaN NaN Peter Kind \n", "2 NaN NaN Peter K. \n", "\n", " _constrainedBy \\\n", "0 https://bluebrain.github.io/nexus/schemas/unco... \n", "1 https://bluebrain.github.io/nexus/schemas/unco... \n", "2 https://bluebrain.github.io/nexus/schemas/unco... \n", "\n", " _createdAt _createdBy \\\n", "0 2020-03-08T20:00:09.092Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "1 2020-03-22T21:51:06.830Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "2 2020-03-22T21:56:16.084Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "\n", " _deprecated _incoming \\\n", "0 False https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 False https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 False https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _outgoing \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _project _rev \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 1 \n", "1 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 3 \n", "2 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge 2 \n", "\n", " _schemaProject \\\n", "0 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "1 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "2 https://bbp.epfl.ch/nexus/v1/projects/dke/kgforge \n", "\n", " _self \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " _updatedAt _updatedBy \\\n", "0 2020-03-08T20:00:09.092Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "1 2020-03-22T21:54:37.507Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "2 2020-03-22T21:58:41.450Z https://bbp.epfl.ch/nexus/v1/realms/bbp/users/sy \n", "\n", " description \n", "0 NaN \n", "1 NaN \n", "2 Resource without user provided context " ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources, store_metadata=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### ElasticSearch Endpoint" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [], "source": [ "# Search for resources of type Person and retrieve their ids and names.\n", "filters = {\"@type\": \"http://schema.org/Person\"}\n", "resources = forge.search(filters, limit=3, \n", " search_endpoint='elastic', \n", " includes=[\"@id\", \"@type\"]) # fields can also be excluded with 'excludes'" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtype
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...http://schema.org/Person
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...http://schema.org/Person
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...http://schema.org/Person
\n", "
" ], "text/plain": [ " id type\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... http://schema.org/Person\n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... http://schema.org/Person\n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... http://schema.org/Person" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources, store_metadata=True)" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Search results are not synchronized\n", "resources[0]._synchronized" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/e353bead-906e-4cd6-b7f4-948fc05c1ef9'" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resources[0].id" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'http://schema.org/Person'" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resources[0].type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Crossbucket search\n", "It is possible to search for resources stored in buckets different than the configured one. The configured store should of course support it." ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [], "source": [ "resources = forge.search(p.type.id == \"Association\", limit=3, cross_bucket=True) # cross_bucket defaults to False" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypeagent.typeagent.name
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
\n", "
" ], "text/plain": [ " id type agent.type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "\n", " agent.name \n", "0 Jane Doe \n", "1 Jane Doe \n", "2 Jane Doe " ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Furthermore it is possible to filter by bucket when cross_bucket is set to True. Setting a bucket value when cross_bucket is False will trigger a not_supported exception.\n", "resources = forge.search(p.type.id == \"Person\", limit=3, cross_bucket=True, bucket=) # add a bucket" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 100, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypeagent.typeagent.name
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
\n", "
" ], "text/plain": [ " id type agent.type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "\n", " agent.name \n", "0 Jane Doe \n", "1 Jane Doe \n", "2 Jane Doe " ] }, "execution_count": 100, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Searching original source\n", "When using BlueBrainNexusStore, it is possible to retrieve resources' payload as they were registered (retrieve_source=True) without any changes related to store added metadata or JSONLD framing." ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [], "source": [ "resources = forge.search(p.type.id == \"Association\", limit=3, retrieve_source=False) # retrieve_source defaults to True" ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 110, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 111, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 112, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idtypeagent.typeagent.name
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...AssociationPersonJane Doe
\n", "
" ], "text/plain": [ " id type agent.type \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Association Person \n", "\n", " agent.name \n", "0 Jane Doe \n", "1 Jane Doe \n", "2 Jane Doe " ] }, "execution_count": 112, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Graph traversing\n", "\n", "SPARQL is used as a query language to perform graph traversing.\n", "\n", "Nexus Forge implements a SPARQL query rewriting strategy leveraging a configured RDFModel that lets users write SPARQL queries without adding prefix declarations, prefix names or long IRIs. With this strategy, only type and property names can be provided.\n", "\n", "Please refer to the [Modeling.ipynb](11%20-%20Modeling.ipynb) notebook to learn about templates." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: DemoStore doesn't implement SPARQL operations yet. Please use another store for this section." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: DemoModel and RdfModel schemas have not been synchronized yet. This section is to be run with RdfModel." ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\")\n", "contribution_jane = Resource(type=\"Contribution\", agent=jane)" ] }, { "cell_type": "code", "execution_count": 114, "metadata": {}, "outputs": [], "source": [ "john = Resource(type=\"Person\", name=\"John Smith\")\n", "contribution_john = Resource(type=\"Contribution\", agent=john)" ] }, { "cell_type": "code", "execution_count": 115, "metadata": {}, "outputs": [], "source": [ "association = Resource(type=\"Dataset\", contribution=[contribution_jane, contribution_john])" ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _register_one\n", " True\n" ] } ], "source": [ "forge.register(association)" ] }, { "cell_type": "code", "execution_count": 117, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " id: \"\"\n", " type:\n", " {\n", " id: \"\"\n", " }\n", " annotation:\n", " {\n", " id: \"\"\n", " type: Annotation\n", " citation:\n", " {\n", " id: \"\"\n", " }\n", " contribution:\n", " {\n", " id: \"\"\n", " type: Contribution\n", " }\n", " dateCreated: 9999-12-31T00:00:00\n", " dateModified: \"\"\n", " derivation:\n", " {\n", " id: \"\"\n", " type: Derivation\n", " }\n", " description: \"\"\n", " distribution:\n", " {\n", " id: \"\"\n", " type: DataDownload\n", " contentSize:\n", " {\n", " unitCode: \"\"\n", " value:\n", " [\n", " 0.0\n", " 0\n", " ]\n", " }\n", " digest:\n", " {\n", " algorithm: \"\"\n", " value: \"\"\n", " }\n", " encodingFormat: \"\"\n", " license: \"\"\n", " name: \"\"\n", " }\n", " generation:\n", " {\n", " id: \"\"\n", " type: Generation\n", " }\n", " hasBody:\n", " {\n", " id: \"\"\n", " type:\n", " {\n", " id: \"\"\n", " }\n", " label: \"\"\n", " note: \"\"\n", " }\n", " hasTarget:\n", " {\n", " id: \"\"\n", " type: AnnotationTarget\n", " }\n", " identifier:\n", " {\n", " identifier: \"\"\n", " }\n", " image:\n", " {\n", " id: \"\"\n", " }\n", " invalidation:\n", " {\n", " id: \"\"\n", " type: Invalidation\n", " }\n", " keywords: \"\"\n", " language:\n", " {\n", " alternateName: \"\"\n", " name: \"\"\n", " }\n", " license:\n", " {\n", " id: \"\"\n", " type: License\n", " }\n", " motivatedBy:\n", " {\n", " id: \"\"\n", " type: Motivation\n", " }\n", " name: \"\"\n", " note: \"\"\n", " sameAs:\n", " {\n", " id: \"\"\n", " }\n", " url:\n", " {\n", " id: \"\"\n", " }\n", " wasAttributedTo:\n", " {\n", " id: \"\"\n", " }\n", " wasDerivedFrom:\n", " {\n", " id: \"\"\n", " }\n", " wasRevisionOf:\n", " {\n", " id: \"\"\n", " }\n", " }\n", " atlasRelease:\n", " {\n", " id: \"\"\n", " }\n", " brainLocation:\n", " {\n", " id: \"\"\n", " type: BrainLocation\n", " atlasSpatialReferenceSystem:\n", " {\n", " id: \"\"\n", " type: AtlasSpatialReferenceSystem\n", " }\n", " brainRegion:\n", " {\n", " id: \"\"\n", " label: \"\"\n", " }\n", " coordinatesInBrainAtlas:\n", " {\n", " id: \"\"\n", " valueX: 0.0\n", " valueY: 0.0\n", " valueZ: 0.0\n", " }\n", " coordinatesInSlice:\n", " {\n", " spatialReferenceSystem:\n", " {\n", " id: \"\"\n", " type: SpatialReferenceSystem\n", " }\n", " valueX: 0.0\n", " valueY: 0.0\n", " valueZ: 0.0\n", " }\n", " distanceToBoundary:\n", " {\n", " boundary:\n", " {\n", " id: \"\"\n", " label: \"\"\n", " }\n", " distance:\n", " {\n", " unitCode: \"\"\n", " value:\n", " [\n", " 0.0\n", " 0\n", " ]\n", " }\n", " }\n", " layer:\n", " {\n", " id: \"\"\n", " label: \"\"\n", " }\n", " longitudinalAxis:\n", " [\n", " Dorsal\n", " Ventral\n", " ]\n", " positionInLayer:\n", " [\n", " Deep\n", " Superficial\n", " ]\n", " }\n", " citation:\n", " {\n", " id: \"\"\n", " }\n", " contribution:\n", " {\n", " id: \"\"\n", " type: Contribution\n", " }\n", " dateCreated: 9999-12-31T00:00:00\n", " dateModified: \"\"\n", " derivation:\n", " {\n", " id: \"\"\n", " type: Derivation\n", " }\n", " description: \"\"\n", " distribution:\n", " {\n", " id: \"\"\n", " type: DataDownload\n", " contentSize:\n", " {\n", " unitCode: \"\"\n", " value:\n", " [\n", " 0.0\n", " 0\n", " ]\n", " }\n", " digest:\n", " {\n", " algorithm: \"\"\n", " value: \"\"\n", " }\n", " encodingFormat: \"\"\n", " license: \"\"\n", " name: \"\"\n", " }\n", " generation:\n", " {\n", " id: \"\"\n", " type: Generation\n", " }\n", " identifier:\n", " {\n", " identifier: \"\"\n", " }\n", " image:\n", " {\n", " id: \"\"\n", " }\n", " invalidation:\n", " {\n", " id: \"\"\n", " type: Invalidation\n", " }\n", " isRegisteredIn:\n", " {\n", " id: \"\"\n", " type: BrainAtlasSpatialReferenceSystem\n", " }\n", " keywords: \"\"\n", " language:\n", " {\n", " alternateName: \"\"\n", " name: \"\"\n", " }\n", " license:\n", " {\n", " id: \"\"\n", " type: License\n", " }\n", " name: \"\"\n", " objectOfStudy:\n", " {\n", " id: \"\"\n", " type: ObjectOfStudy\n", " }\n", " releaseDate: \"\"\n", " sameAs:\n", " {\n", " id: \"\"\n", " }\n", " subject:\n", " {\n", " id: \"\"\n", " type: Subject\n", " }\n", " url:\n", " {\n", " id: \"\"\n", " }\n", " wasAttributedTo:\n", " {\n", " id: \"\"\n", " }\n", " wasDerivedFrom:\n", " {\n", " id: \"\"\n", " }\n", " wasRevisionOf:\n", " {\n", " id: \"\"\n", " }\n", "}\n" ] } ], "source": [ "forge.template(\"Dataset\") # Templates help know which property to use when writing a query to serach for a given type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Prefix and namespace free SPARQL query\n", "\n", "When a forge RDFModel is configured, then there is no need to provide prefixes and namespaces when writing a SPARQL query. Prefixes and namespaces will be automatically inferred from the provided schemas and/or JSON-LD context and the query rewritten accordingly." ] }, { "cell_type": "code", "execution_count": 158, "metadata": {}, "outputs": [], "source": [ "query = \"\"\"\n", " SELECT ?id ?name ?contributor\n", " WHERE {\n", " ?id a Dataset ;\n", " contribution/agent ?contributor.\n", " ?contributor name ?name.\n", " }\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 159, "metadata": {}, "outputs": [], "source": [ "resources = forge.sparql(query, limit=3)" ] }, { "cell_type": "code", "execution_count": 160, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 160, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 161, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 161, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 162, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " id: https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/847380a2-4703-42d2-aa8e-64ed52fc594b\n", " contributor: https://bbp.epfl.ch/nexus/v1/resources/dke/kgforge/_/3f9f7b60-6463-41c8-be68-f512fc2a58fe\n", " name: John Smith\n", "}\n" ] } ], "source": [ "print(resources[0])" ] }, { "cell_type": "code", "execution_count": 163, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idcontributorname
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...John Smith
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...John Smith
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...t464205Jane Doe
\n", "
" ], "text/plain": [ " id \\\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... \n", "\n", " contributor name \n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... John Smith \n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... John Smith \n", "2 t464205 Jane Doe " ] }, "execution_count": 163, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### display rewritten SPARQL query " ] }, { "cell_type": "code", "execution_count": 165, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Submitted query:\n", " PREFIX bmc: \n", " PREFIX bmo: \n", " PREFIX commonshapes: \n", " PREFIX datashapes: \n", " PREFIX dc: \n", " PREFIX dcat: \n", " PREFIX dcterms: \n", " PREFIX mba: \n", " PREFIX nsg: \n", " PREFIX nxv: \n", " PREFIX oa: \n", " PREFIX obo: \n", " PREFIX owl: \n", " PREFIX prov: \n", " PREFIX rdf: \n", " PREFIX rdfs: \n", " PREFIX schema: \n", " PREFIX sh: \n", " PREFIX shsh: \n", " PREFIX skos: \n", " PREFIX vann: \n", " PREFIX void: \n", " PREFIX xml: \n", " PREFIX xsd: \n", " PREFIX : \n", " \n", " SELECT ?id ?name ?contributor\n", " WHERE {\n", " ?id a schema:Dataset ;\n", " nsg:contribution/prov:agent ?contributor.\n", " ?contributor schema:name ?name.\n", " }\n", " LIMIT 3\n", "\n" ] } ], "source": [ "resources = forge.sparql(query, limit=3, debug=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Full SPARQL query\n", "\n", "Regular SPARQL query can also be provided. When provided, the limit and offset arguments superseed any in query limit or offset values." ] }, { "cell_type": "code", "execution_count": 166, "metadata": {}, "outputs": [], "source": [ "query = \"\"\"\n", "PREFIX dc: \n", " PREFIX dcat: \n", " PREFIX dcterms: \n", " PREFIX mba: \n", " PREFIX nsg: \n", " PREFIX owl: \n", " PREFIX prov: \n", " PREFIX rdf: \n", " PREFIX rdfs: \n", " PREFIX schema: \n", " PREFIX sh: \n", " PREFIX shsh: \n", " PREFIX skos: \n", " PREFIX vann: \n", " PREFIX void: \n", " PREFIX xsd: \n", " PREFIX : \n", " SELECT ?id ?name\n", " WHERE {\n", " ?id a schema:Dataset ;\n", " nsg:contribution/prov:agent ?contributor.\n", " ?contributor schema:name ?name.\n", " }\n", " ORDER BY ?id\n", " LIMIT 1\n", " OFFSET 0\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 172, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Submitted query:\n", " \n", " PREFIX dc: \n", " PREFIX dcat: \n", " PREFIX dcterms: \n", " PREFIX mba: \n", " PREFIX nsg: \n", " PREFIX owl: \n", " PREFIX prov: \n", " PREFIX rdf: \n", " PREFIX rdfs: \n", " PREFIX schema: \n", " PREFIX sh: \n", " PREFIX shsh: \n", " PREFIX skos: \n", " PREFIX vann: \n", " PREFIX void: \n", " PREFIX xsd: \n", " PREFIX : \n", " SELECT ?id ?name\n", " WHERE {\n", " ?id a schema:Dataset ;\n", " nsg:contribution/prov:agent ?contributor.\n", " ?contributor schema:name ?name.\n", " }\n", " ORDER BY ?id\n", " LIMIT 3\n", " OFFSET 1\n", "\n" ] } ], "source": [ "# it is recommended to set 'rewrite' to 'False' to prevent the sparql query rewriting when a syntactically correct SPARQL query is provided.\n", "resources = forge.sparql(query, rewrite=False, limit=3, offset=1, debug=True) " ] }, { "cell_type": "code", "execution_count": 173, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 173, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 174, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 174, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 175, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "kgforge.core.resource.Resource" ] }, "execution_count": 175, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources[0])" ] }, { "cell_type": "code", "execution_count": 176, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idname
0t463285A person
1t463453A person
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Jane Doe
\n", "
" ], "text/plain": [ " id name\n", "0 t463285 A person\n", "1 t463453 A person\n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Jane Doe" ] }, "execution_count": 176, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ElasticSearch DSL Query\n", "\n", "ElasticSearch DSL can be used as a query language search for resources provided that the configured store supports it. The 'BlueBrainNexusStore' supports ElasticSearch." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: DemoStore doesn't implement ElasaticSearch DSL operations." ] }, { "cell_type": "code", "execution_count": 177, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\")\n", "contribution_jane = Resource(type=\"Contribution\", agent=jane)" ] }, { "cell_type": "code", "execution_count": 178, "metadata": {}, "outputs": [], "source": [ "john = Resource(type=\"Person\", name=\"John Smith\")\n", "contribution_john = Resource(type=\"Contribution\", agent=john)" ] }, { "cell_type": "code", "execution_count": 179, "metadata": {}, "outputs": [], "source": [ "association = Resource(type=\"Dataset\", contribution=[contribution_jane, contribution_john])" ] }, { "cell_type": "code", "execution_count": 180, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _register_one\n", " True\n" ] } ], "source": [ "forge.register(association)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plain ElasticSearch DSL" ] }, { "cell_type": "code", "execution_count": 181, "metadata": {}, "outputs": [], "source": [ "query = \"\"\"\n", " {\n", " \"_source\": {\n", " \"includes\": [\n", " \"@id\",\n", " \"name\"\n", " ]\n", " },\n", " \"query\": {\n", " \"term\": {\n", " \"@type\": \"http://schema.org/Dataset\"\n", " }\n", " }\n", " }\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 182, "metadata": {}, "outputs": [], "source": [ "# limit and offset (when provided in this method call) superseed 'size' and 'from' values provided in the query\n", "resources = forge.elastic(query, limit=3)" ] }, { "cell_type": "code", "execution_count": 183, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 183, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources)" ] }, { "cell_type": "code", "execution_count": 184, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 184, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(resources)" ] }, { "cell_type": "code", "execution_count": 185, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "kgforge.core.resource.Resource" ] }, "execution_count": 185, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(resources[0])" ] }, { "cell_type": "code", "execution_count": 186, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idname
0https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Interesting Persons
1https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...Interesting Persons
2https://bbp.epfl.ch/nexus/v1/resources/dke/kgf...TTest name
\n", "
" ], "text/plain": [ " id name\n", "0 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Interesting Persons\n", "1 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... Interesting Persons\n", "2 https://bbp.epfl.ch/nexus/v1/resources/dke/kgf... TTest name" ] }, "execution_count": 186, "metadata": {}, "output_type": "execute_result" } ], "source": [ "forge.as_dataframe(resources)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Downloading" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: DemoStore doesn't implement file operations yet. Please use another store for this section." ] }, { "cell_type": "code", "execution_count": 187, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\")" ] }, { "cell_type": "code", "execution_count": 188, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "associations.tsv\n", "my_data.xwz\n", "my_data_derived.txt\n", "persons-with-id.csv\n", "persons.csv\n", "tfidfvectorizer_model_schemaorg_linking\n" ] } ], "source": [ "! ls -p ../../data | egrep -v /$" ] }, { "cell_type": "code", "execution_count": 189, "metadata": {}, "outputs": [], "source": [ "distribution = forge.attach(\"../../data\")" ] }, { "cell_type": "code", "execution_count": 190, "metadata": {}, "outputs": [], "source": [ "association = Resource(type=\"Association\", agent=jane, distribution=distribution)" ] }, { "cell_type": "code", "execution_count": 191, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _register_one\n", " True\n" ] } ], "source": [ "forge.register(association)" ] }, { "cell_type": "code", "execution_count": 195, "metadata": {}, "outputs": [], "source": [ "# The argument overwrite: bool can be provided to decide whether to overwrite (True) existing files with the same name or\n", "# to create new ones (False) with their names suffixed with a timestamp.\n", "# A cross_bucket argument can be provided to download data from the configured bucket (cross_bucket=False - the default value) \n", "# or from a bucket different than the configured one (cross_bucket=True). The configured store should support crossing buckets for this to work.\n", "forge.download(association, \"distribution.contentUrl\", \"./downloaded/\")" ] }, { "cell_type": "code", "execution_count": 196, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 448\n", "-rw-r--r-- 1 mfsy staff 477 Apr 13 00:01 associations.tsv\n", "-rw-r--r-- 1 mfsy staff 16 Apr 13 00:01 my_data.xwz\n", "-rw-r--r-- 1 mfsy staff 24 Apr 13 00:01 my_data_derived.txt\n", "-rw-r--r-- 1 mfsy staff 126 Apr 13 00:01 persons-with-id.csv\n", "-rw-r--r-- 1 mfsy staff 52 Apr 13 00:01 persons.csv\n", "-rw-r--r-- 1 mfsy staff 204848 Apr 13 00:01 tfidfvectorizer_model_schemaorg_linking\n" ] } ], "source": [ "! ls -l ./downloaded/" ] }, { "cell_type": "code", "execution_count": 194, "metadata": {}, "outputs": [], "source": [ "#! rm -R ./downloaded/" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.7 (nexusforgelatest)", "language": "python", "name": "nexusforgelatest" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }