{ "cells": [ { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:19.036357Z", "start_time": "2019-09-23T18:50:19.031896Z" } }, "source": [ "# Modeling\n", "\n", "[Modeling](https://nexus-forge.readthedocs.io/en/latest/interaction.html#modeling) enables the definition and access of data types and properties defined in schemas (e.g JSON templates, [W3C SHACL](https://www.w3.org/TR/shacl)).\n", "Models are configured in the `Model` section of the configuration file." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.068658Z", "start_time": "2019-09-23T18:50:19.054054Z" } }, "outputs": [], "source": [ "from kgforge.core import KnowledgeGraphForge" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A configuration file is needed in order to create a KnowledgeGraphForge session. A configuration can be generated using the notebook [00-Initialization.ipynb](00%20-%20Initialization.ipynb)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "forge = KnowledgeGraphForge(\"../../configurations/forge.yml\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from kgforge.core import Resource" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prefixes\n", "Prefixes are namespaces that are used to put Resource properties within a context." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.455574Z", "start_time": "2019-09-23T18:50:20.443717Z" } }, "outputs": [], "source": [ "forge.prefixes()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Types\n", "The `type` property of a Resource can be associated to the available types in the Model. These types have a pre-defined set of properties." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.469871Z", "start_time": "2019-09-23T18:50:20.460504Z" } }, "outputs": [], "source": [ "forge.types()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Templates\n", "The template will provide a set of properties for the givent type that is recomended to be used when creating Resources." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### showing the properties of a type + getting the template of a Mapping for a type" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.482501Z", "start_time": "2019-09-23T18:50:20.473769Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " id: \"\"\n", " additionalName: \"\"\n", " affiliation:\n", " {\n", " id: \"\"\n", " type: Organization\n", " address: \"\"\n", " email: \"\"\n", " identifier: \"\"\n", " name: \"\"\n", " parentOrganization: \"\"\n", " }\n", " email: \"\"\n", " familyName: \"\"\n", " givenName: \"\"\n", " identifier:\n", " {\n", " id: \"\"\n", " propertyID: \"\"\n", " value:\n", " {\n", " id: \"\"\n", " }\n", " }\n", "}\n" ] } ], "source": [ "forge.template(\"Person\")" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.704970Z", "start_time": "2019-09-23T18:50:20.494904Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " id: \"\"\n", "}\n" ] } ], "source": [ "forge.template(\"Person\", only_required=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### creating (a) Resource instance(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### manually (JSON)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " \"id\": \"\"\n", "}\n" ] } ], "source": [ "forge.template(\"Person\", output=\"json\", only_required=True)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "data = {\n", " \"type\": \"Person\",\n", " \"name\": \"Jane\"\n", "}" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "resource_json = forge.from_json(data)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " type: Person\n", " name: Jane\n", "}\n" ] } ], "source": [ "print(resource_json)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### programmatically (Dict)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "template = forge.template(\"Person\", output=\"dict\", only_required=True)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "template[\"name\"] = \"Jane\"" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "resource_dict = forge.from_json(template)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " id: \"\"\n", " name: Jane\n", "}\n" ] } ], "source": [ "print(resource_dict)" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2019-09-23T14:43:14.465381Z", "start_time": "2019-09-23T14:43:14.455846Z" } }, "source": [ "## Validation\n", "It is possible to verify that a Resource is compliant with the suggested type schema available in the Model." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\")" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "john = Resource(type=\"Person\", name=\"John Smith\")" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "persons = [jane, john]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.716542Z", "start_time": "2019-09-23T18:50:20.708109Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 2\n", " _validate_many\n", " True\n" ] } ], "source": [ "forge.validate(persons)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.734693Z", "start_time": "2019-09-23T18:50:20.718924Z" } }, "outputs": [ { "data": { "text/plain": [ "Action(error=None, message=None, operation='_validate_many', succeeded=True)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jane._last_action" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.749126Z", "start_time": "2019-09-23T18:50:20.739075Z" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jane._validated" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### automatic status update" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.787796Z", "start_time": "2019-09-23T18:50:20.766286Z" } }, "outputs": [], "source": [ "jane.email = \"jane.doe@epfl.ch\"" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.832020Z", "start_time": "2019-09-23T18:50:20.811928Z" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "jane._validated" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### lazy actions handling" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "distribution = forge.attach(\"../../data/persons.csv\")" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "jane = Resource(type=\"Person\", name=\"Jane Doe\", distribution=distribution)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _validate_one\n", " False\n", " ValidationError: resource has lazy actions which need to be executed before\n" ] } ], "source": [ "forge.validate(jane)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: DemoStore doesn't implement file operations yet. Please use another store for the following cell." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _validate_one\n", " True\n" ] } ], "source": [ "forge.validate(jane, execute_actions_before=True)" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "#### Specifying schema in validation\n", "\n", "By default the resource(s) will be validated against the schema of the `type` attribute of the resource.\n", "\n", "However, it is possible to specify the schema to validate against by passing the `type_` argument in `.validate()`" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false }, "pycharm": { "name": "#%%\n" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _validate_one\n", " True\n" ] } ], "source": [ "forge.validate(jane, execute_actions_before=True, type_=\"Person\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### error handling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: DemoModel and RdfModel schemas have not been synchronized yet. This section is to be run with RdfModel. Commented lines are for DemoModel." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.917767Z", "start_time": "2019-09-23T18:50:20.902559Z" } }, "outputs": [], "source": [ "mistake = Resource(type=\"Person\")" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "# resource = Resource(type=\"Association\", agent=mistake)\n", "resource = Resource(type=\"Dataset\", contribution=mistake)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " _validate_one\n", " False\n", " ValidationError: \n", "Validation Report\n", "Conforms: False\n", "Results (1):\n", "Constraint Violation in NodeConstraintComponent (http://www.w3.org/ns/shacl#NodeConstraintComponent):\n", "\tSeverity: sh:Violation\n", "\tSource Shape: this5:DatasetShape\n", "\tFocus Node: [ [ rdf:type ] ; rdf:type ]\n", "\tValue Node: [ [ rdf:type ] ; rdf:type ]\n", "\tMessage: Value does not conform to Shape this6:MINDSShape\n", "\n" ] } ], "source": [ "forge.validate(resource)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:20.986359Z", "start_time": "2019-09-23T18:50:20.948705Z" } }, "outputs": [ { "data": { "text/plain": [ "Action(error='ValidationError', message='\\nValidation Report\\nConforms: False\\nResults (1):\\nConstraint Violation in NodeConstraintComponent (http://www.w3.org/ns/shacl#NodeConstraintComponent):\\n\\tSeverity: sh:Violation\\n\\tSource Shape: this5:DatasetShape\\n\\tFocus Node: [ [ rdf:type ] ; rdf:type ]\\n\\tValue Node: [ [ rdf:type ] ; rdf:type ]\\n\\tMessage: Value does not conform to Shape this6:MINDSShape\\n', operation='_validate_one', succeeded=False)" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._last_action" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "ExecuteTime": { "end_time": "2019-09-23T18:50:21.016505Z", "start_time": "2019-09-23T18:50:20.998401Z" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "resource._validated" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.7 (nexusforgelatest)", "language": "python", "name": "nexusforgelatest" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }