{
"cells": [
{
"cell_type": "markdown",
"id": "bacc54ed",
"metadata": {},
"source": [
"# NTV - Overview\n",
"\n",
"This overview presents different facets and covers all the features of this package.\n",
"\n",
"## Summary\n",
"*(active link on jupyter Notebook or Nbviewer)*\n",
"- [from native entity to JSON-text](#from-native-entity-to-JSON-text)\n",
"- [All JSON data is JSON-NTV data](#All-JSON-data-is-JSON-NTV-data)\n",
"- [NTV data is named](#NTV-data-is-named)\n",
"- [NTV data is typed](#NTV-data-is-typed)\n",
"- [Types can be predefined](#Types-can-be-predefined)\n",
"- [Types are nested](#Types-are-nested)\n",
"- [NTV data is nested](#NTV-data-is-nested)\n",
"- [NTV-json format is reversible for NTV entities](#NTV-json-format-is-reversible-for-NTV-entities)\n",
"- [NTV data can be validated according to their NTVtype](#NTV-data-can-be-validated-according-to-their-NTVtype)\n",
"- [Options are available for NTV-json](#Options-are-available-for-NTV-json)\n",
"- [Changes and comments are managed](#Changes-and-comments-are-managed)\n",
"- [JSON-array and JSON-object are equivalent for NtvList](#JSON-array-and-JSON-object-are-equivalent-for-NtvList)\n",
"- [NTV is a tree data structure](#NTV-is-a-tree-data-structure)\n",
"- [The json representation can be more or less deep](#The-json-representation-can-be-more-or-less-deep)\n",
"- [NTV entities are compatible with tabular data tools](#NTV-entities-are-compatible-with-tabular-data-tools)\n",
"- [CSV data is enriched with NTV structures](#CSV-data-is-enriched-with-NTV-structures)\n",
"- [NTV data can be created from entities of all types](#NTV-data-can-be-created-from-entities-of-all-types)\n",
"- [Template structure can be used](#Template-structure-can-be-used)\n",
"- [NTVtype can be extended](#NTVtype-can-be-extended)\n",
"- [Custom types are allowed](#Custom-types-are-allowed)\n",
"- [NTV can be used with custom entities](#NTV-can-be-used-with-custom-entities)\n",
"\n",
"## References\n",
"- [JSON-NTV specification](https://github.com/loco-philippe/NTV/blob/main/documentation/JSON-NTV-standard.pdf)\n",
"- [JSON-NTV classes and methods](https://loco-philippe.github.io/NTV/json_ntv.html)\n",
"\n",
"This Notebook can also be viewed at [nbviewer](http://nbviewer.org/github/loco-philippe/NTV/tree/main/example)\n",
"\n",
"-----"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "b21f8cad",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"from json_ntv import NtvSingle, NtvList, Ntv, NtvConnector, Datatype, Namespace, to_csv, from_csv, NtvComment\n",
"from datetime import date, datetime\n",
"from shapely import Point"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "d6217d91",
"metadata": {},
"outputs": [],
"source": [
"ntv = \"\"\"\n",
"flowchart LR\n",
" text[\"#10240;#10240;JSON#10240;#10240;\\ntext\"]\n",
" val[\"#10240;JSON-NTV#10240;\\nvalue\"]\n",
" ntv[\"#10240;#10240;#10240;NTV#10240;#10240;#10240;\\nentity\"]\n",
" nat[\"#10240;native#10240;\\nentity\"]\n",
" text--->|JSON load|val\n",
" val--->|JSON dump|text\n",
" val--->|Ntv.from_obj|ntv\n",
" ntv--->|.to_obj|nat\n",
" ntv--->|.to_obj|val\n",
" nat--->|Ntv.from_obj|ntv\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"id": "e8445d34",
"metadata": {},
"source": [
"## from native entity to JSON-text\n",
"- The diagram below explains how to transform **any type of data** into a neutral exchange format"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "ee24c89d",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from base64 import b64encode\n",
"from IPython.display import Image, display\n",
"display(Image(url=\"https://mermaid.ink/img/\" + b64encode(ntv.encode(\"ascii\")).decode(\"ascii\")))"
]
},
{
"cell_type": "markdown",
"id": "3079ba24",
"metadata": {},
"source": [
"- the conversion between native entity and JSON-text is reversible (round trip)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1bccd532",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"loc_and_date = {'newyear': date(2023, 1, 2), 'Paris': Point(2.3, 48.9)}\n",
"json_loc_date = Ntv.obj(loc_and_date).to_obj(encoded=True)\n",
"\n",
"Ntv.obj(json_loc_date).to_obj(format='obj') == loc_and_date"
]
},
{
"cell_type": "markdown",
"id": "0fcbfcfa",
"metadata": {},
"source": [
"## All JSON data is JSON-NTV data\n",
"NTV entities : \n",
"- NtvSingle : primitive entity which is not composed of any other entity\n",
"- NtvList : ordered sequence of NTV entities"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "31758027",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"null NTV entity : \n",
"1 NTV entity : \n",
"[1, 2] NTV entity : \n",
"{\"key\": \"value\"} NTV entity : \n",
"{\"key1\": \"val1\", \"key2\": \"val2\"} NTV entity : \n",
"{\"example\": [21, [1, 2], {\"key1\": 3, \"key2\": 4}]} NTV entity : \n"
]
}
],
"source": [
"liste = [None, 1, [1,2], {'key': 'value'}, {'key1': 'val1', 'key2': 'val2'}, \n",
" {'example': [21, [1,2], {'key1': 3, 'key2': 4}]}]\n",
"for json in liste:\n",
" ntv = Ntv.obj(json)\n",
" print('{:<50} {} {}'.format(str(ntv), 'NTV entity : ', type(ntv)))"
]
},
{
"cell_type": "markdown",
"id": "36ecdec1",
"metadata": {},
"source": [
"## NTV data is named\n",
"- a name can be added or remove"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "48c97569",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"simple data : 3\n",
"simple data with name : {\"value\": 3}\n",
"simple data without name : 3\n"
]
}
],
"source": [
"simple = NtvSingle(3)\n",
"\n",
"print('simple data : ', simple)\n",
"simple.set_name('value')\n",
"print('simple data with name : ', simple)\n",
"simple.set_name('')\n",
"print('simple data without name : ', simple)"
]
},
{
"cell_type": "markdown",
"id": "663d63e8",
"metadata": {},
"source": [
"## NTV data is typed\n",
"- default type is 'json'\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "7ef86113",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Are {':json' : 21} and 21 equivalent ? True\n"
]
}
],
"source": [
"# {':json' : 21} and 21 are equivalent\n",
"number = 21\n",
"typed_number = {':json' : 21}\n",
"comparison = Ntv.obj(typed_number) == Ntv.obj(number)\n",
"print(\"Are {':json' : 21} and 21 equivalent ? \", comparison)"
]
},
{
"cell_type": "markdown",
"id": "2b817705",
"metadata": {},
"source": [
"## Types can be predefined\n",
"- many standard types are included"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "74e2439f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'paris': }\n"
]
},
{
"data": {
"image/svg+xml": [
""
],
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# {'paris:point' : [4.1, 40.5] } indicates that the object named 'paris' has geographical coordinates [4.1, 40.5]\n",
"city = Ntv.obj({'paris:point' : [4.1, 40.5] })\n",
"print(city.to_obj(format='obj'))\n",
"\n",
"# another coordinates are available (e.g. line, polygon)\n",
"city = Ntv.obj({':polygon' : [[[0,1], [1,1], [1,0], [0,1]]] })\n",
"city.to_obj(format='obj')"
]
},
{
"cell_type": "markdown",
"id": "9a91d85a",
"metadata": {},
"source": [
"## Types are nested\n",
"- a Type is defined by a name and a Namespace. \n",
"- a Namespace is represented by a string followed by a point. Namespaces may be nested\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "bd0ded13",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\"picardie:fr.\": {\"oise:dep\": \"60\", \"aisne:dep\": \"02\", \"somme:dep\": \"80\", \"hauts de france:reg\": 32}} \n",
"\n",
"{\"oise:fr.dep\": \"60\"}\n"
]
}
],
"source": [
"picardie = NtvList.obj({'picardie:fr.':{'oise:dep': '60', 'aisne:dep': '02', 'somme:dep': '80', 'hauts de france:reg': 32}})\n",
"\n",
"# 'dep' is a type (french department) defined in the Namespace 'fr.' (french types)\n",
"# the default Namespace 'fr.' is aggregated with the 'dep' type\n",
"\n",
"print(picardie, '\\n')\n",
"print(picardie[0])\n",
"\n",
"# '60' is the 'fr.dep' code for the department of Oise\n"
]
},
{
"cell_type": "markdown",
"id": "06c14142",
"metadata": {},
"source": [
"## NTV data is nested\n",
"- A structure with ordered list of NTV entities is available\n",
"- The type can be shared between entities included in the same entity\n",
"- The type defined in a structure is a default type\n",
"- Two NTV structures are equal if the name and the value are equals (the default type can be different)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "58ffe77c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"equivalent structures ? True\n",
"coordinates and date in a typed structure : {\"cities::point\": [{\"paris\": [4.1, 40.5]}, [4.5, 41.2], {\":date\": \"2012-02-15\"}]}\n",
"coordinates and date in an untyped structure : {\"cities\": {\"paris:point\": [4.1, 40.5], \":point\": [4.5, 41.2], \":date\": \"2012-02-15\"}}\n"
]
}
],
"source": [
"# structure definied by JSON-Array\n",
"cities1 = Ntv.obj({'cities': [{'paris:point': [4.1, 40.5]}, {':point': [4.5, 41.2]}]})\n",
"cities2 = Ntv.obj({'cities::point': [{'paris': [4.1, 40.5]}, [4.5, 41.2] ]})\n",
"\n",
"# structure definied by JSON-Object\n",
"cities3 = Ntv.obj({'cities::point': {'paris': [4.1, 40.5], 'lyon': [4.5, 41.2]}})\n",
"cities4 = Ntv.obj({'cities': {'paris:point': [4.1, 40.5], 'lyon:point': [4.5, 41.2]}})\n",
"\n",
"print('equivalent structures ? ', cities1 == cities2 and cities3 == cities4)\n",
"\n",
"# default type in a structure\n",
"cities5 = Ntv.obj({'cities::point': [{'paris': [4.1, 40.5]}, [4.5, 41.2], {':date': '2012-02-15'}]})\n",
"print('coordinates and date in a typed structure : ', cities5)\n",
"cities5.set_type()\n",
"print('coordinates and date in an untyped structure : ', cities5)"
]
},
{
"cell_type": "markdown",
"id": "80207455",
"metadata": {},
"source": [
"## NTV-json format is reversible for NTV entities\n",
"- The entity build with the Json representation of another entity is identical to the original NTV entity"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "05a02ca4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Ntv.obj(simple. to_obj()) == simple and \\\n",
"Ntv.obj(city. to_obj()) == city and \\\n",
"Ntv.obj(cities2.to_obj()) == cities2 and \\\n",
"Ntv.obj(cities4.to_obj()) == cities4"
]
},
{
"cell_type": "markdown",
"id": "48b0c76e",
"metadata": {},
"source": [
"## NTV data can be validated according to their NTVtype\n",
"- It concerns all NTVsingle entities"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "e02c289f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"without error (boolean, list pointer of error): (True, [])\n",
"\n",
"with coordinate error (boolean, list pointer of error): (False, ['cities::point/1'])\n",
"\n",
"with date error (boolean, list pointer of error): (False, ['cities::point/2'])\n"
]
}
],
"source": [
"cities5 = Ntv.obj({'cities::point': [{'paris': [4.1, 40.5]}, [4.5, 41.2], {':date': '2012-02-15'}]})\n",
"print('without error (boolean, list pointer of error): ', cities5.validate())\n",
"\n",
"cities5 = Ntv.obj({'cities::point': [{'paris': [4.1, 40.5]}, [4.5, 241.2], {':date': '2012-02-15'}]})\n",
"print('\\nwith coordinate error (boolean, list pointer of error): ', cities5.validate())\n",
"\n",
"cities5 = Ntv.obj({'cities::point': [{'paris': [4.1, 40.5]}, [4.5, 41.2], {':date': '2012-22-15'}]})\n",
"print('\\nwith date error (boolean, list pointer of error): ', cities5.validate())"
]
},
{
"cell_type": "markdown",
"id": "713ab7db",
"metadata": {},
"source": [
"## Options are available for NTV-json\n",
"- selected values\n",
"- encoded data\n",
"- data format"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "9d1828ef",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Json format : {'cities::point': [{'paris': [4.1, 40.5]}, [4.5, 41.2]]}\n",
"only values : [[4.1, 40.5], [4.5, 41.2]]\n",
"Json text : {\"cities::point\": [{\"paris\": [4.1, 40.5]}, [4.5, 41.2]]}\n",
"Json binary : b'\\xa1mcities::point\\x82\\xa1eparis\\x82\\xfb@\\x10ffffff\\xfb@D@\\x00\\x00\\x00\\x00\\x00\\x82\\xfb@\\x12\\x00\\x00\\x00\\x00\\x00\\x00\\xfb@D\\x99\\x99\\x99\\x99\\x99\\x9a'\n",
"tuple format : ('NtvList', 'cities', 'point', [('NtvSingle', 'paris', 'point', [4.1, 40.5]), ('NtvSingle', '', 'point', [4.5, 41.2])])\n",
"object format : {'cities::point': [{'paris': }, ]}\n",
"simple Json : {'l-cities-point': ['s-paris-point-[4.1, 40.5]', 's-point-[4.5, 41.2]']}\n",
"Json codes : {'lNT': ['sNT', 'sT']}\n"
]
}
],
"source": [
"print('Json format : ', cities2.to_obj())\n",
"print('only values : ', cities2.to_obj(simpleval=True))\n",
"print('Json text : ', cities2.to_obj(encoded=True))\n",
"print('Json binary : ', cities2.to_obj(format='cbor', encoded=True))\n",
"print('tuple format : ', cities2.to_tuple())\n",
"print('object format : ', cities2.to_obj(format='obj'))\n",
"print('simple Json : ', cities2.to_repr())\n",
"print('Json codes : ', cities2.to_repr(False, False, False)) \n",
"# Codification : first letter: \"s\" (NtvSingle), \"l\" (NtvList), additional letters: 'N' (named), 'T' (typed)"
]
},
{
"cell_type": "markdown",
"id": "05a08bd0",
"metadata": {},
"source": [
"## Changes and comments are managed\n",
"- change proposals and comments can be added to the NTV entities\n",
"- they can be rejected or accepted"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "64290e9e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"are rejected comments deleted ? True\n",
"is 'date1' updated (1965-01-01) ? True\n"
]
}
],
"source": [
"dates_json = {'dates::date': { 'date1':'1964-01-01', 'date2': '1985-02-05', 'date3': '2022-01-21'}}\n",
"dates = Ntv.obj(dates_json)\n",
"\n",
"dates_comment = NtvComment(dates)\n",
"\n",
"dates_comment.add({'list-op': [{'op': 'replace', 'path': 'dates/date1', 'entity': {'date1:date':'1965-01-01'}}], \n",
" 'comment': 'year is not correct'})\n",
"dates_comment.add('everything is correct')\n",
"dates_reject = dates_comment.reject(all_comment=True).ntv\n",
"print('\\nare rejected comments deleted ? ', dates == dates_reject)\n",
"\n",
"dates_accept = dates_comment.accept().ntv\n",
"print(\"is 'date1' updated (1965-01-01) ? \", dates_accept['date1'].val == '1965-01-01')"
]
},
{
"cell_type": "markdown",
"id": "0f5dd615",
"metadata": {},
"source": [
"## JSON-array and JSON-object are equivalent for NtvList\n",
"- if the constraint of the JSON-object (keys are present) is respected"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "483c84a7",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"are NTV entities from a JSON-array or a JSON-object the same ? True \n",
"\n",
"ntv - Array : {'array or object': {'key1': 'value1', 'key2': 'value2'}}\n",
"ntv - Object : {'array or object': {'key1': 'value1', 'key2': 'value2'}}\n",
"\n",
"An NTV entity has JSON-array and JSON-object representation\n"
]
}
],
"source": [
"ntv_lis = Ntv.obj({'array or object': [{'key1': 'value1'}, {'key2': 'value2'}]})\n",
"ntv_obj = Ntv.obj({'array or object': {'key1': 'value1', 'key2': 'value2'}})\n",
"print('are NTV entities from a JSON-array or a JSON-object the same ? ', ntv_lis == ntv_obj, '\\n')\n",
"\n",
"print('ntv - Array : ', ntv_lis.to_obj(ntv_list=True))\n",
"print('ntv - Object : ', ntv_lis.to_obj(ntv_list=False))\n",
"\n",
"print('\\nAn NTV entity has JSON-array and JSON-object representation' )"
]
},
{
"cell_type": "markdown",
"id": "9b94b1d6",
"metadata": {},
"source": [
"## NTV is a tree data structure"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "097b61c4",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"example = { 'example::': {\n",
" \"fruits\": [\n",
" {\"kiwis\": 3, \"mangues:int\": 4, \"pommes\": None },\n",
" {\"panier\": True } ],\n",
" \"legumes::json\": {\n",
" \"patates:string\": \"amandine\",\n",
" \"poireaux\": False },\n",
" \"viandes\": [\"poisson\",{\":string\": \"poulet\"},\"boeuf\"] }}\n",
"\n",
"# tree with value\n",
"Ntv.obj(example).to_mermaid('NTV flowchart', disp=True)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "c68d97a2",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# tree with leaves row\n",
"Ntv.obj(example).to_mermaid('NTV flowchart', disp=True, leaves=True)"
]
},
{
"cell_type": "markdown",
"id": "496f5fd2",
"metadata": {},
"source": [
"## The json representation can be more or less deep"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "110c7bd0",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Ntv.obj({'city': {'paris': [2.3, 48.9], 'lyon': [4.8, 45.8] } }).to_mermaid('full NTV', disp=True)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "6e0f11ad",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Ntv.obj({'city::point': {'paris': [2.3, 48.9], 'lyon': [4.8, 45.8] } }).to_mermaid('mixed JSON-NTV', disp=True)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "360c1b0f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"Ntv.obj({'city:': {'paris': [2.3, 48.9], 'lyon': [4.8, 45.8] } }).to_mermaid('full JSON', disp=True)"
]
},
{
"cell_type": "markdown",
"id": "6a5b7c0e",
"metadata": {},
"source": [
"## NTV entities are compatible with tabular data tools\n",
"- Example with Pandas"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "814da6fe",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1964-01-01\n",
"1 1985-02-05\n",
"2 2022-01-21\n",
"Name: dates, dtype: datetime64[ns] \n",
"\n",
"dates datetime64[ns]\n",
"value int64\n",
"value32 int32\n",
"coord::point object\n",
"names string[python]\n",
"dtype: object\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" dates | \n",
" value | \n",
" value32 | \n",
" coord::point | \n",
" names | \n",
"
\n",
" \n",
" \n",
" \n",
" | 1 | \n",
" 1964-01-01 | \n",
" 10 | \n",
" 10 | \n",
" POINT (1 2) | \n",
" john | \n",
"
\n",
" \n",
" | 2 | \n",
" 1985-02-05 | \n",
" 20 | \n",
" 20 | \n",
" POINT (3 4) | \n",
" eric | \n",
"
\n",
" \n",
" | 3 | \n",
" 2022-01-21 | \n",
" 30 | \n",
" 30 | \n",
" POINT (5 6) | \n",
" judith | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" dates value value32 coord::point names\n",
"1 1964-01-01 10 10 POINT (1 2) john\n",
"2 1985-02-05 20 20 POINT (3 4) eric\n",
"3 2022-01-21 30 30 POINT (5 6) judith"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import os\n",
"import sys\n",
"os.path\n",
"sys.path.insert(0, 'C:\\\\Users\\\\phili\\\\github\\\\ntv-pandas')\n",
"import ntv_pandas\n",
"\n",
"field_data = {'dates::datetime': ['1964-01-01', '1985-02-05', '2022-01-21']}\n",
"tab_data = {'index': [1, 2, 3],\n",
" 'dates::datetime': ['1964-01-01', '1985-02-05', '2022-01-21'], \n",
" 'value': [10, 20, 30],\n",
" 'value32::int32': [10, 20, 30],\n",
" 'coord::point': [[1,2], [3,4], [5,6]],\n",
" 'names::string': ['john', 'eric', 'judith']}\n",
"\n",
"field = Ntv.obj({':field': field_data})\n",
"tab = Ntv.obj({':tab' : tab_data})\n",
"\n",
"# the DataFrame Connector is associated with Datatype 'tab' in dicobj \n",
"sr = field.to_obj(format='obj', dicobj={'field': 'SeriesConnec'})\n",
"df = tab.to_obj (format='obj', dicobj={'tab': 'DataFrameConnec'})\n",
"\n",
"# pandas dtype conform to Ntv type\n",
"print(sr, '\\n')\n",
"print(df.dtypes)\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "14f66d41",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"df2 is identical to df ? True\n"
]
}
],
"source": [
"# the dataframe generated from NTV data from a first dataframe is identical to this first dataframe\n",
"df2 = Ntv.obj(df).to_obj(format='obj', dicobj={'tab': 'DataFrameConnec'})\n",
"print('df2 is identical to df ? ', df2.equals(df))\n"
]
},
{
"cell_type": "markdown",
"id": "da9bf49f",
"metadata": {},
"source": [
"## CSV data is enriched with NTV structures\n",
"- Json format is equivalent to CSV format for tabular data"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "bbba5cec",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Json format :\n",
" {\"index\": [1, 2, 3], \"dates::date\": [\"1964-01-01\", \"1985-02-05\", \"2022-01-02\"], \"value\": [10, 20, 30], \"value32::int32\": [10, 20, 30], \"coord::point\": [[1, 2], [3, 4], {\":datetime\": \"2022-01-02T10:00:00\"}], \"names::string\": [\"john\", \"eric\", \"judith\"]}\n",
"\n",
"CSV format :\n",
" index,dates::date,value,value32::int32,coord::point,names::string\n",
"1,\"\"\"1964-01-01\"\"\",10,10,\"[1, 2]\",\"\"\"john\"\"\"\n",
"2,\"\"\"1985-02-05\"\"\",20,20,\"[3, 4]\",\"\"\"eric\"\"\"\n",
"3,\"\"\"2022-01-02\"\"\",30,30,\"{\"\":datetime\"\": \"\"2022-01-02T10:00:00\"\"}\",\"\"\"judith\"\"\"\n",
"\n",
"is NTV entity from CSV identical to the initial NTV entity ? True\n"
]
}
],
"source": [
"import csv\n",
"tab = {'index': [1, 2, 3],\n",
" 'dates::date': ['1964-01-01', '1985-02-05', date(2022, 1,2)], \n",
" 'value': [10, 20, 30],\n",
" 'value32::int32': [10, 20, 30],\n",
" 'coord::point': [[1,2], [3,4], datetime(2022, 1,2, 10)],\n",
" 'names::string': ['john', 'eric', 'judith']}\n",
"ntv_tab = Ntv.obj(tab)\n",
"print('Json format :\\n', ntv_tab)\n",
"to_csv('test.csv', ntv_tab)\n",
"f = open('test.csv', 'r')\n",
"print('\\nCSV format :\\n', f.read())\n",
"ntv_tab2 = from_csv('test.csv', single_tab=False)\n",
"print('is NTV entity from CSV identical to the initial NTV entity ? ', ntv_tab == ntv_tab2)"
]
},
{
"cell_type": "markdown",
"id": "878677ad",
"metadata": {},
"source": [
"## NTV data can be created from entities of all types\n",
"- the conversions are made from the defined connectors"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "57b9d9d6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"simple representation : {\"dataframe::tab\": [{\"index\": [1, 2, 3], \"dates::datetime\": [\"1964-01-01T00:00:00.000\", \"1985-02-05T00:00:00.000\", \"2022-01-21T00:00:00.000\"], \"value\": [10, 20, 30], \"value32::int32\": [10, 20, 30], \"coord::point\": [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], \"names::string\": [\"john\", \"eric\", \"judith\"]}, {\"index\": [1, 2, 3], \"dates::datetime\": [\"1964-01-01T00:00:00.000\", \"1985-02-05T00:00:00.000\", \"2022-01-21T00:00:00.000\"], \"value\": [10, 20, 30], \"value32::int32\": [10, 20, 30], \"coord::point\": [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], \"names::string\": [\"john\", \"eric\", \"judith\"]}], \"series::field\": [[{\"dates::datetime\": [\"1964-01-01T00:00:00.000\", \"1985-02-05T00:00:00.000\", \"2022-01-21T00:00:00.000\"]}]], \"coord::point\": [[1, 2], [3, 4], [5, 6]], \"name\": \"walter\"} \n",
"\n",
"json representation :\n",
" {\"dataframe::tab\": [{\"index\": [1, 2, 3], \"dates::datetime\": [\"1964-01-01T00:00:00.000\", \"1985-02-05T00:00:00.000\", \"2022-01-21T00:00:00.000\"], \"value\": [10, 20, 30], \"value32::int32\": [10, 20, 30], \"coord::point\": [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], \"names::string\": [\"john\", \"eric\", \"judith\"]}, {\"index\": [1, 2, 3], \"dates::datetime\": [\"1964-01-01T00:00:00.000\", \"1985-02-05T00:00:00.000\", \"2022-01-21T00:00:00.000\"], \"value\": [10, 20, 30], \"value32::int32\": [10, 20, 30], \"coord::point\": [[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]], \"names::string\": [\"john\", \"eric\", \"judith\"]}], \"series::field\": [[{\"dates::datetime\": [\"1964-01-01T00:00:00.000\", \"1985-02-05T00:00:00.000\", \"2022-01-21T00:00:00.000\"]}]], \"coord::point\": [[1, 2], [3, 4], [5, 6]], \"name\": \"walter\"} \n",
"\n",
"data without conversion eg \"name\" :\n",
" walter \n",
"\n",
"data with conversion eg \"dataframe\" :\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" dates | \n",
" value | \n",
" value32 | \n",
" coord::point | \n",
" names | \n",
"
\n",
" \n",
" \n",
" \n",
" | 1 | \n",
" 1964-01-01 | \n",
" 10 | \n",
" 10 | \n",
" POINT (1 2) | \n",
" john | \n",
"
\n",
" \n",
" | 2 | \n",
" 1985-02-05 | \n",
" 20 | \n",
" 20 | \n",
" POINT (3 4) | \n",
" eric | \n",
"
\n",
" \n",
" | 3 | \n",
" 2022-01-21 | \n",
" 30 | \n",
" 30 | \n",
" POINT (5 6) | \n",
" judith | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" dates value value32 coord::point names\n",
"1 1964-01-01 10 10 POINT (1 2) john\n",
"2 1985-02-05 20 20 POINT (3 4) eric\n",
"3 2022-01-21 30 30 POINT (5 6) judith"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ntv = Ntv.obj({'dataframe::tab': [df, df2], 'series::field': [sr], 'coord::point': [[1,2], [3,4], [5,6]], 'name': 'walter'}, True)\n",
"print('simple representation : ', repr(ntv), '\\n')\n",
"\n",
"# ntv is converted into json data\n",
"print('json representation :\\n', ntv, '\\n')\n",
"\n",
"# ntv is converted into objects according to the chosen connectors\n",
"data = ntv.to_obj(format='obj', dicobj={'tab': 'DataFrameConnec', 'field': 'SeriesConnec'})\n",
"print('data without conversion eg \"name\" :\\n', data['name'], '\\n')\n",
"print('data with conversion eg \"dataframe\" :')\n",
"data['dataframe::tab'][0] # 'tab' type is converted into DataFrame object "
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "68ff94da",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"dict_keys(['dataframe::tab', 'series::field', 'coord::point', 'name'])\n",
"walter\n"
]
}
],
"source": [
"print(type(data))\n",
"print(data.keys())\n",
"print(data['name'])"
]
},
{
"cell_type": "markdown",
"id": "78bf7679",
"metadata": {},
"source": [
"## Template structure can be used\n",
"- 'leaf' nodes (NtvSingle) are updated\n"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "d9d6bdcf",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\"air quality measure\": {\"concentration\": {\"value min\": 3.51, \"value max\": 4.2}, \"unit\": \"mg/m3\", \"coordinates:point\": [4.1, 45.2], \"pollutant\": \"NO2\"}}\n"
]
}
],
"source": [
"air_quality = Ntv.obj({'air quality measure':{'concentration': {'value min': 0, 'value max': 0}, \n",
" 'unit': '', \n",
" 'coordinates:point': None,\n",
" 'pollutant': ''}})\n",
"\n",
"measure = [3.51, 4.2, 'mg/m3', [4.1, 45.2], 'NO2']\n",
"air_quality.set_value(measure)\n",
"\n",
"print(air_quality)"
]
},
{
"cell_type": "markdown",
"id": "2d040ab3-cbea-4e0a-822e-6a055f938a76",
"metadata": {},
"source": [
"## NTVtype can be extended\n",
"\n",
"- example of quantities with unit"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "f8dac363-9136-4c03-a30d-ac8f7f1dfe21",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"float kg\n"
]
}
],
"source": [
"mass = Ntv.obj({'mass:float[kg]': 25})\n",
"\n",
"print(mass.ntv_type.base_name, mass.ntv_type.extension)"
]
},
{
"cell_type": "markdown",
"id": "df51515d",
"metadata": {},
"source": [
"## Custom types are allowed\n",
"For example:\n",
"- object defined by a list of parameters\n",
"- object defined by a list of key/values."
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "2c313b51",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"measurement : {\":$sensor\": [3.51, 4.2, \"mg/m3\", [4.1, 45.2]]}\n",
"infos : 4.2 [4.1, 45.2] $sensor \n",
"\n",
"personage : {\"main Breaking Bad character:$character\": {\"surname\": \"white\", \"first name\": \"walter\", \"alias\": \"heisenberg\"}}\n",
"infos : heisenberg $character\n"
]
}
],
"source": [
"measurement = Ntv.obj({':$sensor': [3.51, 4.2, 'mg/m3', [4.1, 45.2]]})\n",
"\n",
"print('measurement : ', measurement)\n",
"print('infos : ', measurement.val[1], measurement.val[3], measurement.type_str, '\\n')\n",
"\n",
"person = Ntv.obj({'main Breaking Bad character:$character': {'surname': 'white', 'first name': 'walter', 'alias': 'heisenberg'}})\n",
"\n",
"print('personage : ', person)\n",
"print('infos : ', person.val['alias'], person.type_str)"
]
},
{
"cell_type": "markdown",
"id": "6e60672d",
"metadata": {},
"source": [
"## NTV can be used with custom entities\n",
"- custom data is integrated with two conversion methods : to_obj_ntv, to_json_ntv\n",
"\n",
"Example :\n",
"- custom class : Sensor\n",
"- custom type : '$sensor'\n",
"- custom conversion class : SensorConnec"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "028082b4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"representation class Sensor :\n",
" Sensor(name='sensor1', measurement=4.2, unit='mg/l', coord=[4, 42], date=datetime.datetime(2021, 2, 5, 0, 0)) \n",
"\n"
]
}
],
"source": [
"from dataclasses import dataclass\n",
"import datetime\n",
"\n",
"# custom classes\n",
"@dataclass\n",
"class Sensor:\n",
" name: str\n",
" measurement: float\n",
" unit: str\n",
" coord: list\n",
" date: datetime.datetime\n",
"\n",
"# custom data\n",
"val1 = Sensor('sensor1', 4.2, 'mg/l', [4,42], datetime.datetime(2021, 2, 5))\n",
"val2 = Sensor('sensor1', 5.1, 'mg/l', [4,42], datetime.datetime(2021, 2, 10))\n",
"val3 = [\"sensor2\", 4.2, \"mg/l\", [5, 42], '2021-04-06']\n",
"\n",
"# simple value\n",
"print('representation class Sensor :\\n', val1, '\\n')"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "690cde2f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"representation NTV (to_json_ntv) :\n",
" {\":$sensor\": [\"sensor1\", 4.2, \"mg/l\", [4, 42], \"2021-02-05T00:00:00\"]} \n",
"\n",
"reflexivity of conversion ? True \n",
"\n",
"'first test' representation NTV :\n",
" {\"campaign\": \"first test\", \"result::$sensor\": [[\"sensor1\", 4.2, \"mg/l\", [4, 42], \"2021-02-05T00:00:00\"], [\"sensor1\", 5.1, \"mg/l\", [4, 42], \"2021-02-10T00:00:00\"], [\"sensor2\", 4.2, \"mg/l\", [5, 42], \"2021-04-06\"]], \"date measure:datetime\": \"2012-01-10\"} \n",
"\n",
"'first test' representation with object :\n",
" Sensor(name='sensor1', measurement=4.2, unit='mg/l', coord=[4, 42], date=datetime.datetime(2021, 2, 5, 0, 0))\n"
]
}
],
"source": [
"class SensorConnec(NtvConnector):\n",
"\n",
" clas_obj = 'Sensor'\n",
" clas_typ = '$sensor'\n",
"\n",
" @staticmethod\n",
" def to_obj_ntv(ntv_value, **kwargs):\n",
" '''convert ntv_value into the return object'''\n",
" return Sensor(*ntv_value[0:4], datetime.datetime.fromisoformat(ntv_value[4]))\n",
" \n",
" @staticmethod\n",
" def to_json_ntv(self, name=None, typ=None):\n",
" ''' convert object into the NTV entity (value, name, type)'''\n",
" return ([self.name, self.measurement, self.unit, self.coord, self.date.isoformat()], None, '$sensor')\n",
"\n",
"ntv1 = Ntv.obj(val1)\n",
"print('representation NTV (to_json_ntv) :\\n',ntv1, '\\n')\n",
"val1_bis = ntv1.to_obj(format='obj')\n",
"print('reflexivity of conversion ? ', val1_bis == val1, '\\n')\n",
"\n",
"# assembly value\n",
"first_test = {'campaign': 'first test', 'result::$sensor': [val1, val2, val3], 'date measure:datetime': '2012-01-10'}\n",
"first_test_ntv = Ntv.obj(first_test)\n",
"print(\"'first test' representation NTV :\\n\",first_test_ntv, '\\n') \n",
"\n",
"first_test_obj = first_test_ntv.to_obj(format='obj', dicobj={'$sensor': 'SensorConnec'})\n",
"print(\"'first test' representation with object :\\n\", first_test_obj[\"result::$sensor\"][0])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}