#Â Mapping

Mappings are pre-defined configuration files that encode the logic on how to transform a specific data source into Resources that follow a template of a targeted _Type_. 

This notebook demonstrates the `DictionaryMapping` wich is based on a JSON structure that represent the target structure, and Python code that will apply desired transformations on the data source.

In [2]:
from kgforge.core import KnowledgeGraphForge

In [3]:
forge = KnowledgeGraphForge("../../configurations/demo-forge.yml")

## Imports

In [4]:
from kgforge.core import Resource

In [5]:
from kgforge.specializations.mappings import DictionaryMapping

## Data

In [6]:
scientists = [
    {
        "id": 123,
        "name": "Marie Curie",
        "gender": "female",
        "middle_name": "Salomea",
    },
    {
        "id": 456,
        "name": "Albert Einstein",
        "gender": "male",
        "middle_name": "(missing)",
    },
]

## Mapping data to the Knowledge Graph Schema

### basics

In [7]:
forge.template("Association")

<info> DemoModel does not distinguish values and constraints in templates for now.
<info> DemoModel does not automatically include nested schemas for now.
{
    type: Association
    agent:
    {
        type: Person
        name: hasattr
    }
}


In [8]:
mapping_simple = DictionaryMapping("""
    type: Association
    agent:
    {
        type: Person
        name: x.name
    }
""")

In [9]:
resources_simple = forge.map(scientists, mapping_simple)

In [10]:
print(resources_simple[0])

{
    type: Association
    agent:
    {
        type: Person
        name: Marie Curie
    }
}


### missing values

In [11]:
mapping_na = DictionaryMapping("""
    type: Association
    agent:
    {
        type: Person
        name: x.name
        additionalName: x.middle_name
    }
""")

In [12]:
print(forge.map(scientists[1], mapping_na))

{
    type: Association
    agent:
    {
        type: Person
        additionalName: (missing)
        name: Albert Einstein
    }
}


In [13]:
print(forge.map(scientists[1], mapping_na, na="(missing)"))

{
    type: Association
    agent:
    {
        type: Person
        name: Albert Einstein
    }
}


### multiple mappings

In [14]:
mapping_person = DictionaryMapping("""
    id: forge.format("identifier", "persons", x.id)
    type: Person
    name: x.name
""")

In [15]:
mapping_association = DictionaryMapping("""
    type: Association
    agent: forge.format("identifier", "persons", x.id)
""")

In [16]:
resources_graph = forge.map(scientists, [mapping_person, mapping_association])

In [17]:
print(resources_graph[0])

{
    id: https://kg.example.ch/persons/123
    type: Person
    name: Marie Curie
}


In [18]:
print(resources_graph[1])

{
    type: Association
    agent: https://kg.example.ch/persons/123
}


### managed mappings

In [19]:
forge.sources()

Data sources with managed mappings:
   - allen-cell-types-database
   - scientists-database


In [20]:
forge.mappings("scientists-database")

Managed mappings for the data source per entity type and mapping type:
   - Association:
        * DictionaryMapping


In [21]:
mapping = forge.mapping("Association", "scientists-database")

In [22]:
resources = forge.map(scientists, mapping, na="(missing)")

In [23]:
type(resources)

list

In [24]:
type(resources[0])

kgforge.core.resource.Resource

In [25]:
print(mapping)

{
    type: Association
    agent:
    {
        id: forge.format("identifier", "persons", x.id)
        type: Person
        additionalName: x.middle_name
        gender: forge.resolve(x.gender, scope="terms")
        name: x.name
    }
    distribution: forge.attach(f"../../data/scientists-database/{'_'.join(x.name.lower().split())}.txt")
}


In [26]:
print(resources[0])

{
    type: Association
    agent:
    {
        id: https://kg.example.ch/persons/123
        type: Person
        additionalName: Salomea
        gender:
        {
            id: http://purl.obolibrary.org/obo/PATO_0000383
            label: female
        }
        name: Marie Curie
    }
    distribution: LazyAction(operation=Store.upload, args=['../../data/scientists-database/marie_curie.txt'])
}


In [27]:
# forge.register(resources)

## Managing mappings

In [28]:
filepath = "mappings/scientists-database/DictionaryMapping/Association.hjson"

### saving

In [29]:
mapping.save(filepath)

### tracking & sharing changes

In [30]:
# ! cd mappings

In [31]:
# ! git add Association.hjson

In [32]:
# ! git commit -m "Add Association mapping"

In [33]:
# ! git push

### loading

In [34]:
loaded = DictionaryMapping.load(filepath)

In [35]:
# loaded == mapping