# Converting AIF To Pandas
This notebook shows how to convert an AIDA TA1 AIF file to Pandas to make it programmer-friendly

In [1]:
import numpy as np
import pandas as pd
import os
import io
from IPython.display import display, HTML, Image

### Before you start
All the examples used in this document read from the /aida folder to make sure that the cells can be run in an independent manner.

We create the /results folder inside so you can see the results generated from each of the KGTK operations. This way if a cells produces an error, you can continue browsing the notebook.

In [2]:
mkdir sample_data/aida/results

### Convert AIF triples to TSV KGTK format

In [3]:
!head sample_data/aida/HC00001DO.ttl.nt

<http://www.isi.edu/gaia/entities/e34874a6-a857-4f14-8aee-9947d3e9caaf> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://tac.nist.gov/tracks/SM-KBP/2019/ontologies/InterchangeOntology#Entity> .
<http://www.isi.edu/gaia/entities/e34874a6-a857-4f14-8aee-9947d3e9caaf> <https://tac.nist.gov/tracks/SM-KBP/2019/ontologies/InterchangeOntology#informativeJustification> _:b0 .
<http://www.isi.edu/gaia/entities/e34874a6-a857-4f14-8aee-9947d3e9caaf> <https://tac.nist.gov/tracks/SM-KBP/2019/ontologies/InterchangeOntology#justifiedBy> _:b1 .
<http://www.isi.edu/gaia/entities/e34874a6-a857-4f14-8aee-9947d3e9caaf> <https://tac.nist.gov/tracks/SM-KBP/2019/ontologies/InterchangeOntology#privateData> _:g0 .
_:g0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://tac.nist.gov/tracks/SM-KBP/2019/ontologies/InterchangeOntology#PrivateData> .
_:g0 <https://tac.nist.gov/tracks/SM-KBP/2019/ontologies/InterchangeOntology#jsonContent> "{\"fileType\":\"en\"}"^^<http://www.w3.org/2001/XMLSc

**Define prefixes to compress the URIs**

In [4]:
pd.read_csv("sample_data/aida/aida-namespaces.tsv", delimiter='\t')

Unnamed: 0,node1,label,node2
0,entity,prefix_expansion,http://www.isi.edu/gaia/entities/
1,relation,prefix_expansion,http://www.isi.edu/gaia/relations/
2,event,prefix_expansion,http://www.isi.edu/gaia/events/
3,rdf,prefix_expansion,http://www.w3.org/1999/02/22-rdf-syntax-ns#
4,ont,prefix_expansion,https://tac.nist.gov/tracks/SM-KBP/2019/ontolo...
5,rpi,prefix_expansion,http://www.rpi.edu/
6,xml-schema-type,prefix_expansion,http://www.w3.org/2001/XMLSchema#
7,columbia,prefix_expansion,http://www.columbia.edu/
8,isi,prefix_expansion,http://www.isi.edu/
9,isi1,prefix_expansion,www.isi.edu/


**Import the AIF triples**

In [5]:
!kgtk import-ntriples -i sample_data/aida/HC00001DO.ttl.nt \
  --namespace-file sample_data/aida/aida-namespaces.tsv \
  --namespace-id-use-uuid True \
  --local-namespace-use-uuid False \
  --local-namespace-prefix _ \
  --newnode-use-uuid True  \
  / sort \
  > sample_data/aida/results/HC00001DO.ttl.tsv

In [6]:
!tail sample_data/aida/results/HC00001DO.ttl.tsv

rpi:NominalInformativeMention/eec532c4-dc9f-4f42-8a3c-56cd14de8c9d/HC00002Z8/2541/2557/PER	ont:system	rpi:informativejustification
rpi:NominalInformativeMention/eec532c4-dc9f-4f42-8a3c-56cd14de8c9d/HC00002Z8/2541/2557/PER	rdf:type	ont:TextJustification
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:confidence	_:g8310
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:endOffsetInclusive	4539
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:sourceDocument	"HC00001DO"
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:source	"HC00002Z8"
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:startOf

**Reified information is cumbersome to work with**

In [7]:
ta1 = pd.read_csv("sample_data/aida/results/HC00001DO.ttl.tsv", delimiter='\t')
display(HTML(ta1.loc[ta1.node1 =='_:g10'].to_html()))

Unnamed: 0,node1,label,node2
16723,_:g10,ont:confidence,_:g11
16724,_:g10,ont:justifiedBy,_:g12
16725,_:g10,ont:system,rpi1:
16726,_:g10,rdf:object,entity:c72e94f4-e4d1-45de-966f-b52cf4d6de5e
16727,_:g10,rdf:predicate,ldc:Transaction.TransferOwnership_Artifact
16728,_:g10,rdf:subject,event:9100be93-931d-4ee0-89aa-50e7d06f773e
16729,_:g10,rdf:type,rdf:Statement


## Simplify the KG

**What we want an easy to understand representation that is close to the diagrams that people want to see**

<img src="https://raw.githubusercontent.com/usc-isi-i2/kgtk/dev/examples/images/aida-event-graph.png" width=700/>

**Undo the reification, and put the justifications as annotations on the semantic edges**

In [10]:
!kgtk unreify-rdf-statements -i sample_data/aida/results/HC00001DO.ttl.tsv \
  / sort --columns 1,2 \
  >  sample_data/aida/results/HC00001DO.ttl.unreified.tsv

**Events now have direct edges to the role fillers (orange diamonds), the justifications are in the id object**

In [11]:
unreified = pd.read_csv("sample_data/aida/results/HC00001DO.ttl.unreified.tsv", delimiter='\t')
unreified.loc[unreified.node1 == 'event:fd2323ad-b9c6-4b57-9228-8579b52475c8']

Unnamed: 0,node1,label,node2,id
17054,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ldc:Life.Die_Place,entity:584ecaed-6832-489c-8e45-2e63a460ab90,_:g3162
17055,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ldc:Life.Die_Victim,entity:10147d53-19e3-4b20-b144-02077ba0f2ac,_:g2654
17056,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ldc:Life.Die_Victim,entity:fbf6e4a1-54e2-423c-92e2-75b2f2aab53b,_:g8555
17057,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ont:informativeJustification,_:b1233,
17058,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ont:justifiedBy,_:b1113,
17059,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ont:justifiedBy,_:b1366,
17060,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ont:justifiedBy,_:b1367,
17061,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ont:justifiedBy,_:b301,
17062,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ont:justifiedBy,_:b368,
17063,event:fd2323ad-b9c6-4b57-9228-8579b52475c8,ont:justifiedBy,_:b642,


**The relations are also objects with direct links to the entities (green diamonds)**

In [12]:
unreified.loc[unreified.node1 == 'relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d']

Unnamed: 0,node1,label,node2,id
54164,relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d,ldc:Physical.LocatedNear_EntityOrFiller,entity:5c64e1a6-d96a-41ef-b584-2c3c30757bf4,_:g6297
54165,relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d,ldc:Physical.LocatedNear_Place,entity:584ecaed-6832-489c-8e45-2e63a460ab90,_:g530
54166,relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d,ont:informativeJustification,_:b397,
54167,relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d,ont:justifiedBy,_:b1096,
54168,relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d,ont:system,rpi1:,
54169,relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d,rdf:type,ldc:Physical.LocatedNear,isi:gaia/assertions/4d0acbd2-f7b3-49f8-abc4-84...
54170,relation:4b8f6334-dbc1-4186-8d9e-a04d864d9a9d,rdf:type,ont:Relation,


## Create files to Work in TA2

**We want Pandas-friendly files, having a single rows for entities, relations and events.**

For initial analysis, let's remove justifications, etc.

In [13]:
!kgtk filter \
  --invert \
  -p ';ont:justifiedBy,ont:privateData,ont:system,ont:informativeJustification;' -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv \
  > sample_data/aida/results/HC00001DO.ttl.unreified.nojust.tsv

In [14]:
!tail sample_data/aida/results/HC00001DO.ttl.unreified.nojust.tsv

rpi:NominalInformativeMention/eec532c4-dc9f-4f42-8a3c-56cd14de8c9d/HC00002Z8/2541/2557/PER	ont:system	rpi:informativejustification	
rpi:NominalInformativeMention/eec532c4-dc9f-4f42-8a3c-56cd14de8c9d/HC00002Z8/2541/2557/PER	rdf:type	ont:TextJustification	
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:confidence	_:g8310	
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:endOffsetInclusive	4539	
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:source	"HC00002Z8"	
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:sourceDocument	"HC00001DO"	
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:s

**Split into a separate file for each of entities, relations and events**

In [15]:
!kgtk filter -p ';rdf:type;ont:Entity' -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv > sample_data/aida/results/HC00001DO.entity_ids.tsv
!kgtk filter -p ';rdf:type;ont:Event' -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv > sample_data/aida/results/HC00001DO.event_ids.tsv
!kgtk filter -p ';rdf:type;ont:Relation' -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv > sample_data/aida/results/HC00001DO.relation_ids.tsv

In [17]:
# Get all entities from the unreified file
!kgtk ifexists \
    --input-keys node1 \
    --filter-keys node1 \
    --filter-on sample_data/aida/results/HC00001DO.entity_ids.tsv \
    -i sample_data/aida/results/HC00001DO.ttl.unreified.nojust.tsv \
  / sort --columns 1,2 \
  > sample_data/aida/results/HC00001DO.entities.tsv

# Get all events from the unreified file
!kgtk ifexists \
    --input-keys node1 \
    --filter-keys node1 \
    --filter-on sample_data/aida/results/HC00001DO.event_ids.tsv \
    -i sample_data/aida/results/HC00001DO.ttl.unreified.nojust.tsv \
  / sort --columns 1,2 \
  > sample_data/aida/results/HC00001DO.events.tsv

# Get all relations from the unreified file
!kgtk ifexists \
    --input-keys node1 \
    --filter-keys node1 \
    --filter-on sample_data/aida/results/HC00001DO.relation_ids.tsv \
    -i sample_data/aida/results/HC00001DO.ttl.unreified.nojust.tsv \
  / sort --columns 1,2 \
  > sample_data/aida/results/HC00001DO.relations.tsv

**Little hack : replace ont:textValue by label**

In [18]:
!sed 's/ont:hasName/label/' sample_data/aida/results/HC00001DO.entities.tsv \
  | sed 's/ont:textValue/label/' \
  > sample_data/aida/results/HC00001DO.entities.renamed.tsv 

**Remove the type edges as they do not provide useful info (e.g., we know, by construction, the entities file contains entities)**

In [20]:
!kgtk filter \
  --invert \
  -p ';;ont:Entity' -i sample_data/aida/results/HC00001DO.entities.renamed.tsv \
  > sample_data/aida/results/HC00001DO.entities.notype.tsv
!kgtk filter \
  --invert \
  -p ';;ont:Relation' -i sample_data/aida/results/HC00001DO.relations.tsv \
  > sample_data/aida/results/HC00001DO.relations.notype.tsv
!kgtk filter \
  --invert \
  -p ';;ont:Event' -i sample_data/aida/results/HC00001DO.events.tsv \
  > sample_data/aida/results/HC00001DO.events.notype.tsv

## Let's make a file that has one entity per row
**Start by lifting the labels into a column**

In [21]:
!kgtk lift --suppress-empty-columns True -i sample_data/aida/results/HC00001DO.entities.notype.tsv / sort > sample_data/aida/results/HC00001DO.entities.labels.tsv

In [22]:
entities = pd.read_csv("sample_data/aida/results/HC00001DO.entities.labels.tsv", delimiter='\t')
entities

Unnamed: 0,node1,label,node2,id,node1;label,node2;label
0,_:b0,ont:confidence,_:g7653,,,
1,_:b0,ont:endOffsetInclusive,4680,,,
2,_:b0,ont:privateData,_:g7654,,,
3,_:b0,ont:sourceDocument,HC00001DO,,,
4,_:b0,ont:source,HC00002Z8,,,
...,...,...,...,...,...,...
56017,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,ont:source,HC00002Z8,,,
56018,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,ont:startOffset,4527,,,
56019,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,ont:system,rpi:informativejustification,,,
56020,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,rdf:type,ont:TextJustification,,,


**Now lift the LDC link targets into a separate column, this is a bit complicated because of the extra level of reification**

In [24]:
!kgtk lift \
    --suppress-empty-columns True \
    --label-value ont:linkTarget \
    --lift-suffix ';temp' \
    --label-file sample_data/aida/results/HC00001DO.ttl.unreified.tsv \
    -i sample_data/aida/results/HC00001DO.entities.labels.tsv \
  / lift \
    --suppress-empty-columns True \
    --label-value ont:link \
    --lift-suffix ';linkTarget' \
    --node2-name 'node2;temp' \
  / sort \
  / remove-columns  -c 'node2;temp' \
  > sample_data/aida/results/HC00001DO.entities.labels.linktargets.tsv

In [25]:
entities = pd.read_csv("sample_data/aida/results/HC00001DO.entities.labels.linktargets.tsv", delimiter='\t')
entities

Unnamed: 0,node1,label,node2,id,node1;label,node2;label,node1;temp,node1;linkTarget,node2;linkTarget
0,_:b0,ont:confidence,_:g7653,,,,,,
1,_:b0,ont:endOffsetInclusive,4680,,,,,,
2,_:b0,ont:privateData,_:g7654,,,,,,
3,_:b0,ont:sourceDocument,HC00001DO,,,,,,
4,_:b0,ont:source,HC00002Z8,,,,,,
...,...,...,...,...,...,...,...,...,...
55849,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,ont:source,HC00002Z8,,,,,,
55850,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,ont:startOffset,4527,,,,,,
55851,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,ont:system,rpi:informativejustification,,,,,,
55852,rpi:NominalInformativeMention/fbc758f0-d19f-4f...,rdf:type,ont:TextJustification,,,,,,


**Statistics of fraction of entities have labels or link targets**

In [26]:
((entities.shape[0]-entities.isnull().sum())/entities.shape[0]).round(3)

node1               1.000
label               1.000
node2               1.000
id                  0.090
node1;label         0.015
node2;label         0.007
node1;temp          0.005
node1;linkTarget    0.006
node2;linkTarget    0.004
dtype: float64

**Distribution of types**

In [27]:
entities['node2'].value_counts()

rpi1:                    6294
ont:Confidence           4011
ont:PrivateData          3391
rpi:fileType             2161
{\fileType\":\"en\"}"    2161
                         ... 
_:g6019                     1
_:g9082                     1
_:g787                      1
_:g1093                     1
_:g5691                     1
Name: node2, Length: 13836, dtype: int64

**Add the labels of the entities to the event file**

In [28]:
!kgtk filter \
  -p ';label;' -i sample_data/aida/results/HC00001DO.entities.renamed.tsv \
  > sample_data/aida/results/HC00001DO.entities.renamed.labels.tsv

In [29]:
!kgtk join --left-file sample_data/aida/results/HC00001DO.events.notype.tsv --right-file sample_data/aida/results/HC00001DO.entities.renamed.labels.tsv \
  --left-join \
  --left-file-join-columns node2 \
  --right-file-join-columns node1 \
  / lift --suppress-empty-columns \
  > sample_data/aida/results/HC00001DO.events.notype.entity-labels.tsv

In [30]:
events = pd.read_csv("sample_data/aida/results/HC00001DO.events.notype.entity-labels.tsv", delimiter='\t')
display(HTML(events[:10].to_html()))

Unnamed: 0,node1,label,node2,id,node1;label,node2;label
0,_:b0,ont:confidence,_:g7653,,,
1,_:b0,ont:endOffsetInclusive,4680,,,
2,_:b0,ont:privateData,_:g7654,,,
3,_:b0,ont:source,HC00002Z8,,,
4,_:b0,ont:sourceDocument,HC00001DO,,,
5,_:b0,ont:startOffset,4679,,,
6,_:b0,ont:system,rpi1:,,,
7,_:b0,rdf:type,ont:TextJustification,,,
8,_:b1,ont:confidence,_:b1530,,,
9,_:b1,ont:endOffsetInclusive,4680,,,


In [31]:
events['node1'].value_counts()[:10]

entity:9911ecfc-e6b9-41a4-a488-30075e439aa8    74
entity:fcb78e77-4962-4fca-977b-aea84bfa3ddd    61
entity:8e97e2c0-5ed1-4ae3-81bc-f66cedd2d8e5    46
entity:c32bb2f7-eb58-4612-b101-dbfcee3e84ae    43
entity:79969b4c-cf9e-4eb7-8123-c7714e087454    42
event:519cf108-2005-4d3d-b82c-a4309db8992e     40
event:dab890d7-aa46-4e1f-9309-e7c2834a164d     38
entity:5d6629ee-be36-4445-8a35-3be47b8ee97a    36
entity:bb729095-2592-4e3d-aa40-cf1a48b01383    36
entity:d1dcefce-badf-4948-bfcf-5d33116fa12c    35
Name: node1, dtype: int64

### Work with clusters

In [32]:
!kgtk filter -p ';ont:clusterMember;' -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv > sample_data/aida/results/HC00001DO.ttl.clusters.tsv

In [34]:
!kgtk join --left-file sample_data/aida/results/HC00001DO.ttl.clusters.tsv --right-file sample_data/aida/results/HC00001DO.entities.notype.tsv \
  --left-file-join-columns node2 \
  --right-file-join-columns node1 \
  > sample_data/aida/results/HC00001DO.cluster.ids.entities.tsv 
!kgtk join --left-file sample_data/aida/results/HC00001DO.ttl.clusters.tsv --right-file sample_data/aida/results/HC00001DO.relations.notype.tsv \
  --left-file-join-columns node2 \
  --right-file-join-columns node1 \
  > sample_data/aida/results/HC00001DO.cluster.ids.relations.tsv 
!kgtk join --left-file sample_data/aida/results/HC00001DO.ttl.clusters.tsv --right-file sample_data/aida/results/HC00001DO.events.notype.tsv \
  --left-file-join-columns node2 \
  --right-file-join-columns node1 \
  > sample_data/aida/results/HC00001DO.cluster.ids.events.tsv 

In [36]:
!kgtk ifexists \
  --input-keys node1 \
  --filter-keys node1 \
  --filter-on sample_data/aida/results/HC00001DO.cluster.ids.entities.tsv \
  -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv \
  > sample_data/aida/results/HC00001DO.cluster.entities.tsv 
!kgtk ifexists \
  --input-keys node1 \
  --filter-keys node1 \
  --filter-on sample_data/aida/results/HC00001DO.cluster.ids.relations.tsv \
  -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv \
  > sample_data/aida/results/HC00001DO.cluster.relations.tsv 
!kgtk ifexists \
  --input-keys node1 \
  --filter-keys node1 \
  --filter-on sample_data/aida/results/HC00001DO.cluster.ids.events.tsv \
  -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv \
  > sample_data/aida/results/HC00001DO.cluster.events.tsv 

### Create and edge file with ids to load in Wikidata SPARQL and browse using SQID

In [37]:
!kgtk add-id -i sample_data/aida/results/HC00001DO.ttl.unreified.tsv  > sample_data/aida/results/HC00001DO.ttl.unreified.ids.tsv

In [38]:
!tail sample_data/aida/results/HC00001DO.ttl.unreified.ids.tsv

rpi:NominalInformativeMention/eec532c4-dc9f-4f42-8a3c-56cd14de8c9d/HC00002Z8/2541/2557/PER	ont:system	rpi:informativejustification	E51098
rpi:NominalInformativeMention/eec532c4-dc9f-4f42-8a3c-56cd14de8c9d/HC00002Z8/2541/2557/PER	rdf:type	ont:TextJustification	E51099
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:confidence	_:g8310	E51100
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:endOffsetInclusive	4539	E51101
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:source	"HC00002Z8"	E51102
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_CommercialOrganization_NewsAgency	ont:sourceDocument	"HC00001DO"	E51103
rpi:NominalInformativeMention/fbc758f0-d19f-4faa-bb06-d042f7884144/HC00002Z8/4527/4539/ORG_Com

In [39]:
# Show all results in results folder created in this tutorial
!ls  sample_data/aida/results/

HC00001DO.cluster.entities.tsv
HC00001DO.cluster.events.tsv
HC00001DO.cluster.ids.entities.tsv
HC00001DO.cluster.ids.events.tsv
HC00001DO.cluster.ids.relations.tsv
HC00001DO.cluster.relations.tsv
HC00001DO.entities.labels.linktargets.tsv
HC00001DO.entities.labels.tsv
HC00001DO.entities.notype.tsv
HC00001DO.entities.renamed.labels.tsv
HC00001DO.entities.renamed.tsv
HC00001DO.entities.tsv
HC00001DO.entity_ids.tsv
HC00001DO.event_ids.tsv
HC00001DO.events.notype.entity-labels.tsv
HC00001DO.events.notype.tsv
HC00001DO.events.tsv
HC00001DO.relation_ids.tsv
HC00001DO.relations.notype.tsv
HC00001DO.relations.tsv
HC00001DO.ttl.clusters.tsv
HC00001DO.ttl.tsv
HC00001DO.ttl.unreified.ids.tsv
HC00001DO.ttl.unreified.nojust.tsv
HC00001DO.ttl.unreified.tsv


In [None]:
# Read KGTK results into lines and directly into Pandas
# lines = !kgtk filter -p ';prefix_expansion;' -i ta1/HC00001DO/HC00001DO.ttl.tsv
# pd.read_csv(io.StringIO('\n'.join(lines)), delimiter='\t')