Find the companies where X has worked, and their roles at those companies
"Which companies has X worked for, and in what roles?"
Reviewing this question we can identify several entities, attributes and relationships. We have the concept of a company, a person (X), and a role. Further, a person worked for a company.
Company and Person are both entities, which we'll model as vertices with appropriate labels. For now, we'll assume a direct relationship between a person and a company: a person WORKED FOR a company. We'll make role an attribute of this relationship.
Adding in a few properties – firstName and lastName for a person, name for a company – we end up with the following data model:
Over the course of this exercise we'll see role change place several times. At this stage it's a simple attribute of a relationship. In later steps we'll see it promoted to being a vertex in its own right.
As far as our current use case is concerned, role appears to be a simple value type, much like colour, height or weight. If it were a complex value type with several fields – such as address – or if there were some explicit structural relations between values – as there are in a category hierarchy – we would consider making it a vertex from the outset.
We'll now create a sample dataset in line with our model. We'll include enough data to ensure that our queries have to exclude some portions of the graph in order to return a correct result.
%load_ext ipython_unittest
%run '../util/neptune.py'
neptune.clear()
g = neptune.graphTraversal()
(g.
addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').
addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').
addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').
addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').
addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').
addV('Company').property(id,'c-1').property('name','Example Corp').
addV('Company').property(id,'c-2').property('name','AnyCompany').
V('p-1').addE('WORKED_FOR').to(V('c-1')).property('role','Principal Analyst').
V('p-2').addE('WORKED_FOR').to(V('c-1')).property('role','Senior Analyst').
V('p-3').addE('WORKED_FOR').to(V('c-1')).property('role','Analyst').
V('p-4').addE('WORKED_FOR').to(V('c-1')).property('role','Analyst').
V('p-5').addE('WORKED_FOR').to(V('c-2')).property('role','Manager').
V('p-3').addE('WORKED_FOR').to(V('c-2')).property('role','Associate Analyst').
toList())
To answer this question, we'll have to perform the following steps:
%%unittest
results = None # TODO
assert results == [{'company': 'Example Corp', 'role': 'Analyst'},
{'company': 'AnyCompany', 'role': 'Associate Analyst'}]
%%unittest
results = (g.V('p-3').
outE('WORKED_FOR').as_('e').
otherV().
project('company', 'role').
by('name').
by(select('e').values('role')).
toList())
assert results == [{'company': 'Example Corp', 'role': 'Analyst'},
{'company': 'AnyCompany', 'role': 'Associate Analyst'}]