{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Interactive Query with Gremlin\n", "Gremlin is one of the most popular query languages in graph databases, as SQL in relational databases. Here, we will give some examples to illustrate how gremlin helps navigate the vertices and edges of a graph.\n", "\n", "## Dataset\n", "MODERN, a toy graph from [tinkerpop](https://tinkerpop.apache.org/docs/current/reference/), which consists of 6 vertices and 6 edges. We extend it to larger size by adding more complex relationships among vertices. Here are the edges:\n", "\n", "\\[(1,3),(1,2),(1,4),(4,5),(4,3),(6,3),(2,4),(4,1)\\]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import graphscope\n", "from graphscope.framework.graph import Graph\n", "from graphscope.framework.loader import Loader\n", "import vineyard\n", "\n", "k8s_volumes = {\n", " \"data\": {\n", " \"type\": \"hostPath\",\n", " \"field\": {\n", " \"path\": \"/testingdata/modern_graph_2\", # Path in host\n", " \"type\": \"Directory\"\n", " },\n", " \"mounts\": {\n", " \"mountPath\": \"/home/jovyan/datasets/modern_graph_2\", # Path in pods\n", " \"readOnly\": True\n", " }\n", " }\n", "}\n", "graphscope.set_option(show_log=True) # enable logging\n", "session = graphscope.session(k8s_volumes=k8s_volumes) # create a session\n", "modern_graph = session.load_from(\n", " vertices={\n", " \"person\": (Loader(\"/home/jovyan/datasets/modern_graph_2/person.csv\", delimiter=\"|\", header_row=True), [\"name\", (\"age\", \"int\")], \"id\"),\n", " \"software\": (Loader(\"/home/jovyan/datasets/modern_graph_2/software.csv\", delimiter=\"|\", header_row=True), [\"name\", \"lang\"], \"id\"),\n", " },\n", " edges={\n", " \"knows\": [Loader(\"/home/jovyan/datasets/modern_graph_2/knows.csv\", delimiter=\"|\"), [], (0, \"person\"), (1, \"person\")],\n", " \"created\": [Loader(\"/home/jovyan/datasets/modern_graph_2/created.csv\", delimiter=\"|\"), [], (0, \"person\"), (1, \"software\")],\n", " }, generate_eid=False)\n", "interactive = session.gremlin(modern_graph)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Between Vertices\n", "Traversals between two particular vertices is quite common situations in graph databases. For example, to figure out the relationships between v1 and v2/v3, a gremlin query can be written like this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q1 = interactive.execute(\"g.V().has(\\\"id\\\", 1).as(\\\"u\\\").out().has(\\\"id\\\", eq(2).or(eq(3))).as(\\\"v\\\").select(\\\"u\\\", \\\"v\\\").by(\\\"id\\\")\")\n", "for p in q1:\n", " print(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is an example which is popular in social network scenarios, such as finding common features between two different people, one called \"marko\" while another called \"peter\"." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q2 = interactive.execute(\"g.V().has(\\\"name\\\", \\\"marko\\\").out().where(__.in().has(\\\"name\\\", \\\"peter\\\")).valueMap()\")\n", "for p in q2:\n", " print(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Degree Centrality\n", "Degree centrality is a measure of the number of edges associated to each vertex, which is of statistical significance in large-scale data processing. Here are some examples:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q3 = interactive.execute(\"g.V().group().by().by(bothE().count())\")\n", "for p in q3:\n", " print(p[0])\n", "q4 = interactive.execute(\"g.V().group().by().by(inE().count())\")\n", "for p in q4:\n", " print(p[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cycle Detection\n", "Cycle detection is another important application of graph query in commerce area where cycles are usually considered as fraudulent patterns. Here is an example illustrating how gremlin helps detect cycles in a graph." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q5 = interactive.execute(\"g.V().as(\\\"u\\\").repeat(out().simplePath()).times(2).where(out().where(eq(\\\"u\\\"))).count()\")\n", "print(q5.one())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Close session" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "session.close()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.6" } }, "nbformat": 4, "nbformat_minor": 4 }