{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 基于Gremlin的交互式查询\n", "Gremlin 是图数据领域最流行的查询语言之一,就好比关系型数据库里的 SQL 一样。接下来,我们将通过一些例子来说明 Gremlin 是如何执行图查询的。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Install graphscope package if you are NOT in the Playground\n", "\n", "!pip3 install graphscope" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 数据集\n", "MODERN,[tinkerpop](https://tinkerpop.apache.org/docs/current/tutorials/getting-started/)提供的示例数据集,由6个顶点和6条边组成。这是图中包含的所有边:\n", "[(1,3),(1,2),(1,4),(4,5),(4,3),(6,3)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Import the graphscope module and load the modern graph\n", "\n", "import graphscope\n", "from graphscope.dataset import load_modern_graph\n", "\n", "graphscope.set_option(show_log=False) # enable logging\n", "\n", "modern_graph = load_modern_graph()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Greate GIE engine with Gremlin\n", "\n", "interactive = graphscope.gremlin(modern_graph)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 两点之间的遍历\n", "两点之间的遍历是图数据领域十分常见的查询场景。例如,为了搞清楚v1和v2/v3之间的关系,我们可以这样写一条 gremlin 语句:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q1 = interactive.execute(\n", " 'g.V().has(\"id\", 1).as(\"u\").out().has(\"id\", eq(2).or(eq(3))).as(\"v\").select(\"u\", \"v\").by(\"id\")'\n", ").all().result()\n", "\n", "for p in q1:\n", " print(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "接下来展示一条在社交网络场景中常用的查询,例如如何找到两个不同用户之间的共同特点,用户1昵称为\"marko\",另一个为\"peter\"。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q2 = interactive.execute(\n", " 'g.V().has(\"name\", \"marko\").out().where(__.in().has(\"name\", \"peter\")).valueMap()'\n", ").all().result()\n", "\n", "for p in q2:\n", " print(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 度的中心性\n", "度的中心性是衡量每个顶点邻接边数量的指标,在处理大数据时有重要意义,这里有一些示例:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q3 = interactive.execute(\"g.V().group().by().by(bothE().count())\").all()\n", "for p in q3:\n", " print(p)\n", "\n", "q4 = interactive.execute(\"g.V().group().by().by(inE().count())\").all()\n", "for p in q4:\n", " print(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 环的检测\n", "环的检测是图查询在商业领域中的另一重要应用,环通常被认为是欺诈行为的发生。这里有一个示例展示 gremlin 如何被用于发现图上的环。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "q5 = interactive.execute(\n", " 'g.V().as(\"u\").repeat(out().simplePath()).times(2).where(out().where(eq(\"u\"))).count()'\n", ")\n", "print(q5.one())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }