{ "cells": [ { "cell_type": "markdown", "id": "aed2fb5a-6baa-4d4e-8783-dac94c61cfcf", "metadata": {}, "source": [ "![Arango Logo](https://github.com/QubitPi/QubitPi/blob/master/img/arango-logo-with-text.png?raw=true \"Error loading QubitPi/img/arango-logo-with-text.png\")" ] }, { "cell_type": "markdown", "id": "687ce9a7-2f5c-4fcd-b718-212b6a491f99", "metadata": {}, "source": [ "Arango AQL Tutorial\n", "===================\n", "\n", "This is a tutorial on ArangoDB’s query language AQL, built around a small dataset of characters from the novel and fantasy drama television series Game of Thrones (as of season 1). It includes character traits in two languages, some family relations, and last but not least a small set of filming locations, which makes for an interesting mix of data to work with." ] }, { "cell_type": "markdown", "id": "652215c7-85f1-49de-a1b6-c5e1106a3175", "metadata": {}, "source": [ "Setup\n", "-----" ] }, { "cell_type": "code", "execution_count": 1, "id": "3d478254-aebe-443a-8cb0-d41630227aa4", "metadata": {}, "outputs": [], "source": [ "import json\n", "import requests\n", "import sys\n", "import oasis # this is a local module \"notebooks/arangodb/oasis.py\"\n", "import time\n", "\n", "from pyArango.connection import *\n", "from arango import ArangoClient\n", "\n", "from IPython.display import JSON" ] }, { "cell_type": "markdown", "id": "e1172472-687d-4d78-9841-bbb58812c493", "metadata": {}, "source": [ "Let's Create the temporary database:\n", "\n", "*Note: this notebook uses a temporary instance which will be autodeleted!*" ] }, { "cell_type": "code", "execution_count": 2, "id": "7a18b278-9c86-4882-9566-597acc063bc4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Reusing cached credentials.\n", "https://tutorials.arangodb.cloud:8529\n", "Username: TUT1cihi64f6xwx1p0jfghlo\n", "Password: TUT57a9pqx8wd466m36fm69hu\n", "Database: TUTq8tm1n269ba2h54k0fr26\n" ] } ], "source": [ "login = oasis.getTempCredentials(\n", " tutorialName=\"AQLCrudTutorial\",\n", " credentialProvider='https://tutorials.arangodb.cloud:8529/_db/_system/tutorialDB/tutorialDB'\n", ")\n", "\n", "database = oasis.connect_python_arango(login)\n", "aql = database.aql\n", "\n", "print(\"https://{}:{}\".format(login[\"hostname\"], login[\"port\"]))\n", "print(\"Username: \" + login[\"username\"])\n", "print(\"Password: \" + login[\"password\"])\n", "print(\"Database: \" + login[\"dbName\"])" ] }, { "cell_type": "markdown", "id": "a13cd968-942b-4242-9663-547631f60319", "metadata": {}, "source": [ "We should be able to see an URL, a Username, a Password, and a Database printed above. Feel free to click the URL which will take us to the ArangoDB UI. On the login page, use the Username and Password above as thelogin credential, then on the next page choose the printed Database as the selected database. The UI should look like the screenshot below:\n", "\n", "![Arango UI](https://github.com/QubitPi/QubitPi/blob/master/img/arango-ui.png?raw=true \"Error loading QubitPi/img/arango-ui.png\")" ] }, { "cell_type": "markdown", "id": "56b92ec8-9479-476d-a1fc-f283e861e73d", "metadata": {}, "source": [ "Creating Collections\n", "--------------------\n", "\n", "Before we can insert documents (data) with AQL into database, we need a place to put them in - a __Collection__. Collections can be managed via the web interface, arangosh or a driver. It is not possible to do so with AQL however." ] }, { "cell_type": "code", "execution_count": 3, "id": "4eaf7e8a-56b5-4a9b-99f5-b5b782bcf5f3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "We have 'Characters' collection now.\n" ] } ], "source": [ "if not database.has_collection(\"Characters\"):\n", " database.create_collection(\"Characters\")\n", "\n", "print(\"We have 'Characters' collection now.\")" ] }, { "cell_type": "markdown", "id": "89116091-e5b6-4f24-8d4d-54ff4a409004", "metadata": {}, "source": [ "Creating and Reading Documents\n", "------------------------------\n", "\n", "The syntax for creating a new [document](https://arango.qubitpi.org/stable/concepts/data-structure/#documents) is `INSERT document INTO collectionName`. The document is an object like we may know it from JavaScript or JSON, which is comprised of attribute key and value pairs. The quotes around the attribute keys are optional in AQL. Keys are always character sequences (strings), whereas attribute values can have different types:\n", "\n", "- null\n", "- boolean (true, false)\n", "- number (integer and floating point)\n", "- string\n", "- array\n", "- object\n", "\n", "Name and surname of the character document we will be inserting are both string values. The alive state uses a boolean. Age is a numeric value. The traits are an array of strings. The entire document is an object." ] }, { "cell_type": "code", "execution_count": 4, "id": "6c16a88e-10e6-48bc-bf03-c567147b3246", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "insert_query = \"\"\"\n", " INSERT {\n", " \"name\": \"Ned\",\n", " \"surname\": \"Stark\",\n", " \"alive\": true,\n", " \"age\": 41,\n", " \"traits\": [\"A\",\"H\",\"C\",\"N\",\"P\"]\n", " } INTO Characters\n", " \"\"\"\n", "\n", "aql.execute(insert_query)" ] }, { "cell_type": "markdown", "id": "3c984999-cab1-4d51-8bd4-d997c7859b41", "metadata": {}, "source": [ "We can also execute all the queries on the ArangoDB Web UI:\n", "\n", "![](./img/aql-on-ui.png)" ] }, { "cell_type": "markdown", "id": "a548a861-df40-483f-b428-061eab029cfd", "metadata": {}, "source": [ "Let's check whether the insert was sucessfull by querying the `Characters` collections. The syntax of the loop is `FOR variableName IN collectionName`." ] }, { "cell_type": "code", "execution_count": 5, "id": "351f7510-cbb5-43b0-8bbb-1970213a2b32", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_key': '266270044329', '_id': 'Characters/266270044329', '_rev': '_i4WVC5C--A', 'name': 'Ned', 'surname': 'Stark', 'alive': True, 'age': 41, 'traits': ['A', 'H', 'C', 'N', 'P']}\n", "\n" ] } ], "source": [ "all_characters = \"\"\"\n", " FOR character IN Characters\n", " RETURN character\n", " \"\"\"\n", "\n", "query_result = aql.execute(all_characters)\n", "\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "b07bf6a6-e4b1-421b-bb62-b149098ae4bd", "metadata": {}, "source": [ "For each document in the collection, `character` is assigned a document, which is then returned as per the loop body.\n", "\n", "Each document features the 5 attributes we stored, plus 3 more added by the database system\n", "\n", "1. a unique `_key`, which identifies it within a collection\n", "2. an `_id` which is a computed property - a concatenation of the collection name, a forward slash `/` and the document key. It uniquely identies a document within a database\n", "3. `_rev`, the revision ID managed by the system. Older revisions of a document cannot be accessed.\n", "\n", "Document keys can be provided by the user upon document creation, or a unique value is assigned automatically. It can not be changed later. All 3 system attributes starting with an underscore `_` are read-only." ] }, { "cell_type": "markdown", "id": "654eb588-56c6-4660-b2e1-89994b440b93", "metadata": {}, "source": [ "Next, let us add some more characters. We use the `LET` keyword to define a named variable with an array of objects as value, so `LET variableName = valueExpression` and the expression being a literal array definition like `[ {...}, {...}, ... ]`.\n", "\n", "`FOR variableName IN expression` is used to iterate over each element of the data array. In each loop, one element is assigned to the variable `d`. This variable is then used in the `INSERT` statement instead of a literal object definition. What is does is basically:" ] }, { "cell_type": "code", "execution_count": 6, "id": "b7a79efc-087f-4b2c-91dc-3964196d155f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "insert_query = \"\"\"\n", " LET data = [\n", " { \"name\": \"Robert\", \"surname\": \"Baratheon\", \"alive\": false, \"traits\": [\"A\",\"H\",\"C\"] },\n", " { \"name\": \"Jaime\", \"surname\": \"Lannister\", \"alive\": true, \"age\": 36, \"traits\": [\"A\",\"F\",\"B\"] },\n", " { \"name\": \"Catelyn\", \"surname\": \"Stark\", \"alive\": false, \"age\": 40, \"traits\": [\"D\",\"H\",\"C\"] },\n", " { \"name\": \"Cersei\", \"surname\": \"Lannister\", \"alive\": true, \"age\": 36, \"traits\": [\"H\",\"E\",\"F\"] },\n", " { \"name\": \"Daenerys\", \"surname\": \"Targaryen\", \"alive\": true, \"age\": 16, \"traits\": [\"D\",\"H\",\"C\"] },\n", " { \"name\": \"Jorah\", \"surname\": \"Mormont\", \"alive\": false, \"traits\": [\"A\",\"B\",\"C\",\"F\"] },\n", " { \"name\": \"Petyr\", \"surname\": \"Baelish\", \"alive\": false, \"traits\": [\"E\",\"G\",\"F\"] },\n", " { \"name\": \"Viserys\", \"surname\": \"Targaryen\", \"alive\": false, \"traits\": [\"O\",\"L\",\"N\"] },\n", " { \"name\": \"Jon\", \"surname\": \"Snow\", \"alive\": true, \"age\": 16, \"traits\": [\"A\",\"B\",\"C\",\"F\"] },\n", " { \"name\": \"Sansa\", \"surname\": \"Stark\", \"alive\": true, \"age\": 13, \"traits\": [\"D\",\"I\",\"J\"] },\n", " { \"name\": \"Arya\", \"surname\": \"Stark\", \"alive\": true, \"age\": 11, \"traits\": [\"C\",\"K\",\"L\"] },\n", " { \"name\": \"Robb\", \"surname\": \"Stark\", \"alive\": false, \"traits\": [\"A\",\"B\",\"C\",\"K\"] },\n", " { \"name\": \"Theon\", \"surname\": \"Greyjoy\", \"alive\": true, \"age\": 16, \"traits\": [\"E\",\"R\",\"K\"] },\n", " { \"name\": \"Bran\", \"surname\": \"Stark\", \"alive\": true, \"age\": 10, \"traits\": [\"L\",\"J\"] },\n", " { \"name\": \"Joffrey\", \"surname\": \"Baratheon\", \"alive\": false, \"age\": 19, \"traits\": [\"I\",\"L\",\"O\"] },\n", " { \"name\": \"Sandor\", \"surname\": \"Clegane\", \"alive\": true, \"traits\": [\"A\",\"P\",\"K\",\"F\"] },\n", " { \"name\": \"Tyrion\", \"surname\": \"Lannister\", \"alive\": true, \"age\": 32, \"traits\": [\"F\",\"K\",\"M\",\"N\"] },\n", " { \"name\": \"Khal\", \"surname\": \"Drogo\", \"alive\": false, \"traits\": [\"A\",\"C\",\"O\",\"P\"] },\n", " { \"name\": \"Tywin\", \"surname\": \"Lannister\", \"alive\": false, \"traits\": [\"O\",\"M\",\"H\",\"F\"] },\n", " { \"name\": \"Davos\", \"surname\": \"Seaworth\", \"alive\": true, \"age\": 49, \"traits\": [\"C\",\"K\",\"P\",\"F\"] },\n", " { \"name\": \"Samwell\", \"surname\": \"Tarly\", \"alive\": true, \"age\": 17, \"traits\": [\"C\",\"L\",\"I\"] },\n", " { \"name\": \"Stannis\", \"surname\": \"Baratheon\", \"alive\": false, \"traits\": [\"H\",\"O\",\"P\",\"M\"] },\n", " { \"name\": \"Melisandre\", \"alive\": true, \"traits\": [\"G\",\"E\",\"H\"] },\n", " { \"name\": \"Margaery\", \"surname\": \"Tyrell\", \"alive\": false, \"traits\": [\"M\",\"D\",\"B\"] },\n", " { \"name\": \"Jeor\", \"surname\": \"Mormont\", \"alive\": false, \"traits\": [\"C\",\"H\",\"M\",\"P\"] },\n", " { \"name\": \"Bronn\", \"alive\": true, \"traits\": [\"K\",\"E\",\"C\"] },\n", " { \"name\": \"Varys\", \"alive\": true, \"traits\": [\"M\",\"F\",\"N\",\"E\"] },\n", " { \"name\": \"Shae\", \"alive\": false, \"traits\": [\"M\",\"D\",\"G\"] },\n", " { \"name\": \"Talisa\", \"surname\": \"Maegyr\", \"alive\": false, \"traits\": [\"D\",\"C\",\"B\"] },\n", " { \"name\": \"Gendry\", \"alive\": false, \"traits\": [\"K\",\"C\",\"A\"] },\n", " { \"name\": \"Ygritte\", \"alive\": false, \"traits\": [\"A\",\"P\",\"K\"] },\n", " { \"name\": \"Tormund\", \"surname\": \"Giantsbane\", \"alive\": true, \"traits\": [\"C\",\"P\",\"A\",\"I\"] },\n", " { \"name\": \"Gilly\", \"alive\": true, \"traits\": [\"L\",\"J\"] },\n", " { \"name\": \"Brienne\", \"surname\": \"Tarth\", \"alive\": true, \"age\": 32, \"traits\": [\"P\",\"C\",\"A\",\"K\"] },\n", " { \"name\": \"Ramsay\", \"surname\": \"Bolton\", \"alive\": true, \"traits\": [\"E\",\"O\",\"G\",\"A\"] },\n", " { \"name\": \"Ellaria\", \"surname\": \"Sand\", \"alive\": true, \"traits\": [\"P\",\"O\",\"A\",\"E\"] },\n", " { \"name\": \"Daario\", \"surname\": \"Naharis\", \"alive\": true, \"traits\": [\"K\",\"P\",\"A\"] },\n", " { \"name\": \"Missandei\", \"alive\": true, \"traits\": [\"D\",\"L\",\"C\",\"M\"] },\n", " { \"name\": \"Tommen\", \"surname\": \"Baratheon\", \"alive\": true, \"traits\": [\"I\",\"L\",\"B\"] },\n", " { \"name\": \"Jaqen\", \"surname\": \"H'ghar\", \"alive\": true, \"traits\": [\"H\",\"F\",\"K\"] },\n", " { \"name\": \"Roose\", \"surname\": \"Bolton\", \"alive\": true, \"traits\": [\"H\",\"E\",\"F\",\"A\"] },\n", " { \"name\": \"The High Sparrow\", \"alive\": true, \"traits\": [\"H\",\"M\",\"F\",\"O\"] }\n", " ]\n", " \n", " FOR d IN data\n", " INSERT d INTO Characters\n", " \"\"\"\n", "\n", "aql.execute(insert_query)" ] }, { "cell_type": "markdown", "id": "72da1545-a1f3-434a-b5fb-db1e3bb19c0e", "metadata": {}, "source": [ "As before let's check the `Characters` collection, but this time only return each characters name:" ] }, { "cell_type": "code", "execution_count": 7, "id": "5e824d88-a7f8-42eb-933f-23e55e7d8afc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Ned\n", "\n", "Robert\n", "\n", "Jaime\n", "\n", "Catelyn\n", "\n", "Cersei\n", "\n", "Daenerys\n", "\n", "Jorah\n", "\n", "Petyr\n", "\n", "Viserys\n", "\n", "Jon\n", "\n", "Sansa\n", "\n", "Arya\n", "\n", "Robb\n", "\n", "Theon\n", "\n", "Bran\n", "\n", "Joffrey\n", "\n", "Sandor\n", "\n", "Tyrion\n", "\n", "Khal\n", "\n", "Tywin\n", "\n", "Davos\n", "\n", "Samwell\n", "\n", "Stannis\n", "\n", "Melisandre\n", "\n", "Margaery\n", "\n", "Jeor\n", "\n", "Bronn\n", "\n", "Varys\n", "\n", "Shae\n", "\n", "Talisa\n", "\n", "Gendry\n", "\n", "Ygritte\n", "\n", "Tormund\n", "\n", "Gilly\n", "\n", "Brienne\n", "\n", "Ramsay\n", "\n", "Ellaria\n", "\n", "Daario\n", "\n", "Missandei\n", "\n", "Tommen\n", "\n", "Jaqen\n", "\n", "Roose\n", "\n", "The High Sparrow\n", "\n" ] } ], "source": [ "all_characters_names = \"\"\"\n", " FOR character IN Characters\n", " RETURN character.name\n", " \"\"\"\n", "\n", "query_result = aql.execute(all_characters_names)\n", "\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "b2262b7c-24ac-4ea0-bdbd-9397010dda6b", "metadata": {}, "source": [ "Updating Documents\n", "------------------\n", "\n", "Let's say we need to change the alive attribute of Ned. For this we first identify the `_key` attribute of Ned" ] }, { "cell_type": "code", "execution_count": 8, "id": "dceb820a-9337-4f6b-b970-69670a0bdae9", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "_key: 266270044329\n", "\n" ] } ], "source": [ "find_ned_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.name == \"Ned\"\n", " RETURN character._key\n", " \"\"\"\n", "\n", "neds_document_key = None\n", "\n", "query_result = aql.execute(find_ned_query)\n", "\n", "for doc in query_result:\n", " print(\"_key: \" + str(doc))\n", " neds_document_key = doc\n", " print()" ] }, { "cell_type": "markdown", "id": "32f98f4f-e50c-4a39-8c66-1adcd89c598d", "metadata": {}, "source": [ "Using `key` we can update an existing document:" ] }, { "cell_type": "code", "execution_count": 9, "id": "b2f89bed-d920-4f5e-ae71-feec461b8ae2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_key': '266270044329', '_id': 'Characters/266270044329', '_rev': '_i4WVFIK--A', 'name': 'Ned', 'surname': 'Stark', 'alive': False, 'age': 41, 'traits': ['A', 'H', 'C', 'N', 'P']}\n", "\n" ] } ], "source": [ "kill_ned = \"\"\"\n", " UPDATE @key \n", " WITH { alive: false} \n", " IN Characters\n", "\"\"\"\n", "bindVars = {'key': neds_document_key}\n", "aql.execute(kill_ned, batch_size=1, bind_vars=bindVars)\n", "\n", "find_ned_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.name == \"Ned\"\n", " RETURN character\n", "\"\"\"\n", "query_result = aql.execute(find_ned_query)\n", "\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "4292df5e-4ebd-46c6-bc2d-d488bfdb224e", "metadata": {}, "source": [ "We could have also replaced the entire document content, using `REPLACE` instead of `UPDATE`:" ] }, { "cell_type": "code", "execution_count": 10, "id": "348a40b4-4c05-420c-bbdc-925a985de900", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_key': '266270044329', '_id': 'Characters/266270044329', '_rev': '_i4WVGAO--A', 'age': 41, 'alive': False, 'name': 'Ned', 'surname': 'Stark', 'traits': ['A', 'H', 'C', 'N', 'P']}\n", "\n" ] } ], "source": [ "kill_ned = \"\"\"\n", " REPLACE @key WITH {\n", " name: \"Ned\",\n", " surname: \"Stark\",\n", " alive: false,\n", " age: 41,\n", " traits: [\"A\",\"H\",\"C\",\"N\",\"P\"]\n", " } IN Characters\n", "\"\"\"\n", "bindVars = {'key': neds_document_key}\n", "aql.execute(kill_ned, batch_size=1, bind_vars=bindVars)\n", "\n", "find_ned_query = \"\"\"\n", "FOR character IN Characters\n", " FILTER character.name == \"Ned\"\n", " RETURN character\n", "\"\"\"\n", "query_result = aql.execute(find_ned_query)\n", "\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "e0caaf2a-50fc-4226-832a-61f29a6f8789", "metadata": {}, "source": [ "We could again use the `FOR` loop construct from before to update all characters:" ] }, { "cell_type": "code", "execution_count": 11, "id": "9644fab1-a6fc-4ec8-8eed-7053b45b2fb6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'Name': 'Ned', 'Season': 1}\n", "\n", "{'Name': 'Robert', 'Season': 1}\n", "\n", "{'Name': 'Jaime', 'Season': 1}\n", "\n", "{'Name': 'Catelyn', 'Season': 1}\n", "\n", "{'Name': 'Cersei', 'Season': 1}\n", "\n", "{'Name': 'Daenerys', 'Season': 1}\n", "\n", "{'Name': 'Jorah', 'Season': 1}\n", "\n", "{'Name': 'Petyr', 'Season': 1}\n", "\n", "{'Name': 'Viserys', 'Season': 1}\n", "\n", "{'Name': 'Jon', 'Season': 1}\n", "\n", "{'Name': 'Sansa', 'Season': 1}\n", "\n", "{'Name': 'Arya', 'Season': 1}\n", "\n", "{'Name': 'Robb', 'Season': 1}\n", "\n", "{'Name': 'Theon', 'Season': 1}\n", "\n", "{'Name': 'Bran', 'Season': 1}\n", "\n", "{'Name': 'Joffrey', 'Season': 1}\n", "\n", "{'Name': 'Sandor', 'Season': 1}\n", "\n", "{'Name': 'Tyrion', 'Season': 1}\n", "\n", "{'Name': 'Khal', 'Season': 1}\n", "\n", "{'Name': 'Tywin', 'Season': 1}\n", "\n", "{'Name': 'Davos', 'Season': 1}\n", "\n", "{'Name': 'Samwell', 'Season': 1}\n", "\n", "{'Name': 'Stannis', 'Season': 1}\n", "\n", "{'Name': 'Melisandre', 'Season': 1}\n", "\n", "{'Name': 'Margaery', 'Season': 1}\n", "\n", "{'Name': 'Jeor', 'Season': 1}\n", "\n", "{'Name': 'Bronn', 'Season': 1}\n", "\n", "{'Name': 'Varys', 'Season': 1}\n", "\n", "{'Name': 'Shae', 'Season': 1}\n", "\n", "{'Name': 'Talisa', 'Season': 1}\n", "\n", "{'Name': 'Gendry', 'Season': 1}\n", "\n", "{'Name': 'Ygritte', 'Season': 1}\n", "\n", "{'Name': 'Tormund', 'Season': 1}\n", "\n", "{'Name': 'Gilly', 'Season': 1}\n", "\n", "{'Name': 'Brienne', 'Season': 1}\n", "\n", "{'Name': 'Ramsay', 'Season': 1}\n", "\n", "{'Name': 'Ellaria', 'Season': 1}\n", "\n", "{'Name': 'Daario', 'Season': 1}\n", "\n", "{'Name': 'Missandei', 'Season': 1}\n", "\n", "{'Name': 'Tommen', 'Season': 1}\n", "\n", "{'Name': 'Jaqen', 'Season': 1}\n", "\n", "{'Name': 'Roose', 'Season': 1}\n", "\n", "{'Name': 'The High Sparrow', 'Season': 1}\n", "\n" ] } ], "source": [ "season_query = \"\"\"\n", " FOR character IN Characters\n", " UPDATE character WITH { season: 1 } IN Characters\n", "\"\"\"\n", "aql.execute(season_query)\n", "\n", "all_characters_names_season = \"\"\"\n", " FOR character IN Characters\n", " RETURN {\"Name\" : character.name, \"Season\" : character.season}\n", "\"\"\"\n", "query_result = aql.execute(all_characters_names_season)\n", "\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "0c8de68f-8d29-4cef-b507-782f3450e53c", "metadata": {}, "source": [ "Note, that here we customized the `RETURN` to return a json document consisting of name and season." ] }, { "cell_type": "markdown", "id": "8bb87d24-4f6b-4742-a37f-e97f49de6601", "metadata": {}, "source": [ "Filtering Documents\n", "-------------------\n", "\n", "To find documents that fulfill certain criteria more complex than key equality, there is the `FILTER` operation in AQL, which enables us to formulate arbitrary conditions for documents to match.\n", "\n", "We actually have used a filter condition before to find `Ned`:" ] }, { "cell_type": "code", "execution_count": 12, "id": "d5a25a5c-f31f-41ca-9b18-c522cf9331e5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_key': '266270044329', '_id': 'Characters/266270044329', '_rev': '_i4WVG26---', 'age': 41, 'alive': False, 'name': 'Ned', 'surname': 'Stark', 'traits': ['A', 'H', 'C', 'N', 'P'], 'season': 1}\n", "\n" ] } ], "source": [ "find_ned_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.name == \"Ned\"\n", " RETURN character\n", "\"\"\"\n", "query_result = aql.execute(find_ned_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "d13df052-c809-4a44-bdea-79dd381d87ce", "metadata": {}, "source": [ "The filter condition reads like: “the attribute name of a character document must be equal to the string Ned”. If the condition applies, character document gets returned. This works with any attribute likewise:" ] }, { "cell_type": "code", "execution_count": 13, "id": "ec98fcc6-0d8f-4562-88b5-2d72175f3559", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_key': '266270044329', '_id': 'Characters/266270044329', '_rev': '_i4WVG26---', 'age': 41, 'alive': False, 'name': 'Ned', 'surname': 'Stark', 'traits': ['A', 'H', 'C', 'N', 'P'], 'season': 1}\n", "\n", "{'_key': '266270044335', '_id': 'Characters/266270044335', '_rev': '_i4WVG26--B', 'name': 'Catelyn', 'surname': 'Stark', 'alive': False, 'age': 40, 'traits': ['D', 'H', 'C'], 'season': 1}\n", "\n", "{'_key': '266270044342', '_id': 'Characters/266270044342', '_rev': '_i4WVG26--I', 'name': 'Sansa', 'surname': 'Stark', 'alive': True, 'age': 13, 'traits': ['D', 'I', 'J'], 'season': 1}\n", "\n", "{'_key': '266270044343', '_id': 'Characters/266270044343', '_rev': '_i4WVG26--J', 'name': 'Arya', 'surname': 'Stark', 'alive': True, 'age': 11, 'traits': ['C', 'K', 'L'], 'season': 1}\n", "\n", "{'_key': '266270044344', '_id': 'Characters/266270044344', '_rev': '_i4WVG26--K', 'name': 'Robb', 'surname': 'Stark', 'alive': False, 'traits': ['A', 'B', 'C', 'K'], 'season': 1}\n", "\n", "{'_key': '266270044346', '_id': 'Characters/266270044346', '_rev': '_i4WVG26--M', 'name': 'Bran', 'surname': 'Stark', 'alive': True, 'age': 10, 'traits': ['L', 'J'], 'season': 1}\n", "\n" ] } ], "source": [ "find_ned_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.surname == \"Stark\"\n", " RETURN character\n", "\"\"\"\n", "query_result = aql.execute(find_ned_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "b4b88221-7279-4283-9b15-5fa88d08045b", "metadata": {}, "source": [ "### Range Conditions\n", "\n", "Strict equality is one possible condition we can state. There are plenty of other conditions we can formulate however. For example, we could ask for all young characters:" ] }, { "cell_type": "code", "execution_count": 14, "id": "f6ae4dfe-5d9f-490a-aced-4cf18d3e1c6b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Ned\n", "\n", "Jaime\n", "\n", "Catelyn\n", "\n", "Cersei\n", "\n", "Daenerys\n", "\n", "Jon\n", "\n", "Sansa\n", "\n", "Theon\n", "\n", "Joffrey\n", "\n", "Tyrion\n", "\n", "Davos\n", "\n", "Samwell\n", "\n", "Brienne\n", "\n" ] } ], "source": [ "find_adults_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.age >= 13\n", " RETURN character.name\n", "\"\"\"\n", "query_result = aql.execute(find_adults_query)\n", "\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "98fd09e3-1903-4ec9-be4d-7d0f53f95b6a", "metadata": {}, "source": [ "The operator `>=` stands for greater-or-equal, so every character of age 13 or older is returned (only their name in the example). We can return names and age of all characters younger than 13 by changing the operator to less-than and using the object syntax to define a subset of attributes to return:" ] }, { "cell_type": "code", "execution_count": 15, "id": "5cda0b7b-6d1f-4ac2-9199-e8734a79a26d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Robert', 'age': None}\n", "\n", "{'name': 'Jorah', 'age': None}\n", "\n", "{'name': 'Petyr', 'age': None}\n", "\n", "{'name': 'Viserys', 'age': None}\n", "\n", "{'name': 'Arya', 'age': 11}\n", "\n", "{'name': 'Robb', 'age': None}\n", "\n", "{'name': 'Bran', 'age': 10}\n", "\n", "{'name': 'Sandor', 'age': None}\n", "\n", "{'name': 'Khal', 'age': None}\n", "\n", "{'name': 'Tywin', 'age': None}\n", "\n", "{'name': 'Stannis', 'age': None}\n", "\n", "{'name': 'Melisandre', 'age': None}\n", "\n", "{'name': 'Margaery', 'age': None}\n", "\n", "{'name': 'Jeor', 'age': None}\n", "\n", "{'name': 'Bronn', 'age': None}\n", "\n", "{'name': 'Varys', 'age': None}\n", "\n", "{'name': 'Shae', 'age': None}\n", "\n", "{'name': 'Talisa', 'age': None}\n", "\n", "{'name': 'Gendry', 'age': None}\n", "\n", "{'name': 'Ygritte', 'age': None}\n", "\n", "{'name': 'Tormund', 'age': None}\n", "\n", "{'name': 'Gilly', 'age': None}\n", "\n", "{'name': 'Ramsay', 'age': None}\n", "\n", "{'name': 'Ellaria', 'age': None}\n", "\n", "{'name': 'Daario', 'age': None}\n", "\n", "{'name': 'Missandei', 'age': None}\n", "\n", "{'name': 'Tommen', 'age': None}\n", "\n", "{'name': 'Jaqen', 'age': None}\n", "\n", "{'name': 'Roose', 'age': None}\n", "\n", "{'name': 'The High Sparrow', 'age': None}\n", "\n" ] } ], "source": [ "find_young_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.age < 13\n", " RETURN { name: character.name, age: character.age }\n", "\"\"\"\n", "query_result = aql.execute(find_young_query)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "e29185d1-082c-4fe0-bb74-2e638ee09dfa", "metadata": {}, "source": [ "We may notice that it returns name and age of 30 characters, most with an age of `null`. The reason is that `null` is the fallback value if an attribute is requested by the query, but no such attribute exists in the document, and the `null` is compares to numbers as lower (see [Type and value order](https://arango.qubitpi.org/stable/aql/fundamentals/type-and-value-order/)). Hence, it accidentally fulfills the age criterion `character.age < 13` (`null < 13`). To not let documents pass the filter without an age attribute, we can add a second criterion:" ] }, { "cell_type": "code", "execution_count": 16, "id": "f8638e2e-eacc-43ac-b38b-08041aea806e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Arya', 'age': 11}\n", "\n", "{'name': 'Bran', 'age': 10}\n", "\n" ] } ], "source": [ "find_young_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.age < 13\n", " FILTER character.age != null\n", " RETURN { name: character.name, age: character.age }\n", "\"\"\"\n", "query_result = aql.execute(find_young_query)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "d6880292-fbe7-4da7-bd99-634546eafd99", "metadata": {}, "source": [ "This could equally be written with a boolean `AND` operator as:" ] }, { "cell_type": "code", "execution_count": 17, "id": "e32e9a7b-eab2-4343-a7d1-3ae12a00d0a8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Arya', 'age': 11}\n", "\n", "{'name': 'Bran', 'age': 10}\n", "\n" ] } ], "source": [ "find_young_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.age < 13 AND character.age != null\n", " RETURN { name: character.name, age: character.age }\n", "\"\"\"\n", "query_result = aql.execute(find_young_query)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "833e4de8-f9f4-4b27-93ba-3a9c73ea47cf", "metadata": {}, "source": [ "If we want documents to fulfill one or another condition, possibly for different attributes as well, use `OR`:" ] }, { "cell_type": "code", "execution_count": 18, "id": "d2e04b10-a856-4fad-a92f-46b558b731f7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Jon', 'surname': 'Snow'}\n", "\n", "{'name': 'Joffrey', 'surname': 'Baratheon'}\n", "\n" ] } ], "source": [ "find_joffrey_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.name == \"Jon\" OR character.name == \"Joffrey\"\n", " RETURN { name: character.name, surname: character.surname }\n", "\"\"\"\n", "query_result = aql.execute(find_joffrey_query)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "129a19dd-eb22-4a05-8cc7-3717bdaf79d0", "metadata": {}, "source": [ "To learn more about Filter Operation check the [documentation](https://arango.qubitpi.org/stable/aql/high-level-operations/filter/)." ] }, { "cell_type": "markdown", "id": "f91f21fb-bfdd-47b5-bb51-ad0fca202397", "metadata": {}, "source": [ "Sorting Documents\n", "-----------------" ] }, { "cell_type": "markdown", "id": "f7f9e25b-b1dd-427a-82c5-f88018c179da", "metadata": {}, "source": [ "To return characters in a defined order, we can add a `SORT()` operation." ] }, { "cell_type": "code", "execution_count": 19, "id": "3d0c34d9-b7ef-405e-be1f-11cb8bedf844", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Arya\n", "\n", "Bran\n", "\n", "Brienne\n", "\n", "Bronn\n", "\n", "Catelyn\n", "\n", "Cersei\n", "\n", "Daario\n", "\n", "Daenerys\n", "\n", "Davos\n", "\n", "Ellaria\n", "\n", "Gendry\n", "\n", "Gilly\n", "\n", "Jaime\n", "\n", "Jaqen\n", "\n", "Jeor\n", "\n", "Joffrey\n", "\n", "Jon\n", "\n", "Jorah\n", "\n", "Khal\n", "\n", "Margaery\n", "\n", "Melisandre\n", "\n", "Missandei\n", "\n", "Ned\n", "\n", "Petyr\n", "\n", "Ramsay\n", "\n", "Robb\n", "\n", "Robert\n", "\n", "Roose\n", "\n", "Samwell\n", "\n", "Sandor\n", "\n", "Sansa\n", "\n", "Shae\n", "\n", "Stannis\n", "\n", "Talisa\n", "\n", "The High Sparrow\n", "\n", "Theon\n", "\n", "Tommen\n", "\n", "Tormund\n", "\n", "Tyrion\n", "\n", "Tywin\n", "\n", "Varys\n", "\n", "Viserys\n", "\n", "Ygritte\n", "\n" ] } ], "source": [ "all_characters_names = \"\"\"\n", " FOR character IN Characters\n", " SORT character.name\n", " RETURN character.name\n", "\"\"\"\n", "query_result = aql.execute(all_characters_names)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "98d27614-5532-49f4-8687-fa9d29f81166", "metadata": {}, "source": [ "We can reverse the sort order with `DESC`:" ] }, { "cell_type": "code", "execution_count": 20, "id": "effcce61-cd28-41dc-8595-be81aaa014cd", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Ygritte\n", "\n", "Viserys\n", "\n", "Varys\n", "\n", "Tywin\n", "\n", "Tyrion\n", "\n", "Tormund\n", "\n", "Tommen\n", "\n", "Theon\n", "\n", "The High Sparrow\n", "\n", "Talisa\n", "\n", "Stannis\n", "\n", "Shae\n", "\n", "Sansa\n", "\n", "Sandor\n", "\n", "Samwell\n", "\n", "Roose\n", "\n", "Robert\n", "\n", "Robb\n", "\n", "Ramsay\n", "\n", "Petyr\n", "\n", "Ned\n", "\n", "Missandei\n", "\n", "Melisandre\n", "\n", "Margaery\n", "\n", "Khal\n", "\n", "Jorah\n", "\n", "Jon\n", "\n", "Joffrey\n", "\n", "Jeor\n", "\n", "Jaqen\n", "\n", "Jaime\n", "\n", "Gilly\n", "\n", "Gendry\n", "\n", "Ellaria\n", "\n", "Davos\n", "\n", "Daenerys\n", "\n", "Daario\n", "\n", "Cersei\n", "\n", "Catelyn\n", "\n", "Bronn\n", "\n", "Brienne\n", "\n", "Bran\n", "\n", "Arya\n", "\n" ] } ], "source": [ "all_characters_names = \"\"\"\n", " FOR character IN Characters\n", " SORT character.name DESC\n", " RETURN character.name\n", "\"\"\"\n", "query_result = aql.execute(all_characters_names)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "e150d6db-a777-4c24-9aab-372c42ecc2b3", "metadata": {}, "source": [ "### Sorting by Multiple Attributes\n", "\n", "Assume we want to sort by surname. Many of the characters share a surname. The result order among characters with the same surname is then undefined. We can first sort by surname, then name to determine the order:" ] }, { "cell_type": "code", "execution_count": 21, "id": "276108b5-86c0-43d8-9561-9ecba32b7c01", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'surname': 'Baelish', 'name': 'Petyr'}\n", "\n", "{'surname': 'Baratheon', 'name': 'Joffrey'}\n", "\n", "{'surname': 'Baratheon', 'name': 'Robert'}\n", "\n", "{'surname': 'Baratheon', 'name': 'Stannis'}\n", "\n", "{'surname': 'Baratheon', 'name': 'Tommen'}\n", "\n", "{'surname': 'Bolton', 'name': 'Ramsay'}\n", "\n", "{'surname': 'Bolton', 'name': 'Roose'}\n", "\n", "{'surname': 'Clegane', 'name': 'Sandor'}\n", "\n", "{'surname': 'Drogo', 'name': 'Khal'}\n", "\n", "{'surname': 'Giantsbane', 'name': 'Tormund'}\n", "\n" ] } ], "source": [ "all_characters_names = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.surname\n", " SORT character.surname, character.name\n", " LIMIT 10\n", " RETURN {\n", " surname: character.surname,\n", " name: character.name\n", " }\n", "\"\"\"\n", "\n", "query_result = aql.execute(all_characters_names)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "efe9acb6-9d2a-49bf-a6e3-fc96a444ffc5", "metadata": {}, "source": [ "Overall, the documents are sorted by last name. If the surname is the same for two characters, the name values are compared and the result sorted." ] }, { "cell_type": "markdown", "id": "5de52601-b704-4874-9e43-8782e86d5642", "metadata": {}, "source": [ "More information about `SORT` can be found in its [documentation](https://arango.qubitpi.org/stable/aql/high-level-operations/sort/)" ] }, { "cell_type": "markdown", "id": "738078fa-0ce7-4ec9-ad29-5494a0836b56", "metadata": {}, "source": [ "Pagination\n", "----------\n", "\n", "It may not always be necessary to return all documents, that a `FOR` loop would normally return. In those cases, we can limit the amount of documents with a `LIMIT()` operation:" ] }, { "cell_type": "code", "execution_count": 22, "id": "1d536289-41d1-4c33-bf4a-5c84088cb4d3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Arya\n", "\n", "Bran\n", "\n", "Brienne\n", "\n", "Bronn\n", "\n", "Catelyn\n", "\n" ] } ], "source": [ "sample_chars_query = \"\"\"\n", " FOR character IN Characters\n", " SORT character.name\n", " LIMIT 5\n", " RETURN character.name\n", "\"\"\"\n", "\n", "query_result = aql.execute(sample_chars_query)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "83bf9b4b-985b-42d0-af9d-4fb3365e3fd9", "metadata": {}, "source": [ "When `LIMIT` is followed 2 numbers, however, the first number specifies the number of documents that are to be skipped and return the next second-number documents. This effectively achieves the pagination:" ] }, { "cell_type": "code", "execution_count": 23, "id": "5cafa508-e66b-4437-b757-cc0e98dac984", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Brienne\n", "\n", "Bronn\n", "\n", "Catelyn\n", "\n", "Cersei\n", "\n", "Daario\n", "\n" ] } ], "source": [ "sample_chars_query = \"\"\"\n", "FOR character IN Characters\n", " SORT character.name\n", " LIMIT 2, 5\n", " RETURN character.name\n", "\"\"\"\n", "query_result = aql.execute(sample_chars_query)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "8f553d3f-d580-46de-9e47-d84deeb560ea", "metadata": {}, "source": [ "More information about `LIMIT` can be found in its [documentation](https://arango.qubitpi.org/stable/aql/high-level-operations/limit/)" ] }, { "cell_type": "markdown", "id": "ad6ee549-189d-475b-bd77-2386388cc80d", "metadata": {}, "source": [ "Putting Everying Together\n", "-------------------------\n", "\n", "Let's address a slightly complex business query - finds the 10 youngest characters:" ] }, { "cell_type": "code", "execution_count": 24, "id": "c11226ba-8d03-4e69-9309-53f4e255cfd5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'name': 'Bran', 'age': 10}\n", "\n", "{'name': 'Arya', 'age': 11}\n", "\n", "{'name': 'Sansa', 'age': 13}\n", "\n", "{'name': 'Daenerys', 'age': 16}\n", "\n", "{'name': 'Jon', 'age': 16}\n", "\n", "{'name': 'Theon', 'age': 16}\n", "\n", "{'name': 'Samwell', 'age': 17}\n", "\n", "{'name': 'Joffrey', 'age': 19}\n", "\n", "{'name': 'Tyrion', 'age': 32}\n", "\n", "{'name': 'Brienne', 'age': 32}\n", "\n" ] } ], "source": [ "sample_chars_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.age\n", " SORT character.age\n", " LIMIT 10\n", " RETURN {\n", " name: character.name,\n", " age: character.age\n", " }\n", "\"\"\"\n", "\n", "query_result = aql.execute(sample_chars_query)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "0c2191bd-c53b-46a6-8a52-8db89216fdb4", "metadata": {}, "source": [ "AQL Join\n", "--------\n", "\n", "The character data we imported has an attribute traits for each character, which is an array of strings. It does not store character features directly however:" ] }, { "cell_type": "code", "execution_count": 25, "id": "0aae5df3-1236-4a97-b85f-8cd0cef32379", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'Name': 'Ned', 'Traits': ['A', 'H', 'C', 'N', 'P']}\n", "\n" ] } ], "source": [ "find_ned_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.name == \"Ned\"\n", " RETURN {\"Name\": character.name, \"Traits\": character.traits}\n", "\"\"\"\n", "\n", "query_result = aql.execute(find_ned_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "a53fa4d7-6047-45c8-84c7-23f416f4a1e9", "metadata": {}, "source": [ "Traits in this dataset are rather a list of letters without an apparent meaning. The idea here is that traits is supposed to store documents keys of another collection, which we can use to resolve the letters to labels such as “strong”. The benefit of using another collection for the actual traits is, that we can easily query for all existing traits later on and store labels in multiple languages for instance in a central place. If we would embed traits directly like this\n", "\n", "```json\n", "{\n", " \"Name\": \"Ned\",\n", " \"Traits\": [\n", " {\n", " \"de\": \"stark\",\n", " \"en\": \"strong\"\n", " },\n", " {\n", " \"de\": \"einflussreich\",\n", " \"en\": \"powerful\"\n", " },\n", " {\n", " \"de\": \"loyal\",\n", " \"en\": \"loyal\"\n", " },\n", " {\n", " \"de\": \"rational\",\n", " \"en\": \"rational\"\n", " },\n", " {\n", " \"de\": \"mutig\",\n", " \"en\": \"brave\"\n", " }\n", " ]\n", "}\n", "```\n", "\n", "it becomes really hard to maintain traits. If we were to rename or translate one of them, we would need to find all other character documents with the same trait and perform the changes there too. If we only refer to a trait in another collection, it is as easy as updating a single document.\n", "\n", "![](./img/join.png)" ] }, { "cell_type": "markdown", "id": "2ec35b98-3443-425b-a765-777a54c7d0aa", "metadata": {}, "source": [ "### Creating Traits Collection\n", "\n", "Let's load our traits collection" ] }, { "cell_type": "code", "execution_count": 26, "id": "129697ad-fd4e-4225-b3de-690f7e883de6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_key': 'A', '_id': 'Traits/A', '_rev': '_i4U2KKO---', 'en': 'strong', 'de': 'stark'}\n", "\n", "{'_key': 'B', '_id': 'Traits/B', '_rev': '_i4U2KKO--_', 'en': 'polite', 'de': 'freundlich'}\n", "\n", "{'_key': 'C', '_id': 'Traits/C', '_rev': '_i4U2KKO--A', 'en': 'loyal', 'de': 'loyal'}\n", "\n", "{'_key': 'D', '_id': 'Traits/D', '_rev': '_i4U2KKO--B', 'en': 'beautiful', 'de': 'schön'}\n", "\n", "{'_key': 'E', '_id': 'Traits/E', '_rev': '_i4U2KKO--C', 'en': 'sneaky', 'de': 'hinterlistig'}\n", "\n", "{'_key': 'F', '_id': 'Traits/F', '_rev': '_i4U2KKO--E', 'en': 'experienced', 'de': 'erfahren'}\n", "\n", "{'_key': 'G', '_id': 'Traits/G', '_rev': '_i4U2KKO--F', 'en': 'corrupt', 'de': 'korrupt'}\n", "\n", "{'_key': 'H', '_id': 'Traits/H', '_rev': '_i4U2KKO--G', 'en': 'powerful', 'de': 'einflussreich'}\n", "\n", "{'_key': 'I', '_id': 'Traits/I', '_rev': '_i4U2KKO--H', 'en': 'naive', 'de': 'naiv'}\n", "\n", "{'_key': 'J', '_id': 'Traits/J', '_rev': '_i4U2KKO--I', 'en': 'unmarried', 'de': 'unverheiratet'}\n", "\n", "{'_key': 'K', '_id': 'Traits/K', '_rev': '_i4U2KKO--J', 'en': 'skillful', 'de': 'geschickt'}\n", "\n", "{'_key': 'L', '_id': 'Traits/L', '_rev': '_i4U2KKO--K', 'en': 'young', 'de': 'jung'}\n", "\n", "{'_key': 'M', '_id': 'Traits/M', '_rev': '_i4U2KKO--L', 'en': 'smart', 'de': 'klug'}\n", "\n", "{'_key': 'N', '_id': 'Traits/N', '_rev': '_i4U2KKO--M', 'en': 'rational', 'de': 'rational'}\n", "\n", "{'_key': 'O', '_id': 'Traits/O', '_rev': '_i4U2KKO--N', 'en': 'ruthless', 'de': 'skrupellos'}\n", "\n", "{'_key': 'P', '_id': 'Traits/P', '_rev': '_i4U2KKO--O', 'en': 'brave', 'de': 'mutig'}\n", "\n", "{'_key': 'Q', '_id': 'Traits/Q', '_rev': '_i4U2KKO--P', 'en': 'mighty', 'de': 'mächtig'}\n", "\n", "{'_key': 'R', '_id': 'Traits/R', '_rev': '_i4U2KKO--Q', 'en': 'weak', 'de': 'schwach'}\n", "\n" ] } ], "source": [ "if not database.has_collection(\"Traits\"):\n", " database.create_collection(\"Traits\")\n", "\n", " insert_query = \"\"\"\n", " LET data = [\n", " { \"_key\": \"A\", \"en\": \"strong\", \"de\": \"stark\" },\n", " { \"_key\": \"B\", \"en\": \"polite\", \"de\": \"freundlich\" },\n", " { \"_key\": \"C\", \"en\": \"loyal\", \"de\": \"loyal\" },\n", " { \"_key\": \"D\", \"en\": \"beautiful\", \"de\": \"schön\" },\n", " { \"_key\": \"E\", \"en\": \"sneaky\", \"de\": \"hinterlistig\" },\n", " { \"_key\": \"F\", \"en\": \"experienced\", \"de\": \"erfahren\" },\n", " { \"_key\": \"G\", \"en\": \"corrupt\", \"de\": \"korrupt\" },\n", " { \"_key\": \"H\", \"en\": \"powerful\", \"de\": \"einflussreich\" },\n", " { \"_key\": \"I\", \"en\": \"naive\", \"de\": \"naiv\" },\n", " { \"_key\": \"J\", \"en\": \"unmarried\", \"de\": \"unverheiratet\" },\n", " { \"_key\": \"K\", \"en\": \"skillful\", \"de\": \"geschickt\" },\n", " { \"_key\": \"L\", \"en\": \"young\", \"de\": \"jung\" },\n", " { \"_key\": \"M\", \"en\": \"smart\", \"de\": \"klug\" },\n", " { \"_key\": \"N\", \"en\": \"rational\", \"de\": \"rational\" },\n", " { \"_key\": \"O\", \"en\": \"ruthless\", \"de\": \"skrupellos\" },\n", " { \"_key\": \"P\", \"en\": \"brave\", \"de\": \"mutig\" },\n", " { \"_key\": \"Q\", \"en\": \"mighty\", \"de\": \"mächtig\" },\n", " { \"_key\": \"R\", \"en\": \"weak\", \"de\": \"schwach\" }\n", " ]\n", " \n", " \n", " FOR d IN data\n", " INSERT d INTO Traits\n", " \"\"\"\n", " \n", " aql.execute(insert_query)\n", "\n", "all_traits = \"\"\"\n", " FOR trait IN Traits\n", " RETURN trait\n", "\"\"\"\n", "query_result = aql.execute(all_traits)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "57784ac9-e64c-4310-984a-20b1fed5360a", "metadata": {}, "source": [ "### Joining Traits\n", "\n", "Now we can use the traits array together with the [DOCUMENT()](https://arango.qubitpi.org/stable/aql/functions/miscellaneous/#document) function to use the elements as document keys and look them up in the Traits collection. The `DOCUMENT()` function can be used to look up a single or multiple documents via document identifiers. In our example, we pass the collection name from which we want to fetch documents as the first argument (\"Traits\") and an array of document keys (_key attribute) as the second argument. In the [RETURN](https://arango.qubitpi.org/stable/aql/high-level-operations/return/) statement we get an array of the full trait documents for each character." ] }, { "cell_type": "code", "execution_count": 27, "id": "cdff8bbc-66b5-480d-b144-390a691dd3c0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['strong', 'powerful', 'loyal', 'rational', 'brave']\n", "\n", "['strong', 'powerful', 'loyal']\n", "\n", "['strong', 'experienced', 'polite']\n", "\n", "['beautiful', 'powerful', 'loyal']\n", "\n", "['powerful', 'sneaky', 'experienced']\n", "\n" ] } ], "source": [ "all_characters_traits = \"\"\"\n", " FOR character IN Characters\n", " LIMIT 5\n", " RETURN DOCUMENT(\"Traits\", character.traits)[*].en\n", "\"\"\"\n", "\n", "query_result = aql.execute(all_characters_traits)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "9da69b0a-4215-44f9-95c6-a19874b07e4b", "metadata": {}, "source": [ "Great, we resolved the letters to meaningful traits! But we also need to know to which character they belong. Thus, we need to merge both the character document and the data from the trait documents:" ] }, { "cell_type": "code", "execution_count": 28, "id": "38b460f5-77de-4d45-b5cb-16545d299ec6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_id': 'Characters/266270044329', '_key': '266270044329', '_rev': '_i4WVG26---', 'age': 41, 'alive': False, 'name': 'Ned', 'season': 1, 'surname': 'Stark', 'traits': ['strong', 'powerful', 'loyal', 'rational', 'brave']}\n", "\n", "{'_id': 'Characters/266270044333', '_key': '266270044333', '_rev': '_i4WVG26--_', 'alive': False, 'name': 'Robert', 'season': 1, 'surname': 'Baratheon', 'traits': ['strong', 'powerful', 'loyal']}\n", "\n", "{'_id': 'Characters/266270044334', '_key': '266270044334', '_rev': '_i4WVG26--A', 'age': 36, 'alive': True, 'name': 'Jaime', 'season': 1, 'surname': 'Lannister', 'traits': ['strong', 'experienced', 'polite']}\n", "\n", "{'_id': 'Characters/266270044335', '_key': '266270044335', '_rev': '_i4WVG26--B', 'age': 40, 'alive': False, 'name': 'Catelyn', 'season': 1, 'surname': 'Stark', 'traits': ['beautiful', 'powerful', 'loyal']}\n", "\n", "{'_id': 'Characters/266270044336', '_key': '266270044336', '_rev': '_i4WVG26--C', 'age': 36, 'alive': True, 'name': 'Cersei', 'season': 1, 'surname': 'Lannister', 'traits': ['powerful', 'sneaky', 'experienced']}\n", "\n" ] } ], "source": [ "all_characters_traits = \"\"\"\n", " FOR character IN Characters\n", " LIMIT 5\n", " RETURN MERGE(character, { traits: DOCUMENT(\"Traits\", character.traits)[*].en } )\n", "\"\"\"\n", "\n", "query_result = aql.execute(all_characters_traits)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "8ea0b9cf-2a3f-4b31-8946-a1fb90d82081", "metadata": {}, "source": [ "The [MERGE()](https://arango.qubitpi.org/stable/aql/functions/document-object/#merge) function merges objects together. Because we used an object { traits: ... } which has the same attribute name traits as the original character attribute, the latter (original) got overwritten by the merge operation." ] }, { "cell_type": "markdown", "id": "98219791-00c9-4827-9a03-5f22c5e31eaf", "metadata": {}, "source": [ "The `DOCUMENT()` function utilizes primary indices to look up documents quickly. It is limited to find documents via their identifiers however. For a use case like the one above it is sufficient to accomplish a simple join.\n", "\n", "There is another, more flexible syntax for joins: nested [FOR](https://arango.qubitpi.org/stable/aql/high-level-operations/for/) loops over multiple collections, with a [FILTER](https://arango.qubitpi.org/stable/aql/high-level-operations/filter/) condition to match up attributes. In case of the traits key array, there needs to be a third loop to iterate over the keys:" ] }, { "cell_type": "code", "execution_count": 29, "id": "6053635a-9a2e-4dc6-849a-8840855d99c5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'_id': 'Characters/266270044329', '_key': '266270044329', '_rev': '_i4WVG26---', 'age': 41, 'alive': False, 'name': 'Ned', 'season': 1, 'surname': 'Stark', 'traits': ['strong', 'powerful', 'loyal', 'rational', 'brave']}\n", "\n", "{'_id': 'Characters/266270044333', '_key': '266270044333', '_rev': '_i4WVG26--_', 'alive': False, 'name': 'Robert', 'season': 1, 'surname': 'Baratheon', 'traits': ['strong', 'powerful', 'loyal']}\n", "\n", "{'_id': 'Characters/266270044334', '_key': '266270044334', '_rev': '_i4WVG26--A', 'age': 36, 'alive': True, 'name': 'Jaime', 'season': 1, 'surname': 'Lannister', 'traits': ['strong', 'experienced', 'polite']}\n", "\n", "{'_id': 'Characters/266270044335', '_key': '266270044335', '_rev': '_i4WVG26--B', 'age': 40, 'alive': False, 'name': 'Catelyn', 'season': 1, 'surname': 'Stark', 'traits': ['beautiful', 'powerful', 'loyal']}\n", "\n", "{'_id': 'Characters/266270044336', '_key': '266270044336', '_rev': '_i4WVG26--C', 'age': 36, 'alive': True, 'name': 'Cersei', 'season': 1, 'surname': 'Lannister', 'traits': ['powerful', 'sneaky', 'experienced']}\n", "\n" ] } ], "source": [ "all_characters_traits = \"\"\"\n", " FOR character IN Characters\n", " LIMIT 5\n", " RETURN MERGE(character, {\n", " traits: (\n", " FOR key IN character.traits\n", " FOR trait IN Traits\n", " FILTER trait._key == key\n", " RETURN trait.en\n", " )\n", " })\n", "\"\"\"\n", "\n", "query_result = aql.execute(all_characters_traits)\n", "for doc in query_result:\n", " print(doc)\n", " print() " ] }, { "cell_type": "markdown", "id": "dd0aedbf-dbaa-4860-8001-5f0cb8e71147", "metadata": {}, "source": [ "For each character, it loops over its traits attribute (e.g. [\"D\",\"H\",\"C\"]) and for each document reference in this array, it loops over the Traits collections. There is a condition to match the document key with the key reference. The inner `FOR` loop and the `FILTER` get transformed to a primary index lookup in this case instead of building up a Cartesian product only to filter away everything but a single match: Document keys within a collection are unique, thus there can only be one match.\n", "\n", "Each written-out, English trait is returned and all the traits are then merged with the character document. The result is identical to the query using `DOCUMENT()`. However, this approach with a nested FOR loop and a `FILTER` is not limited to primary keys. _We can do this with any other attribute as well_. For an efficient lookup, make sure to add a hash index for this joined attribute. If its values are unique, then also set the index option to unique." ] }, { "cell_type": "markdown", "id": "ca364852-c860-4220-bfa3-6b3cc220cd47", "metadata": {}, "source": [ "Graph Traversals\n", "----------------\n", "\n", "This is why ArangoDB is a _multi-model_ database, because now er aremoving from a document data model to graphs\n", "\n", "Relations such as between parents and children can be modeled as graph. In ArangoDB, two documents (a parent and a child character document) can be linked by an edge document. Edge documents are stored in edge collections and have two additional attributes: `_from` and `_to`. They reference any two documents by their document IDs (`_id`).\n", "\n", "Our characters have the following relations between parents and children (first names only for a better overview):\n", "\n", "|Child | Parent |\n", "|---|---|\n", "| Robb | Ned |\n", "| Sansa | Ned |\n", "| Arya | Ned |\n", "| Bran | Ned |\n", "| Jon | Ned |\n", "| Robb | Catelyn |\n", "| Sansa | Catelyn |\n", "| Arya | Catelyn |\n", "| Bran | Catelyn |\n", "| Jaime | Tywin |\n", "| Cersei | Tywin |\n", "| Tyrion | Tywin |\n", "| Joffrey | Jaime |\n", "| Joffrey | Cersei |\n", "\n", "We can visualize the same information in a graph, which often is easier to comprehend.\n", "\n", "![](./img/parent-child-graph.png)" ] }, { "cell_type": "markdown", "id": "b8a79390-148d-4355-9ff6-4746abf44c30", "metadata": {}, "source": [ "### Creating Edges\n", "\n", "To create the required edge documents to store these relations in the database, we can run a query that combines joining and filtering to match up the right character documents, then use their `_id` attribute to insert an edge into an edge collection." ] }, { "cell_type": "code", "execution_count": 30, "id": "c428e75c-f71d-45fb-a979-3f62989f3e91", "metadata": {}, "outputs": [], "source": [ "if not database.has_collection(\"ChildOf\"):\n", " database.create_collection(\"ChildOf\", edge=True)" ] }, { "cell_type": "markdown", "id": "7c4a6962-e4ce-40b6-9358-a684e5a7c012", "metadata": {}, "source": [ "For creating the edges we face one challenge: The character documents don’t have user-defined keys. If they had, it would allow us to create the edges more easily like:\n", "\n", "```INSERT { _from: \"Characters/robb\", _to: \"Characters/ned\" } INTO ChildOf```\n", "\n", "However, creating the edges programmatically based on character names is a good exercise. Breakdown of the query:" ] }, { "cell_type": "code", "execution_count": 31, "id": "c26b6a20-0a7b-468a-8f84-c1521ea0b4fa", "metadata": {}, "outputs": [], "source": [ "create_edges_query = \"\"\"\n", " LET data = [\n", " {\n", " \"parent\": { \"name\": \"Ned\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Robb\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Ned\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Sansa\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Ned\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Arya\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Ned\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Bran\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Catelyn\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Robb\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Catelyn\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Sansa\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Catelyn\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Arya\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Catelyn\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Bran\", \"surname\": \"Stark\" }\n", " }, {\n", " \"parent\": { \"name\": \"Ned\", \"surname\": \"Stark\" },\n", " \"child\": { \"name\": \"Jon\", \"surname\": \"Snow\" }\n", " }, {\n", " \"parent\": { \"name\": \"Tywin\", \"surname\": \"Lannister\" },\n", " \"child\": { \"name\": \"Jaime\", \"surname\": \"Lannister\" }\n", " }, {\n", " \"parent\": { \"name\": \"Tywin\", \"surname\": \"Lannister\" },\n", " \"child\": { \"name\": \"Cersei\", \"surname\": \"Lannister\" }\n", " }, {\n", " \"parent\": { \"name\": \"Tywin\", \"surname\": \"Lannister\" },\n", " \"child\": { \"name\": \"Tyrion\", \"surname\": \"Lannister\" }\n", " }, {\n", " \"parent\": { \"name\": \"Cersei\", \"surname\": \"Lannister\" },\n", " \"child\": { \"name\": \"Joffrey\", \"surname\": \"Baratheon\" }\n", " }, {\n", " \"parent\": { \"name\": \"Jaime\", \"surname\": \"Lannister\" },\n", " \"child\": { \"name\": \"Joffrey\", \"surname\": \"Baratheon\" }\n", " }\n", " ]\n", " \n", " FOR rel in data\n", " LET parentId = FIRST(\n", " FOR c IN Characters\n", " FILTER c.name == rel.parent.name\n", " FILTER c.surname == rel.parent.surname\n", " LIMIT 1\n", " RETURN c._id\n", " )\n", " LET childId = FIRST(\n", " FOR c IN Characters\n", " FILTER c.name == rel.child.name\n", " FILTER c.surname == rel.child.surname\n", " LIMIT 1\n", " RETURN c._id\n", " )\n", " FILTER parentId != null AND childId != null\n", " INSERT { _from: childId, _to: parentId } INTO ChildOf\n", " RETURN NEW\n", "\"\"\"\n", "\n", "query_result = aql.execute(create_edges_query)" ] }, { "cell_type": "markdown", "id": "388f7577-0628-4339-9208-4526e63c98d2", "metadata": {}, "source": [ "### Graph Traversal\n", "\n", "Now that edges link character documents (vertices), we have a graph we can query to find out who the parents are of another character – or in graph terms, we want to start at a vertex and follow the edges to other vertices in an [AQL graph traversal](https://arango.qubitpi.org/stable/aql/graphs/traversals/):" ] }, { "cell_type": "code", "execution_count": 32, "id": "5c4d613a-1b60-4ea6-b086-2d8b229e90f2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Catelyn\n", "\n", "Ned\n", "\n" ] } ], "source": [ "sansa_parents_query = \"\"\"\n", " // First find the start node, i.e., sansa\n", " FOR character IN Characters\n", " FILTER character.name == \"Sansa\"\n", " // Then start a Graph traversal from that start node\n", " FOR parent IN 1..1 OUTBOUND character ChildOf\n", " RETURN parent.name\n", "\"\"\"\n", "\n", "query_result = aql.execute(sansa_parents_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "684e4a18-11ee-4c67-b393-fc045ff1092e", "metadata": {}, "source": [ "We can traverse the Graph also in the reverse direction (i.e., `INBOUND`) to find someones children:" ] }, { "cell_type": "code", "execution_count": 33, "id": "e697f485-2bf9-4ea7-905c-cb6cd0d649f3", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Jon\n", "\n", "Bran\n", "\n", "Arya\n", "\n", "Sansa\n", "\n", "Robb\n", "\n" ] } ], "source": [ "ned_children_query = \"\"\"\n", "// First find the start node, i.e., ned\n", " FOR character IN Characters\n", " FILTER character.name == \"Ned\"\n", " // Then start a Graph traversal from that start node\n", " FOR child IN 1..1 INBOUND character ChildOf\n", " RETURN child.name\n", "\"\"\"\n", "\n", "query_result = aql.execute(ned_children_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "1fe29ea0-d4f6-417f-ba51-3b27b51cf3ed", "metadata": {}, "source": [ "#### Variable Length Traversals and Grandchildren\n", "\n", "One might have wondered about the `IN 1..1 ` specification. This part actually specifies how many hops should be considered for the Traversal. For the Lannister family, we have relations that span from parent to grandchild. Let’s change the traversal depth to return grandchildren, which means to go exactly two steps:" ] }, { "cell_type": "code", "execution_count": 34, "id": "bed89aad-0ffc-45a4-866d-3ad4e9209120", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Joffrey\n", "\n", "Joffrey\n", "\n" ] } ], "source": [ "tywin_grandchildren_query = \"\"\"\n", " // First find the start node, i.e., ned\n", " FOR character IN Characters\n", " FILTER character.name == \"Tywin\"\n", " // Then start a Graph traversal from that start node\n", " FOR grandchild IN 2..2 INBOUND character ChildOf\n", " RETURN grandchild.name\n", "\"\"\"\n", "\n", "query_result = aql.execute(tywin_grandchildren_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "e912dd81-1bae-4d95-816e-addba6e1c86f", "metadata": {}, "source": [ "It might be a bit unexpected, that Joffrey is returned twice. However, if we look at the graph visualization, we can see that multiple paths lead from Joffrey (bottom right) to Tywin:\n", "\n", "![](./img/parent-child-graph.png)\n", "\n", "As a quick fix, change the last line of the query to `RETURN DISTINCT grandchild.name` to return each value only once. Keep in mind though, that there are [traversal options](https://arango.qubitpi.org/stable/aql/graphs/traversals/#syntax) to suppress duplicate vertices early on." ] }, { "cell_type": "code", "execution_count": 35, "id": "ca6621fd-f209-4178-8d1b-fbd925aeaf66", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Joffrey\n", "\n" ] } ], "source": [ "tywin_grandchildren_query = \"\"\"\n", " // First find the start node, i.e., ned\n", " FOR character IN Characters\n", " FILTER character.name == \"Tywin\"\n", " // Then start a Graph traversal from that start node\n", " FOR grandchild IN 2..2 INBOUND character ChildOf\n", " RETURN DISTINCT grandchild.name\n", "\"\"\"\n", "\n", "query_result = aql.execute(tywin_grandchildren_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "bdca59c2-46eb-44a4-afa0-43ee6a7caebf", "metadata": {}, "source": [ "To return the parents and grandparents of Joffrey, we can walk edges in OUTBOUND direction and adjust the traversal depth to go at least 1 step, and 2 at most (i.e., `IN 1..2`):" ] }, { "cell_type": "code", "execution_count": 36, "id": "fb95faa5-ee26-4538-9d25-063af32286f5", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Jaime\n", "\n", "Tywin\n", "\n", "Cersei\n", "\n" ] } ], "source": [ "joffrey_ancestors_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.name == \"Joffrey\"\n", " FOR ancestor IN 1..2 OUTBOUND character ChildOf\n", " RETURN DISTINCT ancestor.name\n", "\"\"\"\n", "\n", "query_result = aql.execute(joffrey_ancestors_query)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "1910a910-ced8-43ce-8251-9993b7d06dca", "metadata": {}, "source": [ "With deeper family trees, it is only be a matter of changing the depth values to query for great-grandchildren and similar relations." ] }, { "cell_type": "markdown", "id": "0740c019-578b-42fd-b506-a6a6b625d233", "metadata": {}, "source": [ "Deleting Documents\n", "------------------\n", "\n", "To fully remove documents from a collection, there is the `REMOVE` operation. It works similar to the other modification operations, yet without a `WITH` clause:" ] }, { "cell_type": "code", "execution_count": 37, "id": "e0b8de38-e8e9-474a-9ba3-8930df55fccb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Ned not found.\n" ] } ], "source": [ "remove_ned = \"\"\"\n", " REMOVE @key IN Characters\n", "\"\"\"\n", "bindVars = {'key': neds_document_key}\n", "\n", "try:\n", " aql.execute(remove_ned, bind_vars=bindVars)\n", "except:\n", " print(\"Ned already removed.\")\n", "\n", "find_ned_query = \"\"\"\n", " FOR character IN Characters\n", " FILTER character.name == \"Ned\"\n", " RETURN character\n", "\"\"\"\n", "query_result = aql.execute(find_ned_query, count=True)\n", "\n", "if len(query_result) == 0 :\n", " print(\"Ned not found.\")" ] }, { "cell_type": "markdown", "id": "26941e37-8aff-4323-8ca7-b12facee8cd2", "metadata": {}, "source": [ "As we might have already guessed we can again use a `FOR` loop if we want to perform this operation for the entire collection:" ] }, { "cell_type": "code", "execution_count": 38, "id": "c99540e5-10ca-421e-a3df-3c233c67f1b1", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No characters left.\n" ] } ], "source": [ "remove_all = \"\"\"\n", " FOR character IN Characters\n", " REMOVE character IN Characters\n", "\"\"\"\n", "aql.execute(remove_all)\n", " \n", "all_characters_names = \"\"\"\n", " FOR character IN Characters\n", " RETURN character\n", "\"\"\"\n", "query_result = aql.execute(all_characters_names, count=True)\n", "\n", "if len(query_result) == 0 :\n", " print(\"No characters left.\")" ] }, { "cell_type": "markdown", "id": "1bc29be2-3bcd-4b1f-80c1-1860b82797bf", "metadata": {}, "source": [ "Geospatial Data\n", "---------------\n", "\n", "In this section we learn how AQL supports geospatial queries. Geospatial coordinates consisting of a latitude and longitude value can be stored either as two separate attributes, or as a single attribute in the form of an array with both numeric values. ArangoDB can index such coordinates for fast geospatial queries.\n", "\n", "Let's create a collection with some filming locations for Games of Thrones.\n", "\n", "![](./img/location.png)" ] }, { "cell_type": "code", "execution_count": 41, "id": "7cfdb44b-81d7-476e-872b-56dc287da402", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dragonstone\n", "\n", "King's Landing\n", "\n", "The Red Keep\n", "\n", "Yunkai\n", "\n", "Astapor\n", "\n", "Winterfell\n", "\n", "Vaes Dothrak\n", "\n", "Beyond the wall\n", "\n" ] } ], "source": [ "if not database.has_collection(\"Locations\"):\n", " database.create_collection(\"Locations\")\n", "\n", " insert_query = \"\"\"\n", " LET locations = [\n", " { \"name\": \"Dragonstone\", \"coordinate\": [ 55.167801, -6.815096 ] },\n", " { \"name\": \"King's Landing\", \"coordinate\": [ 42.639752, 18.110189 ] },\n", " { \"name\": \"The Red Keep\", \"coordinate\": [ 35.896447, 14.446442 ] },\n", " { \"name\": \"Yunkai\", \"coordinate\": [ 31.046642, -7.129532 ] },\n", " { \"name\": \"Astapor\", \"coordinate\": [ 31.50974, -9.774249 ] },\n", " { \"name\": \"Winterfell\", \"coordinate\": [ 54.368321, -5.581312 ] },\n", " { \"name\": \"Vaes Dothrak\", \"coordinate\": [ 54.16776, -6.096125 ] },\n", " { \"name\": \"Beyond the wall\", \"coordinate\": [ 64.265473, -21.094093 ] }\n", " ]\n", " \n", " \n", " FOR location IN locations\n", " INSERT location INTO Locations\n", " \"\"\"\n", " \n", " aql.execute(insert_query)\n", "\n", "all_locations_names = \"\"\"\n", " FOR location IN Locations\n", " RETURN location.name\n", "\"\"\"\n", "\n", "query_result = aql.execute(all_locations_names)\n", "for doc in query_result:\n", " print(doc)\n", " print()" ] }, { "cell_type": "markdown", "id": "56874a86-ef19-4cdc-b424-fc60a442a096", "metadata": {}, "source": [ "### Geospatial Index\n", "\n", "To query based on coordinates, a [geo index](https://arango.qubitpi.org/stable/index-and-search/indexing/working-with-indexes/geo-spatial-indexes/) is required. It determines which fields contain the latitude and longitude values." ] }, { "cell_type": "code", "execution_count": 44, "id": "acb52153-8838-4f5c-9471-2ba47e04bb6b", "metadata": {}, "outputs": [ { "ename": "SyntaxError", "evalue": "expression cannot contain assignment, perhaps you meant \"==\"? (3976001801.py, line 1)", "output_type": "error", "traceback": [ "\u001b[0;36m Cell \u001b[0;32mIn[44], line 1\u001b[0;36m\u001b[0m\n\u001b[0;31m database[\"Locations\"].ensureIndex(\"type\"=\"geo\", \"fields\"=[\"coordinate\"])\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m expression cannot contain assignment, perhaps you meant \"==\"?\n" ] } ], "source": [ "database[\"Locations\"].ensureIndex(\"type\"=\"geo\", \"fields\"=[\"coordinate\"])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.7" } }, "nbformat": 4, "nbformat_minor": 5 }