{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Understanding Expressions in Aerospike\n", "This notebook describes how expressions work in Aerospike: how they are formed, their syntax, benefits, and how they are used in filters and operations. \n", "\n", "This notebook requires Aerospike database running on localhost. Visit [Aerospike notebooks repo](https://github.com/aerospike-examples/interactive-notebooks) for additional details and the docker container." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "In this notebook, we will see how expressions work in Aerospike and benefits they provide. \n", "\n", "The expressions functionality has been enhanced in Aerospike Database 5.6. Expressions appear in two flavors in the client library: Filter Expressions and Operation Expressions. Filter Expressions provide a mechanism to select records for operations and replace Predicate Expressions, which have been deprecated since the 5.2 release. Operation Expressions enable new read and write capabilities as described later. Expressions are also used on server as XDR Filter Expressions to specify which records are shipped to remote destinations. \n", "\n", "We will describe at a high level how expressions are formed in Aerospike and the capabilities they enable. After highlighting key syntax patterns, we will show with specific code examples how expressions are used.\n", "\n", "The main topics in this notebook include: \n", "- scope of expressions \n", "- benefits \n", "- syntax\n", "- usage\n", "- coding examples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "This tutorial assumes familiarity with the following topics:\n", "- [Aerospike Notebooks - Readme and Tips](../readme_tips.ipynb)\n", "- [Hello World](hello_world.ipynb)\n", "- [Introduction to Data Modeling](java-intro_to_data_modeling.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Ensure Database is Running\n", "This notebook requires that Aerospike database is running. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2020-12-29T20:48:49.065421Z", "start_time": "2020-12-29T20:48:49.060897Z" } }, "outputs": [], "source": [ "import io.github.spencerpark.ijava.IJava;\n", "import io.github.spencerpark.jupyter.kernel.magic.common.Shell;\n", "IJava.getKernelInstance().getMagics().registerMagics(Shell.class);\n", "%sh asd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Download and Install Additional Components.\n", "Install the Aerospike Java client version 5.1.3 (or higher) that has the support for expressions." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2020-12-29T20:48:50.084636Z", "start_time": "2020-12-29T20:48:50.080629Z" } }, "outputs": [], "source": [ "%%loadFromPOM\n", "\n", " \n", " com.aerospike\n", " aerospike-client\n", " 5.1.3\n", " \n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initialize Client\n", "Initialize the client. Also, define constants including the namespace `test` and set `expressions` and a convenient function `truncateTestData`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2020-12-29T20:48:50.771243Z", "start_time": "2020-12-29T20:48:50.767819Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initialized the client and connected to the cluster.\n" ] } ], "source": [ "import com.aerospike.client.AerospikeClient;\n", "\n", "AerospikeClient client = new AerospikeClient(\"localhost\", 3000);\n", "System.out.println(\"Initialized the client and connected to the cluster.\");\n", "\n", "final String Namespace = \"test\";\n", "final String Set = \"expressions\";\n", "\n", "// convenience functions to truncate test data\n", "void truncateTestData() {\n", " try {\n", " client.truncate(null, Namespace, Set, null);\n", " }\n", " catch (AerospikeException e) {\n", " // ignore\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Access Shell Commands\n", "You may execute shell commands including Aerospike tools like [aql](https://docs.aerospike.com/docs/tools/aql/index.html) and [asadm](https://docs.aerospike.com/docs/tools/asadm/index.html) in the terminal tab throughout this tutorial. Open a terminal tab by selecting File->Open from the notebook menu, and then New->Terminal." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Defining Expressions\n", "*An expression is a syntactic entity in a programming language that may be evaluated to determine its value.*[(Wikipedia)](https://en.wikipedia.org/wiki/Expression_(computer_science\\)) \n", "\n", "In other words, an expression evaluates to (or returns) a value. Some simple examples of an expression would be: \n", "
\n",
    "5\n",
    "7 + 3\n",
    "2 > 1\n",
    "
\n", "\n", "Expressions can have:\n", "- constants:
5, “horse”, [1, 2, 3]
\n", "- variables:
var x = pow(b, c) + d
\n", "- functions:
pow, mod, min
\n", "- and operators:
==, +, or
\n", "\n", "Expressions are composable. In other words, complex expressions can be formed from simpler expressions. For example: \n", "
\n",
    "1 + min(2, a + 2) <  sqrt(b)\n",
    "
\n", "\n", "An expression is not an assignment: An expression does not assign a value to a variable, but simply evaluates to a value which may be used in an assignment statement that assigns the value to a variable. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Expressions in Aerospike\n", "This section provides a higher level view of the capabilities and workings of expressions in Aerospike. The subsequent sections will drill down into the details.\n", "## Evaluation Context\n", "Expressions are evaluated on server for filtering conditions, reading and writing to bins, and configuring XDR replication. Therefore, an expression only works on server data entities such as the metadata and record data, and uses any constants that the client may provide. When used from the client library, expressions are created on the client and sent to the server in an API operation. Before sending, the client object format of an expression is converted to a wire format using the `build` operation.\n", "## Components and Scope\n", "*An expression is a combination of one or more constants, variables, functions, and operators that the programming language interprets ... and computes to produce another value.* [(Wikipedia)](https://en.wikipedia.org/wiki/Expression_(computer_science\\)) \n", "\n", "In Aerospike, expressions use bins and metadata as variables, metadata and API functions, and values that ae strongly typed as boolean, integer, float, string, list, map, blob, GeoJSON, or HyperLogLog. A host of arithmetic, logical, convenience, and API operations are available for these data. \n", "\n", "Please refer to the [documentation](https://docs.aerospike.com/docs/guide/expressions/) for the list of supported components.\n", "## Immutability of Components\n", "In Aerospike, an expression works on a transient copy, therefore evauating an expression does not change the metadata or bins that are used in the expression.\n", "## Use of Variables\n", "A variable can be defined to represent a sub-expression for syntactic clarity and efficiency. A variable is first defined and initialized by assigning it to an expression, and then used as a substitute for the expression. In the example below, a variable `myvar` is defined and used in an expression `myexpr`:\n", "
\n",
    "myvar = (a + b) / min(a, b)\n",
    "myexpr =  myvar + 1 / myvar\n",
    "
\n", "## Conditional Evaluation\n", "An expression can be condiitonally evaluated with an `if-then-else` like construct. For example: \n", "
\n",
    "if (cond_1) {\n",
    "    myexpr = expr_1;\n",
    "}\n",
    "else if (cond_2) {\n",
    "    myexpr = expr_2;\n",
    "}\n",
    "... \n",
    "else {\n",
    "    myexpr = default;\n",
    "}\n",
    "
\n", "## Uses and Types\n", "Expressions are used in: \n", "- selection conditions aka predicates (called Filter Expressions), \n", "- operations (called Operation Expressions), and \n", "- XDR's shipping configuration (called XDR Filter Expressions). \n", "\n", "The functionality of expressions is the same, although the context determines their use. For example, Filter and XDR Filter Expressions are boolean expressions, whereas Operation Expressions can evaluate to any supported type. \n", "\n", "Only Filter and Operation Expressions can be used in the client library and therefore will be the focus of this tutorial. Please refer to the [documentation](https://docs.aerospike.com/docs/guide/expressions/#xdr-filter-expressions) for the details of XDR Filter Expressions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Benefits of Expressions\n", "Here are some key benefits and capabilities that expressions enable:\n", "\n", "- Capabilities in expressions include: \n", " - variables for syntactic clarity and efficiency,\n", " - conditional evaluation, \n", " - access to metadata and bin data, and\n", " - access to powerful APIs and enhanced set of operators.\n", " \n", "\n", "- The enhanced filtering expressions allow records to be processed more efficiently by avoiding the need for potentially more expensive client or UDF based processing. \n", "\n", "\n", "- Reads and writes are now possible with Operation Expressions.\n", " - in reads, this can eliminate the need to bring large amounts of data to the client with more precise ability to specify the data to be fetched. \n", " - a bin can be updated with the results of an expression, which can eliminate having to read before update by allowing everything to happen on the server side in the same request including the read, processing for update, and update. This saves a round-trip and transfer of potentially large data. In a concurrent setting, this also avoids retries due to conflicts [see the R-M-W pattern](../python/transactions_rmw_pattern.ipynb). \n", " \n", "\n", "- Multi-step operations that can build on each other’s results are now possible through operation expressions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Syntax Details\n", "With a better understanding of their structure, it is easier to parse Aerospike expressions.\n", "\n", "## Notation\n", "Aerospike expressions use [Polish Notation](https://en.wikipedia.org/wiki/Polish_notation) (aka prefix notation) which is widely seen in most programming language functions: `fn(a, b)`. So the expression `5 + 3` in Aerospike Java client would be:\n", "
\n",
    "Exp.add(\n",
    "    Exp.val(5), \n",
    "    Exp.val(3))\n",
    "
\n", "Note, the overloaded `val` method converts all suported types to a `Value` object, which provides an abstraction for all supported value types." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Composition\n", "A complex expression can be composed using two or more sub-expressions. For example, with integer bins `a` and `b`, the expression `(a - b) / (a + b)` would be:\n", "
\n",
    "Exp.div(\n",
    "    Exp.sub(\n",
    "        Exp.intBin(\"a\"), \n",
    "        Exp.intBin(\"b\")),\n",
    "    Exp.add(\n",
    "        Exp.intBin(\"a\"), \n",
    "        Exp.intBin(\"b\")))\n",
    "
\n", "Note, there are corresponding access methods to access bin values for other supported types. Since a bin may hold any value type, an incorrect type access results in an error. A conditional type check may be used to prevent a run-time error." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variable Definition and Use \n", "The `let` construct defines the scope of variables and the expression that uses them. The `def` construct defines a variable and assigns it to an expression. Another expression in the scope can use the variable as a substitute for the expression it defines. For example, in the expression `5 < (a + b) < 10` using a variable `x` for the sum of integer bins `a` and `b`:\n", "
\n",
    "Exp.let(   // let defines the scope of variables for this expression\n",
    "   Exp.def(\"x\",    // def defines a variable\n",
    "       Exp.sum(    //    and also assigns it to an expression\n",
    "           Exp.intBin(\"a\"), \n",
    "           Exp.intBin(\"b\")),  \n",
    "   Exp.and(        // the expression in let scope can use the variable\n",
    "       Exp.lt(\n",
    "           Exp.val(5), \n",
    "           Exp.var(\"x\")),   // var to use the variable\n",
    "     Exp.lt(\n",
    "         Exp.var(\"x\"), \n",
    "         Exp.val(10))));\n",
    "
\n", "Note in the above example, the variable `x` avoids repetitive access to the bins `a` and `b`. Also, variables defined in `let` cannot be used beyond its scope. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conditional Evaluation\n", "The `cond` construct includes one or more pairs of `bool exp, value exp` followed by a default value:\n", "
\n",
    "bool exp1, value exp1, bool exp2, value exp2, ..., default-value\n",
    "
\n", "It evaluates like the if-then-else logic: the expression takes the value of the first `value exp` in the sequence whose corresponding `bool exp` evaluates to true. If all boolean conditions fail, then it evaluates to the last default-value. \n", "\n", "So an expression to evaluate a simple `risk` value \"high\" or \"normal\" based on int bin `age` and bool bin `comorbidities` would be:\n", "
\n",
    "// if (age > 65 && comorbidities) {risk = \"high\";} \n",
    "// else {risk = \"normal\";}\n",
    "Exp.cond(\n",
    "    Exp.and(\n",
    "        Exp.gt(\n",
    "            Exp.intBin(\"age\"), Exp.val(65)),\n",
    "        Exp.boolBin(comorbidities)),\n",
    "    Exp.val(\"high\"), \n",
    "    Exp.val(\"normal\"));\n",
    "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Useful Syntax Patterns\n", "Here is a table that summarizes some useful expression syntax patterns.\n", "\n", "Expression | Syntax Example\n", ":-------- | :---- \n", "3 | Exp.val(3))\n", "\"abc\" | Exp.val(\"abc\")\n", "\\|-3\\| | Exp.abs(Exp.val(-3))\n", "1 + 2 | Exp.add(Exp.val(1)), Exp.val(2))\n", "var a = 5 | Exp.def(\"a\", Exp.val(5))\n", "2 > 3 | Exp.gt(Exp.val(2), Exp.val(3)\n", "Function lastUpdateTime | Exp.lastUpdateTime()\n", "List API listSize | ListExp.listSize(list)\n", "Composition a + 2 * b | Exp.add(Exp.var(a), Exp.mul(Exp.val(2), Exp.var(b))\n", "Condtional eval if (a == 1) then 2; else 3 | Exp.cond(Exp.eq(Exp.var(\"a\"), Exp.val(1)), Exp.val(2), Exp.val(3))\n", "Integer \"bin\" value | Exp.intBin(\"bin\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Coding Patterns\n", "An expression object is constructed on the client to be sent to the server where it is evaluated and used.\n", "\n", "An expression's wire protocol representation is constructed with the `build()` function. A simple expression `fname == \"Frank\"` will be built thus:\n", "
\n",
    "Expression simpleExp = Exp.build(\n",
    "                            Exp.eq(\n",
    "                                Exp.stringBin(\"fname\"), \n",
    "                                Exp.val(\"Frank\")));\n",
    "
\n", "\n", "Note the wire protocol representation of expression is of type `Expression`, whereas a client object is of type `Exp`.\n", "\n", "An expression can be used as a filter expression or an operation expression, as described below. \n", "\n", "Both filter and operation expressions can be used independently of each other and also in the same API call.\n", "\n", "## Filter Expressions\n", "Filter expressions are so named because they are used as a condition to select or discard a record. They always evaluate to a boolean value to indicate whether the record is selected (true) or filtered out (false).\n", "A filter expression, as the deprecated Predicate Expression, is sent to the server through the API's policy object parameter. \n", "
\n",
    "Policy policy = new Policy();\n",
    "policy.filterExp = Exp.build(   // sent through filterExp attribute of policy\n",
    "                        Exp.eq(\n",
    "                            Exp.intBin(\"a\"), \n",
    "                            Exp.val(11)));   \n",
    "...\n",
    "client.query(policy, stmt)      // policy is specified as a parameter in API calls\n",
    "
\n", " \n", "## Operation Expressions\n", "Operation expressions as the name suggests are used in an operation - either to read from bins or write to a bin. Specifically they are used in `read` and `write` methods of `ExpOperation`. \n", "\n", "The basic computational model of `operate`, where operation expressions are used, remains the same: A series of read or write operations are performed in a given sequence on a single record. What is new is that a read operation can be an expression involving zero or more bins. Also, a write operation can get the value from an expression (enabling, for example, use of cross-bin data with conditional logic) instead of a simple constant to update a bin. \n", "\n", "A read with operation expression can also use an arbitrary name for the \"computed bin\" similar to the \"as\" keyword in the SQL statement `SELECT expr AS bin`.\n", "\n", "The pattern for coding an Operation Expression is:\n", "1. Define Expression to read or write the bins.\n", "2. Use `Expression` object in `ExpOperation.read` or `.write` method that returns an `Operation`.\n", "3. Use \"expression operations\" in any API call that takes an operation list. \n", "\n", "This is illustrated below.\n", "\n", "
\n",
    "// operate expression with write \n",
    "// 1. Define Expression to write the bin with.\n",
    "//    if (age > 65 && comorbidities) {risk = \"high\";} \n",
    "//    else {risk = \"normal\";}\n",
    "Expression writeExp = Exp.build(\n",
    "        Exp.cond(\n",
    "            Exp.and(\n",
    "                Exp.gt(\n",
    "                    Exp.intBin(\"age\"), Exp.val(65)),\n",
    "                Exp.boolBin(comorbidities)),\n",
    "            Exp.val(\"high\"), \n",
    "            Exp.val(\"normal\"));\n",
    "// 2. Use Expression object in ExpOperation.write method.\n",
    "Operation writeExpOp = ExpOperation.write(\"risk\",\n",
    "                             Expression writeExp,   // evaluates bin value to update \n",
    "                             ExpWriteFlags.DEFAULT);\n",
    "\n",
    "// operate expression with read\n",
    "// 1. Define Expression to read bins.\n",
    "//    read \"yes\" if (risk == \"high\" or worktype == \"frontline\") else \"no\" \n",
    "//    as a computed bin \"eligible\"\n",
    "Expression readExp = Exp.build(\n",
    "        Exp.cond(\n",
    "            Exp.or(\n",
    "                Exp.eq(\n",
    "                    Exp.stringBin(\"risk\"), Exp.val(\"high\")),\n",
    "                Exp.eq(\n",
    "                    Exp.stringBin(\"worktype\"), Exp.val(\"frontline\"))),\n",
    "            Exp.val(\"yes\"),\n",
    "            Exp.val(\"no\")));\n",
    "// 2. Use Expression object in ExpOperation.read method.\n",
    "Operation readExpOp = ExpOperation.read(\"eligible\",   // named \"computed bin\"\n",
    "                             Expression readExp,      // evaluates value to return \n",
    "                             ExpReadFlags.DEFAULT);\n",
    "                             \n",
    "// 3. Use \"expression operations\" in any API call that takes an operation list.                            \n",
    "Record record = Client.operate(WritePolicy policy, Key key, Operation writeExpOp, Operation readExpOp);\n",
    "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Code Examples\n", "Below are code examples that illustrate the expression features described above." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Filter Expressions\n", "The following example illustrates the capabilities of filtering on metadata and use of List APIs (neither are possible with the deprecated predicate expressions).\n", "\n", "In this illustrative example the filter selects:\n", "- recently updated (sinceUpdate < 2) records \n", "- with list bin having a range of values (i.e., max value - min value) greater than 1000. \n", "\n", "1. Populate the test data with 20 records with an integer bin \"bin1\" values 1-20 and a list bin having 3 randomly selected numbers in the range 1 to 2000.\n", "2. Sleep for 2 seconds, \n", "3. Touch the even numbered records. \n", "4. Run the query with the filter.\n", "\n", "The results should only contain even valued bin1 and bin2 with value range > 1000." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test data populated.\n", "Touched even numbered records.\n", "Results of filter expression query (all even records with bin2 max-min > 1000):\n", "key=id-4 bins={bin1=4, bin2=[1748, 569, 473]}\n", "key=id-10 bins={bin1=10, bin2=[153, 1437, 1302]}\n", "key=id-18 bins={bin1=18, bin2=[333, 1676, 55]}\n", "key=id-16 bins={bin1=16, bin2=[592, 220, 1888]}\n" ] } ], "source": [ "import java.util.ArrayList;\n", "import java.util.Random; \n", "import com.aerospike.client.AerospikeException;\n", "import com.aerospike.client.Bin;\n", "import com.aerospike.client.Key;\n", "import com.aerospike.client.policy.WritePolicy;\n", "import com.aerospike.client.policy.QueryPolicy;\n", "import com.aerospike.client.exp.Exp;\n", "import com.aerospike.client.exp.ListExp;\n", "import com.aerospike.client.Operation;\n", "import com.aerospike.client.task.ExecuteTask;\n", "import com.aerospike.client.query.Statement;\n", "import com.aerospike.client.query.RecordSet;\n", "import com.aerospike.client.Record;\n", "import com.aerospike.client.cdt.ListReturnType;\n", "\n", "// start with a clean state\n", "truncateTestData();\n", "\n", "// 1. Populate the test data with 20 records with an integer bin \"bin1\" values 1-20 \n", "// and a list bin having 3 randomly selected numbers in the range 1 to 2000.\n", "\n", "Random rand = new Random(1); \n", "final int LIST_RANGE = 2000;\n", "WritePolicy wpolicy = new WritePolicy();\n", "wpolicy.sendKey = true;\n", "for (int i = 1; i <= 20; i++) {\n", " Key key = new Key(Namespace, Set, \"id-\"+i);\n", " Bin bin1 = new Bin(new String(\"bin1\"), i);\n", " ArrayList intList = new ArrayList();\n", " intList.add(rand.nextInt(LIST_RANGE));\n", " intList.add(rand.nextInt(LIST_RANGE));\n", " intList.add(rand.nextInt(LIST_RANGE));\n", " Bin bin2 = new Bin(new String(\"bin2\"), intList);\n", " client.put(wpolicy, key, bin1, bin2);\n", "}\n", "System.out.println(\"Test data populated.\");;\n", "\n", "// 2. Sleep for 2 seconds, \n", "Thread.sleep(2000);\n", "\n", "// 3. Touch the even numbered records.\n", "Statement stmt = new Statement();\n", "stmt.setNamespace(Namespace);\n", "stmt.setSetName(Set);\n", "\n", "WritePolicy policy = new WritePolicy();\n", "policy.filterExp = Exp.build(\n", " Exp.eq(\n", " Exp.mod(Exp.intBin(\"bin1\"), Exp.val(2)),\n", " Exp.val(0)));\n", "\n", "ExecuteTask task = client.execute(policy, stmt, Operation.touch()); \n", "task.waitTillComplete(500, 1000);\n", "System.out.println(\"Touched even numbered records.\");;\n", "\n", "// 4. Run the query with the filter.\n", "// records updated in last 2 seconds and whose list value range is more than 1000\n", "Statement stmt = new Statement();\n", "stmt.setNamespace(Namespace);\n", "stmt.setSetName(Set);\n", "\n", "// expression filter is specifed in the operation policy\n", "QueryPolicy policy = new QueryPolicy(client.queryPolicyDefault);\n", "policy.filterExp = Exp.build(\n", " Exp.and(\n", " Exp.lt(Exp.sinceUpdate(), Exp.val(2000)), // updated in last 2s\n", " Exp.gt( // range of values in bin2 greater than 1000\n", " Exp.sub(ListExp.getByRank(ListReturnType.VALUE, Exp.Type.INT, Exp.val(-1), Exp.listBin(\"bin2\")), // largest\n", " ListExp.getByRank(ListReturnType.VALUE, Exp.Type.INT, Exp.val(0), Exp.listBin(\"bin2\"))), // smallest\n", " Exp.val(1000))));\n", "\n", "RecordSet rs = client.query(policy, stmt);\n", "\n", "System.out.println(\"Results of filter expression query (all even records with bin2 max-min > 1000):\");\n", "while (rs.next()) {\n", " Key key = rs.getKey();\n", " Record record = rs.getRecord();\n", " System.out.format(\"key=%s bins=%s\\n\", key.userKey, record.bins);\n", "}\n", "rs.close();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may view the state of the database and ensure correctness of the output by running the following command in the terminal tab:\n", "\n", "`\n", "aql -c \"select * from test.expressions\"\n", "`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Operation Expressions\n", "In the following example, these new capabilites that were not possible earlier are illustrated:\n", "- expressions involving zero or more bins to write a bin\n", "- named \"computed bins\" that return the value of a specified expression involving zero or more bins\n", "- conditional evaluation of expression\n", "- use of variables in an expression\n", "\n", "The code has the following steps:\n", "1. The test data is populated with three randomly generated test scores ranging from 50 to 100 for student ids 1-20.\n", "2. The data is updated by writing two additional bins: \"class\" which represents the teacher's input (0-10) based on class participation, and \"grade\" which is computed by adding \"classwork\" to average of test scores, and using this formula to compute the grade: 50-70 -> C, 65==70-85 -> B, 85+ -> A.\n", "3. A report is then generated for the id, grade, total score, and min/max/average of test scores." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test data populated.\n", "Updated class participation and computed grades.\n", "key=id-1 bins={id=1, grade=A, total=90, testMin=85, testMax=97, testAvg=90}\n", "key=id-2 bins={id=2, grade=C, total=57, testMin=54, testMax=63, testAvg=57}\n", "key=id-3 bins={id=3, grade=B, total=79, testMin=56, testMax=84, testAvg=72}\n", "key=id-4 bins={id=4, grade=A, total=86, testMin=69, testMax=98, testAvg=80}\n", "key=id-5 bins={id=5, grade=B, total=73, testMin=62, testMax=67, testAvg=64}\n", "key=id-6 bins={id=6, grade=A, total=88, testMin=62, testMax=92, testAvg=79}\n", "key=id-7 bins={id=7, grade=A, total=88, testMin=76, testMax=96, testAvg=87}\n", "key=id-8 bins={id=8, grade=B, total=82, testMin=60, testMax=99, testAvg=80}\n", "key=id-9 bins={id=9, grade=B, total=80, testMin=59, testMax=98, testAvg=77}\n", "key=id-10 bins={id=10, grade=C, total=69, testMin=52, testMax=87, testAvg=64}\n", "key=id-11 bins={id=11, grade=C, total=54, testMin=50, testMax=55, testAvg=53}\n", "key=id-12 bins={id=12, grade=C, total=61, testMin=55, testMax=63, testAvg=58}\n", "key=id-13 bins={id=13, grade=B, total=81, testMin=70, testMax=89, testAvg=78}\n", "key=id-14 bins={id=14, grade=B, total=70, testMin=60, testMax=84, testAvg=69}\n", "key=id-15 bins={id=15, grade=B, total=84, testMin=77, testMax=87, testAvg=80}\n", "key=id-16 bins={id=16, grade=A, total=90, testMin=70, testMax=92, testAvg=83}\n", "key=id-17 bins={id=17, grade=C, total=61, testMin=50, testMax=57, testAvg=53}\n", "key=id-18 bins={id=18, grade=B, total=77, testMin=55, testMax=83, testAvg=71}\n", "key=id-19 bins={id=19, grade=B, total=79, testMin=58, testMax=95, testAvg=79}\n", "key=id-20 bins={id=20, grade=B, total=81, testMin=54, testMax=84, testAvg=72}\n" ] } ], "source": [ "import com.aerospike.client.exp.Expression;\n", "import com.aerospike.client.exp.ExpOperation;\n", "import com.aerospike.client.exp.ExpReadFlags;\n", "import com.aerospike.client.exp.ExpWriteFlags;\n", " \n", "// start with a clean state\n", "truncateTestData();\n", "\n", "// 1. The test data is populated with three randomly genrated test scores ranging from 50 to 100 \n", "// for student ids 1-20.\n", "\n", "Random rand = new Random(1); \n", "final int SCORE_RANGE = 50;\n", "WritePolicy wpolicy = new WritePolicy();\n", "wpolicy.sendKey = true;\n", "for (int i = 1; i <= 20; i++) {\n", " Key key = new Key(Namespace, Set, \"id-\"+i);\n", " Bin id = new Bin(new String(\"id\"), i);\n", " ArrayList testScores = new ArrayList();\n", " testScores.add(50 + rand.nextInt(SCORE_RANGE));\n", " testScores.add(50 + rand.nextInt(SCORE_RANGE));\n", " testScores.add(50 + rand.nextInt(SCORE_RANGE));\n", " Bin tests = new Bin(new String(\"tests\"), testScores);\n", " client.put(wpolicy, key, id, tests);\n", "}\n", "System.out.println(\"Test data populated.\");;\n", "\n", "// 2. The data is updated by writing two additional bins: \"class\" which represents the teacher's input (0-10) \n", "// based on class participation, and \"grade\" which is computed by adding \"classwork\" to average of test scores, \n", "// and using this formula: 50-70 -> C, 65==70-85 -> B, 85+ -> A.\n", "\n", "// define the expressions for average test score, total score, grade, min test score, and max test score\n", "Exp avgScoreExp = Exp.let(\n", " Exp.def(\"testsBin\", Exp.listBin(\"tests\")),\n", " Exp.div(\n", " Exp.add(ListExp.getByIndex(ListReturnType.VALUE, Exp.Type.INT, Exp.val(0), Exp.var(\"testsBin\")),\n", " ListExp.getByIndex(ListReturnType.VALUE, Exp.Type.INT, Exp.val(1), Exp.var(\"testsBin\")),\n", " ListExp.getByIndex(ListReturnType.VALUE, Exp.Type.INT, Exp.val(2), Exp.var(\"testsBin\"))),\n", " Exp.val(3)));\n", "Expression avgScoreExpression = Exp.build(avgScoreExp);\n", "\n", "Exp totalScoreExp = Exp.let(\n", " Exp.def(\"classBin\", Exp.intBin(\"class\")),\n", " Exp.add(avgScoreExp, Exp.var(\"classBin\")));\n", "Expression totalScoreExpression = Exp.build(totalScoreExp);\n", "\n", "Exp gradeExp = Exp.let(\n", " Exp.def(\"total\", totalScoreExp),\n", " Exp.cond(\n", " Exp.lt(Exp.var(\"total\"), Exp.val(70)), Exp.val(\"C\"),\n", " Exp.le(Exp.var(\"total\"), Exp.val(85)), Exp.val(\"B\"),\n", " Exp.gt(Exp.var(\"total\"), Exp.val(85)), Exp.val(\"A\"),\n", " Exp.unknown()));\n", "Expression gradeExpression = Exp.build(gradeExp);\n", "\n", "Exp minExp = ListExp.getByRank(ListReturnType.VALUE, Exp.Type.INT, Exp.val(0), Exp.listBin(\"tests\")); // min\n", "Expression minExpression = Exp.build(minExp);\n", "\n", "Exp maxExp = ListExp.getByRank(ListReturnType.VALUE, Exp.Type.INT, Exp.val(-1), Exp.listBin(\"tests\")); // max\n", "Expression maxExpression = Exp.build(maxExp);\n", "\n", "// update class bin with a random 1-10, and the grade bin using gradeExpOp\n", "for (int i = 1; i <= 20; i++) {\n", " Key key = new Key(Namespace, Set, \"id-\"+i);\n", " int classwork = rand.nextInt(SCORE_RANGE)/5;\n", " Bin classBin = new Bin(new String(\"class\"), classwork);\n", " \n", " // write gradeExp to the bin \"grade\"\n", " Operation gradeExpOp = ExpOperation.write(\"grade\",\n", " gradeExpression, \n", " ExpWriteFlags.DEFAULT);\n", " \n", " client.operate(wpolicy, key, Operation.put(classBin), gradeExpOp);\n", "}\n", "System.out.println(\"Updated class participation and computed grades.\");;\n", "\n", "// run a report for all students with id, grade, total, min, max, avg\n", "// using total, min, max, and avg expressions defined above\n", "\n", "Operation ops[] = { Operation.get(\"id\"), \n", " Operation.get(\"grade\"),\n", " ExpOperation.read(\"total\", totalScoreExpression, ExpReadFlags.DEFAULT),\n", " ExpOperation.read(\"testMin\", minExpression, ExpReadFlags.DEFAULT),\n", " ExpOperation.read(\"testMax\", maxExpression, ExpReadFlags.DEFAULT),\n", " ExpOperation.read(\"testAvg\", avgScoreExpression, ExpReadFlags.DEFAULT)\n", "};\n", "\n", "for (int i = 1; i <= 20; i++) {\n", " Key key = new Key(Namespace, Set, \"id-\"+i);\n", " Record record = client.operate(null, key, ops);\n", " System.out.format(\"key=%s bins=%s\\n\", key.userKey, record.bins);\n", "}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may view the state of the database and ensure correctness of the output by running the following command in the terminal tab:\n", "\n", "`\n", "aql -c \"select * from test.expressions\"\n", "`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Using Expression Operations vs R-M-W or UDFs\n", "Aerospike developers have multiple ways to perform a record oriented read-write logic. \n", "1. Read record data to the client, modify, and write back (\"R-M-W\").\n", "2. Create a UDF for the logic and invoke it on the record.\n", "3. Use expression operations in a mulit-op request.\n", "\n", "For read-write transactions, fetching the data to the client and writing back is expensive and requires special care to ensure read-write isolation. Lua UDFs can be difficult to implement, less flexible to change, and can be slower. So it is generally beneficial to use expression operations when possible.\n", "\n", "Here is a suggested decision process:\n", "1. Use expression operations. However if expression operations cannot be used because the task, for example, requires unsupported features such as iterators and loops, then:\n", "2. Use client-side Read-Modify-Write (R-M-W) with version check if amount of data transfer as well as possibility of conflict due to concurrency is limited. Otherwise:\n", "3. Use UDFs if Lua server side programming model and performance meet the needs. Otherwise must use 2.\n", "\n", "Note, Aerospike provides many ways to implement a given data task on one or multiple records. To determine the optimal way for a given task, one should consider and evaluate the options available including the various execution modes (synchronous, asynchronous, background, etc)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Usage Notes\n", "- Policy currently allows both the deprecated predExp and new `filterExp`, but they are mutually exclusive. If both are specified, only `filterExp` will be used and `predExp` will be ignored.\n", "\n", "- Errors during evaluation:\n", " - Type match, bin existence, etc, can be checked using `cond` to avoid run time evaluation errors. \n", "
\n",
    "   Exp.cond(\n",
    "       Exp.eq(   // check if the bin is of type int\n",
    "           Exp.binType(\"a\"),\n",
    "           Exp.val(ParticleType.INTEGER)),\n",
    "       Exp.eq(   // perform int comparison\n",
    "           Exp.intBin(\"a\"),\n",
    "           Exp.val(1)),\n",
    "       Exp.val(false));   // default is false\n",
    "   
\n", " - Filter expressions treat the final unknown value as false, whereas in operation expressions it results in an error.\n", " - If appropriate, evaluation failure can be ignored while performing multiple `operate` operations by setting the flags argument in `ExpOperation.read` or `.write` to `ExpReadFlags.EVAL_NO_FAIL` or `ExpWriteFlags.EVAL_NO_FAIL` respectively.\n", "\n", "- Constructs like loops and iterators over record bins or CDT elements are not currently supported. General manipulation of data beyond what is available in the APIs also is not supported." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Takeaways and Conclusion\n", "The tutorial described expressions capabilities in Aerospike. It explained the scope and syntax, and described the key components and constructs. It provided code examples for how to work with expressions in two client uses: filter expressions and operation expressions. \n", "\n", "The enhanced capabilities in filtering expressions allow records to be processed more efficiently by avoiding the need for more expensive client or udf based processing. New capabilities indlude access to metadata, bin data, powerful APIs, as well as enhanced arithmetic and other operators. \n", "\n", "Operation expressions can eliminate the need to read before update by allowing read, processing for update, and update to happen on the server side in the same request. This saves a round-trip and transfer of potentially large data.\n", "\n", "Expressions provide powerful capabilities; evaluate and use them if they are suitable and provide better performance for your use case over UDFs and client-side processing." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Cleaning Up\n", "Remove tutorial data and close connection." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2020-12-29T20:49:19.972650Z", "start_time": "2020-12-29T20:49:19.967344Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Removed tutorial data and closed server connection.\n" ] } ], "source": [ "truncateTestData();\n", "client.close();\n", "System.out.println(\"Removed tutorial data and closed server connection.\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Further Exploration and Resources\n", "Here are some links for further exploration.\n", "\n", "## Resources\n", "- Workshop video\n", " - [Unleashing the Power of Expressions Workshop (Digital Summit 2021)](https://www.youtube.com/watch?v=ebRLnXvpWaI&list=PLGo1-Ya-AEQCdHtFeRpMEg6-1CLO-GI3G&index=8) \n", "- Docs\n", " - [Aerospike Expressions Guide](https://docs.aerospike.com/docs/guide/expressions/)\n", " - [Java Expression Classes](https://docs.aerospike.com/apidocs/java/com/aerospike/client/exp/package-frame.html)\n", " - [Aerospike Documentation](https://docs.aerospike.com/docs/)\n", "- Related notebooks\n", " - [Read-Write Transactions with R-M-W Pattern (Python)](../python/transactions_rmw_pattern.ipynb)\n", " - [Implementing SQL Operations: SELECT](sql_select.ipynb), \n", " - [Implementing SQL Operations: CREATE, UPDATE, DELETE](sql_updates.ipynb)\n", " - [Working with Lists](java-working_with_lists.ipynb)\n", " - [Working with Maps](java-working_with_maps.ipynb)\n", "- Aerospike Developer Hub\n", " - [Java Developers Resources](https://developer.aerospike.com/java-developers)\n", "- Github repos\n", " - [Java code examples](https://github.com/aerospike/aerospike-client-java/tree/master/examples/src/com/aerospike/examples)\n", " - [Java Client](https://www.aerospike.com/docs/client/java/index.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Explore Other Notebooks\n", "\n", "Visit [Aerospike notebooks repo](https://github.com/aerospike-examples/interactive-notebooks) to run additional Aerospike notebooks. To run a different notebook, download the notebook from the repo to your local machine, and then click on File->Open in the notebook menu, and select Upload." ] } ], "metadata": { "kernelspec": { "display_name": "Java", "language": "java", "name": "java" }, "language_info": { "codemirror_mode": "java", "file_extension": ".jshell", "mimetype": "text/x-java-source", "name": "Java", "pygments_lexer": "java", "version": "11.0.8+10-LTS" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }