{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ExprRules.jl\n", "\n", "This is the base package to support the generation and optimization of Julia expressions from a grammar. The package contains many basic functions for declaring and working with grammars and expression trees.\n", "\n", "## Installation\n", "\n", " Pkg.add(\"ExprRules\")\n", "\n", "Once it's installed, start using the package by calling:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "using ExprRules" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Usage\n", "\n", "### Define a grammar\n", "\n", "Grammars are specified by production rules that specify substitutions of non-terminal symbols. Each production rule is an equality with a non-terminal on the left side and a Julia expression on the right side. \n", "\n", "The _() syntax is a special function where the argument is evaluated at the time of derivation tree's construction and the value is held constant throughout the life of the tree. The pipe (|) syntax is a short-hand that allows the user to define multiple production rules on a single line (i.e., Backus-Naur Form). The |() syntax is another similar short-hand that takes a collection as argument and creates a production rule for each element in the collection." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1: Real = x\n", "2: Real = Real * Real\n", "3: Real = f(Real)\n", "4: Real = _(Base.rand(1.0:5.0))\n", "5: Real = A\n", "6: Real = B\n", "7: Real = cos(Real)\n", "8: Real = 1\n", "9: Real = 2\n", "10: Real = 3\n", "11: Real = 4\n", "12: Real = 5\n", "13: Real = 6\n", "14: Real = 7\n", "15: Real = 8\n", "16: Real = 9\n" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grammar = @grammar begin\n", " Real = x # symbol\n", " Real = Real * Real # julia expression\n", " Real = f(Real) # function call\n", " Real = _(Base.rand(1.0:5.0)) # special syntax, eval argument of _() at derivation time\n", " Real = A | B | cos(Real) # multiple rules on a single line\n", " Real = 1 | 2 | 3\n", " Real = |(4:6) # same as Real = 4 | 5 | 6\n", " Real = |([7,8,9]) # same as Real = 7 | 8 | 9 \n", "end" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "f (generic function with 1 method)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f(x) = 2x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Grammar helper functions\n", "\n", "List non-terminals of the grammar:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1-element Array{Symbol,1}:\n", " :Real" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nonterminals(grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Grammars are indexed by non-terminal symbols and return the corresponding production rule numbers with that nonterminal." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "16-element Array{Int64,1}:\n", " 1\n", " 2\n", " 3\n", " 4\n", " 5\n", " 6\n", " 7\n", " 8\n", " 9\n", " 10\n", " 11\n", " 12\n", " 13\n", " 14\n", " 15\n", " 16" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grammar[:Real]" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Get the return type of the first production rule:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ ":Real" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "return_type(grammar, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the number of children of the second production rule:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nchildren(grammar, 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the child types of the second production rule:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Array{Symbol,1}:\n", " :Real\n", " :Real" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "child_types(grammar, 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the maximum number of children (arity) of the grammar:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "max_arity(grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Determine if the third production rule is terminal:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "false" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isterminal(grammar, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Determine if the fourth production rule is a special _() function:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iseval(grammar, 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Expression trees\n", "\n", "An expression tree represents the derivation of an expression as a tree. The nodes in an expression tree contain an index to a production rule.\n", "\n", "Define an expression tree manually:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rulenode = RuleNode(3, [RuleNode(6)])\n", "display(rulenode, grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate a random expression tree from the grammar:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "using Random" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Random.seed!(138)\n", "rulenode = rand(RuleNode, grammar, :Real, 10)\n", "display(rulenode, grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Evaluate the expression defined by the expression tree:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x=0.5\n", "Core.eval(rulenode, grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the executable Julia expression from an expression tree:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ex = get_executable(rulenode, grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Rather than using Julia's built-in eval function, a much more performant way of evaluating an expression is to use ExprRule's interpreter by providing a custom symbol table. SymbolTable can try to automatically populate the symbol table by analyzing the grammar. Symbols corresponding to input variables should be provided at on-the-fly. Benchmarking suggests that using the custom interpreter can yield up to 20x performance improvement." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "S = SymbolTable(grammar)\n", "S[:x] = 5\n", "Core.eval(S, ex)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sample a random node in the tree:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Random.seed!(0)\n", "sample(rulenode)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sample a random node of type :Real in the expression tree:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Random.seed!(3)\n", "sample(rulenode, :Real, grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sample a random node in the tree and store the location in a NodeLoc object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Random.seed!(1)\n", "loc = sample(NodeLoc, rulenode)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Retrieve the node pointed to by the NodeLoc object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "get(rulenode, loc)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Replace the subtree pointed to by loc with a randomly generated subtree:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Random.seed!(28)\n", "insert!(rulenode, loc, rand(RuleNode, grammar, :Real, 3))\n", "display(rulenode, grammar)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Minimum Depth Map\n", "\n", "Compute the minimum depth of all possible subtrees for each production rule:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "dmap = mindepth_map(grammar)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute the minimum depth of all possible subtrees starting from a given start symbol:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mindepth(grammar, :Real, dmap) #zero for terminals" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Expression Iterator\n", "\n", "Iterate over all possible expressions of a grammar up to depth 2:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "grammar = @grammar begin\n", " Real = Real + Real\n", " Real = 1 | 2\n", "end\n", "iter = ExpressionIterator(grammar, 2, :Real)\n", "collect(iter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Count the number of expressions of a grammar up to depth 2:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "count_expressions(grammar, 2, :Real)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## AbstractTrees.jl Interface" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "using AbstractTrees" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print RuleNode tree in textual format. Leaf nodes are the rule numbers in the grammar." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tree = RuleNode(1, [RuleNode(2), RuleNode(1, [RuleNode(2), RuleNode(3)])])\n", "print_tree(tree)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.4.0", "language": "julia", "name": "julia-1.4" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.4.0" } }, "nbformat": 4, "nbformat_minor": 2 }