{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "This is the first part of a tutorial describing the Iteration/Expression Tree (IET), an intermediate representation based on Abstract Syntax Trees (AST) used in the Devito compiler. In this part we will show how to access and navigate the IET rooted in an `Operator`.\n", "\n", "The reader of this tutorial is expected to be familiar with the basic Devito API (i.e., `Grid`, `Function`/`TimeFunction`, `Operator`, ...). Otherwise, the CFD tutorials in `examples/cfd` are a better place to start.\n", "\n", "# Part I - Top Down\n", "\n", "Let's start with a fairly trivial example." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Data([[[0., 0., 0.],\n", " [0., 0., 0.],\n", " [0., 0., 0.]],\n", "\n", " [[0., 0., 0.],\n", " [0., 0., 0.],\n", " [0., 0., 0.]]], dtype=float32)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from devito import Eq, Grid, TimeFunction, Operator\n", "\n", "grid = Grid(shape=(3, 3))\n", "u = TimeFunction(name='u', grid=grid)\n", "u.data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we just defined a time-varying grid with 3x3 points in each of the space `Dimension`s _x_ and _y_. `u.data` gives us access to the actual data values. In particular, `u.data[0, :, :]` holds the data values at timestep `t=0`, whereas `u.data[1, :, :]` holds the data values at timestep `t=1`.\n", "\n", "We now create an `Operator` that increments by 1 all points in the computational domain.", "\n", "Note: User should not change to `configuration['openmp'] = 1` as the IET structure changes with OpenMP enabled." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from devito import configuration\n", "configuration['openmp'] = 0 # We don't care about OpenMP in this example\n", "\n", "eq = Eq(u.forward, u+1)\n", "op = Operator(eq)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An `Operator` is an IET node that can generate, JIT-compile, and run low-level code (e.g., C). Just like all other types of IET nodes, it's got a number of metadata attached. For example, we can query an `Operator` to retrieve the user-provided SymPy expressions." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Eq(u(t + dt, x, y), u(t, x, y) + 1)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "op.args['expressions']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we print `op`, we can see how the generated code looks like." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#define _POSIX_C_SOURCE 200809L\n", "#include \"stdlib.h\"\n", "#include \"math.h\"\n", "#include \"sys/time.h\"\n", "#include \"xmmintrin.h\"\n", "#include \"pmmintrin.h\"\n", "\n", "struct profiler\n", "{\n", " double section0;\n", "} ;\n", "\n", "\n", "int Kernel(float *restrict u_vec, const int time_M, const int time_m, struct profiler* timers, const int x_M, const int x_m, const int x_size, const int y_M, const int y_m, const int y_size)\n", "{\n", " float (*restrict u)[x_size + 1 + 1][y_size + 1 + 1] __attribute__((aligned(64))) = (float (*)[x_size + 1 + 1][y_size + 1 + 1]) u_vec;\n", " /* Flush denormal numbers to zero in hardware */\n", " _MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);\n", " _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);\n", " for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))\n", " {\n", " struct timeval start_section0, end_section0;\n", " gettimeofday(&start_section0, NULL);\n", " for (int x = x_m; x <= x_M; x += 1)\n", " {\n", " #pragma omp simd\n", " for (int y = y_m; y <= y_M; y += 1)\n", " {\n", " u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1;\n", " }\n", " }\n", " gettimeofday(&end_section0, NULL);\n", " timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000;\n", " }\n", " return 0;\n", "}\n", "\n" ] } ], "source": [ "print(op)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As shown in other tutorials, we can JIT-compile and run `op` by executing:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Operator `Kernel` run in 0.00 s\n" ] }, { "data": { "text/plain": [ "Data([[[2., 2., 2.],\n", " [2., 2., 2.],\n", " [2., 2., 2.]],\n", "\n", " [[3., 3., 3.],\n", " [3., 3., 3.],\n", " [3., 3., 3.]]], dtype=float32)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "op.apply(time=2)\n", "u.data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is the first time we are calling `apply` on `op`, so the `Operator` code gets written in a `.c` file and compiled into a library (a `.so` shared object if on Linux).\n", "\n", "`op` runs with the user-provided `time_M=2` and with the default arguments `u=u, x_m=0, x_M=2, y_m=0, y_M=2,` and `time_m=0`. The indices `*_m` and `*_M` represent the minimum and maximum iteration points along `Dimension` `*`.\n", "\n", "`op` can be used for a completely different run, for example providing a new `TimeFunction`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Data([[[0., 0., 0.],\n", " [0., 0., 0.],\n", " [0., 0., 0.]],\n", "\n", " [[0., 0., 0.],\n", " [0., 0., 0.],\n", " [0., 0., 0.]]], dtype=float32)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "u2 = TimeFunction(name='u', grid=grid)\n", "u2.data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Any of the `Operator` default arguments may be replaced by passing suitable key-value parameters.\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Operator `Kernel` run in 0.00 s\n" ] }, { "data": { "text/plain": [ "Data([[[0., 0., 0.],\n", " [4., 4., 0.],\n", " [4., 4., 0.]],\n", "\n", " [[0., 0., 0.],\n", " [3., 3., 0.],\n", " [3., 3., 0.]]], dtype=float32)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "op.apply(u=u2, x_m=1, x_M=2, y_m=0, y_M=1, time_M=3)\n", "u2.data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note, however, that there is no need for recompilation. JIT compilation occurs only once, triggered by the first call to `op.apply()`.\n", "\n", "An `Operator` is the root of an IET that typically consists of several nested `Iteration`s and `Expression`s – two other fundamental IET node types. The user-provided SymPy equations are wrapped within `Expressions`. Loop nest embedding such expressions are constructed by suitably nesting `Iterations`.\n", "\n", "The Devito compiler constructs the IET from a collection of `Cluster`s, which represent a higher-level intermediate representation (not covered in this tutorial).\n", "\n", "The Devito compiler also analyses the IET to determine key computational properties, such as _sequential_, _parallel_, and _vectorizable_. These properties are attached directly to the relevant IET nodes.\n", "\n", "We can print the IET structure of an `Operator`, as well as the attached computational properties, using the utility function `pprint`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", " \n", "\n", " \n", " \n", "\n", " \n", "\n", " \n", " \n", " \n", "\n", " \n", "\n", " <[affine,sequential,wrappable] Iteration time::time::(time_m, time_M, 1)::(0, 0)>\n", " \n", " \n", " \n", "
\n", "\n", " <[affine,parallel] Iteration x::x::(x_m, x_M, 1)::(0, 0)>\n", " <[affine,parallel,vector-dim] Iteration y::y::(y_m, y_M, 1)::(0, 0)>\n", " \n", "\n", " \n", "\n", "\n", " \n", " section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000;>\n", "\n", "\n", "\n" ] } ], "source": [ "from devito import pprint\n", "pprint(op)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example, `op` is represented as a ``. Attached to it are metadata, such as `_headers` and `_includes`, as well as a `body`, which includes the children IET nodes. Here, the body is a `List` object.\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['#define _POSIX_C_SOURCE 200809L']" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "op._headers" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['stdlib.h', 'math.h', 'sys/time.h', 'xmmintrin.h', 'pmmintrin.h']" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "op._includes" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(,)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "op.body" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can inspect the `List` to observe that its immediate children are an `` and another ``. We can then proceed with the IET traversal until we locate the user-provided `SymPy` equations." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "float (*restrict u)[x_size + 1 + 1][y_size + 1 + 1] __attribute__((aligned(64))) = (float (*)[x_size + 1 + 1][y_size + 1 + 1]) u_vec;\n" ] } ], "source": [ "print(op.body[0].body[0]) # Printing the ArrayCast" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/* Flush denormal numbers to zero in hardware */\n", "_MM_SET_DENORMALS_ZERO_MODE(_MM_DENORMALS_ZERO_ON);\n", "_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);\n", "for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))\n", "{\n", " struct timeval start_section0, end_section0;\n", " gettimeofday(&start_section0, NULL);\n", " for (int x = x_m; x <= x_M; x += 1)\n", " {\n", " #pragma omp simd\n", " for (int y = y_m; y <= y_M; y += 1)\n", " {\n", " u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1;\n", " }\n", " }\n", " gettimeofday(&end_section0, NULL);\n", " timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000;\n", "}\n" ] } ], "source": [ "print(op.body[0].body[1]) # Printing the List" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we access the `Iteration` representing the time loop." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t_iter = op.body[0].body[1].body[1].body[0]\n", "t_iter" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))\n", "{\n", " struct timeval start_section0, end_section0;\n", " gettimeofday(&start_section0, NULL);\n", " for (int x = x_m; x <= x_M; x += 1)\n", " {\n", " #pragma omp simd\n", " for (int y = y_m; y <= y_M; y += 1)\n", " {\n", " u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 1;\n", " }\n", " }\n", " gettimeofday(&end_section0, NULL);\n", " timers->section0 += (double)(end_section0.tv_sec-start_section0.tv_sec)+(double)(end_section0.tv_usec-start_section0.tv_usec)/1000000;\n", "}\n" ] } ], "source": [ "print(t_iter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can inspect the `Iteration` to discover what its iteration bounds are." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(time_m, time_M, 1)" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t_iter.limits" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And as we keep going down through the IET, we can eventually reach the `Expression` wrapping the user-provided SymPy equation." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "''" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "expr = t_iter.nodes[0].body[0].body[0].nodes[0].nodes[0].body[0]\n", "expr.view" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course, there are mechanisms in place to, for example, find all `Expression`s in a given IET. The Devito compiler has a number of IET visitors, among which `FindNodes`, usable to retrieve all nodes of a particular type. So we easily \n", "can get all `Expression`s within `op` as follows" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "''" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from devito.ir.iet import Expression, FindNodes\n", "exprs = FindNodes(Expression).visit(op)\n", "exprs[0].view" ] } ], "metadata": { "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }