{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Domain, Halo and Padding regions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this tutorial we will learn about data regions and how these impact the `Operator` construction. We will use a simple time marching example." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from devito import Eq, Grid, TimeFunction, Operator\n", "\n", "grid = Grid(shape=(3, 3))\n", "u = TimeFunction(name='u', grid=grid)\n", "u.data[:] = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At this point, we have a time-varying 3x3 grid filled with _1's_. Below, we can see the _domain_ data values:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[1. 1. 1.]\n", " [1. 1. 1.]\n", " [1. 1. 1.]]\n", "\n", " [[1. 1. 1.]\n", " [1. 1. 1.]\n", " [1. 1. 1.]]]\n" ] } ], "source": [ "print(u.data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now create a time-marching `Operator` that, at each timestep, increments by `2` all points in the computational domain." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from devito import configuration\n", "configuration['language'] = 'C'\n", "\n", "eq = Eq(u.forward, u+2)\n", "op = Operator(eq, opt='noop')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can print `op` to get the generated code." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#define _POSIX_C_SOURCE 200809L\n", "#define START(S) struct timeval start_ ## S , end_ ## S ; gettimeofday(&start_ ## S , NULL);\n", "#define STOP(S,T) gettimeofday(&end_ ## S, NULL); T->S += (double)(end_ ## S .tv_sec-start_ ## S.tv_sec)+(double)(end_ ## S .tv_usec-start_ ## S .tv_usec)/1000000;\n", "\n", "#include \"stdlib.h\"\n", "#include \"math.h\"\n", "#include \"sys/time.h\"\n", "\n", "struct dataobj\n", "{\n", " void *restrict data;\n", " unsigned long * size;\n", " unsigned long * npsize;\n", " unsigned long * dsize;\n", " int * hsize;\n", " int * hofs;\n", " int * oofs;\n", " void * dmap;\n", "} ;\n", "\n", "struct profiler\n", "{\n", " double section0;\n", "} ;\n", "\n", "\n", "int Kernel(struct dataobj *restrict u_vec, const int time_M, const int time_m, const int x_M, const int x_m, const int y_M, const int y_m, struct profiler * timers)\n", "{\n", " float (*restrict u)[u_vec->size[1]][u_vec->size[2]] __attribute__ ((aligned (64))) = (float (*)[u_vec->size[1]][u_vec->size[2]]) u_vec->data;\n", "\n", " for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))\n", " {\n", " START(section0)\n", " for (int x = x_m; x <= x_M; x += 1)\n", " {\n", " for (int y = y_m; y <= y_M; y += 1)\n", " {\n", " u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 2;\n", " }\n", " }\n", " STOP(section0,timers)\n", " }\n", "\n", " return 0;\n", "}\n", "\n" ] } ], "source": [ "print(op)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we take a look at the constructed expression, `u[t1][x + 1][y + 1] = u[t0][x + 1][y + 1] + 2`, we see several `+1` were added to the `u`'s spatial indices." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is because the domain region is actually surrounded by 'ghost' points, which can be accessed via a stencil when iterating in proximity of the domain boundary. The ghost points define the _halo region._ The halo region can be accessed through the `data_with_halo` data accessor. As we see below, the halo points correspond to the zeros surrounding the domain region." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[0. 0. 0. 0. 0.]\n", " [0. 1. 1. 1. 0.]\n", " [0. 1. 1. 1. 0.]\n", " [0. 1. 1. 1. 0.]\n", " [0. 0. 0. 0. 0.]]\n", "\n", " [[0. 0. 0. 0. 0.]\n", " [0. 1. 1. 1. 0.]\n", " [0. 1. 1. 1. 0.]\n", " [0. 1. 1. 1. 0.]\n", " [0. 0. 0. 0. 0.]]]\n" ] } ], "source": [ "print(u.data_with_halo)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By adding the `+1` offsets, the Devito compiler ensures the array accesses are logically aligned to the equation’s physical domain. For instance, the `TimeFunction` `u(t, x, y)` used in the example above has one point on each side of the `x` and `y` halo regions; if the user writes an expression including `u(t, x, y)` and `u(t, x + 2, y + 2)`, the compiler will ultimately generate `u[t, x + 1, y + 1]` and `u[t, x + 3, y + 3]`. When `x = y = 0`, therefore, the values `u[t, 1, 1]` and `u[t, 3, 3]` are fetched, representing the first and third points in the physical domain. \n", "\n", "By default, the halo region has `space_order` points on each side of the space dimensions. Sometimes, these points may be unnecessary, or, depending on the partial differential equation being approximated, extra points may be needed." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[1. 1. 1.]\n", " [1. 1. 1.]\n", " [1. 1. 1.]]\n", "\n", " [[1. 1. 1.]\n", " [1. 1. 1.]\n", " [1. 1. 1.]]]\n" ] } ], "source": [ "u0 = TimeFunction(name='u0', grid=grid, space_order=0)\n", "u0.data[:] = 1\n", "print(u0.data_with_halo)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 1. 1. 1. 0. 0.]\n", " [0. 0. 1. 1. 1. 0. 0.]\n", " [0. 0. 1. 1. 1. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]]\n", "\n", " [[0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 1. 1. 1. 0. 0.]\n", " [0. 0. 1. 1. 1. 0. 0.]\n", " [0. 0. 1. 1. 1. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]]]\n" ] } ], "source": [ "u2 = TimeFunction(name='u2', grid=grid, space_order=2)\n", "u2.data[:] = 1\n", "print(u2.data_with_halo)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One can also pass a 3-tuple `(o, lp, rp)` instead of a single integer representing the discretization order. Here, `o` is the discretization order, while `lp` and `rp` indicate how many points are expected on left and right sides of a point of interest, respectivelly." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "u_new = TimeFunction(name='u_new', grid=grid, space_order=(4, 3, 1))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 1. 1. 1. 0.]\n", " [0. 0. 0. 1. 1. 1. 0.]\n", " [0. 0. 0. 1. 1. 1. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]]\n", "\n", " [[0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 1. 1. 1. 0.]\n", " [0. 0. 0. 1. 1. 1. 0.]\n", " [0. 0. 0. 1. 1. 1. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]]]\n" ] } ], "source": [ "u_new.data[:] = 1\n", "print(u_new.data_with_halo)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's have a look at the generated code when using `u_new`." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#define _POSIX_C_SOURCE 200809L\n", "#define START(S) struct timeval start_ ## S , end_ ## S ; gettimeofday(&start_ ## S , NULL);\n", "#define STOP(S,T) gettimeofday(&end_ ## S, NULL); T->S += (double)(end_ ## S .tv_sec-start_ ## S.tv_sec)+(double)(end_ ## S .tv_usec-start_ ## S .tv_usec)/1000000;\n", "\n", "#include \"stdlib.h\"\n", "#include \"math.h\"\n", "#include \"sys/time.h\"\n", "\n", "struct dataobj\n", "{\n", " void *restrict data;\n", " unsigned long * size;\n", " unsigned long * npsize;\n", " unsigned long * dsize;\n", " int * hsize;\n", " int * hofs;\n", " int * oofs;\n", " void * dmap;\n", "} ;\n", "\n", "struct profiler\n", "{\n", " double section0;\n", "} ;\n", "\n", "\n", "int Kernel(struct dataobj *restrict u_new_vec, const int time_M, const int time_m, const int x_M, const int x_m, const int y_M, const int y_m, struct profiler * timers)\n", "{\n", " float (*restrict u_new)[u_new_vec->size[1]][u_new_vec->size[2]] __attribute__ ((aligned (64))) = (float (*)[u_new_vec->size[1]][u_new_vec->size[2]]) u_new_vec->data;\n", "\n", " for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))\n", " {\n", " START(section0)\n", " for (int x = x_m; x <= x_M; x += 1)\n", " {\n", " for (int y = y_m; y <= y_M; y += 1)\n", " {\n", " u_new[t1][x + 3][y + 3] = u_new[t0][x + 3][y + 3] + 2;\n", " }\n", " }\n", " STOP(section0,timers)\n", " }\n", "\n", " return 0;\n", "}\n", "\n" ] } ], "source": [ "equation = Eq(u_new.forward, u_new + 2)\n", "op = Operator(equation, opt='noop')\n", "print(op)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And finally, let's run it, to convince ourselves that only the domain region values will be incremented at each timestep." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Operator `Kernel` ran in 0.01 s\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[[[0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 5. 5. 5. 0.]\n", " [0. 0. 0. 5. 5. 5. 0.]\n", " [0. 0. 0. 5. 5. 5. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]]\n", "\n", " [[0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 7. 7. 7. 0.]\n", " [0. 0. 0. 7. 7. 7. 0.]\n", " [0. 0. 0. 7. 7. 7. 0.]\n", " [0. 0. 0. 0. 0. 0. 0.]]]\n" ] } ], "source": [ "#NBVAL_IGNORE_OUTPUT\n", "op.apply(time_M=2)\n", "print(u_new.data_with_halo)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The halo region, in turn, is surrounded by the _padding region_, which can be used for data alignment. By default, there is no padding. This can be changed by passing a suitable value to `padding`, as shown below:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "#define _POSIX_C_SOURCE 200809L\n", "#define START(S) struct timeval start_ ## S , end_ ## S ; gettimeofday(&start_ ## S , NULL);\n", "#define STOP(S,T) gettimeofday(&end_ ## S, NULL); T->S += (double)(end_ ## S .tv_sec-start_ ## S.tv_sec)+(double)(end_ ## S .tv_usec-start_ ## S .tv_usec)/1000000;\n", "\n", "#include \"stdlib.h\"\n", "#include \"math.h\"\n", "#include \"sys/time.h\"\n", "\n", "struct dataobj\n", "{\n", " void *restrict data;\n", " unsigned long * size;\n", " unsigned long * npsize;\n", " unsigned long * dsize;\n", " int * hsize;\n", " int * hofs;\n", " int * oofs;\n", " void * dmap;\n", "} ;\n", "\n", "struct profiler\n", "{\n", " double section0;\n", "} ;\n", "\n", "\n", "int Kernel(struct dataobj *restrict u_pad_vec, const int time_M, const int time_m, const int x_M, const int x_m, const int y_M, const int y_m, struct profiler * timers)\n", "{\n", " float (*restrict u_pad)[u_pad_vec->size[1]][u_pad_vec->size[2]] __attribute__ ((aligned (64))) = (float (*)[u_pad_vec->size[1]][u_pad_vec->size[2]]) u_pad_vec->data;\n", "\n", " for (int time = time_m, t0 = (time)%(2), t1 = (time + 1)%(2); time <= time_M; time += 1, t0 = (time)%(2), t1 = (time + 1)%(2))\n", " {\n", " START(section0)\n", " for (int x = x_m; x <= x_M; x += 1)\n", " {\n", " for (int y = y_m; y <= y_M; y += 1)\n", " {\n", " u_pad[t1][x + 2][y + 2] = u_pad[t0][x + 2][y + 2] + 2;\n", " }\n", " }\n", " STOP(section0,timers)\n", " }\n", "\n", " return 0;\n", "}\n", "\n" ] } ], "source": [ "u_pad = TimeFunction(name='u_pad', grid=grid, space_order=2, padding=(0,2,2))\n", "u_pad._data_allocated[:] = 0\n", "u_pad.data_with_halo[:] = 1\n", "u_pad.data[:] = 2\n", "equation = Eq(u_pad.forward, u_pad + 2)\n", "op = Operator(equation, opt='noop')\n", "print(op)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although in practice not very useful, with the (private) `_data_allocated` accessor one can see the entire _domain + halo + padding region_." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [1. 1. 2. 2. 2. 1. 1. 0. 0.]\n", " [1. 1. 2. 2. 2. 1. 1. 0. 0.]\n", " [1. 1. 2. 2. 2. 1. 1. 0. 0.]\n", " [1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0. 0. 0.]]\n", "\n", " [[1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [1. 1. 2. 2. 2. 1. 1. 0. 0.]\n", " [1. 1. 2. 2. 2. 1. 1. 0. 0.]\n", " [1. 1. 2. 2. 2. 1. 1. 0. 0.]\n", " [1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [1. 1. 1. 1. 1. 1. 1. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0. 0. 0. 0. 0.]]]\n" ] } ], "source": [ "print(u_pad._data_allocated)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above, the __domain__ is filled with __2__, the __halo__ is filled with __1__, and the __padding__ is filled with __0__." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }