{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "# NRPy+ 10-Minute Overview\n", "\n", "## Author: Zach Etienne" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Table of Contents\n", "$$\\label{toc}$$ \n", "\n", "This notebook is organized as follows\n", "\n", "1. [Part 1](#why): Why NRPy+?\n", "1. [Part 2](#christoffel_symbols): Constructing 3-Christoffels $\\Gamma^i_{jk}$ as symbolic expressions in terms of 3-metric $\\gamma_{ij}$ and derivatives\n", "1. [Part 3](#outputc): `outputC()` example: Output $\\Gamma^i_{jk}$ expressions as optimized C code, assuming derivatives already specified\n", "1. [Part 4](#fd_outputc): `FD_outputC()` example: Specify numerical derivatives within $\\Gamma^i_{jk}$ expressions as finite differences\n", "1. [Part 5](#what_next): What next? Navigating the NRPy+ tutorial\n", "1. [Part 6](#latex_pdf_output): Output this notebook to $\\LaTeX$-formatted PDF " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Part 1: Why NRPy+? \\[Back to [top](#toc)\\]\n", "$$\\label{why}$$\n", "\n", "A core problem faced by numerical relativity is not that the techniques we use are particularly complex, but that *the equations numerical relativists solve are very complex*.\n", "\n", "[Einstein notation](https://en.wikipedia.org/wiki/Einstein_notation) certainly makes the complexity more manageable, and is the most common way to perform tensorial mathematics in numerical relativity.\n", "\n", "NRPy+ is built upon the idea that equations written in Einstein notation *are a form of code*, and when this code is expressed in NRPy+'s native (Python) language\n", "\n", "* rank-0 tensors (scalars) are *variables*\n", "* rank-1 tensors are *lists*\n", "* rank-2 tensors are *lists of lists*\n", "* ... and so forth.\n", "\n", "Further, implied tensor summations are simply *loops*.\n", "\n", "NRPy+ combines the above ideas with the incredibly powerful [SymPy](https://www.sympy.org/) computer algebra package (think: Mathematica for Python, but fully free & open source), with a custom-built code generation infrastructure for converting complex SymPy expressions directly into highly-optimized C/C++ code.\n", "\n", "Importantly, NRPy+ builds on the idea of *learning by example*. The core NRPy+ repository contains more than 100 pedagogical and well-formatted Jupyter notebook tutorials, covering topics of core relevance to the field of numerical relativity. About 25 of these tutorials generate *complete C codes* capable of e.g., evolving the scalar wave equation, Maxwell's equations, and Einstein's equations of general relativity -- all in a variety of coordinate systems. All ~100 Jupyter notebooks are linked to from [the main NRPy+ tutorial table of contents.](NRPyPlus_Tutorial.ipynb)\n", "\n", "This 10-minute overview however is designed to introduce only the very basic features of NRPy+, with a core focus on the idea that *NRPy+ can be used to benefit any numerical relativity code*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Part 2: Constructing 3-Christoffels $\\Gamma^i_{jk}$ as symbolic expressions in terms of 3-metric $\\gamma_{ij}$ and derivatives \\[Back to [top](#toc)\\]\n", "$$\\label{christoffel_symbols}$$\n", "\n", "**Problem statement**: Given a three-metric $\\gamma_{ij}$, construct all 18 independent Christoffel symbols $\\Gamma^i_{jk}$, which involves first derivatives of the metric. Assume that $\\gamma_{ij}$ *and its derivatives* are given numerically, requiring the derivatives be defined as symbols.\n", "\n", "In NRPy+ we adopt a rigid syntax for tensors and indexed expressions involving Python lists, so that for example\n", "\n", "* $\\gamma_{ij}=$ `gammaDD[i][j]`\n", "* $\\gamma_{ij,k}=$ `gammaDD_dD[i][j][k]`\n", "\n", "Christoffel symbols (of the first kind) are defined as ([source](https://en.wikipedia.org/wiki/Christoffel_symbols#Christoffel_symbols_of_the_first_kind)):\n", "\n", "\\begin{align}\n", "\\Gamma_{ij}^k &= \\frac{1}{2} \\gamma^{kl}\\left(\\gamma_{jl,i} + \\gamma_{il,j} - \\gamma_{ij,l}\\right)\\\\\n", "\\end{align}\n", "\n", "So first we'll define $\\gamma_{ij}$ and its inverse using NRPy+ functions" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import sympy as sp # SymPy: The Python computer algebra package upon which NRPy+ depends\n", "import indexedexp as ixp # NRPy+: Symbolic indexed expression (e.g., tensors, vectors, etc.) support\n", "\n", "# gammaDD is a rank-2 indexed expression symmetric in indexes 0 and 1\n", "gammaDD = ixp.declarerank2(\"gammaDD\", symmetry=\"sym01\", DIM=3)\n", "\n", "# gammaUU is the inverse of gammaDD, and\n", "# gammaDET (unused) is the determinant of gammaDD\n", "gammaUU, gammaDET = ixp.symm_matrix_inverter3x3(gammaDD) # inverts a 3x3 symmetric matrix" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we interact a bit with these generated expressions, confirming they \n", "\n", "*Exercise to students*: Verify that $\\gamma_{ij} \\gamma^{ij} = 3$ using the provided data structures & Python loops. (*Hint 1*: Scroll down a couple cells to see how the Christoffel symbols are implemented. *Hint 2*: You will need to use SymPy's `simplify()` function to simplify the expression obtained.)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Check that gammaDD[0][1] = gammaDD[1][0]:\n", "gammaDD[0][1] = gammaDD01\n", "gammaDD[1][0] = gammaDD01\n", "\n", "Output gammaUU[1][0] = (-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*gammaDD12**2 - gammaDD01**2*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - gammaDD02**2*gammaDD11)\n", "\n", "Check that gammaUU[1][0] - gammaUU[0][1] = 0:\n", "gammaUU[1][0] - gammaUU[0][1] = 0\n" ] } ], "source": [ "print(\"Check that gammaDD[0][1] = gammaDD[1][0]:\")\n", "print(\"gammaDD[0][1] = \", gammaDD[0][1])\n", "print(\"gammaDD[1][0] = \", gammaDD[1][0])\n", "\n", "print(\"\\nOutput gammaUU[1][0] = \", gammaUU[1][0])\n", "print(\"\\nCheck that gammaUU[1][0] - gammaUU[0][1] = 0:\")\n", "print(\"gammaUU[1][0] - gammaUU[0][1] = \", gammaUU[1][0] - gammaUU[0][1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define Christoffel symbols in terms of the inverse metric and metric first derivatives:\n", "$$\n", "\\Gamma_{ij}^k = \\frac{1}{2} \\gamma^{kl}\\left(\\gamma_{jl,i} + \\gamma_{il,j} - \\gamma_{ij,l}\\right)\n", "$$" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# First define symbolic expressions for metric derivatives\n", "gammaDD_dD = ixp.declarerank3(\"gammaDD_dD\", symmetry=\"sym01\", DIM=3)\n", "\n", "# Initialize GammaUDD (3-Christoffel) to zero\n", "GammaUDD = ixp.zerorank3(DIM=3)\n", "for i in range(3):\n", " for j in range(3):\n", " for k in range(3):\n", " for l in range(3):\n", " GammaUDD[k][i][j] += sp.Rational(1, 2) * gammaUU[k][l] * (\n", " gammaDD_dD[j][l][i] + gammaDD_dD[i][l][j] - gammaDD_dD[i][j][l])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's confirm that $\\Gamma^{k}_{ij} = \\Gamma^k_{ji}$:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n", "0\n" ] } ], "source": [ "for i in range(3):\n", " for j in range(3):\n", " for k in range(3):\n", " print(GammaUDD[i][j][k] - GammaUDD[i][k][j])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Part 3: `outputC()` example: Output $\\Gamma^i_{jk}$ expressions as optimized C code, assuming derivatives already specified \\[Back to [top](#toc)\\]\n", "$$\\label{outputc}$$\n", "\n", "At the core of NRPy+ is the ability to convert SymPy expressions to highly optimized C code.\n", "\n", "**Problem statement**: Output all 18 unique Christoffel symbols with 3 different levels of optimization supported by NRPy+'s core C output routine `outputC()`.\n", "\n", "First we store all 18 unique Christoffel symbols, as well as their desired variable names in C, to Python lists." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "symbols_list = []\n", "varname_list = []\n", "for i in range(3):\n", " for j in range(3):\n", " for k in range(j, 3):\n", " symbols_list += [GammaUDD[i][j][k]]\n", " varname_list += [\"ChristoffelUDD\" + str(i) + str(j) + str(k)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we input these lists into NRPy+'s C/C++ code generation function `outputC()`, at three different levels of optimization.\n", "\n", "**Optimization Level 0 (don't ever do it this way)**: Compute each Christoffel symbol independently." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " ChristoffelUDD000 = (1.0/2.0)*gammaDD_dD000*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD001 + 2*gammaDD_dD010)*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD002 + 2*gammaDD_dD020)*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD001 = (1.0/2.0)*gammaDD_dD001*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD110*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)*(-gammaDD_dD012 + gammaDD_dD021 + gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD002 = (1.0/2.0)*gammaDD_dD002*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD220*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)*(gammaDD_dD012 - gammaDD_dD021 + gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD011 = (1.0/2.0)*gammaDD_dD111*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD011 - gammaDD_dD110)*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD112 + 2*gammaDD_dD121)*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD012 = (1.0/2.0)*gammaDD_dD112*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD221*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)))*(gammaDD_dD012 + gammaDD_dD021 - gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD022 = (1.0/2.0)*gammaDD_dD222*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD022 - gammaDD_dD220)*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD122 - gammaDD_dD221)*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD100 = (1.0/2.0)*gammaDD_dD000*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD001 + 2*gammaDD_dD010)*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD002 + 2*gammaDD_dD020)*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD101 = (1.0/2.0)*gammaDD_dD001*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD110*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)*(-gammaDD_dD012 + gammaDD_dD021 + gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD102 = (1.0/2.0)*gammaDD_dD002*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD220*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)))*(gammaDD_dD012 - gammaDD_dD021 + gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD111 = (1.0/2.0)*gammaDD_dD111*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD011 - gammaDD_dD110)*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD112 + 2*gammaDD_dD121)*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD112 = (1.0/2.0)*gammaDD_dD112*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD221*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)*(gammaDD_dD012 + gammaDD_dD021 - gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD122 = (1.0/2.0)*gammaDD_dD222*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD022 - gammaDD_dD220)*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD122 - gammaDD_dD221)*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD200 = (1.0/2.0)*gammaDD_dD000*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD001 + 2*gammaDD_dD010)*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD002 + 2*gammaDD_dD020)*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD201 = (1.0/2.0)*gammaDD_dD001*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD110*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)))*(-gammaDD_dD012 + gammaDD_dD021 + gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD202 = (1.0/2.0)*gammaDD_dD002*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD220*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)*(gammaDD_dD012 - gammaDD_dD021 + gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD211 = (1.0/2.0)*gammaDD_dD111*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD011 - gammaDD_dD110)*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(-gammaDD_dD112 + 2*gammaDD_dD121)*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD212 = (1.0/2.0)*gammaDD_dD112*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*gammaDD_dD221*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)*(gammaDD_dD012 + gammaDD_dD021 - gammaDD_dD120)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " ChristoffelUDD222 = (1.0/2.0)*gammaDD_dD222*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)))/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD022 - gammaDD_dD220)*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11) + (1.0/2.0)*(2*gammaDD_dD122 - gammaDD_dD221)*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", "}\n", "\n" ] } ], "source": [ "import outputC as outC # NRPy+: Core C code output module\n", "outC.outputC(symbols_list, varname_list, filename=\"stdout\", params=\"CSE_enable=False,outCverbose=False\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice in the above code that many expressions are recomputed time and time again. This is *incredibly inefficient*, and generally compilers won't optimize this properly. So in optimization level 1, we use common subexpression elimination.\n", "\n", "**Optimization Level 1**: Use [common subexpression elimination (CSE)](https://en.wikipedia.org/wiki/Common_subexpression_elimination) to group common subexpressions" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " const double tmp_5 = (1.0/2.0)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " const double tmp_6 = tmp_5*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)));\n", " const double tmp_7 = -gammaDD_dD001 + 2*gammaDD_dD010;\n", " const double tmp_8 = tmp_5*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12);\n", " const double tmp_9 = -gammaDD_dD002 + 2*gammaDD_dD020;\n", " const double tmp_10 = tmp_5*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11);\n", " const double tmp_11 = -gammaDD_dD012 + gammaDD_dD021 + gammaDD_dD120;\n", " const double tmp_12 = gammaDD_dD012 - gammaDD_dD021 + gammaDD_dD120;\n", " const double tmp_13 = -gammaDD_dD112 + 2*gammaDD_dD121;\n", " const double tmp_14 = 2*gammaDD_dD011 - gammaDD_dD110;\n", " const double tmp_15 = gammaDD_dD012 + gammaDD_dD021 - gammaDD_dD120;\n", " const double tmp_16 = 2*gammaDD_dD122 - gammaDD_dD221;\n", " const double tmp_17 = 2*gammaDD_dD022 - gammaDD_dD220;\n", " const double tmp_18 = tmp_5*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02);\n", " const double tmp_19 = tmp_5*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)));\n", " const double tmp_20 = tmp_5*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)));\n", " ChristoffelUDD000 = gammaDD_dD000*tmp_6 + tmp_10*tmp_9 + tmp_7*tmp_8;\n", " ChristoffelUDD001 = gammaDD_dD001*tmp_6 + gammaDD_dD110*tmp_8 + tmp_10*tmp_11;\n", " ChristoffelUDD002 = gammaDD_dD002*tmp_6 + gammaDD_dD220*tmp_10 + tmp_12*tmp_8;\n", " ChristoffelUDD011 = gammaDD_dD111*tmp_8 + tmp_10*tmp_13 + tmp_14*tmp_6;\n", " ChristoffelUDD012 = gammaDD_dD112*tmp_8 + gammaDD_dD221*tmp_10 + tmp_15*tmp_6;\n", " ChristoffelUDD022 = gammaDD_dD222*tmp_10 + tmp_16*tmp_8 + tmp_17*tmp_6;\n", " ChristoffelUDD100 = gammaDD_dD000*tmp_8 + tmp_18*tmp_9 + tmp_19*tmp_7;\n", " ChristoffelUDD101 = gammaDD_dD001*tmp_8 + gammaDD_dD110*tmp_19 + tmp_11*tmp_18;\n", " ChristoffelUDD102 = gammaDD_dD002*tmp_8 + gammaDD_dD220*tmp_18 + tmp_12*tmp_19;\n", " ChristoffelUDD111 = gammaDD_dD111*tmp_19 + tmp_13*tmp_18 + tmp_14*tmp_8;\n", " ChristoffelUDD112 = gammaDD_dD112*tmp_19 + gammaDD_dD221*tmp_18 + tmp_15*tmp_8;\n", " ChristoffelUDD122 = gammaDD_dD222*tmp_18 + tmp_16*tmp_19 + tmp_17*tmp_8;\n", " ChristoffelUDD200 = gammaDD_dD000*tmp_10 + tmp_18*tmp_7 + tmp_20*tmp_9;\n", " ChristoffelUDD201 = gammaDD_dD001*tmp_10 + gammaDD_dD110*tmp_18 + tmp_11*tmp_20;\n", " ChristoffelUDD202 = gammaDD_dD002*tmp_10 + gammaDD_dD220*tmp_20 + tmp_12*tmp_18;\n", " ChristoffelUDD211 = gammaDD_dD111*tmp_18 + tmp_10*tmp_14 + tmp_13*tmp_20;\n", " ChristoffelUDD212 = gammaDD_dD112*tmp_18 + gammaDD_dD221*tmp_20 + tmp_10*tmp_15;\n", " ChristoffelUDD222 = gammaDD_dD222*tmp_20 + tmp_10*tmp_17 + tmp_16*tmp_18;\n", "}\n", "\n" ] } ], "source": [ "outC.outputC(symbols_list, varname_list, filename=\"stdout\", params=\"CSE_enable=True,outCverbose=False\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Optimization Level 2**: Use CSE and take advantage of [single instruction, multiple data (SIMD) macros](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data). NRPy+ translates these macros into assembler-level-optimized code for (x86_64) CPUs. Use case: looping over data on a numerical grid; can evaluate expressions at up to 8 gridpoints simultaneously *per CPU core*. Note that `FusedMulSubSIMD(a,b,c)` and `FusedMulAddSIMD(a,b,c)` perform *two* floating-point operations per cycle (on modern CPUs), where for other arithmetic operations at most one FP operation can be performed per cycle." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " const double tmp_Integer_1 = 1.0;\n", " const REAL_SIMD_ARRAY _Integer_1 = ConstSIMD(tmp_Integer_1);\n", "\n", " const double tmp_Integer_2 = 2.0;\n", " const REAL_SIMD_ARRAY _Integer_2 = ConstSIMD(tmp_Integer_2);\n", "\n", " const double tmp_NegativeOne_ = -1.0;\n", " const REAL_SIMD_ARRAY _NegativeOne_ = ConstSIMD(tmp_NegativeOne_);\n", "\n", " const double tmp_Rational_1_2 = 1.0/2.0;\n", " const REAL_SIMD_ARRAY _Rational_1_2 = ConstSIMD(tmp_Rational_1_2);\n", "\n", " const REAL_SIMD_ARRAY tmp_1 = MulSIMD(_NegativeOne_, MulSIMD(gammaDD12, gammaDD12));\n", " const REAL_SIMD_ARRAY tmp_3 = MulSIMD(_NegativeOne_, MulSIMD(gammaDD01, gammaDD01));\n", " const REAL_SIMD_ARRAY tmp_4 = MulSIMD(_NegativeOne_, MulSIMD(gammaDD02, gammaDD02));\n", " const REAL_SIMD_ARRAY tmp_6 = MulSIMD(_Rational_1_2, DivSIMD(FusedMulAddSIMD(gammaDD11, gammaDD22, tmp_1), FusedMulAddSIMD(gammaDD22, tmp_3, FusedMulAddSIMD(gammaDD00, MulSIMD(gammaDD11, gammaDD22), FusedMulAddSIMD(MulSIMD(_Integer_2, gammaDD01), MulSIMD(gammaDD02, gammaDD12), FusedMulAddSIMD(gammaDD00, tmp_1, MulSIMD(gammaDD11, tmp_4)))))));\n", " const REAL_SIMD_ARRAY tmp_7 = FusedMulSubSIMD(_Integer_2, gammaDD_dD010, gammaDD_dD001);\n", " const REAL_SIMD_ARRAY tmp_8 = MulSIMD(_Rational_1_2, DivSIMD(FusedMulSubSIMD(gammaDD02, gammaDD12, MulSIMD(gammaDD01, gammaDD22)), FusedMulAddSIMD(gammaDD22, tmp_3, FusedMulAddSIMD(gammaDD00, MulSIMD(gammaDD11, gammaDD22), FusedMulAddSIMD(MulSIMD(_Integer_2, gammaDD01), MulSIMD(gammaDD02, gammaDD12), FusedMulAddSIMD(gammaDD00, tmp_1, MulSIMD(gammaDD11, tmp_4)))))));\n", " const REAL_SIMD_ARRAY tmp_9 = FusedMulSubSIMD(_Integer_2, gammaDD_dD020, gammaDD_dD002);\n", " const REAL_SIMD_ARRAY tmp_10 = MulSIMD(_Rational_1_2, DivSIMD(FusedMulSubSIMD(gammaDD01, gammaDD12, MulSIMD(gammaDD02, gammaDD11)), FusedMulAddSIMD(gammaDD22, tmp_3, FusedMulAddSIMD(gammaDD00, MulSIMD(gammaDD11, gammaDD22), FusedMulAddSIMD(MulSIMD(_Integer_2, gammaDD01), MulSIMD(gammaDD02, gammaDD12), FusedMulAddSIMD(gammaDD00, tmp_1, MulSIMD(gammaDD11, tmp_4)))))));\n", " const REAL_SIMD_ARRAY tmp_11 = AddSIMD(gammaDD_dD120, SubSIMD(gammaDD_dD021, gammaDD_dD012));\n", " const REAL_SIMD_ARRAY tmp_12 = AddSIMD(gammaDD_dD120, SubSIMD(gammaDD_dD012, gammaDD_dD021));\n", " const REAL_SIMD_ARRAY tmp_13 = FusedMulSubSIMD(_Integer_2, gammaDD_dD121, gammaDD_dD112);\n", " const REAL_SIMD_ARRAY tmp_14 = FusedMulSubSIMD(_Integer_2, gammaDD_dD011, gammaDD_dD110);\n", " const REAL_SIMD_ARRAY tmp_15 = AddSIMD(gammaDD_dD021, SubSIMD(gammaDD_dD012, gammaDD_dD120));\n", " const REAL_SIMD_ARRAY tmp_16 = FusedMulSubSIMD(_Integer_2, gammaDD_dD122, gammaDD_dD221);\n", " const REAL_SIMD_ARRAY tmp_17 = FusedMulSubSIMD(_Integer_2, gammaDD_dD022, gammaDD_dD220);\n", " const REAL_SIMD_ARRAY tmp_18 = MulSIMD(_Rational_1_2, DivSIMD(FusedMulSubSIMD(gammaDD01, gammaDD02, MulSIMD(gammaDD00, gammaDD12)), FusedMulAddSIMD(gammaDD22, tmp_3, FusedMulAddSIMD(gammaDD00, MulSIMD(gammaDD11, gammaDD22), FusedMulAddSIMD(MulSIMD(_Integer_2, gammaDD01), MulSIMD(gammaDD02, gammaDD12), FusedMulAddSIMD(gammaDD00, tmp_1, MulSIMD(gammaDD11, tmp_4)))))));\n", " const REAL_SIMD_ARRAY tmp_19 = MulSIMD(_Rational_1_2, DivSIMD(FusedMulAddSIMD(gammaDD00, gammaDD22, tmp_4), FusedMulAddSIMD(gammaDD22, tmp_3, FusedMulAddSIMD(gammaDD00, MulSIMD(gammaDD11, gammaDD22), FusedMulAddSIMD(MulSIMD(_Integer_2, gammaDD01), MulSIMD(gammaDD02, gammaDD12), FusedMulAddSIMD(gammaDD00, tmp_1, MulSIMD(gammaDD11, tmp_4)))))));\n", " const REAL_SIMD_ARRAY tmp_20 = MulSIMD(_Rational_1_2, DivSIMD(FusedMulAddSIMD(gammaDD00, gammaDD11, tmp_3), FusedMulAddSIMD(gammaDD22, tmp_3, FusedMulAddSIMD(gammaDD00, MulSIMD(gammaDD11, gammaDD22), FusedMulAddSIMD(MulSIMD(_Integer_2, gammaDD01), MulSIMD(gammaDD02, gammaDD12), FusedMulAddSIMD(gammaDD00, tmp_1, MulSIMD(gammaDD11, tmp_4)))))));\n", " ChristoffelUDD000 = FusedMulAddSIMD(tmp_10, tmp_9, FusedMulAddSIMD(tmp_7, tmp_8, MulSIMD(gammaDD_dD000, tmp_6)));\n", " ChristoffelUDD001 = FusedMulAddSIMD(gammaDD_dD110, tmp_8, FusedMulAddSIMD(tmp_10, tmp_11, MulSIMD(gammaDD_dD001, tmp_6)));\n", " ChristoffelUDD002 = FusedMulAddSIMD(gammaDD_dD220, tmp_10, FusedMulAddSIMD(tmp_12, tmp_8, MulSIMD(gammaDD_dD002, tmp_6)));\n", " ChristoffelUDD011 = FusedMulAddSIMD(tmp_10, tmp_13, FusedMulAddSIMD(tmp_14, tmp_6, MulSIMD(gammaDD_dD111, tmp_8)));\n", " ChristoffelUDD012 = FusedMulAddSIMD(gammaDD_dD221, tmp_10, FusedMulAddSIMD(tmp_15, tmp_6, MulSIMD(gammaDD_dD112, tmp_8)));\n", " ChristoffelUDD022 = FusedMulAddSIMD(tmp_16, tmp_8, FusedMulAddSIMD(tmp_17, tmp_6, MulSIMD(gammaDD_dD222, tmp_10)));\n", " ChristoffelUDD100 = FusedMulAddSIMD(tmp_18, tmp_9, FusedMulAddSIMD(tmp_19, tmp_7, MulSIMD(gammaDD_dD000, tmp_8)));\n", " ChristoffelUDD101 = FusedMulAddSIMD(gammaDD_dD110, tmp_19, FusedMulAddSIMD(tmp_11, tmp_18, MulSIMD(gammaDD_dD001, tmp_8)));\n", " ChristoffelUDD102 = FusedMulAddSIMD(gammaDD_dD220, tmp_18, FusedMulAddSIMD(tmp_12, tmp_19, MulSIMD(gammaDD_dD002, tmp_8)));\n", " ChristoffelUDD111 = FusedMulAddSIMD(tmp_13, tmp_18, FusedMulAddSIMD(tmp_14, tmp_8, MulSIMD(gammaDD_dD111, tmp_19)));\n", " ChristoffelUDD112 = FusedMulAddSIMD(gammaDD_dD221, tmp_18, FusedMulAddSIMD(tmp_15, tmp_8, MulSIMD(gammaDD_dD112, tmp_19)));\n", " ChristoffelUDD122 = FusedMulAddSIMD(tmp_16, tmp_19, FusedMulAddSIMD(tmp_17, tmp_8, MulSIMD(gammaDD_dD222, tmp_18)));\n", " ChristoffelUDD200 = FusedMulAddSIMD(tmp_18, tmp_7, FusedMulAddSIMD(tmp_20, tmp_9, MulSIMD(gammaDD_dD000, tmp_10)));\n", " ChristoffelUDD201 = FusedMulAddSIMD(gammaDD_dD110, tmp_18, FusedMulAddSIMD(tmp_11, tmp_20, MulSIMD(gammaDD_dD001, tmp_10)));\n", " ChristoffelUDD202 = FusedMulAddSIMD(gammaDD_dD220, tmp_20, FusedMulAddSIMD(tmp_12, tmp_18, MulSIMD(gammaDD_dD002, tmp_10)));\n", " ChristoffelUDD211 = FusedMulAddSIMD(tmp_10, tmp_14, FusedMulAddSIMD(tmp_13, tmp_20, MulSIMD(gammaDD_dD111, tmp_18)));\n", " ChristoffelUDD212 = FusedMulAddSIMD(gammaDD_dD221, tmp_20, FusedMulAddSIMD(tmp_10, tmp_15, MulSIMD(gammaDD_dD112, tmp_18)));\n", " ChristoffelUDD222 = FusedMulAddSIMD(tmp_10, tmp_17, FusedMulAddSIMD(tmp_16, tmp_18, MulSIMD(gammaDD_dD222, tmp_20)));\n", "}\n", "\n" ] } ], "source": [ "outC.outputC(symbols_list, varname_list, filename=\"stdout\", params=\"CSE_enable=True,enable_SIMD=True,outCverbose=False\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Part 4: `FD_outputC()` example: Specify numerical derivatives within $\\Gamma^i_{jk}$ expressions as finite differences \\[Back to [top](#toc)\\]\n", "$$\\label{fd_outputc}$$\n", "\n", "To emphasize the infrastructure-agnostic nature of NRPy+, in the above C codes we assumed that the derivatives were already computed (e.g., using finite-difference derivatives, discontinuous Galerkin methods, pseudospectral methods, finite-element methods, etc.)\n", "\n", "In this part, we demonstrate NRPy+'s `FD_outputC()` code, which basically prepends the above C/C++ codes with the code needed for computing arbitrary-order finite-difference derivatives.\n", "\n", "To do this, NRPy+ makes the standard assumption that the underlying grid is *uniform*, and that derivatives are taken with respect to functions stored at each point on the numerical grid. Appropriately, we call such functions \"gridfunctions\".\n", "\n", "In the following, we assume that the input, $\\gamma_{ij}$, is stored at each gridpoint. Also we wish to store each independent component of $\\Gamma^i_{jk}$ at each gridpoint as output. The below code cell also introduces the NRPy+ parameter interface (accessed via `NRPy_param_funcs.py`) to set the finite-differencing order to 6.\n", "\n", "We'll use Optimization Level 1, to make the code easier to read; change `enable_SIMD=False` in the below `FD_outputC()` function call to `enable_SIMD=True` to see the Optimization Level 2 version." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\n", " /*\n", " * NRPy+ Finite Difference Code Generation, Step 1 of 2: Read from main memory and compute finite difference stencils:\n", " */\n", " const double gammaDD00_i0_i1_i2m3 = in_gfs[IDX4S(GAMMADD00GF, i0,i1,i2-3)];\n", " const double gammaDD00_i0_i1_i2m2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1,i2-2)];\n", " const double gammaDD00_i0_i1_i2m1 = in_gfs[IDX4S(GAMMADD00GF, i0,i1,i2-1)];\n", " const double gammaDD00_i0_i1m3_i2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1-3,i2)];\n", " const double gammaDD00_i0_i1m2_i2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1-2,i2)];\n", " const double gammaDD00_i0_i1m1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1-1,i2)];\n", " const double gammaDD00_i0m3_i1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0-3,i1,i2)];\n", " const double gammaDD00_i0m2_i1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0-2,i1,i2)];\n", " const double gammaDD00_i0m1_i1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0-1,i1,i2)];\n", " const double gammaDD00 = in_gfs[IDX4S(GAMMADD00GF, i0,i1,i2)];\n", " const double gammaDD00_i0p1_i1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0+1,i1,i2)];\n", " const double gammaDD00_i0p2_i1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0+2,i1,i2)];\n", " const double gammaDD00_i0p3_i1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0+3,i1,i2)];\n", " const double gammaDD00_i0_i1p1_i2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1+1,i2)];\n", " const double gammaDD00_i0_i1p2_i2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1+2,i2)];\n", " const double gammaDD00_i0_i1p3_i2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1+3,i2)];\n", " const double gammaDD00_i0_i1_i2p1 = in_gfs[IDX4S(GAMMADD00GF, i0,i1,i2+1)];\n", " const double gammaDD00_i0_i1_i2p2 = in_gfs[IDX4S(GAMMADD00GF, i0,i1,i2+2)];\n", " const double gammaDD00_i0_i1_i2p3 = in_gfs[IDX4S(GAMMADD00GF, i0,i1,i2+3)];\n", " const double gammaDD01_i0_i1_i2m3 = in_gfs[IDX4S(GAMMADD01GF, i0,i1,i2-3)];\n", " const double gammaDD01_i0_i1_i2m2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1,i2-2)];\n", " const double gammaDD01_i0_i1_i2m1 = in_gfs[IDX4S(GAMMADD01GF, i0,i1,i2-1)];\n", " const double gammaDD01_i0_i1m3_i2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1-3,i2)];\n", " const double gammaDD01_i0_i1m2_i2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1-2,i2)];\n", " const double gammaDD01_i0_i1m1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1-1,i2)];\n", " const double gammaDD01_i0m3_i1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0-3,i1,i2)];\n", " const double gammaDD01_i0m2_i1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0-2,i1,i2)];\n", " const double gammaDD01_i0m1_i1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0-1,i1,i2)];\n", " const double gammaDD01 = in_gfs[IDX4S(GAMMADD01GF, i0,i1,i2)];\n", " const double gammaDD01_i0p1_i1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0+1,i1,i2)];\n", " const double gammaDD01_i0p2_i1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0+2,i1,i2)];\n", " const double gammaDD01_i0p3_i1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0+3,i1,i2)];\n", " const double gammaDD01_i0_i1p1_i2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1+1,i2)];\n", " const double gammaDD01_i0_i1p2_i2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1+2,i2)];\n", " const double gammaDD01_i0_i1p3_i2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1+3,i2)];\n", " const double gammaDD01_i0_i1_i2p1 = in_gfs[IDX4S(GAMMADD01GF, i0,i1,i2+1)];\n", " const double gammaDD01_i0_i1_i2p2 = in_gfs[IDX4S(GAMMADD01GF, i0,i1,i2+2)];\n", " const double gammaDD01_i0_i1_i2p3 = in_gfs[IDX4S(GAMMADD01GF, i0,i1,i2+3)];\n", " const double gammaDD02_i0_i1_i2m3 = in_gfs[IDX4S(GAMMADD02GF, i0,i1,i2-3)];\n", " const double gammaDD02_i0_i1_i2m2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1,i2-2)];\n", " const double gammaDD02_i0_i1_i2m1 = in_gfs[IDX4S(GAMMADD02GF, i0,i1,i2-1)];\n", " const double gammaDD02_i0_i1m3_i2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1-3,i2)];\n", " const double gammaDD02_i0_i1m2_i2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1-2,i2)];\n", " const double gammaDD02_i0_i1m1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1-1,i2)];\n", " const double gammaDD02_i0m3_i1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0-3,i1,i2)];\n", " const double gammaDD02_i0m2_i1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0-2,i1,i2)];\n", " const double gammaDD02_i0m1_i1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0-1,i1,i2)];\n", " const double gammaDD02 = in_gfs[IDX4S(GAMMADD02GF, i0,i1,i2)];\n", " const double gammaDD02_i0p1_i1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0+1,i1,i2)];\n", " const double gammaDD02_i0p2_i1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0+2,i1,i2)];\n", " const double gammaDD02_i0p3_i1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0+3,i1,i2)];\n", " const double gammaDD02_i0_i1p1_i2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1+1,i2)];\n", " const double gammaDD02_i0_i1p2_i2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1+2,i2)];\n", " const double gammaDD02_i0_i1p3_i2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1+3,i2)];\n", " const double gammaDD02_i0_i1_i2p1 = in_gfs[IDX4S(GAMMADD02GF, i0,i1,i2+1)];\n", " const double gammaDD02_i0_i1_i2p2 = in_gfs[IDX4S(GAMMADD02GF, i0,i1,i2+2)];\n", " const double gammaDD02_i0_i1_i2p3 = in_gfs[IDX4S(GAMMADD02GF, i0,i1,i2+3)];\n", " const double gammaDD11_i0_i1_i2m3 = in_gfs[IDX4S(GAMMADD11GF, i0,i1,i2-3)];\n", " const double gammaDD11_i0_i1_i2m2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1,i2-2)];\n", " const double gammaDD11_i0_i1_i2m1 = in_gfs[IDX4S(GAMMADD11GF, i0,i1,i2-1)];\n", " const double gammaDD11_i0_i1m3_i2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1-3,i2)];\n", " const double gammaDD11_i0_i1m2_i2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1-2,i2)];\n", " const double gammaDD11_i0_i1m1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1-1,i2)];\n", " const double gammaDD11_i0m3_i1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0-3,i1,i2)];\n", " const double gammaDD11_i0m2_i1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0-2,i1,i2)];\n", " const double gammaDD11_i0m1_i1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0-1,i1,i2)];\n", " const double gammaDD11 = in_gfs[IDX4S(GAMMADD11GF, i0,i1,i2)];\n", " const double gammaDD11_i0p1_i1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0+1,i1,i2)];\n", " const double gammaDD11_i0p2_i1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0+2,i1,i2)];\n", " const double gammaDD11_i0p3_i1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0+3,i1,i2)];\n", " const double gammaDD11_i0_i1p1_i2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1+1,i2)];\n", " const double gammaDD11_i0_i1p2_i2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1+2,i2)];\n", " const double gammaDD11_i0_i1p3_i2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1+3,i2)];\n", " const double gammaDD11_i0_i1_i2p1 = in_gfs[IDX4S(GAMMADD11GF, i0,i1,i2+1)];\n", " const double gammaDD11_i0_i1_i2p2 = in_gfs[IDX4S(GAMMADD11GF, i0,i1,i2+2)];\n", " const double gammaDD11_i0_i1_i2p3 = in_gfs[IDX4S(GAMMADD11GF, i0,i1,i2+3)];\n", " const double gammaDD12_i0_i1_i2m3 = in_gfs[IDX4S(GAMMADD12GF, i0,i1,i2-3)];\n", " const double gammaDD12_i0_i1_i2m2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1,i2-2)];\n", " const double gammaDD12_i0_i1_i2m1 = in_gfs[IDX4S(GAMMADD12GF, i0,i1,i2-1)];\n", " const double gammaDD12_i0_i1m3_i2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1-3,i2)];\n", " const double gammaDD12_i0_i1m2_i2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1-2,i2)];\n", " const double gammaDD12_i0_i1m1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1-1,i2)];\n", " const double gammaDD12_i0m3_i1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0-3,i1,i2)];\n", " const double gammaDD12_i0m2_i1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0-2,i1,i2)];\n", " const double gammaDD12_i0m1_i1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0-1,i1,i2)];\n", " const double gammaDD12 = in_gfs[IDX4S(GAMMADD12GF, i0,i1,i2)];\n", " const double gammaDD12_i0p1_i1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0+1,i1,i2)];\n", " const double gammaDD12_i0p2_i1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0+2,i1,i2)];\n", " const double gammaDD12_i0p3_i1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0+3,i1,i2)];\n", " const double gammaDD12_i0_i1p1_i2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1+1,i2)];\n", " const double gammaDD12_i0_i1p2_i2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1+2,i2)];\n", " const double gammaDD12_i0_i1p3_i2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1+3,i2)];\n", " const double gammaDD12_i0_i1_i2p1 = in_gfs[IDX4S(GAMMADD12GF, i0,i1,i2+1)];\n", " const double gammaDD12_i0_i1_i2p2 = in_gfs[IDX4S(GAMMADD12GF, i0,i1,i2+2)];\n", " const double gammaDD12_i0_i1_i2p3 = in_gfs[IDX4S(GAMMADD12GF, i0,i1,i2+3)];\n", " const double gammaDD22_i0_i1_i2m3 = in_gfs[IDX4S(GAMMADD22GF, i0,i1,i2-3)];\n", " const double gammaDD22_i0_i1_i2m2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1,i2-2)];\n", " const double gammaDD22_i0_i1_i2m1 = in_gfs[IDX4S(GAMMADD22GF, i0,i1,i2-1)];\n", " const double gammaDD22_i0_i1m3_i2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1-3,i2)];\n", " const double gammaDD22_i0_i1m2_i2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1-2,i2)];\n", " const double gammaDD22_i0_i1m1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1-1,i2)];\n", " const double gammaDD22_i0m3_i1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0-3,i1,i2)];\n", " const double gammaDD22_i0m2_i1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0-2,i1,i2)];\n", " const double gammaDD22_i0m1_i1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0-1,i1,i2)];\n", " const double gammaDD22 = in_gfs[IDX4S(GAMMADD22GF, i0,i1,i2)];\n", " const double gammaDD22_i0p1_i1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0+1,i1,i2)];\n", " const double gammaDD22_i0p2_i1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0+2,i1,i2)];\n", " const double gammaDD22_i0p3_i1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0+3,i1,i2)];\n", " const double gammaDD22_i0_i1p1_i2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1+1,i2)];\n", " const double gammaDD22_i0_i1p2_i2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1+2,i2)];\n", " const double gammaDD22_i0_i1p3_i2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1+3,i2)];\n", " const double gammaDD22_i0_i1_i2p1 = in_gfs[IDX4S(GAMMADD22GF, i0,i1,i2+1)];\n", " const double gammaDD22_i0_i1_i2p2 = in_gfs[IDX4S(GAMMADD22GF, i0,i1,i2+2)];\n", " const double gammaDD22_i0_i1_i2p3 = in_gfs[IDX4S(GAMMADD22GF, i0,i1,i2+3)];\n", " const double FDPart1_Rational_3_4 = 3.0/4.0;\n", " const double FDPart1_Rational_3_20 = 3.0/20.0;\n", " const double FDPart1_Rational_1_60 = 1.0/60.0;\n", " const double gammaDD_dD000 = invdx0*(FDPart1_Rational_1_60*(-gammaDD00_i0m3_i1_i2 + gammaDD00_i0p3_i1_i2) + FDPart1_Rational_3_20*(gammaDD00_i0m2_i1_i2 - gammaDD00_i0p2_i1_i2) + FDPart1_Rational_3_4*(-gammaDD00_i0m1_i1_i2 + gammaDD00_i0p1_i1_i2));\n", " const double gammaDD_dD001 = invdx1*(FDPart1_Rational_1_60*(-gammaDD00_i0_i1m3_i2 + gammaDD00_i0_i1p3_i2) + FDPart1_Rational_3_20*(gammaDD00_i0_i1m2_i2 - gammaDD00_i0_i1p2_i2) + FDPart1_Rational_3_4*(-gammaDD00_i0_i1m1_i2 + gammaDD00_i0_i1p1_i2));\n", " const double gammaDD_dD002 = invdx2*(FDPart1_Rational_1_60*(-gammaDD00_i0_i1_i2m3 + gammaDD00_i0_i1_i2p3) + FDPart1_Rational_3_20*(gammaDD00_i0_i1_i2m2 - gammaDD00_i0_i1_i2p2) + FDPart1_Rational_3_4*(-gammaDD00_i0_i1_i2m1 + gammaDD00_i0_i1_i2p1));\n", " const double gammaDD_dD010 = invdx0*(FDPart1_Rational_1_60*(-gammaDD01_i0m3_i1_i2 + gammaDD01_i0p3_i1_i2) + FDPart1_Rational_3_20*(gammaDD01_i0m2_i1_i2 - gammaDD01_i0p2_i1_i2) + FDPart1_Rational_3_4*(-gammaDD01_i0m1_i1_i2 + gammaDD01_i0p1_i1_i2));\n", " const double gammaDD_dD011 = invdx1*(FDPart1_Rational_1_60*(-gammaDD01_i0_i1m3_i2 + gammaDD01_i0_i1p3_i2) + FDPart1_Rational_3_20*(gammaDD01_i0_i1m2_i2 - gammaDD01_i0_i1p2_i2) + FDPart1_Rational_3_4*(-gammaDD01_i0_i1m1_i2 + gammaDD01_i0_i1p1_i2));\n", " const double gammaDD_dD012 = invdx2*(FDPart1_Rational_1_60*(-gammaDD01_i0_i1_i2m3 + gammaDD01_i0_i1_i2p3) + FDPart1_Rational_3_20*(gammaDD01_i0_i1_i2m2 - gammaDD01_i0_i1_i2p2) + FDPart1_Rational_3_4*(-gammaDD01_i0_i1_i2m1 + gammaDD01_i0_i1_i2p1));\n", " const double gammaDD_dD020 = invdx0*(FDPart1_Rational_1_60*(-gammaDD02_i0m3_i1_i2 + gammaDD02_i0p3_i1_i2) + FDPart1_Rational_3_20*(gammaDD02_i0m2_i1_i2 - gammaDD02_i0p2_i1_i2) + FDPart1_Rational_3_4*(-gammaDD02_i0m1_i1_i2 + gammaDD02_i0p1_i1_i2));\n", " const double gammaDD_dD021 = invdx1*(FDPart1_Rational_1_60*(-gammaDD02_i0_i1m3_i2 + gammaDD02_i0_i1p3_i2) + FDPart1_Rational_3_20*(gammaDD02_i0_i1m2_i2 - gammaDD02_i0_i1p2_i2) + FDPart1_Rational_3_4*(-gammaDD02_i0_i1m1_i2 + gammaDD02_i0_i1p1_i2));\n", " const double gammaDD_dD022 = invdx2*(FDPart1_Rational_1_60*(-gammaDD02_i0_i1_i2m3 + gammaDD02_i0_i1_i2p3) + FDPart1_Rational_3_20*(gammaDD02_i0_i1_i2m2 - gammaDD02_i0_i1_i2p2) + FDPart1_Rational_3_4*(-gammaDD02_i0_i1_i2m1 + gammaDD02_i0_i1_i2p1));\n", " const double gammaDD_dD110 = invdx0*(FDPart1_Rational_1_60*(-gammaDD11_i0m3_i1_i2 + gammaDD11_i0p3_i1_i2) + FDPart1_Rational_3_20*(gammaDD11_i0m2_i1_i2 - gammaDD11_i0p2_i1_i2) + FDPart1_Rational_3_4*(-gammaDD11_i0m1_i1_i2 + gammaDD11_i0p1_i1_i2));\n", " const double gammaDD_dD111 = invdx1*(FDPart1_Rational_1_60*(-gammaDD11_i0_i1m3_i2 + gammaDD11_i0_i1p3_i2) + FDPart1_Rational_3_20*(gammaDD11_i0_i1m2_i2 - gammaDD11_i0_i1p2_i2) + FDPart1_Rational_3_4*(-gammaDD11_i0_i1m1_i2 + gammaDD11_i0_i1p1_i2));\n", " const double gammaDD_dD112 = invdx2*(FDPart1_Rational_1_60*(-gammaDD11_i0_i1_i2m3 + gammaDD11_i0_i1_i2p3) + FDPart1_Rational_3_20*(gammaDD11_i0_i1_i2m2 - gammaDD11_i0_i1_i2p2) + FDPart1_Rational_3_4*(-gammaDD11_i0_i1_i2m1 + gammaDD11_i0_i1_i2p1));\n", " const double gammaDD_dD120 = invdx0*(FDPart1_Rational_1_60*(-gammaDD12_i0m3_i1_i2 + gammaDD12_i0p3_i1_i2) + FDPart1_Rational_3_20*(gammaDD12_i0m2_i1_i2 - gammaDD12_i0p2_i1_i2) + FDPart1_Rational_3_4*(-gammaDD12_i0m1_i1_i2 + gammaDD12_i0p1_i1_i2));\n", " const double gammaDD_dD121 = invdx1*(FDPart1_Rational_1_60*(-gammaDD12_i0_i1m3_i2 + gammaDD12_i0_i1p3_i2) + FDPart1_Rational_3_20*(gammaDD12_i0_i1m2_i2 - gammaDD12_i0_i1p2_i2) + FDPart1_Rational_3_4*(-gammaDD12_i0_i1m1_i2 + gammaDD12_i0_i1p1_i2));\n", " const double gammaDD_dD122 = invdx2*(FDPart1_Rational_1_60*(-gammaDD12_i0_i1_i2m3 + gammaDD12_i0_i1_i2p3) + FDPart1_Rational_3_20*(gammaDD12_i0_i1_i2m2 - gammaDD12_i0_i1_i2p2) + FDPart1_Rational_3_4*(-gammaDD12_i0_i1_i2m1 + gammaDD12_i0_i1_i2p1));\n", " const double gammaDD_dD220 = invdx0*(FDPart1_Rational_1_60*(-gammaDD22_i0m3_i1_i2 + gammaDD22_i0p3_i1_i2) + FDPart1_Rational_3_20*(gammaDD22_i0m2_i1_i2 - gammaDD22_i0p2_i1_i2) + FDPart1_Rational_3_4*(-gammaDD22_i0m1_i1_i2 + gammaDD22_i0p1_i1_i2));\n", " const double gammaDD_dD221 = invdx1*(FDPart1_Rational_1_60*(-gammaDD22_i0_i1m3_i2 + gammaDD22_i0_i1p3_i2) + FDPart1_Rational_3_20*(gammaDD22_i0_i1m2_i2 - gammaDD22_i0_i1p2_i2) + FDPart1_Rational_3_4*(-gammaDD22_i0_i1m1_i2 + gammaDD22_i0_i1p1_i2));\n", " const double gammaDD_dD222 = invdx2*(FDPart1_Rational_1_60*(-gammaDD22_i0_i1_i2m3 + gammaDD22_i0_i1_i2p3) + FDPart1_Rational_3_20*(gammaDD22_i0_i1_i2m2 - gammaDD22_i0_i1_i2p2) + FDPart1_Rational_3_4*(-gammaDD22_i0_i1_i2m1 + gammaDD22_i0_i1_i2p1));\n", " /*\n", " * NRPy+ Finite Difference Code Generation, Step 2 of 2: Evaluate SymPy expressions and write to main memory:\n", " */\n", " const double FDPart3_5 = (1.0/2.0)/(gammaDD00*gammaDD11*gammaDD22 - gammaDD00*((gammaDD12)*(gammaDD12)) - ((gammaDD01)*(gammaDD01))*gammaDD22 + 2*gammaDD01*gammaDD02*gammaDD12 - ((gammaDD02)*(gammaDD02))*gammaDD11);\n", " const double FDPart3_6 = FDPart3_5*(gammaDD11*gammaDD22 - ((gammaDD12)*(gammaDD12)));\n", " const double FDPart3_7 = -gammaDD_dD001 + 2*gammaDD_dD010;\n", " const double FDPart3_8 = FDPart3_5*(-gammaDD01*gammaDD22 + gammaDD02*gammaDD12);\n", " const double FDPart3_9 = -gammaDD_dD002 + 2*gammaDD_dD020;\n", " const double FDPart3_10 = FDPart3_5*(gammaDD01*gammaDD12 - gammaDD02*gammaDD11);\n", " const double FDPart3_11 = -gammaDD_dD012 + gammaDD_dD021 + gammaDD_dD120;\n", " const double FDPart3_12 = gammaDD_dD012 - gammaDD_dD021 + gammaDD_dD120;\n", " const double FDPart3_13 = -gammaDD_dD112 + 2*gammaDD_dD121;\n", " const double FDPart3_14 = 2*gammaDD_dD011 - gammaDD_dD110;\n", " const double FDPart3_15 = gammaDD_dD012 + gammaDD_dD021 - gammaDD_dD120;\n", " const double FDPart3_16 = 2*gammaDD_dD122 - gammaDD_dD221;\n", " const double FDPart3_17 = 2*gammaDD_dD022 - gammaDD_dD220;\n", " const double FDPart3_18 = FDPart3_5*(-gammaDD00*gammaDD12 + gammaDD01*gammaDD02);\n", " const double FDPart3_19 = FDPart3_5*(gammaDD00*gammaDD22 - ((gammaDD02)*(gammaDD02)));\n", " const double FDPart3_20 = FDPart3_5*(gammaDD00*gammaDD11 - ((gammaDD01)*(gammaDD01)));\n", " aux_gfs[IDX4S(CHRISTOFFELUDD000GF, i0, i1, i2)] = FDPart3_10*FDPart3_9 + FDPart3_6*gammaDD_dD000 + FDPart3_7*FDPart3_8;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD001GF, i0, i1, i2)] = FDPart3_10*FDPart3_11 + FDPart3_6*gammaDD_dD001 + FDPart3_8*gammaDD_dD110;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD002GF, i0, i1, i2)] = FDPart3_10*gammaDD_dD220 + FDPart3_12*FDPart3_8 + FDPart3_6*gammaDD_dD002;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD011GF, i0, i1, i2)] = FDPart3_10*FDPart3_13 + FDPart3_14*FDPart3_6 + FDPart3_8*gammaDD_dD111;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD012GF, i0, i1, i2)] = FDPart3_10*gammaDD_dD221 + FDPart3_15*FDPart3_6 + FDPart3_8*gammaDD_dD112;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD022GF, i0, i1, i2)] = FDPart3_10*gammaDD_dD222 + FDPart3_16*FDPart3_8 + FDPart3_17*FDPart3_6;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD100GF, i0, i1, i2)] = FDPart3_18*FDPart3_9 + FDPart3_19*FDPart3_7 + FDPart3_8*gammaDD_dD000;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD101GF, i0, i1, i2)] = FDPart3_11*FDPart3_18 + FDPart3_19*gammaDD_dD110 + FDPart3_8*gammaDD_dD001;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD102GF, i0, i1, i2)] = FDPart3_12*FDPart3_19 + FDPart3_18*gammaDD_dD220 + FDPart3_8*gammaDD_dD002;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD111GF, i0, i1, i2)] = FDPart3_13*FDPart3_18 + FDPart3_14*FDPart3_8 + FDPart3_19*gammaDD_dD111;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD112GF, i0, i1, i2)] = FDPart3_15*FDPart3_8 + FDPart3_18*gammaDD_dD221 + FDPart3_19*gammaDD_dD112;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD122GF, i0, i1, i2)] = FDPart3_16*FDPart3_19 + FDPart3_17*FDPart3_8 + FDPart3_18*gammaDD_dD222;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD200GF, i0, i1, i2)] = FDPart3_10*gammaDD_dD000 + FDPart3_18*FDPart3_7 + FDPart3_20*FDPart3_9;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD201GF, i0, i1, i2)] = FDPart3_10*gammaDD_dD001 + FDPart3_11*FDPart3_20 + FDPart3_18*gammaDD_dD110;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD202GF, i0, i1, i2)] = FDPart3_10*gammaDD_dD002 + FDPart3_12*FDPart3_18 + FDPart3_20*gammaDD_dD220;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD211GF, i0, i1, i2)] = FDPart3_10*FDPart3_14 + FDPart3_13*FDPart3_20 + FDPart3_18*gammaDD_dD111;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD212GF, i0, i1, i2)] = FDPart3_10*FDPart3_15 + FDPart3_18*gammaDD_dD112 + FDPart3_20*gammaDD_dD221;\n", " aux_gfs[IDX4S(CHRISTOFFELUDD222GF, i0, i1, i2)] = FDPart3_10*FDPart3_17 + FDPart3_16*FDPart3_18 + FDPart3_20*gammaDD_dD222;\n", "}\n" ] } ], "source": [ "import grid as gri # NRPy+: Functions having to do with numerical grids\n", "import finite_difference as fin # NRPy+: Finite difference C code generation module\n", "import NRPy_param_funcs as par # NRPy+: parameter interface\n", "\n", "# First specify the finite-differencing order to be 6th order (all even orders > 0 supported!)\n", "par.set_parval_from_str(\"finite_difference::FD_CENTDERIVS_ORDER\", 6)\n", "\n", "# Next register gridfunctions for Christoffel symbols\n", "gri.register_gridfunctions(\"AUX\", varname_list, rank=3, is_indexed=True, DIM=3)\n", "\n", "# Then register gamma_{ij} as gridfunctions\n", "ixp.register_gridfunctions_for_single_rank2(\"EVOL\", \"gammaDD\", \"sym01\", DIM=3)\n", "\n", "# Finally output the C code using FD_outputC()\n", "# Step 1: FD_outputC() requires that left-hand side and right-hand side of each\n", "# expression be specified in a named-tuple \"lhrh\" defined in outputC.py\n", "outputC_lhrh = []\n", "for i, varname in enumerate(varname_list):\n", " outputC_lhrh += [outC.lhrh(lhs=gri.gfaccess(\"out_gf\", varname), rhs=symbols_list[i])]\n", "# Step 2: call FD_outputC.\n", "fin.FD_outputC(\"stdout\", outputC_lhrh, params=\"CSE_enable=True,enable_SIMD=False,outCverbose=False\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Part 5: What next? Navigating the NRPy+ tutorial \\[Back to [top](#toc)\\]\n", "$$\\label{what_next}$$\n", "\n", "As mentioned previously, NRPy+ is meant to be \"learn by example\". To that end, there are more than 100 fully documented Jupyter notebooks covering a wide variety of topics relevant to numerical relativity research and optimized algorithms for solving PDEs numerically.\n", "\n", "So the answer to the question \"*What next?*\" is, naturally, \"*Whatever you like!*\" To continue the journey, check out [**the main NRPy+ tutorial table of contents**.](NRPyPlus_Tutorial.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "# Part 6: Output this notebook to $\\LaTeX$-formatted PDF file \\[Back to [top](#toc)\\]\n", "$$\\label{latex_pdf_output}$$\n", "\n", "The following code cell converts this Jupyter notebook into a proper, clickable $\\LaTeX$-formatted PDF file. After the cell is successfully run, the generated PDF may be found in the root NRPy+ tutorial directory, with filename\n", "[Tutorial-NRPyPlus_10_Minute_Overview.pdf](Tutorial-NRPyPlus_10_Minute_Overview.pdf). (Note that clicking on this link may not work; you may need to open the PDF file through another means.)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "execution": { "iopub.execute_input": "2021-03-07T17:10:58.642904Z", "iopub.status.busy": "2021-03-07T17:10:58.641857Z", "iopub.status.idle": "2021-03-07T17:11:01.514758Z", "shell.execute_reply": "2021-03-07T17:11:01.513981Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created Tutorial-NRPyPlus_10_Minute_Overview.tex, and compiled LaTeX file\n", " to PDF file Tutorial-NRPyPlus_10_Minute_Overview.pdf\n" ] } ], "source": [ "import cmdline_helper as cmd # NRPy+: Multi-platform Python command-line interface\n", "cmd.output_Jupyter_notebook_to_LaTeXed_PDF(\"Tutorial-NRPyPlus_10_Minute_Overview\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" } }, "nbformat": 4, "nbformat_minor": 2 }