{"cells":[{"cell_type":"markdown","metadata":{"id":"cwFWxL93ZEkX"},"source":["# Using physics informed neural networks (PINNs) to solve parabolic PDEs"]},{"cell_type":"markdown","metadata":{"id":"1PlqQM9aZEkd"},"source":["In this notebook, we illustrate physics informed neural networks (PINNs) to solve partial differential equations (PDEs) as proposed in \n","\n","- Maziar Raissi, Paris Perdikaris, George Em Karniadakis. *Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations*. [arXiv 1711.10561](https://arxiv.org/abs/1711.10561) \n","- Maziar Raissi, Paris Perdikaris, George Em Karniadakis. *Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations*. [arXiv 1711.10566](https://arxiv.org/abs/1711.10566) \n","- Maziar Raissi, Paris Perdikaris, George Em Karniadakis. *Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations*. J. Comp. Phys. 378 pp. 686-707 [DOI: 10.1016/j.jcp.2018.10.045](https://www.sciencedirect.com/science/article/pii/S0021999118307125) \n","\n","This notebook is partially based on another implementation of the PINN approach published on [GitHub by pierremtb](https://github.com/pierremtb/PINNs-TF2.0) as well as the original code, see [Maziar Raissi on GitHub](https://github.com/maziarraissi/PINNs).\n","\n","<a href=\"https://colab.research.google.com/github/janblechschmidt/PDEsByNNs/blob/main/PINN_Solver.ipynb\" target=\"_parent\">\n","    <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n","</a>\n","\n","## Introduction\n","\n","We describe the PINN approach for approximating the solution $u:[0,T] \\times \\mathcal{D} \\to \\mathbb{R}$ of an evolution equation\n","\n","$$\n","\\begin{align}\n","    \\partial_t u (t,x) + \\mathcal{N}[u](t,x) &= 0, && (t,x) \\in (0,T] \\times \\mathcal{D},\\\\\n","    u(0,x) &= u_0(x) \\quad && x \\in \\mathcal{D},\n","\\end{align}\n","$$\n","\n","where $\\mathcal{N}$ is a nonlinear differential operator acting on $u$, \n","$\\mathcal{D} \\subset \\mathbb{R}^d$ a bounded domain,\n","$T$ denotes the final time and\n","$u_0: \\mathcal{D} \\to \\mathbb{R}$ the prescribed initial data.\n","Although the methodology allows for different types of boundary conditions, we restrict our discussion to the inhomogeneous Dirichlet case and prescribe\n","\n","$$\n","\\begin{align}\n","        \\hspace{7em} u(t,x) &= u_b(t,x)  && \\quad (t,x) \\in (0,T] \\times \\partial \\mathcal{D},\n","\\end{align}\n","$$\n","\n","where $\\partial \\mathcal{D}$ denotes the boundary of the domain $\\mathcal{D}$ and $u_b: (0,T] \\times \\partial \\mathcal{D} \\to \\mathbb{R}$ the given boundary data."]},{"cell_type":"markdown","metadata":{"id":"OdvtEjQXZEkf"},"source":["## Methodology\n","\n","The method constructs a neural network approximation\n","\n","$$\n","u_\\theta(t,x) \\approx u(t,x)\n","$$\n","\n","of the solution of nonlinear PDE, where $u_\\theta :[0,T] \\times \\mathcal{D} \\to \\mathbb{R}$ denotes a function realized by a neural network with parameters $\\theta$.\n","\n","The continuous time approach for the parabolic PDE as described in ([Raissi et al., 2017 (Part I)](https://arxiv.org/abs/1711.10561)) is based on the (strong) residual of a given neural network approximation $u_\\theta \\colon [0,T] \\times \\mathcal{D} \\to \\mathbb{R} $ of the solution $u$, i.e.,\n","\n","$$\n","\\begin{align}\n","    r_\\theta (t,x) := \\partial_t u_\\theta (t,x) + \\mathcal{N}[u_\\theta] (t,x).\n","\\end{align}\n","$$\n","\n","To incorporate this PDE residual $r_\\theta$ into a loss function to be minimized, PINNs require a further differentiation to evaluate the differential operators $\\partial_t u_\\theta$ and $\\mathcal{N}[u_\\theta]$.\n","Thus the PINN term $r_\\theta$ shares the same parameters as the original network $u_\\theta(t,x)$, but respects the underlying \"physics\" of the nonlinear PDE.\n","Both types of derivatives can be easily determined through automatic differentiation with current state-of-the-art machine learning libraries, e.g., TensorFlow or PyTorch.\n","\n","The PINN approach for the solution of the initial and boundary value problem now proceeds by minimization of the loss functional\n","\n","$$\n","\\begin{align}\n","    \\phi_\\theta(X) := \\phi_\\theta^r(X^r) + \\phi_\\theta^0(X^0) + \\phi_\\theta^b(X^b),\n","\\end{align}\n","$$\n","\n","where $X$ denotes the collection of training data and the loss function $\\phi_\\theta$ contains the following terms:\n","\n","  - the mean squared residual\n","$$\n","  \\begin{align*}\n","   \\phi_\\theta^r(X^r) := \\frac{1}{N_r}\\sum_{i=1}^{N_r} \\left|r_\\theta\\left(t_i^r, x_i^r\\right)\\right|^2\n","\\end{align*}\n","$$\n","in a number of collocation points $X^r:=\\{(t_i^r, x_i^r)\\}_{i=1}^{N_r} \\subset (0,T] \\times \\mathcal{D}$, where $r_\\theta$ is the physics-informed neural network,\n","  - the mean squared misfit with respect to the initial and boundary conditions\n","$$\n","    \\begin{align*}\n","   \\phi_\\theta^0(X^0) \n","   := \n","   \\frac{1}{N_0}\n","   \\sum_{i=1}^{N_0} \\left|u_\\theta\\left(t_i^0, x_i^0\\right) - u_0\\left(x_i^0\\right)\\right|^2\n","   \\quad \\text{ and } \\quad\n","   \\phi_\\theta^b(X^b) \n","   := \n","   \\frac{1}{N_b}\n","   \\sum_{i=1}^{N_b} \\left|u_\\theta\\left(t_i^b, x_i^b\\right) - u_b\\left(t_i^b, x_i^b\\right)\\right|^2\n","    \\end{align*}\n","$$\n","in a number of points $X^0:=\\{(t^0_i,x^0_i)\\}_{i=1}^{N_0} \\subset \\{0\\} \\times \\mathcal{D}$ and $X^b:=\\{(t^b_i,x^b_i)\\}_{i=1}^{N_b} \\subset (0,T] \\times \\partial \\mathcal{D}$, where $u_\\theta$ is the neural network approximation of the solution $u\\colon[0,T] \\times \\mathcal{D} \\to \\mathbb{R}$.\n","\n","Note that the training data $X$ consists entirely of time-space coordinates.\n"]},{"cell_type":"markdown","metadata":{"id":"GkvR7tPNZEkh"},"source":["## Example: Burgers equation\n","\n","To illustrate the PINN approach we consider the one-dimensional Burgers equation on the spatial domain $\\mathcal{D} = [-1,1]$\n","\n","$$\n","\\begin{aligned}\n","    \\partial_t u + u \\, \\partial_x u - (0.01/\\pi) \\, \\partial_{xx} u &= 0, \\quad &&\\quad (t,x) \\in (0,1] \\times (-1,1),\\\\\n","   u(0,x) &= - \\sin(\\pi \\, x),                 \\quad &&\\quad x \\in [-1,1],\\\\\n","   u(t,-1) = u(t,1) &= 0,                      \\quad &&\\quad t \\in (0,1].\n","\\end{aligned}\n","$$\n","\n","This PDE arises in various disciplines such as traffic flow, fluid mechanics and gas dynamics, and can be derived from the Navier-Stokes equations, see \n","([Basdevant et al., 1986](https://www.researchgate.net/publication/222935980_Spectral_and_finite_difference_solutions_of_Burgers_equation))."]},{"cell_type":"markdown","metadata":{"id":"Rb1T-jMoZEkh"},"source":["### 1. Import necessary packages and set problem specific data\n","This code runs with TensorFlow version `2.3.0`.\n","The implementation relies mainly on the scientific computing library [NumPy](https://numpy.org/doc/stable/user/whatisnumpy.html) and the machine learning library [TensorFlow](https://www.tensorflow.org/).\n","\n","All computations were performed on an Intel i7 CPU (8th Gen) with 16 GByte DDR3 RAM (2133 MHz) within a couple of minutes."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"tXkjWO_yZEkj"},"outputs":[],"source":["%config InlineBackend.figure_format = 'svg'\n","# Import TensorFlow and NumPy\n","import tensorflow as tf\n","import numpy as np\n","\n","# Set data type\n","DTYPE='float32'\n","tf.keras.backend.set_floatx(DTYPE)\n","\n","# Set constants\n","pi = tf.constant(np.pi, dtype=DTYPE)\n","viscosity = .01/pi\n","\n","# Define initial condition\n","def fun_u_0(x):\n","    return -tf.sin(pi * x)\n","\n","# Define boundary condition\n","def fun_u_b(t, x):\n","    n = x.shape[0]\n","    return tf.zeros((n,1), dtype=DTYPE)\n","\n","# Define residual of the PDE\n","def fun_r(t, x, u, u_t, u_x, u_xx):\n","    return u_t + u * u_x - viscosity * u_xx"]},{"cell_type":"markdown","metadata":{"id":"pQhmdtsqZEkj"},"source":["### 2. Generate a set of collocation points\n","\n","We assume that the collocation points $X_r$ as well as the points for the initial time and boundary data $X_0$ and $X_b$ are generated by random sampling from a uniform distribution.\n","Although uniformly distributed data are sufficient in our experiments, the authors of\n","([Raissi et al., 2017 (Part I)](https://arxiv.org/abs/1711.10561))\n","employed a space-filling Latin hypercube sampling strategy ([Stein, 1987](https://www.tandfonline.com/doi/abs/10.1080/00401706.1987.10488205)).\n","Our numerical experiments indicate that this strategy slightly improves the observed convergence rate, but for simplicity the code examples accompanying this paper employ uniform sampling throughout.\n","We choose training data of size $N_0 = N_b =50$ and $N_f=10000$."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"F5DP6GVRZEkk"},"outputs":[],"source":["# Set number of data points\n","N_0 = 50\n","N_b = 50\n","N_r = 10000\n","\n","# Set boundary\n","tmin = 0.\n","tmax = 1.\n","xmin = -1.\n","xmax = 1.\n","\n","# Lower bounds\n","lb = tf.constant([tmin, xmin], dtype=DTYPE)\n","# Upper bounds\n","ub = tf.constant([tmax, xmax], dtype=DTYPE)\n","\n","# Set random seed for reproducible results\n","tf.random.set_seed(0)\n","\n","# Draw uniform sample points for initial boundary data\n","t_0 = tf.ones((N_0,1), dtype=DTYPE)*lb[0]\n","x_0 = tf.random.uniform((N_0,1), lb[1], ub[1], dtype=DTYPE)\n","X_0 = tf.concat([t_0, x_0], axis=1)\n","\n","# Evaluate intitial condition at x_0\n","u_0 = fun_u_0(x_0)\n","\n","# Boundary data\n","t_b = tf.random.uniform((N_b,1), lb[0], ub[0], dtype=DTYPE)\n","x_b = lb[1] + (ub[1] - lb[1]) * tf.keras.backend.random_bernoulli((N_b,1), 0.5, dtype=DTYPE)\n","X_b = tf.concat([t_b, x_b], axis=1)\n","\n","# Evaluate boundary condition at (t_b,x_b)\n","u_b = fun_u_b(t_b, x_b)\n","\n","# Draw uniformly sampled collocation points\n","t_r = tf.random.uniform((N_r,1), lb[0], ub[0], dtype=DTYPE)\n","x_r = tf.random.uniform((N_r,1), lb[1], ub[1], dtype=DTYPE)\n","X_r = tf.concat([t_r, x_r], axis=1)\n","\n","# Collect boundary and inital data in lists\n","X_data = [X_0, X_b]\n","u_data = [u_0, u_b]"]},{"cell_type":"markdown","metadata":{"id":"Hl4GHQQcZEkl"},"source":["Next, we illustrate the collocation points (red circles) and the positions where the boundary and initial conditions will be enforced (cross marks, color indicates value)."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"41dx-V98ZEkl"},"outputs":[],"source":["import matplotlib.pyplot as plt\n","\n","fig = plt.figure(figsize=(9,6))\n","plt.scatter(t_0, x_0, c=u_0, marker='X', vmin=-1, vmax=1)\n","plt.scatter(t_b, x_b, c=u_b, marker='X', vmin=-1, vmax=1)\n","plt.scatter(t_r, x_r, c='r', marker='.', alpha=0.1)\n","plt.xlabel('$t$')\n","plt.ylabel('$x$')\n","\n","plt.title('Positions of collocation points and boundary data');\n","#plt.savefig('Xdata_Burgers.pdf', bbox_inches='tight', dpi=300)"]},{"cell_type":"markdown","metadata":{"id":"5G7wtusVZEkm"},"source":["### 3. Set up network architecture\n","\n","In this example, adopted from \n","([Raissi et al., 2017 (Part I)](https://arxiv.org/abs/1711.10561)), we assume a feedforward neural network of the following structure:\n","- the input is scaled elementwise to lie in the interval $[-1,1]$,\n","- followed by 8 fully connected layers each containing 20 neurons and each followed by a hyperbolic tangent activation function,\n","- one fully connected output layer.\n","\n","This setting results in a network with $3021$ trainable parameters (first hidden layer: $2 \\cdot 20 + 20 = 60$; seven intermediate layers: each $20 \\cdot 20 + 20 = 420$; output layer: $20 \\cdot 1 + 1 = 21$)."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"FcBMTSJLZEkm"},"outputs":[],"source":["def init_model(num_hidden_layers=8, num_neurons_per_layer=20):\n","    # Initialize a feedforward neural network\n","    model = tf.keras.Sequential()\n","\n","    # Input is two-dimensional (time + one spatial dimension)\n","    model.add(tf.keras.Input(2))\n","\n","    # Introduce a scaling layer to map input to [lb, ub]\n","    scaling_layer = tf.keras.layers.Lambda(\n","                lambda x: 2.0*(x - lb)/(ub - lb) - 1.0)\n","    model.add(scaling_layer)\n","\n","    # Append hidden layers\n","    for _ in range(num_hidden_layers):\n","        model.add(tf.keras.layers.Dense(num_neurons_per_layer,\n","            activation=tf.keras.activations.get('tanh'),\n","            kernel_initializer='glorot_normal'))\n","\n","    # Output is one-dimensional\n","    model.add(tf.keras.layers.Dense(1))\n","    \n","    return model"]},{"cell_type":"markdown","metadata":{"id":"TmGvzbh-ZEkn"},"source":["### 4. Define routines to determine loss and gradient\n","\n","In the following code cell, we define a function which evaluates the residual\n","\n","$$\n","\\begin{align}\n","    r_\\theta (t,x) := \\partial_t u_\\theta (t,x) + \\mathcal{N}[u_\\theta] (t,x).\n","\\end{align}\n","$$\n","\n","of the nonlinear PDE in the points $X_r = \\{(t^r_i,x^r_i)\\}_{i=1}^{N_r}$.\n","To compute the necessary partial derivatives we use the automatic differentiation capabilities of TensorFlow.\n","\n","For the Burgers equation, this entails computing $\\partial_t u_\\theta$, $\\partial_x u_\\theta$ and $\\partial_{xx} u_\\theta$.\n","In TensorFlow, this is done via a `GradientTape`, see also the [documentation](https://www.tensorflow.org/api_docs/python/tf/GradientTape), which keeps track of the `watched` variables, in our case `t` and `x`, in order to compute the derivatives."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"azOBDHMoZEkn"},"outputs":[],"source":["def get_r(model, X_r):\n","    \n","    # A tf.GradientTape is used to compute derivatives in TensorFlow\n","    with tf.GradientTape(persistent=True) as tape:\n","        # Split t and x to compute partial derivatives\n","        t, x = X_r[:, 0:1], X_r[:,1:2]\n","\n","        # Variables t and x are watched during tape\n","        # to compute derivatives u_t and u_x\n","        tape.watch(t)\n","        tape.watch(x)\n","\n","        # Determine residual \n","        u = model(tf.stack([t[:,0], x[:,0]], axis=1))\n","\n","        # Compute gradient u_x within the GradientTape\n","        # since we need second derivatives\n","        u_x = tape.gradient(u, x)\n","            \n","    u_t = tape.gradient(u, t)\n","    u_xx = tape.gradient(u_x, x)\n","\n","    del tape\n","\n","    return fun_r(t, x, u, u_t, u_x, u_xx)"]},{"cell_type":"markdown","metadata":{"id":"DRk8r7bkZEkn"},"source":["The next function computes the loss for our model\n","\n","$$\n","\\begin{align}\n","    \\phi_\\theta(X) := \\phi_\\theta^r(X^r) + \\phi_\\theta^0(X^0) + \\phi_\\theta^b(X^b),\n","\\end{align}\n","$$\n","\n","as a function of our the training data.\n","The collocation points are given by `X_r`, the initial and boundary data is contained in `X_data = [X_0, X_b]` and `u_data = [u_0, u_b]`."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"2JfHWgDqZEko"},"outputs":[],"source":["def compute_loss(model, X_r, X_data, u_data):\n","    \n","    # Compute phi^r\n","    r = get_r(model, X_r)\n","    phi_r = tf.reduce_mean(tf.square(r))\n","    \n","    # Initialize loss\n","    loss = phi_r\n","    \n","    # Add phi^0 and phi^b to the loss\n","    for i in range(len(X_data)):\n","        u_pred = model(X_data[i])\n","        loss += tf.reduce_mean(tf.square(u_data[i] - u_pred))\n","    \n","    return loss"]},{"cell_type":"markdown","metadata":{"id":"4HaL62skZEko"},"source":["The next function computes the gradient of the loss function $\\phi_\\theta$ with respect to the unknown variables in the model, also called `trainable_variables` in TensorFlow, i.e. $\\nabla_\\theta \\phi_\\theta$.\n","This is also done via a `GradientTape`, but now it keeps track of the parameters $\\theta$ in our model, which can be accessed by `model.trainable_variables`."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"uVM2FNj3ZEko"},"outputs":[],"source":["def get_grad(model, X_r, X_data, u_data):\n","    \n","    with tf.GradientTape(persistent=True) as tape:\n","        # This tape is for derivatives with\n","        # respect to trainable variables\n","        tape.watch(model.trainable_variables)\n","        loss = compute_loss(model, X_r, X_data, u_data)\n","\n","    g = tape.gradient(loss, model.trainable_variables)\n","    del tape\n","\n","    return loss, g"]},{"cell_type":"markdown","metadata":{"id":"km6-MigNZEkp"},"source":["### 5. Set up optimizer and train model"]},{"cell_type":"markdown","metadata":{"id":"wWOCbS4mZEkp"},"source":["Next we initialize the model, set the learning rate to the step function\n","\n","$$\n","\\delta(n) = 0.01 \\, \\textbf{1}_{\\{n < 1000\\}} + 0.001 \\, \\textbf{1}_{\\{1000 \\le n < 3000\\}} + 0.0005 \\, \\textbf{1}_{\\{3000 \\le n\\}}\n","$$\n","\n","which decays in a piecewise constant fashion, and set up a `tf.keras.optimizer` to train the model."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"XFmQx3yRZEkp"},"outputs":[],"source":["# Initialize model aka u_\\theta\n","model = init_model()\n","model.summary()\n","\n","# We choose a piecewise decay of the learning rate, i.e., the\n","# step size in the gradient descent type algorithm\n","# the first 1000 steps use a learning rate of 0.01\n","# from 1000 - 3000: learning rate = 0.001\n","# from 3000 onwards: learning rate = 0.0005\n","\n","lr = tf.keras.optimizers.schedules.PiecewiseConstantDecay([1000,3000],[1e-2,1e-3,5e-4])\n","\n","# Choose the optimizer\n","optim = tf.keras.optimizers.Adam(learning_rate=lr)"]},{"cell_type":"markdown","metadata":{"id":"V8lVzpJLZEkp"},"source":["Train the model for $N=5000$ epochs (takes approximately 3 minutes).\n","Here, we set up a function `train_step()` which performs one training step.\n","\n","*Note*: The `@tf.function` is a so-called `Decorator` within Python. This particular decorator redefines the function that follows, in our case `train_step`, as a TensorFlow graph which may speed up the training significantly."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"Txvp_tlxZEkq"},"outputs":[],"source":["from time import time\n","\n","# Define one training step as a TensorFlow function to increase speed of training\n","@tf.function\n","def train_step():\n","    # Compute current loss and gradient w.r.t. parameters\n","    loss, grad_theta = get_grad(model, X_r, X_data, u_data)\n","    \n","    # Perform gradient descent step\n","    optim.apply_gradients(zip(grad_theta, model.trainable_variables))\n","    \n","    return loss\n","\n","# Number of training epochs\n","N = 5000\n","hist = []\n","\n","# Start timer\n","t0 = time()\n","\n","for i in range(N+1):\n","    \n","    loss = train_step()\n","    \n","    # Append current loss to hist\n","    hist.append(loss.numpy())\n","    \n","    # Output current loss after 50 iterates\n","    if i%500 == 0:\n","        print('It {:05d}: loss = {:10.8e}'.format(i,loss))\n","        \n","# Print computation time\n","print('\\nComputation time: {} seconds'.format(time()-t0))"]},{"cell_type":"markdown","metadata":{"id":"9t16YkAOZEkq"},"source":["### Plot solution"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"IHlpz-ZtZEkq"},"outputs":[],"source":["from mpl_toolkits.mplot3d import Axes3D\n","\n","# Set up meshgrid\n","N = 600\n","tspace = np.linspace(lb[0], ub[0], N + 1)\n","xspace = np.linspace(lb[1], ub[1], N + 1)\n","T, X = np.meshgrid(tspace, xspace)\n","Xgrid = np.vstack([T.flatten(),X.flatten()]).T\n","\n","# Determine predictions of u(t, x)\n","upred = model(tf.cast(Xgrid,DTYPE))\n","\n","# Reshape upred\n","U = upred.numpy().reshape(N+1,N+1)\n","\n","# Surface plot of solution u(t,x)\n","fig = plt.figure(figsize=(9,6))\n","ax = fig.add_subplot(111, projection='3d')\n","ax.plot_surface(T, X, U, cmap='viridis');\n","ax.view_init(35,35)\n","ax.set_xlabel('$t$')\n","ax.set_ylabel('$x$')\n","ax.set_zlabel('$u_\\\\theta(t,x)$')\n","ax.set_title('Solution of Burgers equation');\n","#plt.savefig('Burgers_Solution.pdf', bbox_inches='tight', dpi=300);"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"LzZYJJek3X0G"},"outputs":[],"source":["plt.plot(X[:,0], U[:,0], label='Initial condition')\n","plt.plot(X[:,0], U[:,-1], label='Final solution')\n","plt.xlabel('x'); plt.ylabel('u')\n","plt.legend();"]},{"cell_type":"markdown","metadata":{"id":"zFt09HjnZEkq"},"source":["### Plot the evolution of loss"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"jTuOcgq1ZEkr"},"outputs":[],"source":["fig = plt.figure(figsize=(9,6))\n","ax = fig.add_subplot(111)\n","ax.semilogy(range(len(hist)), hist,'k-')\n","ax.set_xlabel('$n_{epoch}$')\n","ax.set_ylabel('$\\\\phi_{n_{epoch}}$');"]},{"cell_type":"markdown","metadata":{"id":"jK_IOlTjZEkr"},"source":["## Class implementation of PINNs"]},{"cell_type":"markdown","metadata":{"id":"7_HS1CbvZEkr"},"source":["In this section, we implement PINNs as a class which can be used for further testing. Here, we derive the class `PINN_NeuralNet` from `tf.keras.Model`.\n","\n","Required arguments are the lower bound `lb` and upper bound `ub`."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"tYBQzHVwZEkr"},"outputs":[],"source":["# Define model architecture\n","class PINN_NeuralNet(tf.keras.Model):\n","    \"\"\" Set basic architecture of the PINN model.\"\"\"\n","\n","    def __init__(self, lb, ub, \n","            output_dim=1,\n","            num_hidden_layers=8, \n","            num_neurons_per_layer=20,\n","            activation='tanh',\n","            kernel_initializer='glorot_normal',\n","            **kwargs):\n","        super().__init__(**kwargs)\n","\n","        self.num_hidden_layers = num_hidden_layers\n","        self.output_dim = output_dim\n","        self.lb = lb\n","        self.ub = ub\n","        \n","        # Define NN architecture\n","        self.scale = tf.keras.layers.Lambda(\n","            lambda x: 2.0*(x - lb)/(ub - lb) - 1.0)\n","        self.hidden = [tf.keras.layers.Dense(num_neurons_per_layer,\n","                             activation=tf.keras.activations.get(activation),\n","                             kernel_initializer=kernel_initializer)\n","                           for _ in range(self.num_hidden_layers)]\n","        self.out = tf.keras.layers.Dense(output_dim)\n","        \n","    def call(self, X):\n","        \"\"\"Forward-pass through neural network.\"\"\"\n","        Z = self.scale(X)\n","        for i in range(self.num_hidden_layers):\n","            Z = self.hidden[i](Z)\n","        return self.out(Z)\n","    "]},{"cell_type":"markdown","metadata":{"id":"wcOkamgfZEks"},"source":["Next, we derive a class `PINNSolver` which can be used as a base class.\n","It possesses two methods to solve the PDE:\n","  1. the method `solve_with_TFoptimizer` uses a `TensorFlow` optimizer object as input, e.g., the `AdamOptimizer` above;\n","  2. the method `solve_with_LBFGS` resembles the LBFGS method proposed in the original paper using an LBFGS method provided by [`SciPy`](https://www.scipy.org/)."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"S8PNryTZZEks"},"outputs":[],"source":["import scipy.optimize\n","\n","class PINNSolver():\n","    def __init__(self, model, X_r):\n","        self.model = model\n","        \n","        # Store collocation points\n","        self.t = X_r[:,0:1]\n","        self.x = X_r[:,1:2]\n","        \n","        # Initialize history of losses and global iteration counter\n","        self.hist = []\n","        self.iter = 0\n","    \n","    def get_r(self):\n","        \n","        with tf.GradientTape(persistent=True) as tape:\n","            # Watch variables representing t and x during this GradientTape\n","            tape.watch(self.t)\n","            tape.watch(self.x)\n","            \n","            # Compute current values u(t,x)\n","            u = self.model(tf.stack([self.t[:,0], self.x[:,0]], axis=1))\n","            \n","            u_x = tape.gradient(u, self.x)\n","            \n","        u_t = tape.gradient(u, self.t)\n","        u_xx = tape.gradient(u_x, self.x)\n","        \n","        del tape\n","        \n","        return self.fun_r(self.t, self.x, u, u_t, u_x, u_xx)\n","    \n","    def loss_fn(self, X, u):\n","        \n","        # Compute phi_r\n","        r = self.get_r()\n","        phi_r = tf.reduce_mean(tf.square(r))\n","        \n","        # Initialize loss\n","        loss = phi_r\n","\n","        # Add phi_0 and phi_b to the loss\n","        for i in range(len(X)):\n","            u_pred = self.model(X[i])\n","            loss += tf.reduce_mean(tf.square(u[i] - u_pred))\n","        \n","        return loss\n","    \n","    def get_grad(self, X, u):\n","        with tf.GradientTape(persistent=True) as tape:\n","            # This tape is for derivatives with\n","            # respect to trainable variables\n","            tape.watch(self.model.trainable_variables)\n","            loss = self.loss_fn(X, u)\n","            \n","        g = tape.gradient(loss, self.model.trainable_variables)\n","        del tape\n","        \n","        return loss, g\n","    \n","    def fun_r(self, t, x, u, u_t, u_x, u_xx):\n","        \"\"\"Residual of the PDE\"\"\"\n","        return u_t + u * u_x - viscosity * u_xx\n","    \n","    def solve_with_TFoptimizer(self, optimizer, X, u, N=1001):\n","        \"\"\"This method performs a gradient descent type optimization.\"\"\"\n","        \n","        @tf.function\n","        def train_step():\n","            loss, grad_theta = self.get_grad(X, u)\n","            \n","            # Perform gradient descent step\n","            optimizer.apply_gradients(zip(grad_theta, self.model.trainable_variables))\n","            return loss\n","        \n","        for i in range(N):\n","            \n","            loss = train_step()\n","            \n","            self.current_loss = loss.numpy()\n","            self.callback()\n","\n","    def solve_with_ScipyOptimizer(self, X, u, method='L-BFGS-B', **kwargs):\n","        \"\"\"This method provides an interface to solve the learning problem\n","        using a routine from scipy.optimize.minimize.\n","        (Tensorflow 1.xx had an interface implemented, which is not longer\n","        supported in Tensorflow 2.xx.)\n","        Type conversion is necessary since scipy-routines are written in Fortran\n","        which requires 64-bit floats instead of 32-bit floats.\"\"\"\n","        \n","        def get_weight_tensor():\n","            \"\"\"Function to return current variables of the model\n","            as 1d tensor as well as corresponding shapes as lists.\"\"\"\n","            \n","            weight_list = []\n","            shape_list = []\n","            \n","            # Loop over all variables, i.e. weight matrices, bias vectors and unknown parameters\n","            for v in self.model.variables:\n","                shape_list.append(v.shape)\n","                weight_list.extend(v.numpy().flatten())\n","                \n","            weight_list = tf.convert_to_tensor(weight_list)\n","            return weight_list, shape_list\n","\n","        x0, shape_list = get_weight_tensor()\n","        \n","        def set_weight_tensor(weight_list):\n","            \"\"\"Function which sets list of weights\n","            to variables in the model.\"\"\"\n","            idx = 0\n","            for v in self.model.variables:\n","                vs = v.shape\n","                \n","                # Weight matrices\n","                if len(vs) == 2:  \n","                    sw = vs[0]*vs[1]\n","                    new_val = tf.reshape(weight_list[idx:idx+sw],(vs[0],vs[1]))\n","                    idx += sw\n","                \n","                # Bias vectors\n","                elif len(vs) == 1:\n","                    new_val = weight_list[idx:idx+vs[0]]\n","                    idx += vs[0]\n","                    \n","                # Variables (in case of parameter identification setting)\n","                elif len(vs) == 0:\n","                    new_val = weight_list[idx]\n","                    idx += 1\n","                    \n","                # Assign variables (Casting necessary since scipy requires float64 type)\n","                v.assign(tf.cast(new_val, DTYPE))\n","        \n","        def get_loss_and_grad(w):\n","            \"\"\"Function that provides current loss and gradient\n","            w.r.t the trainable variables as vector. This is mandatory\n","            for the LBFGS minimizer from scipy.\"\"\"\n","            \n","            # Update weights in model\n","            set_weight_tensor(w)\n","            # Determine value of \\phi and gradient w.r.t. \\theta at w\n","            loss, grad = self.get_grad(X, u)\n","            \n","            # Store current loss for callback function            \n","            loss = loss.numpy().astype(np.float64)\n","            self.current_loss = loss            \n","            \n","            # Flatten gradient\n","            grad_flat = []\n","            for g in grad:\n","                grad_flat.extend(g.numpy().flatten())\n","            \n","            # Gradient list to array\n","            grad_flat = np.array(grad_flat,dtype=np.float64)\n","            \n","            # Return value and gradient of \\phi as tuple\n","            return loss, grad_flat\n","        \n","        \n","        return scipy.optimize.minimize(fun=get_loss_and_grad,\n","                                       x0=x0,\n","                                       jac=True,\n","                                       method=method,\n","                                       callback=self.callback,\n","                                       **kwargs)\n","        \n","    def callback(self, xr=None):\n","        if self.iter % 50 == 0:\n","            print('It {:05d}: loss = {:10.8e}'.format(self.iter,self.current_loss))\n","        self.hist.append(self.current_loss)\n","        self.iter+=1\n","        \n","    \n","    def plot_solution(self, **kwargs):\n","        N = 600\n","        tspace = np.linspace(self.model.lb[0], self.model.ub[0], N+1)\n","        xspace = np.linspace(self.model.lb[1], self.model.ub[1], N+1)\n","        T, X = np.meshgrid(tspace, xspace)\n","        Xgrid = np.vstack([T.flatten(),X.flatten()]).T\n","        upred = self.model(tf.cast(Xgrid,DTYPE))\n","        U = upred.numpy().reshape(N+1,N+1)\n","        fig = plt.figure(figsize=(9,6))\n","        ax = fig.add_subplot(111, projection='3d')\n","        ax.plot_surface(T, X, U, cmap='viridis', **kwargs)\n","        ax.set_xlabel('$t$')\n","        ax.set_ylabel('$x$')\n","        ax.set_zlabel('$u_\\\\theta(t,x)$')\n","        ax.view_init(35,35)\n","        return ax\n","        \n","    def plot_loss_history(self, ax=None):\n","        if not ax:\n","            fig = plt.figure(figsize=(7,5))\n","            ax = fig.add_subplot(111)\n","        ax.semilogy(range(len(self.hist)), self.hist,'k-')\n","        ax.set_xlabel('$n_{epoch}$')\n","        ax.set_ylabel('$\\\\phi^{n_{epoch}}$')\n","        return ax"]},{"cell_type":"markdown","metadata":{"id":"4fU7BHO8ZEkt"},"source":["### Burgers equation with L-BFGS\n","\n","The following code cell shows how the new classes `PINN_NeuralNet` and `PINNSolver` can be used to solve the Burgers equation, this time using the `SciPy` implementation of L-BFGS (takes around 3 minutes)."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"qMkLIbsKZEkt"},"outputs":[],"source":["# Initialize model\n","model = PINN_NeuralNet(lb, ub)\n","model.build(input_shape=(None,2))\n","\n","# Initilize PINN solver\n","solver = PINNSolver(model, X_r)\n","\n","# Decide which optimizer should be used\n","#mode = 'TFoptimizer'\n","mode = 'ScipyOptimizer'\n","\n","# Start timer\n","t0 = time()\n","\n","if mode == 'TFoptimizer':\n","    # Choose optimizer\n","    lr = tf.keras.optimizers.schedules.PiecewiseConstantDecay([1000,3000],[1e-2,1e-3,5e-4])\n","    optim = tf.keras.optimizers.Adam(learning_rate=lr)\n","    solver.solve_with_TFoptimizer(optim, X_data, u_data, N=4001)\n","    \n","elif mode == 'ScipyOptimizer':\n","    solver.solve_with_ScipyOptimizer(X_data, u_data,\n","                            method='L-BFGS-B',\n","                            options={'maxiter': 50000,\n","                                     'maxfun': 50000,\n","                                     'maxcor': 50,\n","                                     'maxls': 50,\n","                                     'ftol': 1.0*np.finfo(float).eps})\n","\n","# Print computation time\n","print('\\nComputation time: {} seconds'.format(time()-t0))"]},{"cell_type":"markdown","metadata":{"id":"AoojvmwJZEkt"},"source":["Plot solution and loss history."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"TpLq9hE3ZEkt"},"outputs":[],"source":["solver.plot_solution();\n","solver.plot_loss_history();"]},{"cell_type":"markdown","metadata":{"id":"md3w3gocZEku"},"source":["### Solution of a time-dependent Eikonal equation"]},{"cell_type":"markdown","metadata":{"id":"XFpU_MCzZEku"},"source":["As a second example we consider the one-dimensional Eikonal equation backward in time on the domain $\\mathcal{D}=[-1,1]$\n","\n","$$\n","\\begin{align}\n","   -\\partial_t u(t,x) + |\\nabla u|(t,x) &= 1, \n","   \t\t\\quad & &(t,x) \\in [0,T) \\times [-1,1],\\\\\n","   u(T,x) &= 0, \\quad & &x \\in [-1,1],\\\\\n","   u(t,-1) = u(t, 1) &= 0, \\quad & & t \\in [0,T).\n","\\end{align}\n","$$\n","\n","\n","Note that the partial differential equation can be equally written as a Hamilton-Jacobi-Bellman equation, viz\n","\n","$$\n","    -\\partial_t u(t,x) + \\sup_{|c| \\le 1} \\{c \\, \\nabla u(t,x)\\} = 1 \\quad (t,x) \\in [0,T) \\times [-1,1],\n","$$\n","\n","which characterizes the solution of an optimal control problem seeking to minimize the distance from a point $(t,x)$ to the boundary $[0,T] \\times \\partial \\mathcal{D} \\cup \\{T\\} \\times \\mathcal{D}$.\n","As is easily verified, the solution is given by $u(t,x) = \\min\\{ 1 - t, 1 - |x| \\}$.\n","The fact that the Eikonal equation runs backward in time is in accordance with its interpretation as the optimality condition of a control problem.\n","Note that this is equation can be transformed into a forward evolution problem by the change of variables $\\hat t = T - t$."]},{"cell_type":"markdown","metadata":{"id":"Hpgz97mSZEku"},"source":["#### Problem specific definitions"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"NeBGf0MRZEku"},"outputs":[],"source":["N_0 = 50\n","N_b = 50\n","N_r = 10000\n","\n","# Specify boundaries\n","lb = tf.constant([0., -1.], dtype=DTYPE)\n","ub = tf.constant([1., 1.], dtype=DTYPE)\n","\n","def Eikonal_u_0(x):\n","    n = x.shape[0]\n","    return tf.zeros((n,1), dtype=DTYPE)\n","\n","def Eikonal_u_b(t, x):\n","    n = x.shape[0]\n","    return tf.zeros((n,1), dtype=DTYPE)"]},{"cell_type":"markdown","metadata":{"id":"vcyQghvvZEku"},"source":["#### Generate data\n","This code snippet is almost identical to the code from above.\n","We choose $N_b = 50$ and $N_0 = 50$ uniformly distributed initial value and boundary points and sample $N_r = 10000$ collocation points uniformly within the domain boundaries.\n","We derive a new solver with `PINNSolver` as base class."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"TVK9118VZEkv"},"outputs":[],"source":["tf.random.set_seed(0)\n","\n","# Final time data\n","t_0 = tf.ones((N_0,1), dtype=DTYPE) * ub[0]\n","x_0 = tf.random.uniform((N_0,1), lb[1], ub[1], dtype=DTYPE)\n","X_0 = tf.concat([t_0, x_0], axis=1)\n","u_0 = Eikonal_u_0(x_0)\n","\n","# Boundary data\n","t_b = tf.random.uniform((N_b,1), lb[0], ub[0], dtype=DTYPE)\n","x_b = lb[1] + (ub[1] - lb[1]) * tf.keras.backend.random_bernoulli((N_b,1), 0.5, dtype=DTYPE)\n","X_b = tf.concat([t_b, x_b], axis=1)\n","u_b = Eikonal_u_b(t_b, x_b)\n","\n","# Collocation points\n","t_r = tf.random.uniform((N_r,1), lb[0], ub[0], dtype=DTYPE)\n","x_r = tf.random.uniform((N_r,1), lb[1], ub[1], dtype=DTYPE)\n","X_r = tf.concat([t_r, x_r], axis=1)\n","\n","# Collect boundary and inital data in lists\n","X_data = [X_0,X_b]\n","u_data = [u_0,u_b]"]},{"cell_type":"markdown","metadata":{"id":"xuxKrs2JZEkv"},"source":["#### Derive Eikonal solver class\n","Now, we derive a solver for the Eikonal equation from the `PINNSolver` class. Since the Eikonal equation does not depend on second-order derivatives, we implement a new method `get_r` which avoids the computation of second derivatives."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"H5_H69mkZEkv"},"outputs":[],"source":["class EikonalPINNSolver(PINNSolver):\n","    def __init__(self, *args, **kwargs):\n","        super().__init__(*args, **kwargs)\n","    \n","    def fun_r(self, t, x, u, u_t, u_x, u_xx):\n","        \"\"\"Residual of the PDE\"\"\"\n","        return -u_t + tf.abs(u_x) - 1.\n","    \n","    def get_r(self):\n","        \"\"\"We update get_r since the Eikonal equation is a first-order equation.\n","        Therefore, it is not necessary to compute second derivatives.\"\"\"\n","        \n","        with tf.GradientTape(persistent=True) as tape:\n","            # Watch variables representing t and x during this GradientTape\n","            tape.watch(self.t)\n","            tape.watch(self.x)\n","            \n","            # Compute current values u(t,x)\n","            u = self.model(tf.stack([self.t[:,0], self.x[:,0]], axis=1))\n","            \n","        u_x = tape.gradient(u, self.x)\n","            \n","        u_t = tape.gradient(u, self.t)\n","        \n","        del tape\n","        \n","        return self.fun_r(self.t, self.x, u, u_t, u_x, None)"]},{"cell_type":"markdown","metadata":{"id":"zYbXErmCZEkv"},"source":["#### Setting up the neural network architecture\n","\n","The neural network model chosen for this particular problem can be simpler.\n","We decided to use only two hidden layers with 20 neurons in each, resulting in $501$ unknown parameters (first hidden layer: $2 \\cdot 20 + 20 = 60$; one intermediate layer: $20 \\cdot 20 + 20 = 420$; output layer: $20 \\cdot 1 + 1 = 21$).\n","To account for the lack of smoothness of the solution, we choose a non-differentiable activation function, although the hyperbolic tangent function seems to be able to approximate the kinks in the solution sufficiently well.\n","Here, we decided to use the \\emph{leaky rectified linear unit (leaky ReLU)} activation function \n","\n","$$\n","\\begin{align*}\n","    \\sigma(z) = \\begin{cases}\n","        z &\\text{ if } z \\ge 0,\\\\\n","        0.1 \\, z &\\text{ otherwise},\n","    \\end{cases}\n","\\end{align*}\n","$$\n","\n","which displays a non-vanishing gradient when the unit is not active, i.e., when $z < 0$."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"xe7IxT0ZZEkw"},"outputs":[],"source":["# Initialize model\n","model = PINN_NeuralNet(lb, ub, num_hidden_layers=2,\n","                       activation=tf.keras.layers.LeakyReLU(alpha=0.1),\n","                                       kernel_initializer='he_normal')\n","model.build(input_shape=(None,2))\n","\n","# Initilize PINN solver\n","eikonalSolver = EikonalPINNSolver(model, X_r)"]},{"cell_type":"markdown","metadata":{"id":"91m7w0giZEkw"},"source":["Start training (take approximately 40 seconds)."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"BTNyeVbGZEkw"},"outputs":[],"source":["# Choose step sizes aka learning rate\n","lr = tf.keras.optimizers.schedules.PiecewiseConstantDecay([3000,7000],[1e-1,1e-2,1e-3])\n","\n","# Solve with Adam optimizer\n","optim = tf.keras.optimizers.Adam(learning_rate=lr)\n","\n","# Start timer\n","t0 = time()\n","eikonalSolver.solve_with_TFoptimizer(optim, X_data, u_data, N=10001)\n","\n","# Print computation time\n","print('\\nComputation time: {} seconds'.format(time()-t0))"]},{"cell_type":"markdown","metadata":{"id":"QpSrNvE6ZEkw"},"source":["Plot the results."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"I_0rTzbgZEkw"},"outputs":[],"source":["eikonalSolver.plot_solution();\n","#plt.savefig('Eikonal_Solution.pdf', bbox_inches='tight', dpi=300)\n","eikonalSolver.plot_loss_history();"]},{"cell_type":"markdown","metadata":{"id":"MjA_GhvvZEkx"},"source":["## Parameter identification setting"]},{"cell_type":"markdown","metadata":{"id":"fYq9RRHYZEkx"},"source":["In this section, we want to demonstrate how the PINN approach could be used to solve partial differential equations with unknown parameters $\\lambda$.\n","To be more precise, we consider the parabolic Eikonal equation\n","\n","$$\n","\\begin{aligned}\n","- \\partial_t u + \\sup_{|c|\\le1} c \\cdot \\nabla u = -\\partial_t u + |\\nabla u| &= \\lambda^{-1}\\\\\n","u(T,x) &= 0\\\\\n","u(t,-1) = u(t,1) &= 0\n","\\end{aligned}\n","$$\n","\n","with unknown parameter $\\lambda$.\n","The explicit solution is $u^*(t,x) = \\lambda^{-1} \\min\\{1-t, 1-|x|\\}$.\n"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ffMqtY-PZEkx"},"outputs":[],"source":["lambd_star = 3.\n","\n","def u_expl(t, x, lambd_star):\n","    \"\"\"Explicit solution of the parametric Eikonal equation.\"\"\"\n","    y = 1./lambd_star\n","    return y * tf.math.minimum(1-t, 1-tf.abs(x))"]},{"cell_type":"markdown","metadata":{"id":"7ALc7XLXZEkx"},"source":["Next, we draw $N_d = 500$ uniformly distributed measurements of the exact solution."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ToWTRpqSZEky"},"outputs":[],"source":["N_d = 500\n","noise = 0.0\n","\n","# Draw points with measurements randomly\n","t_d = tf.random.uniform((N_d,1), lb[0], ub[0], dtype=DTYPE)\n","x_d = tf.random.uniform((N_d,1), lb[1], ub[1], dtype=DTYPE)\n","X_d = tf.concat([t_d, x_d], axis=1)\n","u_d = u_expl(t_d, x_d, lambd_star)\n","u_d += noise * tf.random.normal(u_d.shape, dtype=DTYPE)\n","\n","# Copy original data\n","X_param = X_data\n","u_param = u_data"]},{"cell_type":"markdown","metadata":{"id":"bK7Rmg9QZEky"},"source":["Since both the boundary and initial time data are of Dirichlet type, we may handle the measured data exactly like $X_0$ and $X_b$.\n","Thus, we can simply append $X_d$ and $u_d$ to `Xdata` and `udata`.\n","\n","Note that the approach illustrated here is slightly different from the one introduced in ([Raissi et al., 2017 (Part II)](https://arxiv.org/abs/1711.10566)) which takes only measurement data into account."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"_TPaNhASZEky"},"outputs":[],"source":["X_param.append(X_d)\n","u_param.append(u_d)"]},{"cell_type":"markdown","metadata":{"id":"3vvGCNB6ZEky"},"source":["Next, we derive a new network class which takes the additional parameter $\\lambda$ into account.\n","Note that this parameter has to be part of the model in order to be learnt during training."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"r7Yy4fPhZEky"},"outputs":[],"source":["class PINNIdentificationNet(PINN_NeuralNet):\n","    def __init__(self, *args, **kwargs):\n","        \n","        # Call init of base class\n","        super().__init__(*args,**kwargs)\n","        \n","        # Initialize variable for lambda\n","        self.lambd = tf.Variable(1.0, trainable=True, dtype=DTYPE)\n","        self.lambd_list = []"]},{"cell_type":"markdown","metadata":{"id":"3gJAe9YLZEkz"},"source":["Now, we derive a new solver class which only updates the evaluation of the residual `fun_r` which now incorporates the $\\lambda$-dependency.\n","In addition, we modify the `callback` function to store the iterates of $\\lambda$ in a list `lambd_list` as well."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"oE-HHf1mZEkz"},"outputs":[],"source":["class EikonalPINNIdentification(EikonalPINNSolver):\n","    \n","    def fun_r(self, t, x, u, u_t, u_x, u_xx):\n","        \"\"\"Residual of the PDE\"\"\"\n","        return -u_t + tf.abs(u_x) - 1./self.model.lambd\n","    \n","    def callback(self, xr=None):\n","        lambd = self.model.lambd.numpy()\n","        self.model.lambd_list.append(lambd)\n","        \n","        if self.iter % 50 == 0:\n","            print('It {:05d}: loss = {:10.8e} lambda = {:10.8e}'.format(self.iter, self.current_loss, lambd))\n","        \n","        self.hist.append(self.current_loss)\n","        self.iter += 1\n","        \n","    def plot_loss_and_param(self, axs=None):\n","        if axs:\n","            ax1, ax2 = axs\n","            self.plot_loss_history(ax1)\n","        else:\n","            ax1 = self.plot_loss_history()\n","            ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis\n","\n","        color = 'tab:blue'\n","        ax2.tick_params(axis='y', labelcolor=color)\n","        ax2.plot(range(len(self.hist)), self.model.lambd_list,'-',color=color)\n","        ax2.set_ylabel('$\\\\lambda^{n_{epoch}}$', color=color)\n","        return (ax1,ax2)"]},{"cell_type":"markdown","metadata":{"id":"CGCpsiAJZEkz"},"source":["Finally, we set up the model consisting of only two hidden layers employing the Leaky ReLU function with slope parameter $\\alpha = 0.1$, i.e., \n","\n","$$\n","\\sigma(z) = \\begin{cases} x & \\text{ if } x \\ge 0\\\\ \\alpha \\, x & \\text{ otherwise.} \\end{cases}\n","$$\n","\n","The training for $n_{epochs} = 10000$ epochs takes around 45 seconds."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"Pjtp1jnDZEkz"},"outputs":[],"source":["# Initialize model\n","model = PINNIdentificationNet(lb, ub, num_hidden_layers=2,\n","                                        activation=tf.keras.layers.LeakyReLU(alpha=0.1),\n","                                        kernel_initializer='he_normal')\n","\n","model.build(input_shape=(None,2))\n","\n","# Initilize solver\n","eikonalIdentification = EikonalPINNIdentification(model, X_r)\n","\n","# Choose step sizes aka learning rate\n","lr = tf.keras.optimizers.schedules.PiecewiseConstantDecay([3000,7000],[1e-1,1e-2,1e-3])\n","\n","# Solve with Adam optimizer\n","optim = tf.keras.optimizers.Adam(learning_rate=lr)\n","\n","# Start timer\n","t0 = time()\n","eikonalIdentification.solve_with_TFoptimizer(optim, X_param, u_param, N=10001)\n","\n","# Print computation time\n","print('\\nComputation time: {} seconds'.format(time()-t0))"]},{"cell_type":"markdown","metadata":{"id":"aZKSMIK1ZEkz"},"source":["Plot solution $u_\\theta(t,x)$ and evolution of loss values $\\phi^{n_\\text{epoch}}$ and estimated parameters $\\lambda^{n_\\text{epoch}}$.."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"B2x46GGJZEk0"},"outputs":[],"source":["ax = eikonalIdentification.plot_solution()\n","#plt.savefig('Eikonal_PI_Solution.pdf', bbox_inches='tight', dpi=300)\n","axs = eikonalIdentification.plot_loss_and_param()\n","#plt.savefig('Eikonal_PI_LossEvolution.pdf', bbox_inches='tight', dpi=300)"]},{"cell_type":"markdown","metadata":{"id":"6r75WOi6ZEk0"},"source":["Finally, we compute the relative error of the identified parameter $\\lambda$."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"dRfZEVmbZEk0"},"outputs":[],"source":["lambd_rel_error = np.abs((eikonalIdentification.model.lambd.numpy()-lambd_star)/lambd_star)\n","print('Relative error of lambda ', lambd_rel_error)"]},{"cell_type":"markdown","metadata":{"id":"LxPbY_jsZEk0"},"source":["The next code cell performs the previous training 5 times in order to give a more reliable picture of the convergence since the weight matrices are initialized randomly at each run (takes about 4 minutes)."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"Opt0SKZCZEk0"},"outputs":[],"source":["lambd_hist = []\n","loss_hist = []\n","for i in range(5):\n","    print('{:s}\\nStart of iteration {:d}\\n{:s}'.format(50*'-',i,50*'-'))\n","    # Initialize model\n","    model = PINNIdentificationNet(lb, ub, num_hidden_layers=2,\n","                                        activation=tf.keras.layers.LeakyReLU(alpha=0.1),\n","                                        kernel_initializer='he_normal')\n","\n","    model.build(input_shape=(None,2))\n","\n","    # Initilize solver\n","    eikonalIdentification = EikonalPINNIdentification(model, X_r)\n","\n","    # Choose step sizes aka learning rate\n","    lr = tf.keras.optimizers.schedules.PiecewiseConstantDecay([3000,7000],[1e-1,1e-2,1e-3])\n","    N=10001\n","    \n","    # Solve with Adam optimizer\n","    optim = tf.keras.optimizers.Adam(learning_rate=lr)\n","    \n","    eikonalIdentification.solve_with_TFoptimizer(optim, X_param, u_param, N=N)\n","    \n","    # Store evolution of lambdas\n","    lambd_hist.append(model.lambd_list)\n","    \n","    # Store evolution of losses\n","    loss_hist.append(eikonalIdentification.hist)"]},{"cell_type":"markdown","metadata":{"id":"cBgcGX_wZEk1"},"source":["Next, we generate a table printing the mean and standard deviations of the identified parameter $\\lambda$ obtained for the previous runs."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"YjJ3sbsCZEk1"},"outputs":[],"source":["print('  i     Mean of lambda   Std. of lambda')\n","for i in [1000,2000,3000,4000,5000,6000,7000,8000,9000,10000]:\n","    xi = np.array([ x[i] for x in lambd_hist])\n","    print('{:05d}     {:6.4e}       {:6.4e}'.format(i, xi.mean(), xi.std()))"]},{"cell_type":"markdown","metadata":{"id":"sS2_BynhZEk1"},"source":["Next, we plot the five evolutions of $\\lambda$ (dark gray), its mean (solid blue) and one standard deviation (shaded blue) together with the true value of $\\lambda$ (dashed blue)."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"IJA_rXgpZEk1"},"outputs":[],"source":["fig = plt.figure()\n","ax = fig.add_subplot(111)\n","color = 'tab:blue'\n","\n","Lambd = np.stack(lambd_hist)\n","lmean = Lambd.mean(axis=0)\n","lstd = Lambd.std(axis=0)\n","Lambd_RelError = np.abs((Lambd-lambd_star)/lambd_star)\n","lrange=range(len(lmean))\n","\n","for i in range(len(lambd_hist)):\n","    ax.plot(lrange, lambd_hist[i],'-',color='black', alpha=0.5)\n","ax.plot(lrange, lmean,'-',color=color)\n","ax.plot(lrange, lambd_star*np.ones_like(lmean),'--',color=color)\n","ax.fill_between(lrange,lmean-lstd,lmean+lstd, alpha=0.2)\n","ax.set_ylabel('$\\\\lambda^{n_{epoch}}$')\n","ax.set_xlabel('$n_{epoch}$')\n","ax.set_ylim([2.8,3.2])\n","#plt.savefig('Eikonal_PI_Evolution.pdf', bbox_inches='tight', dpi=300)"]},{"cell_type":"markdown","metadata":{"id":"-WtDnWkfZEk1"},"source":["Finally, we plot the mean relative error of $\\lambda$."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"PIaO0PAtZEk2"},"outputs":[],"source":["fig = plt.figure()\n","ax = fig.add_subplot(111)\n","ax.semilogy(lrange,Lambd_RelError.mean(axis=0))\n","ax.fill_between(lrange,Lambd_RelError.mean(axis=0)-Lambd_RelError.std(axis=0),\n","                Lambd_RelError.mean(axis=0)+Lambd_RelError.std(axis=0), alpha=0.2)\n","ax.set_xlabel('$n_{epoch}$')\n","ax.set_ylabel('$e_{\\\\lambda}^{rel}$')\n","ax.set_title('Mean relative error of $\\\\lambda$');"]}],"metadata":{"colab":{"collapsed_sections":["4fU7BHO8ZEkt","md3w3gocZEku","Hpgz97mSZEku","vcyQghvvZEku","xuxKrs2JZEkv","zYbXErmCZEkv"],"provenance":[{"file_id":"https://github.com/janblechschmidt/PDEsByNNs/blob/main/PINN_Solver.ipynb","timestamp":1666864649680}]},"kernelspec":{"display_name":"Python 3.10.6 ('base')","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.10.6"},"vscode":{"interpreter":{"hash":"c6e4e9f98eb68ad3b7c296f83d20e6de614cb42e90992a65aa266555a3137d0d"}}},"nbformat":4,"nbformat_minor":0}