{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sys.version= 3.6.3 |Anaconda, Inc.| (default, Oct 6 2017, 12:04:38) \n", "[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]\n", "np.__version__= 1.13.3\n" ] } ], "source": [ "# NOT FOR MANUSCRIPT\n", "import sys\n", "\n", "import numpy as np\n", "import scipy as sp\n", "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "\n", "print (\"sys.version=\", sys.version)\n", "print (\"np.__version__=\", np.__version__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Geophysical Tutorial – coordinated by Matt Hall\n", "\n", "# The conjugate gradient method\n", "\n", "Karl Schleicher, k_schleicher@hotmail.com" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The conjugate gradient method can be used to solve many large linear geophysical problems. For example, least squares parabolic and hyperbolic radon transform, travel time tomography, least squares migration, and full waveform inversion (e.g. Witte et al., 2018). This tutorial revisits the _Linear inversion_ tutorial (Hall, 2016) that estimated reflectivity by deconvolving a known wavelet from a seismic trace using least squares. This tutorial solves the same problem using the conjugate gradient method. This problem is easy to understand and the concepts apply to other applications. The conjugate gradient method is often used to solve large problems because the least squares algorithm is much more expensive — that is, even a large computer may not be able to find a useful solution in a reasonable amount of time. \n", "\n", "\n", "## Introduction\n", "\n", "The conjugate gradient method was originally proposed by Hestenes (1952) and extended to handle rectangular matrices by Paige & Saunders (1982). Claerbout (2012) demonstrated its application to geophysical problems. It is an iterative method. Each iteration applies the linear operator and its adjoint. The initial guess is often the zero vector and computation may stop after very few iterations. \n", "\n", "The adjoint of the operator $\\mathbf{A}$, denoted as $\\mathbf{A}^\\mathrm{H}$, is defined as the operator that satisfies $\\langle \\mathbf{A} \\mathbf{x}, \\mathbf{y} \\rangle$ = $\\langle \\mathbf{x}, \\mathbf{A}^\\mathrm{H} \\mathbf{y} \\rangle$ for all vectors $\\mathbf{x}$ and $\\mathbf{y}$ (where $\\langle \\mathbf{u},\\mathbf{v} \\rangle$ represents the inner product between vectors $\\mathbf{u}$ and $\\mathbf{v}$). For a given matrix, the adjoint is simply the complex conjugate of the transpose of the matrix; this is also sometimes known as the Hermitian transpose, and sometimes written as $\\mathbf{A}^\\ast$ or $\\mathbf{A}^\\dagger$. Just to muddy the notation water even further, the complex conjugate transpose is denoted by `A.H` in NumPy and `A'` in MATLAB or Octave. However, we will implement the adjoint operator without forming any matrices." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "matrix\n", "[[ 2.+0.j 3.+0.j]\n", " [ 6.+3.j -7.+0.j]]\n", "\n", "transpose\n", "[[ 2.+0.j 6.+3.j]\n", " [ 3.+0.j -7.+0.j]]\n", "\n", "complex conjugate transpose, aka Hermitian transpose\n", "[[ 2.-0.j 6.-3.j]\n", " [ 3.-0.j -7.-0.j]]\n" ] } ], "source": [ "# NOT FOR MANUSCRIPT\n", "A = np.matrix([[2,3], [(6+3j),-7]])\n", "\n", "print(\"matrix\")\n", "print(A)\n", "print(\"\\ntranspose\")\n", "print(A.T)\n", "print(\"\\ncomplex conjugate transpose, aka Hermitian transpose\")\n", "print(A.H)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Many linear operators can be programmed as functions that are more intuitive and efficient than matrix multiplication. The matrices for operators like migration and FWI would be huge, but we avoid this problem because once you have the program for the linear operator, you can write the adjoint operator without computing matrices. Implementing the conjugate gradient algorithm using functions to apply linear operators and their adjoints is practical and efficient. It is wonderful to see programs that implement linear algorithms without matrices and the programming technique is a key theme in Claerbout’s 2012 book.\n", "\n", "This tutorial provides a quick start to the conjugate gradient method based on Guo’s pseudocode (2002). Those interested in more depth can read Claerbout (2012) and Shewchuk (1994). A Jupyter Notebook with Python code to reproduce the figures in this tutorial is at: https://github.com/seg/tutorials." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "