{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Homework 5\n", "\n", "## Matrix Diagonalization and PCA\n", "
\n", "This notebook is arranged in cells. Texts are usually written in the markdown cells, and here you can use html tags (make it bold, italic, colored, etc). You can double click on this cell to see the formatting.
\n", "
\n", "The ellipsis (...) are provided where you are expected to write your solution but feel free to change the template (not over much) in case this style is not to your taste.
\n", "
\n", "Hit \"Shift-Enter\" on a code cell to evaluate it. Double click a Markdown cell to edit.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "### Link Okpy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from client.api.notebook import Notebook\n", "ok = Notebook('hw5.ok')\n", "_ = ok.auth(inline = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "from scipy.integrate import quad\n", "#For plotting\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Problem 1 - Asymmetric Quantum Well\n", "\n", "Quantum mechanics can be formulated as a matrix problem and solved on a computer using linear algebra methods. Suppose, for example, we have a particle of mass M in a one-dimensional quantum well of width $L$, but not a square well like the examples you’ve probably seen before. Suppose instead that the potential $V(x)$ varies somehow inside the well:\n", "

\n", "We cannot solve such problems analytically in general, but we can solve them on the computer. In a pure state of energy $E$, the spatial part of the wavefunction obeys the time-independent\n", "Schrodinger equation $\\hat{H} \\psi(x) = E \\psi(x)$, where the Hamiltonian operator $\\hat{H}$ is given by\n", "

\n", "$$ \\hat{H} = -\\frac{\\hbar^2}{2M}\\frac{d^2}{dx^2} + V(x) $$\n", "

\n", "For simplicity, let’s assume that the walls of the well are infinitely high, so that the wavefunction is zero outside the well, which means it must go to zero at $x = 0$ and $x = L$. In that case, the wavefunction can be expressed as a Fourier sine series thus:\n", "

\n", "$$ \\psi(x) = \\sum_{n=1}^{\\infty}\\psi_n \\mathrm{sin}(\\frac{\\pi nx}{L}) $$\n", "where $\\psi_1, \\psi_2, ...$ are the Fourier coefficients.\n", "

\n", "Using the orthogonality relationships of the sine functions, we find that $\\hat{H} \\psi(x) = E \\psi(x)$ implies that\n", "

\n", "$$ \\sum_{n=1}^{\\infty} \\int_0^L \\mathrm{sin}(\\frac{\\pi mx}{L}) \\hat{H} \\mathrm{sin}(\\frac{\\pi nx}{L}) dx = \\frac{L}{2}E \\psi_m. $$\n", "

\n", "Hence, defining a Hamiltonian matrix $\\bf H$ with elements\n", "

\n", "$$ H_{mn} = \\frac{2}{L} \\int_0^L \\mathrm{sin}(\\frac{\\pi mx}{L}) \\hat{H} \\mathrm{sin}(\\frac{\\pi nx}{L}) dx. $$\n", "

\n", "Then, the Schrodinger's equation can be written in matrix form as $\\bf H \\boldsymbol \\psi$ = $E$ $\\boldsymbol \\psi$, where $\\boldsymbol \\psi$ is an eigenvector of the Hamiltonian matrix with eigenvalue $E$. If we can calculate the eigenvalues of this matrix, then we know the allowed energies of the particle in the well.\n", "

\n", "Let $V(x) = ax/L$, and then we can evaluate the integral in $H_{mn}$ analytically. Here is a general expression for the matrix element $H_{mn}$:\n", "

\n", "\\begin{equation}\n", "H_{mn} = \n", "\\begin{cases}\n", "\\frac{1}{2M}(\\frac{\\hbar \\pi n}{L})^2 + \\frac{a}{2}\\ \\ \\ \\ \\ \\ m = n \\\\\n", "-\\frac{8a}{\\pi^2}\\frac{mn}{(m^2-n^2)^2}\\ \\ \\ \\ \\ \\ m \\neq n\\ \\mathrm{and\\ one\\ is\\ even,\\ one\\ is\\ odd}\n", "\\end{cases}\n", "\\end{equation}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 1. Is the matrix $\\bf H$ real and symmetric?
" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ " Answer:
\n", "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 2. Write a Python program to evaluate your expression for $H_{mn}$ for arbitrary $m$ and $n$ when the particle in the well is an electron, the well has width 5 Angstrom, and $a$ = 10 eV. (The mass and charge of an electron are $9.1094 \\times 10^{−31}$ kg and $1.6022 \\times 10^{−19}$ C respectively.) Evaluate $H_{22}$, $H_{23}$, and $H_{35}$. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "L = 5e-10\n", "hbar = 1.0546e-34\n", "M = 9.1094e-31\n", "a = 10*1.6022e-19\n", "\n", "\n", "def H_element(m,n):\n", " \n", " ...\n", " \n", " return ..." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "print('H22 =', H_element(2,2), ', H23 =', H_element(2,3), ', H35 =', H_element(3,5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 3. The matrix $\\bf H$ is in theory infinitely large, so we cannot calculate all its eigenvalues. But we can get a pretty accurate solution for the first few of them by cutting off the matrix after the first few elements. Use the program you wrote for part 2 to create a 10 × 10 array of the elements of $\\bf H$ up to $m, n$ = 10." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "...\n", "H = ...\n", "..." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Show the matrix H\n", "print(H)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 4. Calculate the eigenvalues of this matrix using the appropriate function from numpy.linalg and hence print out, in units of electron volts, the first ten energy levels of the quantum well, within this approximation." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Suggestion - See https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.linalg.eigh.html\n", "from numpy.linalg import eigh\n", "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 5. What is the ground-state energy of the system? (in eV). " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 6. Use a 100 × 100 array instead and again calculate the first ten energy eigenvalues." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 7. Comparing with the values you calculated in part 4, what do you conclude about the accuracy of the calculation?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " Answer:
\n", "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 8. Now modify your program once more to calculate the wavefunction $\\psi(x)$ for the ground state and the first two excited states of the well. Use your results to make a graph with three curves showing the probability density $|\\psi(x)|^2$ as a function of x in each of these three states." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 9. In the setup, the eigenvector $\\boldsymbol \\psi$ of the Hamiltonian is normalized. Then, is $\\int_0^L|\\psi(x)|^2 = 1$? (Hint: $\\int_0^L \\mathrm{sin}(\\frac{\\pi nx}{L})\\mathrm{sin}(\\frac{\\pi mx}{L}) = \\frac{L}{2}$ if $m = n$ and 0 otherwise). Using \"quad,\" integrate the wavefunction for the ground state from $x = 0$ to $L$.
Answer:

\n", "We have $\\int_0^L|\\psi(x)|^2 = (\\sum_n |\\psi_n|^2) \\int_0^L \\mathrm{sin}(\\frac{\\pi nx}{L})^2 = 1 \\cdot \\frac{L}{2} \\approx 2.5 \\times 10^{-10}$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from scipy.integrate import quad\n", "..." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Problem 2 - Applying the PCA Method on Quasar Spectra\n", "\n", "The following analysis is based on https://arxiv.org/pdf/1208.4122.pdf.\n", "

\n", "\"Principal Component Analysis (PCA) is a powerful\n", "and widely used technique to analyze data\n", "by forming a custom set of “principal component”\n", "eigenvectors that are optimized to describe the\n", "most data variance with the fewest number of\n", "components. With the full set of eigenvectors the data\n", "may be reproduced exactly, i.e., PCA is a transformation\n", "which can lend insight by identifying\n", "which variations in a complex dataset are most\n", "significant and how they are correlated. Alternately,\n", "since the eigenvectors are optimized and\n", "sorted by their ability to describe variance in the\n", "data, PCA may be used to simplify a complex\n", "dataset into a few eigenvectors plus coefficients,\n", "under the approximation that higher-order eigenvectors\n", "are predominantly describing fine tuned\n", "noise or otherwise less important features of the\n", "data.\" (S. Bailey, arxiv: 1208.4122)\n", "

\n", "In this problem, we take the quasar (QSO) spectra from the Sloan Digital Sky Survey (SDSS) and apply PCA to them. Filtering for high $S/N$ in order to apply the standard PCA, we select 18 high-$S/N$ spectra of QSOs with redshift 2.0 < z < 2.1, trimmed to $1340 < \\lambda < 1620\\ \\mathring{A}$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Load data\n", "wavelength = np.loadtxt(\"HW5_Problem2_wavelength.txt\")\n", "flux = np.loadtxt(\"HW5_Problem2_QSOspectra.txt\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Data dimension\n", "print( np.shape(wavelength) )\n", "print( np.shape(flux) )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the above cell, we load the following data: wavelength in Angstroms (\"wavelength\") and 2D array of spectra x fluxes (\"flux\").\n", "

\n", "We have 824 wavelength bins, so \"flux\" is 18 $\\times$ 824 matrix, each row containing fluxes of different QSO spectra.\n", "

\n", " 1. Plot any three QSO spectra flux as a function of wavelength. (In order to better see the features of QSO spectra, you may plot them with some offsets.)
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"Flux\" is the data matrix of order 18 $\\times$ 824. Call this matrix $\\bf X$. \n", "

\n", "We can construct the covariance matrix $\\bf C$ using the mean-centered data matrix. First, calculate the mean of each column and subtracts this from the column. Let $\\bf X_c$ denote the mean-centered data matrix.

\n", "$\\bf X_c$ $ =\n", " \\begin{bmatrix}\n", " x_{(1,1)} - \\overline{x}_1 & x_{(1,2)} - \\overline{x}_2 & \\dots & x_{(1,824)} - \\overline{x}_{824} \\\\\n", " x_{(2,1)} - \\overline{x}_1 & x_{(2,2)} - \\overline{x}_2 & \\dots & x_{(2,824)} - \\overline{x}_{824} \\\\\n", " \\vdots & \\vdots & \\vdots & \\vdots \\\\\n", " x_{(18,1)} - \\overline{x}_1 & x_{(18,2)} - \\overline{x}_2 & \\dots & x_{(18,824)} - \\overline{x}_{824}\n", " \\end{bmatrix}$\n", "

\n", "where $x_{m,n}$ denote the flux of $m$th QSO in $n$th wavelength bin, and $\\overline{x}_k$ is the mean flux in $k$th wavelength bin.\n", "

\n", "Then, the covariance matrix is:\n", "$\\bf C$ $ = \\frac{1}{N-1}$ $\\bf X_c^T X_c.$ ($N$ is the number of QSOs.)\n", "

\n", " 2. Find the covariance matrix C using the data matrix flux.
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "C = \n", "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 3. Using numpy.linalg, find eigenvalues and eigenvectors of the covariance matrix. Order the eigenvalues from largest to smallest and then plot them as a function of the number of eigenvalues. (Remember that the eigenvector with the highest eigenvalue is the principle component of the data set.)\n", "In this case, we find that our covariance matrix is rank-17 matrix, so we only select the first 17 highest eigenvalues and corresponding eigenvectors (other eigenvalues are close to zero).
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "np.linalg.matrix_rank(C)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from numpy.linalg import eig\n", "..." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Make plot\n", "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 4. Plot the first three eigenvectors. These eigenvectors\n", "represent the principal variations of the spectra with respect to that mean spectrum.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The eigenvectors indicate the direction of the principal components, so we can re-orient the data onto the new zes by multiplying the original mean-centered data by the eigenvectors. We call the re-oriented data \"PC scores.\" (Call the PC score matrix $\\bf Z$) Suppose that we have $k$ eigenvectors. Construct the matrix of eigenvectors $\\bf V = [v_1 v_2 ... v_k]$, with $\\bf v_i$ the $i$th highest eigenvector. Then, we can get 18 $\\times\\ k$ PC score matrix by multiplying the 18 $\\times$ 824 data matrix with the 824 $\\times\\ k$ eigenvector matrix:\n", "

\n", "$$ \\bf Z = X_c V $$\n", "

\n", "Then, we can reconstruct the data by mapping it back to 824 dimensions with $\\bf V^T$:\n", "

\n", "$$ \\bf \\hat{X} = \\boldsymbol \\mu + Z V^T $$\n", "where $\\boldsymbol \\mu$ is the vector of mean QSO flux.\n", "

\n", "Now, comparing the original data with the reconstructed data, we can calculate the residuals. Let $\\bf X_{(i)}, \\hat{X}_{(i)}$ denote the rows of $\\bf X, \\hat{X}$ respectively. Remember that the data matrix has the dimension 18 $\\times$ 824, so each row $\\bf X_{(i)}$ corresponding the spectra of one particular QSO. (For example, if you wish to see the QSO spectra in row 7, you can plot $\\bf X_{(7)}$ as a function of wavelength.). Then, we can simply calculate the residual as $\\frac{1}{N} \\sum_{i=1}^N \\bf |\\hat{X}_{(i)} - X_{(i)}|^2$ where $N$ is the total number of QSOs (NOTE: $\\bf |\\hat{X}_{(i)} - X_{(i)}|$ is the magnitude of the difference between two vectors $\\bf \\hat{X}_{(i)}$ and $\\bf X_{(i)}$.)\n", "

\n", " 5. First, start with only mean flux value $\\boldsymbol \\mu$ (in this case $\\bf \\hat{X} = \\boldsymbol \\mu, V = 0$) and calculate the residual. Then, do the reconstruction using the first two principal eigenvectors $\\bf V = [v_1 v_2]$ and calculate the residual. Finally, let $\\bf V = [v_1 v_2 ... v_6]$ (the first six principal eigenvectors) and compute the residual.
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 6. For any two QSO spectra, plot the original and reconstructed spectra using the first six principal eigenvectors.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "scrolled": false }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " 7. Plot the residual as a function of the number of included eigenvectors.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this problem, we only have 18 QSO spectra, so the idea of using PCA may seem silly. We can also use SVD to find eigenvalues and eigenvectors. With SVD, we get $\\bf X_c = USV^T$. Then, the covariance matrix is $\\bf C$ $ = \\frac{1}{N-1}$ $\\bf X_c^T X_c$ $ = \\frac{1}{N-1}$ $\\bf VS^2V^T.$ Then, the eigenvalues are the squared singular values scaled by the factor $\\frac{1}{N-1}$ and the eigenvectors are the columns of $\\bf V$.\n", "

\n", " 8. Find the eigenvalues applying SVD to the mean-centered data matrix $\\bf X_c$.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from scipy.linalg import svd\n", "\n", "...\n", "\n", "# Print Eigenvalues\n", "..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## To Submit\n", "Execute the following cell to submit.\n", "If you make changes, execute the cell again to resubmit the final copy of the notebook, they do not get updated automatically.
\n", "__We recommend that all the above cells should be executed (their output visible) in the notebook at the time of submission.__
\n", "Only the final submission before the deadline will be graded. \n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "_ = ok.submit()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.13" } }, "nbformat": 4, "nbformat_minor": 2 }