{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 9. PageRank with Eigen Decompositions" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "### Two Handy Tricks" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Here are two tools that we'll be using today, which are useful in general." ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "1\\. [Psutil](https://github.com/giampaolo/psutil) is a great way to check on your memory usage. This will be useful here since we are using a larger data set." ] }, { "cell_type": "code", "execution_count": 462, "metadata": { "collapsed": true, "hidden": true }, "outputs": [], "source": [ "import psutil" ] }, { "cell_type": "code", "execution_count": 447, "metadata": { "collapsed": true, "hidden": true }, "outputs": [], "source": [ "process = psutil.Process(os.getpid())\n", "t = process.memory_info()" ] }, { "cell_type": "code", "execution_count": 449, "metadata": { "hidden": true }, "outputs": [ { "data": { "text/plain": [ "(19475513344, 17856520192)" ] }, "execution_count": 449, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.vms, t.rss" ] }, { "cell_type": "code", "execution_count": 450, "metadata": { "collapsed": true, "hidden": true }, "outputs": [], "source": [ "def mem_usage():\n", " process = psutil.Process(os.getpid())\n", " return process.memory_info().rss / psutil.virtual_memory().total" ] }, { "cell_type": "code", "execution_count": 451, "metadata": { "hidden": true }, "outputs": [ { "data": { "text/plain": [ "0.13217061955758594" ] }, "execution_count": 451, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mem_usage()" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "2\\. [TQDM](https://github.com/tqdm/tqdm) gives you progress bars." ] }, { "cell_type": "code", "execution_count": 364, "metadata": { "collapsed": true, "hidden": true }, "outputs": [], "source": [ "from time import sleep" ] }, { "cell_type": "code", "execution_count": 463, "metadata": { "hidden": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "45\n" ] } ], "source": [ "# Without TQDM\n", "s = 0\n", "for i in range(10):\n", " s += i\n", " sleep(0.2)\n", "print(s)" ] }, { "cell_type": "code", "execution_count": 465, "metadata": { "hidden": true }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 10/10 [00:02<00:00, 4.96it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "45\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# With TQDM\n", "from tqdm import tqdm\n", "\n", "s = 0\n", "for i in tqdm(range(10)):\n", " s += i\n", " sleep(0.2)\n", "print(s)" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Motivation" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "**Review**\n", "- What is SVD?\n", "- What are some applications of SVD?" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "#### Additional SVD Application" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "An interesting use of SVD that I recently came across was as a step in the de-biasing of Word2Vec word embeddings, from [Quantifying and Reducing Stereotypes in Word Embeddings](https://arxiv.org/pdf/1606.06121.pdf)(Bolukbasi, et al).\n", "\n", "Word2Vec is a useful library released by Google that represents words as vectors. The similarity of vectors captures semantic meaning, and analogies can be found, such as *Paris:France :: Tokyo: Japan*. \n", "\n", "\"\"\n", "(source: [Vector Representations of Words](https://www.tensorflow.org/versions/r0.10/tutorials/word2vec/))\n", "\n", "However, these embeddings can implicitly encode bias, such as *father:doctor :: mother:nurse* and *man:computer programmer :: woman:homemaker*.\n", "\n", "One approach for de-biasing the space involves using SVD to reduce the dimensionality ([Bolukbasi paper](https://arxiv.org/pdf/1606.06121.pdf)).\n", "\n", "You can read more about bias in word embeddings:\n", "- [How Vector Space Mathematics Reveals the Hidden Sexism in Language](https://www.technologyreview.com/s/602025/how-vector-space-mathematics-reveals-the-hidden-sexism-in-language/)(MIT Tech Review)\n", "- [ConceptNet: better, less-stereotyped word vectors](https://blog.conceptnet.io/2017/04/24/conceptnet-numberbatch-17-04-better-less-stereotyped-word-vectors/)\n", "- [Semantics derived automatically from language corpora necessarily contain human biases](https://www.princeton.edu/~aylinc/papers/caliskan-islam_semantics.pdf) (excellent and very interesting paper!)" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "#### Ways to think about SVD" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "- Data compression\n", "- SVD trades a large number of features for a smaller set of better features\n", "- All matrices are diagonal (if you use change of bases on the domain and range)" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "### Perspectives on SVD" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "We usually talk about SVD in terms of matrices, $$A = U \\Sigma V^T$$ but we can also think about it in terms of vectors. SVD gives us sets of orthonormal vectors ${v_j}$ and ${u_j}$ such that $$ A v_j = \\sigma_j u_j $$\n", "\n", "$\\sigma_j$ are scalars, called singular values\n", "\n", "**Q**: Does this remind you of anything?" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "#### Answer" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "**Relationship between SVD and Eigen Decomposition**: the left-singular vectors of A are the eigenvectors of $AA^T$. The right-singular vectors of A are the eigenvectors of $A^T A$. The non-zero singular values of A are the square roots of the eigenvalues of $A^T A$ (and $A A^T$).\n", "\n", "SVD is a generalization of eigen decomposition. Not all matrices have eigen values, but ALL matrices have singular values.\n", "\n", "Let's forget SVD for a bit and talk about how to find the eigenvalues of a symmetric positive definite matrix..." ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "### Further resources on SVD" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "- [SVD: image compression and least squares](http://andrew.gibiansky.com/blog/mathematics/cool-linear-algebra-singular-value-decomposition/)\n", "\n", "- [Image Compression with SVD](http://nbviewer.jupyter.org/gist/frankcleary/4d2bd178708503b556b0)\n", "\n", "- [The Extraordinary SVD](https://sites.math.washington.edu/~morrow/498_13/svd_applied.pdf)\n", "\n", "- [A Singularly Valuable Decomposition: The SVD of a Matrix](https://sites.math.washington.edu/~morrow/498_13/svd.pdf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Today: Eigen Decomposition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The best classical methods for computing the SVD are variants on methods for computing eigenvalues. In addition to their links to SVD, Eigen decompositions are useful on their own as well. Here are a few practical applications of eigen decomposition:\n", "- [rapid matrix powers](http://www.onmyphd.com/?p=eigen.decomposition#h2_why)\n", "- [nth Fibonacci number](http://mathproofs.blogspot.com/2005/04/nth-term-of-fibonacci-sequence.html)\n", "- Behavior of ODEs\n", "- Markov Chains (health care economics, Page Rank)\n", "- [Linear Discriminant Analysis on Iris dataset](http://sebastianraschka.com/Articles/2014_python_lda.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check out the 3 Blue 1 Brown videos on [Change of basis](https://www.youtube.com/watch?v=P2LTAUO1TdA) and [Eigenvalues and eigenvectors](https://www.youtube.com/watch?v=PFDu9oVAE-g)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"Eigenvalues are a way to see into the heart of a matrix... All the difficulties of matrices are swept away\" -Strang " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Vocab**: A **Hermitian** matrix is one that is equal to it's own conjugate transpose. In the case of real-valued matrices (which is all we are considering in this course), **Hermitian** means the same as **Symmetric**.\n", "\n", "**Relevant Theorems:**\n", "- If A is symmetric, then eigenvalues of A are real and $A = Q \\Lambda Q^T$\n", "- If A is triangular, then its eigenvalues are equal to its diagonal entries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DBpedia Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start with the **Power Method**, which finds one eigenvector. *What good is just one eigenvector?* you may be wondering. This is actually the basis for PageRank (read [The $25,000,000,000 Eigenvector: the Linear Algebra Behind Google](http://www.rose-hulman.edu/~bryan/googleFinalVersionFixed.pdf) for more info)\n", "\n", "Instead of trying to rank the importance of all websites on the internet, we are going to use a dataset of Wikipedia links from [DBpedia](http://wiki.dbpedia.org/). DBpedia provides structured Wikipedia data available in 125 languages. \n", "\n", "\"*The full DBpedia data set features 38 million labels and abstracts in 125 different languages, 25.2 million links to images and 29.8 million links to external web pages; 80.9 million links to Wikipedia categories, and 41.2 million links to [YAGO](http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/) categories*\" --[about DBpedia](http://wiki.dbpedia.org/about)\n", "\n", "Today's lesson is inspired by this [SciKit Learn Example](http://scikit-learn.org/stable/auto_examples/applications/wikipedia_principal_eigenvector.html#sphx-glr-auto-examples-applications-wikipedia-principal-eigenvector-py)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Imports" ] }, { "cell_type": "code", "execution_count": 361, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import os, numpy as np, pickle\n", "from bz2 import BZ2File\n", "from datetime import datetime\n", "from pprint import pprint\n", "from time import time\n", "from tqdm import tqdm_notebook\n", "from scipy import sparse\n", "\n", "from sklearn.decomposition import randomized_svd\n", "from sklearn.externals.joblib import Memory\n", "from urllib.request import urlopen" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Downloading the data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data we have are:\n", "- **redirects**: URLs that redirect to other URLs\n", "- **links**: which pages link to which other pages" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: this takes a while" ] }, { "cell_type": "code", "execution_count": 367, "metadata": { "collapsed": true }, "outputs": [], "source": [ "PATH = 'data/dbpedia/'\n", "URL_BASE = 'http://downloads.dbpedia.org/3.5.1/en/'\n", "filenames = [\"redirects_en.nt.bz2\", \"page_links_en.nt.bz2\"]\n", "\n", "for filename in filenames:\n", " if not os.path.exists(PATH+filename):\n", " print(\"Downloading '%s', please wait...\" % filename)\n", " open(PATH+filename, 'wb').write(urlopen(URL_BASE+filename).read())" ] }, { "cell_type": "code", "execution_count": 368, "metadata": { "collapsed": true }, "outputs": [], "source": [ "redirects_filename = PATH+filenames[0]\n", "page_links_filename = PATH+filenames[1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Graph Adjacency Matrix" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will construct a graph **adjacency matrix**, of which pages point to which.\n", "\n", "\"\"\n", "(source: [PageRank and HyperLink Induced Topic Search](https://www.slideshare.net/priyabrata232/page-rank-and-hyperlink))\n", "\n", "\"\"\n", "(source: [PageRank and HyperLink Induced Topic Search](https://www.slideshare.net/priyabrata232/page-rank-and-hyperlink))\n", "\n", "The power $A^2$ will give you how many ways there are to get from one page to another in 2 steps. You can see a more detailed example, as applied to airline travel, [worked out in these notes](http://www.utdallas.edu/~jwz120030/Teaching/PastCoursesUMBC/M221HS06/ProjectFiles/Adjacency.pdf)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We want to keep track of which pages point to which pages. We will store this in a square matrix, with a $1$ in position $(r, c)$ indicating that the topic in row $r$ points to the topic in column $c$\n", "\n", "You can read [more about graphs here](http://www.geeksforgeeks.org/graph-and-its-representations/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Format" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One line of the file looks like:\n", "- ` .`\n", "\n", "In the below slice, the plus 1, -1 are to remove the <>" ] }, { "cell_type": "code", "execution_count": 369, "metadata": { "collapsed": true }, "outputs": [], "source": [ "DBPEDIA_RESOURCE_PREFIX_LEN = len(\"http://dbpedia.org/resource/\")\n", "SLICE = slice(DBPEDIA_RESOURCE_PREFIX_LEN + 1, -1)" ] }, { "cell_type": "code", "execution_count": 370, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_lines(filename): return (line.split() for line in BZ2File(filename))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Loop through redirections and create dictionary of source to final destination" ] }, { "cell_type": "code", "execution_count": 371, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_redirect(targ, redirects):\n", " seen = set()\n", " while True:\n", " transitive_targ = targ\n", " targ = redirects.get(targ)\n", " if targ is None or targ in seen: break\n", " seen.add(targ)\n", " return transitive_targ" ] }, { "cell_type": "code", "execution_count": 372, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_redirects(redirects_filename):\n", " redirects={}\n", " lines = get_lines(redirects_filename)\n", " return {src[SLICE]:get_redirect(targ[SLICE], redirects) \n", " for src,_,targ,_ in tqdm_notebook(lines, leave=False)}" ] }, { "cell_type": "code", "execution_count": 373, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4e6fbe6a61b84503ba29df3be06a9c77" } }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\r" ] } ], "source": [ "redirects = get_redirects(redirects_filename)" ] }, { "cell_type": "code", "execution_count": 374, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "13.766303744" ] }, "execution_count": 374, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mem_usage()" ] }, { "cell_type": "code", "execution_count": 375, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def add_item(lst, redirects, index_map, item):\n", " k = item[SLICE]\n", " lst.append(index_map.setdefault(redirects.get(k, k), len(index_map)))" ] }, { "cell_type": "code", "execution_count": 376, "metadata": { "collapsed": true }, "outputs": [], "source": [ "limit=119077682 #5000000" ] }, { "cell_type": "code", "execution_count": 377, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "912599a04ead440ba016131b73398e1b" } }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# Computing the integer index map\n", "index_map = dict() # links->IDs\n", "lines = get_lines(page_links_filename)\n", "source, destination, data = [],[],[]\n", "for l, split in tqdm_notebook(enumerate(lines), total=limit):\n", " if l >= limit: break\n", " add_item(source, redirects, index_map, split[0])\n", " add_item(destination, redirects, index_map, split[2])\n", " data.append(1)" ] }, { "cell_type": "code", "execution_count": 379, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "119077682" ] }, "execution_count": 379, "metadata": {}, "output_type": "execute_result" } ], "source": [ "n=len(data); n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Looking at our data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The below steps are just to illustrate what info is in our data and how it is structured. They are not efficient." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see what type of items are in index_map:" ] }, { "cell_type": "code", "execution_count": 425, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(b'1940_Cincinnati_Reds_Team_Issue', 9991173)" ] }, "execution_count": 425, "metadata": {}, "output_type": "execute_result" } ], "source": [ "index_map.popitem()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at one item in our index map:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1940_Cincinnati_Reds_Team_Issue has index $9991173$. This only shows up once in the destination list:" ] }, { "cell_type": "code", "execution_count": 427, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[119077649]" ] }, "execution_count": 427, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[i for i,x in enumerate(source) if x == 9991173]" ] }, { "cell_type": "code", "execution_count": 428, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(9991173, 9991050)" ] }, "execution_count": 428, "metadata": {}, "output_type": "execute_result" } ], "source": [ "source[119077649], destination[119077649]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we want to check which page is the source (has index $9991050$). Note: usually you should not access a dictionary by searching for its values. This is inefficient and not how dictionaries are intended to be used." ] }, { "cell_type": "code", "execution_count": 429, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "b'W711-2'\n" ] } ], "source": [ "for page_name, index in index_map.items():\n", " if index == 9991050:\n", " print(page_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see on Wikipedia that the Cincinati Red Teams Issue has [redirected to W711-2](https://en.wikipedia.org/wiki/W711-2):\n", "\n", "\"\"" ] }, { "cell_type": "code", "execution_count": 431, "metadata": { "collapsed": true }, "outputs": [], "source": [ "test_inds = [i for i,x in enumerate(source) if x == 9991050]" ] }, { "cell_type": "code", "execution_count": 466, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "47" ] }, "execution_count": 466, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(test_inds)" ] }, { "cell_type": "code", "execution_count": 433, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[119076756, 119076757, 119076758, 119076759, 119076760]" ] }, "execution_count": 433, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_inds[:5]" ] }, { "cell_type": "code", "execution_count": 434, "metadata": { "collapsed": true }, "outputs": [], "source": [ "test_dests = [destination[i] for i in test_inds]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we want to check which page is the source (has index 9991174):" ] }, { "cell_type": "code", "execution_count": 435, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "b'Baseball'\n", "b'Ohio'\n", "b'Cincinnati'\n", "b'Flash_Thompson'\n", "b'1940'\n", "b'1938'\n", "b'Lonny_Frey'\n", "b'Cincinnati_Reds'\n", "b'Ernie_Lombardi'\n", "b'Baseball_card'\n", "b'James_Wilson'\n", "b'Trading_card'\n", "b'Detroit_Tigers'\n", "b'Baseball_culture'\n", "b'Frank_McCormick'\n", "b'Bucky_Walters'\n", "b'1940_World_Series'\n", "b'Billy_Werber'\n", "b'Ival_Goodman'\n", "b'Harry_Craft'\n", "b'Paul_Derringer'\n", "b'Johnny_Vander_Meer'\n", "b'Cigarette_card'\n", "b'Eddie_Joost'\n", "b'Myron_McCormick'\n", "b'Beckett_Media'\n", "b'Icarus_affair'\n", "b'Ephemera'\n", "b'Sports_card'\n", "b'James_Turner'\n", "b'Jimmy_Ripple'\n", "b'Lewis_Riggs'\n", "b'The_American_Card_Catalog'\n", "b'Rookie_card'\n", "b'Willard_Hershberger'\n", "b'Elmer_Riddle'\n", "b'Joseph_Beggs'\n", "b'Witt_Guise'\n", "b'Milburn_Shoffner'\n" ] } ], "source": [ "for page_name, index in index_map.items():\n", " if index in test_dests:\n", " print(page_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that the items in the list appear in the wikipedia page:\n", "\n", "\"\"\n", "(Source: [Wikipedia](https://en.wikipedia.org/wiki/W711-2))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create Matrix" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we create a sparse matrix using Scipy's COO format, and that convert it to CSR.\n", "\n", "**Questions**: What are COO and CSR? Why would we create it with COO and then convert it right away?" ] }, { "cell_type": "code", "execution_count": 341, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X = sparse.coo_matrix((data, (destination,source)), shape=(n,n), dtype=np.float32)\n", "X = X.tocsr()" ] }, { "cell_type": "code", "execution_count": 347, "metadata": { "collapsed": true }, "outputs": [], "source": [ "del(data,destination, source)" ] }, { "cell_type": "code", "execution_count": 342, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<119077682x119077682 sparse matrix of type ''\n", "\twith 93985520 stored elements in Compressed Sparse Row format>" ] }, "execution_count": 342, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X" ] }, { "cell_type": "code", "execution_count": 343, "metadata": { "collapsed": true }, "outputs": [], "source": [ "names = {i: name for name, i in index_map.items()}" ] }, { "cell_type": "code", "execution_count": 349, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "12.903882752" ] }, "execution_count": 349, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mem_usage()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Save matrix so we don't have to recompute" ] }, { "cell_type": "code", "execution_count": 350, "metadata": { "collapsed": true }, "outputs": [], "source": [ "pickle.dump(X, open(PATH+'X.pkl', 'wb'))\n", "pickle.dump(index_map, open(PATH+'index_map.pkl', 'wb'))" ] }, { "cell_type": "code", "execution_count": 316, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X = pickle.load(open(PATH+'X.pkl', 'rb'))\n", "index_map = pickle.load(open(PATH+'index_map.pkl', 'rb'))" ] }, { "cell_type": "code", "execution_count": 317, "metadata": { "collapsed": true }, "outputs": [], "source": [ "names = {i: name for name, i in index_map.items()}" ] }, { "cell_type": "code", "execution_count": 351, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<119077682x119077682 sparse matrix of type ''\n", "\twith 93985520 stored elements in Compressed Sparse Row format>" ] }, "execution_count": 351, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Power method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Motivation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An $n \\times n$ matrix $A$ is **diagonalizable** if it has $n$ linearly independent eigenvectors $v_1,\\, \\ldots v_n$.\n", "\n", "Then any $w$ can be expressed $w = \\sum_{j=1}^n c_j v_j $, for some scalars $c_j$.\n", "\n", "**Exercise:** Show that $$ A^k w = \\sum_{j=1}^n c_j \\lambda_j^k v_j$$\n", "\n", "**Question**: How will this behave for large $k$?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is inspiration for the **power method**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Code" ] }, { "cell_type": "code", "execution_count": 281, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def show_ex(v):\n", " print(', '.join(names[i].decode() for i in np.abs(v.squeeze()).argsort()[-1:-10:-1]))" ] }, { "cell_type": "code", "execution_count": 453, "metadata": { "collapsed": true }, "outputs": [], "source": [ "?np.squeeze" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How to normalize a sparse matrix:" ] }, { "cell_type": "code", "execution_count": 336, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([4, 6], dtype=int64)" ] }, "execution_count": 336, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S = sparse.csr_matrix(np.array([[1,2],[3,4]]))\n", "Sr = S.sum(axis=0).A1; Sr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[numpy.matrix.A1](https://docs.scipy.org/doc/numpy/reference/generated/numpy.matrix.A1.html#numpy.matrix.A1)" ] }, { "cell_type": "code", "execution_count": 337, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 0, 1], dtype=int32)" ] }, "execution_count": 337, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.indices" ] }, { "cell_type": "code", "execution_count": 338, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4], dtype=int64)" ] }, "execution_count": 338, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.data" ] }, { "cell_type": "code", "execution_count": 339, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0.25 , 0.33333, 0.75 , 0.66667])" ] }, "execution_count": 339, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.data / np.take(Sr, S.indices)" ] }, { "cell_type": "code", "execution_count": 328, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def power_method(A, max_iter=100):\n", " n = A.shape[1]\n", " A = np.copy(A)\n", " A.data /= np.take(A.sum(axis=0).A1, A.indices)\n", "\n", " scores = np.ones(n, dtype=np.float32) * np.sqrt(A.sum()/(n*n)) # initial guess\n", " for i in range(max_iter):\n", " scores = A @ scores\n", " nrm = np.linalg.norm(scores)\n", " scores /= nrm\n", " print(nrm)\n", "\n", " return scores" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Question**: Why normalize the scores on each iteration?" ] }, { "cell_type": "code", "execution_count": 345, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.621209\n", "0.856139\n", "1.02793\n", "1.02029\n", "1.02645\n", "1.02504\n", "1.02364\n", "1.02126\n", "1.019\n", "1.01679\n" ] } ], "source": [ "scores = power_method(X, max_iter=10)" ] }, { "cell_type": "code", "execution_count": 346, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Living_people, Year_of_birth_missing_%28living_people%29, United_States, United_Kingdom, Race_and_ethnicity_in_the_United_States_Census, France, Year_of_birth_missing, World_War_II, Germany\n" ] } ], "source": [ "show_ex(scores)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 327, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "11.692331008" ] }, "execution_count": 327, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mem_usage()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Comments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Many advanced eigenvalue algorithms that are used in practice are variations on the power method.\n", "\n", "In Lesson 3: Background Removal, we used Facebook's [fast randomized pca/svd library, fbpca](https://github.com/facebook/fbpca). Check out [the source code](https://github.com/facebook/fbpca/blob/master/fbpca.py#L1549) for the pca method we used. It uses the power method!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Further Study**\n", "\n", "- Check out [Google Page Rank, Power Iteration and the Second EigenValue of the Google Matrix](http://rstudio-pubs-static.s3.amazonaws.com/239261_8a607707294341c4b7e26acf728c28bd.html) for animations of the distribution as it converges.\n", "\n", "- The convergence rate of the power method is the ratio of the largest eigenvalue to the 2nd largest eigenvalue. It can be speeded up by adding *shifts*. To find eigenvalues other than the largest, a method called *deflation* can be used. See Chapter 12.1 of Greenbaum & Chartier for more details.\n", "\n", "- *Krylov Subspaces*: In the Power Iteration, notice that we multiply by our matrix A each time, effectively calculating $$Ab,\\,A^2b,\\,A^3b,\\,A^4b\\, \\ldots$$\n", " \n", " The matrix with those vectors as columns is called the *Krylov matrix*, and the space spanned by those vectors is the *Krylov subspace*. Keep this in mind for later." ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "### Compare to SVD" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "hidden": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 8min 40s, sys: 1min 20s, total: 10min 1s\n", "Wall time: 5min 56s\n" ] } ], "source": [ "%time U, s, V = randomized_svd(X, 3, n_iter=3)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "hidden": true }, "outputs": [ { "data": { "text/plain": [ "28.353073152" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mem_usage()" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "hidden": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "List_of_World_War_II_air_aces, List_of_animated_feature-length_films, List_of_animated_feature_films:2000s, International_Swimming_Hall_of_Fame, List_of_museum_ships, List_of_linguists, List_of_television_programs_by_episode_count, List_of_game_show_hosts, List_of_astronomers\n" ] } ], "source": [ "# Top wikipedia pages according to principal singular vectors\n", "show_ex(U.T[0])" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "hidden": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "List_of_former_United_States_senators, List_of_United_States_Representatives_from_New_York, List_of_United_States_Representatives_from_Pennsylvania, Members_of_the_110th_United_States_Congress, List_of_Justices_of_the_Ohio_Supreme_Court, Congressional_endorsements_for_the_2008_United_States_presidential_election, List_of_United_States_Representatives_in_the_110th_Congress_by_seniority, List_of_Members_of_the_United_States_House_of_Representatives_in_the_103rd_Congress_by_seniority, List_of_United_States_Representatives_in_the_111th_Congress_by_seniority\n" ] } ], "source": [ "show_ex(U.T[1])" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "hidden": true, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "United_States, Japan, United_Kingdom, Germany, Race_and_ethnicity_in_the_United_States_Census, France, United_States_Army_Air_Forces, Australia, Canada\n" ] } ], "source": [ "show_ex(V[0])" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "hidden": true, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Democratic_Party_%28United_States%29, Republican_Party_%28United_States%29, Democratic-Republican_Party, United_States, Whig_Party_%28United_States%29, Federalist_Party, National_Republican_Party, Catholic_Church, Jacksonian_democracy\n" ] } ], "source": [ "show_ex(V[1])" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "**Exercise:** Normalize the data in various ways. Don't overwrite the adjacency matrix, but instead create a new one. See how your results differ." ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "**Eigen Decomposition vs SVD:**\n", "- SVD involves 2 bases, eigen decomposition involves 1 basis\n", "- SVD bases are orthonormal, eigen basis generally not orthogonal\n", "- All matrices have an SVD, not all matrices (not even all square) have an eigen decomposition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## QR Algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We used the power method to find the eigenvector corresponding to the largest eigenvalue of our matrix of Wikipedia links. This eigenvector gave us the relative importance of each Wikipedia page (like a simplified PageRank).\n", "\n", "Next, let's look at a method for finding all eigenvalues of a symmetric, positive definite matrix. This method includes 2 fundamental algorithms in numerical linear algebra, and is a basis for many more complex methods.\n", "\n", "[The Second Eigenvalue of the Google Matrix](https://nlp.stanford.edu/pubs/secondeigenvalue.pdf): has \"implications for the convergence rate of the standard PageRank algorithm as the web scales, for the stability of PageRank to perturbations to the link structure of the web, for the detection of Google spammers, and for the design of algorithms to speed up PageRank\".\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Avoiding Confusion: QR Algorithm vs QR Decomposition" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The **QR algorithm** uses something called the **QR decomposition**. Both are important, so don't get them confused. The **QR decomposition** decomposes a matrix $A = QR$ into a set of orthonormal columns $Q$ and a triangular matrix $R$. We will look at several ways to calculate the QR decomposition in a future lesson. For now, just know that it is giving us an orthogonal matrix and a triangular matrix." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Linear Algebra" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Two matrices $A$ and $B$ are **similar** if there exists a non-singular matrix $X$ such that $$B = X^{-1}AX$$\n", "\n", "Watch this: [Change of Basis](https://www.youtube.com/watch?v=P2LTAUO1TdA&index=13&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)\n", "\n", "**Theorem**: If $X$ is non-singular, then $A$ and $X^{-1}AX$ have the same eigenvalues." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### More Linear Algebra" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A **Schur factorization** of a matrix $A$ is a factorization:\n", "$$ A = Q T Q^*$$\n", "where $Q$ is unitary and $T$ is upper-triangular.\n", "\n", "**Question**: What can you say about the eigenvalues of $A$?\n", "\n", "**Theorem:** Every square matrix has a Schur factorization." ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "#### Other resources" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Review: [Linear combinations, span, and basis vectors](https://www.youtube.com/watch?v=k7RM-ot2NWY&index=3&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab)\n", "\n", "See Lecture 24 for proofs of above theorems (and more!)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The most basic version of the QR algorithm:\n", "\n", " for k=1,2,...\n", " Q, R = A\n", " A = R @ Q\n", " \n", "Under suitable assumptions, this algorithm converges to the Schur form of A!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Why it works" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Written again, only with subscripts:\n", "\n", "$A_0 = A$\n", "\n", "for $k=1,2,\\ldots$\n", "\n", " $\\quad Q_k$, $R_k$ = $A_{k-1}$\n", " \n", " $\\quad A_k$ = $R_k Q_k$\n", " \n", "We can think of this as constructing sequences of $A_k$, $Q_k$, and $R_k$.\n", "\n", "$$ A_k = Q_k \\, R_k $$\n", "\n", "$$ Q_k^{-1} \\, A_k = R_k$$\n", "\n", "Thus, \n", "\n", "$$ R_k Q_k = Q_k^{-1} \\, A_k \\, Q_k $$\n", "\n", "$$A_k = Q_k^{-1} \\ldots Q_2^{-1} Q_1^{-1} A Q_1 Q_2 \\dots Q_k$$\n", "\n", "Trefethen proves the following on page 216-217:\n", "\n", "$$A^k = Q_1 Q_2 \\dots Q_k R_k R_{k-1}\\dots R_1$$\n", "\n", "**Key**: The QR algorithm constructs orthonormal bases for successive powers $A^k$. And remember the close relationship between powers of A and the eigen decomposition.\n", "\n", "To learn more, read up on *Rayleigh quotients*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Pure QR" ] }, { "cell_type": "code", "execution_count": 467, "metadata": { "collapsed": true }, "outputs": [], "source": [ "n = 6\n", "A = np.random.rand(n,n)\n", "AT = A @ A.T" ] }, { "cell_type": "code", "execution_count": 477, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def pure_qr(A, max_iter=1000):\n", " Ak = np.copy(A)\n", " n = A.shape[0]\n", " QQ = np.eye(n)\n", " for k in range(max_iter):\n", " Q, R = np.linalg.qr(Ak)\n", " Ak = R @ Q\n", " QQ = QQ @ Q\n", " if k % 100 == 0:\n", " print(Ak)\n", " print(\"\\n\")\n", " return Ak, QQ" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Pure QR" ] }, { "cell_type": "code", "execution_count": 479, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0.62878, 0.23258, 0.63909, 0.90223, 0.94772, 0.80247],\n", " [ 0.64361, 0.52469, 0.92231, 0.32869, 0.58532, 0.75104],\n", " [ 0.44363, 0.00427, 0.62418, 0.47093, 0.6762 , 0.28078],\n", " [ 0.14629, 0.76324, 0.23316, 0.55208, 0.21712, 0.20255],\n", " [ 0.56122, 0.08282, 0.12788, 0.10419, 0.40358, 0.69272],\n", " [ 0.41172, 0.06411, 0.92162, 0.53139, 0.27901, 0.61592]])" ] }, "execution_count": 479, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A" ] }, { "cell_type": "code", "execution_count": 478, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 2.65646 0.21386 0.16765 0.75256 0.61251 0.93518]\n", " [ 0.52744 0.47579 0.17052 -0.41086 -0.21182 -0.01876]\n", " [ 0.29923 0.06964 0.11173 0.1879 -0.29101 0.60032]\n", " [ 0.2274 0.46162 -0.26654 0.08899 0.24631 0.26447]\n", " [-0.06093 0.02892 0.34162 0.07533 0.02393 -0.05456]\n", " [-0.06025 0.02694 -0.11675 -0.00927 -0.11939 -0.00767]]\n", "\n", "\n", "[[ 2.78023 0.52642 0.0395 -0.11135 0.1569 1.15184]\n", " [ 0. 0.18624 -0.297 -0.07256 -0.04537 0.27907]\n", " [ 0. 0.69328 0.34105 -0.12198 0.11029 0.0992 ]\n", " [-0. -0.0494 -0.02057 0.09461 0.59466 0.09115]\n", " [-0. 0.00008 -0.02659 -0.40372 0.06542 0.38612]\n", " [-0. 0. 0. 0. -0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 -0.12185 -0.51401 0.17625 -0.07467 1.15184]\n", " [ 0. 0.2117 -0.70351 0.09974 -0.02986 0.00172]\n", " [ 0. 0.28284 0.32635 -0.0847 -0.08488 -0.29104]\n", " [-0. -0.00068 -0.00088 -0.01282 0.54836 0.13447]\n", " [-0. 0. -0.00102 -0.45718 0.16208 -0.37726]\n", " [-0. 0. 0. 0. -0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 -0.33997 0.4049 0.17949 0.06291 1.15184]\n", " [ 0. 0.48719 -0.48788 -0.05831 -0.12286 -0.23486]\n", " [ 0. 0.49874 0.05104 -0.07191 0.03638 0.17261]\n", " [ 0. 0.00002 0. 0.02128 0.41958 0.3531 ]\n", " [ 0. -0. 0.00002 -0.58571 0.1278 -0.18838]\n", " [ 0. 0. 0. -0. 0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 0.35761 0.38941 0.07462 0.17493 1.15184]\n", " [ 0. 0.0597 -0.55441 -0.0681 -0.04456 0.14084]\n", " [ 0. 0.43221 0.47853 -0.06068 0.12117 0.25519]\n", " [-0. -0. -0. 0.16206 0.45708 0.37724]\n", " [-0. 0. -0. -0.54821 -0.01298 0.1336 ]\n", " [ 0. 0. 0. 0. -0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 0.06853 -0.52424 0.05224 -0.18287 1.15184]\n", " [ 0. 0.36572 -0.6889 0.07864 -0.09263 0.105 ]\n", " [ 0. 0.29772 0.17251 -0.09836 -0.02347 -0.27191]\n", " [ 0. -0. -0. 0.13719 0.57888 -0.20884]\n", " [ 0. 0. -0. -0.42642 0.01189 -0.34139]\n", " [ 0. 0. 0. 0. -0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 -0.52782 0.03045 -0.14389 -0.12436 1.15184]\n", " [ 0. 0.25091 -0.27593 0.08994 -0.06581 -0.28672]\n", " [ 0. 0.7107 0.28732 0.10154 0.04751 -0.05245]\n", " [ 0. -0. -0. 0.0297 -0.59054 -0.01234]\n", " [ 0. -0. 0. 0.41475 0.11938 -0.40001]\n", " [ 0. 0. 0. 0. 0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 0.1533 0.50599 0.18983 0.01158 1.15184]\n", " [ 0. 0.18627 -0.69511 -0.0991 -0.00189 0.01621]\n", " [ 0. 0.29151 0.35196 0.05638 -0.10949 0.29102]\n", " [ 0. -0. -0. -0.02207 -0.48261 0.25246]\n", " [ 0. -0. 0. 0.52268 0.17116 0.31053]\n", " [ 0. 0. 0. 0. 0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 0.29683 -0.43751 -0.13852 0.13032 1.15184]\n", " [ 0. 0.48375 -0.53231 -0.01164 0.13482 0.216 ]\n", " [ 0. 0.45431 0.05448 -0.07972 -0.01795 -0.19571]\n", " [ 0. -0. 0. 0.10042 -0.40743 -0.39915]\n", " [ 0. -0. 0. 0.59786 0.04867 -0.02893]\n", " [ 0. 0. 0. 0. 0. -0.11832]]\n", "\n", "\n", "[[ 2.78023 -0.39373 -0.35284 0.00838 -0.19 1.15184]\n", " [ 0. 0.05184 -0.51278 0.05752 -0.0564 -0.16496]\n", " [ 0. 0.47384 0.48639 0.09426 0.09806 -0.24031]\n", " [ 0. -0. -0. 0.17043 -0.52593 0.30622]\n", " [ 0. -0. 0. 0.47936 -0.02134 -0.25766]\n", " [ 0. 0. 0. 0. 0. -0.11832]]\n", "\n", "\n" ] } ], "source": [ "Ak, Q = pure_qr(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's compare to the eigenvalues:" ] }, { "cell_type": "code", "execution_count": 469, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 2.78023+0.j , -0.11832+0.j , 0.26911+0.44246j,\n", " 0.26911-0.44246j, 0.07454+0.49287j, 0.07454-0.49287j])" ] }, "execution_count": 469, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linalg.eigvals(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check that Q is orthogonal:" ] }, { "cell_type": "code", "execution_count": 476, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(True, True)" ] }, "execution_count": 476, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.allclose(np.eye(n), Q @ Q.T), np.allclose(np.eye(n), Q.T @ Q)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is really really slow." ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "#### Practical QR (QR with shifts)" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "**Idea**: Instead of factoring $A_k$ as $Q_k R_k$, \n", "\n", "1. Get the QR factorization $$A_k - s_k I = Q_k R_k$$\n", "2. Set $$A_{k+1} = R_k Q_k + s_k I$$\n", "\n", "Choose $s_k$ to approximate an eigenvalue of $A$. We'll use $s_k = A_k(m,m)$. \n", "\n", "The idea of adding shifts to speed up convergence shows up in many algorithms in numerical linear algebra (including the power method, inverse iteration, and Rayleigh quotient iteration). " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Homework: Add shifts to the QR algorithm" ] }, { "cell_type": "code", "execution_count": 460, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#Exercise: Add shifts to the QR algorithm\n", "#Exercise: def practical_qr(A, iters=10):\n", "#Exercise: return Ak, Q\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Practical QR" ] }, { "cell_type": "code", "execution_count": 204, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 5.16264 2.53603 0.31923 0.35315 0.97569 0.43615]\n", "[ 7.99381 0.05922 0.34478 0.29482 0.79026 0.29999]\n", "[ 8.00491 0.04358 0.89735 0.26386 0.26182 0.31135]\n", "[ 8.00493 0.13648 0.91881 0.14839 0.24313 0.33115]\n", "[ 8.00493 0.43377 0.62809 0.13429 0.24592 0.33589]\n", "[ 8.00493 0.81058 0.25128 0.13297 0.24722 0.3359 ]\n", "[ 8.00493 0.98945 0.07221 0.13292 0.24747 0.3359 ]\n", "[ 8.00493 1.0366 0.02497 0.13296 0.24751 0.3359 ]\n", "[ 8.00493 1.04688 0.01465 0.13299 0.24752 0.3359 ]\n", "[ 8.00493 1.04902 0.0125 0.13301 0.24753 0.3359 ]\n" ] } ], "source": [ "Ak, Q = practical_qr(A, 10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check that Q is orthogonal:" ] }, { "cell_type": "code", "execution_count": 201, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(True, True)" ] }, "execution_count": 201, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.allclose(np.eye(n), Q @ Q.T), np.allclose(np.eye(n), Q.T @ Q)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's compare to the eigenvalues:" ] }, { "cell_type": "code", "execution_count": 196, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 2.68500+0.j , 0.19274+0.41647j, 0.19274-0.41647j,\n", " -0.35988+0.43753j, -0.35988-0.43753j, -0.18346+0.j ])" ] }, "execution_count": 196, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linalg.eigvals(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Problem**: This is better than the unshifted version (which wasn't even guaranteed to converge), but is still really slow! In fact, it is $\\mathcal{O}(n^4)$, which is awful." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the case of symmetric matrices, it's $\\mathcal{O}(n^3)$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, if you start with a **Hessenberg matrix** (zeros below the first subdiagonal), it's faster: $\\mathcal{O}(n^3)$, and $\\mathcal{O}(n^2)$ if symmetric." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A Two-Phase Approach" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In practice, a two phase approach is used to find eigenvalues:\n", "\n", "1. Reduce the matrix to *Hessenberg* form (zeros below the first subdiagonal)\n", "2. Iterative process that causes Hessenberg to converge to a *triangular* matrix. The eigenvalues of a triangular matrix are the values on the diagonal, so we are finished!\n", "\n", "\"2\n", "(source: Trefethen, Lecture 25)\n", "\n", "In the case of a Hermitian matrix, this approach is even faster, since the intermediate step is also Hermitian (and a Hermitian Hessenberg is *tridiagonal*).\n", "\n", "\"2\n", "(source: Trefethen, Lecture 25)\n", "\n", "Phase 1 reaches an exact solution in a finite number of steps, whereas Phase 2 theoretically never reaches the exact solution.\n", "\n", "We've already done step 2: the QR algorithm. Remember that it would be possible to just use the QR algorithm, but ridiculously slow." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Arnoldi Iteration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the Arnoldi iteration for phase 1 (and the QR algorithm for phase 2)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Initializations" ] }, { "cell_type": "code", "execution_count": 459, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "n = 5\n", "A0 = np.random.rand(n,n) #.astype(np.float64)\n", "A = A0 @ A0.T\n", "\n", "np.set_printoptions(precision=5, suppress=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Linear Algebra Review: Projections" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When vector $\\mathbf{b}$ is projected onto a line $\\mathbf{a}$, its projection $\\mathbf{p}$ is the part of $\\mathbf{b}$ along that line $\\mathbf{a}$.\n", "\n", "Let's look at interactive graphic (3.4) for [section 3.2.2: Projections](http://immersivemath.com/ila/ch03_dotproduct/ch03.html) of the [Immersive Linear Algebra online book](http://immersivemath.com/ila/index.html).\n", "\n", "\"projection\"\n", "(source: [Immersive Math](http://immersivemath.com/ila/ch03_dotproduct/ch03.html))\n", "\n", "And here is what it looks like to project a vector onto a plane:\n", "\n", "\"projection\"\n", "(source: [The Linear Algebra View of Least-Squares Regression](https://medium.com/@andrew.chamberlain/the-linear-algebra-view-of-least-squares-regression-f67044b7f39b))\n", "\n", "When vector $\\mathbf{b}$ is projected onto a line $\\mathbf{a}$, its projection $\\mathbf{p}$ is the part of $\\mathbf{b}$ along that line $\\mathbf{a}$. So $\\mathbf{p}$ is some multiple of $\\mathbf{a}$. Let $\\mathbf{p} = \\hat{x}\\mathbf{a}$ where $\\hat{x}$ is a scalar." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Orthogonality" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**The key to projection is orthogonality:** The line *from* $\\mathbf{b}$ to $\\mathbf{p}$ (which can be written $\\mathbf{b} - \\hat{x}\\mathbf{a}$) is perpendicular to $\\mathbf{a}$.\n", "\n", "This means that $$ \\mathbf{a} \\cdot (\\mathbf{b} - \\hat{x}\\mathbf{a}) = 0 $$\n", "\n", "and so $$\\hat{x} = \\frac{\\mathbf{a} \\cdot \\mathbf{b}}{\\mathbf{a} \\cdot \\mathbf{a}} $$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The Algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Motivation**:\n", "\n", "We want orthonormal columns in $Q$ and a Hessenberg $H$ such that $A Q = Q H$.\n", "\n", "Thinking about it iteratively, $$ A Q_n = Q_{n+1} H_n $$ where $Q_{n+1}$ is $n\\times n+1$ and $H_n$ is $n+1 \\times n$. This creates a solvable recurrence relation.\n", "\n", "\"arnoldi\"\n", "(source: Trefethen, Lecture 33)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Pseudo-code for Arnoldi Algorithm**\n", "\n", " Start with an arbitrary vector (normalized to have norm 1) for first col of Q\n", " for n=1,2,3...\n", " v = A @ nth col of Q\n", " for j=1,...n\n", " project v onto q_j, and subtract the projection off of v\n", " want to capture part of v that isn't already spanned by prev columns of Q\n", " store coefficients in H\n", " normalize v, and then make it the (n+1)th column of Q" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we are multiplying A by the previous vector in Q and removing the components that are not orthogonal to the existing columns of Q.\n", "\n", "**Question:** Repeated multiplications of A? Does this remind you of anything?" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "#### Answer:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "hidden": true }, "outputs": [], "source": [ "#Exercise Answer\n", "The *Power Method* involved iterative multiplications by A as well! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### About how the Arnoldi Iteration works" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- With the Arnoldi Iteration, we are finding an orthonormal basis for the *Krylov subspace*. \n", "The Krylov matrix $$ K = \\left[b \\; Ab \\; A^2b \\; \\dots \\; A^{n-1}b \\right]$$\n", "has a QR factorization\n", "$$K = QR$$\n", "and that is the same $Q$ that is being found in the Arnoldi Iteration. Note that the Arnoldi Iteration does not explicity calculate $K$ or $R$.\n", "\n", "- Intuition: K contains good information about the largest eigenvalues of A, and the QR factorization reveals this information by peeling off one approximate eigenvector at a time.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Arnoldi Iteration is two things:\n", "1. the basis of many of the iterative algorithms of numerical linear algebra\n", "2. a technique for finding eigenvalues of nonhermitian matrices\n", "(Trefethen, page 257)\n", "\n", "**How Arnoldi Locates Eigenvalues**\n", "\n", "1. Carry out Arnoldi iteration\n", "2. Periodically calculate the eigenvalues (called *Arnoldi estimates* or *Ritz values*) of the Hessenberg H, using the QR algorithm\n", "3. Check at whether these values are converging. If they are, they're probably eigenvalues of A." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Implementation" ] }, { "cell_type": "code", "execution_count": 246, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Decompose square matrix A @ Q ~= Q @ H\n", "def arnoldi(A):\n", " m, n = A.shape\n", " assert(n <= m)\n", " \n", " # Hessenberg matrix\n", " H = np.zeros([n+1,n]) #, dtype=np.float64)\n", " # Orthonormal columns\n", " Q = np.zeros([m,n+1]) #, dtype=np.float64)\n", " # 1st col of Q is a random column with unit norm\n", " b = np.random.rand(m)\n", " Q[:,0] = b / np.linalg.norm(b)\n", " for j in range(n):\n", " v = A @ Q[:,j]\n", " for i in range(j+1):\n", " #This comes from the formula for projection of v onto q.\n", " #Since columns q are orthonormal, q dot q = 1\n", " H[i,j] = np.dot(Q[:,i], v)\n", " v = v - (H[i,j] * Q[:,i])\n", " H[j+1,j] = np.linalg.norm(v)\n", " Q[:,j+1] = v / H[j+1,j]\n", " \n", " # printing this to see convergence, would be slow to use in practice\n", " print(np.linalg.norm(A @ Q[:,:-1] - Q @ H))\n", " return Q[:,:-1], H[:-1,:]" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8.59112969133\n", "4.45398729097\n", "0.935693639985\n", "3.36613943339\n", "0.817740180293\n" ] } ], "source": [ "Q, H = arnoldi(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check that H is tri-diagonal:" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 5.62746, 4.05085, -0. , 0. , -0. ],\n", " [ 4.05085, 3.07109, 0.33036, 0. , -0. ],\n", " [ 0. , 0.33036, 0.98297, 0.11172, -0. ],\n", " [ 0. , 0. , 0.11172, 0.29777, 0.07923],\n", " [ 0. , 0. , 0. , 0.07923, 0.06034]])" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "H" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write code to confirm that:\n", "1. AQ = QH\n", "2. Q is orthonormal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Answer:" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Exercise:\n" ] }, { "cell_type": "code", "execution_count": 456, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 456, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Exercise:\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### General Case:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**General Matrix**: Now we can do this on our general matrix A (not symmetric). In this case, we are getting a Hessenberg instead of a Tri-diagonal" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.44287067354\n", "1.06234006889\n", "0.689291414367\n", "0.918098818651\n", "4.7124490411e-16\n" ] } ], "source": [ "Q0, H0 = arnoldi(A0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check that H is Hessenberg:" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1.67416, 0.83233, -0.39284, 0.10833, 0.63853],\n", " [ 1.64571, 1.16678, -0.54779, 0.50529, 0.28515],\n", " [ 0. , 0.16654, -0.22314, 0.08577, -0.02334],\n", " [ 0. , 0. , 0.79017, 0.11732, 0.58978],\n", " [ 0. , 0. , 0. , 0.43238, -0.07413]])" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "H0" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.allclose(A0 @ Q0, Q0 @ H0)" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(True, True)" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.allclose(np.eye(len(Q0)), Q0.T @ Q0), np.allclose(np.eye(len(Q0)), Q0 @ Q0.T)" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Putting it all together" ] }, { "cell_type": "code", "execution_count": 215, "metadata": { "collapsed": true, "hidden": true }, "outputs": [], "source": [ "def eigen(A, max_iter=20):\n", " Q, H = arnoldi(A)\n", " Ak, QQ = practical_qr(H, max_iter)\n", " U = Q @ QQ\n", " D = np.diag(Ak)\n", " return U, D" ] }, { "cell_type": "code", "execution_count": 213, "metadata": { "collapsed": true, "hidden": true }, "outputs": [], "source": [ "n = 10\n", "A0 = np.random.rand(n,n)\n", "A = A0 @ A0.T" ] }, { "cell_type": "code", "execution_count": 242, "metadata": { "hidden": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "14.897422908\n", "1.57451192745\n", "1.4820012435\n", "0.668164424736\n", "0.438450319682\n", "0.674050723258\n", "1.19470880942\n", "0.217103444634\n", "0.105443975462\n", "3.8162597576e-15\n", "[ 27.34799 1.22613 1.29671 0.70253 0.49651 0.56779 0.60974\n", " 0.70123 0.19209 0.04905]\n", "[ 27.34981 1.85544 1.04793 0.49607 0.44505 0.7106 1.00724\n", " 0.07293 0.16058 0.04411]\n", "[ 27.34981 2.01074 0.96045 0.54926 0.61117 0.8972 0.53424\n", " 0.19564 0.03712 0.04414]\n", "[ 27.34981 2.04342 0.94444 0.61517 0.89717 0.80888 0.25402\n", " 0.19737 0.03535 0.04414]\n", "[ 27.34981 2.04998 0.94362 0.72142 1.04674 0.58643 0.21495\n", " 0.19735 0.03534 0.04414]\n", "[ 27.34981 2.05129 0.94496 0.90506 0.95536 0.49632 0.21015\n", " 0.19732 0.03534 0.04414]\n", "[ 27.34981 2.05156 0.94657 1.09452 0.79382 0.46723 0.20948\n", " 0.1973 0.03534 0.04414]\n", "[ 27.34981 2.05161 0.94863 1.1919 0.70539 0.45628 0.20939\n", " 0.19728 0.03534 0.04414]\n", "[ 27.34981 2.05162 0.95178 1.22253 0.67616 0.45174 0.20939\n", " 0.19727 0.03534 0.04414]\n", "[ 27.34981 2.05162 0.95697 1.22715 0.66828 0.44981 0.2094\n", " 0.19725 0.03534 0.04414]\n", "[ 27.34981 2.05162 0.96563 1.22124 0.66635 0.44899 0.20941\n", " 0.19724 0.03534 0.04414]\n", "[ 27.34981 2.05162 0.97969 1.20796 0.66592 0.44864 0.20942\n", " 0.19723 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.00135 1.18652 0.66585 0.44849 0.20943\n", " 0.19722 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.03207 1.15586 0.66584 0.44843 0.20943\n", " 0.19722 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.07082 1.11714 0.66584 0.4484 0.20944\n", " 0.19721 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.11307 1.07489 0.66585 0.44839 0.20944\n", " 0.1972 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.15241 1.03556 0.66585 0.44839 0.20945\n", " 0.1972 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.18401 1.00396 0.66585 0.44839 0.20945\n", " 0.1972 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.20652 0.98145 0.66585 0.44839 0.20946\n", " 0.19719 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.22121 0.96676 0.66585 0.44839 0.20946\n", " 0.19719 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.23026 0.95771 0.66585 0.44839 0.20946\n", " 0.19719 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.23563 0.95234 0.66585 0.44839 0.20946\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.23876 0.94921 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24056 0.94741 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24158 0.94639 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24216 0.94581 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24249 0.94548 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24268 0.94529 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24278 0.94519 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24284 0.94513 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24288 0.94509 0.66585 0.44839 0.20947\n", " 0.19718 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.2429 0.94507 0.66585 0.44839 0.20947\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24291 0.94506 0.66585 0.44839 0.20947\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24291 0.94506 0.66585 0.44839 0.20947\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24292 0.94505 0.66585 0.44839 0.20947\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24292 0.94505 0.66585 0.44839 0.20947\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24292 0.94505 0.66585 0.44839 0.20948\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24292 0.94505 0.66585 0.44839 0.20948\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24292 0.94505 0.66585 0.44839 0.20948\n", " 0.19717 0.03534 0.04414]\n", "[ 27.34981 2.05162 1.24292 0.94505 0.66585 0.44839 0.20948\n", " 0.19717 0.03534 0.04414]\n" ] } ], "source": [ "U, D = eigen(A, 40)" ] }, { "cell_type": "code", "execution_count": 240, "metadata": { "hidden": true }, "outputs": [ { "data": { "text/plain": [ "array([ 5.10503, 0.58805, 0.49071, -0.65174, -0.60231, -0.37664,\n", " -0.13165, 0.0778 , -0.10469, -0.29771])" ] }, "execution_count": 240, "metadata": {}, "output_type": "execute_result" } ], "source": [ "D" ] }, { "cell_type": "code", "execution_count": 241, "metadata": { "hidden": true }, "outputs": [ { "data": { "text/plain": [ "array([ 27.34981, 2.05162, 1.24292, 0.94505, 0.66585, 0.44839,\n", " 0.20948, 0.19717, 0.04414, 0.03534])" ] }, "execution_count": 241, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linalg.eigvals(A)" ] }, { "cell_type": "code", "execution_count": 236, "metadata": { "hidden": true }, "outputs": [ { "data": { "text/plain": [ "0.0008321887107978883" ] }, "execution_count": 236, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linalg.norm(U @ np.diag(D) @ U.T - A)" ] }, { "cell_type": "code", "execution_count": 237, "metadata": { "hidden": true }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 237, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.allclose(U @ np.diag(D) @ U.T, A, atol=1e-3)" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "### Further Reading" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Let's find some eigenvalues!\n", "\n", "\n", "from [Nonsymmetric Eigenvalue Problems](https://sites.math.washington.edu/~morrow/498_13/eigenvalues.pdf) chapter:\n", "\n", "Note that \"direct\" methods must still iterate, since finding eigenvalues is mathematically equivalent to finding zeros of polynomials, for which no noniterative methods can exist. We call a method direct if experience shows that it (nearly) never fails to converge in a\n", "fixed number of iterations.\n", "\n", "Iterative methods typically provide approximations only to a subset of the eigenvalues and eigenvectors and are usually run only long enough to get a few adequately accurate eigenvalues rather than a large number\n", "\n", "our ultimate algorithm: the shifted Hessenberg QR algorithm\n", "\n", "More reading:\n", "- [The Symmetric Eigenproblem and SVD](https://sites.math.washington.edu/~morrow/498_13/eigenvalues2.pdf)\n", "- [Iterative Methods for Eigenvalue Problems](https://sites.math.washington.edu/~morrow/498_13/eigenvalues3.pdf) Rayleigh-Ritz Method, Lanczos algorithm" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "### Coming Up\n", "\n", "We will be coding our own QR decomposition (two different ways!) in the future, but first we are going to see another way that the QR decomposition can be used: to calculate linear regression." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## End" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Miscellaneous Notes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Symmetric matrices come up naturally:\n", "- distance matrices\n", "- relationship matrices (Facebook or LinkedIn)\n", "- ODEs\n", "\n", "We will look at positive definite matrices, since that guarantees that all the eigenvalues are real." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: in the confusing language of NLA, the QR algorithm is *direct*, because you are making progress on all columns at once. In other math/CS language, the QR algorithm is *iterative*, because it iteratively converges and never reaches an exact solution." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "structured orthogonalization. In the language of NLA, Arnoldi iteration is considered an *iterative* algorithm, because you could stop part way and have a few columns completed.\n", "\n", "a Gram-Schmidt style iteration for transforming a matrix to Hessenberg form" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" }, "widgets": { "state": { "2987aec2ec494d2b9b31bdf93ba87cb6": { "views": [ { "cell_index": 18 } ] }, "84bffde89e894158860b175e986f4d61": { "views": [ { "cell_index": 14 } ] } }, "version": "1.2.0" } }, "nbformat": 4, "nbformat_minor": 2 }