{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Sascha Spors,\n", "Professorship Signal Theory and Digital Signal Processing,\n", "Institute of Communications Engineering (INT),\n", "Faculty of Computer Science and Electrical Engineering (IEF),\n", "University of Rostock,\n", "Germany\n", "\n", "# Data Driven Audio Signal Processing - A Tutorial with Computational Examples\n", "\n", "Winter Semester 2023/24 (Master Course #24512)\n", "\n", "- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture\n", "- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise\n", "\n", "Feel free to contact lecturer frank.schultz@uni-rostock.de" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 4: SVD and Left Matrix Inverse\n", "\n", "## Objectives\n", "\n", "- Matrix A of dimension (M x N)\n", "- SVD for a **tall/thin** matrix, we assume a matrix with **full column rank** $r=N$\n", "- Four subspaces in SVD domain\n", "- Projection matrices\n", "- Least squares / normal equation(s)\n", "- Left inverse\n", "\n", "## Special Python Packages\n", "\n", "Some convenient functions are found in `scipy.linalg`, some in `numpy.linalg` " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Some Initial Python Stuff" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy.linalg import svd, diagsvd, inv, pinv, null_space, norm\n", "from numpy.linalg import matrix_rank\n", "\n", "np.set_printoptions(precision=2, floatmode=\"fixed\", suppress=True)\n", "\n", "rng = np.random.default_rng(1234)\n", "mean, stdev = 0, 1\n", "\n", "# we might convince ourselves that all works for complex data as well\n", "# then the ^H operator (conj().T) needs to be used instead of just .T\n", "use_complex = False" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SVD of Tall/Thin, Full Column Rank Matrix A" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "M = 7 # number of rows\n", "N = 3 # number of cols\n", "\n", "rank = min(M, N) # set desired rank == full column rank == independent columns\n", "print(\"desired rank of A:\", rank)\n", "\n", "if use_complex:\n", " dtype = \"complex128\"\n", " A = np.zeros([M, N], dtype=dtype)\n", " for i in range(rank):\n", " col = rng.normal(mean, stdev, M) + 1j * rng.normal(mean, stdev, M)\n", " row = rng.normal(mean, stdev, N) + 1j * rng.normal(mean, stdev, N)\n", " A += np.outer(col, row) # superposition of rank-1 matrices\n", "else:\n", " dtype = \"float64\"\n", " A = np.zeros([M, N], dtype=dtype)\n", " for i in range(rank):\n", " col = rng.normal(mean, stdev, M)\n", " row = rng.normal(mean, stdev, N)\n", " A += np.outer(col, row) # superposition of rank-1 matrices\n", "# check if rng produced desired rank\n", "print(\" rank of A:\", matrix_rank(A))\n", "print(\"tall/thin matrix with full column rank\")\n", "print(\"-> matrix V contains only the row space\")\n", "print(\"-> null space is only the zero vector\")\n", "print(\"A =\\n\", A)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[U, s, Vh] = svd(A, full_matrices=True)\n", "S = diagsvd(s, M, N) # for full SVD the matrix S has same dim as A\n", "V = Vh.conj().T\n", "Uh = U.conj().T\n", "\n", "print(\"U =\\n\", U)\n", "# number of non-zero sing values along diag must match rank\n", "print(\"non-zero singular values: \", s[:rank])\n", "print(\"S =\\n\", S) # contains 0 Matrix below diagonal part\n", "print(\"V =\\n\", V)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Four Subspaces in SVD Domain\n", "\n", "The null space $N(\\mathbf{A})$ of the tall/thin, full column rank matrix $\\mathbf{A}$ is only $\\mathbf{0}$, i.e. $N(\\mathbf{A})=\\mathbf{0}$. Except for $\\mathbf{x}=\\mathbf{0}$, all other $\\mathbf{x}$ are mapped to the column space $C(\\mathbf{A})$. This, however, requires, that the $\\mathbf{V}$ matrix completely spans the row space and no $\\mathbf{v}$ vectors span a dedicated null space.\n", "\n", "The tall/thin, full column rank matrix $\\mathbf{A}$ spans a rather large left null space $N(\\mathbf{A}^\\mathrm{H})$ with dimension $M-\\mathrm{rank}(A)$.\n", "\n", "We, therefore, here deal with a linear set of equations with **more equations than unknowns** ($M>N$, more rows than columns) , i.e. the **over-determined** case. For this case, we can find a solution in the **least-squares error** sense by help of the **left inverse** as discussed below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# all stuff that is in matrix U\n", "print(\"U =\\n\", U)\n", "\n", "# column space C(A)\n", "print(\"\\ncolumn space (orthogonal to left null space):\")\n", "print(U[:, :rank])\n", "\n", "# left null space, if empty only 0 vector\n", "print(\"left null space (orthogonal to column space):\")\n", "print(U[:, rank:])\n", "\n", "print(\"###\")\n", "\n", "# all stuff that is in matrix V\n", "print(\"\\nV =\\n\", V)\n", "\n", "# row space\n", "print(\"\\nrow space (orthogonal to null space):\")\n", "print(V[:, :rank])\n", "\n", "# null space N(A), if empty only 0 vector\n", "print(\"null space (orthogonal to row space):\")\n", "print(V[:, rank:]) # for full column rank this is only the zero vector" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Left Inverse via SVD" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[U, s, Vh] = svd(A, full_matrices=True)\n", "V = Vh.conj().T\n", "Uh = U.conj().T\n", "\n", "Si = diagsvd(1 / s, N, M) # works if array s has only non-zero entries\n", "print(\"Inverse singular value matrix with right zero block\")\n", "print(\"Si =\\n\", Si)\n", "# left inverse using 'inverse' SVD:\n", "Ali = V @ Si @ Uh\n", "# left inverse using a dedicated pinv algorithm\n", "# proper choice is done by pinv() itself\n", "Ali_pinv = pinv(A)\n", "print(\"pinv == left inverse via SVD?\", np.allclose(Ali, Ali_pinv))\n", "print(\"Si @ S\\n\", Si @ S, \"\\nyields NxN identity matrix\")\n", "print(\"Ali @ A = \\n\", Ali @ A, \"\\nyields NxN identity matrix\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Projection Matrices for the Left Inverse Problem" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "u_col = U[:, 0, None]\n", "u_left_null = U[:, rank, None]\n", "u_tmp = u_col + u_left_null\n", "\n", "# projection onto row space == I_NxN\n", "# full rank and identity since we don't have a null space\n", "# so each vector of the row space is projected onto itself\n", "P_CAH = Ali @ A\n", "print(\n", " \"projection matrix P_CAH projects a row space vector onto itself:\\n\",\n", " \"P_CAH @ V[:,0] == V[:,0]:\",\n", " np.allclose(P_CAH @ V[:, 0], V[:, 0]),\n", ")\n", "\n", "# projection onto column space\n", "P_CA = A @ Ali\n", "print(\n", " \"projection matrix P_CA projects U-space stuff to column space:\\n\",\n", " \"P_CA @ (u_tmp) == u_col:\",\n", " np.allclose(P_CA @ (u_tmp), u_col),\n", ")\n", "\n", "# projection onto null space == null matrix\n", "P_NA = np.eye(N, N) - P_CAH\n", "print(\"projection matrix P_NA is a null matrix\\nP_NA=\\n\", P_NA)\n", "\n", "# projection onto left null space\n", "P_NAH = np.eye(M, M) - P_CA\n", "print(\n", " \"projection matrix P_NAH projects U-space stuff to left null space:\\n\",\n", " \"P_NAH @ (u_tmp)==u_lnull\",\n", " np.allclose(P_NAH @ (u_tmp), u_left_null),\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# design a vector:\n", "# one entry from column space + one entry from left null space\n", "# so we take some special left singular vectors:\n", "b = U[:, 0, None] + U[:, rank, None] # same as u_tmp above\n", "print(\"b==u_tmp\", np.allclose(b, u_tmp))\n", "\n", "# the vector b is a linear combination and lives in column+left null spaces\n", "# with above introduced projection matrices we can project b\n", "# (i) onto column space\n", "bhat = P_CA @ b # project b to column space C(A)\n", "print(\"bhat==U[:, 0, None]:\", np.allclose(bhat, U[:, 0, None]))\n", "# and\n", "# (ii) onto left null space\n", "e = P_NAH @ b # project b to left null space N(A^H)\n", "print(\"e==U[:, 0, None]:\", np.allclose(e, U[:, rank, None]))\n", "\n", "# to find x_hat, i.e. the LS solution of the inverse problem\n", "# we bring b to row space via left inverse:\n", "# only the column space part of b (i.e. bhat) is brought to the row space\n", "# we can never map back a 'zero' (i.e. e)\n", "print(\"x_hat = Ali @ b == Ali @ bhat:\", np.allclose(Ali @ b, Ali @ bhat))\n", "# we expect that this is the scaled first right singular vector\n", "print(\"x_hat=\\n\", V[:, 0, None] / S[0, 0])\n", "print(Ali @ bhat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Left Inverse from Normal Equation(s) / Least Squares Error Optimization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Derivation 1\n", "\n", "Vector addition from example above with an error term $\\mathbf{e}$\n", "\n", "$\\hat{\\mathbf{b}} + \\mathbf{e} = \\mathbf{b} \\rightarrow \\mathbf{e} = \\mathbf{b} - \\hat{\\mathbf{b}}$\n", "\n", "We know, that pure row space $\\hat{\\mathbf{x}}$ maps to pure column space $\\hat{\\mathbf{b}}$\n", "\n", "$\\hat{\\mathbf{b}} = \\mathbf{A} \\hat{\\mathbf{x}}$\n", "\n", "Inserting this\n", "\n", "$\\mathbf{e} = \\mathbf{b} - \\hat{\\mathbf{b}} = \\mathbf{b} - \\mathbf{A} \\hat{\\mathbf{x}}$\n", "\n", "The vector $\\mathbf{A} \\mathbf{x}$ is always living in the column space no matter what $\\mathbf{x}$ is constructed of.\n", "\n", "The vector $\\mathbf{e}$ is orthogonal to column space (since it lives in left null space).\n", "\n", "So, we know that the inner product must solve to zero\n", "\n", "$(\\mathbf{A} \\mathbf{x})^\\mathrm{H} \\mathbf{e} = 0 \\rightarrow \\mathbf{x}^\\mathrm{H} \\mathbf{A}^\\mathrm{H} (\\mathbf{b} - \\mathbf{A} \\hat{\\mathbf{x}}) = 0$\n", "\n", "Rearranging yields\n", "\n", "$\\mathbf{x}^\\mathrm{H} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\mathbf{x}^\\mathrm{H} \\mathbf{A}^\\mathrm{H} \\mathbf{A} \\hat{\\mathbf{x}}$\n", "\n", "and by canceling $\\mathbf{x}^\\mathrm{H} $ the famous normal equation is obtained\n", "\n", "$\\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\mathbf{A}^\\mathrm{H} \\mathbf{A} \\hat{\\mathbf{x}}$\n", "\n", "This can be solved using the inverse of $\\mathbf{A}^\\mathrm{H} \\mathbf{A}$ (this matrix is full rank and therefore invertible)\n", "\n", "$(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = (\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} (\\mathbf{A}^\\mathrm{H} \\mathbf{A}) \\hat{\\mathbf{x}}$\n", "\n", "Since $(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} (\\mathbf{A}^\\mathrm{H} \\mathbf{A}) = \\mathbf{I}$ holds, we get the solution in least-squares error sense for $\\mathbf{x}$ in the row space of $\\mathbf{A}$\n", "\n", "$(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\hat{\\mathbf{x}}$\n", "\n", "We find the **left inverse** of $\\mathbf{A}$ as\n", "\n", "$\\mathbf{A}^{+L} = (\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H}$\n", "\n", "such that left multiplying\n", "\n", "$\\mathbf{A}^{+L} \\mathbf{A} = \\mathbf{I}$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Derivation 2\n", "\n", "Here, the origin of the naming *least squares error* is made more obvious.\n", "\n", "We set up an optimization problem defining **least** amount of **squared** length of the **error** vector\n", "\n", "$\\mathrm{min}_\\mathbf{x} ||\\mathbf{e}||^2_2 = \\mathrm{min}_\\mathbf{x} ||\\mathbf{b} - \\mathbf{A} {\\mathbf{x}}||_2^2$\n", "\n", "We could solve this with help of calculus. But we have a nice tool, i.e. the properties of subspaces, that not requires pages of calculation:\n", "\n", "We must find the minimum distance from $\\mathbf{b}$ to the column space of $\\mathbf{A}$.\n", "\n", "This can be only achieved if the error vector $\\mathbf{e}=\\mathbf{b} - \\mathbf{A} {\\mathbf{x}}$ is orthogonal to the column space of $\\mathbf{A}$. \n", "\n", "This in turn means that $\\mathbf{e}$ must live in the left null space of $\\mathbf{A}$, i.e. $\\mathbf{e} \\in N(\\mathbf{A}^\\mathrm{H})$. \n", "\n", "By definition of left nullspace we have $\\mathbf{A}^\\mathrm{H} \\mathbf{e} = \\mathbf{0}$. \n", "\n", "So, let us insert $\\mathbf{e}$ into $\\mathbf{A}^\\mathrm{H} \\mathbf{e} = \\mathbf{0}$:\n", "\n", "$\\mathbf{A}^\\mathrm{H} (\\mathbf{b} - \\mathbf{A} {\\mathbf{x}}) = \\mathbf{0}$\n", "\n", "$\\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\mathbf{A}^\\mathrm{H} \\mathbf{A} {\\mathbf{x}}$\n", "\n", "The optimum $\\mathbf{x}$ which solves the problem is just as above in derivation 1\n", "\n", "$(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\hat{\\mathbf{x}}$\n", "\n", "And again, we find the **left inverse** of $\\mathbf{A}$ as\n", "\n", "$\\mathbf{A}^{+L} = (\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H}$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Ali_normaleq = inv(A.conj().T @ A) @ A.conj().T\n", "\n", "xhat = Ali_normaleq @ b # LS solution in row space\n", "print(\"xhat = \", xhat)\n", "bhat = A @ xhat # map to column space\n", "\n", "# thus this is the projection matrix that maps b (column + left nullspace stuff)\n", "# to the column space of A in the LS sense\n", "P_CA2 = A @ Ali_normaleq\n", "\n", "print(\"P_CA == P_CA2 ?\", np.allclose(P_CA, P_CA2)) # check with above result\n", "print(\n", " \"P_CA2 @ b == bhat:\", np.allclose(P_CA2 @ b, bhat)\n", ") # check that both outputs are equal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Copyright\n", "\n", "- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)\n", "- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)\n", "- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)\n", "- feel free to use the notebooks for your own purposes\n", "- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year." ] } ], "metadata": { "interpreter": { "hash": "1743232f157dd0954c61aae30535e75a2972519a625c7e796bafe0cd9a07bf7e" }, "kernelspec": { "display_name": "myddasp", "language": "python", "name": "myddasp" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 4 }