{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sascha Spors,\n",
    "Professorship Signal Theory and Digital Signal Processing,\n",
    "Institute of Communications Engineering (INT),\n",
    "Faculty of Computer Science and Electrical Engineering (IEF),\n",
    "University of Rostock,\n",
    "Germany\n",
    "\n",
    "# Data Driven Audio Signal Processing - A Tutorial with Computational Examples\n",
    "\n",
    "Winter Semester 2023/24 (Master Course #24512)\n",
    "\n",
    "- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture\n",
    "- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise\n",
    "\n",
    "Feel free to contact lecturer frank.schultz@uni-rostock.de"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 4: SVD and Left Matrix Inverse\n",
    "\n",
    "## Objectives\n",
    "\n",
    "- Matrix A of dimension (M x N)\n",
    "- SVD for a **tall/thin** matrix, we assume a matrix with **full column rank** $r=N$\n",
    "- Four subspaces in SVD domain\n",
    "- Projection matrices\n",
    "- Least squares / normal equation(s)\n",
    "- Left inverse\n",
    "\n",
    "## Special Python Packages\n",
    "\n",
    "Some convenient functions are found in `scipy.linalg`, some in `numpy.linalg` "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Some Initial Python Stuff"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from scipy.linalg import svd, diagsvd, inv, pinv, null_space, norm\n",
    "from numpy.linalg import matrix_rank\n",
    "\n",
    "np.set_printoptions(precision=2, floatmode=\"fixed\", suppress=True)\n",
    "\n",
    "rng = np.random.default_rng(1234)\n",
    "mean, stdev = 0, 1\n",
    "\n",
    "# we might convince ourselves that all works for complex data as well\n",
    "# then the ^H operator (conj().T) needs to be used instead of just .T\n",
    "use_complex = False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SVD of Tall/Thin, Full Column Rank Matrix A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "M = 7  # number of rows\n",
    "N = 3  # number of cols\n",
    "\n",
    "rank = min(M, N)  # set desired rank == full column rank == independent columns\n",
    "print(\"desired rank of A:\", rank)\n",
    "\n",
    "if use_complex:\n",
    "    dtype = \"complex128\"\n",
    "    A = np.zeros([M, N], dtype=dtype)\n",
    "    for i in range(rank):\n",
    "        col = rng.normal(mean, stdev, M) + 1j * rng.normal(mean, stdev, M)\n",
    "        row = rng.normal(mean, stdev, N) + 1j * rng.normal(mean, stdev, N)\n",
    "        A += np.outer(col, row)  # superposition of rank-1 matrices\n",
    "else:\n",
    "    dtype = \"float64\"\n",
    "    A = np.zeros([M, N], dtype=dtype)\n",
    "    for i in range(rank):\n",
    "        col = rng.normal(mean, stdev, M)\n",
    "        row = rng.normal(mean, stdev, N)\n",
    "        A += np.outer(col, row)  # superposition of rank-1 matrices\n",
    "# check if rng produced desired rank\n",
    "print(\"        rank of A:\", matrix_rank(A))\n",
    "print(\"tall/thin matrix with full column rank\")\n",
    "print(\"-> matrix V contains only the row space\")\n",
    "print(\"-> null space is only the zero vector\")\n",
    "print(\"A =\\n\", A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "[U, s, Vh] = svd(A, full_matrices=True)\n",
    "S = diagsvd(s, M, N)  # for full SVD the matrix S has same dim as A\n",
    "V = Vh.conj().T\n",
    "Uh = U.conj().T\n",
    "\n",
    "print(\"U =\\n\", U)\n",
    "# number of non-zero sing values along diag must match rank\n",
    "print(\"non-zero singular values: \", s[:rank])\n",
    "print(\"S =\\n\", S)  # contains 0 Matrix below diagonal part\n",
    "print(\"V =\\n\", V)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Four Subspaces in SVD Domain\n",
    "\n",
    "The null space $N(\\mathbf{A})$ of the tall/thin, full column rank matrix $\\mathbf{A}$ is only $\\mathbf{0}$, i.e. $N(\\mathbf{A})=\\mathbf{0}$. Except for $\\mathbf{x}=\\mathbf{0}$, all other $\\mathbf{x}$ are mapped to the column space $C(\\mathbf{A})$. This, however, requires, that the $\\mathbf{V}$ matrix completely spans the row space and no $\\mathbf{v}$ vectors span a dedicated null space.\n",
    "\n",
    "The tall/thin, full column rank matrix $\\mathbf{A}$ spans a rather large left null space $N(\\mathbf{A}^\\mathrm{H})$ with dimension $M-\\mathrm{rank}(A)$.\n",
    "\n",
    "We, therefore, here deal with a linear set of equations with **more equations than unknowns** ($M>N$, more rows than columns) , i.e. the **over-determined** case. For this case, we can find a solution in the **least-squares error** sense by help of the **left inverse** as discussed below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# all stuff that is in matrix U\n",
    "print(\"U =\\n\", U)\n",
    "\n",
    "# column space C(A)\n",
    "print(\"\\ncolumn space (orthogonal to left null space):\")\n",
    "print(U[:, :rank])\n",
    "\n",
    "# left null space, if empty only 0 vector\n",
    "print(\"left null space (orthogonal to column space):\")\n",
    "print(U[:, rank:])\n",
    "\n",
    "print(\"###\")\n",
    "\n",
    "# all stuff that is in matrix V\n",
    "print(\"\\nV =\\n\", V)\n",
    "\n",
    "# row space\n",
    "print(\"\\nrow space (orthogonal to null space):\")\n",
    "print(V[:, :rank])\n",
    "\n",
    "# null space N(A), if empty only 0 vector\n",
    "print(\"null space (orthogonal to row space):\")\n",
    "print(V[:, rank:])  # for full column rank this is only the zero vector"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Left Inverse via SVD"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "[U, s, Vh] = svd(A, full_matrices=True)\n",
    "V = Vh.conj().T\n",
    "Uh = U.conj().T\n",
    "\n",
    "Si = diagsvd(1 / s, N, M)  # works if array s has only non-zero entries\n",
    "print(\"Inverse singular value matrix with right zero block\")\n",
    "print(\"Si =\\n\", Si)\n",
    "# left inverse using 'inverse' SVD:\n",
    "Ali = V @ Si @ Uh\n",
    "# left inverse using a dedicated pinv algorithm\n",
    "# proper choice is done by pinv() itself\n",
    "Ali_pinv = pinv(A)\n",
    "print(\"pinv == left inverse via SVD?\", np.allclose(Ali, Ali_pinv))\n",
    "print(\"Si @ S\\n\", Si @ S, \"\\nyields NxN identity matrix\")\n",
    "print(\"Ali @ A = \\n\", Ali @ A, \"\\nyields NxN identity matrix\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Projection Matrices for the Left Inverse Problem"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "u_col = U[:, 0, None]\n",
    "u_left_null = U[:, rank, None]\n",
    "u_tmp = u_col + u_left_null\n",
    "\n",
    "# projection onto row space == I_NxN\n",
    "# full rank and identity since we don't have a null space\n",
    "# so each vector of the row space is projected onto itself\n",
    "P_CAH = Ali @ A\n",
    "print(\n",
    "    \"projection matrix P_CAH projects a row space vector onto itself:\\n\",\n",
    "    \"P_CAH @ V[:,0] == V[:,0]:\",\n",
    "    np.allclose(P_CAH @ V[:, 0], V[:, 0]),\n",
    ")\n",
    "\n",
    "# projection onto column space\n",
    "P_CA = A @ Ali\n",
    "print(\n",
    "    \"projection matrix P_CA projects U-space stuff to column space:\\n\",\n",
    "    \"P_CA @ (u_tmp) == u_col:\",\n",
    "    np.allclose(P_CA @ (u_tmp), u_col),\n",
    ")\n",
    "\n",
    "# projection onto null space == null matrix\n",
    "P_NA = np.eye(N, N) - P_CAH\n",
    "print(\"projection matrix P_NA is a null matrix\\nP_NA=\\n\", P_NA)\n",
    "\n",
    "# projection onto left null space\n",
    "P_NAH = np.eye(M, M) - P_CA\n",
    "print(\n",
    "    \"projection matrix P_NAH projects U-space stuff to left null space:\\n\",\n",
    "    \"P_NAH @ (u_tmp)==u_lnull\",\n",
    "    np.allclose(P_NAH @ (u_tmp), u_left_null),\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# design a vector:\n",
    "# one entry from column space + one entry from left null space\n",
    "# so we take some special left singular vectors:\n",
    "b = U[:, 0, None] + U[:, rank, None]  # same as u_tmp above\n",
    "print(\"b==u_tmp\", np.allclose(b, u_tmp))\n",
    "\n",
    "# the vector b is a linear combination and lives in column+left null spaces\n",
    "# with above introduced projection matrices we can project b\n",
    "# (i) onto column space\n",
    "bhat = P_CA @ b  # project b to column space C(A)\n",
    "print(\"bhat==U[:, 0, None]:\", np.allclose(bhat, U[:, 0, None]))\n",
    "# and\n",
    "# (ii) onto left null space\n",
    "e = P_NAH @ b  # project b to left null space N(A^H)\n",
    "print(\"e==U[:, 0, None]:\", np.allclose(e, U[:, rank, None]))\n",
    "\n",
    "# to find x_hat, i.e. the LS solution of the inverse problem\n",
    "# we bring b to row space via left inverse:\n",
    "# only the column space part of b (i.e. bhat) is brought to the row space\n",
    "# we can never map back a 'zero' (i.e. e)\n",
    "print(\"x_hat = Ali @ b == Ali @ bhat:\", np.allclose(Ali @ b, Ali @ bhat))\n",
    "# we expect that this is the scaled first right singular vector\n",
    "print(\"x_hat=\\n\", V[:, 0, None] / S[0, 0])\n",
    "print(Ali @ bhat)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Left Inverse from Normal Equation(s) / Least Squares Error Optimization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Derivation 1\n",
    "\n",
    "Vector addition from example above with an error term $\\mathbf{e}$\n",
    "\n",
    "$\\hat{\\mathbf{b}} + \\mathbf{e} = \\mathbf{b} \\rightarrow \\mathbf{e} = \\mathbf{b} - \\hat{\\mathbf{b}}$\n",
    "\n",
    "We know, that pure row space $\\hat{\\mathbf{x}}$ maps to pure column space $\\hat{\\mathbf{b}}$\n",
    "\n",
    "$\\hat{\\mathbf{b}} = \\mathbf{A} \\hat{\\mathbf{x}}$\n",
    "\n",
    "Inserting this\n",
    "\n",
    "$\\mathbf{e} = \\mathbf{b} - \\hat{\\mathbf{b}} = \\mathbf{b} - \\mathbf{A} \\hat{\\mathbf{x}}$\n",
    "\n",
    "The vector $\\mathbf{A} \\mathbf{x}$ is always living in the column space no matter what $\\mathbf{x}$ is constructed of.\n",
    "\n",
    "The vector $\\mathbf{e}$ is orthogonal to column space (since it lives in left null space).\n",
    "\n",
    "So, we know that the inner product must solve to zero\n",
    "\n",
    "$(\\mathbf{A} \\mathbf{x})^\\mathrm{H} \\mathbf{e} = 0 \\rightarrow \\mathbf{x}^\\mathrm{H} \\mathbf{A}^\\mathrm{H} (\\mathbf{b} - \\mathbf{A} \\hat{\\mathbf{x}}) = 0$\n",
    "\n",
    "Rearranging yields\n",
    "\n",
    "$\\mathbf{x}^\\mathrm{H} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\mathbf{x}^\\mathrm{H} \\mathbf{A}^\\mathrm{H} \\mathbf{A} \\hat{\\mathbf{x}}$\n",
    "\n",
    "and by canceling $\\mathbf{x}^\\mathrm{H} $ the famous normal equation is obtained\n",
    "\n",
    "$\\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\mathbf{A}^\\mathrm{H} \\mathbf{A} \\hat{\\mathbf{x}}$\n",
    "\n",
    "This can be solved using the inverse of $\\mathbf{A}^\\mathrm{H} \\mathbf{A}$ (this matrix is full rank and therefore invertible)\n",
    "\n",
    "$(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = (\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} (\\mathbf{A}^\\mathrm{H} \\mathbf{A}) \\hat{\\mathbf{x}}$\n",
    "\n",
    "Since $(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} (\\mathbf{A}^\\mathrm{H} \\mathbf{A}) = \\mathbf{I}$ holds, we get the solution in least-squares error sense for $\\mathbf{x}$ in the row space of $\\mathbf{A}$\n",
    "\n",
    "$(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\hat{\\mathbf{x}}$\n",
    "\n",
    "We find the **left inverse** of $\\mathbf{A}$ as\n",
    "\n",
    "$\\mathbf{A}^{+L} = (\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H}$\n",
    "\n",
    "such that left multiplying\n",
    "\n",
    "$\\mathbf{A}^{+L} \\mathbf{A} = \\mathbf{I}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Derivation 2\n",
    "\n",
    "Here, the origin of the naming *least squares error* is made more obvious.\n",
    "\n",
    "We set up an optimization problem defining **least** amount of **squared** length of the **error** vector\n",
    "\n",
    "$\\mathrm{min}_\\mathbf{x} ||\\mathbf{e}||^2_2 = \\mathrm{min}_\\mathbf{x} ||\\mathbf{b} - \\mathbf{A} {\\mathbf{x}}||_2^2$\n",
    "\n",
    "We could solve this with help of calculus. But we have a nice tool, i.e. the properties of subspaces, that not requires pages of calculation:\n",
    "\n",
    "We must find the minimum distance from $\\mathbf{b}$ to the column space of $\\mathbf{A}$.\n",
    "\n",
    "This can be only achieved if the error vector $\\mathbf{e}=\\mathbf{b} - \\mathbf{A} {\\mathbf{x}}$ is orthogonal to the column space of $\\mathbf{A}$. \n",
    "\n",
    "This in turn means that $\\mathbf{e}$ must live in the left null space of $\\mathbf{A}$, i.e. $\\mathbf{e} \\in N(\\mathbf{A}^\\mathrm{H})$. \n",
    "\n",
    "By definition of left nullspace we have $\\mathbf{A}^\\mathrm{H} \\mathbf{e} = \\mathbf{0}$. \n",
    "\n",
    "So, let us insert $\\mathbf{e}$ into $\\mathbf{A}^\\mathrm{H} \\mathbf{e} = \\mathbf{0}$:\n",
    "\n",
    "$\\mathbf{A}^\\mathrm{H} (\\mathbf{b} - \\mathbf{A} {\\mathbf{x}}) = \\mathbf{0}$\n",
    "\n",
    "$\\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\mathbf{A}^\\mathrm{H} \\mathbf{A} {\\mathbf{x}}$\n",
    "\n",
    "The optimum $\\mathbf{x}$ which solves the problem is just as above in derivation 1\n",
    "\n",
    "$(\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H} \\mathbf{b} = \\hat{\\mathbf{x}}$\n",
    "\n",
    "And again, we find the **left inverse** of $\\mathbf{A}$ as\n",
    "\n",
    "$\\mathbf{A}^{+L} = (\\mathbf{A}^\\mathrm{H} \\mathbf{A})^{-1} \\mathbf{A}^\\mathrm{H}$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Ali_normaleq = inv(A.conj().T @ A) @ A.conj().T\n",
    "\n",
    "xhat = Ali_normaleq @ b  # LS solution in row space\n",
    "print(\"xhat = \", xhat)\n",
    "bhat = A @ xhat  # map to column space\n",
    "\n",
    "# thus this is the projection matrix that maps b (column + left nullspace stuff)\n",
    "# to the column space of A in the LS sense\n",
    "P_CA2 = A @ Ali_normaleq\n",
    "\n",
    "print(\"P_CA == P_CA2 ?\", np.allclose(P_CA, P_CA2))  # check with above result\n",
    "print(\n",
    "    \"P_CA2 @ b == bhat:\", np.allclose(P_CA2 @ b, bhat)\n",
    ")  # check that both outputs are equal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Copyright\n",
    "\n",
    "- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)\n",
    "- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)\n",
    "- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)\n",
    "- feel free to use the notebooks for your own purposes\n",
    "- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year."
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "1743232f157dd0954c61aae30535e75a2972519a625c7e796bafe0cd9a07bf7e"
  },
  "kernelspec": {
   "display_name": "myddasp",
   "language": "python",
   "name": "myddasp"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}