{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sascha Spors,\n",
    "Professorship Signal Theory and Digital Signal Processing,\n",
    "Institute of Communications Engineering (INT),\n",
    "Faculty of Computer Science and Electrical Engineering (IEF),\n",
    "University of Rostock,\n",
    "Germany\n",
    "\n",
    "# Data Driven Audio Signal Processing - A Tutorial with Computational Examples\n",
    "\n",
    "Winter Semester 2024/25 (Master Course #24512)\n",
    "\n",
    "- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture\n",
    "- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise\n",
    "\n",
    "Feel free to contact lecturer frank.schultz@uni-rostock.de"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Exercise 5: Linear Regression Toy Example\n",
    "\n",
    "## Objectives\n",
    "\n",
    "When no assumption on an underlying data generation process is being made, pure linear algebra is used to solve for model parameters. Hence, we should link\n",
    "- linear regression model (simple line fit)\n",
    "- left inverse of a tall / thin, full column (feature) matrix\n",
    "- (residual) least squares\n",
    "- projection matrices to the 4 subspaces\n",
    "\n",
    "to the very same playground using the following simple toy example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "from scipy.linalg import svd, diagsvd, inv, pinv, norm\n",
    "from numpy.linalg import matrix_rank\n",
    "\n",
    "np.set_printoptions(precision=3,\n",
    "                    floatmode='maxprec',\n",
    "                    suppress=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = np.array([[1, 1],\n",
    "              [1, 2],\n",
    "              [1, 3],\n",
    "              [1, 4]])\n",
    "print(X, X.shape, matrix_rank(X))\n",
    "y_col = np.array([[1],\n",
    "                  [3],\n",
    "                  [5],\n",
    "                  [7]])\n",
    "print(y_col, y_col.shape)\n",
    "[U, s, Vh] = svd(X)\n",
    "V = Vh.T\n",
    "y_left_null = (-U[:,2]+U[:,3])[:, None]  # [:, None] makes it a (4,1) array\n",
    "print(y_left_null, y_left_null.shape)\n",
    "y = y_col + y_left_null\n",
    "print(y, y.shape)\n",
    "M, N = X.shape\n",
    "print(M, N)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_col.T @ y_left_null  # column space is ortho to left null space"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# magnitudes of vectors\n",
    "np.sqrt(y_col.T @ y_col), np.sqrt(y_left_null.T @ y_left_null)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X.T @ X  # this is full rank -> invertible"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "inv(X.T @ X)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# left inverse for tall/thin, full column rank X\n",
    "Xli = inv(X.T @ X) @ X.T\n",
    "Xli"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# left inverse via SVD option 1 -> invert singular values & reverse space mapping: U -> V\n",
    "S = diagsvd(s, M, N)\n",
    "Sli = inv(S.T @ S) @ S.T\n",
    "Xli_svd_1 = V @ Sli @ U.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# left inverse via SVD option 2 -> invert singular values & reverse space mapping: U -> V\n",
    "# s / s^2 = 1 / s might be nicer seen here\n",
    "Xli_svd_2 = V @ diagsvd(s / s**2, N, M) @ U.T\n",
    "\n",
    "np.allclose(Xli_svd_2, Xli_svd_1),        np.allclose(Xli, Xli_svd_1),      np.allclose(Xli, pinv(X))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "theta_hat = Xli @ y  # it is rarely called that way in this context, but: we actually train a model with this operation\n",
    "theta_hat  # fitted / trained model parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Xli @ y_col  # we get same theta_hat if using only column space stuff of y "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Xli @ y_left_null  # this must yield zero, as X cannot bring left null to row space"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_hat = X @ theta_hat\n",
    "y_hat  # == y_col"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "e = y - y_hat  # e == y_left_null\n",
    "e, e.T @ e"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# recap: y_hat = y_col, e = y_left_null\n",
    "# y = y_col + y_lef_null = y_hat + e\n",
    "# hence\n",
    "y_hat.T @ e  # column space is ortho to left null space"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# projection matrices:\n",
    "P_col = X @ Xli\n",
    "P_col, P_col @ y, y_col"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# check P_col projection in terms of SVD\n",
    "S @ Sli, np.allclose(U @ (S @ Sli) @ U.T, P_col)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "P_left_null = np.eye(M) - P_col\n",
    "P_left_null, P_left_null @ y, e"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# check P_left_null projection in terms of SVD\n",
    "np.eye(M) - S @ Sli, np.allclose(U @ (np.eye(M) - S @ Sli) @ U.T, P_left_null)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "P_row = Xli @ X  # == always identity matrix for full column rank X\n",
    "P_row, P_row @ theta_hat"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# check P_row projection in terms of SVD\n",
    "Sli @ S, np.allclose(V @ (Sli @ S) @ V.T, P_row)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "P_null = np.eye(N) - P_row  # == always zero matrix for full column rank X\n",
    "P_null  # null space is spanned only by zero vector"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# check P_null projection in terms of SVD\n",
    "np.allclose(V @ (np.eye(N) - Sli @ S) @ V.T, P_null)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(8,4))\n",
    "\n",
    "# residuals\n",
    "for m in range(M):\n",
    "    plt.plot([X[m, 1], X[m, 1]],\n",
    "             [y[m, 0], y_col[m, 0]], lw=3, label='error '+str(m+1))\n",
    "# data\n",
    "plt.plot(X[:,1], y, 'C4x',\n",
    "         ms=10, mew=3,\n",
    "         label='data')\n",
    "# fitted line\n",
    "plt.plot(X[:,1], theta_hat[0] * X[:,0] + theta_hat[1] * X[:,1], 'k', label='least squares fit (interpolation)')\n",
    "x = np.linspace(0, 1, 10)\n",
    "plt.plot(x, theta_hat[0] + theta_hat[1] * x, 'C7:', label='least squares fit (extrapolation)')\n",
    "x = np.linspace(4, 5, 10)\n",
    "plt.plot(x, theta_hat[0] + theta_hat[1] * x, 'C7:')\n",
    "\n",
    "plt.xticks(np.arange(6))\n",
    "plt.yticks(np.arange(11)-1)\n",
    "plt.xlim(0, 5)\n",
    "plt.ylim(-1, 9)\n",
    "plt.xlabel('feature x1')\n",
    "plt.ylabel('y')\n",
    "plt.title(r'min the sum of squared errors solves for $\\hat{\\theta}=[-1,2]^T$ -> intercept: -1, slope: +2')\n",
    "plt.legend()\n",
    "plt.grid(True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Copyright\n",
    "\n",
    "- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)\n",
    "- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)\n",
    "- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)\n",
    "- feel free to use the notebooks for your own purposes\n",
    "- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "myddasp",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}