{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "abae8153",
   "metadata": {},
   "source": [
    "Sascha Spors,\n",
    "Professorship Signal Theory and Digital Signal Processing,\n",
    "Institute of Communications Engineering (INT),\n",
    "Faculty of Computer Science and Electrical Engineering (IEF),\n",
    "University of Rostock,\n",
    "Germany\n",
    "\n",
    "# Data Driven Audio Signal Processing - A Tutorial with Computational Examples\n",
    "\n",
    "Master Course #24512\n",
    "\n",
    "- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture\n",
    "- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise\n",
    "\n",
    "Feel free to contact lecturer frank.schultz@uni-rostock.de"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11c6393a",
   "metadata": {},
   "source": [
    "# Principal Component Analysis (PCA) of Simple Audio Features\n",
    "\n",
    "- real world data from [exercise12_MusicGenreClassification.ipynb](exercise12_MusicGenreClassification.ipynb)\n",
    "- we calculate a PCA of an audio feature matrix\n",
    "- Part I: to make it still easy to visualize, we choose only three original features to indicate the PC directions in a graphical approach \n",
    "- Part II: the full data matrix is put into a **correlation matrix PCA** to inspect and interprete the PC loading matrix as correlations between original features and PC features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2fe4b1ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy.linalg import svd, diagsvd\n",
    "# from statsmodels.multivariate.pca import PCA"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d27286a2",
   "metadata": {},
   "outputs": [],
   "source": [
    "matplotlib_widget_flag = True"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1adbe598",
   "metadata": {},
   "outputs": [],
   "source": [
    "if matplotlib_widget_flag:\n",
    "    %matplotlib widget"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c9d077aa",
   "metadata": {},
   "source": [
    "## Part I"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d0cea135",
   "metadata": {},
   "outputs": [],
   "source": [
    "audiofolder = \"./audio_ex12/\"\n",
    "with np.load(audiofolder + \"/_raw_data_large.npz\") as data:\n",
    "    X = data[\"Xdata\"]\n",
    "    Y = data[\"Ydata\"]\n",
    "# 0...true_peak_lin\n",
    "# 1...true_peak_lin2\n",
    "# 2...true_peak_db\n",
    "# 3...rms_lin2\n",
    "# 4...rms_lin\n",
    "# 5...rms_db\n",
    "# 6...lufs_lin\n",
    "# 7...lufs_lin2\n",
    "# 8...lufs_db\n",
    "# 9...crest_lin\n",
    "# 10...crest_db\n",
    "# 11...low_high_ratio\n",
    "X = np.squeeze([X[:,0], X[:,5], X[:,8]]).T\n",
    "# feel free to play around with other feature matrices\n",
    "# a PCA on the whole X matrix yields very high variance explanation using\n",
    "# just PC 1 score, so the chosen, simple features 0...11 seem to be redundant\n",
    "# somehow?!, see the interpretation of the loadings at the end of the notebook\n",
    "\n",
    "# the PCA assumes mean-free columns, so we explicitly design X this way\n",
    "X = X - np.mean(X, axis=0)\n",
    "# we also normalize by standard deviation,\n",
    "# then each column of X has unit variance and total variance of X is 3: \n",
    "if True:\n",
    "    X = X / np.std(X, axis=0, ddof=1)\n",
    "    print(np.var(X, axis=0, ddof=1), np.sum(np.var(X, axis=0, ddof=1)) )\n",
    "\n",
    "N, F = X.shape[0], X.shape[1]\n",
    "N, F"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e35460a6",
   "metadata": {},
   "outputs": [],
   "source": [
    "[U, s, Vh] = svd(X, full_matrices=False)\n",
    "S = diagsvd(s, F, F)\n",
    "V = Vh.conj().T\n",
    "\n",
    "pcf = U @ S  # known as PC features, PC scores \n",
    "pcl = V  # known as PC loadings\n",
    "# the normalisation scheme here uses V.T@V=I for the PC loadings\n",
    "# cf. the discussion for eq. (2.8) in\n",
    "# https://doi.org/10.1098/rsta.2015.0202\n",
    "# note that the wording in PCA world is rather inconsistent and misleading\n",
    "# so we make sure that our math and our understanding is consistent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06d2173f",
   "metadata": {},
   "outputs": [],
   "source": [
    "var_pcf = np.var(pcf, axis=0, ddof=1)\n",
    "std_pcf = np.std(pcf, axis=0, ddof=1)\n",
    "print(var_pcf / np.sum(var_pcf) * 100)  # PCF 1 explains already 95% of data's variance\n",
    "print(np.cumsum(var_pcf) / np.sum(var_pcf) * 100)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04eb7f9e",
   "metadata": {},
   "source": [
    "## Plot Directions of the PCs and Original Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3d0d6363",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure()\n",
    "fig.set_size_inches(9,3)\n",
    "\n",
    "ax = fig.add_subplot(131)\n",
    "ax.plot(X[:, 0], X[:, 1], \".\", color=\"gray\", ms=1)\n",
    "ax.axis(\"square\")\n",
    "ax.set_xlabel(\"original feature 1: true_peak_lin\")\n",
    "ax.set_ylabel(\"original feature 2: rms_db\")\n",
    "ax.grid(True)\n",
    "\n",
    "ax = fig.add_subplot(132)\n",
    "ax.plot(X[:, 0], X[:, 2], \".\", color=\"gray\", ms=1)\n",
    "ax.axis(\"square\")\n",
    "ax.set_xlabel(\"original feature 1: true_peak_lin\")\n",
    "ax.set_ylabel(\"original feature 3: lufs_db\")\n",
    "ax.grid(True)\n",
    "\n",
    "ax = fig.add_subplot(133)\n",
    "ax.plot(X[:, 1], X[:, 2], \".\", color=\"gray\", ms=1)\n",
    "ax.axis(\"square\")\n",
    "ax.set_xlabel(\"original feature 2: rms_db\")\n",
    "ax.set_ylabel(\"original feature 3: lufs_db\")\n",
    "ax.grid(True)\n",
    "\n",
    "plt.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c5dd0957",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111, projection=\"3d\")\n",
    "ax.plot(X[:, 0], X[:, 1], X[:, 2], \".\", color=\"gray\", ms=1)\n",
    "ax.axis(\"square\")\n",
    "ax.set_xlabel(\"original feature 1: true_peak_lin\")\n",
    "ax.set_ylabel(\"original feature 2: rms_db\")\n",
    "ax.set_zlabel(\"original feature 3: lufs_db\")\n",
    "ax.set_title(\"data cloud in original coordinate system\")\n",
    "ax.grid(True)\n",
    "ax.azim = -44\n",
    "ax.elev = 28"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "48000c5b",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111, projection=\"3d\")\n",
    "\n",
    "ax.plot(X[:, 0], X[:, 1], X[:, 2], \".\", color=\"gray\", ms=1)\n",
    "\n",
    "# draw directions of PC axis\n",
    "# length is chosen to be the 3*sqrt(variance)=3*std of the PC features\n",
    "# note that V not necessarily spans a right-hand system\n",
    "col = ['C3', 'C2', 'C0']\n",
    "for i in range(F):\n",
    "    ax.plot(\n",
    "        [0, 3*std_pcf[i] * V[0, i]],\n",
    "        [0, 3*std_pcf[i] * V[1, i]],\n",
    "        [0, 3*std_pcf[i] * V[2, i]],\n",
    "        col[i], lw=3,\n",
    "        label=f\"{'3-std-weighted   PC '}{i+1}\" f\"{'=col '}{i+1} in V\")\n",
    "\n",
    "ax.axis(\"square\")\n",
    "ax.set_xlabel(\"original feature 1: true_peak_lin\")\n",
    "ax.set_ylabel(\"original feature 2: rms_db\")\n",
    "ax.set_zlabel(\"original feature 3: lufs_db\")\n",
    "ax.set_title(\"data cloud in original coordinate system\")\n",
    "ax.legend()\n",
    "ax.grid(True)\n",
    "ax.azim = -44\n",
    "ax.elev = 28"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a392e7c1",
   "metadata": {},
   "source": [
    "## Plot Data in PC Coordinate System"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c1037277",
   "metadata": {},
   "outputs": [],
   "source": [
    "# rather than rotating/flipping of original data into a new coordinate system,\n",
    "# we can do a dedicated SVD on pcs\n",
    "# because V_pcs is a diagonal matrix with +-1,\n",
    "# i.e. indicating +-x, +-y and +-z axis for a 3D plot\n",
    "[U_pcf, s_pcf, Vh_pcf] = svd(pcf, full_matrices=False)\n",
    "S_pcf = diagsvd(s_pcf, F, F)\n",
    "V_pcf = Vh_pcf.T\n",
    "\n",
    "# make sure that correct U/V pair for pcf holds by\n",
    "# introducing corresponding reflections\n",
    "for i in range(F):\n",
    "    if not np.allclose((U_pcf @ S_pcf)[:, i], pcf[:, i]):\n",
    "        U_pcf[:,i] *= -1\n",
    "        V_pcf[:,i] *= -1\n",
    "# then V_pcs indicates the correct directions in the PC coordinate system\n",
    "print(np.allclose(U_pcf @ S_pcf, pcf))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ab9569b",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111, projection=\"3d\")\n",
    "\n",
    "ax.plot(pcf[:, 0], pcf[:, 1], pcf[:, 2], \"x\", color=\"gray\", ms=1)\n",
    "\n",
    "# draw directions of PC axis\n",
    "# length is chosen to be the 3*sqrt(variance)=3*std of the PC features\n",
    "# note that V not necessarily spans a right-hand system\n",
    "col = ['C3', 'C2', 'C0']\n",
    "for i in range(F):\n",
    "    ax.plot(\n",
    "        [0, 3*std_pcf[i] * V_pcf[0, i]],\n",
    "        [0, 3*std_pcf[i] * V_pcf[1, i]],\n",
    "        [0, 3*std_pcf[i] * V_pcf[2, i]],\n",
    "        col[i], lw=3,\n",
    "        label=f\"{'3-std-weighted PC '}{i+1}\")\n",
    "\n",
    "ax.axis(\"square\")\n",
    "ax.set_xlabel(\"PC 1 direction\")\n",
    "ax.set_ylabel(\"PC 2 direction\")\n",
    "ax.set_zlabel(\"PC 3 direction\")\n",
    "ax.set_title(\"data in PC coordinate system\")\n",
    "ax.legend()\n",
    "ax.grid(True)\n",
    "ax.azim = -37\n",
    "ax.elev = 28"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "175d6f16",
   "metadata": {},
   "source": [
    "## Truncated SVD / Low-Rank Approximation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8a2b076a",
   "metadata": {},
   "outputs": [],
   "source": [
    "r_des = 2  # 1 or 2 \n",
    "\n",
    "# SVD mindset\n",
    "# sum of outer products, i.e. sum of rank-1 matrices\n",
    "X_rank_red = np.zeros((N, F))\n",
    "for i in range(r_des):\n",
    "    X_rank_red += s[i] * np.outer(U[:, i], V[:, i])\n",
    "\n",
    "# PCA mindset:\n",
    "# we might also use the PC features and\n",
    "# set the intended PC loadings to zero\n",
    "X_rank_red2 = np.zeros((N, F))\n",
    "pcl_rank_red = np.copy(pcl)\n",
    "pcl_rank_red[:, r_des:] = 0\n",
    "X_rank_red2 = pcf @ pcl_rank_red.T\n",
    "print(np.allclose(X_rank_red, X_rank_red2))\n",
    "pcl_rank_red"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed213620",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111, projection=\"3d\")\n",
    "\n",
    "ax.plot(X_rank_red2[:, 0],\n",
    "        X_rank_red2[:, 1],\n",
    "        X_rank_red2[:, 2], \"x\", color=\"gray\", ms=1)\n",
    "\n",
    "# draw directions of PC axis\n",
    "# length is chosen to be the 3*sqrt(variance)=3*std of the PC features\n",
    "# note that V not necessarily spans a right-hand system\n",
    "col = ['C3', 'C2', 'C0']\n",
    "for i in range(F):\n",
    "    ax.plot(\n",
    "        [0, 3*std_pcf[i] * V[0, i]],\n",
    "        [0, 3*std_pcf[i] * V[1, i]],\n",
    "        [0, 3*std_pcf[i] * V[2, i]],\n",
    "        col[i], lw=1,\n",
    "        label=f\"{'3-std-weighted    PC '}{i+1}\" f\"{' score=col '}{i+1} in V\")\n",
    "\n",
    "ax.axis(\"square\")\n",
    "ax.set_xlim(-6, 6)\n",
    "ax.set_ylim(-6, 6)\n",
    "ax.set_zlim(-6, 6)\n",
    "ax.set_xlabel(\"new feature 1 after rank reduction\")\n",
    "ax.set_ylabel(\"new feature 2 after...\")\n",
    "ax.set_zlabel(\"new feature 3 after...\")\n",
    "if r_des==1:\n",
    "    ax.set_title(\"rank-{0:d} approximation of data (line in 3D)\".format(r_des))      \n",
    "elif r_des==2:\n",
    "    ax.set_title(\"rank-{0:d} approximation of data (plane in 3D)\".format(r_des))\n",
    "ax.legend()\n",
    "ax.grid(True)\n",
    "ax.azim = -44\n",
    "ax.elev = 28"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "245a01b2",
   "metadata": {},
   "source": [
    "## Dimensionality Reduction after PCA"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "120d1942",
   "metadata": {},
   "outputs": [],
   "source": [
    "dim_des = 2 # 1 or 2\n",
    "# PCA mindset\n",
    "X_dim_red = np.zeros((N, dim_des))\n",
    "X_dim_red = pcf[:, :dim_des]\n",
    "print('original matrix shape     ', X.shape)\n",
    "print('matrix shape after dim red', X_dim_red.shape)\n",
    "\n",
    "X_dim_red_plot = np.zeros((N, F))\n",
    "X_dim_red_plot[:, :dim_des] = pcf[:, :dim_des]\n",
    "\n",
    "# check with SVD mindset\n",
    "print(np.allclose((U @ S)[:, :dim_des], X_dim_red))\n",
    "print(np.allclose(X @ V[:, :dim_des], X_dim_red))\n",
    "V[:, :dim_des]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bb5de466",
   "metadata": {},
   "outputs": [],
   "source": [
    "# the dimensionality reduction actually yields a matrix with smaller\n",
    "# dimension, cf. shape of X_dim_red\n",
    "# for convenience we plot data here in 3D plot\n",
    "fig = plt.figure()\n",
    "ax = fig.add_subplot(111, projection=\"3d\")\n",
    "\n",
    "ax.plot(\n",
    "    X_dim_red_plot[:, 0],\n",
    "    X_dim_red_plot[:, 1],\n",
    "    X_dim_red_plot[:, 2],\n",
    "    \"x\", color=\"gray\", ms=1)\n",
    "\n",
    "# draw directions of PC axis\n",
    "# length is chosen to be the 3*sqrt(variance)=3*std of the PC features\n",
    "# note that V not necessarily spans a right-hand system\n",
    "col = ['C3', 'C2', 'C0']\n",
    "for i in range(F):\n",
    "    ax.plot(\n",
    "        [0, 3*std_pcf[i] * V_pcf[0, i]],\n",
    "        [0, 3*std_pcf[i] * V_pcf[1, i]],\n",
    "        [0, 3*std_pcf[i] * V_pcf[2, i]],\n",
    "        col[i], lw=1,\n",
    "        label=f\"{'3-std-weighted PC '}{i+1}\")\n",
    "\n",
    "ax.axis(\"square\")\n",
    "#ax.set_xlim(-6, 6)\n",
    "#ax.set_ylim(-6, 6)\n",
    "#ax.set_zlim(-6, 6)\n",
    "ax.set_xlabel(\"PC 1 direction\")\n",
    "ax.set_ylabel(\"PC 2 direction\")\n",
    "ax.set_zlabel(\"PC 3 direction\")\n",
    "if dim_des==1:\n",
    "    plt.title(\"PCA data dimensionality reduction to {0:d}D (line in 1D)\".format(dim_des))\n",
    "elif dim_des==2:\n",
    "    plt.title(\"PCA data dimensionality reduction to {0:d}D (plane in 2D)\".format(dim_des))\n",
    "ax.legend()\n",
    "ax.grid(True)\n",
    "ax.azim = -37\n",
    "ax.elev = 28\n",
    "\n",
    "print(\"reduced to dimension\", X_dim_red.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ce117847",
   "metadata": {},
   "source": [
    "## Part II: Check / Interpret PCA Loadings for Full Data Matrix\n",
    "\n",
    "We use a correlation matrix PCA here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a1b34988",
   "metadata": {},
   "outputs": [],
   "source": [
    "with np.load(audiofolder + \"/_raw_data_large.npz\") as data:\n",
    "    X = data[\"Xdata\"]\n",
    "    Y = data[\"Ydata\"]\n",
    "N, F = X.shape[0], X.shape[1]\n",
    "N, F"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "768e26b4",
   "metadata": {},
   "outputs": [],
   "source": [
    "X = X - np.mean(X, axis=0)\n",
    "X = X / np.std(X, axis=0, ddof=1)\n",
    "np.sum(np.mean(X, axis=0)),  np.std(X, axis=0, ddof=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "795b30f6",
   "metadata": {},
   "outputs": [],
   "source": [
    "# get PCA\n",
    "# the normalisation scheme for part II is known as 'correlation matrix PCA'\n",
    "# cf. the discussion for eq. (2.8) in\n",
    "# https://doi.org/10.1098/rsta.2015.0202\n",
    "# using this normalisation scheme, the loading matrix is a correlation\n",
    "# matrix to show correlation between the original features and the PC features\n",
    "# this is a nice way to interprete the meaning of the linear combinations\n",
    "# and potential meanings of the PC features\n",
    "Z = X / np.sqrt(N-1)\n",
    "R = Z.T @ Z\n",
    "[U, s, Vh] = svd(Z, full_matrices=False)\n",
    "V, S = Vh.T, diagsvd(s, F, F)\n",
    "# adapt polarity for simpler interpretation\n",
    "V[:,0] *= -1\n",
    "U[:,0] *= -1\n",
    "V[:,1] *= -1\n",
    "U[:,1] *= -1\n",
    "\n",
    "PC_Features = U  # principal component features (PCF), aka principal component scores (PCS) \n",
    "PC_Loadings = V @ S  # principal component loadings\n",
    "# cf. due to Z = U S V.T -> (S @ V.T).T\n",
    "np.allclose(Z, PC_Features @ PC_Loadings.T)  # check correct SVD matrix factorisation\n",
    "# note that other normalisation schemes exist for the PCA\n",
    "# e.g. in data science / machine learning we typically see this:\n",
    "# PC_Features = U @ S\n",
    "# PC_Loadings = V\n",
    "# then other post-processing / interpretation is needed\n",
    "# this is precisely the scheme that was used in Part I"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2cd3878d",
   "metadata": {},
   "outputs": [],
   "source": [
    "variance = np.diag(S.T@S)\n",
    "total_variance = np.linalg.trace(S.T@S)\n",
    "total_variance, np.sum(variance)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1caab578",
   "metadata": {},
   "outputs": [],
   "source": [
    "# variance / cumsum variance as percentage\n",
    "variance/total_variance*100, np.cumsum(variance)/total_variance*100\n",
    "# the first four PC features explain 96.96% of data's variance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16a015c4",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure()\n",
    "plt.subplot(2,1,1)\n",
    "plt.plot(np.arange(F)+1, variance, 'ok--')\n",
    "plt.ylabel('eigenwert of R = variance')\n",
    "plt.title('Scree Plot')\n",
    "plt.subplot(2,1,2)\n",
    "plt.plot(np.arange(F)+1, np.cumsum(variance), 'ok--')\n",
    "plt.xlabel('eigenwert index')\n",
    "plt.ylabel('cumsum variance')\n",
    "plt.grid(True)\n",
    "variance, np.cumsum(variance)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bd01a8da",
   "metadata": {},
   "outputs": [],
   "source": [
    "# original features\n",
    "# 0...true_peak_lin\n",
    "# 1...true_peak_lin2\n",
    "# 2...true_peak_db\n",
    "# 3...rms_lin2\n",
    "# 4...rms_lin\n",
    "# 5...rms_db\n",
    "# 6...lufs_lin\n",
    "# 7...lufs_lin2\n",
    "# 8...lufs_db\n",
    "# 9...crest_lin\n",
    "# 10...crest_db\n",
    "# 11...low_high_ratio\n",
    "PC_Loadings[:,0:4]\n",
    "# original features 0...8 load very high and almost equally onto the first\n",
    "# PC feature (which explains ca. 70% variance), so this PC feature is about\n",
    "# peak/rms power either linear or dB\n",
    "#\n",
    "# the crest factor related features 9 and 10 load very high onto the second\n",
    "# PC feature (which explains ca. 13% variance), so this PC feature is crest\n",
    "# factor specific\n",
    "#\n",
    "# the original feature 11...low_high_ratio almost exclusively loads onto the\n",
    "# third PC feature (which explains ca. 8.7% variance)\n",
    "#\n",
    "# the fourth PC feature (which explains ca. 4.6% variance) is about separating\n",
    "# linear (-) and dB (+) related features due to polarity changes in the correlation,\n",
    "# this holds especially for the peak/rms power related original features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b619e92b",
   "metadata": {},
   "outputs": [],
   "source": [
    "pcf_label = ['PCF1', 'PCF2', 'PCF3', 'PCF4', 'PCF5', 'PCF6', 'PCF7', 'PCF8', 'PCF9', 'PCF10', 'PCF11', 'PCF12']\n",
    "task_label = ['true peak lin', 'true peak lin2', 'true peak db',\n",
    "              'rms lin2', 'rms lin', 'rms db', 'lufs lin',\n",
    "              'lufs lin2', 'lufs db', 'crest lin', 'crest db',\n",
    "              'low high ratio']\n",
    "\n",
    "cmap = plt.get_cmap('Spectral_r', 8)\n",
    "fig = plt.figure(figsize=(9,6))\n",
    "ax = fig.add_subplot(111)\n",
    "cax = ax.matshow(PC_Loadings, cmap=cmap, vmin=-1, vmax=+1)\n",
    "fig.colorbar(cax)\n",
    "ax.set_xticks(np.arange(len(pcf_label)))\n",
    "ax.set_yticks(np.arange(len(task_label)))\n",
    "ax.set_xticklabels(pcf_label)\n",
    "ax.set_yticklabels(task_label)\n",
    "ax.set_title('Loading Matrix = Audio Feature x loads to PC-Feature y')\n",
    "plt.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "28fb69ab",
   "metadata": {},
   "source": [
    "## Copyright\n",
    "\n",
    "- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)\n",
    "- feel free to use the notebooks for your own purposes\n",
    "- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)\n",
    "- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)\n",
    "- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "data-driven-audio-signal-processing-exercise",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}