{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem Set 1 (110 points) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Important information\n", "We provide signatures of the functions that you have to implement. Make sure you follow the signatures defined, otherwise your coding solutions will not be graded.\n", "\n", "Read [homework rules](https://nbviewer.jupyter.org/github/oseledets/nla2018/blob/master/hw.pdf) carefully. If you do not follow it you will likely be penalized." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 1 (Python demo) 40 pts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data preparation (10 pts)\n", "\n", "* First of all download $\\verb|.wav|$ file with starcraft sound from [here](https://github.com/oseledets/nla2018/tree/master/psets). Load it in python and play using the following functions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.linalg import toeplitz\n", "import numpy as np\n", "import math\n", "import scipy.io.wavfile as wav\n", "import matplotlib.pyplot as plt\n", "from IPython.display import Audio\n", "%matplotlib notebook" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# reading\n", "rate, audio = wav.read(\"TMaRdy00.wav\")\n", "\n", "# plotting\n", "plt.plot(audio)\n", "plt.ylabel(\"Amplitude\")\n", "plt.xlabel(\"Time\")\n", "plt.title(\"You wanna piece of me, boy?\")\n", "plt.show()\n", "\n", "# playing\n", "Audio(audio, rate=rate)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our next goal is to process this signal by multiplying it by a special type of matrix (convolution operation) that will smooth the signal. \n", "\n", "* (5 pts) Before processing this file let us estimate what size of matrix we can afford. Let $N$ be the size of the signal. Estimate analytically memory in megabytes required to store dense square matrix of size $N\\times N$ to fit in your operation memory and print this number. Cut the signal so that you will not have swap (overflow of the operation memory). **Note:** Cut the signal by taking every p-th number in array: ```signal[::p]```. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# N = ...\n", "print(N)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* (5 pts) Write a function \n", "```python\n", "def gen_toeplitz(N, alpha): \n", " return T\n", "```\n", "that outputs matrix $T$: $$T_{ij} = \\sqrt{\\frac{\\alpha}{\\pi}}e^{-\\alpha (i-j)^2}, \\quad i,j=1,\\dots,N$$ as numpy array. Avoid using loops or lists! The function [np.meshgrid](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.meshgrid.html) will be helpful for this task.\n", "**Note:** matrices that depend only on difference of indices: $T_{ij} \\equiv T_{i-j}$ are called **Toeplitz**. Toeplitz matrix-by-vector multiplication is **convolution** since it can be written as $$y_i = \\sum_{j=1}^N T_{i-j} x_j.$$ Convolutions can be computed faster than $\\mathcal{O}(N^2)$ complexity using Fast Fourier transform (will be covered later in our course, no need to implement it here)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# INPUT: N - integer (positive), alpha - float (positive)\n", "# OUTPUT: T - np.array (shape: NxN)\n", "\n", "def gen_toeplitz(N, alpha):\n", " # Your code is here\n", " return T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Convolution (10 pts)\n", "\n", "* (5 pts) Write a function ```convolution``` (see below)\n", "that takes the signal you want to convolve and multiply it by Toeplitz matrix T (for matvec operations use @ symbol). " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# INPUT: signal - np.array (shape: Nx1), N - int (positive), alpha - float (positive)\n", "# OUTPUT: convolved_signal - np.array (shape: Nx1)\n", "\n", "def convolution(signal, N, alpha):\n", " # Your code is here\n", " return convolved_signal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* (3 pts) Plot the first $100$ points of the result and the first $100$ points of your signal on the same figure. Do the same plots for $\\alpha = \\frac{1}{5}$, $\\alpha = \\frac{1}{100}$ using ```plt.subplots``` in matplotlib. Each subplot should contain first $100$ points of initial and convolved signals for some $\\alpha$. Make sure that you got results that look like smoothed initial signal.\n", "\n", "* (2 pts) Play the resulting signal. In order to do so you should also scale the frequency (rate), which is one of the inputs in `Audio`. \n", "Note that you cannot play a signal which is too small." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deconvolution (20 pts)\n", "\n", "Given a convolved signal $y$ and an initial signal $x$ our goal now is to recover $x$ by solving the system\n", "$$\n", " y = Tx.\n", "$$\n", "To do so we will run iterative process\n", "$$\n", " x_{k+1} = x_{k} - \\tau_k (Tx_k - y), \\quad k=1,2,\\dots\n", "$$\n", "starting from zero vector $x_0$. There are different ways how to define parameters $\\tau_k$.\n", "Different choices lead to different methods (e.g. Richardson iteration, Chebyshev iteration, etc.).\n", "This topic will be covered in details later in our course.\n", "\n", "To get some intuition why this process converges to the solution of $Tx=y$, we can consider the following. Let us note that if $x_k$ converges to some limit $x$, then so does $x_{k+1}$. Taking $k\\to \\infty$ we arrive at $x = x - \\tau (Tx - y)$ and hence $x$ is the solution of $Tx = y$. \n", "\n", "Another important point is that iterative process requires only matrix-vector porducts $Tx_k$ on each iteration instead of the whole matrix. In this problem we, however, work with the full matrix, but keep in mind, that convolution can be done efficiently without storing the whole matrix.\n", "\n", "* (5 pts) For each $k$ choose paremeter $\\tau_k$ such that the residual $r_{k+1}=Tx_{k+1} - y$ is minimal possible (*line search* with search direction $r_k$):\n", "$$\n", " \\|Tx_{k+1} - y\\|_2 \\to \\min_{\\tau_k}\n", "$$\n", "found analytically. The answer to this bullet is a derivation of $\\tau_k$. The parameter $\\tau_k$ should be expressed in terms of residuals $r_k = T x_k - y$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your solution is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* (10 pts) Write a function ```iterative```\n", "that outputs accuracy –– a numpy array of relative errors $\\big\\{\\frac{\\|x_{k+1} - x\\|_2}{\\|x\\|_2}\\big\\}$ after ```num_iter``` iterations using $\\tau_k$ from the previous task. **Note:** The only loop you are allowed to use here is a loop for $k$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# INPUT: N - int (positive), alpha - float (positive), num_iter - integer (positive), \n", "# y - np.array (shape: Nx1, convolved signal), s - np.array (shape: Nx1, original signal)\n", "# OUTPUT: rel_error - np.array size (num_iter x 1)\n", "\n", "def iterative(N, num_iter, y, s, alpha):\n", " # Your code is here\n", " return rel_error" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* (2 pts) Set ```num_iter=1000```, ```x=s[::20]``` and do a convergence plot for $\\alpha = \\frac{1}{2}$ and $\\alpha = \\frac{1}{5}$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your plots are here " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* (3 pts) Set ```x=s[::20]```, ```num_iter=1000``` and $\\alpha=\\frac{1}{5}$. Explain what happens with the convergence if you add small random noise of amplitude $10^{-3}\\max(x)$ to $y$. The answer to this question should be an explanation supported by plots and/or tables." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 2 (Theoretical tasks) 45 pts\n", "\n", "_1._\n", "- (5 pts) Prove that $\\| U A \\|_F = \\| A U \\|_F = \\| A \\|_F$ for any unitary matrix $U$.\n", "- (5 pts) Prove that $\\| Ux \\|_2 = \\| x \\|_2$ for any $x$ iff $U$ is unitary.\n", "- (5 pts) Prove that $\\| U A \\|_2 = \\| A U \\|_2 = \\| A \\|_2$ for any unitary $U$.\n", " \n", "_2._\n", "- (5 pts) Using the results from the previous subproblem, prove that $\\| A \\|_F \\le \\sqrt{\\mathrm{rank}(A)} \\| A \\|_2$. _Hint:_ SVD will help you.\n", "- (5 pts) Show that for any $m, n$ and $k \\le \\min(m, n)$ there exists $A \\in \\mathbb{R}^{m \\times n}: \\mathrm{rank}(A) = k$, such that $\\| A \\|_F = \\sqrt{\\mathrm{rank}(A)} \\| A \\|_2$. In other words, show that the previous inequality is not strict.\n", "- (5 pts) Prove that if $\\mathrm{rank}(A) = 1$, then $\\| A \\|_F = \\| A \\|_2$.\n", "- (5 pts) Prove that $\\| A B \\|_F \\le \\| A \\|_2 \\| B \\|_F$.\n", "\n", "_3._ \n", "* (3 pts) Differentiate with respect to $A$ the function\n", "$$\n", "f(A) = \\mathrm{sin}(x^\\top A B C D x),\n", "$$\n", "where $x$ is a vector and $A, B, C, D$ are square matrices.\n", "\n", "* (7 pts) Differentiate with respect to $y, A, X$ the function\n", "$$f(y, A, X) = \\mathrm{tr}(\\mathrm{diag}(y) A X),$$\n", "where $y \\in \\mathbb{R}^n$ and $A, X \\in \\mathbb{R}^{n \\times n}$. Here \n", "\n", "$$\n", "\\mathrm{diag}(y)_{i, j} = \n", " \\begin{cases}\n", " y_i, & \\text{if}\\ i = j \\\\\n", " 0, & \\text{otherwise}\n", " \\end{cases}\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your solution is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 3 (Strassen algorithm) 15 pts\n", "\n", "_1._ (3 pts) Implement the naive algorithm for squared matrix multiplication with explicit β€œfor” cycles." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def naive_multiplication(A, B):\n", " \"\"\"\n", " Implement naive matrix multiplication with explicit for cycles\n", " \n", " Parameters: Matrices A, B\n", " \n", " Returns: Matrix C = AB\n", " \"\"\"\n", "# Your code is here \n", " return C" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_2._ (7 pts) Implement the Strassen algorithm." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def strassen(A, B):\n", " \"\"\"\n", " Implement Strassen algorithm for matrix multiplication \n", " \n", " Parameters: Matrices A, B\n", " \n", " Returns: Matrix C = AB\n", " \"\"\"\n", "# Your code is here\n", " return C" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_3._ (5 pts) Compare three approaches: naive multiplication, Strassen algorithm and standard NumPy function. \n", "Provide a plot in log-scale of dependence between the matrix size and the runtime of multiplication. You will have three lines, do not forget to add legend, axis labels and other attributes (see our [requirements](https://nbviewer.jupyter.org/github/oseledets/nla2018/blob/master/hw.pdf)) \n", "Consider the matrix size in the range of 100 to 700 with step 100, e.g. $n=100, 200,\\ldots, 700$. \n", "Justify the results theoretically (e.g., use the known formulas for total multiplication complexity of naive and Strassen algorithms)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 4 (SVD) 10 pts\n", "In this assignment you are supposed to study how SVD could be used in image compression." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_1._ (2 pts) Compute the singular values of some predownloaded image (via the code provided below) and plot them. Do not forget to use logarithmic scale." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "from PIL import Image, ImageDraw\n", "import requests\n", "import numpy as np\n", "\n", "\n", "url = 'https://pbs.twimg.com/profile_images/1658625695/my_photo_400x400.jpg' # Ivan\n", "url = 'https://i.chzbgr.com/full/5536320768/h88BAB406/' # Insight\n", "# url = '' # your favorite picture, please!\n", "\n", "face_raw = Image.open(requests.get(url, stream=True).raw)\n", "face = np.array(face_raw).astype(np.uint8)\n", "\n", "plt.imshow(face_raw)\n", "plt.xticks(())\n", "plt.yticks(())\n", "plt.title('Original Picture')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_2._ (3 pts) Complete a function ```compress```, that performs SVD and truncates it (using $k$ singular values/vectors). See the prototype below. \n", "\n", "Note, that in colourful case you have to split your image to channels and work with matrices corresponding to different channels separately.\n", "\n", "Plot approximate reconstructed image $M_\\varepsilon$ of your favorite image such that $rank(M_\\varepsilon) = 5, 20, 50$ using ```plt.subplots```." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def compress(image, k):\n", " \"\"\"\n", " Perform svd decomposition and truncate it (using k singular values/vectors)\n", " \n", " Parameters: \n", " image (np.array): input image (probably, colourful)\n", " \n", " k (int): approximation rank\n", " \n", " --------\n", " Returns:\n", " reconst_matrix (np.array): reconstructed matrix (tensor in colourful case)\n", " \n", " s (np.array): array of singular values \n", " \"\"\"\n", "# Your code is here\n", " return reconst_matrix, s" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code is here " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_3._ (3 pts) Plot the following two figures for your favorite picture\n", "* How relative error of approximation depends on the rank of approximation?\n", "* How compression rate in terms of storing information ((singular vectors + singular numbers) / total size of image) depends on the rank of approximation?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_4._ (2 pts) Consider the following two pictures. Compute their approximations (with the same rank, or relative error). What do you see? Explain results." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "url1 = 'http://sk.ru/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-60-11/skoltech1.jpg'\n", "url2 = 'http://www.simpsoncrazy.com/content/characters/poster/bottom-right.jpg'\n", "image_raw1 = Image.open(requests.get(url1, stream=True).raw)\n", "image_raw2 = Image.open(requests.get(url2, stream=True).raw)\n", "\n", "image1 = np.array(image_raw1).astype(np.uint8)\n", "image2 = np.array(image_raw2).astype(np.uint8)\n", "\n", "plt.figure(figsize=(18, 6))\n", "plt.subplot(1,2,1)\n", "plt.imshow(image_raw1)\n", "plt.title('One Picture')\n", "plt.xticks(())\n", "plt.yticks(())\n", "\n", "plt.subplot(1,2,2)\n", "plt.imshow(image_raw2)\n", "plt.title('Another Picture')\n", "plt.xticks(())\n", "plt.yticks(())\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Your code is here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Problem 5 (Bonus)\n", "\n", "1. The norm is called absolute if $\\|x\\|=\\| \\lvert x \\lvert \\|$ holds for any vector $x$, where $x=(x_1,\\dots,x_n)^T$ and $\\lvert x \\lvert = (\\lvert x_1 \\lvert,\\dots, \\lvert x_n \\lvert)^T$. Give an example of a norm which is not absolute.\n", "\n", "2. Write a function ```ranks_HOSVD(A, eps)```\n", "that calculates Tucker ranks of a d-dimensional tensor $A$ using High-Order SVD (HOSVD) algorithm, where ```eps``` is the relative accuracy in the Frobenius norm between the approximated and the initial tensors. Details can be found [here](http://ca.sandia.gov/~tgkolda/pubs/pubfiles/TensorReview.pdf) on Figure 4.3.\n", "```python\n", "def ranks_HOSVD(A, eps):\n", " return r #r should be a tuple of ranks r = (r1, r2, ..., rd)\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }