{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Lecture 17:  Tensor decompositions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Recap of the previous lecture\n",
    "* Wavelets\n",
    "* L1-norm minimization\n",
    "* Compressed sensing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Today lecture \n",
    "\n",
    "* Tensor decompositions and applications\n",
    "* Practice"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Tensors\n",
    "\n",
    "By **tensor** we imply nothing, but a **multidimensional array**:\n",
    "$$\n",
    "A(i_1, \\dots, i_d), \\quad 1\\leq i_k\\leq n_k,\n",
    "$$\n",
    "where $d$ is called dimension, $n_k$ - mode size.\n",
    "This is standard definition in applied mathematics community. For details see [[1]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.2059&rep=rep1&type=pdf), [[2]](http://arxiv.org/pdf/1302.7121.pdf), [[3]](http://epubs.siam.org/doi/abs/10.1137/090752286).\n",
    "\n",
    "* $d=2$ (matrices) $\\Rightarrow$ classic theory (SVD, LU, QR, $\\dots$)\n",
    "\n",
    "* $d\\geq 3$ (tensors) $\\Rightarrow$ under development. Generalization of standard matrix results is not always straightforward"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Curse of dimensionality\n",
    "\n",
    "The problem with multidimensional data is that number of parameters grows <font color='red'> exponentially </font> with $d$:\n",
    "\n",
    "\n",
    "$$\n",
    "    \\text{storage} = n^d.\n",
    "$$\n",
    "For instance, for $n=2$ and $d=500$\n",
    "$$\n",
    "    n^d = 2^{500} \\gg 10^{83}  -  \\text{ number of atoms in the Universe}\n",
    "$$\n",
    "\n",
    "Why do we care? It seems that we are living in the 3D World :)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Applications\n",
    "\n",
    "#### Quantum chemistry\n",
    "\n",
    "Stationary Schroedinger equation for system with $N_{el}$ electrons\n",
    "$$\n",
    "    \\hat H \\Psi = E \\Psi,\n",
    "$$\n",
    "where\n",
    "$$\n",
    "\\Psi = \\Psi(\\{{\\bf r_1},\\sigma_1\\},\\dots, \\{{\\bf r_{N_{el}}},\\sigma_{N_{el}}\\})\n",
    "$$\n",
    "3$N_{el}$ spatial variables and $N_{el}$ spin variable. \n",
    "<img src=\"http://blog.covance.com/wp-content/uploads/2015/02/101679_Large-Molecule_1575263331.jpg\" width=400>\n",
    "\n",
    "* Drug and material design\n",
    "* Predicting physical experiments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Uncertainty quantification\n",
    "\n",
    "Example: oil reservoir modeling. Model may depend on parameters $p_1,\\dots,p_d$ (like measured experimentally procity or temperature) with uncertainty\n",
    "\n",
    "$$\n",
    "u = u(t,{\\bf r},\\,{p_1,\\dots,p_d})\n",
    "$$\n",
    "<img src='http://www.ife.no/en/ife/departments/tracer/tracertechnology/image' width=350>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### And many more\n",
    "\n",
    "* Signal processing\n",
    "* Recommender systems\n",
    "* Neural networks\n",
    "* Language models\n",
    "* Financial mathematics\n",
    "* ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Working with many dimensions\n",
    "\n",
    "* **Monte-Carlo**: class of methods based on random sampling. Convergence issues\n",
    "* **Sparse grids**: special types of grids with small number of grid points. Strong regularity conditions\n",
    "* Young and promising approach based on <font color='red'>tensor decompositions </font>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Tensor decompositions\n",
    "\n",
    "## 2D\n",
    "\n",
    "Skeleton decomposition:\n",
    "$$\n",
    "A = UV^T\n",
    "$$\n",
    "or elementwise:\n",
    "$$\n",
    "a_{ij} = \\sum_{\\alpha=1}^r u_{i\\alpha} v_{j\\alpha}\n",
    "$$\n",
    "leads us to the idea of **separation of variables.**\n",
    "\n",
    "**Properties:**\n",
    "* Not unique: $A = U V^T = UBB^{-1}V^T = \\tilde U \\tilde V^T$\n",
    "* Can be calculated in a stable way by **SVD**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 1. Canonical decomposition\n",
    "\n",
    "The most straightforward way to generize separation of variables to many dimensions is the **canonical decomposition**: (CP/CANDECOMP/PARAFAC)\n",
    "$$\n",
    "a_{ijk} = \\sum_{\\alpha=1}^r u_{i\\alpha} v_{j\\alpha} w_{k\\alpha}\n",
    "$$\n",
    "minimal possible $r$ is called the **canonical rank**. Matrices $U$, $V$ and $W$ are called **canonical factors**. This decomposition was proposed in 1927 by Hitchcock."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "**Properties**:\n",
    "\n",
    "* For a $d$ dimensional tensor memory is $nrd$\n",
    "* Unique under mild conditions\n",
    "* Set of tensors with rank$\\leq r$ is not closed (by contrast to matrices): <br>\n",
    "  $a_{ijk} = i+j+k$, $\\text{rank}(A) = 3$, but\n",
    "  $$\n",
    "      a^\\epsilon_{ijk} = \\frac{(1+\\epsilon i)(1+\\epsilon j)(1+\\epsilon k) - 1}{\\epsilon}\\to i+j+k=a_{ijk},\\quad \\epsilon\\to 0\n",
    "  $$\n",
    "  and $\\text{rank}(A^{\\epsilon}) = 2$\n",
    "* No stable algorithms to compute best rank-$r$ approximation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "**Alternating Least Squares algorithm**\n",
    "\n",
    "0. Intialize random $U,V,W$\n",
    "1. fix $V,W$, solve least squares for $U$\n",
    "2. fix $U,W$, solve least squares for $V$\n",
    "3. fix $U,V$, solve least squares for $W$\n",
    "4. go to 2."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 2. Tucker decomposition\n",
    "\n",
    "Next attempt is the decomposition proposed by Tucker in 1963 in chemometrics journal:\n",
    "$$\n",
    "a_{ijk} = \\sum_{\\alpha_1,\\alpha_2,\\alpha_3=1}^{r_1,r_2,r_3}g_{\\alpha_1\\alpha_2\\alpha_3} u_{i\\alpha_1} v_{j\\alpha_2} w_{k\\alpha_3}.\n",
    "$$\n",
    "Here we have several different ranks. Minimal possible $r_1,r_2,r_3$ are called **Tucker ranks**.\n",
    "\n",
    "**Properties**:\n",
    "\n",
    "* For a $d$ dimensional tensor memory is <font color='red'>$r^d$</font> $+ nrd$. Still **curse of dimensionality**\n",
    "* Stable SVD-based algorithm:\n",
    "    1. $U =$ principal components of the unfolding `A.reshape(n1, n2*n3)`\n",
    "    2. $V =$ principal components of the unfolding `A.transpose([1,0,2]).reshape(n2, n1*n3)`\n",
    "    3. $W =$ principal components of the unfolding `A.transpose([2,0,1]).reshape(n3, n1*n2)`\n",
    "    4. $g_{\\alpha_1\\alpha_2\\alpha_3} = \\sum_{i,j,k=1}^{n_1,n_2,n_3} a_{ijk} u_{i\\alpha_1} v_{j\\alpha_2} w_{k\\alpha_3}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 3. Tensor Train decomposition\n",
    "\n",
    "* Calculation of the canonical decomposition is unstable\n",
    "* Tucker decomposition suffers from the curse of dimensionality\n",
    "\n",
    "Tensor Train (**TT**) decomposition (Oseledets, Tyrtyshnikov 2009) is both stable and contains linear in $d$ number of parameters:\n",
    "$$\n",
    "a_{i_1 i_2 \\dots i_d} = \\sum_{\\alpha_1,\\dots,\\alpha_{d-1}} \n",
    "g_{i_1\\alpha_1} g_{\\alpha_1 i_2\\alpha_2}\\dots g_{\\alpha_{d-2} i_{d-1}\\alpha_{d-1}} g_{\\alpha_{d-1} i_{d}}\n",
    "$$\n",
    "or in the matrix form\n",
    "$$\n",
    "    a_{i_1 i_2 \\dots i_d} = G_1 (i_1)G_2 (i_2)\\dots G_d(i_d)\n",
    "$$\n",
    "* The storage is $\\mathcal{O}(dnr^2)$ \n",
    "* Stable TT-SVD algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "**Example**\n",
    "$$a_{i_1\\dots i_d} = i_1 + \\dots + i_d$$\n",
    "Canonical rank is $d$. At the same time TT-ranks are $2$:\n",
    "$$\n",
    "i_1 + \\dots + i_d = \\begin{pmatrix} i_1 & 1 \\end{pmatrix} \n",
    "\\begin{pmatrix} 1 & 0 \\\\ i_2 & 1 \\end{pmatrix}\n",
    "\\dots\n",
    "\\begin{pmatrix} 1 & 0 \\\\ i_{d-1} & 1 \\end{pmatrix}\n",
    "\\begin{pmatrix} 1  \\\\  i_d \\end{pmatrix}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 4. Quantized Tensor Train\n",
    "\n",
    "Consider a 1D array $a_k = f(x_k)$, $k=1,\\dots,2^d$ where $f$ is some 1D function calculated on grid points $x_k$.\n",
    "\n",
    "Let $$k = {2^{d-1} i_1 + 2^{d-2} i_2 + \\dots + 2^0 i_{d}}\\quad i_1,\\dots,i_d = 0,1 $$ \n",
    "be binary representation of $k$, then\n",
    "$$\n",
    "    a_k = a_{2^{d-1} i_1 + 2^{d-2} i_2 + \\dots + 2^0 i_{d}} \\equiv \\tilde a_{i_1,\\dots,i_d},\n",
    "$$\n",
    "where $\\tilde a$ is nothing, but a reshaped tensor $a$. TT decomposition of $\\tilde a$ is called **Quantized Tensor Train (QTT)** decomposition. Interesting fact is that the QTT decomposition has relation to wavelets.\n",
    "\n",
    "Contains <font color='red'>$\\mathcal{O}(\\log n r^2)$</font> elements!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Cross approximation method\n",
    "\n",
    "If decomposition of a tensor is given, then there is no problem to do basic operations fast. \n",
    "\n",
    "However, the question is if it is possible to find decomposition taking into account that typically tensors even can not be stored. \n",
    "\n",
    "**Cross approximation** method allows to find the decomposition using only few of its elements."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Summary \n",
    "\n",
    "* Tensor decompositions - useful tool to work with multidimensional data\n",
    "* Canonical, Tucker, TT, QTT decompositions\n",
    "* Cross approximation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Next time\n",
    "\n",
    "Ping-Pong test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Now lets have some practice"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<link href='http://fonts.googleapis.com/css?family=Fenix' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Alegreya+Sans:100,300,400,500,700,800,900,100italic,300italic,400italic,500italic,700italic,800italic,900italic' rel='stylesheet' type='text/css'>\n",
       "<link href='http://fonts.googleapis.com/css?family=Source+Code+Pro:300,400' rel='stylesheet' type='text/css'>\n",
       "<style>\n",
       "    @font-face {\n",
       "        font-family: \"Computer Modern\";\n",
       "        src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
       "    }\n",
       "    div.cell{\n",
       "        /*width:80%;*/\n",
       "        /*margin-left:auto !important;\n",
       "        margin-right:auto;*/\n",
       "    }\n",
       "    h1 {\n",
       "        font-family: 'Alegreya Sans', sans-serif;\n",
       "    }\n",
       "    h2 {\n",
       "        font-family: 'Fenix', serif;\n",
       "    }\n",
       "    h3{\n",
       "\t\tfont-family: 'Fenix', serif;\n",
       "        margin-top:12px;\n",
       "        margin-bottom: 3px;\n",
       "       }\n",
       "\th4{\n",
       "\t\tfont-family: 'Fenix', serif;\n",
       "       }\n",
       "    h5 {\n",
       "        font-family: 'Alegreya Sans', sans-serif;\n",
       "    }\t   \n",
       "    div.text_cell_render{\n",
       "        font-family: 'Alegreya Sans',Computer Modern, \"Helvetica Neue\", Arial, Helvetica, Geneva, sans-serif;\n",
       "        line-height: 1.2;\n",
       "        font-size: 120%;\n",
       "        /*width:70%;*/\n",
       "        /*margin-left:auto;*/\n",
       "        margin-right:auto;\n",
       "    }\n",
       "    .CodeMirror{\n",
       "            font-family: \"Source Code Pro\";\n",
       "\t\t\tfont-size: 90%;\n",
       "    }\n",
       "/*    .prompt{\n",
       "        display: None;\n",
       "    }*/\n",
       "    .text_cell_render h1 {\n",
       "        font-weight: 200;\n",
       "        font-size: 50pt;\n",
       "\t\tline-height: 110%;\n",
       "        color:#CD2305;\n",
       "        margin-bottom: 0.5em;\n",
       "        margin-top: 0.5em;\n",
       "        display: block;\n",
       "    }\t\n",
       "    .text_cell_render h5 {\n",
       "        font-weight: 300;\n",
       "        font-size: 16pt;\n",
       "        color: #CD2305;\n",
       "        font-style: italic;\n",
       "        margin-bottom: .5em;\n",
       "        margin-top: 0.5em;\n",
       "        display: block;\n",
       "    }\n",
       "    \n",
       "    li {\n",
       "        line-height: 110%;\n",
       "    }\n",
       "    .warning{\n",
       "        color: rgb( 240, 20, 20 )\n",
       "        }  \n",
       "\n",
       "</style>\n",
       "\n",
       "<script>\n",
       "    MathJax.Hub.Config({\n",
       "                        TeX: {\n",
       "                           extensions: [\"AMSmath.js\"]\n",
       "                           },\n",
       "                tex2jax: {\n",
       "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
       "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
       "                },\n",
       "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
       "                \"HTML-CSS\": {\n",
       "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
       "                }\n",
       "        });\n",
       "</script>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython.core.display import HTML\n",
    "def css_styling():\n",
    "    styles = open(\"./styles/custom.css\", \"r\").read()\n",
    "    return HTML(styles)\n",
    "css_styling()"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}