{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# About This Notebook\n",
    "\n",
    "This notebook shows how to implement **Low-Rank Tensor Completion with Truncated Nuclear Norm minimization (LRTC-TNN)** on some real-world data sets. For an in-depth discussion of LRTC-TNN, please see our article [1].\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "<font color=\"black\">\n",
    "<b>[1]</b> Xinyu Chen, Jinming Yang, Lijun Sun (2020). <b>A Nonconvex Low-Rank Tensor Completion Model for Spatiotemporal Traffic Data Imputation</b>. arXiv.2003.10271. <a href=\"https://arxiv.org/abs/2003.10271\" title=\"PDF\"><b>[PDF]</b></a> \n",
    "</font>\n",
    "</div>\n",
    "\n",
    "\n",
    "## Quick Run\n",
    "\n",
    "This notebook is publicly available for any usage at our data imputation project. Please check out [**transdim - GitHub**](https://github.com/xinychen/transdim).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Low-Rank Tensor Completion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We start by importing the necessary dependencies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from numpy.linalg import inv as inv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Tensor Unfolding (`ten2mat`) and Matrix Folding (`mat2ten`)\n",
    "\n",
    "Using numpy reshape to perform 3rd rank tensor unfold operation. [[**link**](https://stackoverflow.com/questions/49970141/using-numpy-reshape-to-perform-3rd-rank-tensor-unfold-operation)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def ten2mat(tensor, mode):\n",
    "    return np.reshape(np.moveaxis(tensor, mode, 0), (tensor.shape[mode], -1), order = 'F')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor size:\n",
      "(3, 2, 4)\n",
      "original tensor:\n",
      "[[[ 1  2  3  4]\n",
      "  [ 3  4  5  6]]\n",
      "\n",
      " [[ 5  6  7  8]\n",
      "  [ 7  8  9 10]]\n",
      "\n",
      " [[ 9 10 11 12]\n",
      "  [11 12 13 14]]]\n",
      "\n",
      "(1) mode-1 tensor unfolding:\n",
      "[[ 1  3  2  4  3  5  4  6]\n",
      " [ 5  7  6  8  7  9  8 10]\n",
      " [ 9 11 10 12 11 13 12 14]]\n",
      "\n",
      "(2) mode-2 tensor unfolding:\n",
      "[[ 1  5  9  2  6 10  3  7 11  4  8 12]\n",
      " [ 3  7 11  4  8 12  5  9 13  6 10 14]]\n",
      "\n",
      "(3) mode-3 tensor unfolding:\n",
      "[[ 1  5  9  3  7 11]\n",
      " [ 2  6 10  4  8 12]\n",
      " [ 3  7 11  5  9 13]\n",
      " [ 4  8 12  6 10 14]]\n"
     ]
    }
   ],
   "source": [
    "X = np.array([[[1, 2, 3, 4], [3, 4, 5, 6]], \n",
    "              [[5, 6, 7, 8], [7, 8, 9, 10]], \n",
    "              [[9, 10, 11, 12], [11, 12, 13, 14]]])\n",
    "print('tensor size:')\n",
    "print(X.shape)\n",
    "print('original tensor:')\n",
    "print(X)\n",
    "print()\n",
    "print('(1) mode-1 tensor unfolding:')\n",
    "print(ten2mat(X, 0))\n",
    "print()\n",
    "print('(2) mode-2 tensor unfolding:')\n",
    "print(ten2mat(X, 1))\n",
    "print()\n",
    "print('(3) mode-3 tensor unfolding:')\n",
    "print(ten2mat(X, 2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def mat2ten(mat, tensor_size, mode):\n",
    "    index = list()\n",
    "    index.append(mode)\n",
    "    for i in range(tensor_size.shape[0]):\n",
    "        if i != mode:\n",
    "            index.append(i)\n",
    "    return np.moveaxis(np.reshape(mat, list(tensor_size[index]), order = 'F'), 0, mode)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Singular Value Thresholding (SVT) for TNN"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def svt_tnn(mat, alpha, rho, theta):\n",
    "    \"\"\"This is a Numpy dependent singular value thresholding (SVT) process.\"\"\"\n",
    "    u, s, v = np.linalg.svd(mat, full_matrices = 0)\n",
    "    vec = s.copy()\n",
    "    vec[theta :] = s[theta :] - alpha / rho\n",
    "    vec[vec < 0] = 0\n",
    "    return u @ np.diag(vec) @ v"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Understanding these codes**:\n",
    "\n",
    "- **`line 1`**: Necessary inputs including any input matrix $\\boldsymbol{X}$, weight of Truncated Nuclear Norm (TNN) regularization $\\alpha$, learning rate $\\rho$, and positive integer number $\\theta$ for nuclear norm truncation.\n",
    "\n",
    "- **`line 2`**: Compute the Singular Value Decomposition (SVD) for any matrix $\\boldsymbol{X}$ with `numpy.linalg.svd` (i.e., SVD function in `Numpy`'s linear algebra package).\n",
    "\n",
    "- **`line 3-5`**: Truncate singular values $\\sigma_{\\theta+1},...$ with the following rule:\n",
    "\n",
    "\\begin{equation}\n",
    "\\sigma_{i}=\\left[\\sigma_{i}(\\boldsymbol{X})-\\frac{\\alpha}{\\rho}\\right]_{+}.\n",
    "\\end{equation}\n",
    "\n",
    "- **`line 6`**: Return the resulted matrix."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Potential alternative for this**:\n",
    "\n",
    "This is a competitively efficient algorithm for implementing SVT-TNN."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def svt_tnn(mat, alpha, rho, theta):\n",
    "    tau = alpha / rho\n",
    "    [m, n] = mat.shape\n",
    "    if 2 * m < n:\n",
    "        u, s, v = np.linalg.svd(mat @ mat.T, full_matrices = 0)\n",
    "        s = np.sqrt(s)\n",
    "        idx = np.sum(s > tau)\n",
    "        mid = np.zeros(idx)\n",
    "        mid[:theta] = 1\n",
    "        mid[theta:idx] = (s[theta:idx] - tau) / s[theta:idx]\n",
    "        return (u[:, :idx] @ np.diag(mid)) @ (u[:, :idx].T @ mat)\n",
    "    elif m > 2 * n:\n",
    "        return svt_tnn(mat.T, alpha, rho, theta).T\n",
    "    u, s, v = np.linalg.svd(mat, full_matrices = 0)\n",
    "    idx = np.sum(s > tau)\n",
    "    vec = s[:idx].copy()\n",
    "    vec[theta:idx] = s[theta:idx] - tau\n",
    "    return u[:, :idx] @ np.diag(vec) @ v[:idx, :]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-warning\">\n",
    "<ul>\n",
    "<li><b><code>compute_mape</code>:</b> <font color=\"black\">Compute the value of Mean Absolute Percentage Error (MAPE).</font></li>\n",
    "<li><b><code>compute_rmse</code>:</b> <font color=\"black\">Compute the value of Root Mean Square Error (RMSE).</font></li>\n",
    "</ul>\n",
    "</div>\n",
    "\n",
    "> Note that $$\\mathrm{MAPE}=\\frac{1}{n} \\sum_{i=1}^{n} \\frac{\\left|y_{i}-\\hat{y}_{i}\\right|}{y_{i}} \\times 100, \\quad\\mathrm{RMSE}=\\sqrt{\\frac{1}{n} \\sum_{i=1}^{n}\\left(y_{i}-\\hat{y}_{i}\\right)^{2}},$$ where $n$ is the total number of estimated values, and $y_i$ and $\\hat{y}_i$ are the actual value and its estimation, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def compute_rmse(var, var_hat):\n",
    "    return np.sqrt(np.sum((var - var_hat) ** 2) / var.shape[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def compute_mape(var, var_hat):\n",
    "    return np.sum(np.abs(var - var_hat) / var) / var.shape[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Define LRTC-TNN Function with `Numpy`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter):\n",
    "    \"\"\"Low-Rank Tenor Completion with Truncated Nuclear Norm, LRTC-TNN.\"\"\"\n",
    "    \n",
    "    dim = np.array(sparse_tensor.shape)\n",
    "    pos_missing = np.where(sparse_tensor == 0)\n",
    "    pos_test = np.where((dense_tensor != 0) & (sparse_tensor == 0))\n",
    "    \n",
    "    X = np.zeros(np.insert(dim, 0, len(dim))) # \\boldsymbol{\\mathcal{X}}\n",
    "    T = np.zeros(np.insert(dim, 0, len(dim))) # \\boldsymbol{\\mathcal{T}}\n",
    "    Z = sparse_tensor.copy()\n",
    "    last_tensor = sparse_tensor.copy()\n",
    "    snorm = np.sqrt(np.sum(sparse_tensor ** 2))\n",
    "    it = 0\n",
    "    while True:\n",
    "        rho = min(rho * 1.05, 1e5)\n",
    "        for k in range(len(dim)):\n",
    "            X[k] = mat2ten(svt_tnn(ten2mat(Z - T[k] / rho, k), alpha[k], rho, int(np.ceil(theta * dim[k]))), dim, k)\n",
    "        Z[pos_missing] = np.mean(X + T / rho, axis = 0)[pos_missing]\n",
    "        T = T + rho * (X - np.broadcast_to(Z, np.insert(dim, 0, len(dim))))\n",
    "        tensor_hat = np.einsum('k, kmnt -> mnt', alpha, X)\n",
    "        tol = np.sqrt(np.sum((tensor_hat - last_tensor) ** 2)) / snorm\n",
    "        last_tensor = tensor_hat.copy()\n",
    "        it += 1\n",
    "        if (it + 1) % 50 == 0:\n",
    "            print('Iter: {}'.format(it + 1))\n",
    "            print('RMSE: {:.6}'.format(compute_rmse(dense_tensor[pos_test], tensor_hat[pos_test])))\n",
    "            print()\n",
    "        if (tol < epsilon) or (it >= maxiter):\n",
    "            break\n",
    "\n",
    "    print('Imputation MAPE: {:.6}'.format(compute_mape(dense_tensor[pos_test], tensor_hat[pos_test])))\n",
    "    print('Imputation RMSE: {:.6}'.format(compute_rmse(dense_tensor[pos_test], tensor_hat[pos_test])))\n",
    "    print()\n",
    "    \n",
    "    return tensor_hat"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Understanding these codes**:\n",
    "\n",
    "- **`line 18-19`**: Update $\\boldsymbol{\\mathcal{Z}}_{k}^{l+1},k=1,2,3$.\n",
    "\n",
    "- **`line 20-22`**: Update $\\boldsymbol{\\mathcal{X}}_{k}^{l+1}$ by\n",
    "\n",
    "\\begin{equation}\n",
    "\\boldsymbol{\\mathcal{X}}_{k}^{l+1}=\\mathcal{P}_{\\Omega}(\\boldsymbol{\\mathcal{Y}})+\\mathcal{P}_{\\Omega}^{\\perp}\\left(\\boldsymbol{\\mathcal{Z}}_{k}^{l+1}-\\frac{1}{\\rho}\\boldsymbol{\\mathcal{T}}_{k}^{l}\\right),k=1,2,3.\n",
    "\\end{equation}\n",
    "\n",
    "- **`line 23`**: Update $\\boldsymbol{\\mathcal{T}}_{k}^{l+1}$ by\n",
    "\n",
    "\\begin{equation}\n",
    "\\boldsymbol{\\mathcal{T}}_{k}^{l+1}=\\boldsymbol{\\mathcal{T}}_{k}^{l}+\\rho_k\\left(\\boldsymbol{\\mathcal{X}}_{k}^{l+1}-\\boldsymbol{\\mathcal{Z}}_{k}^{l+1}\\right).\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Guangzhou urban traffic speed data set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 5.51797\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 3.00147\n",
      "\n",
      "Imputation MAPE: 0.070093\n",
      "Imputation RMSE: 2.99615\n",
      "\n",
      "Running time: 115 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 5.32424\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 3.58887\n",
      "\n",
      "Imputation MAPE: 0.0836949\n",
      "Imputation RMSE: 3.57911\n",
      "\n",
      "Running time: 151 seconds\n",
      "\n",
      "Missing rate = 0.9\n",
      "Iter: 50\n",
      "RMSE: 4.05595\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 4.05\n",
      "\n",
      "Imputation MAPE: 0.0950771\n",
      "Imputation RMSE: 4.05026\n",
      "\n",
      "Running time: 167 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "## 30% RM\n",
    "r = 0.3\n",
    "print('Missing rate = {}'.format(r))\n",
    "missing_rate = r\n",
    "\n",
    "## Random Missing (RM)\n",
    "dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "dim1, dim2, dim3 = dense_tensor.shape\n",
    "np.random.seed(1000)\n",
    "sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim2, dim3) + 0.5 - missing_rate)\n",
    "\n",
    "start = time.time()\n",
    "alpha = np.ones(3) / 3\n",
    "rho = 1e-5\n",
    "theta = 0.25\n",
    "epsilon = 1e-4\n",
    "maxiter = 100\n",
    "LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "end = time.time()\n",
    "print('Running time: %d seconds'%(end - start))\n",
    "print()\n",
    "\n",
    "## 70% RM\n",
    "r = 0.7\n",
    "print('Missing rate = {}'.format(r))\n",
    "missing_rate = r\n",
    "\n",
    "## Random Missing (RM)\n",
    "dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "dim1, dim2, dim3 = dense_tensor.shape\n",
    "np.random.seed(1000)\n",
    "sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim2, dim3) + 0.5 - missing_rate)\n",
    "\n",
    "start = time.time()\n",
    "alpha = np.ones(3) / 3\n",
    "rho = 1e-5\n",
    "theta = 0.2\n",
    "epsilon = 1e-4\n",
    "maxiter = 100\n",
    "LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "end = time.time()\n",
    "print('Running time: %d seconds'%(end - start))\n",
    "print()\n",
    "\n",
    "## 90% RM\n",
    "r = 0.9\n",
    "print('Missing rate = {}'.format(r))\n",
    "missing_rate = r\n",
    "\n",
    "## Random Missing (RM)\n",
    "dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "dim1, dim2, dim3 = dense_tensor.shape\n",
    "np.random.seed(1000)\n",
    "sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim2, dim3) + 0.5 - missing_rate)\n",
    "\n",
    "start = time.time()\n",
    "alpha = np.ones(3) / 3\n",
    "rho = 1e-4\n",
    "theta = 0.1\n",
    "epsilon = 1e-4\n",
    "maxiter = 100\n",
    "LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "end = time.time()\n",
    "print('Running time: %d seconds'%(end - start))\n",
    "print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 5.06925\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 4.08494\n",
      "\n",
      "Imputation MAPE: 0.0965249\n",
      "Imputation RMSE: 4.08516\n",
      "\n",
      "Running time: 152 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 5.14929\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 4.30086\n",
      "\n",
      "Imputation MAPE: 0.101477\n",
      "Imputation RMSE: 4.30102\n",
      "\n",
      "Running time: 123 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "for r in [0.3, 0.7]:\n",
    "    print('Missing rate = {}'.format(r))\n",
    "    missing_rate = r\n",
    "\n",
    "    ## Non-random Missing (NM)\n",
    "    dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "    dim1, dim2, dim3 = dense_tensor.shape\n",
    "    np.random.seed(1000)\n",
    "    sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim3) + 0.5 - missing_rate)[:, None, :]\n",
    "\n",
    "    start = time.time()\n",
    "    alpha = np.ones(3) / 3\n",
    "    rho = 1e-5\n",
    "    theta = 0.05\n",
    "    epsilon = 1e-4\n",
    "    maxiter = 100\n",
    "    LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "    end = time.time()\n",
    "    print('Running time: %d seconds'%(end - start))\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iter: 50\n",
      "RMSE: 5.27782\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 3.97509\n",
      "\n",
      "Imputation MAPE: 0.094045\n",
      "Imputation RMSE: 3.96451\n",
      "\n",
      "Running time: 147 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import time\n",
    "import scipy.io\n",
    "np.random.seed(1000)\n",
    "\n",
    "missing_rate = 0.3\n",
    "\n",
    "## Block-out Missing (BM)\n",
    "dense_tensor = scipy.io.loadmat('../datasets/Guangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "dim1, dim2, dim3 = dense_tensor.shape\n",
    "\n",
    "dim_time = dim2 * dim3\n",
    "block_window = 6\n",
    "vec = np.random.rand(int(dim_time / block_window))\n",
    "temp = np.array([vec] * block_window)\n",
    "vec = temp.reshape([dim2 * dim3], order = 'F')\n",
    "\n",
    "sparse_tensor = mat2ten(ten2mat(dense_tensor, 0) * np.round(vec + 0.5 - missing_rate)[None, :], np.array([dim1, dim2, dim3]), 0)\n",
    "\n",
    "start = time.time()\n",
    "alpha = np.ones(3) / 3\n",
    "rho = 1e-5\n",
    "theta = 0.10\n",
    "epsilon = 1e-4\n",
    "maxiter = 100\n",
    "LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "end = time.time()\n",
    "print('Running time: %d seconds'%(end - start))\n",
    "print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Hangzhou metro passenger flow data set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 25.2411\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 24.944\n",
      "\n",
      "Imputation MAPE: 0.186277\n",
      "Imputation RMSE: 24.9491\n",
      "\n",
      "Running time: 15 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 29.1678\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 29.5341\n",
      "\n",
      "Imputation MAPE: 0.201632\n",
      "Imputation RMSE: 29.5459\n",
      "\n",
      "Running time: 19 seconds\n",
      "\n",
      "Missing rate = 0.9\n",
      "Iter: 50\n",
      "RMSE: 37.7603\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 38.0407\n",
      "\n",
      "Imputation MAPE: 0.229517\n",
      "Imputation RMSE: 38.0515\n",
      "\n",
      "Running time: 23 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "for r in [0.3, 0.7, 0.9]:\n",
    "    print('Missing rate = {}'.format(r))\n",
    "    missing_rate = r\n",
    "\n",
    "    ## Random Missing (RM)\n",
    "    dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "    dim1, dim2, dim3 = dense_tensor.shape\n",
    "    np.random.seed(1000)\n",
    "    sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim2, dim3) + 0.5 - missing_rate)\n",
    "\n",
    "    start = time.time()\n",
    "    alpha = np.ones(3) / 3\n",
    "    rho = 1e-5\n",
    "    theta = 0.10\n",
    "    epsilon = 1e-4\n",
    "    maxiter = 100\n",
    "    LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "    end = time.time()\n",
    "    print('Running time: %d seconds'%(end - start))\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 49.406\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 47.6061\n",
      "\n",
      "Imputation MAPE: 0.193862\n",
      "Imputation RMSE: 47.5992\n",
      "\n",
      "Running time: 21 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 42.8749\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 41.8305\n",
      "\n",
      "Imputation MAPE: 0.226381\n",
      "Imputation RMSE: 41.8327\n",
      "\n",
      "Running time: 14 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "for r in [0.3, 0.7]:\n",
    "    print('Missing rate = {}'.format(r))\n",
    "    missing_rate = r\n",
    "\n",
    "    ## Non-random Missing (NM)\n",
    "    dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "    dim1, dim2, dim3 = dense_tensor.shape\n",
    "    np.random.seed(1000)\n",
    "    sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim3) + 0.5 - missing_rate)[:, None, :]\n",
    "\n",
    "    start = time.time()\n",
    "    alpha = np.ones(3) / 3\n",
    "    rho = 1e-5\n",
    "    theta = 0.10\n",
    "    epsilon = 1e-4\n",
    "    maxiter = 100\n",
    "    LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "    end = time.time()\n",
    "    print('Running time: %d seconds'%(end - start))\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iter: 50\n",
      "RMSE: 28.8777\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 29.2744\n",
      "\n",
      "Imputation MAPE: 0.21374\n",
      "Imputation RMSE: 29.2814\n",
      "\n",
      "Running time: 20 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import time\n",
    "import scipy.io\n",
    "np.random.seed(1000)\n",
    "\n",
    "missing_rate = 0.3\n",
    "\n",
    "## Block-out Missing (BM)\n",
    "dense_tensor = scipy.io.loadmat('../datasets/Hangzhou-data-set/tensor.mat')['tensor'].transpose(0, 2, 1)\n",
    "dim1, dim2, dim3 = dense_tensor.shape\n",
    "\n",
    "dim_time = dim2 * dim3\n",
    "block_window = 6\n",
    "vec = np.random.rand(int(dim_time / block_window))\n",
    "temp = np.array([vec] * block_window)\n",
    "vec = temp.reshape([dim2 * dim3], order = 'F')\n",
    "\n",
    "sparse_tensor = mat2ten(ten2mat(dense_tensor, 0) * np.round(vec + 0.5 - missing_rate)[None, :], np.array([dim1, dim2, dim3]), 0)\n",
    "\n",
    "start = time.time()\n",
    "alpha = np.ones(3) / 3\n",
    "rho = 1e-5\n",
    "theta = 0.10\n",
    "epsilon = 1e-4\n",
    "maxiter = 100\n",
    "LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "end = time.time()\n",
    "print('Running time: %d seconds'%(end - start))\n",
    "print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Seattle freeway traffic speed data set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 5.86304\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 3.1126\n",
      "\n",
      "Imputation MAPE: 0.0480528\n",
      "Imputation RMSE: 3.10965\n",
      "\n",
      "Running time: 263 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 5.89289\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 3.79913\n",
      "\n",
      "Imputation MAPE: 0.0613629\n",
      "Imputation RMSE: 3.7982\n",
      "\n",
      "Running time: 228 seconds\n",
      "\n",
      "Missing rate = 0.9\n",
      "Iter: 50\n",
      "RMSE: 4.86097\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 4.81283\n",
      "\n",
      "Imputation MAPE: 0.081929\n",
      "Imputation RMSE: 4.81347\n",
      "\n",
      "Running time: 260 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "for r in [0.3, 0.7, 0.9]:\n",
    "    print('Missing rate = {}'.format(r))\n",
    "    missing_rate = r\n",
    "\n",
    "    ## Random missing (RM)\n",
    "    dense_tensor = np.load('../datasets/Seattle-data-set/tensor.npz')['arr_0'].transpose(0, 2, 1)\n",
    "    dim1, dim2, dim3 = dense_tensor.shape\n",
    "    np.random.seed(1000)\n",
    "    sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim2, dim3) + 0.5 - missing_rate)\n",
    "\n",
    "    start = time.time()\n",
    "    alpha = np.ones(3) / 3\n",
    "    rho = 1e-5\n",
    "    theta = 0.30\n",
    "    if r > 0.8:\n",
    "        rho = 5e-5\n",
    "        theta = 0.10\n",
    "    epsilon = 1e-4\n",
    "    maxiter = 100\n",
    "    LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "    end = time.time()\n",
    "    print('Running time: %d seconds'%(end - start))\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 5.10438\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 4.43494\n",
      "\n",
      "Imputation MAPE: 0.074171\n",
      "Imputation RMSE: 4.43533\n",
      "\n",
      "Running time: 249 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 6.14372\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 5.4049\n",
      "\n",
      "Imputation MAPE: 0.0928467\n",
      "Imputation RMSE: 5.40446\n",
      "\n",
      "Running time: 239 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "for r in [0.3, 0.7]:\n",
    "    print('Missing rate = {}'.format(r))\n",
    "    missing_rate = r\n",
    "\n",
    "    ## Non-random Missing (NM)\n",
    "    dense_tensor = np.load('../datasets/Seattle-data-set/tensor.npz')['arr_0'].transpose(0, 2, 1)\n",
    "    dim1, dim2, dim3 = dense_tensor.shape\n",
    "    np.random.seed(1000)\n",
    "    sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim3) + 0.5 - missing_rate)[:, None, :]\n",
    "\n",
    "    start = time.time()\n",
    "    alpha = np.ones(3) / 3\n",
    "    rho = 1e-5\n",
    "    theta = 0.05\n",
    "    epsilon = 1e-4\n",
    "    maxiter = 100\n",
    "    LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "    end = time.time()\n",
    "    print('Running time: %d seconds'%(end - start))\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iter: 50\n",
      "RMSE: 6.60377\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 5.69047\n",
      "\n",
      "Imputation MAPE: 0.0981021\n",
      "Imputation RMSE: 5.69791\n",
      "\n",
      "Running time: 220 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import scipy.io\n",
    "np.random.seed(1000)\n",
    "\n",
    "missing_rate = 0.3\n",
    "\n",
    "## Block-out Missing (BM)\n",
    "dense_tensor = np.load('../datasets/Seattle-data-set/tensor.npz')['arr_0'].transpose(0, 2, 1)\n",
    "dim1, dim2, dim3 = dense_tensor.shape\n",
    "block_window = 12\n",
    "vec = np.random.rand(int(dim2 * dim3 / block_window))\n",
    "temp = np.array([vec] * block_window)\n",
    "vec = temp.reshape([dim2 * dim3], order = 'F')\n",
    "sparse_tensor = mat2ten(dense_mat * np.round(vec + 0.5 - missing_rate)[None, :], np.array([dim1, dim2, dim3]), 0)\n",
    "\n",
    "start = time.time()\n",
    "alpha = np.ones(3) / 3\n",
    "rho = 1e-5\n",
    "theta = 0.30\n",
    "epsilon = 1e-4\n",
    "maxiter = 100\n",
    "LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "end = time.time()\n",
    "print('Running time: %d seconds'%(end - start))\n",
    "print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Portland highway traffic volume data set"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 16.6038\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 15.6579\n",
      "\n",
      "Imputation MAPE: 0.172064\n",
      "Imputation RMSE: 15.6594\n",
      "\n",
      "Running time: 693 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 19.5941\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 19.2446\n",
      "\n",
      "Imputation MAPE: 0.204501\n",
      "Imputation RMSE: 19.2494\n",
      "\n",
      "Running time: 698 seconds\n",
      "\n",
      "Missing rate = 0.9\n",
      "Iter: 50\n",
      "RMSE: 23.7422\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 24.1852\n",
      "\n",
      "Imputation MAPE: 0.244044\n",
      "Imputation RMSE: 24.1911\n",
      "\n",
      "Running time: 646 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "for r in [0.3, 0.7, 0.9]:\n",
    "    print('Missing rate = {}'.format(r))\n",
    "    missing_rate = r\n",
    "\n",
    "    # Random Missing (RM)\n",
    "    dense_mat = np.load('../datasets/Portland-data-set/volume.npy')\n",
    "    dim1, dim2 = dense_mat.shape\n",
    "    dim = np.array([dim1, 96, 31])\n",
    "    dense_tensor = mat2ten(dense_mat, dim, 0)\n",
    "    np.random.seed(1000)\n",
    "    sparse_tensor = mat2ten(dense_mat * np.round(np.random.rand(dim1, dim2) + 0.5 - missing_rate), dim, 0)\n",
    "\n",
    "    start = time.time()\n",
    "    alpha = np.ones(3) / 3\n",
    "    rho = 1e-5\n",
    "    theta = 0.10\n",
    "    epsilon = 1e-4\n",
    "    maxiter = 100\n",
    "    LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "    end = time.time()\n",
    "    print('Running time: %d seconds'%(end - start))\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Missing rate = 0.3\n",
      "Iter: 50\n",
      "RMSE: 19.2963\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 18.5759\n",
      "\n",
      "Imputation MAPE: 0.193184\n",
      "Imputation RMSE: 18.5804\n",
      "\n",
      "Running time: 722 seconds\n",
      "\n",
      "Missing rate = 0.7\n",
      "Iter: 50\n",
      "RMSE: 38.5421\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 38.2218\n",
      "\n",
      "Imputation MAPE: 0.269787\n",
      "Imputation RMSE: 38.2252\n",
      "\n",
      "Running time: 656 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import time\n",
    "import scipy.io\n",
    "\n",
    "for r in [0.3, 0.7]:\n",
    "    print('Missing rate = {}'.format(r))\n",
    "    missing_rate = r\n",
    "\n",
    "    # Non-random Missing (NM)\n",
    "    dense_mat = np.load('../datasets/Portland-data-set/volume.npy')\n",
    "    dim1, dim2 = dense_mat.shape\n",
    "    dim = np.array([dim1, 96, 31])\n",
    "    dense_tensor = mat2ten(dense_mat, dim, 0)\n",
    "    np.random.seed(1000)\n",
    "    sparse_tensor = dense_tensor * np.round(np.random.rand(dim1, dim[2]) + 0.5 - missing_rate)[:, None, :]\n",
    "\n",
    "    start = time.time()\n",
    "    alpha = np.ones(3) / 3\n",
    "    rho = 1e-5\n",
    "    theta = 0.10\n",
    "    epsilon = 1e-4\n",
    "    maxiter = 100\n",
    "    LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "    end = time.time()\n",
    "    print('Running time: %d seconds'%(end - start))\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Iter: 50\n",
      "RMSE: 23.3329\n",
      "\n",
      "Iter: 100\n",
      "RMSE: 23.0543\n",
      "\n",
      "Imputation MAPE: 0.230816\n",
      "Imputation RMSE: 23.0534\n",
      "\n",
      "Running time: 697 seconds\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import scipy.io\n",
    "np.random.seed(1000)\n",
    "\n",
    "missing_rate = 0.3\n",
    "\n",
    "## Block-out Missing (BM)\n",
    "dense_mat = np.load('../datasets/Portland-data-set/volume.npy')\n",
    "dim1, dim2 = dense_mat.shape\n",
    "dim = np.array([dim1, 96, 31])\n",
    "dense_tensor = mat2ten(dense_mat, dim, 0)\n",
    "block_window = 4\n",
    "vec = np.random.rand(int(dim2 / block_window))\n",
    "temp = np.array([vec] * block_window)\n",
    "vec = temp.reshape([dim2], order = 'F')\n",
    "sparse_tensor = mat2ten(dense_mat * np.round(vec + 0.5 - missing_rate)[None, :], dim, 0)\n",
    "\n",
    "start = time.time()\n",
    "alpha = np.ones(3) / 3\n",
    "rho = 1e-5\n",
    "theta = 0.05\n",
    "epsilon = 1e-4\n",
    "maxiter = 100\n",
    "LRTC(dense_tensor, sparse_tensor, alpha, rho, theta, epsilon, maxiter)\n",
    "end = time.time()\n",
    "print('Running time: %d seconds'%(end - start))\n",
    "print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### License\n",
    "\n",
    "<div class=\"alert alert-block alert-danger\">\n",
    "<b>This work is released under the MIT license.</b>\n",
    "</div>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  },
  "nbTranslate": {
   "displayLangs": [
    "*"
   ],
   "hotkey": "alt-t",
   "langInMainMenu": true,
   "sourceLang": "en",
   "targetLang": "fr",
   "useGoogleTranslate": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}