{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# About this Notebook\n", "\n", "Bayesian Probabilistic Matrix Factorization - Autoregressive (BPMF-AR) model for spatiotemporal short-term prediction." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from numpy import linalg as LA\n", "from numpy.random import multivariate_normal\n", "from scipy.stats import wishart\n", "\n", "def Normal_Wishart(mu_0, lamb, W, nu, seed = None):\n", " \"\"\"Function drawing a Gaussian-Wishart random variable\"\"\"\n", " Lambda = wishart(df = nu, scale = W, seed = seed).rvs()\n", " cov = np.linalg.inv(lamb * Lambda)\n", " mu = multivariate_normal(mu_0, cov)\n", " return mu, Lambda" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Matrix Computation Concepts\n", "\n", "## Kronecker product\n", "\n", "- **Definition**:\n", "\n", "Given two matrices $A\\in\\mathbb{R}^{m_1\\times n_1}$ and $B\\in\\mathbb{R}^{m_2\\times n_2}$, then, the **Kronecker product** between these two matrices is defined as\n", "\n", "$$A\\otimes B=\\left[ \\begin{array}{cccc} a_{11}B & a_{12}B & \\cdots & a_{1m_2}B \\\\ a_{21}B & a_{22}B & \\cdots & a_{2m_2}B \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\ a_{m_11}B & a_{m_12}B & \\cdots & a_{m_1m_2}B \\\\ \\end{array} \\right]$$\n", "where the symbol $\\otimes$ denotes Kronecker product, and the size of resulted $A\\otimes B$ is $(m_1m_2)\\times (n_1n_2)$ (i.e., $m_1\\times m_2$ columns and $n_1\\times n_2$ rows).\n", "\n", "- **Example**:\n", "\n", "If $A=\\left[ \\begin{array}{cc} 1 & 2 \\\\ 3 & 4 \\\\ \\end{array} \\right]$ and $B=\\left[ \\begin{array}{ccc} 5 & 6 & 7\\\\ 8 & 9 & 10 \\\\ \\end{array} \\right]$, then, we have\n", "\n", "$$A\\otimes B=\\left[ \\begin{array}{cc} 1\\times \\left[ \\begin{array}{ccc} 5 & 6 & 7\\\\ 8 & 9 & 10\\\\ \\end{array} \\right] & 2\\times \\left[ \\begin{array}{ccc} 5 & 6 & 7\\\\ 8 & 9 & 10\\\\ \\end{array} \\right] \\\\ 3\\times \\left[ \\begin{array}{ccc} 5 & 6 & 7\\\\ 8 & 9 & 10\\\\ \\end{array} \\right] & 4\\times \\left[ \\begin{array}{ccc} 5 & 6 & 7\\\\ 8 & 9 & 10\\\\ \\end{array} \\right] \\\\ \\end{array} \\right]$$\n", "\n", "$$=\\left[ \\begin{array}{cccccc} 5 & 6 & 7 & 10 & 12 & 14 \\\\ 8 & 9 & 10 & 16 & 18 & 20 \\\\ 15 & 18 & 21 & 20 & 24 & 28 \\\\ 24 & 27 & 30 & 32 & 36 & 40 \\\\ \\end{array} \\right]\\in\\mathbb{R}^{4\\times 6}.$$\n", "\n", "## Khatri-Rao product (`kr_prod`)\n", "\n", "- **Definition**:\n", "\n", "Given two matrices $A=\\left( \\boldsymbol{a}_1,\\boldsymbol{a}_2,...,\\boldsymbol{a}_r \\right)\\in\\mathbb{R}^{m\\times r}$ and $B=\\left( \\boldsymbol{b}_1,\\boldsymbol{b}_2,...,\\boldsymbol{b}_r \\right)\\in\\mathbb{R}^{n\\times r}$ with same number of columns, then, the **Khatri-Rao product** (or **column-wise Kronecker product**) between $A$ and $B$ is given as follows,\n", "\n", "$$A\\odot B=\\left( \\boldsymbol{a}_1\\otimes \\boldsymbol{b}_1,\\boldsymbol{a}_2\\otimes \\boldsymbol{b}_2,...,\\boldsymbol{a}_r\\otimes \\boldsymbol{b}_r \\right)\\in\\mathbb{R}^{(mn)\\times r}$$\n", "where the symbol $\\odot$ denotes Khatri-Rao product, and $\\otimes$ denotes Kronecker product.\n", "\n", "- **Example**:\n", "\n", "If $A=\\left[ \\begin{array}{cc} 1 & 2 \\\\ 3 & 4 \\\\ \\end{array} \\right]=\\left( \\boldsymbol{a}_1,\\boldsymbol{a}_2 \\right) $ and $B=\\left[ \\begin{array}{cc} 5 & 6 \\\\ 7 & 8 \\\\ 9 & 10 \\\\ \\end{array} \\right]=\\left( \\boldsymbol{b}_1,\\boldsymbol{b}_2 \\right) $, then, we have\n", "\n", "$$A\\odot B=\\left( \\boldsymbol{a}_1\\otimes \\boldsymbol{b}_1,\\boldsymbol{a}_2\\otimes \\boldsymbol{b}_2 \\right) $$\n", "\n", "$$=\\left[ \\begin{array}{cc} \\left[ \\begin{array}{c} 1 \\\\ 3 \\\\ \\end{array} \\right]\\otimes \\left[ \\begin{array}{c} 5 \\\\ 7 \\\\ 9 \\\\ \\end{array} \\right] & \\left[ \\begin{array}{c} 2 \\\\ 4 \\\\ \\end{array} \\right]\\otimes \\left[ \\begin{array}{c} 6 \\\\ 8 \\\\ 10 \\\\ \\end{array} \\right] \\\\ \\end{array} \\right]$$\n", "\n", "$$=\\left[ \\begin{array}{cc} 5 & 12 \\\\ 7 & 16 \\\\ 9 & 20 \\\\ 15 & 24 \\\\ 21 & 32 \\\\ 27 & 40 \\\\ \\end{array} \\right]\\in\\mathbb{R}^{6\\times 2}.$$" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def kr_prod(a, b):\n", " return np.einsum('ir, jr -> ijr', a, b).reshape(a.shape[0] * b.shape[0], -1)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 5 12]\n", " [ 7 16]\n", " [ 9 20]\n", " [15 24]\n", " [21 32]\n", " [27 40]]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "B = np.array([[5, 6], [7, 8], [9, 10]])\n", "print(kr_prod(A, B))" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def BPMF(dense_mat, sparse_mat, binary_mat, W, X, maxiter1, maxiter2):\n", " dim1 = sparse_mat.shape[0]\n", " dim2 = sparse_mat.shape[1]\n", " rank = W.shape[1]\n", " pos = np.where((dense_mat>0) & (sparse_mat==0))\n", " position = np.where(sparse_mat > 0)\n", " \n", " beta0 = 1\n", " nu0 = rank\n", " mu0 = np.zeros((rank))\n", " tau = 1\n", " a0 = 1e-6\n", " b0 = 1e-6\n", " W0 = np.eye(rank)\n", " \n", " for iter in range(maxiter1):\n", " W_bar = np.mean(W, axis = 0)\n", " var_mu0 = (dim1 * W_bar + beta0 * mu0)/(dim1 + beta0)\n", " var_nu = dim1 + nu0\n", " var_W = np.linalg.inv(np.linalg.inv(W0) \n", " + dim1 * np.cov(W.T) + dim1 * beta0/(dim1 + beta0)\n", " * np.outer(W_bar - mu0, W_bar - mu0))\n", " var_W = (var_W + var_W.T)/2\n", " var_mu0, var_Lambda0 = Normal_Wishart(var_mu0, dim1 + beta0, var_W, var_nu, seed = None)\n", " \n", " var1 = X.T\n", " var2 = kr_prod(var1, var1)\n", " var3 = tau * np.matmul(var2, binary_mat.T).reshape([rank, rank, dim1]) + np.dstack([var_Lambda0] * dim1)\n", " var4 = tau * np.matmul(var1, sparse_mat.T) + np.dstack([np.matmul(var_Lambda0, var_mu0)] * dim1)[0, :, :]\n", " for i in range(dim1):\n", " var_Lambda1 = var3[ :, :, i]\n", " inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)\n", " var_mu = np.matmul(inv_var_Lambda1, var4[:, i])\n", " W[i, :] = np.random.multivariate_normal(var_mu, inv_var_Lambda1)\n", " \n", " X_bar = np.mean(X, axis = 0)\n", " var_mu0 = (dim2 * X_bar + beta0 * mu0)/(dim2 + beta0)\n", " var_nu = dim2 + nu0\n", " var_X = np.linalg.inv(np.linalg.inv(W0) \n", " + dim2 * np.cov(X.T) + dim2 * beta0/(dim2 + beta0)\n", " * np.outer(X_bar - mu0, X_bar - mu0))\n", " var_X = (var_X + var_X.T)/2\n", " var_mu0, var_Lambda0 = Normal_Wishart(var_mu0, dim2 + beta0, var_X, var_nu, seed = None)\n", " \n", " var1 = W.T\n", " var2 = kr_prod(var1, var1)\n", " var3 = tau * np.matmul(var2, binary_mat).reshape([rank, rank, dim2]) + np.dstack([var_Lambda0] * dim2)\n", " var4 = tau * np.matmul(var1, sparse_mat) + np.dstack([np.matmul(var_Lambda0, var_mu0)] * dim2)[0, :, :]\n", " for t in range(dim2):\n", " var_Lambda1 = var3[ :, :, t]\n", " inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)\n", " var_mu = np.matmul(inv_var_Lambda1, var4[:, t])\n", " X[t, :] = np.random.multivariate_normal(var_mu, inv_var_Lambda1)\n", " \n", " mat_hat = np.matmul(W, X.T)\n", " rmse = np.sqrt(np.sum((dense_mat[pos] - mat_hat[pos])**2)/dense_mat[pos].shape[0])\n", " \n", " var_a = a0 + 0.5 * sparse_mat[position].shape[0]\n", " error = sparse_mat - mat_hat\n", " var_b = b0 + 0.5 * np.sum(error[position]**2)\n", " tau = np.random.gamma(var_a, 1/var_b)\n", " \n", " if (iter + 1) % 100 == 0:\n", " print('Iter: {}'.format(iter + 1))\n", " print('RMSE: {:.6}'.format(rmse))\n", " print()\n", "\n", " W_plus = np.zeros((dim1, rank))\n", " X_plus = np.zeros((dim2, rank))\n", " mat_hat_plus = np.zeros((dim1, dim2))\n", " for iters in range(maxiter2):\n", " W_bar = np.mean(W, axis = 0)\n", " var_mu0 = (dim1 * W_bar + beta0 * mu0)/(dim1 + beta0)\n", " var_nu = dim1 + nu0\n", " var_W = np.linalg.inv(np.linalg.inv(W0) \n", " + dim1 * np.cov(W.T) + dim1 * beta0/(dim1 + beta0)\n", " * np.outer(W_bar - mu0, W_bar - mu0))\n", " var_W = (var_W + var_W.T)/2\n", " var_mu0, var_Lambda0 = Normal_Wishart(var_mu0, dim1 + beta0, var_W, var_nu, seed = None)\n", " \n", " var1 = X.T\n", " var2 = kr_prod(var1, var1)\n", " var3 = tau * np.matmul(var2, binary_mat.T).reshape([rank, rank, dim1]) + np.dstack([var_Lambda0] * dim1)\n", " var4 = tau * np.matmul(var1, sparse_mat.T) + np.dstack([np.matmul(var_Lambda0, var_mu0)] * dim1)[0, :, :]\n", " for i in range(dim1):\n", " var_Lambda1 = var3[ :, :, i]\n", " inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)\n", " var_mu = np.matmul(inv_var_Lambda1, var4[:, i])\n", " W[i, :] = np.random.multivariate_normal(var_mu, inv_var_Lambda1)\n", " W_plus += W\n", " \n", " X_bar = np.mean(X, axis = 0)\n", " var_mu0 = (dim2 * X_bar + beta0 * mu0)/(dim2 + beta0)\n", " var_nu = dim2 + nu0\n", " var_X = np.linalg.inv(np.linalg.inv(W0) \n", " + dim2 * np.cov(X.T) + dim2 * beta0/(dim2 + beta0)\n", " * np.outer(X_bar - mu0, X_bar - mu0))\n", " var_X = (var_X + var_X.T)/2\n", " var_mu0, var_Lambda0 = Normal_Wishart(var_mu0, dim2 + beta0, var_X, var_nu, seed = None)\n", " \n", " var1 = W.T\n", " var2 = kr_prod(var1, var1)\n", " var3 = tau * np.matmul(var2, binary_mat).reshape([rank, rank, dim2]) + np.dstack([var_Lambda0] * dim2)\n", " var4 = tau * np.matmul(var1, sparse_mat) + np.dstack([np.matmul(var_Lambda0, var_mu0)] * dim2)[0, :, :]\n", " for t in range(dim2):\n", " var_Lambda1 = var3[ :, :, t]\n", " inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)\n", " var_mu = np.matmul(inv_var_Lambda1, var4[:, t])\n", " X[t, :] = np.random.multivariate_normal(var_mu, inv_var_Lambda1)\n", " X_plus += X\n", " \n", " mat_hat = np.matmul(W, X.T)\n", " mat_hat_plus += mat_hat\n", " \n", " var_a = a0 + 0.5 * sparse_mat[position].shape[0]\n", " error = sparse_mat - mat_hat\n", " var_b = b0 + 0.5 * np.sum(error[position]**2)\n", " tau = np.random.gamma(var_a, 1/var_b)\n", " \n", " W = W_plus/maxiter2\n", " X = X_plus/maxiter2\n", " mat_hat = mat_hat_plus/maxiter2\n", " final_mape = np.sum(np.abs(dense_mat[pos] - \n", " mat_hat[pos])/dense_mat[pos])/dense_mat[pos].shape[0]\n", " final_rmse = np.sqrt(np.sum((dense_mat[pos] - \n", " mat_hat[pos])**2)/dense_mat[pos].shape[0])\n", " print('Final MAPE: {:.6}'.format(final_mape))\n", " print('Final RMSE: {:.6}'.format(final_rmse))\n", " print()\n", " return mat_hat, W, X" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Organization\n", "\n", "### Part 1: Matrix Structure\n", "\n", "We consider a dataset of $m$ discrete time series $\\boldsymbol{y}_{i}\\in\\mathbb{R}^{f},i\\in\\left\\{1,2,...,m\\right\\}$. The time series may have missing elements. We express spatio-temporal dataset as a matrix $Y\\in\\mathbb{R}^{m\\times f}$ with $m$ rows (e.g., locations) and $f$ columns (e.g., discrete time intervals),\n", "\n", "$$Y=\\left[ \\begin{array}{cccc} y_{11} & y_{12} & \\cdots & y_{1f} \\\\ y_{21} & y_{22} & \\cdots & y_{2f} \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\ y_{m1} & y_{m2} & \\cdots & y_{mf} \\\\ \\end{array} \\right]\\in\\mathbb{R}^{m\\times f}.$$\n", "\n", "### Part 2: Tensor Structure\n", "\n", "We consider a dataset of $m$ discrete time series $\\boldsymbol{y}_{i}\\in\\mathbb{R}^{nf},i\\in\\left\\{1,2,...,m\\right\\}$. The time series may have missing elements. We partition each time series into intervals of predifined length $f$. We express each partitioned time series as a matrix $Y_{i}$ with $n$ rows (e.g., days) and $f$ columns (e.g., discrete time intervals per day),\n", "\n", "$$Y_{i}=\\left[ \\begin{array}{cccc} y_{11} & y_{12} & \\cdots & y_{1f} \\\\ y_{21} & y_{22} & \\cdots & y_{2f} \\\\ \\vdots & \\vdots & \\ddots & \\vdots \\\\ y_{n1} & y_{n2} & \\cdots & y_{nf} \\\\ \\end{array} \\right]\\in\\mathbb{R}^{n\\times f},i=1,2,...,m,$$\n", "\n", "therefore, the resulting structure is a tensor $\\mathcal{Y}\\in\\mathbb{R}^{m\\times n\\times f}$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**How to transform a data set into something we can use for time series prediction?**\n", "\n", "Now we have the data in an easy-to-use form by parsing [**Urban Traffic Speed Dataset of Guangzhou, China**](http://doi.org/10.5281/zenodo.1205229). " ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "import scipy.io\n", "\n", "tensor = scipy.io.loadmat('Guangzhou-data-set/tensor.mat')\n", "tensor = tensor['tensor']\n", "random_matrix = scipy.io.loadmat('Guangzhou-data-set/random_matrix.mat')\n", "random_matrix = random_matrix['random_matrix']\n", "random_tensor = scipy.io.loadmat('Guangzhou-data-set/random_tensor.mat')\n", "random_tensor = random_tensor['random_tensor']\n", "\n", "dense_mat = tensor.reshape([tensor.shape[0], tensor.shape[1] * tensor.shape[2]])\n", "missing_rate = 0.2\n", "\n", "# =============================================================================\n", "### Random missing (RM) scenario:\n", "### ------------------------------\n", "### missing rate | 0.2 | 0.4 |\n", "### rank | 80 | 80 |\n", "### ------------------------------\n", "### Set the RM scenario by:\n", "binary_mat = np.round(random_tensor + 0.5 - missing_rate).reshape([random_tensor.shape[0], \n", " random_tensor.shape[1] \n", " * random_tensor.shape[2]])\n", "# =============================================================================\n", "\n", "# =============================================================================\n", "### Non-random missing (NM) scenario:\n", "### ------------------------------\n", "### missing rate | 0.2 | 0.4 |\n", "### rank | 10 | 10 |\n", "### ------------------------------\n", "### Set the NM scenario by:\n", "# binary_tensor = np.zeros(tensor.shape)\n", "# for i1 in range(tensor.shape[0]):\n", "# for i2 in range(tensor.shape[1]):\n", "# binary_tensor[i1,i2,:] = np.round(random_matrix[i1,i2] + 0.5 - missing_rate)\n", "# binary_mat = binary_tensor.reshape([binary_tensor.shape[0], binary_tensor.shape[1] \n", "# * binary_tensor.shape[2]])\n", "# =============================================================================\n", "\n", "sparse_mat = np.multiply(dense_mat, binary_mat)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Rolling Spatiotemporal Prediction\n", "\n", "**Using clear explanations**: If we have a partially observed matrix $Y\\in\\mathbb{R}^{m\\times T}$, then how to do a single-step rolling prediction starting at the time interval $f+1$ and ending at the time interval $T$?\n", "\n", "The mechanism is:\n", "\n", "1. First learn spatial factors $W\\in\\mathbb{R}^{m\\times r}$, temporal factors $X\\in\\mathbb{R}^{f\\times r}$, and AR coefficients $\\boldsymbol{\\theta}_{s}\\in\\mathbb{R}^{d},s=1,2,...,r$ from partially observed matrix $Y\\in\\mathbb{R}^{m\\times f}$.\n", "\n", "2. Predict $\\boldsymbol{x}_{f+1}$ by\n", "$$\\hat{\\boldsymbol{x}}_{f+1}=\\sum_{k=1}^{d}\\boldsymbol{\\theta}_{k}\\circledast\\boldsymbol{x}_{f+1-h_k}.$$\n", "\n", "3. Load partially observed matrix $Y_{f}\\in\\mathbb{R}^{m\\times b}$ ($b$ is the number of back steps) and fix spatial factors $W\\in\\mathbb{R}^{m\\times T}$ and AR coefficients $\\boldsymbol{\\theta}_{s}\\in\\mathbb{R}^{d},s=1,2,...,r$, then learn temporal factors $X\\in\\mathbb{R}^{b\\times r}$.\n", "\n", "4. Compute the AR coefficients $\\boldsymbol{\\theta}_{s}\\in\\mathbb{R}^{d},s=1,2,...,r$ by\n", "\n", "\n", "5. Predict $\\boldsymbol{x}_{f+2}$ by\n", "$$\\hat{\\boldsymbol{x}}_{f+2}=\\sum_{k=1}^{d}\\boldsymbol{\\theta}_{k}\\circledast\\boldsymbol{x}_{b+1-h_k}.$$\n", "\n", "6. Make prediction iteratively until the time step $T$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## How to estimate AR coefficients?\n", "\n", "$$\\hat{\\boldsymbol{\\theta}}=\\left(Q^\\top\\Sigma_{\\eta}Q+\\Sigma_{\\theta}^{-1}\\right)^{-1}Q^\\top\\Sigma_{\\eta}^{-1}P$$\n", "where\n", "$$Q=[\\tilde{\\boldsymbol{x}}_{h_d+1},\\cdots,\\tilde{\\boldsymbol{x}}_{T}]^{\\top}\\in\\mathbb{R}^{T'\\times d}$$\n", "and\n", "$$P=[x_{h_d+1},\\cdots,x_{T}]^{\\top}.$$" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "def OfflineBPMF(sparse_mat, init, time_lags, maxiter1, maxiter2):\n", " \"\"\"Offline Bayesain Temporal Matrix Factorization\"\"\"\n", " W = init[\"W\"]\n", " X = init[\"X\"]\n", "\n", " d=time_lags.shape[0]\n", " dim1 = sparse_mat.shape[0]\n", " dim2 = sparse_mat.shape[1]\n", " rank = W.shape[1]\n", " position = np.where(sparse_mat > 0)\n", " binary_mat = np.zeros((dim1, dim2))\n", " binary_mat[position] = 1\n", "\n", " tau = 1\n", " alpha = 1e-6\n", " beta = 1e-6\n", " beta0 = 1\n", " nu0 = rank\n", " mu0 = np.zeros((rank))\n", " W0 = np.eye(rank)\n", "\n", " for iter in range(maxiter1):\n", " X_bar = np.mean(X, axis = 0)\n", " var_mu0 = (dim2 * X_bar + beta0 * mu0)/(dim2 + beta0)\n", " var_nu = dim2 + nu0\n", " var_X = np.linalg.inv(np.linalg.inv(W0) \n", " + dim2 * np.cov(X.T) + dim2 * beta0/(dim2 + beta0)\n", " * np.outer(X_bar - mu0, X_bar - mu0))\n", " var_X = (var_X + var_X.T)/2\n", " var_mu0, var_Lambda0 = Normal_Wishart(var_mu0, dim2 + beta0, var_X, var_nu, seed = None)\n", " \n", " var1 = W.T\n", " var2 = kr_prod(var1, var1)\n", " var3 = tau * np.matmul(var2, binary_mat).reshape([rank, rank, dim2]) + np.dstack([var_Lambda0] * dim2)\n", " var4 = tau * np.matmul(var1, sparse_mat) + np.dstack([np.matmul(var_Lambda0, var_mu0)] * dim2)[0, :, :]\n", " for t in range(dim2):\n", " var_Lambda1 = var3[ :, :, t]\n", " inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)\n", " var_mu = np.matmul(inv_var_Lambda1, var4[:, t])\n", " X[t, :] = np.random.multivariate_normal(var_mu, inv_var_Lambda1)\n", " \n", " mat_hat = np.matmul(W, X.T)\n", " \n", " var_alpha = alpha + 0.5 * sparse_mat[position].shape[0]\n", " error = sparse_mat - mat_hat\n", " var_beta = beta + 0.5 * np.sum(error[position] ** 2)\n", " tau = np.random.gamma(var_alpha, 1/var_beta)\n", "\n", " X_plus = np.zeros((dim2, rank))\n", " for iter in range(maxiter2): \n", " X_bar = np.mean(X, axis = 0)\n", " var_mu0 = (dim2 * X_bar + beta0 * mu0)/(dim2 + beta0)\n", " var_nu = dim2 + nu0\n", " var_X = np.linalg.inv(np.linalg.inv(W0) \n", " + dim2 * np.cov(X.T) + dim2 * beta0/(dim2 + beta0)\n", " * np.outer(X_bar - mu0, X_bar - mu0))\n", " var_X = (var_X + var_X.T)/2\n", " var_mu0, var_Lambda0 = Normal_Wishart(var_mu0, dim2 + beta0, var_X, var_nu, seed = None)\n", " \n", " var1 = W.T\n", " var2 = kr_prod(var1, var1)\n", " var3 = tau * np.matmul(var2, binary_mat).reshape([rank, rank, dim2]) + np.dstack([var_Lambda0] * dim2)\n", " var4 = tau * np.matmul(var1, sparse_mat) + np.dstack([np.matmul(var_Lambda0, var_mu0)] * dim2)[0, :, :]\n", " for t in range(dim2):\n", " var_Lambda1 = var3[ :, :, t]\n", " inv_var_Lambda1 = np.linalg.inv((var_Lambda1 + var_Lambda1.T)/2)\n", " var_mu = np.matmul(inv_var_Lambda1, var4[:, t])\n", " X[t, :] = np.random.multivariate_normal(var_mu, inv_var_Lambda1)\n", " X_plus += X\n", " \n", " mat_hat = np.matmul(W, X.T)\n", " \n", " var_alpha = alpha + 0.5 * sparse_mat[position].shape[0]\n", " error = sparse_mat - mat_hat\n", " var_beta = beta + 0.5 * np.sum(error[position] ** 2)\n", " tau = np.random.gamma(var_alpha, 1/var_beta)\n", "\n", " X = X_plus/maxiter2\n", " Sigma_eta = np.eye(dim2 - np.max(time_lags))\n", " Sigma_theta = np.eye(time_lags.shape[0])\n", " theta = np.zeros((time_lags.shape[0], rank))\n", " for s in range(rank):\n", " P = X[np.max(time_lags) : dim2, s]\n", " Q = np.zeros((dim2 - np.max(time_lags), time_lags.shape[0]))\n", " for t in range(np.max(time_lags), dim2):\n", " Q[t - np.max(time_lags), :] = X[t - time_lags, s]\n", " theta[:, s] = np.matmul(np.matmul(np.matmul(np.linalg.inv(np.matmul(np.matmul(Q.T, Sigma_eta), Q) \n", " + np.linalg.inv(Sigma_theta)), \n", " Q.T), np.linalg.inv(Sigma_eta)), P)\n", "\n", " return X, theta" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "def st_prediction(dense_mat, sparse_mat, pred_time_steps, back_steps, rank, time_lags, maxiter):\n", " start_time = dense_mat.shape[1] - pred_time_steps\n", " dense_mat0 = dense_mat[:, 0 : start_time]\n", " sparse_mat0 = sparse_mat[:, 0 : start_time]\n", " binary_mat0 = np.zeros((sparse_mat0.shape[0], sparse_mat0.shape[1]))\n", " binary_mat0[np.where(sparse_mat0 > 0)] = 1\n", " dim1 = sparse_mat0.shape[0]\n", " dim2 = sparse_mat0.shape[1]\n", " mat_hat = np.zeros((dim1, pred_time_steps))\n", " \n", " init = {\"W\": np.random.rand(dim1, rank), \n", " \"X\": np.random.rand(dim2, rank)}\n", " mat_hat, W, X = BPMF(dense_mat0, sparse_mat0, binary_mat0, init[\"W\"], init[\"X\"], maxiter[0], maxiter[1])\n", " \n", " init[\"W\"] = W.copy()\n", " Sigma_eta = np.eye(dim2 - np.max(time_lags))\n", " Sigma_theta = np.eye(time_lags.shape[0])\n", " theta = np.zeros((time_lags.shape[0], rank))\n", " for s in range(rank):\n", " P = X[np.max(time_lags) : dim2, s]\n", " Q = np.zeros((dim2 - np.max(time_lags), time_lags.shape[0]))\n", " for t in range(np.max(time_lags), dim2):\n", " Q[t - np.max(time_lags), :] = X[t - time_lags, s]\n", " theta[:, s] = np.matmul(np.matmul(np.matmul(np.linalg.inv(np.matmul(np.matmul(Q.T, Sigma_eta), Q) \n", " + np.linalg.inv(Sigma_theta)), \n", " Q.T), np.linalg.inv(Sigma_eta)), P)\n", " X0 = np.zeros((dim2 + 1, rank))\n", " X0[0 : dim2, :] = X.copy()\n", " X0[dim2, :] = np.einsum('ij, ij -> j', theta, X0[dim2 - time_lags, :])\n", " init[\"X\"] = X0[X0.shape[0] - back_steps : X0.shape[0], :]\n", " mat_hat[:, 0] = np.matmul(W, X0[dim2, :])\n", " \n", " for t in range(1, pred_time_steps):\n", " dense_mat1 = dense_mat[:, start_time - back_steps + t : start_time + t]\n", " sparse_mat1 = sparse_mat[:, start_time - back_steps + t : start_time + t]\n", " X, theta = OfflineBPMF(sparse_mat1, init, time_lags, maxiter[2], maxiter[3])\n", " X0 = np.zeros((back_steps + 1, rank))\n", " X0[0 : back_steps, :] = X.copy()\n", " X0[back_steps, :] = np.einsum('ij, ij -> j', theta, X0[back_steps - time_lags, :])\n", " init[\"X\"] = X0[1: back_steps + 1, :]\n", " mat_hat[:, t] = np.matmul(W, X0[back_steps, :])\n", " if (t + 1) % 40 == 0:\n", " print('Time step: {}'.format(t + 1))\n", "\n", " small_dense_mat = dense_mat[:, start_time : dense_mat.shape[1]]\n", " pos = np.where(small_dense_mat > 0)\n", " final_mape = np.sum(np.abs(small_dense_mat[pos] - \n", " mat_hat[pos])/small_dense_mat[pos])/small_dense_mat[pos].shape[0]\n", " final_rmse = np.sqrt(np.sum((small_dense_mat[pos] - \n", " mat_hat[pos]) ** 2)/small_dense_mat[pos].shape[0])\n", " print('Final MAPE: {:.6}'.format(final_mape))\n", " print('Final RMSE: {:.6}'.format(final_rmse))\n", " print()\n", " return mat_hat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The main influential factors for such prediction are:\n", "\n", "- The number of back steps $b$ (`back_steps`).\n", "\n", "- `rank`.\n", "\n", "- `maxiter`.\n", "\n", "- `time_lags`." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iter: 100\n", "RMSE: 4.49363\n", "\n", "Final MAPE: 0.0937335\n", "Final RMSE: 4.01711\n", "\n", "Time step: 40\n", "Time step: 80\n", "Time step: 120\n", "Time step: 160\n", "Time step: 200\n", "Time step: 240\n", "Time step: 280\n", "Time step: 320\n", "Time step: 360\n", "Time step: 400\n", "Time step: 440\n", "Time step: 480\n", "Time step: 520\n", "Time step: 560\n", "Time step: 600\n", "Time step: 640\n", "Time step: 680\n", "Time step: 720\n", "Final MAPE: 0.113364\n", "Final RMSE: 4.42602\n", "\n", "Running time: 19584 seconds\n" ] } ], "source": [ "import time\n", "start = time.time()\n", "pred_time_steps = 144 * 5\n", "back_steps = 144 * 4 * 1\n", "rank = 50\n", "time_lags = np.array([1, 2, 144])\n", "maxiter = np.array([100, 100, 10, 20])\n", "small_dense_mat = dense_mat[:, dense_mat.shape[1] - pred_time_steps : dense_mat.shape[1]]\n", "mat_hat = st_prediction(dense_mat, sparse_mat, pred_time_steps, back_steps, rank, time_lags, maxiter)\n", "end = time.time()\n", "print('Running time: %d seconds'%(end - start))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Experiment results** of short-term traffic prediction with missing values using Bayesian temporal matrix factorization (BTMF):\n", "\n", "| scenario |`back_steps`|`rank`|`time_lags`| `maxiter` | mape | rmse |\n", "|:----------|-----:|-----:|---------:|---------:|-----------:|----------:|\n", "|**40%, NM**| $144\\times 2$ | 30 | (1,2,144) | (100,100,10,20) | 0.1203 | **4.7566**|\n", "|**Original data**| $144\\times 4$ | 80 | (1,2,144) | (100,100,10,20) | 0.1045 | **4.1626**|\n", "|**20%, RM**| $144\\times 4$ | 50 | (1,2,144) | (100,100,10,20) | 0.1134 | **4.4260**|\n", "|**40%, NM**| $144\\times 4$ | 50 | (1,2,144) | (100,100,10,20) | 0.1190 | **4.6293**|\n", "|**20%, NM**| $144\\times 4$ | 30 | (1,2,144) | (100,100,10,20) | 0.1132 | **4.4421**|\n", "|**40%, NM**| $144\\times 4$ | 30 | (1,2,144) | (100,100,10,20) | 0.1180 | **4.6520**|\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }