{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "$$\n", "\\newcommand{\\mat}[1]{\\boldsymbol {#1}}\n", "\\newcommand{\\mattr}[1]{\\boldsymbol {#1}^\\top}\n", "\\newcommand{\\matinv}[1]{\\boldsymbol {#1}^{-1}}\n", "\\newcommand{\\vec}[1]{\\boldsymbol {#1}}\n", "\\newcommand{\\vectr}[1]{\\boldsymbol {#1}^\\top}\n", "\\newcommand{\\rvar}[1]{\\mathrm {#1}}\n", "\\newcommand{\\rvec}[1]{\\boldsymbol{\\mathrm{#1}}}\n", "\\newcommand{\\diag}{\\mathop{\\mathrm {diag}}}\n", "\\newcommand{\\set}[1]{\\mathbb {#1}}\n", "\\newcommand{\\norm}[1]{\\left\\lVert#1\\right\\rVert}\n", "\\newcommand{\\pderiv}[2]{\\frac{\\partial #1}{\\partial #2}}\n", "\\newcommand{\\bb}[1]{\\boldsymbol{#1}}\n", "\\newcommand{\\Tr}[0]{^\\top}\n", "\\newcommand{\\softmax}[1]{\\mathrm{softmax}\\left({#1}\\right)}\n", "$$\n", "\n", "# CS236781: Deep Learning\n", "# Tutorial 11: Variational AutoEncoders\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Introduction\n", "\n", "In this tutorial, we will cover:\n", "\n", "- Discriminative Vs. Generative\n", "- start simple- KDE and GMM\n", "\n", "- VAE\n", " - KL divergence\n", " - representation trick\n", " - VAE loss\n", " - code example\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "execution": { "iopub.execute_input": "2022-03-24T07:25:15.707910Z", "iopub.status.busy": "2022-03-24T07:25:15.707373Z", "iopub.status.idle": "2022-03-24T07:25:16.993157Z", "shell.execute_reply": "2022-03-24T07:25:16.992767Z" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# Setup\n", "%matplotlib inline\n", "import numpy as np\n", "import pandas as pd\n", "import os\n", "import sys\n", "import time\n", "import matplotlib.pyplot as plt\n", "from mpl_toolkits.mplot3d import Axes3D\n", "import seaborn as sns; sns.set()\n", "\n", "\n", "\n", "# pytorch\n", "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "import torchvision\n", "from torchvision import datasets, transforms\n", "from torch.utils.data import DataLoader,Dataset \n", "\n", "\n", "if torch.cuda.is_available():\n", " torch.backends.cudnn.deterministic = True\n", "\n", "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n", " \n", "# sklearn imports\n", "from sklearn import mixture\n", "from sklearn.manifold import TSNE" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "execution": { "iopub.execute_input": "2022-03-24T07:25:16.995346Z", "iopub.status.busy": "2022-03-24T07:25:16.995233Z", "iopub.status.idle": "2022-03-24T07:25:17.008877Z", "shell.execute_reply": "2022-03-24T07:25:17.008540Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "plt.rcParams['font.size'] = 20\n", "data_dir = os.path.expanduser('~/.pytorch-datasets')\n", "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Discriminative Vs. Generative" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "recall for our probabalistic notation:\n", "\n", "- Domain: $\\vec{x}^i \\in \\set{R}^D$\n", "- Target: $y^i \\in \\set{Y}$ - typically for classification, a set of classes.\n", "\n", "* When did we solve a regresion problem in the course?\n", "* what was the Target space there?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Discriminative models**, are most of the models we saw in the course so far: we're trying to learn $P(Y|X)$ \n", "for that type of models, we have to use labels, so we use supervised learning setup.\n", "\n", "**Generative models**, are models that learn $P(X)$ rather explicitly (Today), or implicitly (Next week). can also learn $P(X|Y)$ if we know the target space.\n", "\n", "and serve two perposes:\n", "\n", "1. we can use bayes rule: $(Y|X) = \\frac{P(X|Y) P(X)}{P(Y)}$ and since we only like to maximize, we can only look at $P(X|Y) P(X)$, and thus we can classify the example!\n", "\n", "2. We can sample the learned distribution $P(X)$ or $P(X|Y)$ and generate new instances !\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "