{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Univariate Distribution\n", "\n", "> In this post, we will show the basic usage of tensorflow probability (tfp), and how to make univariate distribution. This is the summary of lecture \"Probabilistic Deep Learning with Tensorflow 2\" from Imperial College London\n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Chanseok Kang\n", "- categories: [Python, Coursera, Tensorflow_probability, ICL]\n", "- image: images/bernoulli.png" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Tensorflow Probability](https://www.tensorflow.org/probability) (tfp for short) is a library for probabilistic reasoning and statistical analysis in Tensorflow. It is a part of wide ecosystem of Tensorflow, so it can easily combined with Tensorflow core." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Packages" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "import tensorflow_probability as tfp\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "tfd = tfp.distributions\n", "plt.rcParams['figure.figsize'] = (10, 6)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tensorflow Version: 2.4.0\n", "Tensorflow Probability Version: 0.11.1\n" ] } ], "source": [ "print(\"Tensorflow Version: \", tf.__version__)\n", "print(\"Tensorflow Probability Version: \", tfp.__version__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Univariate Distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From wikipedia,\n", "\n", "> In In statistics, a univariate distribution is a probability distribution of only one random variable. This is in contrast to a multivariate distribution, the probability distribution of a random vector (consisting of multiple random variables)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Normal Distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of simple univariate distributions is Normal Distribution (also known as Gaussian Distribution). We can create it with tensorflow probability." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create a normal distribution from tensorflow distributions\n", "normal = tfd.Normal(loc=0, scale=1)\n", "normal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `loc` stands for mean($\\mu$) of distribution, and `scale` is standard distribution ($\\sigma$) of distribution. After that, we create the normal distribution object. In order to generate the data from normal distribution, we need to `sample` from it." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "normal.sample()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or it can generate multiple samples." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "normal.sample(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we generate 10000 samples and plot it, its shape will be bell-shaped." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.hist(samples.numpy(), bins=50, density=True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you're familiar with statistics, the probability of each sample can be expressed." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "normal.prob(0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or you can use log probability." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "normal.log_prob(0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exponential distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another example of univariate distribution is exponential distribution. This distribution has controllable parameter called $\\lambda$, and can be expressed like this,\n", "\n", "$$ f(x; \\lambda) \\begin{cases} \\lambda e^{-\\lambda x} & x \\ge 0, \\\\ 0 & x < 0 \\end{cases} $$" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "exponential = tfd.Exponential(rate=1)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "exponential.sample(5)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlMAAAFlCAYAAADPim3FAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAQHElEQVR4nO3dXYymd1nH8d/lro0CIsSuRratW5Py0hgquhQUXxBUWtbYmHjQohCIpGlCEY2JrCZ6wkkJaiChsGlqJUZiD6DRSlfqgW8HBNIWEGhryabUdimG1hdUPKgLlwczTYZhln2618w+z0w/n2TTve/n35mredLNd//PPfdd3R0AAM7Oty17AACA3UxMAQAMiCkAgAExBQAwIKYAAAbEFADAwP5lfePzzz+/Dx06tKxvDwCwsHvuuefx7j6w1WtLi6lDhw7l7rvvXta3BwBYWFX9y+le8zEfAMCAmAIAGBBTAAADYgoAYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADOxf9gA77dDROxZa99ANR3Z4EgBgL7IzBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATEFADAgpgAABsQUAMCAmAIAGBBTAAADYgoAYEBMAQAMiCkAgAExBQAwsFBMVdUVVfVAVZ2oqqNbvP7dVfVXVfVPVXVvVb1p+0cFAFg9Z4ypqtqX5MYkVya5NMk1VXXppmVvSXJfd1+W5JVJ/rCqztvmWQEAVs4iO1OXJznR3Q929xNJbk1y1aY1neS7qqqSPCvJvyc5ta2TAgCsoEVi6mCSRzYcn1w/t9F7k7woyaNJPpvkbd399W2ZEABghS0SU7XFud50/Jokn07yvCQ/nOS9VfXsb/pCVddW1d1Vdfdjjz32FEcFAFg9i8TUySQXbji+IGs7UBu9KcltveZEki8keeHmL9TdN3X34e4+fODAgbOdGQBgZSwSU3cluaSqLl6/qPzqJLdvWvNwklcnSVV9X5IXJHlwOwcFAFhF+8+0oLtPVdX1Se5Msi/JLd19b1Vdt/76sSTvSPKBqvps1j4WfHt3P76DcwMArIQzxlSSdPfxJMc3nTu24fePJvn57R3t3Dp09I6F1j10w5EdngQA2E3cAR0AYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATEFADAgpgAABsQUAMCAmAIAGBBTAAADYgoAYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATEFADAgpgAABsQUAMCAmAIAGBBTAAADYgoAYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAwP5lD7DbHDp6x0LrHrrhyA5PAgCsAjtTAAADYgoAYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAwEIxVVVXVNUDVXWiqo6eZs0rq+rTVXVvVf3D9o4JALCaznjTzqral+TGJD+X5GSSu6rq9u6+b8Oa5yR5X5IruvvhqvreHZoXAGClLLIzdXmSE939YHc/keTWJFdtWvO6JLd198NJ0t1f3t4xAQBW0yIxdTDJIxuOT66f2+j5SZ5bVX9fVfdU1Ru2+kJVdW1V3V1Vdz/22GNnNzEAwApZJKZqi3O96Xh/kh9NciTJa5L8XlU9/5v+pe6buvtwdx8+cODAUx4WAGDVLPKg45NJLtxwfEGSR7dY83h3fzXJV6vqH5NcluTz2zIlAMCKWmRn6q4kl1TVxVV1XpKrk9y+ac1fJvnJqtpfVc9I8rIk92/vqAAAq+eMO1Pdfaqqrk9yZ5J9SW7p7nur6rr114919/1V9dEkn0ny9SQ3d/fndnJwAIBVsMjHfOnu40mObzp3bNPxu5K8a/tGAwBYfe6ADgAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAAP7lz3AXnXo6B0LrXvohiM7PAkAsJPsTAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATEFADAgpgAABsQUAMCAmAIAGBBTAAADYgoAYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATEFADAgpgAABsQUAMCAmAIAGNi/7AGe7g4dvWOhdQ/dcGSHJwEAzoadKQCAATEFADAgpgAABsQUAMCAmAIAGBBTAAADYgoAYEBMAQAMLBRTVXVFVT1QVSeq6ui3WPfSqvpaVf3y9o0IALC6zhhTVbUvyY1JrkxyaZJrqurS06x7Z5I7t3tIAIBVtcjO1OVJTnT3g939RJJbk1y1xbq3Jvlwki9v43wAACttkWfzHUzyyIbjk0letnFBVR1M8ktJXpXkpaf7QlV1bZJrk+Siiy56qrM+rXmGHwCspkV2pmqLc73p+N1J3t7dX/tWX6i7b+ruw919+MCBAwuOCACwuhbZmTqZ5MINxxckeXTTmsNJbq2qJDk/yWur6lR3/8V2DAkAsKoWiam7klxSVRcn+WKSq5O8buOC7r74yd9X1QeSfERIAQBPB2eMqe4+VVXXZ+2n9PYluaW7762q69ZfP7bDMwIArKxFdqbS3ceTHN90bsuI6u43zscCANgd3AEdAGBATAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATEFADAgpgAABsQUAMCAmAIAGBBTAAADYgoAYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAwP5lD8D2OnT0joXXPnTDkR2cBACeHuxMAQAMiCkAgAExBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADIgpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATEFADAgpgAABvYvewCW59DROxZa99ANR3Z4EgDYvexMAQAMiCkAgAExBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGBATAEADLgDOmfkTukAcHp2pgAABsQUAMCAmAIAGBBTAAADYgoAYEBMAQAMiCkAgIGFYqqqrqiqB6rqRFUd3eL1X6mqz6z/+lhVXbb9owIArJ4zxlRV7UtyY5Irk1ya5JqqunTTsi8k+enufnGSdyS5absHBQBYRYvsTF2e5ER3P9jdTyS5NclVGxd098e6+z/WDz+e5ILtHRMAYDUtElMHkzyy4fjk+rnT+bUkfz0ZCgBgt1jk2Xy1xbnecmHVz2Qtpn7iNK9fm+TaJLnooosWHJHdwjP8AHg6WmRn6mSSCzccX5Dk0c2LqurFSW5OclV3/9tWX6i7b+ruw919+MCBA2czLwDASlkkpu5KcklVXVxV5yW5OsntGxdU1UVJbkvy+u7+/PaPCQCwms74MV93n6qq65PcmWRfklu6+96qum799WNJfj/J9yR5X1UlyanuPrxzYwMArIZFrplKdx9PcnzTuWMbfv/mJG/e3tEAAFafO6ADAAyIKQCAATEFADAgpgAABsQUAMCAmAIAGBBTAAADC91nCraTZ/gBsJfYmQIAGBBTAAADYgoAYEBMAQAMiCkAgAExBQAwIKYAAAbEFADAgJgCABgQUwAAA2IKAGDAs/lYWZ7hB8BuYGcKAGBATAEADIgpAIABMQUAMCCmAAAG/DQfu56f+gNgmexMAQAMiCkAgAExBQAw4JopnjYWvbYqcX0VAIuzMwUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBgQEwBAAyIKQCAATfthC14eDIAi7IzBQAwYGcKBuxgAWBnCgBgQEwBAAz4mA/OAR8HAuxddqYAAAbEFADAgJgCABgQUwAAAy5AhxXiQnWA3cfOFADAgJgCABgQUwAAA66Zgl1o0WurFuUaLICzZ2cKAGBATAEADPiYD3hKHxv6SBDgG9mZAgAYsDMFPCUufgf4RnamAAAG7EwBS+UROsBuJ6aAXUF0AatqoZiqqiuSvCfJviQ3d/cNm16v9ddfm+R/k7yxuz+5zbMCnJHoAs61M8ZUVe1LcmOSn0tyMsldVXV7d9+3YdmVSS5Z//WyJO9f/yfAStruC+mfikVDThjC7rDIztTlSU5094NJUlW3JrkqycaYuirJn3Z3J/l4VT2nqr6/u7+07RMD7HLLDLntJvhgsZg6mOSRDccn8827TlutOZhETAHssN0QZ7thxkXshijcS7cv2S2xvkhM1Rbn+izWpKquTXLt+uH/VNUDC3z/qfOTPH4Ovg/nhvdzb/F+7i17/v2sdy57gnPq/CSP74b/5nM04w+c7oVFYupkkgs3HF+Q5NGzWJPuvinJTQt8z21TVXd39+Fz+T3ZOd7PvcX7ubd4P/cW7+fiFrlp511JLqmqi6vqvCRXJ7l905rbk7yh1rw8yVdcLwUAPB2ccWequ09V1fVJ7szarRFu6e57q+q69dePJTmetdsinMjarRHetHMjAwCsjoXuM9Xdx7MWTBvPHdvw+07ylu0dbduc048V2XHez73F+7m3eD/3Fu/ngmqtgwAAOBsedAwAMLBnY6qqrqiqB6rqRFUdXfY8nL2qurCq/q6q7q+qe6vqbcueibmq2ldVn6qqjyx7FmbWb9T8oar65/X/T39s2TNx9qrqN9f/rP1cVf15VX3HsmdadXsypjY8AufKJJcmuaaqLl3uVAycSvJb3f2iJC9P8hbv557wtiT3L3sItsV7kny0u1+Y5LJ4X3etqjqY5NeTHO7uH8raD55dvdypVt+ejKlseAROdz+R5MlH4LALdfeXnnxwdnf/d9b+oD643KmYqKoLkhxJcvOyZ2Gmqp6d5KeS/HGSdPcT3f2fSx2Kqf1JvrOq9id5Rra4byTfaK/G1Okeb8MuV1WHkrwkySeWPAoz707y20m+vuQ5mPvBJI8l+ZP1j21vrqpnLnsozk53fzHJHyR5OGuPhPtKd//NcqdafXs1phZ6vA27S1U9K8mHk/xGd//Xsufh7FTVLyT5cnffs+xZ2Bb7k/xIkvd390uSfDWJ61R3qap6btY+ybk4yfOSPLOqfnW5U62+vRpTCz3eht2jqr49ayH1we6+bdnzMPKKJL9YVQ9l7SP4V1XVny13JAZOJjnZ3U/uFn8oa3HF7vSzSb7Q3Y919/8luS3Jjy95ppW3V2NqkUfgsEtUVWXteoz7u/uPlj0PM939O919QXcfytr/m3/b3f7mu0t1978meaSqXrB+6tVJ7lviSMw8nOTlVfWM9T97Xx0/UHBGC90Bfbc53SNwljwWZ+8VSV6f5LNV9en1c7+7fmd+YPnemuSD6395fTAeKbZrdfcnqupDST6ZtZ+k/lTcCf2M3AEdAGBgr37MBwBwTogpAIABMQUAMCCmAAAGxBQAwICYAgAYEFMAAANiCgBg4P8Bg1IkTIOXVt8AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.hist(exponential.sample(10000).numpy(), bins=50, density=True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bernoulli Distribution\n", "\n", "Bernoulli Distribution is also a family of univariate distribution. All we need to describe this distribution is the probabiltiy that 1 is occurred. Otherwise, 0 will be occurred.\n", "\n", "$$ f(x; p) = \\begin{cases} p & \\text{if } k=1, \\\\ q = 1 - p & \\text{if } k = 0 \\end{cases} $$" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "bernoulli = tfd.Bernoulli(probs=0.8)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bernoulli.sample(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This distribution generates only two data, 0 and 1." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Probability result 0.20000000298023224 for k = 0\n", "Probability result 0.4000000059604645 for k = 0.5\n", "Probability result 0.800000011920929 for k = 1\n", "Probability result 0.05000000074505806 for k = -1\n" ] } ], "source": [ "for k in [0, 0.5, 1, -1]:\n", " print('Probability result {} for k = {}'.format(bernoulli.prob(k), k))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We already define the probability of 1 to 0.8, so the probability of 0 will be 0.2. You can see that the probability of unexpected data will be strange probability." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Batch Distributions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The advantage of tensorflow distribution is that it can easily make batch data from specific distribution." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "bernoulli_batch = tfd.Bernoulli(probs=[0.1, 0.25, 0.5, 0.75, 0.9])" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bernoulli_batch" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bernoulli_batch.sample(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can make 2D batch samples using higher rank as probs." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "probs = [[[0.5, 0.5], \n", " [0.8, 0.3], \n", " [0.25, 0.75]]]\n", "bernoulli_batch_2D = tfd.Bernoulli(probs=probs)\n", "bernoulli_batch_2D" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bernoulli_batch_2D.sample(5)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bernoulli_batch_2D.prob([[[1, 0], \n", " [0, 0], \n", " [1, 1]]])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" } }, "nbformat": 4, "nbformat_minor": 4 }