{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[MIT License](https://github.com/cdslaborg/paramonte#license) \n", "[ParaMonte: plain powerful parallel Monte Carlo library](https://github.com/cdslaborg/paramonte). \n", "Copyright (C) 2012-present, [The Computational Data Science Lab](https://www.cdslab.org/#about) \n", "https://github.com/cdslaborg/paramonte \n", "[References](https://www.cdslab.org/paramonte/notes/overview/preface/#how-to-acknowledge-the-use-of-the-paramonte-library-in-your-work) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Why do we need to work with the logarithm of the mathematical objective functions? \n", " \n", " \n", "**Short Answer**:\n", "\n", "> In optimization and Monte Carlo sampling problems, since the mathematical objective functions (e.g., probability density functions) can take extremely small or large values, we often work with their natural logarithms instead. **This is the reason behind the naming convention used in the ParaMonte library for** the user's objective functions: **getLogFunc**, implying that **the user must provide a function that returns the natural logarithm of the target objective function**. \n", "\n", "**Long Answer**:\n", "\n", "Why do we often prefer to work with the logarithm of functions in optimization and Monte Carlo sampling problems? \n", "\n", "To find out, consider the following simple mathematical objective function `getLogFunc_bad()` that implements a 4-dimensional Standard MultiVariate Normal (**MVN**) distribution, whose generic Probability Density Function is given by the following equation (see also this [Wikipedia article](https://en.wikipedia.org/wiki/Multivariate_normal_distribution)), " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\large\n", "\\pi (x, \\mu, \\Sigma, k = 4) = (2\\pi)^{-\\frac{k}{2}} ~ |\\Sigma|^{-\\frac{1}{2}} ~ \n", "\\exp\n", "\\bigg( \n", "-\\frac{1}{2} (x-\\mu)^T ~ \\Sigma^{-1} ~ (x-\\mu)\n", "\\bigg)\n", "$$" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy.stats import multivariate_normal\n", "\n", "NDIM = 4 # the number of dimensions of the domain of the objective function: MVN\n", "\n", "mvn = multivariate_normal ( mean = np.zeros(4) # This is the mean of the MVN distribution.\n", " , cov = np.eye(4) # This is the covariance matrix of the MVN distribution.\n", " )\n", "\n", "def getLogFunc_bad(point): return np.log(mvn.pdf(point))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Suppose we want to sample this MVN Probability Density Function (PDF) via the [ParaMonte Monte Carlo simulation library](https://github.com/cdslaborg/paramonte/). To do so, the ParaMonte library sampler routines will make repeated calls to this objective function that we have implmented in the above. \n", "\n", "However, notice in the above implementation that we have suffixed the objective function with `_bad`. There are several good **important** reasons for such naming:\n", "\n", "+ The evaluation of this function involves a `log(exp())` term in its definition. If the origin of the `exp()` term is not clear to you, take a look at the definition of the MVN distribution in the equation provided in the above. **This is computationally very expensive** and in general, is considered a bad implementation. \n", " \n", " \n", "+ The evaluation of the function as implemented in the above requires an inverse-covariance matrix computation on each call made to `getLogFunc_bad()`. **This is completely redundant** as the value of the covariance matrix -- and therefore, its inverse -- does not change throughout the simulation. Consequently, **a lot of computational resources are wasted for nothing**. \n", " \n", " \n", "+ The above implementation of the MVN distribution is quite prone to numerical **overflow** and **underflow**, which could **cause the simulations to crash at runtime**. For example, when the input value for `point` happens to be too far from the `mean` of the MVN distribution, it is likely for the resulting MVN PDF value to be so small that the computer rounds it to zero. Then `np.log(0.0)` in `getLogFunc_bad()` becomes undefined and the simulation crashes. That would only be the best-case scenario in which the crash will alert you about the problem. But more frequently, your code may not even complain that it is working with infinities and not doing any useful work.\n", " \n", "It is, therefore, important to implement a **numerically efficient**, **fault-tolerant** MVN PDF calculator that resolves all of the above concerns. This is possible if we take a second look at the equation for the MVN PDF in the above and try directly implement its logarithm as our computational objective function," ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "NDIM = 4 # the number of dimensions of the domain of the objective function: MVN\n", "MEAN = np.zeros(NDIM) # This is the mean of the MVN distribution.\n", "COVMAT = np.eye(NDIM) # This is the covariance matrix of the MVN distribution.\n", "INVCOV = np.linalg.inv(COVMAT) # This is the inverse of the covariance matrix of MVN.\n", "\n", "# This is the log of the coefficient used in the definition of the MVN.\n", "\n", "MVN_COEF = NDIM * np.log( 1. / np.sqrt(2.*np.pi) ) + np.log( np.sqrt(np.linalg.det(INVCOV)) )\n", "\n", "def getLogFunc(point): \n", " \"\"\"\n", " Return the logarithm of the MVN PDF.\n", " \"\"\"\n", " normedPoint = MEAN - point\n", " return MVN_COEF - 0.5 * ( np.dot( normedPoint, np.matmul(INVCOV,normedPoint) ) )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Performance benchmark\n", "Now, let's compare the performance of the two implementations `getLogFunc()` and `getLogFunc_bad()` in the above," ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5.71 µs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" ] } ], "source": [ "%timeit getLogFunc(np.double([0,0,0,0]))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "24.3 µs ± 483 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)\n" ] } ], "source": [ "%timeit getLogFunc_bad(np.double([0,0,0,0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The good implementation is on average **three to five times more efficient** than the naive implementation of the objective function!\n", "\n", "### Fault tolerance\n", "\n", "Now, let's compare the fault-tolerance of the two implementations by assigning large values to the elements of the input `point` array," ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-200000003.67575413" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getLogFunc(point = NDIM*[10000])" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\shahmoradia\\Anaconda3\\lib\\site-packages\\ipykernel_launcher.py:10: RuntimeWarning: divide by zero encountered in log\n", " # Remove the CWD from sys.path while we load stuff.\n" ] }, { "data": { "text/plain": [ "-inf" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "getLogFunc_bad(point = NDIM*[10000])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, this may seem like being too meticulous, but, **a good fault-tolerant implementation of the objective function is absolutely essential in Monte Carlo simulations**, and this example objective function here is no exception. The `-inf` value that `getLogFunc_bad()` yields, will certainly lead to a severe catastrophic crash of a Monte Carlo simulation (You can [try it with the ParaMonte library's ParaDRAM sampler](https://nbviewer.jupyter.org/github/cdslaborg/paramontex/blob/main/Python/Jupyter/sampling_multivariate_distribution_function_via_paradram/sampling_multivariate_distribution_function_via_paradram.ipynb) at your own risk!). " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }