{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# B\n", "\n", "## Python / NumPy / SciPy\n", " \n", "This Jupyter notebook is the Python equivalent of the R code in appendix B, R, pp. 561 - 565, [Introduction to Probability, 2nd Edition](https://www.crcpress.com/Introduction-to-Probability-Second-Edition/Blitzstein-Hwang/p/book/9781138369917), Blitzstein & Hwang.\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.1 Vectors\n", "\n", "Rather than using the usual Python list (array), for probability and statistics it is more convenient to use a [NumPy](https://docs.scipy.org/doc/numpy/user/basics.creation.html) array, as there are many [advantages to using a NumPy array over a regular Python lists](https://stackoverflow.com/questions/993984/what-are-the-advantages-of-numpy-over-regular-python-lists). The rules for operating on NumPy arrays of different shape are quite different from vectors in R (please see [Broadcasting](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html)).\n", "\n", "Building on NumPy, [SciPy](https://docs.scipy.org/doc/scipy-1.2.0/reference/tutorial/basic.html) is provides functions for mathematics, science, and engineering. The [scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html) module contains a large number of probability distributions as well as a growing library of statistical functions.\n", "\n", "\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| [numpy.array([1, 1, 0, 2.7, 3.1])](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.array.html) | creates the array [1, 1, 0, 2.7, 3.1] |\n", "| [numpy.arange(1, 101)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.arange.html) | creates the array [1, 2, ..., 100] |\n", "| [numpy.arange(1, 100)**3](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html#general-broadcasting-rules) | creates the array [13, 23, ..., 1003] |\n", "| [numpy.zeros(50)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.zeros.html) | creates the array [0, 0,..., 0] of length 50 |\n", "| numpy.arange(0, 100, 3) | creates the array [0, 3, 6, 9, ..., 99] |\n", "| [v[4]](https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.indexing.html#integer-array-indexing) | 5th entry of vector $\\mathtt{v}$ (index starts at 0) |\n", "| [v[[True, True, True, True, False, True, ... , True]]](https://docs.scipy.org/doc/numpy-1.15.1/reference/arrays.indexing.html#boolean-array-indexing) | all but the 5th entry of $\\mathtt{v}$ |\n", "| v[[2, 0, 3]] | 3rd, 1st, 4th entries of array $\\mathtt{v}$ |\n", "| v[v>2] | entries of $\\mathtt{v}$ that exceed 2 |\n", "| [numpy.where(v > 2)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.where.html) | indices of $\\mathtt{v}$ such that entry exceeds 2 |\n", "| numpy.where(v == 7) | indices of $\\mathtt{v}$ such that entry equals 7 |\n", "| [v.min()](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.min.html) | minimum of $\\mathtt{v}$ |\n", "| [v.max()](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.max.html) | maximum of $\\mathtt{v}$ |\n", "| numpy.where(v==v.max()) | indices where $\\mathtt{v.max()}$ is achieved |\n", "| [v.sum()](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.sum.html) | sum of the entries in $\\mathtt{v}$ |\n", "| [v.cumsum()](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.cumsum.html) | cumulative sums of the entries in $\\mathtt{v}$ |\n", "| [v.prod()](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.prod.html) | product of the entries in $\\mathtt{v}$ |\n", "| [scipy.stats.rankdata(v)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rankdata.html) | ranks of the entries in $\\mathtt{v}$ |\n", "| [v.size](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.size.html) | number of elements in array $\\mathtt{v}$ |\n", "| [numpy.sort(v)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.sort.html) | return a sorted copy of array $\\mathtt{v}$ (in increasing order) |\n", "| [numpy.unique(v)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.unique.html) | lists each element of $\\mathtt{v}$ once, without duplicates |\n", "| numpy.unique(v, return_counts=True) | tallies how many times each element of $\\mathtt{v}$ occurs |\n", "| dict(zip(*numpy.unique(v, return_counts=True))) | return mapping of array $\\mathtt{v}$ element keys and frequency values |\n", "| [numpy.hstack([v, w])](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.hstack.html) | concatenates vectors $\\mathtt{v}$ and $\\mathtt{w}$ |\n", "| [numpy.union1d(v, w)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.union1d.html) | union of $\\mathtt{v}$ and $\\mathtt{w}$ as sets |\n", "| [numpy.intersect1d(v, w)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.intersect1d.html) | intersection of $\\mathtt{v}$ and $\\mathtt{w}$ as sets |\n", "| v + w | adds $\\mathtt{v}$ and $\\mathtt{w}$ entrywise, if w is a scalar (see [Broadcasting](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html)) |\n", "| v * w | multiplies $\\mathtt{v}$ and $\\mathtt{w}$ entrywise, if w is a scalar (see [Broadcasting](https://docs.scipy.org/doc/numpy-1.15.0/user/basics.broadcasting.html)) |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.2 Matrices\n", "\n", "\n", "From [Linear Algebra (scipy.linalg)](https://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html) page at SciPy.org:\n", "* The use of the [numpy.matrix](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.matrix.html) class is discouraged, since it adds nothing that cannot be accomplished with 2D [numpy.ndarray](https://docs.scipy.org/doc/numpy-1.15.0/reference/arrays.ndarray.html) objects.\n", "* [scipy.linalg](https://docs.scipy.org/doc/scipy/reference/linalg.html) contains all the functions in [numpy.linalg](https://docs.scipy.org/doc/numpy-1.15.1/reference/routines.linalg.html), plus some other more advanced ones... Another advantage of using scipy.linalg over numpy.linalg is that it is always compiled with BLAS/LAPACK support... Therefore, unless you don't want to add scipy as a dependency to your numpy program, _use scipy.linalg instead of numpy.linalg_\n", "\n", "Therefore, these notebooks will use numpy.ndarray for matrices, and use scipy.linalg for its linear algebra functionality.\n", "\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| numpy.array([[1,5], [3,7]]) | creates the matrix $\\begin{pmatrix} 1 & 5 \\\\ 3 & 7 \\end{pmatrix}$ |\n", "| [A.shape](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.shape.html) | gives the dimensions of matrix A |\n", "| [numpy.diag(A)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.diag.html) | extracts the diagonal of matrix A |\n", "| numpy.diag([1, 7]) | creates the diagonal matrix $\\begin{pmatrix} 1 & 0 \\\\ 0 & 7 \\end{pmatrix}$ |\n", "| [numpy.stack([u, v, w])](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.stack.html) | binds arrays u, v, w into a matrix, as rows |\n", "| [numpy.column_stack([u, v, w])](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.column_stack.html) | binds arrays u, v, w into a matrix, as columns |\n", "| [A.T](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ndarray.T.html#numpy.ndarray.T) | transpose of matrix A |\n", "| [A[1,2]](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#indexing-arrays) | row 2, column 3 entry of matrix A |\n", "| A[1, :] | row 2 of matrix A (as a vector) |\n", "| A[:, 2] | column 3 of matrix A (as a vector) |\n", "| A[[0,2],:][:,[1,3]] | submatrix of A, keeping rows 1, 3 and columns 2, 4
see [Array object, Combining advanced and basic indexing](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.indexing.html#combining-advanced-and-basic-indexing) |\n", "| A.sum(axis=1) | row sums of matrix A |\n", "| A.mean(axis=1) | row averages of matrix A |\n", "| A.sum(axis=0) | column sums of matrix A |\n", "| A.mean(axis=0) | column averages of matrix A |\n", "| [scipy.linalg.eig(A)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eig.html#scipy.linalg.eig) | eigenvalues and eigenvectors of matrix A |\n", "| [scipy.linalg.inv(A)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.inv.html) | matrix A−1 |\n", "| [scipy.linalg.solve(A, b)](https://docs.scipy.org/doc/scipy-1.2.0/reference/generated/scipy.linalg.solve.html) | solves Ax = b for x (where b is a column vector) |\n", "| [A @ B](https://www.python.org/dev/peps/pep-0465/) | matrix multiplication AB |\n", "| [scipy.linalg.fractional_matrix_power(A, k)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.fractional_matrix_power.html#scipy.linalg.fractional_matrix_power) | matrix power Ak |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.3 Math\n", "\n", "* NumPy has many convenient [mathematical functions](https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html), which are geared for operations on scalars, vectors and matrices.\n", "* SciPy also offers functions similar to NumPy, but the SciPy submodules are more extensive, and, depending upon your Python distribution, may also already be optimized to take advantage of the Intel Math Kernel Library, LAPACK, and BLAS.\n", "* Thus, unless you have clear reasons for not using SciPy, we suggest you use the scipy submodules whenever possible.\n", "* Note that while many of these functions are provided in the [math](https://docs.python.org/3/library/math.html) standard Python module, the NumPy and SciPy functions below work out of the box on scalar values as well as vectors. \n", "\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| [numpy.abs(x)](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.absolute.html) | $\\lvert x \\rvert$ |\n", "| [numpy.exp(x)](https://docs.scipy.org/doc/numpy/reference/generated/numpy.exp.html) | $e^x$ |\n", "| [numpy.log(x)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.log.html) | $log(x)$ |\n", "| numpy.log(x) / numpy.log(b) | $log_{b}(x)$ |\n", "| [numpy.sqrt(x)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.sqrt.html) | $\\sqrt{x}$ |\n", "| [numpy.floor(x)](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.floor.html) | $\\lfloor x \\rfloor$ |\n", "| [numpy.ceil(x)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.ceil.html) | $\\lceil x \\rceil$ |\n", "| [scipy.special.factorial(n)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.factorial.html) | $n!$ |\n", "| [scipy.special.gammaln(n + 1)](https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.special.gammaln.html) | $log(n!)$ (helps prevent overflow) |\n", "| [scipy.special.gamma(a)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.gamma.html) | $\\Gamma(a)$ |\n", "| scipy.special.gammaln(a) | $log(\\Gamma(a))$ (helps prevent overflow) |\n", "| [scipy.special.comb(n, k)](https://docs.scipy.org/doc/scipy-0.15.0/reference/generated/scipy.special.comb.html#scipy.special.comb) | binomial coefficient $\\binom{n}{k}$ |\n", "| pbirthday(k) | (see Birthday problem calculation and simulation, Ch. 1 notebook) |\n", "| [x**2 if x > 0 else \\x**3](https://docs.python.org/3/reference/expressions.html#conditional-expressions) | $x^2$ if $x > 0$, $x^3$ otherwise (piecewise) |\n", "| def f(x): return exp(-x) | defines the function $f$ by $f(x) = e^{-x}$ |\n", "| [scipy.integrate.quad(f, a, b)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html) | finds $\\int_{a}^{b} f(x) \\, dx$ numerically |\n", "| [scipy.optimize.fminbound(f, a, b)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fminbound.html) | _minimizes_ $f$ numerically on [a, b] |\n", "| [scipy.optimize.fminbound(lambda x: -f(x), a, b)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fminbound.html) | _maximizes_ $f$ numerically on [a, b] |\n", "| [scipy.optimize.root(f, x0=numpy.mean([a,b]))](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.root.html) | searches numerically for a zero of $f$ in $[a, b]$
initially guessing at $\\frac{a + b}{2}$|\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.4 Sampling and simulation\n", "\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| [numpy.random.permutation(7)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.permutation.html#numpy.random.permutation) | random permutation of 0, 1, ..., 6 |\n", "| numpy.random.permutation(range(1,8)) | random permutation of 1, 2, ..., 7 |\n", "| [numpy.random.choice(range(1,53), size=5, replace=False)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.choice.html#numpy.random.choice) | picks 5 times from 1, 2, ..., 52 (don't replace) |\n", "| numpy.random.choice(list(string.ascii_lowercase),
size=5, replace=False) | picks 5 random letters of the alphabet (don't replace)
see [Common string operations](https://docs.python.org/3/library/string.html) |\n", "| numpy.random.choice(range(1,4), size=5, p=p) | picks 5 times from 1, 2, 3 with probabilities $\\mathtt{p}$ (replace) |\n", "| [[experiment() for _ in range(10**4)]](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions) | gathers 104 executions of $\\mathtt{experiment}$ function into array |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.5 Plotting\n", "\n", "* The Python-based Jupyter notebooks here in this repository all make use of the [matplotlib.pyplot](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.html) state-based interface to matplotlib. \n", "* The [%matplotlib inline magic command](https://ipython.readthedocs.io/en/stable/interactive/plotting.html#id1) provides a MATLAB-like way of plotting in the notebook.\n", "* matplotlib.pyplot is mainly intended for interactive plots and simple cases of programmatic plot generation\n", "* If the [matplotlib colormaps](https://matplotlib.org/2.0.2/api/colors_api.html) are not enough, you can find all sorts of color suggestions at [colorbrewer2.org](http://colorbrewer2.org/#type=sequential&scheme=BuGn&n=3).\n", "* The single underscore variable name _ is often used to denote a \"throwaway\" variable whose value is not used.\n", " * For example, in a for loop, if the loop counter is never really used, you can write a for loop like this:\n", " \n", " for _ in range(10):\n", " print(\"I don't care what iteration I'm on!\")\n", "\n", " * _ is also used to assign but ignore return values from functions. In the matplotlib.pyplot.subplots example below, we ignore the fig returned by assigning that to _.\n", "* matplotlib being state-based, you can continue to add to a plot until [matplotlib.pyplot.show](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.show.html#matplotlib.pyplot.show) is called; or in a Jupyter notebook, until the end of a code cell.\n", "\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| [x = numpy.linspace(a, b, num=n)](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html) | create a vector of n equally spaced points from a to b |\n", "| [matplotlib.pyplot.plot(x, f(x))](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.plot.html) | graphs the given vector x and a function f on x |\n", "| [matplotlib.pyplot.scatter(x, f(x))](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.scatter.html#matplotlib.pyplot.scatter) | creates scatter plot of the points $(x_i, \\, y_i)$ |\n", "| matplotlib.pyplot.plot(x, y, linestyle=\"--\") | creates line plot of the points $(x_i, \\, y_i)$, with the given [linestyle](https://matplotlib.org/gallery/lines_bars_and_markers/line_styles_reference.html) |\n", "| matplotlib.pyplot.scatter(x,y) | adds the points $(x_i, \\, y_i)$ to the plot |\n", "| lines(x,y) | adds line segments through the $(x_i, \\, y_i)$ to the plot |\n", "| matplotlib.pyplot.plot(x, b*x + a) | adds the line with intercept a, slope b to the plot, given x |\n", "| [matplotlib.pyplot.hist(x, bins=b, col=\"blue\")](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html) | blue histogram of the values in x, with b bins |\n", "| [_, (ax1, ax2) = matplotlib.pyplot.subplots(1, 2)](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.subplots.html) | tells matplotlib we want 2 side-by-side plots (a 1 $\\times$ 2 array of plots)
where the axes assigned are ax1 and ax2 |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.6 Programming\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| [x = numpy.pi](https://docs.scipy.org/doc/numpy-1.15.0/reference/constants.html) | sets $\\mathtt{x}$ equal to $\\pi$ |\n", "| x>3 and x<5 | Is $\\mathtt{x > 3}$ and $\\mathtt{x < 5}$? ($\\mathtt{True/False}$) |\n", "| x>3 or x<5 | Is $\\mathtt{x > 3}$ or $\\mathtt{x < 5}$? ($\\mathtt{True/False}$) |\n", "| if n>3: x = x + 1 | adds 1 to $\\mathtt{x}$ if $\\mathtt{n > 3}$ |\n", "| x = x+1 if n==0 else x + 2 | adds 1 to $\\mathtt{x}$ if $\\mathtt{n = 0}$, else adds 2 |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.7 Summary statistics\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| [numpy.mean(v)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.mean.html#numpy.mean) | sample mean of array v |\n", "| [numpy.var(v, ddof=1)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.var.html#numpy.var) | sample variance of vector v |\n", "| [numpy.std(v, ddof=1)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.std.html) | sample standard deviation of vector v |\n", "| [numpy.median(v)](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.median.html) | sample median of vector v |\n", "| [numpy.percentile(v, [0, 25, 50, 75, 100])](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.percentile.html) | min, 1st quartile, median, 3rd quartile, max of v |\n", "| numpy.percentile(v, p) | pth sample quantile of vector v |\n", "| [numpy.cov(v, w, ddof=1)[0,1]](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.cov.html) | sample covariance of vectors v and w |\n", "| [numpy.corrcoef(v, w)[0,1]](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.corrcoef.html) | sample correlation of vectors v and w |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## B.8 Distributions\n", "\n", "* [Discrete distributions in scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html#discrete-distributions)\n", "* [Continuous distributions in scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html#discrete-distributions)\n", "* [Multivariate distributions in scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html#multivariate-distributions)\n", "* Please see the [Statistical functions page for scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html) for a full list of the probability distributions and statistical functions provided.\n", "\n", "| Command | What it does |\n", "|---------|--------------|\n", "| [help(_distribution_)](https://docs.python.org/3/library/functions.html#help) | shows documentation on the _distribution_ named |\n", "| [scipy.stats.binom.pmf(k, n, p)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binom.html#scipy.stats.binom) | PMF $P(X = k)$ for $X \\sim Bin(n, p)$ |\n", "| scipy.stats.binom.cdf(x, n, p) | CDF $P(X \\leq x)$ for $X \\sim Bin(n, p)$ |\n", "| scipy.stats.binom.ppf(q, n, p) | quantile $P(X \\leq x) \\geq q$ for $X \\sim Bin(n, p)$ |\n", "| scipy.stats.binom.rvs(n, p, size=r) | vector of r i.i.d. $Bin(n, p)$ r.v.s |\n", "| [scipy.stats.geom.pmf(k, p)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.geom.html#scipy.stats.geom) | PMF $P(X=k) = (1-p)^{k-1}\\,p$ |\n", "| [scipy.stats.hypergeom.pmf(k, M, n, N)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.hypergeom.html#scipy.stats.hypergeom) | PMF $P(X=k) = \\frac{\\binom{n}{k}\\binom{M-n}{N-k}}{\\binom{M}{N}}$ |\n", "| [scipy.stats.nbinom.pmf(k, n, p)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.nbinom.html#scipy.stats.nbinom) | PMF $P(X=k) = \\binom{k+n-1}{n-1} \\, p^n \\, (1-p)^k$ |\n", "| [scipy.stats.poisson.pmf(k, mu)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.poisson.html#scipy.stats.poisson) | PMF $P(X=k) = e^{-\\mu}\\,\\frac{\\mu^k}{k!}$ |\n", "| [scipy.stats.beta.pdf(x,a,b)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.beta.html#scipy.stats.beta) | PDF $f(x) = \\frac{\\Gamma(a+b)\\,x^{a-1}\\,(1-x)^{b-1}}{\\Gamma(a)\\,\\Gamma(b)}$ |\n", "| [scipy.stats.cauchy.pdf(x)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.cauchy.html#scipy.stats.cauchy) | PDF $f(x) \\frac{1}{\\pi \\, (1 + x^2)}$ |\n", "| [scipy.stats.chi2.pdf(x, df)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2.html#scipy.stats.chi2) | PDF $f(x)$ for $X \\sim \\chi^2_{df}$ |\n", "| [scipy.stats.expon.pdf(x, scale=1/b)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.expon.html#scipy.stats.expon) | PDF $f(x)$ for $X \\sim Expo(b)$ |\n", "| [scipy.stats.gamma.pdf(x, a, scale=1/b)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html#scipy.stats.gamma) | PDF $f(x)$ for $X \\sim Gamma(a, b)$ |\n", "| [scipy.stats.lognorm.pdf(x, loc=m, scale=s)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html#scipy.stats.lognorm) | PDF $f(x)$ for $X \\sim \\mathcal{LN}(m, s^2)$ |\n", "| [scipy.stats.norm.pdf(x, loc=m, scale=s)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html) | PDF $f(x)$ for $X \\sim \\mathcal{N}(m, s^2)$ |\n", "| [scipy.stats.t.pdf(x, df)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html) | PDF $f(x)$ for $X \\sim t_{df}$ |\n", "| [scipy.stats.uniform.pdf(x, a, a+b)](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.uniform.html) | PDF $f(x)$ for $X \\sim Unif(a, b)$ |\n", "\n", "----\n", "\n", "The application programming interface (API for the [scipy.stats](https://docs.scipy.org/doc/scipy/reference/stats.html) module is fairly consistent for the distributions supported. [Discrete distributions](https://docs.scipy.org/doc/scipy/reference/stats.html#discrete-distributions) have pmf, cdf, and rvs functions for probability mass, cumulative distribution, and random value generation. Similarly for [continuous distributions](https://docs.scipy.org/doc/scipy/reference/stats.html#continuous-distributions), the functions for pdf, cdf, and rvs correspond to probability density, cumulative distribution, and random value generation. Note that for Normal distributions, the mean and **standard deviation** need to be provided as the parameters, rather than mean and _variance_.\n", "\n", "Among the [multivariate distributions](https://docs.scipy.org/doc/scipy/reference/stats.html#multivariate-distributions) supported, [scipy.stats.multinomial](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multinomial.html#scipy.stats.multinomial) has pmf and rvs for probability mass function and random value generation. [scipy.stats.multivariate_normal](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multivariate_normal.html#scipy.stats.multivariate_normal) has pdf, cdf, and rvs for probabilty density, cumulative distribution, and random value generation functions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "\n", "Joseph K. Blitzstein and Jessica Hwang, Harvard University and Stanford University, © 2019 by Taylor and Francis Group, LLC" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" }, "notebook_info": { "appendix": "B", "author": "Joseph K. Blitzstein, Jessica Hwang", "creator": "buruzaemon", "title": "Introduction to Probability, 1st Edition" } }, "nbformat": 4, "nbformat_minor": 2 }