{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### 一.指数族分布的形式\n", "前面几节介绍的概率分布其实可以用一种统一的形式的来表示: \n", "\n", "$$\n", "p(x\\mid\\eta)=h(x)g(\\eta)exp[\\eta^T\\mu(x)]\n", "$$ \n", "\n", "这便是指数家族分布,其中$g(\\eta)$可以看做归一化系数,下面来看看前面介绍过的几种分布变换为指数族分布后的形式\n", "\n", "#### 伯努利分布\n", "\n", "$$\n", "p(x\\mid\\mu)=\\mu^x(1-\\mu)^{1-x}\\\\\n", "=exp[xlog\\mu+(1-x)log(1-mu)]\\\\\n", "=(1-\\mu)exp[log(\\frac{\\mu}{1-\\mu})x]\n", "$$ \n", "\n", "所以,$\\eta=ln(\\frac{\\mu}{1-\\mu})$,可以推得: \n", "$$\n", "\\mu=\\sigma(\\eta)=\\frac{1}{1+exp(-\\eta)}\n", "$$ \n", "\n", "所以对应的指数家族的函数关系为: \n", "\n", "$$\n", "h(x)=1,g(\\eta)=1-\\mu=1-\\sigma(\\eta)=\\sigma(-\\eta),\\mu(x)=x\n", "$$ \n", "\n", "#### 单一观测的多项式分布\n", "\n", "$$\n", "p(x\\mid\\mu)=\\prod_{k=1}^M\\mu_k^{x_k}=exp[\\sum_{k=1}^Mx_klog\\mu_k]\n", "$$ \n", "\n", "所以: \n", "\n", "$$\n", "h(x)=1\\\\\n", "g(\\eta)=1\\\\\n", "\\mu(x)=(x_1,...,x_M)^T=x\\\\\n", "\\eta=(log\\mu_1,...,log\\mu_M)^T\n", "$$ \n", "\n", "注意:$\\eta_k$之间不是相互独立的,因为有一个约束$\\sum_{k=1}^M\\mu_k=1$ \n", "\n", "\n", "#### 一元高斯分布\n", "$$\n", "p(x\\mid\\mu,\\sigma^2)=\\frac{1}{(2\\pi\\sigma^2)^{\\frac{1}{2}}}exp[-\\frac{1}{2\\sigma^2}(x-\\mu)^2]\\\\\n", "=\\frac{1}{(2\\pi\\sigma^2)^{\\frac{1}{2}}}exp[-\\frac{1}{2\\sigma^2}x^2+\\frac{\\mu}{\\sigma^2}x-\\frac{1}{2\\sigma^2}\\mu^2]\\\\\n", "=\\frac{1}{(2\\pi\\sigma^2)^{\\frac{1}{2}}}exp[-\\frac{1}{2\\sigma^2}\\mu^2]exp[-\\frac{1}{2\\sigma^2}x^2+\\frac{\\mu}{\\sigma^2}x]\n", "$$ \n", "\n", "我们可以令: \n", "\n", "$$\n", "\\eta=(\\frac{\\mu}{\\sigma^2},\\frac{-1}{2\\sigma^2})^T\\\\\n", "\\mu(x)=(x,x^2)^T\\\\\n", "$$ \n", "最后可以推得: \n", "\n", "$$\n", "h(x)=(2\\pi)^{2\\frac{1}{2}}\\\\\n", "g(\\eta)=(-2\\eta_2)^{\\frac{1}{2}}exp(\\frac{\\eta_1^2}{4\\eta_2})\n", "$$ \n", "\n", "剩下地,如多元高斯分布,Gamma分布,beta分布,狄利克雷分布,多项式分布,二项分布等都可以通过类似的方式转换为指数家族分布,那么问题就来了,将这些分布转换为指数族分布的形式有啥好处呢?自然是为了计算上更加方便,特别是求极大似然估计以及求共轭先验上,下面分别介绍" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 二.极大似然估计\n", "在做极大似然估计前我们先看一个一般的结论,由于指数族分布必然是一个概率分布,所以有: \n", "\n", "$$\n", "g(\\eta)\\int h(x)exp[\\eta^T\\mu(x)]dx=1\n", "$$ \n", "\n", "两边对$\\eta$求梯度,有: \n", "\n", "$$\n", "\\nabla g(\\eta)\\int h(x)exp[\\eta^T\\mu(x)]dx+g(\\eta)\\int h(x)exp[\\eta^T\\mu(x)]u(x)dx=0\\\\\n", "\\Leftrightarrow -\\nabla g(\\eta)\\frac{1}{g(\\eta)}=g(\\eta)\\int h(x)exp[\\eta^T\\mu(x)]u(x)dx=E[\\mu(x)]\\\\\n", "\\Leftrightarrow -\\nabla ln[g(\\eta)]=E[\\mu(x)]\n", "$$ \n", "\n", "注意,上面的等式是恒成立的哦,我们自然就会猜想,如果是求极大似然估计,它的形式应该也会和上面的等式差不多才对,下面省略求解过程,直接写出极大似然估计的结果: \n", "\n", "$$\n", "-\\nabla ln[g(\\eta_{ML})]=\\frac{1}{N}\\sum_{n=1}^N\\mu(x_n)\n", "$$ \n", "\n", "显然,当$N\\rightarrow\\infty$时,有$\\frac{1}{N}\\sum_{n=1}^N\\mu(x_n)=E[\\mu(x)]$,以及$\\eta_{ML}=\\eta$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 三.共轭先验\n", "\n", "对于指数分布家族的任何成员,都存在一个共轭先验,可以写作如下的形式: \n", "\n", "$$\n", "p(\\eta\\mid \\chi,\\nu)=f(\\chi,\\nu)g(\\eta)^\\nu exp[\\nu\\eta^T\\chi]\n", "$$ \n", "\n", "其中,$f(\\chi,\\nu)$是归一化系数,为了验证该分布是共轭先验,让它与如下的似然函数相乘: \n", "\n", "$$\n", "p(X\\mid\\eta)=(\\prod_{n=1}^Nh(x_n))g(\\eta)^Nexp[\\eta^T\\sum_{n=1}^N\\mu(x_n)]\n", "$$ \n", "\n", "可推得: \n", "\n", "$$\n", "p(\\eta\\mid x,\\chi,\\nu)\\propto g(\\eta)^{\\nu+N}exp[\\eta^T(\\sum_{n=1}^N\\mu(x_n)+\\nu\\chi)]\n", "$$ \n", "\n", "这与先验分布具有相同的形式" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 四.小结一下\n", "\n", "用下图对概率分布这几节的内容做个简单梳理: \n", "![avatar](./source/12_概率分布之间的关系.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }