{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Biostat M280 Homework 6\n", "\n", "**Due ~~June 7 @ 11:59PM~~ June 9 @ 11:59PM**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Consider again the MLE of the Dirichlet-multinomial model. In [HW5](http://hua-zhou.github.io/teaching/biostatm280-2019spring/hw/hw5/hw05.html), we worked out a Newton's method. In this homework, we explore the MM and EM approach." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q1\n", "\n", "Show that, given iid observations $\\mathbf{x}_1,\\ldots,\\mathbf{x}_n$, the log-likelihood can be written as\n", "$$\n", "L(\\alpha) = \\sum_{i=1}^n \\ln \\binom{|\\mathbf{x}_i|}{\\mathbf{x}_i}\n", "+\\sum_{i=1}^n \\sum_{j=1}^d \\sum_{k=0}^{x_{ij}-1} \\ln(\\alpha_j+k) - \\sum_{i=1}^n \\sum_{k=0}^{|\\mathbf{x}_i|-1} \\ln(|\\alpha|+k).\n", "$$\n", "Hint: $\\Gamma(a + k) / \\Gamma(a) = a (a + 1) \\cdots (a+k-1)$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q2\n", "\n", "Suppose $(P_1,\\ldots,P_d) \\in \\Delta_d = \\{\\mathbf{p}: p_i \\ge 0, \\sum_i p_i = 1\\}$ follows a Dirichlet distribution with parameter $\\alpha = (\\alpha_1,\\ldots,\\alpha_d)$. Show that\n", "$$\n", "\t\\mathbf{E}(\\ln P_j) = \\Psi(\\alpha_j) - \\Psi(|\\alpha|),\n", "$$\n", "where $\\Psi(z) = \\Gamma'(z)/\\Gamma(z)$ is the digamma function and $|\\alpha| = \\sum_{j=1}^d \\alpha_j$. Hint: Differentiate the identity \n", "$$\n", "1 = \\int_{\\Delta_d} \\frac{\\Gamma(|\\alpha|)}{\\prod_{j=1}^d \\Gamma(\\alpha_j)} \\prod_{j=1}^d p_j^{\\alpha_j-1} \\, d\\mathbf{p}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q3\n", "\n", "The admixture representation of the Dirichlet-multinomial distribution suggests that we can treat the unobserved multinomial parameters $\\mathbf{p}_1,\\ldots,\\mathbf{p}_n$ as missing data and derive an EM algorithm. Show that the Q function is\n", "$$\n", " Q(\\alpha|\\alpha^{(t)}) = \n", "\\sum_{j=1}^d \\sum_{i=1}^n \\alpha_j \\left[\\Psi(x_{ij}+\\alpha_j^{(t)}) - \\Psi(|\\mathbf{x}_i|+|\\alpha^{(t)}|)\\right] - n \\sum_{j=1}^d \\ln \\Gamma(\\alpha_j) + n \\ln \\Gamma(|\\alpha|) + c^{(t)},\n", "$$\n", "where $c^{(t)}$ is a constant irrelevant to optimization. Explain why it is not easy to maximize the Q function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q4\n", "\n", "We derive an MM algorithm for maximing $L$. Consider the formulation of the log-likelihood that contains terms $\\ln (\\alpha_j + k)$ and $- \\ln (|\\alpha|+k)$. Applying Jensen's inequality to the concave term $\\ln (\\alpha_j + k)$ and supporting hyperplane inequality to the convex term $- \\ln (|\\alpha|+k)$, show that a minorizing function to $L(\\alpha)$ is\n", "$$\n", "\tg(\\alpha|\\alpha^{(t)}) = - \\sum_{k=0}^{\\max_i |\\mathbf{x}_i|-1} \\frac{r_k}{|\\alpha^{(t)}|+k} |\\alpha| + \\sum_{j=1}^d \\sum_{k=0}^{\\max_i x_{ij}-1} \\frac{s_{jk} \\alpha_j^{(t)}}{\\alpha_j^{(t)}+k} \\ln \\alpha_j + c^{(t)},\n", "$$\n", "where $s_{jk} = \\sum_{i=1}^n 1_{\\{x_{ij} > k\\}}$, $r_k = \\sum_{i=1}^n 1_{\\{|\\mathbf{x}_i| > k\\}}$, and $c^{(t)}$ is a constant irrelevant to optimization. Maximizing the surrogate function $g(\\alpha|\\alpha^{(t)})$ is trivial since $\\alpha_j$ are separated. Show that the MM updates are\n", "$$\n", "\t\\alpha_j^{(t+1)} = \\frac{\\sum_{k=0}^{\\max_i x_{ij}-1} \\frac{s_{jk}}{\\alpha_j^{(t)}+k}}{\\sum_{k=0}^{\\max_i |\\mathbf{x}_i|-1} \\frac{r_k}{|\\alpha^{(t)}|+k}} \\alpha_j^{(t)}, \\quad j=1,\\ldots,d.\n", "$$\n", "The quantities $s_{jk}$, $r_k$, $\\max_i x_{ij}$ and $\\max_i |\\mathbf{x}_i|$ only depend on data and can be pre-computed. Comment on whether the MM updates respect the parameter constraint $\\alpha_j>0$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q5\n", "\n", "Write a function for finding MLE of Dirichlet-multinomial distribution given iid observations $\\mathbf{x}_1,\\ldots,\\mathbf{x}_n$, using MM algorithm." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dirmult_mm" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"\"\"\n", " dirmult_mm(X)\n", "\n", "Find the MLE of Dirichlet-multinomial distribution using MM algorithm.\n", "\n", "# Argument\n", "* `X`: an `d`-by-`n` matrix of counts; each column is one data point.\n", "\n", "# Optional argument \n", "* `alpha0`: starting point. \n", "* `maxiters`: the maximum allowable Newton iterations (default 100). \n", "* `tolfun`: the tolerance for relative change in objective values (default 1e-6). \n", "\n", "# Output\n", "# Output\n", "* `logl`: the log-likelihood at MLE. \n", "* `niter`: the number of iterations performed.\n", "# `α`: the MLE.\n", "* `∇`: the gradient at MLE. \n", "* `obsinfo`: the observed information matrix at MLE. \n", "\"\"\"\n", "function dirmult_mm(\n", " X::AbstractMatrix; \n", " α0::Vector = dirmult_mom(X), \n", " maxiters::Int = 100, \n", " tolfun = 1e-6\n", " )\n", " \n", " # your code here\n", " \n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q6\n", "\n", "Re-do [HW5 Q9](http://hua-zhou.github.io/teaching/biostatm280-2019spring/hw/hw5/hw05.html#Q9) using your new `dirmult_mm` function. Compare the number of iterations and run time by MM algorithm to those by Newton's method. Comment on the efficiency of Newton's algorithm vs MM algorithm for this problem." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Q7\n", "\n", "Finally let us re-consider the EM algorithm. The difficulty with the M step in EM algorithm can be remedied. Discuss how we can further minorize the $\\ln \\Gamma(|\\alpha|)$ term in the $Q$ function to produce a minorizing function with all $\\alpha_j$ separated. For this homework, you do **not** need to implement this EM-MM hybrid algorithm. Hint: $z \\mapsto \\ln \\Gamma(z)$ is a convex function." ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.1.0", "language": "julia", "name": "julia-1.1" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.1.0" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "65px", "width": "252px" }, "navigate_menu": true, "number_sections": false, "sideBar": true, "skip_h1_title": true, "threshold": 4, "toc_cell": false, "toc_section_display": "block", "toc_window_display": false, "widenNotebook": false } }, "nbformat": 4, "nbformat_minor": 2 }