{
"cells": [
{
"cell_type": "markdown",
"id": "2ad87afe",
"metadata": {},
"source": [
"# Discrete Data and the Multinomial Distribution\n",
"\n",
"- **[1]** (##) We consider IID data $D = \\{x_1,x_2,\\ldots,x_N\\}$ obtained from tossing a $K$-sided die. We use a *binary selection variable*\n",
"$$x_{nk} \\equiv \\begin{cases} 1 & \\text{if $x_n$ lands on $k$th face}\\\\\n",
" 0 & \\text{otherwise}\n",
"\\end{cases}\n",
"$$\n",
"with probabilities $p(x_{nk} = 1)=\\theta_k$. \n",
" (a) Write down the probability for the $n$th observation $p(x_n|\\theta)$ and derive the log-likelihood $\\log p(D|\\theta)$. \n",
" (b) Derive the maximum likelihood estimate for $\\theta$.\n",
"\n",
"\n",
"- **[2]** (#) In the notebook, Laplace's generalized rule of succession (the probability that we throw the $k$th face at the next toss) was derived as \n",
"$$\\begin{align*}\n",
"p(x_{\\bullet,k}=1|D) = \\frac{m_k + \\alpha_k }{ N+ \\sum_k \\alpha_k}\n",
"\\end{align*}$$\n",
"Provide an interpretation of the variables $m_k,N,\\alpha_k,\\sum_k\\alpha_k$.\n",
"\n",
"\n",
"- **[3]** (##) Show that Laplace's generalized rule of succession can be worked out to a prediction that is composed of a prior prediction and data-based correction term. \n",
"\n",
"\n",
"- **[4]** (#) Verify that \n",
" (a) the categorial distribution is a special case of the multinomial for $N=1$. \n",
" (b) the Bernoulli is a special case of the categorial distribution for $K=2$. \n",
" (c) the binomial is a special case of the multinomial for $K=2$.\n",
"\n",
"\n",
"- **[5]** (###) Determine the mean, variance and mode of a Beta distribution.\n",
"\n",
"\n",
"- **[6]** (###) Consider a data set of binary variables $D=\\{x_1,x_2,\\ldots,x_N\\}$ with a Bernoulli distribution $\\mathrm{Ber}(x_k|\\mu)$ as data generating distribution and a Beta prior for $\\mu$. Assume that you make $n$ observations with $x=1$ and $N-n$ observations with $x=0$. Now consider a new draw $x_\\bullet$. We are interested in computing $p(x_\\bullet|D)$. Show that the mean value for $p(x_\\bullet|D)$ lies in between the prior mean and Maximum Likelihood estimate.\n",
"\n",
"\n",
"- **[7]** Consider a data set $D = \\{(x_1,y_1), (x_2,y_2),\\dots,(x_N,y_N)\\}$ with 1-of-$K$ notation for the discrete classes, i.e.,\n",
"$$\\begin{equation*} y_{nk} = \\begin{cases} 1 & \\text{if $y_n$ in $k$th class} \\\\\n",
" 0 & \\text{otherwise} \n",
" \\end{cases}\n",
"\\end{equation*}$$\n",
"together with class-conditional distribution $p(x_n| y_{nk}=1,\\theta) = \\mathcal{N}(x_n|\\mu_k,\\Sigma)$ and multinomial prior $p(y_{nk}=1) = \\pi_k$. \n",
" (a) Proof that the joint log-likelihood is given by\n",
"$$\\begin{equation*}\n",
"\\log p(D|\\theta) = \\sum_{n,k} y_{nk} \\log \\mathcal{N}(x_n|\\mu_k,\\Sigma) + \\sum_{n,k} y_{nk} \\log \\pi_k\n",
"\\end{equation*}$$ \n",
" (b) Show now that the MLE of the *class-conditional* mean is given by\n",
"$$\\begin{equation*}\n",
" \\hat \\mu_k = \\frac{\\sum_n y_{nk} x_n}{\\sum_n y_{nk}} \n",
"\\end{equation*}\n",
"$$\n",
"\n",
"\n",
"\n",
"\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fc095f2c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.6.3",
"language": "julia",
"name": "julia-1.6"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.6.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}