{
 "cells": [
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "---\n",
    "title: Biostat/Biomath M257 Homework 5\n",
    "subtitle: 'Due May 26 @ 11:59PM'\n",
    "author: Student Name and UID\n",
    "date: today\n",
    "format:\n",
    "  html:\n",
    "    theme: cosmo\n",
    "    embed-resources: true\n",
    "    number-sections: true\n",
    "    toc: true\n",
    "    toc-depth: 4\n",
    "    toc-location: left\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "System information (for reproducibility):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Julia Version 1.9.0\n",
      "Commit 8e630552924 (2023-05-07 11:25 UTC)\n",
      "Platform Info:\n",
      "  OS: macOS (arm64-apple-darwin22.4.0)\n",
      "  CPU: 12 × Apple M2 Max\n",
      "  WORD_SIZE: 64\n",
      "  LIBM: libopenlibm\n",
      "  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)\n",
      "  Threads: 9 on 8 virtual cores\n",
      "Environment:\n",
      "  JULIA_NUM_THREADS = 8\n",
      "  JULIA_EDITOR = code\n"
     ]
    }
   ],
   "source": [
    "versioninfo()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load packages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[32m\u001b[1m  Activating\u001b[22m\u001b[39m project at `~/Documents/github.com/ucla-biostat-257/2023spring/hw/hw5`\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[32m\u001b[1mStatus\u001b[22m\u001b[39m `~/Documents/github.com/ucla-biostat-257/2023spring/hw/hw5/Project.toml`\n",
      "  \u001b[90m[f65535da] \u001b[39mConvex v0.15.3\n",
      "  \u001b[90m[a93c6f00] \u001b[39mDataFrames v1.5.0\n",
      "  \u001b[90m[87dc4568] \u001b[39mHiGHS v1.5.1\n",
      "  \u001b[90m[b99e6be6] \u001b[39mHypatia v0.7.2\n",
      "  \u001b[90m[4076af6c] \u001b[39mJuMP v1.11.0\n",
      "  \u001b[90m[2f354839] \u001b[39mPajarito v0.8.2\n",
      "  \u001b[90m[08abe8d2] \u001b[39mPrettyTables v2.2.4\n",
      "  \u001b[90m[3eaba693] \u001b[39mStatsModels v0.7.2\n"
     ]
    }
   ],
   "source": [
    "using Pkg\n",
    "\n",
    "Pkg.activate(pwd())\n",
    "Pkg.instantiate()\n",
    "Pkg.status()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this exercise, we practice using disciplined convex programming (SDP in particular) to solve optimal design problems.\n",
    "\n",
    "## Introduction to optimal design\n",
    "\n",
    "Consider a linear model\n",
    "\\begin{eqnarray*}\n",
    "\ty_i = \\mathbf{x}_i^T \\boldsymbol{\\beta} + \\epsilon_i, \\quad i = 1,\\ldots, n,\n",
    "\\end{eqnarray*}\n",
    "where $\\epsilon_i$ are independent Gaussian noises with common variance $\\sigma^2$. It is well known that the least squares estimate $\\hat{\\boldsymbol{\\beta}}$ is unbiased and has covariance $\\sigma^2 (\\sum_{i=1}^n \\mathbf{x}_i \\mathbf{x}_i^T)^{-1}$. \n",
    "\n",
    "In **exact optimal design**, given total number of $n$ allowable experiments, we want to choose among a list of $m$ candidate design points $\\{\\mathbf{x}_1, \\ldots, \\mathbf{x}_m\\}$ such that the covariance matrix is minimized in some sense. In mathematical terms, we want to find an integer vector $\\mathbf{n} = (n_1, \\ldots, n_m)$ such that $n_i \\ge 0$, $\\sum_{i=1}^m n_i = n$, and the matrix $\\mathbf{V} = \\left( \\sum_{i=1}^m n_i \\mathbf{x}_i \\mathbf{x}_i^T \\right)^{-1}$ is \"small\".\n",
    "\n",
    "In **approximate optimal design**,  we want to find a probability vector $\\mathbf{p} = (p_1, \\ldots, p_m)$ such that $p_i \\ge 0$, $\\sum_{i=1}^m p_i = 1$, and the matrix $\\mathbf{V} = \\left( \\sum_{i=1}^m p_i \\mathbf{x}_i \\mathbf{x}_i^T \\right)^{-1}$ is \"small\".\n",
    "\n",
    "Commonly used optimal design criteria include:\n",
    "\n",
    "- In **$D$-optimal design**, we minimize the determinant of $\\mathbf{V}$\n",
    "\\begin{eqnarray*}\n",
    "\t&\\text{minimize}& \\det \\left( \\sum_{i=1}^m p_i \\mathbf{x}_i \\mathbf{x}_i^T \\right)^{-1} \\\\\n",
    "\t&\\text{subject to}& p_i \\ge 0, \\sum_{i=1}^m p_i = 1.\n",
    "\\end{eqnarray*}\n",
    "\n",
    "- In **$E$-optimal design**, we minimize the spectral norm, i.e., the maximum eigenvalue of $\\mathbf{V}$\n",
    "\\begin{eqnarray*}\n",
    "\t&\\text{minimize}& \\lambda_{\\text{max}} \\left( \\sum_{i=1}^m p_i \\mathbf{x}_i \\mathbf{x}_i^T \\right)^{-1} \\\\\n",
    "\t&\\text{subject to}& p_i \\ge 0, \\sum_{i=1}^m p_i = 1.\t\n",
    "\\end{eqnarray*}\n",
    "Statistically we are minimizing the maximum variance of $\\sum_{j=1}^p a_j \\text{var}(\\hat \\beta_j)$ over all vectors $\\mathbf{a}$ with unit norm.\n",
    "\n",
    "- In **$A$-optimal design**, we minimize the trace of $\\mathbf{V}$\n",
    "\\begin{eqnarray*}\n",
    "\t&\\text{minimize}& \\text{tr} \\left( \\sum_{i=1}^m p_i \\mathbf{x}_i \\mathbf{x}_i^T \\right)^{-1} \\\\\n",
    "\t&\\text{subject to}& p_i \\ge 0, \\sum_{i=1}^m p_i = 1.\n",
    "\\end{eqnarray*}\n",
    "Statistically we are minimizing the total variance $\\sum_{j=1}^p \\text{var}(\\hat \\beta_j)$.\n",
    "\n",
    "## Q1 (10 pts) 3x4 factorial design\n",
    "\n",
    "A drug company ask you to help design a two factor clinical trial, in which treatment A has three levels (A1, A2, and A3) and treatment B has four levels (B1, B2, B3, and B4). Drug company also tells you that the treatment combination A3:B4 has undesirable side effects so we ignore this design point. \n",
    "\n",
    "Using dummy coding with A1 and B1 as the baseline levels, find the matrix $C$ with each row a unique design point.\n",
    "\n",
    "## Q2 (30 pts) Find approximate optimal designs\n",
    "\n",
    "Using semidefinite programming (SDP) software to find the approximate D-, E-, and A-optimal designs for this clinical trial.\n",
    "\n",
    "Hint: This is what I got, which may or may not be correct.\n",
    "\n",
    "```\n",
    "Approximate Optimal Design\n",
    "┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐\n",
    "│ design_pt │   D_opt │   E_opt │   A_opt │ D_opt_n │ E_opt_n │ A_opt_n │\n",
    "│    String │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │ Float64 │\n",
    "├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤\n",
    "│      A1B1 │   0.082 │   0.272 │   0.200 │       8 │      27 │      20 │\n",
    "│      A2B1 │   0.082 │   0.152 │   0.101 │       8 │      15 │      10 │\n",
    "│      A3B1 │   0.097 │   0.114 │   0.104 │      10 │      11 │      10 │\n",
    "│      A1B2 │   0.082 │   0.057 │   0.086 │       8 │       6 │       9 │\n",
    "│      A2B2 │   0.082 │   0.039 │   0.051 │       8 │       4 │       5 │\n",
    "│      A3B2 │   0.097 │   0.057 │   0.068 │      10 │       6 │       7 │\n",
    "│      A1B3 │   0.082 │   0.057 │   0.086 │       8 │       6 │       9 │\n",
    "│      A2B3 │   0.082 │   0.039 │   0.051 │       8 │       4 │       5 │\n",
    "│      A3B3 │   0.097 │   0.057 │   0.068 │      10 │       6 │       7 │\n",
    "│      A1B4 │   0.109 │   0.081 │   0.106 │      11 │       8 │      11 │\n",
    "│      A2B4 │   0.109 │   0.073 │   0.080 │      11 │       7 │       8 │\n",
    "└───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘\n",
    "```\n",
    "\n",
    "## Q3 (30 pts) Find exact optimal designs\n",
    "\n",
    "Using mixed-integer semidefinite programming (SDP) software to find the exact D-, E-, and A-optimal designs for this clinical trial **with $n=100$**.\n",
    "\n",
    "Hint: This is what I got. Apparently I haven't got the E-optimal design right yet.\n",
    "\n",
    "```\n",
    "Exact Optimal Design\n",
    "┌───────────┬───────┬───────┬───────┐\n",
    "│ design_pt │ D_opt │ E_opt │ A_opt │\n",
    "│    String │ Int64 │ Int64 │ Int64 │\n",
    "├───────────┼───────┼───────┼───────┤\n",
    "│      A1B1 │     8 │    90 │    20 │\n",
    "│      A2B1 │     8 │     1 │    10 │\n",
    "│      A3B1 │    10 │     1 │    10 │\n",
    "│      A1B2 │     8 │     1 │     9 │\n",
    "│      A2B2 │     8 │     1 │     5 │\n",
    "│      A3B2 │    10 │     1 │     7 │\n",
    "│      A1B3 │     8 │     1 │     9 │\n",
    "│      A2B3 │     8 │     1 │     5 │\n",
    "│      A3B3 │    10 │     1 │     7 │\n",
    "│      A1B4 │    11 │     1 │    10 │\n",
    "│      A2B4 │    11 │     1 │     8 │\n",
    "└───────────┴───────┴───────┴───────┘\n",
    "```\n",
    "\n",
    "## Q4 (30 bonus points) Optimal design with nuisance parameters\n",
    "\n",
    "Suppose the regression coefficients of linear model $\\boldsymbol{\\beta}$ is partitioned as $\\boldsymbol{\\beta} = (\\boldsymbol{\\beta}_0^T, \\boldsymbol{\\beta}_1^T)^T$, where $\\boldsymbol{\\beta}_0$ are nuisance parameters and $\\boldsymbol{\\beta}_1$ are parameters of primary interest. Given an approximate design $\\mathbf{p} = (p_1, \\ldots, p_m)$, let the information matrix be partitioned accordingly\n",
    "$$\n",
    "\\mathbf{I}(\\mathbf{p}) = \\sum_{i=1}^m p_i \\mathbf{x}_i \\mathbf{x}_i^T =  \\begin{pmatrix}\n",
    "\\mathbf{I}_{00} & \\mathbf{I}_{01} \\\\\n",
    "\\mathbf{I}_{10} & \\mathbf{I}_{11}\n",
    "\\end{pmatrix}.\n",
    "$$\n",
    "Then the information matrix for $\\boldsymbol{\\beta}_1$ adjusted for nuisance parameter $\\boldsymbol{\\beta}_0$ is\n",
    "$$\n",
    "\\mathbf{I}_{1 \\mid 0}(\\mathbf{p}) = \\mathbf{I}_{11} - \\mathbf{I}_{10} \\mathbf{I}_{00}^{-1} \\mathbf{I}_{01}.\n",
    "$$\n",
    "\n",
    "Revisiting the 3x4 factorial design problem in Q1, suppose the drug company only cares about the estimation of A treatment effects. Find the approximate D-, E-, and A-optimal designs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "jupytext": {
   "formats": "ipynb,qmd"
  },
  "kernelspec": {
   "display_name": "Julia 1.9.0",
   "language": "julia",
   "name": "julia-1.9"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.9.0"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "87px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": true,
   "threshold": 4,
   "toc_cell": false,
   "toc_section_display": "block",
   "toc_window_display": true,
   "widenNotebook": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}