{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "29-197002 経済学研究科D1\n",
    "\n",
    "奥村恭平"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 1\n",
    "\n",
    "Below, we derive\n",
    "$$\n",
    "    \\Pr(d_i = j) = \\frac{\\exp(\\delta_j)}{\\sum_{k \\in \\mathcal{J}} \\exp(\\delta_k)}\n",
    "$$\n",
    "\n",
    "Let $f$ be the pdf of Extreme Value Type I random variable: $f(x) := F'(x) = \\mathrm e^{-x} \\cdot \\exp(-\\exp(-x))$.\n",
    "\n",
    "\\begin{align*}\n",
    "\t\\Pr(d_i = j)\n",
    "\t\t&=\n",
    "\t\t\t\\Pr(\\forall k \\neq j; \\ u_{ij} \\geq u_{ik}) \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\Pr(\\forall k \\neq j; \\ \\epsilon_{ij} + \\delta_j - \\delta_k \\geq \\epsilon_{ik}) \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\prod_{k \\neq j} \\Pr(\\epsilon_{ij} + \\delta_j - \\delta_k \\geq \\epsilon_{ik}) \\quad (\\because \\ \\text{indep.}) \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\int_{-\\infty}^\\infty F(\\epsilon_{ij} + \\delta_j - \\delta_k) f(\\epsilon_{ij}) \\mathrm d \\epsilon_{ij} \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\int_{-\\infty}^\\infty \\exp\\left(\n",
    "\t\t\t\\sum_{k \\neq j} - \\exp(-\\epsilon_{ij} - \\delta_j + \\delta_k)\n",
    "\t\t\t\\right)\n",
    "\t\t\t\\mathrm e^{-\\epsilon_{ij}} \\cdot \\exp(-\\exp(-\\epsilon_{ij}))\n",
    "\t\t\t\\mathrm d \\epsilon_{ij} \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\int_{-\\infty}^\\infty \\exp\\left(\n",
    "\t\t\t\t\\sum_{k} - \\exp(-x - \\delta_j + \\delta_k)\n",
    "\t\t\t\t\\right)\n",
    "\t\t\t\t\\mathrm e^{-x}\n",
    "\t\t\t\t\\mathrm d x \\quad (x := \\epsilon_{ij})\\\\\n",
    "\t\t&=\n",
    "\t\t\t\\int_{-\\infty}^\\infty \\exp\\left(\n",
    "\t\t\t- \\mathrm e^{-x}\n",
    "\t\t\t\t\\sum_{k} \\exp(\\delta_k - \\delta_j)\n",
    "\t\t\t\t\\right)\n",
    "\t\t\t\t\\mathrm e^{-x}\n",
    "\t\t\t\t\\mathrm d x \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\int_{0}^\\infty \\exp\\left(\n",
    "\t\t\t-t\n",
    "\t\t\t\t\\sum_{k} \\exp(\\delta_k - \\delta_j)\n",
    "\t\t\t\t\\right)\n",
    "\t\t\t\t\\mathrm d t \\quad (t := \\mathrm e^{-x}) \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\left[-\n",
    "\t\t\t\t\\left(\\sum_{k} \\exp(\\delta_k - \\delta_j)\\right)^{-1}\n",
    "\t\t\t\t\\exp\\left(\n",
    "\t\t\t\t-t\n",
    "\t\t\t\t\t\\sum_{k} \\exp(\\delta_k - \\delta_j)\n",
    "\t\t\t\t\t\\right)\n",
    "\t\t\t\\right]_{0}^{\\infty} \\\\\n",
    "\t\t&=\n",
    "\t\t\t\\left(\\sum_{k} \\exp(\\delta_k - \\delta_j)\\right)^{-1} \\\\\n",
    "\t\t&=\n",
    "\t\t\\frac{\\exp(\\delta_j)}{\\sum_k \\exp (\\delta_k)}\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The log-likelihood is:\n",
    "$$\n",
    "    \\sum_i \\sum_j y_{ij} \\cdot \\log \\left(\n",
    "        \\frac{\\exp(X_{ij} \\beta)}{1 + \\sum_k \\exp(X_{ik} \\beta)}\n",
    "    \\right)\n",
    "$$\n",
    "\n",
    "In order to use `scipy.optimize.minimize`, we define a loss function that is the likelihood multiplied by $-1$.\n",
    "\n",
    "The estimated value is $\\beta^* \\approx (-1.88673942, 0.09717683, -1.02683967)$.\n",
    "\n",
    "Below is the code I used:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2019-04-19T09:26:49.447234Z",
     "start_time": "2019-04-19T09:26:48.332822Z"
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The estimated beta is [-1.88673942  0.09717683 -1.02683967]\n"
     ]
    }
   ],
   "source": [
    "# import\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "from scipy.optimize import minimize\n",
    "\n",
    "\n",
    "# read csv\n",
    "df = pd.read_csv('../data/DataPS201901.csv', header=None)\n",
    "data = df.values\n",
    "\n",
    "# pre-processing\n",
    "x = [{'hp': 1.0, 'fe': 5.0}, {'hp': 1.2, 'fe': 3.5}, {'hp': 1.4, 'fe': 2.0}]\n",
    "y = data[:,0]\n",
    "\n",
    "age = data[:, 1]\n",
    "gender = data[:, 2]\n",
    "N = data.shape[0] # the number of agents\n",
    "J = data.shape[1] # the number of goods\n",
    "\n",
    "age = age[:, np.newaxis]/100 # rescaling\n",
    "gender = gender[:, np.newaxis]\n",
    "\n",
    "X = []\n",
    "for j in range(data.shape[1]):\n",
    "    temp = np.hstack([np.ones((N,1)), age * x[j]['hp'], gender * x[j]['fe']])\n",
    "    X.append(temp)\n",
    "\n",
    "    \n",
    "# loss function\n",
    "## use numpy to speed up\n",
    "def loss(beta):\n",
    "    # exp_Xij_beta(i,j) = exp(X_ij @ beta)\n",
    "    for j in range(J):\n",
    "        if j == 0:\n",
    "            exp_Xij_beta = np.exp(X[j] @ beta)[:, np.newaxis]\n",
    "        else:\n",
    "            exp_Xij_beta = np.hstack([exp_Xij_beta, np.exp(X[j] @ beta)[:, np.newaxis]])\n",
    "\n",
    "    exp_sum = np.sum(exp_Xij_beta, axis=1) + 1\n",
    "\n",
    "    # z_ij は,(i,j)に対応するlogの中身\n",
    "    z0 = np.ones((N,1)) / exp_sum[:, np.newaxis] # outside option\n",
    "    z1 = exp_Xij_beta / exp_sum[:, np.newaxis]\n",
    "    z = np.hstack([z0, z1])\n",
    "    w = np.log(z)\n",
    "    \n",
    "    # Y_ij = 1 iff agent i chooses option j\n",
    "    Y = np.zeros((N, J+1))\n",
    "    for i in range(N):\n",
    "        Y[i][y[i]] = 1\n",
    "    \n",
    "    return - (Y * w).sum()\n",
    "\n",
    "res = minimize(loss, x0=[1, 1, 1])\n",
    "beta = res.x\n",
    "beta[1] = beta[1] / 100 # rescaling\n",
    "\n",
    "print('The estimated beta is {}'.format(beta))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}