{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "29-197002 経済学研究科D1\n", "\n", "奥村恭平" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem 1\n", "\n", "Below, we derive\n", "$$\n", " \\Pr(d_i = j) = \\frac{\\exp(\\delta_j)}{\\sum_{k \\in \\mathcal{J}} \\exp(\\delta_k)}\n", "$$\n", "\n", "Let $f$ be the pdf of Extreme Value Type I random variable: $f(x) := F'(x) = \\mathrm e^{-x} \\cdot \\exp(-\\exp(-x))$.\n", "\n", "\\begin{align*}\n", "\t\\Pr(d_i = j)\n", "\t\t&=\n", "\t\t\t\\Pr(\\forall k \\neq j; \\ u_{ij} \\geq u_{ik}) \\\\\n", "\t\t&=\n", "\t\t\t\\Pr(\\forall k \\neq j; \\ \\epsilon_{ij} + \\delta_j - \\delta_k \\geq \\epsilon_{ik}) \\\\\n", "\t\t&=\n", "\t\t\t\\prod_{k \\neq j} \\Pr(\\epsilon_{ij} + \\delta_j - \\delta_k \\geq \\epsilon_{ik}) \\quad (\\because \\ \\text{indep.}) \\\\\n", "\t\t&=\n", "\t\t\t\\int_{-\\infty}^\\infty F(\\epsilon_{ij} + \\delta_j - \\delta_k) f(\\epsilon_{ij}) \\mathrm d \\epsilon_{ij} \\\\\n", "\t\t&=\n", "\t\t\t\\int_{-\\infty}^\\infty \\exp\\left(\n", "\t\t\t\\sum_{k \\neq j} - \\exp(-\\epsilon_{ij} - \\delta_j + \\delta_k)\n", "\t\t\t\\right)\n", "\t\t\t\\mathrm e^{-\\epsilon_{ij}} \\cdot \\exp(-\\exp(-\\epsilon_{ij}))\n", "\t\t\t\\mathrm d \\epsilon_{ij} \\\\\n", "\t\t&=\n", "\t\t\t\\int_{-\\infty}^\\infty \\exp\\left(\n", "\t\t\t\t\\sum_{k} - \\exp(-x - \\delta_j + \\delta_k)\n", "\t\t\t\t\\right)\n", "\t\t\t\t\\mathrm e^{-x}\n", "\t\t\t\t\\mathrm d x \\quad (x := \\epsilon_{ij})\\\\\n", "\t\t&=\n", "\t\t\t\\int_{-\\infty}^\\infty \\exp\\left(\n", "\t\t\t- \\mathrm e^{-x}\n", "\t\t\t\t\\sum_{k} \\exp(\\delta_k - \\delta_j)\n", "\t\t\t\t\\right)\n", "\t\t\t\t\\mathrm e^{-x}\n", "\t\t\t\t\\mathrm d x \\\\\n", "\t\t&=\n", "\t\t\t\\int_{0}^\\infty \\exp\\left(\n", "\t\t\t-t\n", "\t\t\t\t\\sum_{k} \\exp(\\delta_k - \\delta_j)\n", "\t\t\t\t\\right)\n", "\t\t\t\t\\mathrm d t \\quad (t := \\mathrm e^{-x}) \\\\\n", "\t\t&=\n", "\t\t\t\\left[-\n", "\t\t\t\t\\left(\\sum_{k} \\exp(\\delta_k - \\delta_j)\\right)^{-1}\n", "\t\t\t\t\\exp\\left(\n", "\t\t\t\t-t\n", "\t\t\t\t\t\\sum_{k} \\exp(\\delta_k - \\delta_j)\n", "\t\t\t\t\t\\right)\n", "\t\t\t\\right]_{0}^{\\infty} \\\\\n", "\t\t&=\n", "\t\t\t\\left(\\sum_{k} \\exp(\\delta_k - \\delta_j)\\right)^{-1} \\\\\n", "\t\t&=\n", "\t\t\\frac{\\exp(\\delta_j)}{\\sum_k \\exp (\\delta_k)}\n", "\\end{align*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The log-likelihood is:\n", "$$\n", " \\sum_i \\sum_j y_{ij} \\cdot \\log \\left(\n", " \\frac{\\exp(X_{ij} \\beta)}{1 + \\sum_k \\exp(X_{ik} \\beta)}\n", " \\right)\n", "$$\n", "\n", "In order to use `scipy.optimize.minimize`, we define a loss function that is the likelihood multiplied by $-1$.\n", "\n", "The estimated value is $\\beta^* \\approx (-1.88673942, 0.09717683, -1.02683967)$.\n", "\n", "Below is the code I used:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2019-04-19T09:26:49.447234Z", "start_time": "2019-04-19T09:26:48.332822Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The estimated beta is [-1.88673942 0.09717683 -1.02683967]\n" ] } ], "source": [ "# import\n", "import pandas as pd\n", "import numpy as np\n", "from scipy.optimize import minimize\n", "\n", "\n", "# read csv\n", "df = pd.read_csv('../data/DataPS201901.csv', header=None)\n", "data = df.values\n", "\n", "# pre-processing\n", "x = [{'hp': 1.0, 'fe': 5.0}, {'hp': 1.2, 'fe': 3.5}, {'hp': 1.4, 'fe': 2.0}]\n", "y = data[:,0]\n", "\n", "age = data[:, 1]\n", "gender = data[:, 2]\n", "N = data.shape[0] # the number of agents\n", "J = data.shape[1] # the number of goods\n", "\n", "age = age[:, np.newaxis]/100 # rescaling\n", "gender = gender[:, np.newaxis]\n", "\n", "X = []\n", "for j in range(data.shape[1]):\n", " temp = np.hstack([np.ones((N,1)), age * x[j]['hp'], gender * x[j]['fe']])\n", " X.append(temp)\n", "\n", " \n", "# loss function\n", "## use numpy to speed up\n", "def loss(beta):\n", " # exp_Xij_beta(i,j) = exp(X_ij @ beta)\n", " for j in range(J):\n", " if j == 0:\n", " exp_Xij_beta = np.exp(X[j] @ beta)[:, np.newaxis]\n", " else:\n", " exp_Xij_beta = np.hstack([exp_Xij_beta, np.exp(X[j] @ beta)[:, np.newaxis]])\n", "\n", " exp_sum = np.sum(exp_Xij_beta, axis=1) + 1\n", "\n", " # z_ij は,(i,j)に対応するlogの中身\n", " z0 = np.ones((N,1)) / exp_sum[:, np.newaxis] # outside option\n", " z1 = exp_Xij_beta / exp_sum[:, np.newaxis]\n", " z = np.hstack([z0, z1])\n", " w = np.log(z)\n", " \n", " # Y_ij = 1 iff agent i chooses option j\n", " Y = np.zeros((N, J+1))\n", " for i in range(N):\n", " Y[i][y[i]] = 1\n", " \n", " return - (Y * w).sum()\n", "\n", "res = minimize(loss, x0=[1, 1, 1])\n", "beta = res.x\n", "beta[1] = beta[1] / 100 # rescaling\n", "\n", "print('The estimated beta is {}'.format(beta))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }