{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# <center>Block 10: the gravity equation</center>\n",
    "### <center>Alfred Galichon (NYU & Sciences Po)</center>\n",
    "## <center>'math+econ+code' masterclass on optimal transport and economic applications</center>\n",
    "<center>© 2018-2021 by Alfred Galichon. Past and present support from NSF grant DMS-1716489, ERC grant CoG-866274, and contributions by Jules Baudet, Pauline Corblet, Gregory Dannay, and James Nesbit are acknowledged.</center>\n",
    "\n",
    "#### <center>With python code examples</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Learning objectives\n",
    "\n",
    "* Regularized optimal transport\n",
    "\n",
    "* The gravity equation\n",
    "\n",
    "* Generalized linear models\n",
    "\n",
    "* Pseudo-Poisson maximum likelihood estimation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## References\n",
    "\n",
    "* Anderson and van Wincoop (2003). \"Gravity with Gravitas: A Solution to the Border Puzzle\". *American Economic Review*.\n",
    "\n",
    "* Head and Mayer (2014). \"Gravity Equations: Workhorse, Toolkit and Cookbook\". *Handbook of International Economics*.\n",
    "\n",
    "* Choo and Siow (2005). \"Who marries whom and why\". *Journal of Political Economy*.\n",
    "\n",
    "* Gourieroux, Trognon, Monfort (1984). \"Pseudo Maximum Likelihood Methods: Theory\". *Econometrica*.\n",
    "\n",
    "* McCullagh and Nelder (1989). *Generalized Linear Models*. Chapman and Hall/CRC.\n",
    "\n",
    "* Santos Silva and Tenreyro (2006). \"The Log of Gravity\". *Review of Economics and Statistics*.\n",
    "\n",
    "* Yotov et al. (2011). *An advanced guide to trade policy analysis*. WTO.\n",
    "\n",
    "* Guimares and Portugal (2012). \"Real Wages and the Business Cycle: Accounting for Worker, Firm, and Job Title Heterogeneity\". *AEJ: Macro*.\n",
    "\n",
    "* Dupuy and G (2014), \"Personality traits and the marriage market\". *Journal of Political Economy*.\n",
    "\n",
    "* Dupuy, G and Sun (2019), \"Estimating matching affinity matrix under low-rank constraints\". *Information and Inference*.\n",
    "\n",
    "* Carlier, Dupuy, Galichon and Sun \"SISTA: learning optimal transport costs under sparsity constraints.\" *Communications on Pure and Applied Mathematics* (forthcoming)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Motivation\n",
    "\n",
    "The gravity equation is a very useful tool for explaining trade flows by various measures of proximity between countries.\n",
    "\n",
    "A number of regressors have been proposed. They include: geographic distance, common official languague, common colonial past, share of common religions, etc.\n",
    "\n",
    "The dependent variable is the volume of exports from country $i$ to country $n$, for each pair of country $\\left(  i,n\\right)$.\n",
    "\n",
    "Today, we shall see a close connection between gravity models of international trade and separable matching models.\n",
    "\n",
    "---\n",
    "\n",
    "To start with, let's load some of the libraries we shall need."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import os\n",
    "import pandas as pd\n",
    "import string as str\n",
    "import math\n",
    "import sys\n",
    "import time\n",
    "\n",
    "from scipy import optimize, special\n",
    "# !python -m pip install -i https://pypi.gurobi.com gurobipy ## only if Gurobi not here\n",
    "import gurobipy as grb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And let's load our data, which comes from the book *An Advanced Guide to Trade Policy Analysis: The Structural Gravity Mode*, by Yotov et al. We will estimate the gravity model using optimal transport as well as using Poisson regression."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>exporter</th>\n",
       "      <th>importer</th>\n",
       "      <th>year</th>\n",
       "      <th>trade</th>\n",
       "      <th>DIST</th>\n",
       "      <th>ln_DIST</th>\n",
       "      <th>CNTG</th>\n",
       "      <th>LANG</th>\n",
       "      <th>CLNY</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ARG</td>\n",
       "      <td>ARG</td>\n",
       "      <td>1986</td>\n",
       "      <td>61288.590263</td>\n",
       "      <td>533.908240</td>\n",
       "      <td>6.280224</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ARG</td>\n",
       "      <td>AUS</td>\n",
       "      <td>1986</td>\n",
       "      <td>27.764874</td>\n",
       "      <td>12044.574134</td>\n",
       "      <td>9.396370</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ARG</td>\n",
       "      <td>AUT</td>\n",
       "      <td>1986</td>\n",
       "      <td>3.559843</td>\n",
       "      <td>11751.146521</td>\n",
       "      <td>9.371706</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ARG</td>\n",
       "      <td>BEL</td>\n",
       "      <td>1986</td>\n",
       "      <td>96.102567</td>\n",
       "      <td>11305.285764</td>\n",
       "      <td>9.333026</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ARG</td>\n",
       "      <td>BGR</td>\n",
       "      <td>1986</td>\n",
       "      <td>3.129231</td>\n",
       "      <td>12115.572046</td>\n",
       "      <td>9.402246</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  exporter importer  year         trade          DIST   ln_DIST  CNTG  LANG  \\\n",
       "0      ARG      ARG  1986  61288.590263    533.908240  6.280224     0     0   \n",
       "1      ARG      AUS  1986     27.764874  12044.574134  9.396370     0     0   \n",
       "2      ARG      AUT  1986      3.559843  11751.146521  9.371706     0     0   \n",
       "3      ARG      BEL  1986     96.102567  11305.285764  9.333026     0     0   \n",
       "4      ARG      BGR  1986      3.129231  12115.572046  9.402246     0     0   \n",
       "\n",
       "   CLNY  \n",
       "0     0  \n",
       "1     0  \n",
       "2     0  \n",
       "3     0  \n",
       "4     0  "
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#thepath = os.path.join(os.getcwd(),'data_mec_optim/gravity_wtodata/')\n",
    "thepath = 'https://raw.githubusercontent.com/math-econ-code/mec_optim_2021-01/master/data_mec_optim/gravity_wtodata/'\n",
    "\n",
    "\n",
    "tradedata = pd.read_csv(os.path.join(thepath ,'1_TraditionalGravity_from_WTO_book.csv'), sep=',')\n",
    "tradedata = tradedata[['exporter', 'importer','year', 'trade', 'DIST','ln_DIST', 'CNTG', 'LANG', 'CLNY']]\n",
    "\n",
    "tradedata.sort_values(['year','exporter','importer'], inplace = True)\n",
    "tradedata.reset_index(inplace = True, drop = True)\n",
    "\n",
    "nbt = len(tradedata['year'].unique())\n",
    "nbi = len(tradedata['importer'].unique())\n",
    "nbk = 4\n",
    "\n",
    "tradedata.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Consistent the common practice, we only look at the flows of trade between pairs of distinct countries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "tradedata.loc[np.where(tradedata['importer']==tradedata['exporter'],True, False),['DIST', 'ln_DIST', 'CNTG', 'LANG', 'CLNY']]=0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's prepare the data so that we can use it. We want to construct \n",
    "* $D_{ni,t}^k$ which is the $k$th pairwise discrepancy measure between importer $n$ and exporter $i$ at time $t$\n",
    "\n",
    "* $X_{n,t}$ total value of expenditure of importer $n$ at time $t$\n",
    " \n",
    "* $Y_{i,t}$ total value of production of exporter $i$ at time $t$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "Xhatnit = []\n",
    "Dnikt = []\n",
    "\n",
    "years = tradedata['year'].unique()\n",
    "for t, year in enumerate(years):\n",
    "    \n",
    "    tradedata_year = tradedata[tradedata['year']==year]\n",
    "    \n",
    "    Xhatnit.append(tradedata_year.pivot(index = 'exporter', columns = 'importer', values ='trade').values)\n",
    "    np.fill_diagonal(Xhatnit[t],0)\n",
    "    \n",
    "    Dnikt.append(tradedata_year[[ 'ln_DIST', 'CNTG', 'LANG', 'CLNY']].values)\n",
    "    \n",
    "Xnt = np.zeros((nbi,nbt))\n",
    "Yit = np.zeros((nbi,nbt))\n",
    "\n",
    "for t in range(nbt):\n",
    "    Xnt[:,t] = Xhatnit[t].sum(axis = 1)\n",
    "    Yit[:,t] = Xhatnit[t].sum(axis = 0)\n",
    "    \n",
    "totalmass_t = sum(Xhatnit).sum(axis=(0,1))/nbt\n",
    "pihat_nit = Xhatnit/totalmass_t"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's standardize the data and construct some useful objects:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "meanD_k = np.asmatrix([mat.mean(axis = 0) for mat in Dnikt]).mean(axis = 0)\n",
    "sdD_k = np.asmatrix([mat.std(axis = 0,ddof = 1) for mat in Dnikt]).mean(axis = 0)\n",
    "\n",
    "Dnikt = [(mat - meanD_k)/sdD_k for mat in Dnikt]\n",
    "\n",
    "p_nt = Xnt/totalmass_t\n",
    "q_nt = Yit/totalmass_t\n",
    "IX = np.repeat(1, nbi).reshape(nbi,1)\n",
    "tIY = np.repeat(1, nbi).reshape(1,nbi)\n",
    "\n",
    "f_nit = []\n",
    "g_nit = []\n",
    "\n",
    "for t in range(nbt):\n",
    "    f_nit.append(p_nt[:,t].reshape(nbi,1).dot(tIY))\n",
    "    g_nit.append(IX.dot(q_nt[:,t].reshape(1,nbi)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Regularized optimal transport\n",
    "\n",
    "Consider the optimal transport duality\n",
    "\n",
    "\\begin{align*}\n",
    "\\max_{\\pi\\in\\mathcal{M}\\left(  P,Q\\right)  }\\sum_{xy}\\pi_{xy}\\Phi_{xy}=\\min_{u_{x}+v_{y}\\geq\\Phi_{xy}}\\sum_{x\\in\\mathcal{X}}p_{x}u_{x}+\\sum_{y\\in\\mathcal{Y}}q_{y}v_{y}\n",
    "\\end{align*}\n",
    "\n",
    "Now let's assume that we are adding an entropy to the primal objective function. For any $\\sigma>0$, we get\n",
    "\n",
    "\\begin{align*}\n",
    "&  \\max_{\\pi\\in\\mathcal{M}\\left(  P,Q\\right)  }\\sum_{xy}\\pi_{xy}\\Phi_{xy}-\\sigma\\sum_{xy}\\pi_{xy}\\ln\\pi_{xy}\\\\\n",
    "&  =\\min_{u,v}\\sum_{x\\in\\mathcal{X}}p_{x}u_{x}+\\sum_{y\\in\\mathcal{Y}}q_{y}v_{y}+\\sigma\\sum_{xy}\\exp\\left(  \\frac{\\Phi_{xy}-u_{x}-v_{y}-\\sigma}{\\sigma}\\right)\n",
    "\\end{align*}\n",
    "\n",
    "The latter problem is an unconstrained convex optimization problem. But the most efficient numerical computation technique is often coordinate descent, i.e. alternate between minimization in $u$ and minimization in $v$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Iterated fitting\n",
    "\n",
    "Maximize wrt to $u$ yields\n",
    "\n",
    "\\begin{align*}\n",
    "e^{-u_{x}/\\sigma}=\\frac{p_{x}}{\\sum_{y}\\exp\\left(  \\frac{\\Phi_{xy}-v_{y}-\\sigma}{\\sigma}\\right)  }\n",
    "\\end{align*}\n",
    "\n",
    "and wrt $v$ yields\n",
    "\n",
    "\\begin{align*}\n",
    "e^{-v_{y}/\\sigma}=\\frac{q_{y}}{\\sum_{x}\\exp\\left(  \\frac{\\Phi_{xy}-v_{y}-\\sigma}{\\sigma}\\right)  }\n",
    "\\end{align*}\n",
    "\n",
    "It is called the \"iterated projection fitting procedure\" (ipfp), aka \"matrix scaling\", \"RAS algorithm\", \"Sinkhorn-Knopp algorithm\", \"Kruithof's method\", \"Furness procedure\", \"biproportional fitting procedure\", \"Bregman's procedure\". See survey in Idel (2016).\n",
    "\n",
    "Maybe the most often reinvented algorithm in applied mathematics. Recently rediscovered in a machine learning context."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Econometrics of matching\n",
    "\n",
    "The goal is to estimate the matching surplus $\\Phi_{xy}$. For this, take a linear parameterization\n",
    "\n",
    "\\begin{align*}\n",
    "\\Phi_{xy}^{\\beta}=\\sum_{k=1}^{K}\\beta_{k}\\phi_{xy}^{k}.\n",
    "\\end{align*}\n",
    "\n",
    "Following Choo and Siow (2006), Galichon and Salanie (2011) introduce logit heterogeneity in individual preferences and show that the equilibrium now maximizes the *regularized Monge-Kantorovich problem*\n",
    "\n",
    "\\begin{align*}\n",
    "W\\left(  \\beta\\right)  =\\max_{\\pi\\in\\mathcal{M}\\left(  P,Q\\right)  }\\sum_{xy}\\pi_{xy}\\Phi_{xy}^{\\beta}-\\sigma\\sum_{xy}\\pi_{xy}\\ln\\pi_{xy}\n",
    "\\end{align*}\n",
    "\n",
    "By duality, $W\\left(  \\beta\\right)  $ can be expressed\n",
    "\n",
    "\\begin{align*}\n",
    "W\\left(  \\beta\\right)  =\\min_{u,v}\\sum_{x}p_{x}u_{x}+\\sum_{y}q_{y}v_{y}+\\sigma\\sum_{xy}\\exp\\left(  \\frac{\\Phi_{xy}^{\\beta}-u_{x}-v_{y}-\\sigma}{\\sigma}\\right)\n",
    "\\end{align*}\n",
    "\n",
    "and w.l.o.g. can set $\\sigma=1$ and drop the additive constant $-\\sigma$ in the $\\exp$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Estimation\n",
    "\n",
    "We observe the actual matching $\\hat{\\pi}_{xy}$. Note that $\\partial W/ \\partial\\beta^{k}=\\sum_{xy}\\pi_{xy}\\phi_{xy}^{k},$\\ hence $\\beta$ is estimated by running\n",
    "\n",
    "<a name='objFun'></a>\n",
    "\\begin{align*}\n",
    "\\min_{u,v,\\beta}\\sum_{x}p_{x}u_{x}+\\sum_{y}q_{y}v_{y}+\\sum_{xy}\\exp\\left(\\Phi_{xy}^{\\beta}-u_{x}-v_{y}\\right)  -\\sum_{xy,k}\\hat{\\pi}_{xy}\\beta_{k}\\phi_{xy}^{k}\n",
    "\\end{align*}\n",
    "\n",
    "which is still a convex optimization problem.\n",
    "\n",
    "This is actually the objective function of the log-likelihood in a Poisson regression with $x$ and $y$ fixed effects, where we assume\n",
    "\n",
    "\\begin{align*}\n",
    "\\pi_{xy}|xy\\sim Poisson\\left(  \\exp\\left(  \\sum_{k=1}^{K}\\beta_{k}\\phi\n",
    "_{xy}^{k}-u_{x}-v_{y}\\right)  \\right)  .\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Poisson regression with fixed effects\n",
    "\n",
    "Let $\\theta=\\left(  \\beta,u,v\\right)  $ and $Z=\\left(  \\phi,D^{x},D^{y}\\right)  $ where $D_{x^{\\prime}y^{\\prime}}^{x}=1\\left\\{  x=x^{\\prime}\\right\\}  $ and $D_{x^{\\prime}y^{\\prime}}^{y}=1\\left\\{  y=y^{\\prime}\\right\\}$ are $x$-and $y$-dummies. Let $m_{xy}\\left(  Z;\\theta\\right)  =\\exp\\left(\\theta^{\\intercal}Z_{xy}\\right)  $ be the parameter of the Poisson distribution.\n",
    "\n",
    "The conditional likelihood of $\\hat{\\pi}_{xy}$ given $Z_{xy}$ is\n",
    "\n",
    "\\begin{align*}\n",
    "l_{xy}\\left(  \\hat{\\pi}_{xy};\\theta\\right)   &  =\\hat{\\pi}_{xy}\\log m_{xy}\\left(  Z;\\theta\\right)  -m_{xy}\\left(  Z;\\theta\\right) \\\\\n",
    "&  =\\hat{\\pi}_{xy}\\left(  \\theta^{\\intercal}Z_{xy}\\right)  -\\exp\\left(\\theta^{\\intercal}Z_{xy}\\right) \\\\\n",
    "&  =\\hat{\\pi}_{xy}\\left(  \\sum_{k=1}^{K}\\beta_{k}\\phi_{xy}^{k}-u_{x}-v_{y}\\right)  -\\exp\\left(  \\sum_{k=1}^{K}\\beta_{k}\\phi_{xy}^{k}-u_{x}-v_{y}\\right)\n",
    "\\end{align*}\n",
    "\n",
    "Summing over $x$ and $y$, the sample log-likelihood is\n",
    "\n",
    "\\begin{align*}\n",
    "\\sum_{xy}\\hat{\\pi}_{xy}\\sum_{k=1}^{K}\\beta_{k}\\phi_{xy}^{k}-\\sum_{x}p_{x}u_{x}-\\sum_{y}q_{y}v_{y}-\\sum_{xy}\\exp\\left(  \\sum_{k=1}^{K}\\beta_{k}\\phi_{xy}^{k}-u_{x}-v_{y}\\right)\n",
    "\\end{align*}\n",
    "\n",
    "hence we recover the [objective function](#objFun)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### From Poisson to pseudo-Poisson\n",
    "\n",
    "If $\\pi_{xy}|xy$ is Poisson, then $\\mathbb{E}\\left[\\pi_{xy}\\right]=m_{xy}\\left(  Z_{xy};\\theta\\right)  =\\mathbb{V}ar\\left(  \\pi_{xy}\\right)  $. While it makes sense to assume the former equality, the latter is a rather strong assumption.\n",
    "\n",
    "For estimation purposes, $\\hat{\\theta}$ is obtained by\n",
    "\n",
    "\\begin{align*}\n",
    "\\max_{\\theta}\\sum_{xy}l\\left(  \\hat{\\pi}_{xy};\\theta\\right)  =\\sum_{xy}\\left(\\hat{\\pi}_{xy}\\left(  \\theta^{\\intercal}Z_{xy}\\right)  -\\exp\\left(\\theta^{\\intercal}Z_{xy}\\right)  \\right)\n",
    "\\end{align*}\n",
    "\n",
    "however, for inference purposes, one shall not assume the Poisson distribution. Instead\n",
    "\n",
    "\\begin{align*}\n",
    "\\sqrt{N}\\left(  \\hat{\\theta}-\\theta\\right)  \\Longrightarrow\\left(A_{0}\\right)  ^{-1}B_{0}\\left(  A_{0}\\right)  ^{-1}\n",
    "\\end{align*}\n",
    "\n",
    "where $N=\\left\\vert \\mathcal{X}\\right\\vert \\times\\left\\vert \\mathcal{Y}\\right\\vert $ and $A_{0}$ and $B_{0}$ are estimated by\n",
    "\n",
    "\\begin{align*}\n",
    "\\hat{A}_{0}  &  =N^{-1}\\sum_{xy}D_{\\theta\\theta}^{2}l\\left(  \\hat{\\pi}_{xy};\\hat{\\theta}\\right)  =N^{-1}\\sum_{xy}\\exp\\left(  \\hat{\\theta}^{\\intercal}Z_{xy}\\right)  Z_{xy}Z_{xy}^{\\intercal}\\\\\n",
    "\\hat{B}_{0}  &  =N^{-1}\\sum_{xy}\\left(  \\hat{\\pi}_{xy}-\\exp\\left(  \\hat{\\theta}^{\\intercal}Z_{xy}\\right)  \\right)  ^{2}Z_{xy}Z_{xy}^{\\intercal}.\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# DG XXX ESTIMATION HERE, USING EXP FAMILIES"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The gravity equation\n",
    "\n",
    "\"Structural gravity equation\" (Anderson and van Wincoop, 2003) as exposited in Head and Mayer (2014)\n",
    "handbook chapter:\n",
    "\n",
    "\\begin{align*}\n",
    "X_{ni}=\\underset{S_{i}}{\\underbrace{\\frac{Y_{i}}{\\Omega_{i}}}}\\underset{M_{n}}{\\underbrace{\\frac{X_{n}}{\\Psi_{n}}}}\\Phi_{ni}%\n",
    "\\end{align*}\n",
    "\n",
    "where $n$=importer, $i$=exporter, $X_{ni}$=trade flow from $i$ to $n$, $Y_{i}=\\sum_{n}X_{ni}$ is value of production, $X_{n}=\\sum_{i}X_{ni}$ is importers' expenditures, and $\\phi_{ni}$=bilateral accessibility of $n$ to $i$.\n",
    "\n",
    "$\\Omega_{i}$ and $\\Psi_{n}$ are \\textquotedblleft multilateral resistances\\textquotedblright, satisfying the set of implicit equations\n",
    "\n",
    "\\begin{align*}\n",
    "\\Psi_{n}=\\sum_{i}\\frac{\\Phi_{ni}Y_{i}}{\\Omega_{i}}\\text{ and }\\Omega_{i}%\n",
    "=\\sum_{n}\\frac{\\Phi_{ni}X_{n}}{\\Psi_{n}}%\n",
    "\\end{align*}\n",
    "\n",
    "These are exactly the same equations as those of the regularized OT."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Explaining trade\n",
    "\n",
    "Parameterize $\\Phi_{ni}=\\exp\\left(  \\sum_{k=1}^{K}\\beta_{k}D_{ni}^{k}\\right)  $, where the $D_{ni}^{k}$ are $K$ pairwise measures of distance between $n$ and $i$. We have\n",
    "\n",
    "\\begin{align*}\n",
    "X_{ni}=\\exp\\left(  \\sum_{k=1}^{K}\\beta_{k}D_{ni}^{k}-s_{i}-m_{n}\\right)\n",
    "\\end{align*}\n",
    "\n",
    "where fixed effects $s_{i}=-\\ln S_{i}$ and $m_{n}=-\\ln M_{n}$ are adjusted by\n",
    "\n",
    "\\begin{align*}\n",
    "\\sum_{i}X_{ni}=Y_{i}\\text{ and }\\sum_{n}X_{ni}=X_{n}.\n",
    "\\end{align*}\n",
    "\n",
    "Standard choices of $D_{ni}^{k}$'s:\n",
    "\n",
    "* Logarithm of bilateral distance between $n$ and $i$\n",
    "\n",
    "* Indicator of contiguous borders; of common official language; of\n",
    "colonial ties\n",
    "\n",
    "* Trade policy variables: presence of a regional trade agreement; tariffs\n",
    "\n",
    "* Could include many other measures of proximity, e.g. measure of genetic/cultural distance, intensity of communications, etc."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will solve this model by fixing a $\\beta$ and solving the matching problem using IPFP. Then in an outer loop we will solve for the $\\beta$ which minimizes the distance between model and empirical moments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Time elapsed =  8.417711734771729 s.\n"
     ]
    }
   ],
   "source": [
    "sigma = 1\n",
    "maxiterIpfp = 1000\n",
    "maxiter = 500\n",
    "tolIpfp = 1e-12\n",
    "tolDescent = 1e-12\n",
    "t_s = 0.03\n",
    "iterCount = 0\n",
    "contIter = True\n",
    "\n",
    "v_it = np.zeros((nbi, nbt))\n",
    "beta_k = np.repeat(0, nbk)\n",
    "\n",
    "thegrad = np.repeat(0, nbk)\n",
    "pi_nit = []\n",
    "\n",
    "theval_old = -math.inf\n",
    "\n",
    "ptm = time.time()\n",
    "while(contIter):\n",
    "    \n",
    "    #print(\"Iteration\", iterCount)\n",
    "    \n",
    "    for t in range(nbt):\n",
    "        \n",
    "        #print(\"Year\", t)\n",
    "\n",
    "        D_ij_k = Dnikt[t]\n",
    "\n",
    "        Phi = D_ij_k.dot(beta_k.reshape(nbk,1)).reshape(nbi,nbi)\n",
    "\n",
    "        contIpfp = True\n",
    "        iterIpfp = 0\n",
    "\n",
    "        v = v_it[:, t].reshape(1,nbi)\n",
    "        f = f_nit[t]\n",
    "        g = g_nit[t]\n",
    "\n",
    "        K = np.exp(Phi/sigma)\n",
    "        np.fill_diagonal(K,0)\n",
    "\n",
    "        fK = np.multiply(f,K)\n",
    "        gK = np.multiply(g,K)\n",
    "\n",
    "        while(contIpfp):\n",
    "\n",
    "            iterIpfp = iterIpfp + 1\n",
    "\n",
    "            u = sigma * np.log(np.sum(np.multiply(gK,np.exp((-IX.dot(v))/sigma)), axis = 1)).flatten()\n",
    "            vnext = sigma * np.log(np.sum(np.multiply(fK,np.exp((-u.T.dot(tIY))/sigma)), axis = 0))\n",
    "            error = np.max(np.abs(np.sum(np.multiply(gK,np.exp((-IX.dot(vnext) - u.T.dot(tIY))/sigma)), axis = 1) - 1))\n",
    "\n",
    "            if (error < tolIpfp or iterIpfp >= maxiterIpfp):\n",
    "                contIpfp = False\n",
    "            v = vnext\n",
    "\n",
    "        v_it[:,t] = np.asarray(v)[0]\n",
    "\n",
    "        fgK = np.multiply(f,gK)\n",
    "        pi_nit.append(np.multiply(fgK,np.exp((-IX.dot(v) - u.T.dot(tIY))/sigma)))\n",
    "\n",
    "        thegrad = thegrad + (pi_nit[t]-pihat_nit[t]).flatten(order = 'F').dot(D_ij_k)\n",
    "\n",
    "    beta_k = beta_k - t_s * thegrad\n",
    "\n",
    "    nonzero_pi_nit = np.concatenate(pi_nit).ravel()[np.where(np.concatenate(pi_nit).ravel()>0, True, False)]\n",
    "    theval = float(np.sum(np.multiply(thegrad,beta_k), axis = 1)) - sigma * float(np.sum(np.multiply(nonzero_pi_nit, np.log(nonzero_pi_nit)),axis=(0,1)))\n",
    "\n",
    "    iterCount = iterCount + 1\n",
    "\n",
    "    if (iterCount > maxiter or np.abs(theval - theval_old) < tolDescent):\n",
    "        contIter = False\n",
    "\n",
    "    theval_old = theval\n",
    "    thegrad = np.repeat(0, nbk)\n",
    "    pi_nit = []\n",
    "    \n",
    "diff = time.time() - ptm\n",
    "print('Time elapsed = ', diff, 's.')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.84092368  0.43744866  0.2474767  -0.22249036]]\n"
     ]
    }
   ],
   "source": [
    "beta_k = beta_k/sdD_k\n",
    "print(beta_k)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We recover the PPML estimates on Table 1 p. 42 of [Yotov et al.'s book](https://www.wto.org/english/res_e/booksp_e/advancedwtounctad2016_e.pdf)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Comparison XXXXXXXXXXXXXXXXXXXX\n",
    "We can compare the results and speed of our computation to that of Poisson regression packages. As a warning, these give the same results, but at the cost of a much longer run time, so use at your own risk. We can solve instead using the Poisson regression from the glm package. XXXXXXXXXXXX"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}