{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bake-off: Stanford Sentiment Treebank"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "__author__ = \"Christopher Potts\"\n",
    "__version__ = \"CS224u, Stanford, Spring 2018 term\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Contents\n",
    "\n",
    "0. [Overview](#Overview)\n",
    "0. [Bake-off submission](#Bake-off-submission)\n",
    "0. [Methodological note](#Methodological-note)\n",
    "0. [Set-up](#Set-up)\n",
    "0. [Baseline](#Baseline)\n",
    "0. [TfRNNClassifier wrapper](#TfRNNClassifier-wrapper)\n",
    "0. [TreeNN wrapper](#TreeNN-wrapper)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "The goal of this in-class bake-off is to __achieve the highest average F1 score__ on the SST development set, with the binary class function.\n",
    "\n",
    "The only restriction: __you cannot make any use of the subtree labels__."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bake-off submission\n",
    "\n",
    "1. A description of the model you created.\n",
    "1. The value of `f1-score` in the `avg / total` row of the classification report.\n",
    "\n",
    "Submission URL: https://docs.google.com/forms/d/1R41Zxxils7lOPzuThMdv2p1TKmFEy8c0DyUg-YkzTa0/edit"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Methodological note\n",
    "\n",
    "You don't have to use the experimental framework defined below (based on `sst`). However, if you don't use `sst.experiment` as below, then make sure you're training only on `train`, evaluating on `dev`, and that you report with \n",
    "\n",
    "```\n",
    "from sklearn.metrics import classification_report\n",
    "classification_report(y_dev, predictions)\n",
    "```\n",
    "where `y_dev = [y for tree, y in sst.dev_reader(class_func=sst.binary_class_func)]`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set-up\n",
    "\n",
    "See [the first notebook in this unit](sst_01_overview.ipynb#Set-up) for set-up instructions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Applications/anaconda/envs/nlu/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
      "  from ._conv import register_converters as _register_converters\n"
     ]
    }
   ],
   "source": [
    "from collections import Counter\n",
    "from rnn_classifier import RNNClassifier\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "import sst\n",
    "import tensorflow as tf\n",
    "from tf_rnn_classifier import TfRNNClassifier\n",
    "from tree_nn import TreeNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Baseline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def unigrams_phi(tree):\n",
    "    \"\"\"The basis for a unigrams feature function.\n",
    "    \n",
    "    Parameters\n",
    "    ----------\n",
    "    tree : nltk.tree\n",
    "        The tree to represent.\n",
    "    \n",
    "    Returns\n",
    "    -------    \n",
    "    defaultdict\n",
    "        A map from strings to their counts in `tree`. (Counter maps a \n",
    "        list to a dict of counts of the elements in that list.)\n",
    "    \n",
    "    \"\"\"\n",
    "    return Counter(tree.leaves())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fit_maxent_classifier(X, y):        \n",
    "    mod = LogisticRegression(fit_intercept=True)\n",
    "    mod.fit(X, y)\n",
    "    return mod"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy: 0.772\n",
      "             precision    recall  f1-score   support\n",
      "\n",
      "   negative      0.783     0.741     0.761       428\n",
      "   positive      0.762     0.802     0.782       444\n",
      "\n",
      "avg / total      0.772     0.772     0.772       872\n",
      "\n"
     ]
    }
   ],
   "source": [
    "_ = sst.experiment(\n",
    "    unigrams_phi,                      # Free to write your own!\n",
    "    fit_maxent_classifier,             # Free to write your own!\n",
    "    train_reader=sst.train_reader,     # Fixed by the competition.\n",
    "    assess_reader=sst.dev_reader,      # Fixed.\n",
    "    class_func=sst.binary_class_func)  # Fixed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By the way, with some informal hyperparameter search on a GPU machine, I found this model\n",
    "```\n",
    "tf_rnn_glove = TfRNNClassifier(\n",
    "    sst_glove_vocab,\n",
    "    embedding=glove_embedding, ## 100d version\n",
    "    hidden_dim=300,\n",
    "    max_length=52,\n",
    "    hidden_activation=tf.nn.relu,\n",
    "    cell_class=tf.nn.rnn_cell.LSTMCell,\n",
    "    train_embedding=True,\n",
    "    max_iter=5000,\n",
    "    batch_size=1028,\n",
    "    eta=0.001)\n",
    "```\n",
    "which finished with almost identical performance to the above:\n",
    "    \n",
    "```\n",
    "             precision    recall  f1-score   support\n",
    "\n",
    "   negative       0.78      0.75      0.76       428\n",
    "   positive       0.77      0.80      0.78       444\n",
    "\n",
    "avg / total       0.77      0.77      0.77       872\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## TfRNNClassifier wrapper"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def rnn_phi(tree):\n",
    "    return tree.leaves()    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fit_tf_rnn_classifier(X, y):\n",
    "    vocab = sst.get_vocab(X, n_words=3000)\n",
    "    mod = TfRNNClassifier(\n",
    "        vocab, \n",
    "        eta=0.05,\n",
    "        batch_size=2048,\n",
    "        embed_dim=50,\n",
    "        hidden_dim=50,\n",
    "        max_length=52, \n",
    "        max_iter=500,\n",
    "        cell_class=tf.nn.rnn_cell.LSTMCell,\n",
    "        hidden_activation=tf.nn.tanh,\n",
    "        train_embedding=True)\n",
    "    mod.fit(X, y)\n",
    "    return mod"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Iteration 500: loss: 2.5404394865036012"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy: 0.615\n",
      "             precision    recall  f1-score   support\n",
      "\n",
      "   negative      0.571     0.869     0.689       428\n",
      "   positive      0.745     0.369     0.494       444\n",
      "\n",
      "avg / total      0.660     0.615     0.590       872\n",
      "\n"
     ]
    }
   ],
   "source": [
    "_ = sst.experiment(\n",
    "    rnn_phi,\n",
    "    fit_tf_rnn_classifier, \n",
    "    vectorize=False,  # For deep learning, use `vectorize=False`.\n",
    "    assess_reader=sst.dev_reader)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## TreeNN wrapper"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def tree_phi(tree):\n",
    "    return tree"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fit_tree_nn_classifier(X, y):\n",
    "    vocab = sst.get_vocab(X, n_words=3000)\n",
    "    mod = TreeNN(\n",
    "        vocab, \n",
    "        embed_dim=100, \n",
    "        max_iter=100)\n",
    "    mod.fit(X, y)\n",
    "    return mod"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Finished epoch 100 of 100; error is 0.8351342778738807"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy: 0.510\n",
      "             precision    recall  f1-score   support\n",
      "\n",
      "   negative      0.501     0.498     0.499       428\n",
      "   positive      0.519     0.523     0.521       444\n",
      "\n",
      "avg / total      0.510     0.510     0.510       872\n",
      "\n"
     ]
    }
   ],
   "source": [
    "_ = sst.experiment(\n",
    "    rnn_phi,\n",
    "    fit_tree_nn_classifier, \n",
    "    vectorize=False,  # For deep learning, use `vectorize=False`.\n",
    "    assess_reader=sst.dev_reader)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  },
  "widgets": {
   "state": {},
   "version": "1.1.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}