{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Bake-off: Stanford Sentiment Treebank" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "__author__ = \"Christopher Potts\"\n", "__version__ = \"CS224u, Stanford, Spring 2018 term\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Contents\n", "\n", "0. [Overview](#Overview)\n", "0. [Bake-off submission](#Bake-off-submission)\n", "0. [Methodological note](#Methodological-note)\n", "0. [Set-up](#Set-up)\n", "0. [Baseline](#Baseline)\n", "0. [TfRNNClassifier wrapper](#TfRNNClassifier-wrapper)\n", "0. [TreeNN wrapper](#TreeNN-wrapper)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "\n", "The goal of this in-class bake-off is to __achieve the highest average F1 score__ on the SST development set, with the binary class function.\n", "\n", "The only restriction: __you cannot make any use of the subtree labels__." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bake-off submission\n", "\n", "1. A description of the model you created.\n", "1. The value of `f1-score` in the `avg / total` row of the classification report.\n", "\n", "Submission URL: https://docs.google.com/forms/d/1R41Zxxils7lOPzuThMdv2p1TKmFEy8c0DyUg-YkzTa0/edit" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Methodological note\n", "\n", "You don't have to use the experimental framework defined below (based on `sst`). However, if you don't use `sst.experiment` as below, then make sure you're training only on `train`, evaluating on `dev`, and that you report with \n", "\n", "```\n", "from sklearn.metrics import classification_report\n", "classification_report(y_dev, predictions)\n", "```\n", "where `y_dev = [y for tree, y in sst.dev_reader(class_func=sst.binary_class_func)]`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set-up\n", "\n", "See [the first notebook in this unit](sst_01_overview.ipynb#Set-up) for set-up instructions." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Applications/anaconda/envs/nlu/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n", " from ._conv import register_converters as _register_converters\n" ] } ], "source": [ "from collections import Counter\n", "from rnn_classifier import RNNClassifier\n", "from sklearn.linear_model import LogisticRegression\n", "import sst\n", "import tensorflow as tf\n", "from tf_rnn_classifier import TfRNNClassifier\n", "from tree_nn import TreeNN" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Baseline" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def unigrams_phi(tree):\n", " \"\"\"The basis for a unigrams feature function.\n", " \n", " Parameters\n", " ----------\n", " tree : nltk.tree\n", " The tree to represent.\n", " \n", " Returns\n", " ------- \n", " defaultdict\n", " A map from strings to their counts in `tree`. (Counter maps a \n", " list to a dict of counts of the elements in that list.)\n", " \n", " \"\"\"\n", " return Counter(tree.leaves())" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def fit_maxent_classifier(X, y): \n", " mod = LogisticRegression(fit_intercept=True)\n", " mod.fit(X, y)\n", " return mod" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.772\n", " precision recall f1-score support\n", "\n", " negative 0.783 0.741 0.761 428\n", " positive 0.762 0.802 0.782 444\n", "\n", "avg / total 0.772 0.772 0.772 872\n", "\n" ] } ], "source": [ "_ = sst.experiment(\n", " unigrams_phi, # Free to write your own!\n", " fit_maxent_classifier, # Free to write your own!\n", " train_reader=sst.train_reader, # Fixed by the competition.\n", " assess_reader=sst.dev_reader, # Fixed.\n", " class_func=sst.binary_class_func) # Fixed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By the way, with some informal hyperparameter search on a GPU machine, I found this model\n", "```\n", "tf_rnn_glove = TfRNNClassifier(\n", " sst_glove_vocab,\n", " embedding=glove_embedding, ## 100d version\n", " hidden_dim=300,\n", " max_length=52,\n", " hidden_activation=tf.nn.relu,\n", " cell_class=tf.nn.rnn_cell.LSTMCell,\n", " train_embedding=True,\n", " max_iter=5000,\n", " batch_size=1028,\n", " eta=0.001)\n", "```\n", "which finished with almost identical performance to the above:\n", " \n", "```\n", " precision recall f1-score support\n", "\n", " negative 0.78 0.75 0.76 428\n", " positive 0.77 0.80 0.78 444\n", "\n", "avg / total 0.77 0.77 0.77 872\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TfRNNClassifier wrapper" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def rnn_phi(tree):\n", " return tree.leaves() " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def fit_tf_rnn_classifier(X, y):\n", " vocab = sst.get_vocab(X, n_words=3000)\n", " mod = TfRNNClassifier(\n", " vocab, \n", " eta=0.05,\n", " batch_size=2048,\n", " embed_dim=50,\n", " hidden_dim=50,\n", " max_length=52, \n", " max_iter=500,\n", " cell_class=tf.nn.rnn_cell.LSTMCell,\n", " hidden_activation=tf.nn.tanh,\n", " train_embedding=True)\n", " mod.fit(X, y)\n", " return mod" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Iteration 500: loss: 2.5404394865036012" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.615\n", " precision recall f1-score support\n", "\n", " negative 0.571 0.869 0.689 428\n", " positive 0.745 0.369 0.494 444\n", "\n", "avg / total 0.660 0.615 0.590 872\n", "\n" ] } ], "source": [ "_ = sst.experiment(\n", " rnn_phi,\n", " fit_tf_rnn_classifier, \n", " vectorize=False, # For deep learning, use `vectorize=False`.\n", " assess_reader=sst.dev_reader)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TreeNN wrapper" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def tree_phi(tree):\n", " return tree" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def fit_tree_nn_classifier(X, y):\n", " vocab = sst.get_vocab(X, n_words=3000)\n", " mod = TreeNN(\n", " vocab, \n", " embed_dim=100, \n", " max_iter=100)\n", " mod.fit(X, y)\n", " return mod" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Finished epoch 100 of 100; error is 0.8351342778738807" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.510\n", " precision recall f1-score support\n", "\n", " negative 0.501 0.498 0.499 428\n", " positive 0.519 0.523 0.521 444\n", "\n", "avg / total 0.510 0.510 0.510 872\n", "\n" ] } ], "source": [ "_ = sst.experiment(\n", " rnn_phi,\n", " fit_tree_nn_classifier, \n", " vectorize=False, # For deep learning, use `vectorize=False`.\n", " assess_reader=sst.dev_reader)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" }, "widgets": { "state": {}, "version": "1.1.2" } }, "nbformat": 4, "nbformat_minor": 2 }