{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "# Spelling Bee"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "This notebook starts our deep dive (no pun intended) into NLP by introducing sequence-to-sequence learning on Spelling Bee."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "heading_collapsed": true
   },
   "source": [
    "## Data Stuff\n",
    "\n",
    "We take our data set from [The CMU pronouncing dictionary](https://en.wikipedia.org/wiki/CMU_Pronouncing_Dictionary)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using TensorFlow backend.\n",
      "/home/jhoward/anaconda3/lib/python3.6/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.\n",
      "  \"This module will be removed in 0.20.\", DeprecationWarning)\n"
     ]
    }
   ],
   "source": [
    "%matplotlib inline\n",
    "import importlib\n",
    "import utils2; importlib.reload(utils2)\n",
    "from utils2 import *\n",
    "np.set_printoptions(4)\n",
    "PATH = 'data/spellbee/'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "limit_mem()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "The CMU pronouncing dictionary consists of sounds/words and their corresponding phonetic description (American pronunciation).\n",
    "\n",
    "The phonetic descriptions are a sequence of phonemes. Note that the vowels end with integers; these indicate where the stress is.\n",
    "\n",
    "Our goal is to learn how to spell these words given the sequence of phonemes."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "The preparation of this data set follows the same pattern we've seen before for NLP tasks.\n",
    "\n",
    "Here we iterate through each line of the file and grab each word/phoneme pair that starts with an uppercase letter. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(('A', ['AH0']), ('ZYWICKI', ['Z', 'IH0', 'W', 'IH1', 'K', 'IY0']))"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lines = [l.strip().split(\"  \") for l in open(PATH+\"cmudict-0.7b\", encoding='latin1') \n",
    "         if re.match('^[A-Z]', l)]\n",
    "lines = [(w, ps.split()) for w, ps in lines]\n",
    "lines[0], lines[-1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Next we're going to get a list of the unique phonemes in our vocabulary, as well as add a null \"_\" for zero-padding."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true,
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['_', 'AA0', 'AA1', 'AA2', 'AE0']"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "phonemes = [\"_\"] + sorted(set(p for w, ps in lines for p in ps))\n",
    "phonemes[:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "70"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(phonemes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Then we create mappings of phonemes and letters to respective indices.\n",
    "\n",
    "Our letters include the padding element \"_\", but also \"*\" which we'll explain later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "p2i = dict((v, k) for k,v in enumerate(phonemes))\n",
    "letters = \"_abcdefghijklmnopqrstuvwxyz*\"\n",
    "l2i = dict((v, k) for k,v in enumerate(letters))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Let's create a dictionary mapping words to the sequence of indices corresponding to it's phonemes, and let's do it only for words between 5 and 15 characters long."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "108006"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "maxlen=15\n",
    "pronounce_dict = {w.lower(): [p2i[p] for p in ps] for w, ps in lines\n",
    "                 if (5<=len(w)<=maxlen) and re.match(\"^[A-Z]+$\", w)}\n",
    "len(pronounce_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Aside on various approaches to python's list comprehension:\n",
    "* the first list is a typical example of a list comprehension subject to a conditional\n",
    "* the second is a list comprehension inside a list comprehension, which returns a list of list\n",
    "* the third is similar to the second, but is read and behaves like a nested loop\n",
    "    * Since there is no inner bracket, there are no lists wrapping the inner loop"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(['XYZ'], [['x', 'y', 'z'], ['a', 'b', 'c']], ['x', 'y', 'z', 'a', 'b', 'c'])"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a=['xyz','abc']\n",
    "[o.upper() for o in a if o[0]=='x'], [[p for p in o] for o in a], [p for o in a for p in o]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Split lines into words, phonemes, convert to indexes (with padding), split into training, validation, test sets. Note we also find the max phoneme sequence length for padding."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "maxlen_p = max([len(v) for k,v in pronounce_dict.items()])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "pairs = np.random.permutation(list(pronounce_dict.keys()))\n",
    "n = len(pairs)\n",
    "input_ = np.zeros((n, maxlen_p), np.int32)\n",
    "labels_ = np.zeros((n, maxlen), np.int32)\n",
    "\n",
    "for i, k in enumerate(pairs):\n",
    "    for j, p in enumerate(pronounce_dict[k]): input_[i][j] = p\n",
    "    for j, letter in enumerate(k): labels_[i][j] = l2i[letter]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "go_token = l2i[\"*\"]\n",
    "dec_input_ = np.concatenate([np.ones((n,1)) * go_token, labels_[:,:-1]], axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Sklearn's <tt>train_test_split</tt> is an easy way to split data into training and testing sets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "(input_train, input_test, labels_train, labels_test, dec_input_train, dec_input_test\n",
    "    ) = train_test_split(input_, labels_, dec_input_, test_size=0.1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(97205, 16)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "input_train.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(97205, 15)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "labels_train.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(70, 28)"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "input_vocab_size, output_vocab_size = len(phonemes), len(letters)\n",
    "input_vocab_size, output_vocab_size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Next we proceed to build our model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "## Keras code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "parms = {'verbose': 0, 'callbacks': [TQDMNotebookCallback(leave_inner=True)]}\n",
    "lstm_params = {}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "dim = 240"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "### Without attention"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "def get_rnn(return_sequences= True): \n",
    "    return LSTM(dim, dropout_U= 0.1, dropout_W= 0.1, \n",
    "               consume_less= 'gpu', return_sequences=return_sequences)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "The model has three parts:\n",
    "* We first pass list of phonemes through an embedding function to get a list of phoneme embeddings. Our goal is to turn this sequence of embeddings into a single distributed representation that captures what our phonemes say.\n",
    "* Turning a sequence into a representation can be done using an RNN. This approach is useful because RNN's are able to keep track of state and memory, which is obviously important in forming a complete understanding of a pronunciation.\n",
    "    * <tt>BiDirectional</tt> passes the original sequence through an RNN, and the reversed sequence through a different RNN and concatenates the results. This allows us to look forward and backwards.\n",
    "    * We do this because in language things that happen later often influence what came before (i.e. in Spanish, \"el chico, la chica\" means the boy, the girl; the word for \"the\" is determined by the gender of the subject, which comes after).\n",
    "* Finally, we arrive at a vector representation of the sequence which captures everything we need to spell it. We feed this vector into more RNN's, which are trying to generate the labels. After this, we make a classification for what each letter is in the output sequence.\n",
    "    * We use <tt>RepeatVector</tt> to help our RNN remember at each point what the original word is that it's trying to translate.\n",
    "    \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "inp = Input((maxlen_p,))\n",
    "x = Embedding(input_vocab_size, 120)(inp)\n",
    "\n",
    "x = Bidirectional(get_rnn())(x)\n",
    "x = get_rnn(False)(x)\n",
    "\n",
    "x = RepeatVector(maxlen)(x)\n",
    "x = get_rnn()(x)\n",
    "x = get_rnn()(x)\n",
    "x = TimeDistributed(Dense(output_vocab_size, activation='softmax'))(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "We can refer to the parts of the model before and after <tt>get_rnn(False)</tt> returns a vector as the encoder and decoder. The encoder has taken a sequence of embeddings and encoded it into a numerical vector that completely describes it's input, while the decoder transforms that vector into a new sequence.\n",
    "\n",
    "Now we can fit our model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "model = Model(inp, x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "model.compile(Adam(), 'sparse_categorical_crossentropy', metrics=['acc'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true
   },
   "outputs": [
    {
     "ename": "KeyboardInterrupt",
     "evalue": "",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-23-68a848dd942e>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m hist=model.fit(input_train, np.expand_dims(labels_train,-1), \n\u001b[1;32m      2\u001b[0m           \u001b[0mvalidation_data\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0minput_test\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mexpand_dims\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlabels_test\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m           batch_size=64, **parms, nb_epoch=3)\n\u001b[0m",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/keras/engine/training.py\u001b[0m in \u001b[0;36mfit\u001b[0;34m(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch)\u001b[0m\n\u001b[1;32m   1194\u001b[0m                               \u001b[0mval_f\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mval_f\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mval_ins\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mval_ins\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mshuffle\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mshuffle\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1195\u001b[0m                               \u001b[0mcallback_metrics\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcallback_metrics\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1196\u001b[0;31m                               initial_epoch=initial_epoch)\n\u001b[0m\u001b[1;32m   1197\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1198\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0mevaluate\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatch_size\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m32\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mverbose\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0msample_weight\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/keras/engine/training.py\u001b[0m in \u001b[0;36m_fit_loop\u001b[0;34m(self, f, ins, out_labels, batch_size, nb_epoch, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch)\u001b[0m\n\u001b[1;32m    889\u001b[0m                 \u001b[0mbatch_logs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'size'\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbatch_ids\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    890\u001b[0m                 \u001b[0mcallbacks\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mon_batch_begin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbatch_index\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mbatch_logs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 891\u001b[0;31m                 \u001b[0mouts\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mf\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mins_batch\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    892\u001b[0m                 \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mouts\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlist\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    893\u001b[0m                     \u001b[0mouts\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mouts\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, inputs)\u001b[0m\n\u001b[1;32m   1941\u001b[0m         \u001b[0msession\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mget_session\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1942\u001b[0m         updated = session.run(self.outputs + [self.updates_op],\n\u001b[0;32m-> 1943\u001b[0;31m                               feed_dict=feed_dict)\n\u001b[0m\u001b[1;32m   1944\u001b[0m         \u001b[0;32mreturn\u001b[0m \u001b[0mupdated\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0moutputs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1945\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36mrun\u001b[0;34m(self, fetches, feed_dict, options, run_metadata)\u001b[0m\n\u001b[1;32m    765\u001b[0m     \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    766\u001b[0m       result = self._run(None, fetches, feed_dict, options_ptr,\n\u001b[0;32m--> 767\u001b[0;31m                          run_metadata_ptr)\n\u001b[0m\u001b[1;32m    768\u001b[0m       \u001b[0;32mif\u001b[0m \u001b[0mrun_metadata\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    769\u001b[0m         \u001b[0mproto_data\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtf_session\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mTF_GetBuffer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrun_metadata_ptr\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36m_run\u001b[0;34m(self, handle, fetches, feed_dict, options, run_metadata)\u001b[0m\n\u001b[1;32m    963\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0mfinal_fetches\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0mfinal_targets\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    964\u001b[0m       results = self._do_run(handle, final_targets, final_fetches,\n\u001b[0;32m--> 965\u001b[0;31m                              feed_dict_string, options, run_metadata)\n\u001b[0m\u001b[1;32m    966\u001b[0m     \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    967\u001b[0m       \u001b[0mresults\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36m_do_run\u001b[0;34m(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)\u001b[0m\n\u001b[1;32m   1013\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0mhandle\u001b[0m \u001b[0;32mis\u001b[0m \u001b[0;32mNone\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1014\u001b[0m       return self._do_call(_run_fn, self._session, feed_dict, fetch_list,\n\u001b[0;32m-> 1015\u001b[0;31m                            target_list, options, run_metadata)\n\u001b[0m\u001b[1;32m   1016\u001b[0m     \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1017\u001b[0m       return self._do_call(_prun_fn, self._session, handle, feed_dict,\n",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36m_do_call\u001b[0;34m(self, fn, *args)\u001b[0m\n\u001b[1;32m   1020\u001b[0m   \u001b[0;32mdef\u001b[0m \u001b[0m_do_call\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfn\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1021\u001b[0m     \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1022\u001b[0;31m       \u001b[0;32mreturn\u001b[0m \u001b[0mfn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1023\u001b[0m     \u001b[0;32mexcept\u001b[0m \u001b[0merrors\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mOpError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1024\u001b[0m       \u001b[0mmessage\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcompat\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mas_text\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0me\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmessage\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/home/jhoward/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py\u001b[0m in \u001b[0;36m_run_fn\u001b[0;34m(session, feed_dict, fetch_list, target_list, options, run_metadata)\u001b[0m\n\u001b[1;32m   1002\u001b[0m         return tf_session.TF_Run(session, options,\n\u001b[1;32m   1003\u001b[0m                                  \u001b[0mfeed_dict\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfetch_list\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtarget_list\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1004\u001b[0;31m                                  status, run_metadata)\n\u001b[0m\u001b[1;32m   1005\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1006\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m_prun_fn\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0msession\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mhandle\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfeed_dict\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfetch_list\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mKeyboardInterrupt\u001b[0m: "
     ]
    }
   ],
   "source": [
    "hist=model.fit(input_train, np.expand_dims(labels_train,-1), \n",
    "          validation_data=[input_test, np.expand_dims(labels_test,-1)], \n",
    "          batch_size=64, **parms, nb_epoch=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "hist.history['val_loss']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "To evaluate, we don't want to know what percentage of letters are correct but what percentage of words are."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "def eval_keras(input):\n",
    "    preds = model.predict(input, batch_size=128)\n",
    "    predict = np.argmax(preds, axis = 2)\n",
    "    return (np.mean([all(real==p) for real, p in zip(labels_test, predict)]), predict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "The accuracy isn't great."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "acc, preds = eval_keras(input_test); acc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [],
   "source": [
    "def print_examples(preds):\n",
    "    print(\"pronunciation\".ljust(40), \"real spelling\".ljust(17), \n",
    "          \"model spelling\".ljust(17), \"is correct\")\n",
    "\n",
    "    for index in range(20):\n",
    "        ps = \"-\".join([phonemes[p] for p in input_test[index]]) \n",
    "        real = [letters[l] for l in labels_test[index]] \n",
    "        predict = [letters[l] for l in preds[index]]\n",
    "        print (ps.split(\"-_\")[0].ljust(40), \"\".join(real).split(\"_\")[0].ljust(17),\n",
    "            \"\".join(predict).split(\"_\")[0].ljust(17), str(real == predict))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true
   },
   "source": [
    "We can see that sometimes the mistakes are completely reasonable, occasionally they're totally off. This tends to happen with the longer words that have large phoneme sequences.\n",
    "\n",
    "That's understandable; we'd expect larger sequences to lose more information in an encoding."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pronunciation                            real spelling     model spelling    is correct\n",
      "OW1-SH-AA0-F                             oshaf             ossaf             False\n",
      "R-IH1-NG-K-AH0-L-D                       wrinkled          rinkkld           False\n",
      "D-IH0-T-EH1-K-SH-AH0-N                   detection         detection         True\n",
      "Y-AE1-S-IH0-N                            yassin            yasin             False\n",
      "AA1-P-HH-AY2-M                           opheim            ophime            False\n",
      "JH-AH1-S-K-OW0                           jusco             jusko             False\n",
      "B-L-UW1-P-EH2-N-S-AH0-L-IH0-NG           bluepencilling    bluopensiinn      False\n",
      "L-AE1-R-OW0                              laroe             larro             False\n",
      "L-AO1-D-AH0-T-AO2-R-IY0                  laudatory         laudttrr          False\n",
      "P-ER2-T-ER0-B-EY1-SH-AH0-N-Z             perturbations     perterbations     False\n",
      "EH1-V-AH0-D-AH0-N-S-AH0-Z                evidences         evedences         False\n",
      "M-AH0-K-AA1-N-V-IH2-L                    mcconville        mcconvilll        False\n",
      "F-AE1-S-T-S                              fasts             fasts             True\n",
      "M-IH0-S-D-IH0-R-EH1-K-T                  misdirect         misderect         False\n",
      "M-IH0-Z-UW1-L-AH0                        missoula          misula            False\n",
      "K-L-IH1-R-Z                              clears            cliars            False\n",
      "P-R-AA1-S-T-AH0-T-UW2-T-S                prostitutes       prostitutes       True\n",
      "R-EY1-Z-AH0-N                            rayson            raion             False\n",
      "S-T-R-OW1-L-D                            strolled          strolldd          False\n",
      "S-T-EY1-OW2-V-ER0-Z                      stayovers         stayevers         False\n"
     ]
    }
   ],
   "source": [
    "print_examples(preds)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "heading_collapsed": true
   },
   "source": [
    "### Attention model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "This graph demonstrates the accuracy decay for a nueral translation task. With an encoding/decoding technique, larger input sequences result in less accuracy.\n",
    "\n",
    "<img src=\"https://smerity.com/media/images/articles/2016/bahdanau_attn.png\" width=\"600\">\n",
    "\n",
    "This can be mitigated using an attentional model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "import attention_wrapper; importlib.reload(attention_wrapper)\n",
    "from attention_wrapper import Attention"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "The attentional model doesn't encode into a single vector, but rather a sequence of vectors. The decoder then at every point is passing through this sequence. For example, after the bi-directional RNN we have 16 vectors corresponding to each phoneme's output state. Each output state describes how each phoneme relates between the other phonemes before and after it. After going through more RNN's, our goal is to transform this sequence into a vector of length 15 so we can classify into characters. \n",
    "\n",
    "A smart way to take a weighted average of the 16 vectors for each of the 15 outputs, where each set of weights is unique to the output. For example, if character 1 only needs information from the first phoneme vector, that weight might be 1 and the others 0; if it needed information from the 1st and 2nd equally, those two might be 0.5 each.\n",
    "\n",
    "The weights for combining all the input states to produce specific outputs can be learned using an attentional model; we update the weights using SGD, and train it jointly with the encoder/decoder. Once we have the outputs, we can classify the character using softmax as usual."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Notice below we do not have an RNN that returns a flat vector as we did before; we have a sequence of vectors as desired. We can then pass a sequence of encoded states into the our custom <tt>Attention</tt> model.\n",
    "\n",
    "This attention model also uses a technique called teacher forcing; in addition to passing the encoded hidden state, we also pass the correct answer for the previous time period. We give this information to the model because it makes it easier to train. In the beginning of training, the model will get most things wrong, and if your earlier character predictions are wrong then your later ones will likely be as well. Teacher forcing allows the model to still learn how to predict later characters, even if the earlier characters were all wrong."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true,
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "inp = Input((maxlen_p,))\n",
    "inp_dec = Input((maxlen,))\n",
    "emb_dec = Embedding(output_vocab_size, 120)(inp_dec)\n",
    "emb_dec = Dense(dim)(emb_dec)\n",
    "\n",
    "x = Embedding(input_vocab_size, 120)(inp)\n",
    "x = Bidirectional(get_rnn())(x)\n",
    "x = get_rnn()(x)\n",
    "x = get_rnn()(x)\n",
    "x = Attention(get_rnn, 3)([x, emb_dec])\n",
    "x = TimeDistributed(Dense(output_vocab_size, activation='softmax'))(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "We can now train, passing in the decoder inputs as well for teacher forcing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "model = Model([inp, inp_dec], x)\n",
    "model.compile(Adam(), 'sparse_categorical_crossentropy', metrics=['acc'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "hist=model.fit([input_train, dec_input_train], np.expand_dims(labels_train,-1), \n",
    "          validation_data=[[input_test, dec_input_test], np.expand_dims(labels_test,-1)], \n",
    "          batch_size=64, **parms, nb_epoch=3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true,
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[1.1220142268966762,\n",
       " 0.8048337527904541,\n",
       " 0.5265462693931019,\n",
       " 0.31194811385601234,\n",
       " 0.23548489567190725,\n",
       " 0.22613589595439379,\n",
       " 0.19347934096944011,\n",
       " 0.19403484622484754,\n",
       " 0.18044625614216103,\n",
       " 0.17990882804400168]"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hist.history['val_loss']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 998,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "K.set_value(model.optimizer.lr, 1e-4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 999,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true,
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "hist=model.fit([input_train, dec_input_train], np.expand_dims(labels_train,-1), \n",
    "          validation_data=[[input_test, dec_input_test], np.expand_dims(labels_test,-1)], \n",
    "          batch_size=64, **parms, nb_epoch=5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1000,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0.1591,  0.1563,  0.1532,  0.1517,  0.1499])"
      ]
     },
     "execution_count": 1000,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.array(hist.history['val_loss'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1001,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "def eval_keras():\n",
    "    preds = model.predict([input_test, dec_input_test], batch_size=128)\n",
    "    predict = np.argmax(preds, axis = 2)\n",
    "    return (np.mean([all(real==p) for real, p in zip(labels_test, predict)]), predict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "Better accuracy!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 895,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.51134154244977315"
      ]
     },
     "execution_count": 895,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "acc, preds = eval_keras(); acc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "source": [
    "This model is certainly performing better with longer words. The mistakes it's making are reasonable, and it even succesfully formed the word \"partisanship\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 896,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pronunciation                            real spelling     model spelling    is correct\n",
      "D-AY0-AE1-F-AH0-N-IH0-S                  diaphanous        diaphoneus        False\n",
      "T-IH1-R-IY0                              teary             tiary             False\n",
      "M-IH1-T-AH0-N                            mittan            mitton            False\n",
      "P-AA1-R-N-AH0-S                          parness           parnass           False\n",
      "SH-UW1-ER0-M-AH0-N                       schuermann        schuerman         False\n",
      "P-AA1-R-T-AH0-Z-AH0-N-SH-IH2-P           partisanship      partisanship      True\n",
      "AE1-SH-IH0-Z                             ashes             ashes             True\n",
      "S-P-IY1-K                                speak             speek             False\n",
      "K-AO1-R-B-IH0-T                          corbitt           corbit            False\n",
      "B-IH1-L-D-ER0-Z                          builders          biilders          False\n",
      "EH0-S-P-R-IY1                            esprit            espree            False\n",
      "EH0-NG-G-EH0-L-B-EH1-R-T-AH0             engelberta        engelberta        True\n",
      "AE0-N-T-AY2-P-ER0-S-AH0-N-EH1-L          antipersonnel     antipersoneal     False\n",
      "W-IH1-M-Z                                whims             whimm             False\n",
      "AO2-L-T-AH0-G-EH1-DH-ER0                 altogether        altagether        False\n",
      "AE1-K-S-AH0-L-Z                          axles             axees             False\n",
      "D-UW1-S                                  doose             doose             True\n",
      "B-R-AH0-Z-IH1-L-Y-AH0-N                  brazilian         brazilian         True\n",
      "G-R-EH1-N-Z                              grenz             grenz             True\n",
      "B-AH1-K-HH-AE2-N-T-S                     buckhantz         buckhants         False\n"
     ]
    }
   ],
   "source": [
    "print(\"pronunciation\".ljust(40), \"real spelling\".ljust(17), \n",
    "      \"model spelling\".ljust(17), \"is correct\")\n",
    "\n",
    "for index in range(20):\n",
    "    ps = \"-\".join([phonemes[p] for p in input_test[index]]) \n",
    "    real = [letters[l] for l in labels_test[index]] \n",
    "    predict = [letters[l] for l in preds[index]]\n",
    "    print (ps.split(\"-_\")[0].ljust(40), \"\".join(real).split(\"_\")[0].ljust(17),\n",
    "        \"\".join(predict).split(\"_\")[0].ljust(17), str(real == predict))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "heading_collapsed": true
   },
   "source": [
    "## Test code for the attention layer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 301,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "nb_samples, nb_time, input_dim, output_dim = (64, 4, 32, 48)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 302,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "x = tf.placeholder(np.float32, (nb_samples, nb_time, input_dim))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 303,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "xr = K.reshape(x,(-1,nb_time,1,input_dim))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 304,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TensorShape([Dimension(32), Dimension(32)])"
      ]
     },
     "execution_count": 304,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "W1 = tf.placeholder(np.float32, (input_dim, input_dim)); W1.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 305,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "W1r = K.reshape(W1, (1, input_dim, input_dim))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 306,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "W1r2 = K.reshape(W1, (1, 1, input_dim, input_dim))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 307,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TensorShape([Dimension(64), Dimension(4), Dimension(32)])"
      ]
     },
     "execution_count": 307,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xW1 = K.conv1d(x,W1r,border_mode='same'); xW1.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 308,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TensorShape([Dimension(64), Dimension(4), Dimension(1), Dimension(32)])"
      ]
     },
     "execution_count": 308,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xW12 = K.conv2d(xr,W1r2,border_mode='same'); xW12.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 251,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "xW2 = K.dot(x, W1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 245,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "x1 = np.random.normal(size=(nb_samples, nb_time, input_dim))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 246,
   "metadata": {
    "collapsed": true,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "w1 = np.random.normal(size=(input_dim, input_dim))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 248,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "res = sess.run(xW1, {x:x1, W1:w1})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 252,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "res2 = sess.run(xW2, {x:x1, W1:w1})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 253,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 253,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.allclose(res, res2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 283,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TensorShape([Dimension(48), Dimension(32)])"
      ]
     },
     "execution_count": 283,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "W2 = tf.placeholder(np.float32, (output_dim, input_dim)); W2.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 295,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [],
   "source": [
    "h = tf.placeholder(np.float32, (nb_samples, output_dim))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 296,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TensorShape([Dimension(64), Dimension(32)])"
      ]
     },
     "execution_count": 296,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hW2 = K.dot(h,W2); hW2.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 297,
   "metadata": {
    "collapsed": false,
    "deletable": true,
    "editable": true,
    "hidden": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TensorShape([Dimension(64), Dimension(1), Dimension(1), Dimension(32)])"
      ]
     },
     "execution_count": 297,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hW2 = K.reshape(hW2,(-1,1,1,input_dim)); hW2.shape"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}