{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# TensorFlow による RNN 実習<br><small>ハンズオン資料</small>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p style=\"text-align:center;font-size:150%;line-height:150%\">2016/09/10 機械学習 名古屋 第6回勉強会</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## はじめに"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ この資料は、TensorFlow を用いて、簡単な RNN の構築・学習を実施することを目的とするものです。  \n",
    "+ この資料に掲載のコードは、TensorFlow 公式のチュートリアル [Recurrent Neural Networks](https://www.tensorflow.org/versions/master/tutorials/recurrent/) を元に構築しております。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## 目標"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ TensorFlow と PTB データセットを用いて、基本的な RNN のモデル構築・学習・文章推測ができる。\n",
    "+ 他のデータセットや、他のモデルで同様に系列データの学習・推測をする（オプション、もしくは宿題）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "## 環境等"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "以下の環境を前提とします。\n",
    "\n",
    "+ Python（必須）（2.7.x / 3.x どちらでもOK）\n",
    "+ TensorFlow（必須）（0.7 / 0.8 / 0.9 / 0.10[New!] どれでもOK）\n",
    "+ IPython（任意）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ TensorFlow 0.6（以前）は未対応。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## TensorFlow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "+ [公式サイト](https://www.tensorflow.org/)\n",
    "+ Google 製の「データフローグラフを用いた数値計算ライブラリ」（公式の説明を私訳）\n",
    "  + DeepLearning 用の機能も豊富。\n",
    "+ 最新 [v0.10](https://github.com/tensorflow/tensorflow/releases/tag/v0.10.0rc0) がリリース準備中(2016/09/03 現在、v0.10-rc0 公開中)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "インストールの詳細省略。  \n",
    "インストールが成功していれば、Python のインタラクティブシェル（もしくは ipython, Jupyter 等）で↓以下のようにすれば利用開始。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 今回は TensorBoard は不使用。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## RNN"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### おさらい"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "+ RNN：\n",
    "    + 系列データ（文章、音声、映像等）のコンテキストを学習・分類するモデルの総称\n",
    "+ コンテキスト（文脈）：\n",
    "    + 系列内の要素の並び・依存関係\n",
    "    + 要素の例：\n",
    "        + 「文章」中の「文字」、「単語」、「形態素」等\n",
    "        + 「音声」中の「（パワースペクトル密度係数の）ベクトル」、「音素」等\n",
    "        + 「映像（動画）」中の「（ある時刻における）画像（静止画）」等"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "+ LSTM：\n",
    "    + RNN の拡張モデル\n",
    "    + メモリセルと3つのゲートからなるLSTMブロックという構造を持つ\n",
    "    + （従来の RNN と比較して）「長期にわたる依存性」を学習可能"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## PTB データセット"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "+ PTB データセット：\n",
    "    + 単語の履歴を元に、文章中の次の単語を予測する「文章予測モデル」のための（サンプル）データセット。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ RNN の最初のサンプル構築・実験に最適。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### データ準備"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz をダウンロード・展開。  \n",
    "（その中の `data/ptb.train.txt`, `data/ptb.valid.txt`, `data/ptb.test.txt` のみ使用）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 必要なデータのみ抽出しアーカイブしたファイルを用意 → https://github.com/antimon2/MLN_201609/raw/tf_rnn_dev/simple-examples.tgz  \n",
    "※ あと今回は自動ダウンロード・展開の機能は提供なし。各自手作業でお願いしますm(\\_ \\_)m"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def _read_words(filename):\n",
    "    with tf.gfile.GFile(filename, \"r\") as f:\n",
    "        return f.read().replace(\"\\n\", \"<eos>\").split()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ `<eos>` は \"End of Sentence\" の意味。改行を文章の区切と見なしそのマークを入れています。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def _build_vocab(filename):\n",
    "    import collections\n",
    "\n",
    "    data = _read_words(filename)\n",
    "\n",
    "    counter = collections.Counter(data)\n",
    "    count_pairs = sorted(counter.items(), key=lambda x: (-x[1], x[0]))\n",
    "\n",
    "    words, _ = list(zip(*count_pairs))\n",
    "    word_to_id = dict(zip(words, range(len(words))))\n",
    "\n",
    "    return word_to_id\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 概要：訓練データに含まれる全ての単語を高頻度順に番号付けした辞書を生成。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def _file_to_word_ids(filename, word_to_id):\n",
    "    data = _read_words(filename)\n",
    "    return [word_to_id[word] for word in data]\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 概要：テキストファイルを単語ごとに分割→単語IDのリストに変換。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptb_raw_data(data_path=None):\n",
    "    \"\"\"Load PTB raw data from data directory \"data_path\".\"\"\"\n",
    "    import os\n",
    "\n",
    "    train_path = os.path.join(data_path, \"ptb.train.txt\")\n",
    "    valid_path = os.path.join(data_path, \"ptb.valid.txt\")\n",
    "    test_path = os.path.join(data_path, \"ptb.test.txt\")\n",
    "\n",
    "    word_to_id = _build_vocab(train_path)\n",
    "    train_data = _file_to_word_ids(train_path, word_to_id)\n",
    "    valid_data = _file_to_word_ids(valid_path, word_to_id)\n",
    "    test_data = _file_to_word_ids(test_path, word_to_id)\n",
    "    vocabulary = len(word_to_id)\n",
    "    return train_data, valid_data, test_data, vocabulary\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 概要：訓練データ・検証データ・評価データをそれぞれ読込。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptb_iterator(raw_data, batch_size, num_steps):\n",
    "    \"\"\"Iterate on the raw PTB data.\"\"\"\n",
    "    import numpy as np\n",
    "    \n",
    "    raw_data = np.array(raw_data, dtype=np.int32)\n",
    "\n",
    "    data_len = len(raw_data)\n",
    "    batch_len = data_len // batch_size\n",
    "    data = np.zeros([batch_size, batch_len], dtype=np.int32)\n",
    "    for i in range(batch_size):\n",
    "        data[i] = raw_data[batch_len * i:batch_len * (i + 1)]\n",
    "\n",
    "    epoch_size = (batch_len - 1) // num_steps\n",
    "\n",
    "    if epoch_size == 0:\n",
    "        raise ValueError(\"epoch_size == 0, decrease batch_size or num_steps\")\n",
    "\n",
    "    for i in range(epoch_size):\n",
    "        x = data[:, i*num_steps:(i+1)*num_steps]\n",
    "        y = data[:, i*num_steps+1:(i+1)*num_steps+1]\n",
    "        yield (x, y)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 概要：与えられたデータを `batch_size`×`num_steps` のミニバッチデータにして送出するイテレータ（generator）を返す。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## モデル構築"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### サンプルモデル定義データ"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class SampleTinyConfig(object):\n",
    "    \"\"\"Tiny config.\"\"\"\n",
    "    init_scale = 0.1\n",
    "    learning_rate = 1.0\n",
    "    max_grad_norm = 5\n",
    "    num_layers = 2\n",
    "    num_steps = 10\n",
    "    hidden_size = 50\n",
    "    max_epoch = 1\n",
    "    max_max_epoch = 3\n",
    "    keep_prob = 1.0\n",
    "    lr_decay = 0.5\n",
    "    batch_size = 20\n",
    "    vocab_size = 10000\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### モデル定義"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptbmodel_init_step1(config):\n",
    "    \"\"\"Prepare Placeholders.\"\"\"\n",
    "    _input_data = tf.placeholder(tf.int32, [config.batch_size, config.num_steps])\n",
    "    _targets = tf.placeholder(tf.int32, [config.batch_size, config.num_steps])\n",
    "    return (_input_data, _targets)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptbmodel_init_step2(config, is_training):\n",
    "    \"\"\"Construct LSTM cell(s).\"\"\"\n",
    "    batch_size = config.batch_size\n",
    "    size = config.hidden_size\n",
    "    vocab_size = config.vocab_size\n",
    "\n",
    "    lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(size, forget_bias=0.0)\n",
    "    if is_training and config.keep_prob < 1:\n",
    "        lstm_cell = tf.nn.rnn_cell.DropoutWrapper(\n",
    "                lstm_cell, output_keep_prob=config.keep_prob)\n",
    "    cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * config.num_layers)\n",
    "    _initial_state = cell.zero_state(batch_size, tf.float32)\n",
    "\n",
    "    return (cell, _initial_state)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 複数LSTMスタック（`tf.nn.rnn_cell.MultiRNNCell`）利用。  \n",
    "　 ↑第1層の出力が第2層の入力になる。  \n",
    "※ ここでは簡単のため、`forget_bias=0.0`（Forget Gate を無効）にしている。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptbmodel_init_step3(config, _input_data, is_training):\n",
    "    \"\"\"Convert Input Data.\"\"\"\n",
    "    size = config.hidden_size\n",
    "    vocab_size = config.vocab_size\n",
    "\n",
    "    with tf.device(\"/cpu:0\"):\n",
    "        embedding = tf.get_variable(\"embedding\", [vocab_size, size])\n",
    "        inputs = tf.nn.embedding_lookup(embedding, _input_data)\n",
    "\n",
    "    if is_training and config.keep_prob < 1:\n",
    "        inputs = tf.nn.dropout(inputs, config.keep_prob)\n",
    "\n",
    "    return inputs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 「単語のベクトル表現」利用（前回資料参照）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptbmodel_init_step4(config, cell, inputs, _initial_state):\n",
    "    \"\"\"Calculate Logits and Final State.\"\"\"\n",
    "    batch_size = config.batch_size\n",
    "    size = config.hidden_size\n",
    "    vocab_size = config.vocab_size\n",
    "\n",
    "    outputs = []\n",
    "    state = _initial_state\n",
    "    with tf.variable_scope(\"RNN\"):\n",
    "        for time_step in range(config.num_steps):\n",
    "            if time_step > 0: tf.get_variable_scope().reuse_variables()\n",
    "            (cell_output, state) = cell(inputs[:, time_step, :], state)\n",
    "            outputs.append(cell_output)\n",
    "\n",
    "    output = tf.reshape(tf.concat(1, outputs), [-1, size])\n",
    "    softmax_w = tf.get_variable(\"softmax_w\", [size, vocab_size])\n",
    "    softmax_b = tf.get_variable(\"softmax_b\", [vocab_size])\n",
    "    logits = tf.matmul(output, softmax_w) + softmax_b\n",
    "    probs = tf.nn.softmax(logits)\n",
    "    return (logits, probs, state)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptbmodel_init_step5(config, logits, _targets):\n",
    "    \"\"\"Calculate Cost.\"\"\"\n",
    "\n",
    "    loss = tf.nn.seq2seq.sequence_loss_by_example(\n",
    "            [logits],\n",
    "            [tf.reshape(_targets, [-1])],\n",
    "            [tf.ones([config.batch_size * config.num_steps])])\n",
    "    cost = tf.reduce_sum(loss) / config.batch_size\n",
    "    return cost\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def ptbmodel_init_step6(config, cost):\n",
    "    \"\"\"Define Training Operation.\"\"\"\n",
    "    _lr = tf.Variable(0.0, trainable=False)\n",
    "    tvars = tf.trainable_variables()\n",
    "    grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars),\n",
    "                                      config.max_grad_norm)\n",
    "    optimizer = tf.train.GradientDescentOptimizer(_lr)\n",
    "    _train_op = optimizer.apply_gradients(zip(grads, tvars))\n",
    "    return (_lr, _train_op)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "class PTBModel(object):\n",
    "    \"\"\"The PTB model.\"\"\"\n",
    "\n",
    "    def __init__(self, is_training, config):\n",
    "        self.batch_size = batch_size = config.batch_size\n",
    "        self.num_steps = num_steps = config.num_steps\n",
    "        size = config.hidden_size\n",
    "        vocab_size = config.vocab_size\n",
    "\n",
    "        _input_data, _targets = ptbmodel_init_step1(config)\n",
    "        self.input_data = _input_data\n",
    "        self.targets = _targets\n",
    "\n",
    "        cell, _initial_state = ptbmodel_init_step2(config, is_training)\n",
    "        self.cell = cell\n",
    "        self.initial_state = _initial_state\n",
    "\n",
    "        inputs = ptbmodel_init_step3(config, _input_data, is_training)\n",
    "\n",
    "        logits, probs, _final_state = ptbmodel_init_step4(\n",
    "                                config, cell, inputs, _initial_state)\n",
    "        self.logits = logits\n",
    "        self.probs = probs\n",
    "        self.final_state = _final_state\n",
    "        cost = ptbmodel_init_step5(config, logits, _targets)\n",
    "        self.cost = cost\n",
    "\n",
    "        if is_training:\n",
    "            _lr, _train_op = ptbmodel_init_step6(config, cost)\n",
    "            self.lr = _lr\n",
    "            self.train_op = _train_op\n",
    "\n",
    "    def assign_lr(self, session, lr_value):\n",
    "        session.run(tf.assign(self.lr, lr_value))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 学習実行"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "from __future__ import absolute_import\n",
    "from __future__ import division\n",
    "from __future__ import print_function\n",
    "\n",
    "import time\n",
    "import os\n",
    "\n",
    "import numpy as np\n",
    "import tensorflow as tf\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def run_epoch(session, model, data, eval_op, verbose=False):\n",
    "    \"\"\"Runs the model on the given data.\"\"\"\n",
    "    epoch_size = ((len(data) // model.batch_size) - 1) // model.num_steps\n",
    "    start_time = time.time()\n",
    "    costs = 0.0\n",
    "    iters = 0\n",
    "    # state = model.initial_state.eval()\n",
    "    state = session.run(model.initial_state)\n",
    "    for step, (x, y) in enumerate(ptb_iterator(data, model.batch_size,\n",
    "                                               model.num_steps)):\n",
    "        cost, state, _ = session.run([model.cost, model.final_state, eval_op],\n",
    "                                     {model.input_data: x,\n",
    "                                        model.targets: y,\n",
    "                                        model.initial_state: state})\n",
    "        costs += cost\n",
    "        iters += model.num_steps\n",
    "\n",
    "        if verbose and step % (epoch_size // 10) == 10:\n",
    "            print(\"%.3f perplexity: %.3f speed: %.0f wps\" %\n",
    "                  (step * 1.0 / epoch_size, np.exp(costs / iters),\n",
    "                   iters * model.batch_size / (time.time() - start_time)))\n",
    "\n",
    "    return np.exp(costs / iters)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "data_path = 'simple-examples/data/'\n",
    "train_data, valid_data, test_data, _ = ptb_raw_data(data_path)\n",
    "\n",
    "config = SampleTinyConfig()\n",
    "eval_config = SampleTinyConfig()\n",
    "eval_config.batch_size = 1\n",
    "eval_config.num_steps = 1\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "# with tf.Graph().as_default():\n",
    "initializer = tf.random_uniform_initializer(-config.init_scale,\n",
    "                                            config.init_scale)\n",
    "with tf.variable_scope(\"model\", reuse=None, initializer=initializer):\n",
    "    m = PTBModel(is_training=True, config=config)\n",
    "with tf.variable_scope(\"model\", reuse=True, initializer=initializer):\n",
    "    mvalid = PTBModel(is_training=False, config=config)\n",
    "    mtest = PTBModel(is_training=False, config=eval_config)\n",
    "init = tf.initialize_all_variables()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ Tensorflow v0.9以上 だと、WARNING が出力されますが、正常に動作します。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "sess = tf.Session()\n",
    "sess.run(init)\n",
    "\n",
    "for i in range(config.max_max_epoch):\n",
    "    lr_decay = config.lr_decay ** max(i - config.max_epoch, 0.0)\n",
    "    m.assign_lr(sess, config.learning_rate * lr_decay)\n",
    "\n",
    "    print(\"Epoch: %d Learning rate: %.3f\" % (i + 1, sess.run(m.lr)))\n",
    "    train_perplexity = run_epoch(sess, m, train_data, m.train_op,\n",
    "                                                             verbose=True)\n",
    "    print(\"Epoch: %d Train Perplexity: %.3f\" % (i + 1, train_perplexity))\n",
    "    valid_perplexity = run_epoch(sess, mvalid, valid_data, tf.no_op())\n",
    "    print(\"Epoch: %d Valid Perplexity: %.3f\" % (i + 1, valid_perplexity))\n",
    "\n",
    "test_perplexity = run_epoch(sess, mtest, test_data, tf.no_op())\n",
    "print(\"Test Perplexity: %.3f\" % test_perplexity)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "source": [
    "※ 30分くらいかかります。  \n",
    "※ 最終的に Test Perplexity の値が 155 前後で表示されるはずです。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 文章推測"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "def sample(model, sess, words, vocab, num=200, prime=['the'], sampling_type=1):\n",
    "    state = sess.run(model.cell.zero_state(1, tf.float32))\n",
    "    for word in prime[:-1]:\n",
    "        x = np.array([[vocab[word]]])\n",
    "        feed = {model.input_data: x, model.initial_state:state}\n",
    "        state = sess.run(model.final_state, feed)\n",
    "\n",
    "    def weighted_pick(weights):\n",
    "        t = np.cumsum(weights)\n",
    "        s = np.sum(weights)\n",
    "        return(int(np.searchsorted(t, np.random.rand(1)*s)))\n",
    "\n",
    "    ret = list(prime)\n",
    "    word = prime[-1]\n",
    "    for n in range(num):\n",
    "        x = np.array([[vocab[word]]])\n",
    "        feed = {model.input_data: x, model.initial_state:state}\n",
    "        probs, state = sess.run([model.probs, model.final_state], feed)\n",
    "        p = probs[0]\n",
    "\n",
    "        if sampling_type == 0:\n",
    "            sample = np.argmax(p)\n",
    "        else:\n",
    "            sample = weighted_pick(p)\n",
    "\n",
    "        pred = words[sample]\n",
    "        if pred == '<eos>':\n",
    "            break\n",
    "\n",
    "        ret.append(pred)\n",
    "        word = pred\n",
    "    return ret"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "word_to_id = _build_vocab(os.path.join(data_path, \"ptb.train.txt\"))\n",
    "words = sorted(word_to_id.keys(), key=lambda x: word_to_id[x])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "res = sample(mtest, sess, words, word_to_id, prime=['the', 'american'])\n",
    "' '.join(res)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "res = sample(mtest, sess, words, word_to_id, prime=np.random.choice(words, 3, replace=False))\n",
    "' '.join(res)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 発展"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### 精度向上"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "+ イテレーション回数を調整する（`num_step`, `max_epoch`, `max_max_epoch` 等）\n",
    "+ LSTM のサイズを調整する（`hidden_size`）\n",
    "+ その他各種パラメータを調整する（`init_scale`, `learning_rate`, `max_grad_norm` 等）\n",
    "+ Forget Gate を有効にする（`ptbmodel_init_step2()` 関数内、`tf.nn.rnn_cell.BasicLSTMCell()` への引数 `forget_bias=～` に適切な数値（>0.0）を設定する）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### その他（課題）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "+ 単語単位ではなく、文字単位で学習してみる（char-rnn）\n",
    "+ 英文（欧文）ではなく、日本語の文章で学習・推測してみる\n",
    "+ 文章（文字列）以外の系列データ（音声・動画等）に適用してみる"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## 参考"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "+ [Recurrent Neural Networks](https://www.tensorflow.org/tutorials/recurrent/index.html)（TensorFlow 公式チュートリアル）\n",
    "    + [参考日本語訳その1](http://qiita.com/KojiOhki/items/149f96bd98973bd219ac)\n",
    "    + [参考日本語訳その2 + コード解説](http://tensorflow.classcat.com/2016/03/13/tensorflow-cc-recurrent-neural-networks/)"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "TensorFlow v0.9 (Python 3)",
   "language": "python",
   "name": "tensorflow09"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}