{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# RNN - Many-to-one\n", "> In this post, We will briefly cover the many-to-one type, which is one the common types of Recurrent Neural Network and its implementation in tensorflow. \n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Chanseok Kang\n", "- categories: [Python, Deep_Learning, Tensorflow-Keras]\n", "- image: images/many-to-one_detail.png" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Tensorflow: 2.3.1\n" ] } ], "source": [ "import tensorflow as tf\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "\n", "print('Tensorflow: {}'.format(tf.__version__))\n", "\n", "plt.rcParams['figure.figsize'] = (16, 10)\n", "plt.rc('font', size=15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Various usage of RNN\n", "As we already discussed, RNN is used for sequence data handling. And there are several types of RNN architecture.\n", "\n", "![various_rnn](image/various_rnn.png) {% fn 1 %}\n", "\n", "In previous [post](https://goodboychan.github.io/chans_jupyter/python/deep_learning/tensorflow-keras/2020/10/26/02-RNN-Basic.html), we take a look **one-to-one** type, which is the basic RNN structure. And next one is **one-to-many** type. For example, if the model gets the fixed format like image as an input, it generates the sequence data. You can see the implementation on image caption application. Another type is **many-to-many** type. It gets sequence data as an inputs, and also generates the sequence data as an output. Common application of many-to-many type is machine translation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Many-to-one\n", "\n", "**Many-to-one** type, which is our topic in this post, gets an sequence data as an input and generates some informatic data like labels. So we can use it for classification. Suppose that someone defines the sentiment of each sentence, and train the model with many-to-one type. And when the model gets the unseen sentence, then it will predict the intention of sentence, good or bad.\n", "\n", "![many-to-one example](image/many-to-one.png)\n", "\n", "The detailed explanation is like this.\n", "\n", "Suppose we have a sentence, \"This movis is good\". And we want to classify the sentiment of this sentence. In order to do this, we need to apply tokenization in word level. If this sentensce intends the good sentiment, then word token may contains good words, like \"good\". So we can classify this sentense to good sentiment.\n", "\n", "So if we want to apply it in RNN model, we need to consider the sentence as a word sequence(many), then clssify its label(one). That is process of many-to-one type model.\n", "\n", "![many-to-one detail](image/many-to-one_detail.png)\n", "\n", "But as you notice, computational model cannot accept the word itself as an input. Instead, it needs to convert with numerical vector. **Embedding** Layer can do this.\n", "\n", "So how can we implement this with tensorflow?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example - Word sentiment classification\n", "\n", "At first, we prepare the dummy data for simple classification. For the simplicity, we defined 1 as a good token, and 0 otherwise." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "words = ['good', 'bad', 'worse', 'so good']\n", "y = [1, 0, 0, 1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on this, we can generate the token dictionary, which is the mapping table for each characters. But before we prepare this, we need to consider one exceptional case, the variation of each sentence length. To train the network, the format (or shape) of input data must be fixed. So we need to add the concept of **padding**. Tensorflow has useful API for padding, `pad_sequences`." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "char_set = [''] + sorted(list(set(''.join(words))))\n", "idx2char = {idx:char for idx, char in enumerate(char_set)}\n", "char2idx = {char:idx for idx, char in enumerate(char_set)}" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['', ' ', 'a', 'b', 'd', 'e', 'g', 'o', 'r', 's', 'w']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "char_set" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{0: '',\n", " 1: ' ',\n", " 2: 'a',\n", " 3: 'b',\n", " 4: 'd',\n", " 5: 'e',\n", " 6: 'g',\n", " 7: 'o',\n", " 8: 'r',\n", " 9: 's',\n", " 10: 'w'}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "idx2char" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'': 0,\n", " ' ': 1,\n", " 'a': 2,\n", " 'b': 3,\n", " 'd': 4,\n", " 'e': 5,\n", " 'g': 6,\n", " 'o': 7,\n", " 'r': 8,\n", " 's': 9,\n", " 'w': 10}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "char2idx" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So as we mentioned before, we need to vectorize each tokens." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "X = list(map(lambda word: [char2idx.get(char) for char in word], words))\n", "X_len = list(map(lambda word: len(word), X))" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[6, 7, 7, 4], [3, 2, 4], [10, 7, 8, 9, 5], [9, 7, 1, 6, 7, 7, 4]]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 3, 5, 7]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_len" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from tensorflow.keras.preprocessing.sequence import pad_sequences\n", "\n", "# Padding the sequence of indices\n", "max_sequence=10\n", "\n", "X = pad_sequences(X, maxlen=max_sequence, padding='post', truncating='post')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 6, 7, 7, 4, 0, 0, 0, 0, 0, 0],\n", " [ 3, 2, 4, 0, 0, 0, 0, 0, 0, 0],\n", " [10, 7, 8, 9, 5, 0, 0, 0, 0, 0],\n", " [ 9, 7, 1, 6, 7, 7, 4, 0, 0, 0]])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 0, 0, 1]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we used `pad_sequences` API with several arguments. First, we defined maximum sequence length to 10. So numerical vector has an maximum length of 10. That is, we just consider 10 characters for sequence. And there are some words that has less than 10 characters. In that case, we filled some 0s for padding. Through argument, we can define the direction of padding, `pre` or `post`.\n", "\n", "And of course, it is efficient to use the dataset with generator." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# Generate data pipeline\n", "train_ds = tf.data.Dataset.from_tensor_slices((X, y)).shuffle(buffer_size=4).batch(batch_size=2)\n", "print(train_ds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After that, we can build many-to-one model with simpleRNN." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "input_dim = len(char2idx)\n", "output_dim = len(char2idx)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_1\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "embedding_1 (Embedding) (None, 10, 11) 121 \n", "_________________________________________________________________\n", "simple_rnn_1 (SimpleRNN) (None, 10) 220 \n", "_________________________________________________________________\n", "dense_1 (Dense) (None, 2) 22 \n", "=================================================================\n", "Total params: 363\n", "Trainable params: 242\n", "Non-trainable params: 121\n", "_________________________________________________________________\n" ] } ], "source": [ "from tensorflow.keras.models import Sequential\n", "from tensorflow.keras.layers import Embedding, SimpleRNN, Dense\n", "\n", "model = Sequential([\n", " Embedding(input_dim=input_dim, output_dim=output_dim,\n", " mask_zero=True, input_length=max_sequence,\n", " trainable=False, embeddings_initializer=tf.keras.initializers.random_normal()),\n", " SimpleRNN(units=10),\n", " Dense(2)\n", "])\n", "\n", "model.summary()" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "def loss_fn(model, X, y):\n", " return tf.reduce_mean(tf.keras.losses.sparse_categorical_crossentropy(y_true=y, \n", " y_pred=model(X), \n", " from_logits=True))\n", "\n", "optimizer = tf.keras.optimizers.Adam(learning_rate=0.01)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "epoch: 5, tr_loss: 0.248861\n", "epoch: 10, tr_loss: 0.018041\n", "epoch: 15, tr_loss: 0.003201\n", "epoch: 20, tr_loss: 0.001664\n", "epoch: 25, tr_loss: 0.001231\n", "epoch: 30, tr_loss: 0.001025\n" ] } ], "source": [ "tr_loss_hist = []\n", "\n", "for e in range(30):\n", " avg_tr_loss = 0\n", " tr_step = 0\n", " \n", " for x_mb, y_mb in train_ds:\n", " with tf.GradientTape() as tape:\n", " tr_loss = loss_fn(model, x_mb, y_mb)\n", " \n", " grads = tape.gradient(tr_loss, sources=model.variables)\n", " optimizer.apply_gradients(grads_and_vars=zip(grads, model.variables))\n", " avg_tr_loss += tr_loss\n", " tr_step += 1\n", " \n", " avg_tr_loss /= tr_step\n", " tr_loss_hist.append(avg_tr_loss)\n", " \n", " if (e + 1) % 5 == 0:\n", " print('epoch: {:3}, tr_loss: {:3f}'.format(e + 1, avg_tr_loss))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After that, we can check the performance of this simple rnn network" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "y_pred = model.predict(X)\n", "y_pred = np.argmax(y_pred, axis=-1)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "acc: 100.00%\n" ] } ], "source": [ "print('acc: {:.2%}'.format(np.mean(y_pred == y)))" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure()\n", "plt.plot(tr_loss_hist)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, the performance of this simple network shows almost 100% accuracy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "In this post, we just cover the basic concept of **many-to-one** type RNN model for sentiment classification, and implement it with Tensorflow." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "{{ 'Reference from stanford CS231n lecture note' | fndetail: 1 }}" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }