{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Emojify! \n", "\n", "Welcome to the second assignment of Week 2. You are going to use word vector representations to build an Emojifier. \n", "\n", "Have you ever wanted to make your text messages more expressive? Your emojifier app will help you do that. \n", "So rather than writing:\n", ">\"Congratulations on the promotion! Let's get coffee and talk. Love you!\" \n", "\n", "The emojifier can automatically turn this into:\n", ">\"Congratulations on the promotion! πŸ‘ Let's get coffee and talk. β˜•οΈ Love you! ❀️\"\n", "\n", "* You will implement a model which inputs a sentence (such as \"Let's go see the baseball game tonight!\") and finds the most appropriate emoji to be used with this sentence (⚾️).\n", "\n", "#### Using word vectors to improve emoji lookups\n", "* In many emoji interfaces, you need to remember that ❀️ is the \"heart\" symbol rather than the \"love\" symbol. \n", " * In other words, you'll have to remember to type \"heart\" to find the desired emoji, and typing \"love\" won't bring up that symbol.\n", "* We can make a more flexible emoji interface by using word vectors!\n", "* When using word vectors, you'll see that even if your training set explicitly relates only a few words to a particular emoji, your algorithm will be able to generalize and associate additional words in the test set to the same emoji.\n", " * This works even if those additional words don't even appear in the training set. \n", " * This allows you to build an accurate classifier mapping from sentences to emojis, even using a small training set. \n", "\n", "#### What you'll build\n", "1. In this exercise, you'll start with a baseline model (Emojifier-V1) using word embeddings.\n", "2. Then you will build a more sophisticated model (Emojifier-V2) that further incorporates an LSTM. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Updates\n", "\n", "#### If you were working on the notebook before this update...\n", "* The current notebook is version \"2a\".\n", "* You can find your original work saved in the notebook with the previous version name (\"v2\") \n", "* To view the file directory, go to the menu \"File->Open\", and this will open a new tab that shows the file directory.\n", "\n", "#### List of updates\n", "* sentence_to_avg\n", " * Updated instructions.\n", " * Use separate variables to store the total and the average (instead of just `avg`).\n", " * Additional hint about how to initialize the shape of `avg` vector.\n", "* sentences_to_indices\n", " * Updated preceding text and instructions, added additional hints.\n", "* pretrained_embedding_layer\n", " * Additional instructions to explain how to implement each step.\n", "* Emoify_V2\n", " * Modifies instructions to specify which parameters are needed for each Keras layer.\n", " * Remind users of Keras syntax.\n", " * Explanation of how to use the layer object that is returned by `pretrained_embedding_layer`.\n", " * Provides sample Keras code.\n", "* Spelling, grammar and wording corrections." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get started! Run the following cell to load the package you are going to use. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "from emo_utils import *\n", "import emoji\n", "import matplotlib.pyplot as plt\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1 - Baseline model: Emojifier-V1\n", "\n", "### 1.1 - Dataset EMOJISET\n", "\n", "Let's start by building a simple baseline classifier. \n", "\n", "You have a tiny dataset (X, Y) where:\n", "- X contains 127 sentences (strings).\n", "- Y contains an integer label between 0 and 4 corresponding to an emoji for each sentence.\n", "\n", "\n", "
**Figure 1**: EMOJISET - a classification problem with 5 classes. A few examples of sentences are given here.
\n", "\n", "Let's load the dataset using the code below. We split the dataset between training (127 examples) and testing (56 examples)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X_train, Y_train = read_csv('data/train_emoji.csv')\n", "X_test, Y_test = read_csv('data/tesss.csv')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "maxLen = len(max(X_train, key=len).split())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the following cell to print sentences from X_train and corresponding labels from Y_train. \n", "* Change `idx` to see different examples. \n", "* Note that due to the font used by iPython notebook, the heart emoji may be colored black rather than red." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "never talk to me again 😞\n", "I am proud of your achievements πŸ˜„\n", "It is the worst day in my life 😞\n", "Miss you so much ❀️\n", "food is life 🍴\n", "I love you mum ❀️\n", "Stop saying bullshit 😞\n", "congratulations on your acceptance πŸ˜„\n", "The assignment is too long 😞\n", "I want to go play ⚾\n" ] } ], "source": [ "for idx in range(10):\n", " print(X_train[idx], label_to_emoji(Y_train[idx]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 - Overview of the Emojifier-V1\n", "\n", "In this part, you are going to implement a baseline model called \"Emojifier-v1\". \n", "\n", "
\n", "\n", "
**Figure 2**: Baseline model (Emojifier-V1).
\n", "
\n", "\n", "\n", "#### Inputs and outputs\n", "* The input of the model is a string corresponding to a sentence (e.g. \"I love you). \n", "* The output will be a probability vector of shape (1,5), (there are 5 emojis to choose from).\n", "* The (1,5) probability vector is passed to an argmax layer, which extracts the index of the emoji with the highest probability." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### One-hot encoding\n", "* To get our labels into a format suitable for training a softmax classifier, lets convert $Y$ from its current shape $(m, 1)$ into a \"one-hot representation\" $(m, 5)$, \n", " * Each row is a one-hot vector giving the label of one example.\n", " * Here, `Y_oh` stands for \"Y-one-hot\" in the variable names `Y_oh_train` and `Y_oh_test`: " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "Y_oh_train = convert_to_one_hot(Y_train, C = 5)\n", "Y_oh_test = convert_to_one_hot(Y_test, C = 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see what `convert_to_one_hot()` did. Feel free to change `index` to print out different values. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sentence 'I missed you' has label index 0, which is emoji ❀️\n", "Label index 0 in one-hot encoding format is [ 1. 0. 0. 0. 0.]\n" ] } ], "source": [ "idx = 50\n", "print(f\"Sentence '{X_train[50]}' has label index {Y_train[idx]}, which is emoji {label_to_emoji(Y_train[idx])}\", )\n", "print(f\"Label index {Y_train[idx]} in one-hot encoding format is {Y_oh_train[idx]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the data is now ready to be fed into the Emojify-V1 model. Let's implement the model!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.3 - Implementing Emojifier-V1\n", "\n", "As shown in Figure 2 (above), the first step is to:\n", "* Convert each word in the input sentence into their word vector representations.\n", "* Then take an average of the word vectors. \n", "* Similar to the previous exercise, we will use pre-trained 50-dimensional GloVe embeddings. \n", "\n", "Run the following cell to load the `word_to_vec_map`, which contains all the vector representations." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "word_to_index, index_to_word, word_to_vec_map = read_glove_vecs('../../readonly/glove.6B.50d.txt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You've loaded:\n", "- `word_to_index`: dictionary mapping from words to their indices in the vocabulary \n", " - (400,001 words, with the valid indices ranging from 0 to 400,000)\n", "- `index_to_word`: dictionary mapping from indices to their corresponding words in the vocabulary\n", "- `word_to_vec_map`: dictionary mapping words to their GloVe vector representation.\n", "\n", "Run the following cell to check if it works." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "the index of cucumber in the vocabulary is 113317\n", "the 289846th word in the vocabulary is potatos\n" ] } ], "source": [ "word = \"cucumber\"\n", "idx = 289846\n", "print(\"the index of\", word, \"in the vocabulary is\", word_to_index[word])\n", "print(\"the\", str(idx) + \"th word in the vocabulary is\", index_to_word[idx])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise**: Implement `sentence_to_avg()`. You will need to carry out two steps:\n", "1. Convert every sentence to lower-case, then split the sentence into a list of words. \n", " * `X.lower()` and `X.split()` might be useful. \n", "2. For each word in the sentence, access its GloVe representation.\n", " * Then take the average of all of these word vectors.\n", " * You might use `numpy.zeros()`.\n", " \n", " \n", "#### Additional Hints\n", "* When creating the `avg` array of zeros, you'll want it to be a vector of the same shape as the other word vectors in the `word_to_vec_map`. \n", " * You can choose a word that exists in the `word_to_vec_map` and access its `.shape` field.\n", " * Be careful not to hard code the word that you access. In other words, don't assume that if you see the word 'the' in the `word_to_vec_map` within this notebook, that this word will be in the `word_to_vec_map` when the function is being called by the automatic grader.\n", " * Hint: you can use any one of the word vectors that you retrieved from the input `sentence` to find the shape of a word vector." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# GRADED FUNCTION: sentence_to_avg\n", "\n", "def sentence_to_avg(sentence, word_to_vec_map):\n", " \"\"\"\n", " Converts a sentence (string) into a list of words (strings). Extracts the GloVe representation of each word\n", " and averages its value into a single vector encoding the meaning of the sentence.\n", " \n", " Arguments:\n", " sentence -- string, one training example from X\n", " word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representation\n", " \n", " Returns:\n", " avg -- average vector encoding information about the sentence, numpy-array of shape (50,)\n", " \"\"\"\n", " \n", " ### START CODE HERE ###\n", " # Step 1: Split sentence into list of lower case words (β‰ˆ 1 line)\n", " words = sentence.lower().split()\n", "\n", " # Initialize the average word vector, should have the same shape as your word vectors.\n", " avg = np.zeros((50,))\n", " \n", " # Step 2: average the word vectors. You can loop over the words in the list \"words\".\n", " for w in words:\n", " avg += word_to_vec_map[w]\n", " avg = avg / len(words)\n", " \n", " ### END CODE HERE ###\n", " \n", " return avg" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "avg = \n", " [-0.008005 0.56370833 -0.50427333 0.258865 0.55131103 0.03104983\n", " -0.21013718 0.16893933 -0.09590267 0.141784 -0.15708967 0.18525867\n", " 0.6495785 0.38371117 0.21102167 0.11301667 0.02613967 0.26037767\n", " 0.05820667 -0.01578167 -0.12078833 -0.02471267 0.4128455 0.5152061\n", " 0.38756167 -0.898661 -0.535145 0.33501167 0.68806933 -0.2156265\n", " 1.797155 0.10476933 -0.36775333 0.750785 0.10282583 0.348925\n", " -0.27262833 0.66768 -0.10706167 -0.283635 0.59580117 0.28747333\n", " -0.3366635 0.23393817 0.34349183 0.178405 0.1166155 -0.076433\n", " 0.1445417 0.09808667]\n" ] } ], "source": [ "avg = sentence_to_avg(\"Morrocan couscous is my favorite dish\", word_to_vec_map)\n", "print(\"avg = \\n\", avg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Expected Output**:\n", "\n", "```Python\n", "avg =\n", "[-0.008005 0.56370833 -0.50427333 0.258865 0.55131103 0.03104983\n", " -0.21013718 0.16893933 -0.09590267 0.141784 -0.15708967 0.18525867\n", " 0.6495785 0.38371117 0.21102167 0.11301667 0.02613967 0.26037767\n", " 0.05820667 -0.01578167 -0.12078833 -0.02471267 0.4128455 0.5152061\n", " 0.38756167 -0.898661 -0.535145 0.33501167 0.68806933 -0.2156265\n", " 1.797155 0.10476933 -0.36775333 0.750785 0.10282583 0.348925\n", " -0.27262833 0.66768 -0.10706167 -0.283635 0.59580117 0.28747333\n", " -0.3366635 0.23393817 0.34349183 0.178405 0.1166155 -0.076433\n", " 0.1445417 0.09808667]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Model\n", "\n", "You now have all the pieces to finish implementing the `model()` function. \n", "After using `sentence_to_avg()` you need to:\n", "* Pass the average through forward propagation\n", "* Compute the cost\n", "* Backpropagate to update the softmax parameters\n", "\n", "**Exercise**: Implement the `model()` function described in Figure (2). \n", "\n", "* The equations you need to implement in the forward pass and to compute the cross-entropy cost are below:\n", "* The variable $Y_{oh}$ (\"Y one hot\") is the one-hot encoding of the output labels. \n", "\n", "$$ z^{(i)} = W . avg^{(i)} + b$$\n", "\n", "$$ a^{(i)} = softmax(z^{(i)})$$\n", "\n", "$$ \\mathcal{L}^{(i)} = - \\sum_{k = 0}^{n_y - 1} Y_{oh,k}^{(i)} * log(a^{(i)}_k)$$\n", "\n", "**Note** It is possible to come up with a more efficient vectorized implementation. For now, let's use nested for loops to better understand the algorithm, and for easier debugging.\n", "\n", "We provided the function `softmax()`, which was imported earlier." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# GRADED FUNCTION: model\n", "\n", "def model(X, Y, word_to_vec_map, learning_rate = 0.01, num_iterations = 400):\n", " \"\"\"\n", " Model to train word vector representations in numpy.\n", " \n", " Arguments:\n", " X -- input data, numpy array of sentences as strings, of shape (m, 1)\n", " Y -- labels, numpy array of integers between 0 and 7, numpy-array of shape (m, 1)\n", " word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representation\n", " learning_rate -- learning_rate for the stochastic gradient descent algorithm\n", " num_iterations -- number of iterations\n", " \n", " Returns:\n", " pred -- vector of predictions, numpy-array of shape (m, 1)\n", " W -- weight matrix of the softmax layer, of shape (n_y, n_h)\n", " b -- bias of the softmax layer, of shape (n_y,)\n", " \"\"\"\n", " \n", " np.random.seed(1)\n", "\n", " # Define number of training examples\n", " m = Y.shape[0] # number of training examples\n", " n_y = 5 # number of classes \n", " n_h = 50 # dimensions of the GloVe vectors \n", " \n", " # Initialize parameters using Xavier initialization\n", " W = np.random.randn(n_y, n_h) / np.sqrt(n_h)\n", " b = np.zeros((n_y,))\n", " \n", " # Convert Y to Y_onehot with n_y classes\n", " Y_oh = convert_to_one_hot(Y, C = n_y) \n", " \n", " # Optimization loop\n", " for t in range(num_iterations): # Loop over the number of iterations\n", " for i in range(m): # Loop over the training examples\n", " \n", " ### START CODE HERE ### (β‰ˆ 4 lines of code)\n", " # Average the word vectors of the words from the i'th training example\n", " avg = sentence_to_avg(X[i], word_to_vec_map)\n", "\n", " # Forward propagate the avg through the softmax layer\n", " z = np.dot(W, avg) + b\n", " a = softmax(z)\n", "\n", " # Compute cost using the i'th training label's one hot representation and \"A\" (the output of the softmax)\n", " cost = -np.sum(Y_oh[i] * np.log(a))\n", " ### END CODE HERE ###\n", " \n", " # Compute gradients \n", " dz = a - Y_oh[i]\n", " dW = np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))\n", " db = dz\n", "\n", " # Update parameters with Stochastic Gradient Descent\n", " W = W - learning_rate * dW\n", " b = b - learning_rate * db\n", " \n", " if t % 100 == 0:\n", " print(\"Epoch: \" + str(t) + \" --- cost = \" + str(cost))\n", " pred = predict(X, Y, W, b, word_to_vec_map) #predict is defined in emo_utils.py\n", "\n", " return pred, W, b" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(132,)\n", "(132,)\n", "(132, 5)\n", "never talk to me again\n", "\n", "(20,)\n", "(20,)\n", "(132, 5)\n", "\n" ] } ], "source": [ "print(X_train.shape)\n", "print(Y_train.shape)\n", "print(np.eye(5)[Y_train.reshape(-1)].shape)\n", "print(X_train[0])\n", "print(type(X_train))\n", "Y = np.asarray([5,0,0,5, 4, 4, 4, 6, 6, 4, 1, 1, 5, 6, 6, 3, 6, 3, 4, 4])\n", "print(Y.shape)\n", "\n", "X = np.asarray(['I am going to the bar tonight', 'I love you', 'miss you my dear',\n", " 'Lets go party and drinks','Congrats on the new job','Congratulations',\n", " 'I am so happy for you', 'Why are you feeling bad', 'What is wrong with you',\n", " 'You totally deserve this prize', 'Let us go play football',\n", " 'Are you down for football this afternoon', 'Work hard play harder',\n", " 'It is suprising how people can be dumb sometimes',\n", " 'I am very disappointed','It is the best day in my life',\n", " 'I think I will end up alone','My life is so boring','Good job',\n", " 'Great so awesome'])\n", "\n", "print(X.shape)\n", "print(np.eye(5)[Y_train.reshape(-1)].shape)\n", "print(type(X_train))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the next cell to train your model and learn the softmax parameters (W,b). " ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 0 --- cost = 1.95204988128\n", "Accuracy: 0.348484848485\n", "Epoch: 100 --- cost = 0.0797181872601\n", "Accuracy: 0.931818181818\n", "Epoch: 200 --- cost = 0.0445636924368\n", "Accuracy: 0.954545454545\n", "Epoch: 300 --- cost = 0.0343226737879\n", "Accuracy: 0.969696969697\n", "[[ 3.]\n", " [ 2.]\n", " [ 3.]\n", " [ 0.]\n", " [ 4.]\n", " [ 0.]\n", " [ 3.]\n", " [ 2.]\n", " [ 3.]\n", " [ 1.]\n", " [ 3.]\n", " [ 3.]\n", " [ 1.]\n", " [ 3.]\n", " [ 2.]\n", " [ 3.]\n", " [ 2.]\n", " [ 3.]\n", " [ 1.]\n", " [ 2.]\n", " [ 3.]\n", " [ 0.]\n", " [ 2.]\n", " [ 2.]\n", " [ 2.]\n", " [ 1.]\n", " [ 4.]\n", " [ 3.]\n", " [ 3.]\n", " [ 4.]\n", " [ 0.]\n", " [ 3.]\n", " [ 4.]\n", " [ 2.]\n", " [ 0.]\n", " [ 3.]\n", " [ 2.]\n", " [ 2.]\n", " [ 3.]\n", " [ 4.]\n", " [ 2.]\n", " [ 2.]\n", " [ 0.]\n", " [ 2.]\n", " [ 3.]\n", " [ 0.]\n", " [ 3.]\n", " [ 2.]\n", " [ 4.]\n", " [ 3.]\n", " [ 0.]\n", " [ 3.]\n", " [ 3.]\n", " [ 3.]\n", " [ 4.]\n", " [ 2.]\n", " [ 1.]\n", " [ 1.]\n", " [ 1.]\n", " [ 2.]\n", " [ 3.]\n", " [ 1.]\n", " [ 0.]\n", " [ 0.]\n", " [ 0.]\n", " [ 3.]\n", " [ 4.]\n", " [ 4.]\n", " [ 2.]\n", " [ 2.]\n", " [ 1.]\n", " [ 2.]\n", " [ 0.]\n", " [ 3.]\n", " [ 2.]\n", " [ 2.]\n", " [ 0.]\n", " [ 3.]\n", " [ 3.]\n", " [ 1.]\n", " [ 2.]\n", " [ 1.]\n", " [ 2.]\n", " [ 2.]\n", " [ 4.]\n", " [ 3.]\n", " [ 3.]\n", " [ 2.]\n", " [ 4.]\n", " [ 0.]\n", " [ 0.]\n", " [ 3.]\n", " [ 3.]\n", " [ 3.]\n", " [ 3.]\n", " [ 2.]\n", " [ 0.]\n", " [ 1.]\n", " [ 2.]\n", " [ 3.]\n", " [ 0.]\n", " [ 2.]\n", " [ 2.]\n", " [ 2.]\n", " [ 3.]\n", " [ 2.]\n", " [ 2.]\n", " [ 2.]\n", " [ 4.]\n", " [ 1.]\n", " [ 1.]\n", " [ 3.]\n", " [ 3.]\n", " [ 4.]\n", " [ 1.]\n", " [ 2.]\n", " [ 1.]\n", " [ 1.]\n", " [ 3.]\n", " [ 1.]\n", " [ 0.]\n", " [ 4.]\n", " [ 0.]\n", " [ 3.]\n", " [ 3.]\n", " [ 4.]\n", " [ 4.]\n", " [ 1.]\n", " [ 4.]\n", " [ 3.]\n", " [ 0.]\n", " [ 2.]]\n" ] } ], "source": [ "pred, W, b = model(X_train, Y_train, word_to_vec_map)\n", "print(pred)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Expected Output** (on a subset of iterations):\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", " **Epoch: 0**\n", " \n", " cost = 1.95204988128\n", " \n", " Accuracy: 0.348484848485\n", "
\n", " **Epoch: 100**\n", " \n", " cost = 0.0797181872601\n", " \n", " Accuracy: 0.931818181818\n", "
\n", " **Epoch: 200**\n", " \n", " cost = 0.0445636924368\n", " \n", " Accuracy: 0.954545454545\n", "
\n", " **Epoch: 300**\n", " \n", " cost = 0.0343226737879\n", " \n", " Accuracy: 0.969696969697\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! Your model has pretty high accuracy on the training set. Lets now see how it does on the test set. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.4 - Examining test set performance \n", "\n", "* Note that the `predict` function used here is defined in emo_util.spy." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Training set:\n", "Accuracy: 0.977272727273\n", "Test set:\n", "Accuracy: 0.857142857143\n" ] } ], "source": [ "print(\"Training set:\")\n", "pred_train = predict(X_train, Y_train, W, b, word_to_vec_map)\n", "print('Test set:')\n", "pred_test = predict(X_test, Y_test, W, b, word_to_vec_map)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Expected Output**:\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", " **Train set accuracy**\n", " \n", " 97.7\n", "
\n", " **Test set accuracy**\n", " \n", " 85.7\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Random guessing would have had 20% accuracy given that there are 5 classes. (1/5 = 20%).\n", "* This is pretty good performance after training on only 127 examples. \n", "\n", "\n", "#### The model matches emojis to relevant words\n", "In the training set, the algorithm saw the sentence \n", ">\"*I love you*\" \n", "\n", "with the label ❀️. \n", "* You can check that the word \"adore\" does not appear in the training set. \n", "* Nonetheless, lets see what happens if you write \"*I adore you*.\"\n", "\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy: 0.833333333333\n", "\n", "i adore you ❀️\n", "i love you ❀️\n", "funny lol πŸ˜„\n", "lets play with a ball ⚾\n", "food is ready 🍴\n", "not feeling happy πŸ˜„\n" ] } ], "source": [ "X_my_sentences = np.array([\"i adore you\", \"i love you\", \"funny lol\", \"lets play with a ball\", \"food is ready\", \"not feeling happy\"])\n", "Y_my_labels = np.array([[0], [0], [2], [1], [4],[3]])\n", "\n", "pred = predict(X_my_sentences, Y_my_labels , W, b, word_to_vec_map)\n", "print_predictions(X_my_sentences, pred)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Amazing! \n", "* Because *adore* has a similar embedding as *love*, the algorithm has generalized correctly even to a word it has never seen before. \n", "* Words such as *heart*, *dear*, *beloved* or *adore* have embedding vectors similar to *love*. \n", " * Feel free to modify the inputs above and try out a variety of input sentences. \n", " * How well does it work?\n", "\n", "#### Word ordering isn't considered in this model\n", "* Note that the model doesn't get the following sentence correct:\n", ">\"not feeling happy\" \n", "\n", "* This algorithm ignores word ordering, so is not good at understanding phrases like \"not happy.\" \n", "\n", "#### Confusion matrix\n", "* Printing the confusion matrix can also help understand which classes are more difficult for your model. \n", "* A confusion matrix shows how often an example whose label is one class (\"actual\" class) is mislabeled by the algorithm with a different class (\"predicted\" class)." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(56,)\n", " ❀️ ⚾ πŸ˜„ 😞 🍴\n", "Predicted 0.0 1.0 2.0 3.0 4.0 All\n", "Actual \n", "0 6 0 0 1 0 7\n", "1 0 8 0 0 0 8\n", "2 2 0 16 0 0 18\n", "3 1 1 2 12 0 16\n", "4 0 0 1 0 6 7\n", "All 9 9 19 13 6 56\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQwAAAD3CAYAAADormr9AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGFhJREFUeJzt3X20XXV95/H35+YZE4SQGAMEw0gEM1RRaeoS7eJBKYiF\nDHQYcdFGZQbaGVaham1wpku7nA6oLXVY1dZYH4KISlUEHR6apjxanhJAAkRMFoYFmYSQICKUkAl8\n5o+9Lxyuuffsc+952Ofez2uts+7Z++yzv7997z3f89u/396/n2wTEVHFQK8LEBH9IwkjIipLwoiI\nypIwIqKyJIyIqCwJIyIqS8KIiMqSMCKisiSMiKhscq8L0EmSlgBTgN227+hRGQZsv9iFOD051okU\nV5I8wS+NHrc1DEm/A1wNnAR8S9K5kmZ2Ie5Jkv5C0oWS9utSsujVsU6ouMDUMn5XPjeS3MLjum6U\nCdvj6gEImAZ8HTi9XHcEsAr4GLBXB2P/FvBz4APA3wM/Bt4BTBlPxzrR4pZxFgHfBV5XLg90KlZD\nzMoJA1jT6fLYHn81DBeeB9YDb5I00/a9wPnAe4EPdTD84cA/2b7c9h8C3wM+DrwN2v/N1KtjnWhx\nS1uBR4ALJS2w/WI3ahqSKj26ZdwljAb3AfsBr5c02fYDwJ8CH5H05g7FvAuYIekwANsXA7cCfyNp\nH3fu9KQXxzoh4kr6DUlX2v4V8ClgE/DX3UoaSRgdpvK3Z/ta4Bngj4HDy2+jtcB1FFXbTtgK7Abe\nI2lOWY6/Au4HzulQzF4da9fjSprUg7ibKE4NvlMmjQuBjXQhaUhiYGCg0qNbVJ4r9TVJhwKzgTXA\ni7ZfaHjtM8As4HngUeCjwFG2N7Up9qQh8d4CfBq4HrjR9jpJy8tyfbYN8Q4B9gHut71zyGsdO1ZJ\n/x6YA6y3va2Lcd8JHGz7G+XyFNv/rwtxX2t7a/l8GvA1YJrt0yTNAi4AFgKfaNf/0lADAwOeMmVK\npW137dq11vaRnShHo75PGJJOBf4XsLl8rAG+bvvphm2OAd4EvAH4gu0H2xD3DbZ/Vj6fZPuFwW63\nMmmcQ/HBNrAEWGp73Rhjvo/iWHdQ1Gb+0vb9Qz5EnTjWE4HPAA9TdGWebXtzeTqwuxNxy2/tvYA7\nKGoNl9j++/K1aWVbRqeO9zDgQeB/UyTIFZJeBXwemGt7aZk0Pg3sTfH72D3WuEMNDAx46tSplbZ9\n/vnnkzCakTQFuIzin+nHkk4D3g7sAj5r+5dDtp/cjj9s+cG9AviB7Q+U6waTxkBZTZ0D7Av8JnCb\n7Z+PMeY7gK8AH7B9j6QvAtNtf7h8/RXXe7TxWI8GVgBn2r5T0pUUH8x/HhqznXEb9vdx4AXgzcA9\ntv9mmO3aFlfSgcC3gR8Cx1Ek5+8A64A/AQ4qaxp7U9Q6nmhH3KEGBgY8bdq0Stvu3LmzKwljPLRh\n7E3R5QVwJfAjim/BMwAkvV3SSeXrL/z621tTftOcS9Eyv0vSZQBlspjc8AHabXtD2WMypmTR4DO2\n7ymffxKYXVaXKZPUb5bJDNpwrKXHgXPKZPFaiq7jcyV9CfgDgDJu237HQ+wGFgArgSWSLpZ0YRn3\nnZ2Ia/sx4E7grRS9L9cC/wW4lCJpL5B0ie2nO5UsBqXRs43KavjFwKmS3lV+WG8F7gXeVX6YDgLu\nLrcfc3XK9rPAh4HLKfr+pzckjcHq+ZuBMyVNV/v+mncA3y/3P4nieoTXUSTMwW/FwyhOydpyrOV+\n1tu+oVw8C/ii7aXAbcB7JS0ADqaNv+MhrgK22l5NcWx/BLy6fO217Y7b8PdaTnE6OQfYQnHaswH4\nc4pGzy+2I16TstQuYfT1KQmApOnAf6b4g15m++Zy/Q0U34w/63D8/Siq7M/ZPlPSmyhqPLcMbRxs\nY8zJwHTgKtvHSToTeAvwqbIlvyskXQt8xPb6DsbYH/hL4F8prmn5BkWb0BW2L+1QTFHUUv8c+HcU\n19Est/0DSYuA7bZ/0YnYjSZNmuQZM2ZU2vbZZ5/tyilJ399LYnunpG9SfBtcUDZYPQ+8BvjliG9u\nT/wdks4BPifpIYpa2293KlmUMXcDz0h6tKyeHw98qJPJYrBBt2H5NIrfcUc/OLb/r6RHKT68/832\nD8uGzo0djGlePt28iaLN5gflaxs6FXdPutllWkXfJwwA27+Q9GWKlu1zgJ0UjXSPdyn+dkn3AScC\n77G9pZPxGr4B31X+PK7T/8iDyaI8zTsT+Ajwnwa7HjvsyxS1qbXl8k1DG1s7wfZDZZf4Qkl72f63\nTsccqpunG1WMi4QBYHsXcIOkm4vFzv9DDZK0L0Xj2PFj7TqtouEb8NPAXV3+1nuR4pz+VNsPdSOg\n7UeBRwdrOd382wK3A6d2Md5Lut0+UUXft2HUhaTpHnIhVRdiTvjbrbuhV7WLyZMne9asWZW2feqp\np9KG0U+6nSzKmEkWXdCLZDGobjWMJIyIGkvCiIjKkjAiohKVd6vWSb1K0wGSzp4IMRN3fMat25We\n4z5hAL34p+rJP3Lijr+47UwYkjZJWifpXklrynWzJa2StKH8ue9I+5gICSOib3WghnGM7SMaumCX\nA6ttLwJWl8vDl6cfeuZmz57tBQsWjOq9O3bsYL/99hvVe6sOXjLUE088wdy5c0f13rEYS9yx/B9s\n376dOXPmjOq9Y6lOj+V4d+3aNeq4o/2feuyxx3jyyScrH/DUqVNd9fe6ZcuWptdhSNoEHGl7e8O6\nh4CjbW+RNJ9i0KdDh9tHXzR6LliwgGuuuabrcQ844ICux+yV3bvbPv5LJZMn9+ZfcNOmTV2PefLJ\nJ7f8nja3Txj4Z0kvAF+yvQKY13Arw1Zg3kg76IuEETFRtZAw5gy2S5RWlAmh0TtdjJT2GmCVpJ82\nvmh7cMqCYSVhRNRYC92q25udktjeXP7cpmLktCXA45LmN5ySjHiXdRo9I2qqnQPoSHqVinFIB0eN\nO55iNPurgWXlZssoBiwaVmoYETXWxjaMecCV5f4mA5fbvk7SXcAVks6imKjp9JF2koQRUWPtShi2\nH6YYSHno+h0UAx1XkoQRUWO5lyQiKkvCiIhK6njzWRJGRI3VrYbRk/Ql6QRJD0naWA6yGhF7MOHv\nVlUxCc8XKEbYXgycIWlxt8sR0Q8mfMKguLpso+2Hy5G+vw2c0oNyRNRaOy/capdeJIwDgEcblh8r\n10XEEHVLGLVt9CxHNTobJtZdoxGN0ugJmylm4x50YLnuFWyvsH2k7SNHO55FRL8bGBio9OhaeboW\n6WV3AYskHSxpKvB+ihtgIqJBHdswun5KYnu3pHOB64FJwFdtP9DtckT0g7qdkvSkDcP2NUD3h9CK\n6DNJGBFRWRJGRFSWhBERlXS7QbOKJIyIGsvdqhFRWWoYEVFZEkZEVJI2jIhoSRJGRFSWhDEKU6ZM\n6ckdqxs3bux6TIBDDjmk6zF7Ncdpr/RiLtnRTHidhBERlWQQ4IhoSWoYEVFZEkZEVJaEERGVJWFE\nRCW5cCsiWlK3hFGvPpuIeIV2DgIsaZKkeyT9qFyeLWmVpA3lz32blmeMxxMRHdTmQYDPA9Y3LC8H\nVtteBKwul0eUhBFRU+0cNVzSgcBJwD80rD4FWFk+XwksbbaftGFE1Fgb2zA+D3wcmNWwbp7tLeXz\nrcC8Zjvp1eztX5W0TdL9vYgf0S9aqGHMkbSm4XF2wz7eB2yzvXa4OC5udGl6s0uvahhfB/4WuLRH\n8SP6Qgs1jO22jxzmtaOAkyW9F5gO7C3pMuBxSfNtb5E0H9jWLEhPahi2bwae7EXsiH4xePPZWHtJ\nbF9g+0DbCylmGvwX22dSzDi4rNxsGXBVszKlDSOixjp8HcZFwBWSzgIeAU5v9obaJozG2dsPOuig\nHpcmojfanTBs3wjcWD7fARzXyvtr263aOHv73Llze12ciJ6Y8JMxR0R1uTQckPQt4DbgUEmPledQ\nEdGgnRdutUuvZm8/oxdxI/pN3WoYOSWJqLGM6RkRlWQ8jIhoSRJGRFSWhBERlSVhRERlSRgRUUka\nPSOiJelWjYjKUsMYhRdffJHnnnuu63F7MYs6wLXXXtv1mCeeeGLXY/bSfffd1/WYo/kfTsKIiErS\nhhERLUnCiIjKkjAiorIkjIioZHAQ4DpJwoiosdQwIqKyJIyIqCwJIyIqS8KIiErqeOFW15tgJS2Q\ndIOkByU9IOm8bpchol9k1HDYDXzU9t2SZgFrJa2y/WAPyhJRaxO+W9X2FmBL+fxXktYDBwBJGBFD\n1O2UpKdtGJIWAm8B7uhlOSLqqI5tGD1LGJJmAt8Dzrf99B5ef2ky5gULFnS5dBH1ULeE0aupEqdQ\nJItv2v7+nrZpnIx5zpw53S1gRE30TaOnpB8CHu512yePJqCKo/sKsN72xaPZR8REUbcaxkinJH/V\noZhHAb8PrJN0b7nuE7av6VC8iL7UrpvPJE0HbgamUXzmv2v7k5JmA98BFgKbgNNt/2KkfQ2bMGzf\nNOaS7nm/twL1SpsRNdWmGsbzwLG2nymbA26VdC1wKrDa9kWSlgPLgT8baUdN05ekRZK+W15o9fDg\nox1HEREja0cbhgvPlItTyoeBU4CV5fqVwNJm5alS3/ka8HcUF1wdA1wKXFbhfRExRu1q9JQ0qWwC\n2Aassn0HMK+8LgpgKzCv2X6qJIwZtlcDsv2I7U8BJ1V4X0SMUQsJY46kNQ2Psxv3Y/sF20cABwJL\nJB0+5HUzQifHoCrXYTwvaQDYIOlcYDMws+LxRsQotdhlut32kc02sv2UpBuAE4DHJc23vUXSfIra\nx4iq1DDOA/YC/hh4G0UPx7IK74uIMWrHKYmkuZL2KZ/PAN4D/BS4mpc/y8uAq5qVp2kNw/Zd5dNn\ngA812z4i2qdNN5/NB1ZKmkRRSbjC9o8k3QZcIeks4BHg9GY7apowyurLr53b2D625WJHREva0a1q\n+z6Ke7aGrt8BHNfKvqq0YXys4fl04DSKHpOI6KC+vPnM9tohq34s6c4OlSciGvRdwigvHx00QNHw\n+eqOlWjPZWDKlCndDAnA7t29qUgdffTRXY955529+Q5YsmRJT+LOmDGj6zFH8+Hvu4QBrKVowxDF\nqcjPgbM6WaiIKPRjwnij7Z2NKyRN61B5IqJB3RJGlT6bf93DutvaXZCIeKXBu1WrPLplpPEwXksx\n1uYMSW/h5TtM96a4kCsiOqxuNYyRTkl+B/ggxbXnf83LCeNp4BOdLVZEQB8lDNsrKa4OO83297pY\npogo1S1hVDn5edvgdegAkvaV9D87WKaIoPp9JN1MKlUSxom2nxpcKIfwem/nihQRg+qWMKp0q06S\nNM328/DS3W7pVo3ogrqdklRJGN8EVkv6GkXD5wd5eViviOigvpsq0fZnJP0EeDfFFZ/XA6/rdMEi\nJrq+vPms9DhFsviPFJeGj7rXZLghz0e7v4jxrG8ShqQ3AGeUj+0U8xfI9jFjjLnHIc9t3z7G/UaM\nO32TMCiG8LoFeJ/tjQCS/mSsAcvBRvc05HlEDFG3hDFSi8qpwBbgBklflnQcbZqAaJghzyNiiLp1\nqw6bMGz/wPb7gcOAG4DzgddI+jtJx48laLMhz6GYvX1wyPTt27ePJVxEX+rLC7dsP2v7ctu/S/EB\nv4cm06lVVV4QNjjk+dDXMnt7THh1u1u1pUi2f1F+kFsaOLTRCEOeR8QQdathVO1Wbac9Dnneg3JE\n1F7dGj27njCGG/I8Il6pny/ciogeSMKIiMqSMCKisr67+SwieiNtGBHRkiSMiKgsCSMiKkvCiIjK\n6pYw6tUEGxEvadfNZ5IWSLpB0oOSHpB0Xrl+tqRVkjaUP/dtVqa+qGFIYvLkvihq3+rVLOqbN2/u\nSdw3vvGNXY85mhnj29Stuhv4qO27Jc0C1kpaRTE+72rbF0laDiynyY2lqWFE1Fg7ahi2t9i+u3z+\nK2A9xTSop/DygN4rgaXNypOv7Yia6sR1GJIWUtzLdQcwz/aW8qWtwLxm70/CiKixFhLGHElrGpZX\n2F4xZF8zKQbwPt/20437tm1JTYfKTMKIqLEWEsZ220eOsJ8pFMnim7a/X65+XNJ821skzacYMnNE\nacOIqLE29ZII+Aqw3vbFDS9dDSwrny8DrmpWntQwImqsTW0YRwG/D6wrB98G+ARwEXCFpLOAR4DT\nm+0oCSOipiS1pVvV9q0MP+J/S8NtJmFE1FjdrvRMwoiosSSMiKgsCSMiKqnjADo961Ytp0u8R1Km\nGIgYRuYledl5FNe0793DMkTUWmoYgKQDgZOAf+hF/Ih+UbepEntVw/g88HFgVo/iR9Re2jAASe8D\nttle22S7l2Zvf+KJJ7pUuoh6qVsbRi9OSY4CTpa0Cfg2cKyky4Zu1Dh7+9y5c7tdxohamPAJw/YF\ntg+0vRB4P/Avts/sdjki+kHdEkauw4iosbq1YfQ0Ydi+Ebixl2WIqKs6NnqmhhFRY5lbNSIqSw0j\nIipLwoiIStKGEREtScKIiMqSMCKisvSSREQlacOIiJYkYYzCzp07Wb9+fa+L0TXr1q3resz999+/\n6zEBDj744AkVt1VJGBFRWRJGRFSWhBERlaTRMyJakm7ViKgsNYyIqCwJIyIqSRtGRLSkbgmjXi0q\nEfEK7RoEWNJXJW2TdH/DutmSVknaUP7ct9l+kjAiaqyNo4Z/HThhyLrlwGrbi4DV5fKIkjAiakpS\n26ZKtH0z8OSQ1acAK8vnK4GlzfbT0YQhaakkSzqsXF44WCWSdHRmbo8YWYfnJZlne0v5fCswr9kb\nOl3DOAO4tfwZES1qIWHMGZxatHyc3Uoc2wbcbLuO9ZJImgm8EzgG+CHwyU7FihivWqg9bLd9ZIu7\nf1zSfNtbJM0HtjV7QydrGKcA19n+GbBD0ts6GCtiXOrwKcnVwLLy+TLgqmZv6GTCOINismXKny2d\nlqhh9vYnnxzaVhMx/lVNFhW7Vb8F3AYcKukxSWcBFwHvkbQBeHe5PKKOnJJImg0cC/yGJAOTKM6P\nvlB1H7ZXACsADj/88KbnVhHjUbsu3LI93Bf2ca3sp1NtGL8HfMP2OYMrJN0ELOhQvIhxqW53q3aq\nNGcAVw5Z9z3ggg7FixiXOtyG0bKO1DBsH7OHdZcAlzQs30hmbo8YVm4+i4iWJGFERGVJGBFRWRJG\nRFSWhBERlQzerVonSRgRNZYaRkRUloQREZUlYUREJblwa5QeeOCB7YsXL35klG+fA2xvZ3lqGjNx\n6x/3da2+IQljFGzPHe17Ja0ZxcAiY9KLmIk7PuMmYUREZelWjYhK0obRGysmSMzEHYdx65Yw6lXf\n6YBy5K5xEVPSC5LulXS/pH+UtNdo4zZO8yDpZEnDTmIjaR9J/3W414eLK+lTkj5WtUyt6sXftttx\n6zYexrhPGOPMc7aPsH04sAv4w8YXVWj5b2r7atsjjee4DzBswojOScKIdrkFOETF5FAPSboUuB9Y\nIOl4SbdJurusicwEkHSCpJ9Kuhs4dXBHkj4o6W/L5/MkXSnpJ+XjHRSDw76+rN18rtzuTyXdJek+\nSX/RsK//Lulnkm4FDu3ab2OcqlvCmAhtGOOOpMnAicB15apFwDLbt0uaA/wP4N22n5X0Z8BHJH0W\n+DLF4Mwbge8Ms/tLgJts/wdJk4CZFHNuHm77iDL+8WXMJYCAqyX9NvAs8H7gCIr/rbuBte09+okj\nN5/FWM2QdG/5/BbgK8D+wCO2by/Xvx1YDPy4/OaZSjG8/GHAz21vAJB0GbCn2bGOBf4AwPYLwC/1\n67N6H18+7imXZ1IkkFnAlbb/rYxx9ZiONmrX6JmE0V+eG/yWH1T+Qz3buApYNXRYeUmveN8YCbjQ\n9peGxDi/jTGC+iWMetV3oh1uB46SdAiApFdJegPwU2ChpNeX2w03T8Vq4I/K906S9GrgVxS1h0HX\nAx9uaBs5QNJrgJuBpZJmSJoF/G6bj21Cqdp+kUbPGDXbTwAfBL4l6T7K0xHbOylOQf5P2eg53Dya\n5wHHSFpH0f6w2PYOilOc+yV9zvY/AZcDt5XbfReYZftuiraRnwDXAnd17EAniLolDBWTNkdE3bz1\nrW/1LbfcUmnbmTNnru3G/S1pw4iosbq1YSRhRNRUulUjoiWpYUREZUkYEVFZ3RJGvU6QIuIV2tWt\nWt5H9JCkjRrhzuRmkjAiaqpdF26V9wR9geL+o8XAGZIWj6ZMSRgRNdamGsYSYKPth23vAr4NnDKa\n8qQNI6LG2tStegDwaMPyY8BvjWZHSRgRNbV27drry+EKqpguaU3D8opOjAyWhBFRU7ZPaNOuNgML\nGpYPLNe1LG0YEePfXcAiSQdLmkoxyNGoxipJDSNinLO9W9K5FMMSTAK+avuB0ewrd6tGRGU5JYmI\nypIwIqKyJIyIqCwJIyIqS8KIiMqSMCKisiSMiKgsCSMiKvv/8Xv3V7kAUaMAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "print(Y_test.shape)\n", "print(' '+ label_to_emoji(0)+ ' ' + label_to_emoji(1) + ' ' + label_to_emoji(2)+ ' ' + label_to_emoji(3)+' ' + label_to_emoji(4))\n", "print(pd.crosstab(Y_test, pred_test.reshape(56,), rownames=['Actual'], colnames=['Predicted'], margins=True))\n", "plot_confusion_matrix(Y_test, pred_test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## What you should remember from this section\n", "- Even with a 127 training examples, you can get a reasonably good model for Emojifying. \n", " - This is due to the generalization power word vectors gives you. \n", "- Emojify-V1 will perform poorly on sentences such as *\"This movie is not good and not enjoyable\"* \n", " - It doesn't understand combinations of words.\n", " - It just averages all the words' embedding vectors together, without considering the ordering of words. \n", " \n", "**You will build a better algorithm in the next section!**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2 - Emojifier-V2: Using LSTMs in Keras: \n", "\n", "Let's build an LSTM model that takes word **sequences** as input!\n", "* This model will be able to account for the word ordering. \n", "* Emojifier-V2 will continue to use pre-trained word embeddings to represent words.\n", "* We will feed word embeddings into an LSTM.\n", "* The LSTM will learn to predict the most appropriate emoji. \n", "\n", "Run the following cell to load the Keras packages." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using TensorFlow backend.\n" ] } ], "source": [ "import numpy as np\n", "np.random.seed(0)\n", "from keras.models import Model\n", "from keras.layers import Dense, Input, Dropout, LSTM, Activation\n", "from keras.layers.embeddings import Embedding\n", "from keras.preprocessing import sequence\n", "from keras.initializers import glorot_uniform\n", "np.random.seed(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1 - Overview of the model\n", "\n", "Here is the Emojifier-v2 you will implement:\n", "\n", "
\n", "
**Figure 3**: Emojifier-V2. A 2-layer LSTM sequence classifier.
\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2 Keras and mini-batching \n", "\n", "* In this exercise, we want to train Keras using mini-batches. \n", "* However, most deep learning frameworks require that all sequences in the same mini-batch have the **same length**. \n", " * This is what allows vectorization to work: If you had a 3-word sentence and a 4-word sentence, then the computations needed for them are different (one takes 3 steps of an LSTM, one takes 4 steps) so it's just not possible to do them both at the same time.\n", " \n", "#### Padding handles sequences of varying length\n", "* The common solution to handling sequences of **different length** is to use padding. Specifically:\n", " * Set a maximum sequence length\n", " * Pad all sequences to have the same length. \n", " \n", "##### Example of padding\n", "* Given a maximum sequence length of 20, we could pad every sentence with \"0\"s so that each input sentence is of length 20. \n", "* Thus, the sentence \"I love you\" would be represented as $(e_{I}, e_{love}, e_{you}, \\vec{0}, \\vec{0}, \\ldots, \\vec{0})$. \n", "* In this example, any sentences longer than 20 words would have to be truncated. \n", "* One way to choose the maximum sequence length is to just pick the length of the longest sentence in the training set. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.3 - The Embedding layer\n", "\n", "* In Keras, the embedding matrix is represented as a \"layer\".\n", "* The embedding matrix maps word indices to embedding vectors.\n", " * The word indices are positive integers.\n", " * The embedding vectors are dense vectors of fixed size.\n", " * When we say a vector is \"dense\", in this context, it means that most of the values are non-zero. As a counter-example, a one-hot encoded vector is not \"dense.\"\n", "* The embedding matrix can be derived in two ways:\n", " * Training a model to derive the embeddings from scratch. \n", " * Using a pretrained embedding\n", " \n", "#### Using and updating pre-trained embeddings\n", "* In this part, you will learn how to create an [Embedding()](https://keras.io/layers/embeddings/) layer in Keras\n", "* You will initialize the Embedding layer with the GloVe 50-dimensional vectors. \n", "* In the code below, we'll show you how Keras allows you to either train or leave fixed this layer. \n", "* Because our training set is quite small, we will leave the GloVe embeddings fixed instead of updating them.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Inputs and outputs to the embedding layer\n", "\n", "* The `Embedding()` layer's input is an integer matrix of size **(batch size, max input length)**. \n", " * This input corresponds to sentences converted into lists of indices (integers).\n", " * The largest integer (the highest word index) in the input should be no larger than the vocabulary size.\n", "* The embedding layer outputs an array of shape (batch size, max input length, dimension of word vectors).\n", "\n", "* The figure shows the propagation of two example sentences through the embedding layer. \n", " * Both examples have been zero-padded to a length of `max_len=5`.\n", " * The word embeddings are 50 units in length.\n", " * The final dimension of the representation is `(2,max_len,50)`. \n", "\n", "\n", "
**Figure 4**: Embedding layer
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Prepare the input sentences\n", "**Exercise**: \n", "* Implement `sentences_to_indices`, which processes an array of sentences (X) and returns inputs to the embedding layer:\n", " * Convert each training sentences into a list of indices (the indices correspond to each word in the sentence)\n", " * Zero-pad all these lists so that their length is the length of the longest sentence.\n", " \n", "##### Additional Hints\n", "* Note that you may have considered using the `enumerate()` function in the for loop, but for the purposes of passing the autograder, please follow the starter code by initializing and incrementing `j` explicitly." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 I\n", "1 like\n", "2 learning\n" ] } ], "source": [ "for idx, val in enumerate([\"I\", \"like\", \"learning\"]):\n", " print(idx,val)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# GRADED FUNCTION: sentences_to_indices\n", "\n", "def sentences_to_indices(X, word_to_index, max_len):\n", " \"\"\"\n", " Converts an array of sentences (strings) into an array of indices corresponding to words in the sentences.\n", " The output shape should be such that it can be given to `Embedding()` (described in Figure 4). \n", " \n", " Arguments:\n", " X -- array of sentences (strings), of shape (m, 1)\n", " word_to_index -- a dictionary containing the each word mapped to its index\n", " max_len -- maximum number of words in a sentence. You can assume every sentence in X is no longer than this. \n", " \n", " Returns:\n", " X_indices -- array of indices corresponding to words in the sentences from X, of shape (m, max_len)\n", " \"\"\"\n", " \n", " m = X.shape[0] # number of training examples\n", " \n", " ### START CODE HERE ###\n", " # Initialize X_indices as a numpy matrix of zeros and the correct shape (β‰ˆ 1 line)\n", " X_indices = np.zeros((m, max_len))\n", " \n", " for i in range(m): # loop over training examples\n", " \n", " # Convert the ith training sentence in lower case and split is into words. You should get a list of words.\n", " sentence_words =X[i].lower().split()\n", " \n", " # Initialize j to 0\n", " j = 0\n", " \n", " # Loop over the words of sentence_words\n", " for w in sentence_words:\n", " # Set the (i,j)th entry of X_indices to the index of the correct word.\n", " X_indices[i, j] = word_to_index[w]\n", " # Increment j to j + 1\n", " j = j + 1\n", " \n", " ### END CODE HERE ###\n", " \n", " return X_indices" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the following cell to check what `sentences_to_indices()` does, and check your results." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "X1 = ['funny lol' 'lets play baseball' 'food is ready for you']\n", "X1_indices =\n", " [[ 155345. 225122. 0. 0. 0.]\n", " [ 220930. 286375. 69714. 0. 0.]\n", " [ 151204. 192973. 302254. 151349. 394475.]]\n" ] } ], "source": [ "X1 = np.array([\"funny lol\", \"lets play baseball\", \"food is ready for you\"])\n", "X1_indices = sentences_to_indices(X1,word_to_index, max_len = 5)\n", "print(\"X1 =\", X1)\n", "print(\"X1_indices =\\n\", X1_indices)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Expected Output**:\n", "\n", "```Python\n", "X1 = ['funny lol' 'lets play baseball' 'food is ready for you']\n", "X1_indices =\n", " [[ 155345. 225122. 0. 0. 0.]\n", " [ 220930. 286375. 69714. 0. 0.]\n", " [ 151204. 192973. 302254. 151349. 394475.]]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Build embedding layer\n", "\n", "* Let's build the `Embedding()` layer in Keras, using pre-trained word vectors. \n", "* The embedding layer takes as input a list of word indices.\n", " * `sentences_to_indices()` creates these word indices.\n", "* The embedding layer will return the word embeddings for a sentence. \n", "\n", "**Exercise**: Implement `pretrained_embedding_layer()` with these steps:\n", "1. Initialize the embedding matrix as a numpy array of zeros.\n", " * The embedding matrix has a row for each unique word in the vocabulary.\n", " * There is one additional row to handle \"unknown\" words.\n", " * So vocab_len is the number of unique words plus one.\n", " * Each row will store the vector representation of one word. \n", " * For example, one row may be 50 positions long if using GloVe word vectors.\n", " * In the code below, `emb_dim` represents the length of a word embedding.\n", "2. Fill in each row of the embedding matrix with the vector representation of a word\n", " * Each word in `word_to_index` is a string.\n", " * word_to_vec_map is a dictionary where the keys are strings and the values are the word vectors.\n", "3. Define the Keras embedding layer. \n", " * Use [Embedding()](https://keras.io/layers/embeddings/). \n", " * The input dimension is equal to the vocabulary length (number of unique words plus one).\n", " * The output dimension is equal to the number of positions in a word embedding.\n", " * Make this layer's embeddings fixed.\n", " * If you were to set `trainable = True`, then it will allow the optimization algorithm to modify the values of the word embeddings.\n", " * In this case, we don't want the model to modify the word embeddings.\n", "4. Set the embedding weights to be equal to the embedding matrix.\n", " * Note that this is part of the code is already completed for you and does not need to be modified. " ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# GRADED FUNCTION: pretrained_embedding_layer\n", "\n", "def pretrained_embedding_layer(word_to_vec_map, word_to_index):\n", " \"\"\"\n", " Creates a Keras Embedding() layer and loads in pre-trained GloVe 50-dimensional vectors.\n", " \n", " Arguments:\n", " word_to_vec_map -- dictionary mapping words to their GloVe vector representation.\n", " word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)\n", "\n", " Returns:\n", " embedding_layer -- pretrained layer Keras instance\n", " \"\"\"\n", " \n", " vocab_len = len(word_to_index) + 1 # adding 1 to fit Keras embedding (requirement)\n", " emb_dim = word_to_vec_map[\"cucumber\"].shape[0] # define dimensionality of your GloVe word vectors (= 50)\n", " \n", " ### START CODE HERE ###\n", " # Step 1\n", " # Initialize the embedding matrix as a numpy array of zeros.\n", " # See instructions above to choose the correct shape.\n", " emb_matrix = np.zeros((vocab_len, emb_dim))\n", " \n", " # Step 2\n", " # Set each row \"idx\" of the embedding matrix to be \n", " # the word vector representation of the idx'th word of the vocabulary\n", " for word, index in word_to_index.items():\n", " emb_matrix[index, :] = word_to_vec_map[word]\n", "\n", " # Step 3\n", " # Define Keras embedding layer with the correct input and output sizes\n", " # Make it non-trainable.\n", " embedding_layer = Embedding(vocab_len, emb_dim, trainable=False)\n", " ### END CODE HERE ###\n", "\n", " # Step 4 (already done for you; please do not modify)\n", " # Build the embedding layer, it is required before setting the weights of the embedding layer. \n", " embedding_layer.build((None,)) # Do not modify the \"None\". This line of code is complete as-is.\n", " \n", " # Set the weights of the embedding layer to the embedding matrix. Your layer is now pretrained.\n", " embedding_layer.set_weights([emb_matrix])\n", " \n", " return embedding_layer" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "weights[0][1][3] = -0.3403\n" ] } ], "source": [ "embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)\n", "print(\"weights[0][1][3] =\", embedding_layer.get_weights()[0][1][3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Expected Output**:\n", "\n", "```Python\n", "weights[0][1][3] = -0.3403\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.3 Building the Emojifier-V2\n", "\n", "Lets now build the Emojifier-V2 model. \n", "* You feed the embedding layer's output to an LSTM network. \n", "\n", "
\n", "
**Figure 3**: Emojifier-v2. A 2-layer LSTM sequence classifier.
\n", "\n", "\n", "**Exercise:** Implement `Emojify_V2()`, which builds a Keras graph of the architecture shown in Figure 3. \n", "* The model takes as input an array of sentences of shape (`m`, `max_len`, ) defined by `input_shape`. \n", "* The model outputs a softmax probability vector of shape (`m`, `C = 5`). \n", "\n", "* You may need to use the following Keras layers:\n", " * [Input()](https://keras.io/layers/core/#input)\n", " * Set the `shape` and `dtype` parameters.\n", " * The inputs are integers, so you can specify the data type as a string, 'int32'.\n", " * [LSTM()](https://keras.io/layers/recurrent/#lstm)\n", " * Set the `units` and `return_sequences` parameters.\n", " * [Dropout()](https://keras.io/layers/core/#dropout)\n", " * Set the `rate` parameter.\n", " * [Dense()](https://keras.io/layers/core/#dense)\n", " * Set the `units`, \n", " * Note that `Dense()` has an `activation` parameter. For the purposes of passing the autograder, please do not set the activation within `Dense()`. Use the separate `Activation` layer to do so.\n", " * [Activation()](https://keras.io/activations/).\n", " * You can pass in the activation of your choice as a lowercase string.\n", " * [Model](https://keras.io/models/model/)\n", " Set `inputs` and `outputs`.\n", "\n", "\n", "#### Additional Hints\n", "* Remember that these Keras layers return an object, and you will feed in the outputs of the previous layer as the input arguments to that object. The returned object can be created and called in the same line.\n", "\n", "```Python\n", "# How to use Keras layers in two lines of code\n", "dense_object = Dense(units = ...)\n", "X = dense_object(inputs)\n", "\n", "# How to use Keras layers in one line of code\n", "X = Dense(units = ...)(inputs)\n", "```\n", "\n", "* The `embedding_layer` that is returned by `pretrained_embedding_layer` is a layer object that can be called as a function, passing in a single argument (sentence indices).\n", "\n", "* Here is some sample code in case you're stuck\n", "```Python\n", "raw_inputs = Input(shape=(maxLen,), dtype='int32')\n", "preprocessed_inputs = ... # some pre-processing\n", "X = LSTM(units = ..., return_sequences= ...)(processed_inputs)\n", "X = Dropout(rate = ..., )(X)\n", "...\n", "X = Dense(units = ...)(X)\n", "X = Activation(...)(X)\n", "model = Model(inputs=..., outputs=...)\n", "...\n", "```\n", "\n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# GRADED FUNCTION: Emojify_V2\n", "\n", "def Emojify_V2(input_shape, word_to_vec_map, word_to_index):\n", " \"\"\"\n", " Function creating the Emojify-v2 model's graph.\n", " \n", " Arguments:\n", " input_shape -- shape of the input, usually (max_len,)\n", " word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representation\n", " word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)\n", "\n", " Returns:\n", " model -- a model instance in Keras\n", " \"\"\"\n", " \n", " ### START CODE HERE ###\n", " # Define sentence_indices as the input of the graph\n", " # It should be of shape input_shape and dtype 'int32' (as it contains indices).\n", " sentence_indices = Input(input_shape, dtype='int32')\n", " \n", " # Create the embedding layer pretrained with GloVe Vectors (β‰ˆ1 line)\n", " embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)\n", " \n", " # Propagate sentence_indices through your embedding layer, you get back the embeddings\n", " embeddings = embedding_layer(sentence_indices) \n", " \n", " # Propagate the embeddings through an LSTM layer with 128-dimensional hidden state\n", " # Be careful, the returned output should be a batch of sequences.\n", " X = LSTM(128, return_sequences=True)(embeddings)\n", " # Add dropout with a probability of 0.5\n", " X = Dropout(0.5)(X)\n", " # Propagate X trough another LSTM layer with 128-dimensional hidden state\n", " # Be careful, the returned output should be a single hidden state, not a batch of sequences.\n", " X = LSTM(128, return_sequences=False)(X)\n", " # Add dropout with a probability of 0.5\n", " X = Dropout(0.5)(X)\n", " # Propagate X through a Dense layer with softmax activation to get back a batch of 5-dimensional vectors.\n", " X = Dense(5)(X)\n", " # Add a softmax activation\n", " X = Activation('softmax')(X)\n", " \n", " # Create Model instance which converts sentence_indices into X.\n", " model = Model(inputs=sentence_indices, outputs=X)\n", " \n", " ### END CODE HERE ###\n", " \n", " return model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Run the following cell to create your model and check its summary. Because all sentences in the dataset are less than 10 words, we chose `max_len = 10`. You should see your architecture, it uses \"20,223,927\" parameters, of which 20,000,050 (the word embeddings) are non-trainable, and the remaining 223,877 are. Because our vocabulary size has 400,001 words (with valid indices from 0 to 400,000) there are 400,001\\*50 = 20,000,050 non-trainable parameters. " ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "input_1 (InputLayer) (None, 10) 0 \n", "_________________________________________________________________\n", "embedding_2 (Embedding) (None, 10, 50) 20000050 \n", "_________________________________________________________________\n", "lstm_1 (LSTM) (None, 10, 128) 91648 \n", "_________________________________________________________________\n", "dropout_1 (Dropout) (None, 10, 128) 0 \n", "_________________________________________________________________\n", "lstm_2 (LSTM) (None, 128) 131584 \n", "_________________________________________________________________\n", "dropout_2 (Dropout) (None, 128) 0 \n", "_________________________________________________________________\n", "dense_1 (Dense) (None, 5) 645 \n", "_________________________________________________________________\n", "activation_1 (Activation) (None, 5) 0 \n", "=================================================================\n", "Total params: 20,223,927\n", "Trainable params: 223,877\n", "Non-trainable params: 20,000,050\n", "_________________________________________________________________\n" ] } ], "source": [ "model = Emojify_V2((maxLen,), word_to_vec_map, word_to_index)\n", "model.summary()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As usual, after creating your model in Keras, you need to compile it and define what loss, optimizer and metrics your are want to use. Compile your model using `categorical_crossentropy` loss, `adam` optimizer and `['accuracy']` metrics:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": true }, "outputs": [], "source": [ "model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's time to train your model. Your Emojifier-V2 `model` takes as input an array of shape (`m`, `max_len`) and outputs probability vectors of shape (`m`, `number of classes`). We thus have to convert X_train (array of sentences as strings) to X_train_indices (array of sentences as list of word indices), and Y_train (labels as indices) to Y_train_oh (labels as one-hot vectors)." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X_train_indices = sentences_to_indices(X_train, word_to_index, maxLen)\n", "Y_train_oh = convert_to_one_hot(Y_train, C = 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fit the Keras model on `X_train_indices` and `Y_train_oh`. We will use `epochs = 50` and `batch_size = 32`." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/50\n", "132/132 [==============================] - 0s - loss: 1.6105 - acc: 0.1667 \n", "Epoch 2/50\n", "132/132 [==============================] - 0s - loss: 1.5378 - acc: 0.3106 \n", "Epoch 3/50\n", "132/132 [==============================] - 0s - loss: 1.5061 - acc: 0.3030 \n", "Epoch 4/50\n", "132/132 [==============================] - 0s - loss: 1.4453 - acc: 0.3561 \n", "Epoch 5/50\n", "132/132 [==============================] - 0s - loss: 1.3588 - acc: 0.4167 \n", "Epoch 6/50\n", "132/132 [==============================] - 0s - loss: 1.2436 - acc: 0.5303 \n", "Epoch 7/50\n", "132/132 [==============================] - 0s - loss: 1.1801 - acc: 0.4394 \n", "Epoch 8/50\n", "132/132 [==============================] - 0s - loss: 1.0568 - acc: 0.5758 \n", "Epoch 9/50\n", "132/132 [==============================] - 0s - loss: 0.8762 - acc: 0.6970 \n", "Epoch 10/50\n", "132/132 [==============================] - 0s - loss: 0.8194 - acc: 0.7121 \n", "Epoch 11/50\n", "132/132 [==============================] - 0s - loss: 0.6982 - acc: 0.7500 \n", "Epoch 12/50\n", "132/132 [==============================] - 0s - loss: 0.5978 - acc: 0.8030 \n", "Epoch 13/50\n", "132/132 [==============================] - 0s - loss: 0.4896 - acc: 0.8258 \n", "Epoch 14/50\n", "132/132 [==============================] - 0s - loss: 0.5152 - acc: 0.8106 \n", "Epoch 15/50\n", "132/132 [==============================] - 0s - loss: 0.4853 - acc: 0.8106 \n", "Epoch 16/50\n", "132/132 [==============================] - 0s - loss: 0.3543 - acc: 0.8561 \n", "Epoch 17/50\n", "132/132 [==============================] - 0s - loss: 0.3960 - acc: 0.8636 \n", "Epoch 18/50\n", "132/132 [==============================] - 0s - loss: 0.6225 - acc: 0.8333 \n", "Epoch 19/50\n", "132/132 [==============================] - 0s - loss: 0.5164 - acc: 0.8106 \n", "Epoch 20/50\n", "132/132 [==============================] - 0s - loss: 0.3943 - acc: 0.8485 \n", "Epoch 21/50\n", "132/132 [==============================] - 0s - loss: 0.4644 - acc: 0.8106 \n", "Epoch 22/50\n", "132/132 [==============================] - 0s - loss: 0.3992 - acc: 0.8561 \n", "Epoch 23/50\n", "132/132 [==============================] - 0s - loss: 0.3731 - acc: 0.8409 \n", "Epoch 24/50\n", "132/132 [==============================] - 0s - loss: 0.3110 - acc: 0.8939 \n", "Epoch 25/50\n", "132/132 [==============================] - 0s - loss: 0.3427 - acc: 0.8864 \n", "Epoch 26/50\n", "132/132 [==============================] - 0s - loss: 0.2496 - acc: 0.9242 \n", "Epoch 27/50\n", "132/132 [==============================] - 0s - loss: 0.3091 - acc: 0.8788 \n", "Epoch 28/50\n", "132/132 [==============================] - 0s - loss: 0.2491 - acc: 0.9167 \n", "Epoch 29/50\n", "132/132 [==============================] - 0s - loss: 0.3718 - acc: 0.8636 \n", "Epoch 30/50\n", "132/132 [==============================] - 0s - loss: 0.2553 - acc: 0.9167 \n", "Epoch 31/50\n", "132/132 [==============================] - 0s - loss: 0.2879 - acc: 0.8864 \n", "Epoch 32/50\n", "132/132 [==============================] - 0s - loss: 0.1927 - acc: 0.9318 \n", "Epoch 33/50\n", "132/132 [==============================] - 0s - loss: 0.2038 - acc: 0.9545 \n", "Epoch 34/50\n", "132/132 [==============================] - 0s - loss: 0.1568 - acc: 0.9621 \n", "Epoch 35/50\n", "132/132 [==============================] - 0s - loss: 0.1670 - acc: 0.9621 \n", "Epoch 36/50\n", "132/132 [==============================] - 0s - loss: 0.1926 - acc: 0.9394 \n", "Epoch 37/50\n", "132/132 [==============================] - 0s - loss: 0.2546 - acc: 0.9167 \n", "Epoch 38/50\n", "132/132 [==============================] - 0s - loss: 0.2600 - acc: 0.9167 \n", "Epoch 39/50\n", "132/132 [==============================] - 0s - loss: 0.1530 - acc: 0.9470 \n", "Epoch 40/50\n", "132/132 [==============================] - 0s - loss: 0.1932 - acc: 0.9318 \n", "Epoch 41/50\n", "132/132 [==============================] - 0s - loss: 0.0918 - acc: 0.9697 \n", "Epoch 42/50\n", "132/132 [==============================] - 0s - loss: 0.1222 - acc: 0.9545 \n", "Epoch 43/50\n", "132/132 [==============================] - 0s - loss: 0.0955 - acc: 0.9621 \n", "Epoch 44/50\n", "132/132 [==============================] - 0s - loss: 0.0590 - acc: 0.9924 \n", "Epoch 45/50\n", "132/132 [==============================] - 0s - loss: 0.0713 - acc: 0.9848 \n", "Epoch 46/50\n", "132/132 [==============================] - 0s - loss: 0.0665 - acc: 0.9924 \n", "Epoch 47/50\n", "132/132 [==============================] - 0s - loss: 0.0580 - acc: 0.9924 \n", "Epoch 48/50\n", "132/132 [==============================] - 0s - loss: 0.0751 - acc: 0.9697 \n", "Epoch 49/50\n", "132/132 [==============================] - 0s - loss: 0.0464 - acc: 0.9924 \n", "Epoch 50/50\n", "132/132 [==============================] - 0s - loss: 0.0418 - acc: 0.9848 \n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.fit(X_train_indices, Y_train_oh, epochs = 50, batch_size = 32, shuffle=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Your model should perform around **90% to 100% accuracy** on the training set. The exact accuracy you get may be a little different. Run the following cell to evaluate your model on the test set. " ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "32/56 [================>.............] - ETA: 0s\n", "Test accuracy = 0.928571428571\n" ] } ], "source": [ "X_test_indices = sentences_to_indices(X_test, word_to_index, max_len = maxLen)\n", "Y_test_oh = convert_to_one_hot(Y_test, C = 5)\n", "loss, acc = model.evaluate(X_test_indices, Y_test_oh)\n", "print()\n", "print(\"Test accuracy = \", acc)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You should get a test accuracy between 80% and 95%. Run the cell below to see the mislabelled examples. " ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Expected emoji:😞 prediction: work is hard\tπŸ˜„\n", "Expected emoji:😞 prediction: This girl is messing with me\t❀️\n", "Expected emoji:❀️ prediction: I love taking breaks\t😞\n", "Expected emoji:πŸ˜„ prediction: you brighten my day\t❀️\n" ] } ], "source": [ "# This code allows you to see the mislabelled examples\n", "C = 5\n", "y_test_oh = np.eye(C)[Y_test.reshape(-1)]\n", "X_test_indices = sentences_to_indices(X_test, word_to_index, maxLen)\n", "pred = model.predict(X_test_indices)\n", "for i in range(len(X_test)):\n", " x = X_test_indices\n", " num = np.argmax(pred[i])\n", " if(num != Y_test[i]):\n", " print('Expected emoji:'+ label_to_emoji(Y_test[i]) + ' prediction: '+ X_test[i] + label_to_emoji(num).strip())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you can try it on your own example. Write your own sentence below. " ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "not feeling happy 😞\n" ] } ], "source": [ "# Change the sentence below to see your prediction. Make sure all the words are in the Glove embeddings. \n", "x_test = np.array(['not feeling happy'])\n", "X_test_indices = sentences_to_indices(x_test, word_to_index, maxLen)\n", "print(x_test[0] +' '+ label_to_emoji(np.argmax(model.predict(X_test_indices))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## LSTM version accounts for word order\n", "* Previously, Emojify-V1 model did not correctly label \"not feeling happy,\" but our implementation of Emojiy-V2 got it right. \n", " * (Keras' outputs are slightly random each time, so you may not have obtained the same result.) \n", "* The current model still isn't very robust at understanding negation (such as \"not happy\")\n", " * This is because the training set is small and doesn't have a lot of examples of negation. \n", " * But if the training set were larger, the LSTM model would be much better than the Emojify-V1 model at understanding such complex sentences. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Congratulations!\n", "\n", "You have completed this notebook! ❀️❀️❀️\n", "\n", "\n", "## What you should remember\n", "- If you have an NLP task where the training set is small, using word embeddings can help your algorithm significantly. \n", "- Word embeddings allow your model to work on words in the test set that may not even appear in the training set. \n", "- Training sequence models in Keras (and in most other deep learning frameworks) requires a few important details:\n", " - To use mini-batches, the sequences need to be **padded** so that all the examples in a mini-batch have the **same length**. \n", " - An `Embedding()` layer can be initialized with pretrained values. \n", " - These values can be either fixed or trained further on your dataset. \n", " - If however your labeled dataset is small, it's usually not worth trying to train a large pre-trained set of embeddings. \n", " - `LSTM()` has a flag called `return_sequences` to decide if you would like to return every hidden states or only the last one. \n", " - You can use `Dropout()` right after `LSTM()` to regularize your network. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "#### Input sentences:\n", "```Python\n", "\"Congratulations on finishing this assignment and building an Emojifier.\"\n", "\"We hope you're happy with what you've accomplished in this notebook!\"\n", "```\n", "#### Output emojis:\n", "# πŸ˜€πŸ˜€πŸ˜€πŸ˜€πŸ˜€πŸ˜€" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Acknowledgments\n", "\n", "Thanks to Alison Darcy and the Woebot team for their advice on the creation of this assignment. \n", "* Woebot is a chatbot friend that is ready to speak with you 24/7. \n", "* Part of Woebot's technology uses word embeddings to understand the emotions of what you say. \n", "* You can chat with Woebot by going to http://woebot.io\n", "\n", "" ] } ], "metadata": { "coursera": { "course_slug": "nlp-sequence-models", "graded_item_id": "RNnEs", "launcher_item_id": "acNYU" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 2 }