{ "cells": [ { "cell_type": "markdown", "id": "81ad537c", "metadata": {}, "source": [ "# Preparations\n" ] }, { "cell_type": "markdown", "id": "ce68eb89", "metadata": {}, "source": [ "Execute the following code blocks to configure the session and import relevant modules." ] }, { "cell_type": "code", "execution_count": null, "id": "ce6938f3", "metadata": {}, "outputs": [], "source": [ "%config InlineBackend.figure_format ='retina'\n", "%load_ext autoreload\n", "%autoreload 2\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": null, "id": "b5fdd8c0", "metadata": {}, "outputs": [], "source": [ "import os\n", "import sys\n", "import math\n", "import numpy as np\n", "import pandas as pd\n", "import tensorflow as tf\n", "from keras.models import Sequential\n", "from keras.layers import Dense, LSTM\n", "from keras.utils import np_utils\n", "from keras.preprocessing.sequence import pad_sequences\n", "from keras.optimizer_v2.adam import Adam\n", "# Alternatively:\n", "# from tensorflow.keras.optimizers import Adam\n", "from sklearn.preprocessing import MinMaxScaler\n", "from sklearn.metrics import mean_squared_error\n", "import matplotlib.pyplot as plt\n", "import rnnutils" ] }, { "cell_type": "markdown", "id": "78ee2654", "metadata": {}, "source": [ "# Lab session: predicting time series, discrete state space\n" ] }, { "cell_type": "markdown", "id": "22e7c523", "metadata": {}, "source": [ "## Aims\n", "\n", "In this lab the aim is to predict a character in the alphabet given a short subsequence. Basically, the network learns to output the probability distribution of a character conditional on a sequence of input characters. Since the state space is discrete, you need to think about what output activity and loss function to use.\n", "\n", "As in the previous lab, to help you along the way, some of the steps have been prepared in advance, but in most cases, your task is to complete missing code. Don't hesitate to change parameter settings and experiment with the model architectures." ] }, { "cell_type": "markdown", "id": "23ef8428", "metadata": {}, "source": [ "## Prepare data" ] }, { "cell_type": "markdown", "id": "117bd55b", "metadata": {}, "source": [ "We will work with the English alphabet, which consists of 26 characters (states). The predictions will be based on alphabet substrings, such that the model given an input \"CDE\" should output \"F\", \"STUV\" \"W\" and so on." ] }, { "cell_type": "code", "execution_count": null, "id": "8af74afe", "metadata": {}, "outputs": [], "source": [ "alphabet = \"ABCDEFGHIJKLMNOPQRSTUVWXYZ\"" ] }, { "cell_type": "markdown", "id": "2773510f", "metadata": {}, "source": [ "Since a neural network cannot deal directly with characters, we map each individual letter to an integer (integer encoding):" ] }, { "cell_type": "code", "execution_count": null, "id": "a7392e56", "metadata": {}, "outputs": [], "source": [ "char_to_int = dict((c, i) for i, c in enumerate(alphabet))\n", "int_to_char = dict((i, c) for i, c in enumerate(alphabet))" ] }, { "cell_type": "markdown", "id": "fc815f9e", "metadata": {}, "source": [ "Training data will be generated by selecting n-tuple (n<=6) slices from the alphabet, where the output will be the last character and the input the preceding characters of a slice. The following code generates training data. " ] }, { "cell_type": "code", "execution_count": null, "id": "8bf695ef", "metadata": { "code_folding": [] }, "outputs": [], "source": [ "num_inputs = 200 # number of training samples (randomly generated)\n", "max_len = 5 # maximum number of sequence length\n", "dataX = []\n", "dataY = []\n", "for i in range(num_inputs):\n", " start = np.random.randint(len(alphabet)-2)\n", " end = np.random.randint(start, min(start+max_len,len(alphabet)-1))\n", " sequence_in = alphabet[start:end+1]\n", " sequence_out = alphabet[end + 1]\n", " dataX.append([char_to_int[char] for char in sequence_in])\n", " dataY.append(char_to_int[sequence_out])" ] }, { "cell_type": "markdown", "id": "a1bf6e87", "metadata": {}, "source": [ "Take a minute to inspect the dataX inputs. As you will see, the length of different entries differ. Prior to training, we need to pad input sequences shorter than five characters with zeros. Can you think of why this is necessary?" ] }, { "cell_type": "code", "execution_count": null, "id": "8a3a01f8", "metadata": {}, "outputs": [], "source": [ "# convert list of lists to array and pad sequences if needed (with zero)\n", "X = pad_sequences(dataX, maxlen=max_len, dtype='float32')\n", "# reshape X to be [samples, time steps, features]\n", "X = np.reshape(X, (X.shape[0], max_len, 1))\n", "# normalize by length of alphabet (26)\n", "X = X / float(len(alphabet))\n", "# one hot encode the output variable\n", "y = np_utils.to_categorical(dataY)" ] }, { "cell_type": "markdown", "id": "cf5e4a41", "metadata": {}, "source": [ "For convenience, we define parameters related to training and instantiate the optimizer (Adam). Test different values to see what happens. For more information, see [the keras documentation on Adam](https://keras.io/api/optimizers/adam)." ] }, { "cell_type": "code", "execution_count": null, "id": "5095c478", "metadata": {}, "outputs": [], "source": [ "epochsVal = 100\n", "learnRateVal = 0.001\n", "batchSizeVal = 10\n", "opt = Adam(learning_rate=learnRateVal, decay=learnRateVal / epochsVal)" ] }, { "cell_type": "markdown", "id": "85a31fa6", "metadata": {}, "source": [ "Finally, build your model. See the [keras LSTM documentation](https://keras.io/api/layers/recurrent_layers/lstm/) for information on more parameter settings, and [documentation on losses](https://keras.io/api/losses/) to learn more about different loss functions." ] }, { "cell_type": "code", "execution_count": null, "id": "315f498f", "metadata": { "solution": "hidden", "solution_first": true }, "outputs": [], "source": [ "model = Sequential()\n", "# Add LSTM layers\n", "# model.add(LSTM(units=32, input_shape=(), return_sequences=..., activation=...))\n", "# Add dense layer with activation for categorical output\n", "# model.add(Dense(..., activation=\"\")\n", "# Compile model using loss function for categorical data\n", "# model.compile(loss=... , optimizer=opt, metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": null, "id": "d2d156c4", "metadata": { "solution": "hidden" }, "outputs": [], "source": [ "model = Sequential()\n", "# Add LSTM layers; X.shape[1] refers to the number of columns in X which is the number of time steps, or window size\n", "model.add(LSTM(units=8, input_shape=(X.shape[1], 1), return_sequences=False, activation=\"tanh\"))\n", "# Add dense layer with activation for categorical output\n", "model.add(Dense(len(alphabet), activation=\"softmax\"))\n", "# Compile model using loss function for categorical data\n", "model.compile(loss=\"categorical_crossentropy\", optimizer=opt, metrics=[\"accuracy\"])" ] }, { "cell_type": "markdown", "id": "569bc966", "metadata": {}, "source": [ "Once you have compiled the model, fit it using the parameters you set above and evaluate." ] }, { "cell_type": "code", "execution_count": null, "id": "df0f6e14", "metadata": {}, "outputs": [], "source": [ "H = model.fit(X, y, epochs=epochsVal, batch_size=batchSizeVal, validation_split=0.1, verbose=1)" ] }, { "cell_type": "code", "execution_count": null, "id": "1a1bc681", "metadata": {}, "outputs": [], "source": [ "scores = model.evaluate(X, y, verbose=0)\n", "print(\"Model train accuracy: %.2f%%\" % (scores[1]*100))" ] }, { "cell_type": "code", "execution_count": null, "id": "cb6aa0ae", "metadata": {}, "outputs": [], "source": [ "# \n", "# rnnutils.plot_loss_acc(H)" ] }, { "cell_type": "markdown", "id": "3f98222c", "metadata": {}, "source": [ "## Printing predictions" ] }, { "cell_type": "markdown", "id": "4b630a14", "metadata": {}, "source": [ "Finally, to test some predictions, you can select an entry from the input data and run `model.predict`. Briefly, the code will select input sequences from the training data and the model will output predictions based on an input. If you increase the number of examples you will probably see cases where the predictions are wrong." ] }, { "cell_type": "code", "execution_count": null, "id": "ffb7a1a2", "metadata": { "solution": "hidden" }, "outputs": [], "source": [ "num_examples = 2\n", "for i in range(num_examples):\n", " pattern_index = np.random.randint(len(dataX))\n", " pattern = dataX[pattern_index]\n", " x = pad_sequences([pattern], maxlen=max_len, dtype='float32')\n", " x = np.reshape(x, (1, max_len, 1))\n", " x = x / float(len(alphabet))\n", " prediction = model.predict(x, verbose=0)\n", " index = np.argmax(prediction)\n", " result = int_to_char[index]\n", " seq_in = [int_to_char[value] for value in pattern]\n", " print (pattern_index, pattern, seq_in, \"->\", result)" ] } ], "metadata": { "kernelspec": { "display_name": "Python (nn_dl_python)", "language": "python", "name": "nn_dl_python" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 5 }