{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Creating a Siamese model using Trax: Ungraded Lecture Notebook" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:tokens_length=568 inputs_length=512 targets_length=114 noise_density=0.15 mean_noise_span_length=3.0 \n" ] } ], "source": [ "import trax\n", "from trax import layers as tl\n", "import trax.fastmath.numpy as np\n", "import numpy\n", "\n", "# Setting random seeds\n", "trax.supervised.trainer_lib.init_random_number_generators(10)\n", "numpy.random.seed(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## L2 Normalization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before building the model you will need to define a function that applies L2 normalization to a tensor. This is very important because in this week's assignment you will create a custom loss function which expects the tensors it receives to be normalized. Luckily this is pretty straightforward:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def normalize(x):\n", " return x / np.sqrt(np.sum(x * x, axis=-1, keepdims=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the denominator can be replaced by `np.linalg.norm(x, axis=-1, keepdims=True)` to achieve the same results and that Trax's numpy is being used within the function." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The tensor is of type: \n", "\n", "And looks like this:\n", "\n", " [[0.77132064 0.02075195 0.63364823 0.74880388 0.49850701]\n", " [0.22479665 0.19806286 0.76053071 0.16911084 0.08833981]]\n" ] } ], "source": [ "tensor = numpy.random.random((2,5))\n", "print(f'The tensor is of type: {type(tensor)}\\n\\nAnd looks like this:\\n\\n {tensor}')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The normalized tensor is of type: \n", "\n", "And looks like this:\n", "\n", " [[0.57393795 0.01544148 0.4714962 0.55718327 0.37093794]\n", " [0.26781026 0.23596111 0.9060541 0.20146926 0.10524315]]\n" ] } ], "source": [ "norm_tensor = normalize(tensor)\n", "print(f'The normalized tensor is of type: {type(norm_tensor)}\\n\\nAnd looks like this:\\n\\n {norm_tensor}')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the initial tensor was converted from a numpy array to a jax array in the process." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Siamese Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To create a `Siamese` model you will first need to create a LSTM model using the `Serial` combinator layer and then use another combinator layer called `Parallel` to create the Siamese model. You should be familiar with the following layers (notice each layer can be clicked to go to the docs):\n", " - [`Serial`](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.combinators.Serial) A combinator layer that allows to stack layers serially using function composition.\n", " - [`Embedding`](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.core.Embedding) Maps discrete tokens to vectors. It will have shape `(vocabulary length X dimension of output vectors)`. The dimension of output vectors (also called `d_feature`) is the number of elements in the word embedding.\n", " - [`LSTM`](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.rnn.LSTM) The LSTM layer. It leverages another Trax layer called [`LSTMCell`](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.rnn.LSTMCell). The number of units should be specified and should match the number of elements in the word embedding.\n", " - [`Mean`](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.core.Mean) Computes the mean across a desired axis. Mean uses one tensor axis to form groups of values and replaces each group with the mean value of that group.\n", " - [`Fn`](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.base.Fn) Layer with no weights that applies the function f, which should be specified using a lambda syntax. \n", " - [`Parallel`](https://trax-ml.readthedocs.io/en/latest/trax.layers.html#trax.layers.combinators.Parallel) It is a combinator layer (like `Serial`) that applies a list of layers in parallel to its inputs.\n", "\n", "Putting everything together the Siamese model will look like this:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "vocab_size = 500\n", "model_dimension = 128\n", "\n", "# Define the LSTM model\n", "LSTM = tl.Serial(\n", " tl.Embedding(vocab_size=vocab_size, d_feature=model_dimension),\n", " tl.LSTM(model_dimension),\n", " tl.Mean(axis=1),\n", " tl.Fn('Normalize', lambda x: normalize(x))\n", " )\n", "\n", "# Use the Parallel combinator to create a Siamese model out of the LSTM \n", "Siamese = tl.Parallel(LSTM, LSTM)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next is a helper function that prints information for every layer (sublayer within `Serial`):" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Siamese model:\n", "\n", "Total layers: 2\n", "\n", "========\n", "Parallel.sublayers_0: Serial[\n", " Embedding_500_128\n", " LSTM_128\n", " Mean\n", " Normalize\n", "]\n", "\n", "========\n", "Parallel.sublayers_1: Serial[\n", " Embedding_500_128\n", " LSTM_128\n", " Mean\n", " Normalize\n", "]\n", "\n", "Detail of LSTM models:\n", "\n", "Total layers: 4\n", "\n", "========\n", "Serial.sublayers_0: Embedding_500_128\n", "\n", "========\n", "Serial.sublayers_1: LSTM_128\n", "\n", "========\n", "Serial.sublayers_2: Mean\n", "\n", "========\n", "Serial.sublayers_3: Normalize\n", "\n" ] } ], "source": [ "def show_layers(model, layer_prefix):\n", " print(f\"Total layers: {len(model.sublayers)}\\n\")\n", " for i in range(len(model.sublayers)):\n", " print('========')\n", " print(f'{layer_prefix}_{i}: {model.sublayers[i]}\\n')\n", "\n", "print('Siamese model:\\n')\n", "show_layers(Siamese, 'Parallel.sublayers')\n", "\n", "print('Detail of LSTM models:\\n')\n", "show_layers(LSTM, 'Serial.sublayers')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try changing the parameters defined before the Siamese model and see how it changes!\n", "\n", "You will actually train this model in this week's assignment. For now you should be more familiarized with creating Siamese models using Trax. **Keep it up!**" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 4 }