{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Finding Similar Songs - Part 2: Siamese Networks\n",
    "\n",
    "In the first part of this tutorial I have introduced the traditional distance based approach to similarity estimations. The main idea is that features are extracted from the audio content. These features are numeric descriptions of semantically relevant information. An example for a high-level feature is the number of beats per minute which is a description for the tempo of a song. Music feature-sets are more abstract and describe the spectral or rhythmical distribution of energy. These are not single but vectors of numbers. Thus, a song is semantically described by this vector and if the set of extracted features spans over various music characteristics such as rhythm, timbre, harmonics, complexity, etc. then calculating the similarity of the vector's numbers is considered to be an approximation of music similarity. Thus, the lower the numerical distance between two vectors, the higher their acoustic similarity. For this reason these approaches are known as *Distance based* methods. They mainly depend on the selected sets of features and on the similarity metric chosen to compare their values.\n",
    "\n",
    "In the second part of this tutorial we are now focussing on an approach, where the feature representation, as well as the similarity function is learned from the underlying dataset.\n",
    "\n",
    "\n",
    "## Tutorial Overview\n",
    "\n",
    "1. Loading data\n",
    "2. Preprocess data\n",
    "3. Define Model\n",
    "4. Fit Model\n",
    "5. Evaluate Model\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Requiremnts\n",
    "\n",
    "The requirements are the same as for the first part of the tutorials. Please follow the instructions of part one if you have trouble running this tutorial."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "os.environ[\"CUDA_VISIBLE_DEVICES\"] = \"0\"\n",
    "\n",
    "import tensorflow as tf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# visualization\n",
    "%matplotlib inline\n",
    "\n",
    "# numeric and scientific processing\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "# misc\n",
    "import os\n",
    "import progressbar\n",
    "\n",
    "from IPython.display import IFrame\n",
    "from IPython.display import HTML, display\n",
    "\n",
    "pd.set_option('display.max_colwidth', -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Loading Data\n",
    "\n",
    "Before we can train our models we first have to get some data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "DATASET_PATH    = \"D:/Research/Data/MIR/MagnaTagATune/ISMIR2018\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Feature Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "load feature data from numpy pickle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(6380, 80, 80)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "with np.load(\"%s/ISMIR2018_tut_Magnagtagatune_spectrograms.npz\" % DATASET_PATH) as npz:\n",
    "    melspecs      = npz[\"features\"]\n",
    "    clip_id       = npz[\"clip_id\"]\n",
    "    \n",
    "melspecs.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "prepare feature-metadata for alignment with dataset meta-data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_metadata = pd.DataFrame({\"featurespace_id\": np.arange(melspecs.shape[0]), \n",
    "                                 \"clip_id\"        : clip_id})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Metadata\n",
    "\n",
    "load meta-data from csv-file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2617, 10)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "metadata = pd.read_csv(\"./metadata/ismir2018_tut_part_2_genre_metadata.csv\", index_col=0)\n",
    "metadata.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Align featuredata with metadata"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "metadata = metadata.reset_index()\n",
    "metadata = metadata.merge(feature_metadata, left_on=\"clip_id\", right_on=\"clip_id\", how=\"inner\", left_index=True, right_index=False)\n",
    "metadata = metadata.set_index(\"index\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add HTML5 audio player component for listening to similarity retrieval results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "tmp                = metadata.mp3_path.str.split(\"/\", expand=True)\n",
    "metadata[\"player\"] = '<audio src=\"http://localhost:9999/' + tmp[6] + '/' + tmp[7] +'\" controls>'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Labels"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load labels from csv-file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2617, 7)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "labels = pd.read_csv(\"./metadata/ismir2018_tut_part_2_genre_labels.csv\", index_col=0)\n",
    "\n",
    "labels.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Align Meta-data and Feature-data\n",
    "\n",
    "Sort Metadata by Feature-space ID => metadata is aligned to feature-data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "metadata.sort_values(\"featurespace_id\", inplace=True)\n",
    "\n",
    "# subsample feature-space\n",
    "melspecs = melspecs[metadata.featurespace_id].astype(np.float32)\n",
    "\n",
    "# re-enumerate sub-sampled feature-space (alignment)\n",
    "metadata[\"featurespace_id\"] = np.arange(metadata.shape[0])\n",
    "\n",
    "# align labels\n",
    "labels = labels.loc[metadata.index]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add the layer-dimension to the feature-space (required by the convolutional neural networks. Only one layer added => mono-channel audio)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "melspecs = np.expand_dims(melspecs, 3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Siamese Networks\n",
    "\n",
    "A Siamese neural network is a neural network architecture where two inputs are fed into the same stack of network layers. This is where the name comes from. The shared layers are \"similar\" to Siamese Twins. By feeding two inputs to the shared layers, two representations are generated which can be used for comparison. To train the network according a certain task, it requires labelled data. To learn a simlarity function, these labels should indicate if the two input are similar or dissimilar.\n",
    "\n",
    "This is exactly the approach initially described by Hadsell-et-al.'06 (http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf). The authors create pairs of simlar and dissimilar images. These are fed into a Siamese NEtwork stack. Finally, the model calculates the eucledian distance between the two generated representations. A contrastive loss is used, to optimize the learned simlarity.\n",
    "\n",
    "To calculate the similarity between a seed image and the rest of the collection, the model is applied to predict the distance between this seed image and every other. The result is a list of distances which has to be sorted descendingly.\n",
    "\n",
    "The following code example follows this approach:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Keras**\n",
    "\n",
    "We use the high-level deep learning API Keras: https://keras.io/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using TensorFlow backend.\n"
     ]
    }
   ],
   "source": [
    "from keras.models       import Model\n",
    "from keras.layers       import Input, Lambda, Dense, Conv2D, Flatten, MaxPooling2D, Concatenate\n",
    "from keras.layers       import Dropout, BatchNormalization, GaussianNoise\n",
    "from keras.optimizers   import Nadam, SGD, Adam, RMSprop\n",
    "from keras.constraints  import unit_norm\n",
    "from keras.regularizers import l2\n",
    "from keras import backend as K"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First we define a distance measure to compare the two representations. We will be using the well known Eucledian distance:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "def euclidean_distance(vects):\n",
    "    x, y = vects\n",
    "    return K.sqrt(K.maximum(K.sum(K.square(x - y), axis=1, keepdims=True), K.epsilon()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The Siamese Network Architecture\n",
    "\n",
    "Now we define the Siamese Network Architecture. It consists of two fully connected layers. These layers are shared among the \"Siamese twins\". The network takes two inputs. One goes to the left twin, the other to the right one. The Eucledian distance of the output of each twin is calculated which is the final output of the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "__________________________________________________________________________________________________\n",
      "Layer (type)                    Output Shape         Param #     Connected to                     \n",
      "==================================================================================================\n",
      "input_ref (InputLayer)          (None, 80, 80, 1)    0                                            \n",
      "__________________________________________________________________________________________________\n",
      "input_dif (InputLayer)          (None, 80, 80, 1)    0                                            \n",
      "__________________________________________________________________________________________________\n",
      "gauss_noise (GaussianNoise)     (None, 80, 80, 1)    0           input_ref[0][0]                  \n",
      "                                                                 input_ref[0][0]                  \n",
      "                                                                 input_dif[0][0]                  \n",
      "                                                                 input_dif[0][0]                  \n",
      "__________________________________________________________________________________________________\n",
      "bnorm_input (BatchNormalization (None, 80, 80, 1)    4           gauss_noise[0][0]                \n",
      "                                                                 gauss_noise[1][0]                \n",
      "                                                                 gauss_noise[2][0]                \n",
      "                                                                 gauss_noise[3][0]                \n",
      "__________________________________________________________________________________________________\n",
      "conv_a (Conv2D)                 (None, 80, 80, 16)   3712        bnorm_input[0][0]                \n",
      "                                                                 bnorm_input[2][0]                \n",
      "__________________________________________________________________________________________________\n",
      "conv_b (Conv2D)                 (None, 80, 80, 16)   3712        bnorm_input[1][0]                \n",
      "                                                                 bnorm_input[3][0]                \n",
      "__________________________________________________________________________________________________\n",
      "max_pooling_a (MaxPooling2D)    (None, 4, 20, 16)    0           conv_a[0][0]                     \n",
      "                                                                 conv_a[1][0]                     \n",
      "__________________________________________________________________________________________________\n",
      "max_pooling_b (MaxPooling2D)    (None, 20, 4, 16)    0           conv_b[0][0]                     \n",
      "                                                                 conv_b[1][0]                     \n",
      "__________________________________________________________________________________________________\n",
      "bnorm_a (BatchNormalization)    (None, 4, 20, 16)    64          max_pooling_a[0][0]              \n",
      "                                                                 max_pooling_a[1][0]              \n",
      "__________________________________________________________________________________________________\n",
      "bnorm_b (BatchNormalization)    (None, 20, 4, 16)    64          max_pooling_b[0][0]              \n",
      "                                                                 max_pooling_b[1][0]              \n",
      "__________________________________________________________________________________________________\n",
      "flatten_a (Flatten)             (None, 1280)         0           bnorm_a[0][0]                    \n",
      "                                                                 bnorm_a[1][0]                    \n",
      "__________________________________________________________________________________________________\n",
      "flatten_b (Flatten)             (None, 1280)         0           bnorm_b[0][0]                    \n",
      "                                                                 bnorm_b[1][0]                    \n",
      "__________________________________________________________________________________________________\n",
      "concatenate (Concatenate)       (None, 2560)         0           flatten_a[0][0]                  \n",
      "                                                                 flatten_b[0][0]                  \n",
      "                                                                 flatten_a[1][0]                  \n",
      "                                                                 flatten_b[1][0]                  \n",
      "__________________________________________________________________________________________________\n",
      "dense_1 (Dense)                 (None, 256)          655616      concatenate[0][0]                \n",
      "                                                                 concatenate[1][0]                \n",
      "__________________________________________________________________________________________________\n",
      "lambda_1 (Lambda)               (None, 256)          0           dense_1[0][0]                    \n",
      "                                                                 dense_1[1][0]                    \n",
      "==================================================================================================\n",
      "Total params: 663,172\n",
      "Trainable params: 663,106\n",
      "Non-trainable params: 66\n",
      "__________________________________________________________________________________________________\n"
     ]
    }
   ],
   "source": [
    "def create_siamese_network():\n",
    "\n",
    "    # --- input layers\n",
    "    input_ref = Input((80,80,1), name=\"input_ref\") # reference track\n",
    "    input_dif = Input((80,80,1), name=\"input_dif\") # different track\n",
    "    \n",
    "    # --- input pre-processing\n",
    "    gn = GaussianNoise(0.2, name=\"gauss_noise\")    # add noise to input during training to avoid overfitting\n",
    "    bn = BatchNormalization(name=\"bnorm_input\")    # normalize input\n",
    "\n",
    "    # --- CNN-Stack A\n",
    "    cnn_a_1 = Conv2D(16, (21,11), \n",
    "                     padding            = \"same\", \n",
    "                     activation         = \"relu\", \n",
    "                     kernel_regularizer = l2(0.0001), \n",
    "                     name               = \"conv_a\")\n",
    "    \n",
    "    cnn_a_2 = MaxPooling2D((20,4), name = \"max_pooling_a\")\n",
    "    cnn_a_3 = BatchNormalization(  name = \"bnorm_a\")\n",
    "    cnn_a_4 = Flatten(             name = \"flatten_a\")\n",
    "\n",
    "    # --- CNN-Stack B\n",
    "    cnn_b_1 = Conv2D(16, (11,21), \n",
    "                     padding            = \"same\", \n",
    "                     activation         = \"relu\", \n",
    "                     kernel_regularizer = l2(0.0001), \n",
    "                     name               = \"conv_b\")\n",
    "    \n",
    "    cnn_b_2 = MaxPooling2D((4,20), name = \"max_pooling_b\")\n",
    "    cnn_b_3 = BatchNormalization(  name = \"bnorm_b\")\n",
    "    cnn_b_4 = Flatten(             name = \"flatten_b\")\n",
    "\n",
    "    # --- merge parallel CNN Stacks\n",
    "    mrg = Concatenate(axis=1, name=\"concatenate\")\n",
    "\n",
    "    # --- Fully connected layer => learned representation layer\n",
    "    hidden_layer = Dense(256, activation=\"elu\", kernel_constraint=unit_norm())\n",
    "    \n",
    "    # --- function to assemble shared layers\n",
    "    def get_shared_dnn(m_input):\n",
    "        shared_cnn_a = cnn_a_4(cnn_a_3(cnn_a_2(cnn_a_1(bn(gn(m_input))))))\n",
    "        shared_cnn_b = cnn_b_4(cnn_b_3(cnn_b_2(cnn_b_1(bn(gn(m_input))))))\n",
    "\n",
    "        return hidden_layer(mrg([shared_cnn_a,shared_cnn_b]))\n",
    "\n",
    "    # --- instantiate  shared layers\n",
    "    siamese_ref = get_shared_dnn(input_ref)\n",
    "    siamese_dif = get_shared_dnn(input_dif)\n",
    "\n",
    "    # --- calculate dissimilarity\n",
    "    dist  = Lambda(euclidean_distance, output_shape=lambda x: x[0])([siamese_ref, siamese_dif])\n",
    "    \n",
    "    # --- build model\n",
    "    model = Model(inputs=[input_ref, input_dif], outputs=dist)\n",
    "    \n",
    "    return model\n",
    "\n",
    "model = create_siamese_network()\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<svg height=\"802pt\" viewBox=\"0.00 0.00 765.50 802.00\" width=\"766pt\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
       "<g class=\"graph\" id=\"graph0\" transform=\"scale(1 1) rotate(0) translate(4 798)\">\n",
       "<title>G</title>\n",
       "<polygon fill=\"white\" points=\"-4,4 -4,-798 761.5,-798 761.5,4 -4,4\" stroke=\"none\"/>\n",
       "<!-- 1667702335584 -->\n",
       "<g class=\"node\" id=\"node1\"><title>1667702335584</title>\n",
       "<polygon fill=\"none\" points=\"46.5,-747.5 46.5,-793.5 348.5,-793.5 348.5,-747.5 46.5,-747.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"113.5\" y=\"-766.8\">input_ref: InputLayer</text>\n",
       "<polyline fill=\"none\" points=\"180.5,-747.5 180.5,-793.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"208.5\" y=\"-778.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"180.5,-770.5 236.5,-770.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"208.5\" y=\"-755.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"236.5,-747.5 236.5,-793.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"292.5\" y=\"-778.3\">(None, 80, 80, 1)</text>\n",
       "<polyline fill=\"none\" points=\"236.5,-770.5 348.5,-770.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"292.5\" y=\"-755.3\">(None, 80, 80, 1)</text>\n",
       "</g>\n",
       "<!-- 1667702336872 -->\n",
       "<g class=\"node\" id=\"node3\"><title>1667702336872</title>\n",
       "<polygon fill=\"none\" points=\"186,-664.5 186,-710.5 529,-710.5 529,-664.5 186,-664.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"273.5\" y=\"-683.8\">gauss_noise: GaussianNoise</text>\n",
       "<polyline fill=\"none\" points=\"361,-664.5 361,-710.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"389\" y=\"-695.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"361,-687.5 417,-687.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"389\" y=\"-672.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"417,-664.5 417,-710.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"473\" y=\"-695.3\">(None, 80, 80, 1)</text>\n",
       "<polyline fill=\"none\" points=\"417,-687.5 529,-687.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"473\" y=\"-672.3\">(None, 80, 80, 1)</text>\n",
       "</g>\n",
       "<!-- 1667702335584&#45;&gt;1667702336872 -->\n",
       "<g class=\"edge\" id=\"edge1\"><title>1667702335584-&gt;1667702336872</title>\n",
       "<path d=\"M241.221,-747.366C260.887,-737.41 284.264,-725.576 304.881,-715.138\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"306.491,-718.246 313.832,-710.607 303.329,-712.001 306.491,-718.246\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702335920 -->\n",
       "<g class=\"node\" id=\"node2\"><title>1667702335920</title>\n",
       "<polygon fill=\"none\" points=\"366.5,-747.5 366.5,-793.5 668.5,-793.5 668.5,-747.5 366.5,-747.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"433.5\" y=\"-766.8\">input_dif: InputLayer</text>\n",
       "<polyline fill=\"none\" points=\"500.5,-747.5 500.5,-793.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"528.5\" y=\"-778.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"500.5,-770.5 556.5,-770.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"528.5\" y=\"-755.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"556.5,-747.5 556.5,-793.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"612.5\" y=\"-778.3\">(None, 80, 80, 1)</text>\n",
       "<polyline fill=\"none\" points=\"556.5,-770.5 668.5,-770.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"612.5\" y=\"-755.3\">(None, 80, 80, 1)</text>\n",
       "</g>\n",
       "<!-- 1667702335920&#45;&gt;1667702336872 -->\n",
       "<g class=\"edge\" id=\"edge3\"><title>1667702335920-&gt;1667702336872</title>\n",
       "<path d=\"M473.779,-747.366C454.113,-737.41 430.736,-725.576 410.119,-715.138\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"411.671,-712.001 401.168,-710.607 408.509,-718.246 411.671,-712.001\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702336760 -->\n",
       "<g class=\"node\" id=\"node4\"><title>1667702336760</title>\n",
       "<polygon fill=\"none\" points=\"171,-581.5 171,-627.5 544,-627.5 544,-581.5 171,-581.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"273.5\" y=\"-600.8\">bnorm_input: BatchNormalization</text>\n",
       "<polyline fill=\"none\" points=\"376,-581.5 376,-627.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"404\" y=\"-612.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"376,-604.5 432,-604.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"404\" y=\"-589.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"432,-581.5 432,-627.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"488\" y=\"-612.3\">(None, 80, 80, 1)</text>\n",
       "<polyline fill=\"none\" points=\"432,-604.5 544,-604.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"488\" y=\"-589.3\">(None, 80, 80, 1)</text>\n",
       "</g>\n",
       "<!-- 1667702336872&#45;&gt;1667702336760 -->\n",
       "<g class=\"edge\" id=\"edge5\"><title>1667702336872-&gt;1667702336760</title>\n",
       "<path d=\"M357.5,-664.366C357.5,-656.152 357.5,-646.658 357.5,-637.725\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"361,-637.607 357.5,-627.607 354,-637.607 361,-637.607\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702336424 -->\n",
       "<g class=\"node\" id=\"node5\"><title>1667702336424</title>\n",
       "<polygon fill=\"none\" points=\"61,-498.5 61,-544.5 348,-544.5 348,-498.5 61,-498.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"117\" y=\"-517.8\">conv_a: Conv2D</text>\n",
       "<polyline fill=\"none\" points=\"173,-498.5 173,-544.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"201\" y=\"-529.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"173,-521.5 229,-521.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"201\" y=\"-506.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"229,-498.5 229,-544.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"288.5\" y=\"-529.3\">(None, 80, 80, 1)</text>\n",
       "<polyline fill=\"none\" points=\"229,-521.5 348,-521.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"288.5\" y=\"-506.3\">(None, 80, 80, 16)</text>\n",
       "</g>\n",
       "<!-- 1667702336760&#45;&gt;1667702336424 -->\n",
       "<g class=\"edge\" id=\"edge9\"><title>1667702336760-&gt;1667702336424</title>\n",
       "<path d=\"M315.692,-581.366C297.055,-571.5 274.933,-559.788 255.348,-549.419\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"256.733,-546.193 246.257,-544.607 253.458,-552.379 256.733,-546.193\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702432320 -->\n",
       "<g class=\"node\" id=\"node6\"><title>1667702432320</title>\n",
       "<polygon fill=\"none\" points=\"407.5,-498.5 407.5,-544.5 695.5,-544.5 695.5,-498.5 407.5,-498.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"464\" y=\"-517.8\">conv_b: Conv2D</text>\n",
       "<polyline fill=\"none\" points=\"520.5,-498.5 520.5,-544.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"548.5\" y=\"-529.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"520.5,-521.5 576.5,-521.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"548.5\" y=\"-506.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"576.5,-498.5 576.5,-544.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"636\" y=\"-529.3\">(None, 80, 80, 1)</text>\n",
       "<polyline fill=\"none\" points=\"576.5,-521.5 695.5,-521.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"636\" y=\"-506.3\">(None, 80, 80, 16)</text>\n",
       "</g>\n",
       "<!-- 1667702336760&#45;&gt;1667702432320 -->\n",
       "<g class=\"edge\" id=\"edge11\"><title>1667702336760-&gt;1667702432320</title>\n",
       "<path d=\"M410.255,-581.473C434.63,-571.296 463.75,-559.138 489.199,-548.512\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"490.876,-551.605 498.755,-544.522 488.179,-545.145 490.876,-551.605\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702337208 -->\n",
       "<g class=\"node\" id=\"node7\"><title>1667702337208</title>\n",
       "<polygon fill=\"none\" points=\"0,-415.5 0,-461.5 369,-461.5 369,-415.5 0,-415.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"97\" y=\"-434.8\">max_pooling_a: MaxPooling2D</text>\n",
       "<polyline fill=\"none\" points=\"194,-415.5 194,-461.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"222\" y=\"-446.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"194,-438.5 250,-438.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"222\" y=\"-423.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"250,-415.5 250,-461.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"309.5\" y=\"-446.3\">(None, 80, 80, 16)</text>\n",
       "<polyline fill=\"none\" points=\"250,-438.5 369,-438.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"309.5\" y=\"-423.3\">(None, 4, 20, 16)</text>\n",
       "</g>\n",
       "<!-- 1667702336424&#45;&gt;1667702337208 -->\n",
       "<g class=\"edge\" id=\"edge13\"><title>1667702336424-&gt;1667702337208</title>\n",
       "<path d=\"M199.035,-498.366C196.985,-490.062 194.611,-480.451 192.385,-471.434\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"195.754,-470.476 189.958,-461.607 188.958,-472.154 195.754,-470.476\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702432712 -->\n",
       "<g class=\"node\" id=\"node8\"><title>1667702432712</title>\n",
       "<polygon fill=\"none\" points=\"387.5,-415.5 387.5,-461.5 757.5,-461.5 757.5,-415.5 387.5,-415.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"485\" y=\"-434.8\">max_pooling_b: MaxPooling2D</text>\n",
       "<polyline fill=\"none\" points=\"582.5,-415.5 582.5,-461.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"610.5\" y=\"-446.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"582.5,-438.5 638.5,-438.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"610.5\" y=\"-423.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"638.5,-415.5 638.5,-461.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"698\" y=\"-446.3\">(None, 80, 80, 16)</text>\n",
       "<polyline fill=\"none\" points=\"638.5,-438.5 757.5,-438.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"698\" y=\"-423.3\">(None, 20, 4, 16)</text>\n",
       "</g>\n",
       "<!-- 1667702432320&#45;&gt;1667702432712 -->\n",
       "<g class=\"edge\" id=\"edge15\"><title>1667702432320-&gt;1667702432712</title>\n",
       "<path d=\"M557.238,-498.366C559.391,-490.062 561.883,-480.451 564.221,-471.434\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"567.647,-472.165 566.769,-461.607 560.871,-470.408 567.647,-472.165\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702431872 -->\n",
       "<g class=\"node\" id=\"node9\"><title>1667702431872</title>\n",
       "<polygon fill=\"none\" points=\"18,-332.5 18,-378.5 369,-378.5 369,-332.5 18,-332.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"109.5\" y=\"-351.8\">bnorm_a: BatchNormalization</text>\n",
       "<polyline fill=\"none\" points=\"201,-332.5 201,-378.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"229\" y=\"-363.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"201,-355.5 257,-355.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"229\" y=\"-340.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"257,-332.5 257,-378.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"313\" y=\"-363.3\">(None, 4, 20, 16)</text>\n",
       "<polyline fill=\"none\" points=\"257,-355.5 369,-355.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"313\" y=\"-340.3\">(None, 4, 20, 16)</text>\n",
       "</g>\n",
       "<!-- 1667702337208&#45;&gt;1667702431872 -->\n",
       "<g class=\"edge\" id=\"edge17\"><title>1667702337208-&gt;1667702431872</title>\n",
       "<path d=\"M186.959,-415.366C187.872,-407.152 188.927,-397.658 189.919,-388.725\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"193.418,-388.932 191.044,-378.607 186.461,-388.159 193.418,-388.932\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702432880 -->\n",
       "<g class=\"node\" id=\"node10\"><title>1667702432880</title>\n",
       "<polygon fill=\"none\" points=\"391,-332.5 391,-378.5 744,-378.5 744,-332.5 391,-332.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"483.5\" y=\"-351.8\">bnorm_b: BatchNormalization</text>\n",
       "<polyline fill=\"none\" points=\"576,-332.5 576,-378.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"604\" y=\"-363.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"576,-355.5 632,-355.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"604\" y=\"-340.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"632,-332.5 632,-378.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"688\" y=\"-363.3\">(None, 20, 4, 16)</text>\n",
       "<polyline fill=\"none\" points=\"632,-355.5 744,-355.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"688\" y=\"-340.3\">(None, 20, 4, 16)</text>\n",
       "</g>\n",
       "<!-- 1667702432712&#45;&gt;1667702432880 -->\n",
       "<g class=\"edge\" id=\"edge19\"><title>1667702432712-&gt;1667702432880</title>\n",
       "<path d=\"M571.134,-415.366C570.627,-407.152 570.041,-397.658 569.489,-388.725\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"572.974,-388.372 568.865,-378.607 565.987,-388.804 572.974,-388.372\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702432152 -->\n",
       "<g class=\"node\" id=\"node11\"><title>1667702432152</title>\n",
       "<polygon fill=\"none\" points=\"94.5,-249.5 94.5,-295.5 370.5,-295.5 370.5,-249.5 94.5,-249.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"148.5\" y=\"-268.8\">flatten_a: Flatten</text>\n",
       "<polyline fill=\"none\" points=\"202.5,-249.5 202.5,-295.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"230.5\" y=\"-280.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"202.5,-272.5 258.5,-272.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"230.5\" y=\"-257.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"258.5,-249.5 258.5,-295.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"314.5\" y=\"-280.3\">(None, 4, 20, 16)</text>\n",
       "<polyline fill=\"none\" points=\"258.5,-272.5 370.5,-272.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"314.5\" y=\"-257.3\">(None, 1280)</text>\n",
       "</g>\n",
       "<!-- 1667702431872&#45;&gt;1667702432152 -->\n",
       "<g class=\"edge\" id=\"edge21\"><title>1667702431872-&gt;1667702432152</title>\n",
       "<path d=\"M204.157,-332.366C208.241,-323.884 212.982,-314.037 217.404,-304.853\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"220.671,-306.135 221.856,-295.607 214.364,-303.098 220.671,-306.135\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702433160 -->\n",
       "<g class=\"node\" id=\"node12\"><title>1667702433160</title>\n",
       "<polygon fill=\"none\" points=\"408.5,-249.5 408.5,-295.5 686.5,-295.5 686.5,-249.5 408.5,-249.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"463.5\" y=\"-268.8\">flatten_b: Flatten</text>\n",
       "<polyline fill=\"none\" points=\"518.5,-249.5 518.5,-295.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"546.5\" y=\"-280.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"518.5,-272.5 574.5,-272.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"546.5\" y=\"-257.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"574.5,-249.5 574.5,-295.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"630.5\" y=\"-280.3\">(None, 20, 4, 16)</text>\n",
       "<polyline fill=\"none\" points=\"574.5,-272.5 686.5,-272.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"630.5\" y=\"-257.3\">(None, 1280)</text>\n",
       "</g>\n",
       "<!-- 1667702432880&#45;&gt;1667702433160 -->\n",
       "<g class=\"edge\" id=\"edge23\"><title>1667702432880-&gt;1667702433160</title>\n",
       "<path d=\"M562.035,-332.366C559.985,-324.062 557.611,-314.451 555.385,-305.434\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"558.754,-304.476 552.958,-295.607 551.958,-306.154 558.754,-304.476\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702433272 -->\n",
       "<g class=\"node\" id=\"node13\"><title>1667702433272</title>\n",
       "<polygon fill=\"none\" points=\"192.5,-166.5 192.5,-212.5 586.5,-212.5 586.5,-166.5 192.5,-166.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"271.5\" y=\"-185.8\">concatenate: Concatenate</text>\n",
       "<polyline fill=\"none\" points=\"350.5,-166.5 350.5,-212.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"378.5\" y=\"-197.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"350.5,-189.5 406.5,-189.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"378.5\" y=\"-174.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"406.5,-166.5 406.5,-212.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"496.5\" y=\"-197.3\">[(None, 1280), (None, 1280)]</text>\n",
       "<polyline fill=\"none\" points=\"406.5,-189.5 586.5,-189.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"496.5\" y=\"-174.3\">(None, 2560)</text>\n",
       "</g>\n",
       "<!-- 1667702432152&#45;&gt;1667702433272 -->\n",
       "<g class=\"edge\" id=\"edge25\"><title>1667702432152-&gt;1667702433272</title>\n",
       "<path d=\"M275.401,-249.366C294.612,-239.455 317.431,-227.682 337.596,-217.279\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"339.369,-220.302 346.651,-212.607 336.159,-214.081 339.369,-220.302\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702433160&#45;&gt;1667702433272 -->\n",
       "<g class=\"edge\" id=\"edge26\"><title>1667702433160-&gt;1667702433272</title>\n",
       "<path d=\"M504.326,-249.366C484.993,-239.455 462.028,-227.682 441.735,-217.279\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"443.117,-214.054 432.622,-212.607 439.924,-220.283 443.117,-214.054\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702433384 -->\n",
       "<g class=\"node\" id=\"node14\"><title>1667702433384</title>\n",
       "<polygon fill=\"none\" points=\"264.5,-83.5 264.5,-129.5 514.5,-129.5 514.5,-83.5 264.5,-83.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"316.5\" y=\"-102.8\">dense_1: Dense</text>\n",
       "<polyline fill=\"none\" points=\"368.5,-83.5 368.5,-129.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"396.5\" y=\"-114.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"368.5,-106.5 424.5,-106.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"396.5\" y=\"-91.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"424.5,-83.5 424.5,-129.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"469.5\" y=\"-114.3\">(None, 2560)</text>\n",
       "<polyline fill=\"none\" points=\"424.5,-106.5 514.5,-106.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"469.5\" y=\"-91.3\">(None, 256)</text>\n",
       "</g>\n",
       "<!-- 1667702433272&#45;&gt;1667702433384 -->\n",
       "<g class=\"edge\" id=\"edge29\"><title>1667702433272-&gt;1667702433384</title>\n",
       "<path d=\"M389.5,-166.366C389.5,-158.152 389.5,-148.658 389.5,-139.725\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"393,-139.607 389.5,-129.607 386,-139.607 393,-139.607\" stroke=\"black\"/>\n",
       "</g>\n",
       "<!-- 1667702433776 -->\n",
       "<g class=\"node\" id=\"node15\"><title>1667702433776</title>\n",
       "<polygon fill=\"none\" points=\"216,-0.5 216,-46.5 563,-46.5 563,-0.5 216,-0.5\" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"278\" y=\"-19.8\">lambda_1: Lambda</text>\n",
       "<polyline fill=\"none\" points=\"340,-0.5 340,-46.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"368\" y=\"-31.3\">input:</text>\n",
       "<polyline fill=\"none\" points=\"340,-23.5 396,-23.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"368\" y=\"-8.3\">output:</text>\n",
       "<polyline fill=\"none\" points=\"396,-0.5 396,-46.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"479.5\" y=\"-31.3\">[(None, 256), (None, 256)]</text>\n",
       "<polyline fill=\"none\" points=\"396,-23.5 563,-23.5 \" stroke=\"black\"/>\n",
       "<text font-family=\"Times New Roman,serif\" font-size=\"14.00\" text-anchor=\"middle\" x=\"479.5\" y=\"-8.3\">(None, 256)</text>\n",
       "</g>\n",
       "<!-- 1667702433384&#45;&gt;1667702433776 -->\n",
       "<g class=\"edge\" id=\"edge31\"><title>1667702433384-&gt;1667702433776</title>\n",
       "<path d=\"M389.5,-83.3664C389.5,-75.1516 389.5,-65.6579 389.5,-56.7252\" fill=\"none\" stroke=\"black\"/>\n",
       "<polygon fill=\"black\" points=\"393,-56.6068 389.5,-46.6068 386,-56.6069 393,-56.6068\" stroke=\"black\"/>\n",
       "</g>\n",
       "</g>\n",
       "</svg>"
      ],
      "text/plain": [
       "<IPython.core.display.SVG object>"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython.display       import SVG\n",
    "from keras.utils.vis_utils import model_to_dot\n",
    "\n",
    "SVG(model_to_dot(model, \n",
    "                 show_shapes      = True, \n",
    "                 show_layer_names = True).create(prog='dot', format='svg'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Contrastive Loss**\n",
    "\n",
    "The contrastive loss is based on the Euclidean distance and measures the cost of data pairs. The objective of the contrastive loss is to minimize the distance between a similar pair and to separate any two dissimilar data with a distance margin"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "def contrastive_loss(y_true, y_pred):\n",
    "    margin = 1\n",
    "    return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create Data-Pairs\n",
    "\n",
    "Now we have to prepare and partition the input data. Because it is a pair-wise comparison approach, we have to create pairs of input instances with sequentially one row containing the reference track and a similar example and the consecutive row containing the reference track and a dissimilar example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_pairs(feature_data, metadata, labels, num_triplets_per_track):\n",
    "    \n",
    "    data_ref  = []\n",
    "    data_dif  = []\n",
    "    gt_labels = []\n",
    "    \n",
    "    pbar = progressbar.ProgressBar(max_value=metadata.shape[0])\n",
    "        \n",
    "    for row_id, q_track in pbar(metadata.iterrows()):\n",
    "        \n",
    "        for _ in range(num_triplets_per_track):\n",
    "            \n",
    "            label_differences = np.abs(labels - labels.loc[row_id]).sum(axis=1)\n",
    "            \n",
    "            similar_instances    = label_differences[label_differences == 0]\n",
    "            dissimilar_instances = label_differences[label_differences != 0] \n",
    "            \n",
    "            # search similar and dissimilar examples\n",
    "            pos_example_idx      = similar_instances.sample(1).index.values[0]\n",
    "            neg_example_idx      = dissimilar_instances.sample(1).index.values[0]\n",
    "            \n",
    "            # create feature triplets\n",
    "            feat_id_ref          = metadata.loc[row_id].featurespace_id\n",
    "            feat_id_pos          = metadata.loc[pos_example_idx].featurespace_id\n",
    "            feat_id_neg          = metadata.loc[neg_example_idx].featurespace_id\n",
    "            \n",
    "            # genuine pair\n",
    "            data_ref.append(feature_data[feat_id_ref])\n",
    "            data_dif.append(feature_data[feat_id_pos])\n",
    "            gt_labels.append(1)\n",
    "            \n",
    "            data_ref.append(feature_data[feat_id_ref])\n",
    "            data_dif.append(feature_data[feat_id_neg])\n",
    "            gt_labels.append(0)\n",
    "\n",
    "    return [np.asarray(data_ref), np.asarray(data_dif)], np.asarray(gt_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Exectue the function to prepare the input data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100% (2617 of 2617) |####################| Elapsed Time: 0:01:02 Time:  0:01:02\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "(26170, 80, 80, 1)"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create pairs\n",
    "data_pairs, paired_labels = create_pairs(melspecs, metadata, labels, 5)\n",
    "\n",
    "# check - how many instances have we created?\n",
    "data_pairs[0].shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Prepare the Siamese Neural Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define the model\n",
    "model_orig = create_siamese_network()\n",
    "\n",
    "# define the optimizer\n",
    "opt = Adam(lr=0.0001, decay=0.001)\n",
    "\n",
    "# compile the model\n",
    "model_orig.compile(loss      = contrastive_loss, \n",
    "                   optimizer = opt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train the network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train on 24170 samples, validate on 2000 samples\n",
      "Epoch 1/10\n",
      "24170/24170 [==============================] - 26s 1ms/step - loss: 3.9755 - val_loss: 0.5766\n",
      "Epoch 2/10\n",
      "24170/24170 [==============================] - 23s 933us/step - loss: 0.5068 - val_loss: 353.1192\n",
      "Epoch 3/10\n",
      "24170/24170 [==============================] - 22s 930us/step - loss: 0.3829 - val_loss: 0.4764\n",
      "Epoch 4/10\n",
      "24170/24170 [==============================] - 22s 929us/step - loss: 0.3252 - val_loss: 300.5028\n",
      "Epoch 5/10\n",
      "24170/24170 [==============================] - 22s 928us/step - loss: 0.2902 - val_loss: 0.3298\n",
      "Epoch 6/10\n",
      "24170/24170 [==============================] - 23s 939us/step - loss: 0.2685 - val_loss: 0.3312\n",
      "Epoch 7/10\n",
      "24170/24170 [==============================] - 23s 938us/step - loss: 0.2536 - val_loss: 0.3015\n",
      "Epoch 8/10\n",
      "24170/24170 [==============================] - 23s 939us/step - loss: 0.2423 - val_loss: 0.3134\n",
      "Epoch 9/10\n",
      "24170/24170 [==============================] - 23s 937us/step - loss: 0.2332 - val_loss: 0.2901\n",
      "Epoch 10/10\n",
      "24170/24170 [==============================] - 23s 935us/step - loss: 0.2256 - val_loss: 0.2880\n"
     ]
    }
   ],
   "source": [
    "model_orig.fit([data_pairs[0][:-2000], data_pairs[1][:-2000]], \n",
    "                paired_labels[:-2000], \n",
    "                batch_size       = 64, \n",
    "                verbose          = 1, \n",
    "                epochs           = 10,\n",
    "                shuffle          = False, # important !\n",
    "                validation_data = [[data_pairs[0][-2000:], data_pairs[1][-2000:]], paired_labels[-2000:]]);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_orig.save_weights(\"./models/part_2b_siamese_network.h5\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Evaluate\n",
    "\n",
    "Now that we have a trained model, we want to evaluate its performance. We will first play around with some examples, listen to the results and judge by our subjective interpretation before we persue a general evaluation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "scrolled": true
   },
   "source": [
    "### Evaluate by Example\n",
    "\n",
    "The following function calculated the distances between a given query track and all other tracks of the collection. The result is a list of distances where the smallest distance coresponds with the most similar track. The list is sorted descendingly and the top-ten similar tracks are presented below the information of the query track. The Spotify playlist we created at the beginning will also be updated with the query results. Thus, you can listen to it in your Spotify client."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "def similar(model, query_idx):\n",
    "    \n",
    "\n",
    "    ref_cols = [\"artist\",\"title\",\"album\",\"url\",\"original_url\"]\n",
    "    print(metadata.iloc[query_idx][ref_cols])\n",
    "    \n",
    "    ref_feat = melspecs[[metadata.iloc[query_idx].featurespace_id]]\n",
    "    ref_feat = np.repeat(ref_feat, melspecs.shape[0], axis=0)\n",
    "    \n",
    "    # calclulate predicted distances between query track and all others\n",
    "    res = model.predict([ref_feat, melspecs])\n",
    "\n",
    "    # reshape\n",
    "    res = res.flatten()\n",
    "    \n",
    "    # get sorted indexes in ascending order (smallest distance to query track first)\n",
    "    si = np.argsort(res)\n",
    "    \n",
    "    # output filter\n",
    "    display_cols = [\"artist\", \"title\", \"album\", \"player\"]\n",
    "    \n",
    "    \n",
    "    \n",
    "    return HTML(metadata.set_index(\"featurespace_id\").loc[si[:11]][display_cols].to_html(escape=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's check the results for some individual tracks by supplying the index to our dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "artist          Sun Palace                                                                     \n",
      "title           Man of the Severn Wave                                                         \n",
      "album           Give Me a Perfect World                                                        \n",
      "url             http://www.magnatune.com/artists/albums/sunpalace-perfectworld/                \n",
      "original_url    http://he3.magnatune.com/all/07-Man%20of%20the%20Severn%20Wave-Sun%20Palace.mp3\n",
      "Name: 14722, dtype: object\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>artist</th>\n",
       "      <th>title</th>\n",
       "      <th>album</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>featurespace_id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1511</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Man of the Severn Wave</td>\n",
       "      <td>Give Me a Perfect World</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/sun_palace-give_me_a_perfect_world-07-man_of_the_severn_wave-117-146.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2007</th>\n",
       "      <td>The Kokoon</td>\n",
       "      <td>Face</td>\n",
       "      <td>Erase</td>\n",
       "      <td><audio src=\"http://localhost:9999/6/the_kokoon-erase-02-face-175-204.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2023</th>\n",
       "      <td>Hybris</td>\n",
       "      <td>Rotten Flowers</td>\n",
       "      <td>The First Words</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/hybris-the_first_words-06-rotten_flowers-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1805</th>\n",
       "      <td>The Kokoon</td>\n",
       "      <td>Tap At Floes</td>\n",
       "      <td>Berlin</td>\n",
       "      <td><audio src=\"http://localhost:9999/9/the_kokoon-berlin-10-tap_at_floes-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1898</th>\n",
       "      <td>Norine Braun</td>\n",
       "      <td>Jenny</td>\n",
       "      <td>Miles to Go</td>\n",
       "      <td><audio src=\"http://localhost:9999/6/norine_braun-miles_to_go-11-jenny-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1708</th>\n",
       "      <td>Mandrake Root</td>\n",
       "      <td>Solitaire</td>\n",
       "      <td>The Seventh Mirror</td>\n",
       "      <td><audio src=\"http://localhost:9999/3/mandrake_root-the_seventh_mirror-09-solitaire-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2037</th>\n",
       "      <td>Chris Juergensen</td>\n",
       "      <td>Some Sympathy</td>\n",
       "      <td>Big Bad Sun</td>\n",
       "      <td><audio src=\"http://localhost:9999/f/chris_juergensen-big_bad_sun-08-some_sympathy-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>200</th>\n",
       "      <td>Arthur Yoria</td>\n",
       "      <td>It's Now Something Else</td>\n",
       "      <td>Of the Lovely</td>\n",
       "      <td><audio src=\"http://localhost:9999/5/arthur_yoria-of_the_lovely-07-its_now_something_else-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1880</th>\n",
       "      <td>Arthur Yoria</td>\n",
       "      <td>She Looks Like You</td>\n",
       "      <td>Ill Be Here Awake</td>\n",
       "      <td><audio src=\"http://localhost:9999/9/arthur_yoria-ill_be_here_awake-06-she_looks_like_you-117-146.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>87</th>\n",
       "      <td>Drop Trio</td>\n",
       "      <td>Melody-Melody</td>\n",
       "      <td>Big Dipper</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/drop_trio-big_dipper-03-melodymelody-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1319</th>\n",
       "      <td>William Brooks</td>\n",
       "      <td>I Didn't Think</td>\n",
       "      <td>Karma Dogs</td>\n",
       "      <td><audio src=\"http://localhost:9999/1/william_brooks-karma_dogs-02-i_didnt_think-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "similar(model_orig, 1511)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Improve Performance through prior Knowledge\n",
    "\n",
    "The original approach only uses genuine and impostor pairs and does not consider any further prior knowledge. In that sense if two tracks belong to the same playlist, they are considered similar, if not, than they are not. But, because we have chosen genre-playlists, there are genres that are more similar than others. This is of course highly subjective and depends on the listening behaviour and experience of a listener.\n",
    "\n",
    "The following list represents my own interpretation of genre similarities:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "sim = [[[\"blues\",      \"blues\"],     1.0],\n",
    "       [[\"blues\",      \"classical\"], 0.0],\n",
    "       [[\"blues\",      \"country\"],   0.8],\n",
    "       [[\"blues\",      \"jazz\"],      0.3],\n",
    "       [[\"blues\",      \"pop\"],       0.0],\n",
    "       [[\"blues\",      \"rock\"],      0.1],\n",
    "       [[\"blues\",      \"techno\"],    0.0],\n",
    "       [[\"classical\",  \"classical\"], 1.0],\n",
    "       [[\"classical\",  \"country\"],   0.0],\n",
    "       [[\"classical\",  \"jazz\"],      0.0],\n",
    "       [[\"classical\",  \"pop\"],       0.0],\n",
    "       [[\"classical\",  \"rock\"],      0.0],\n",
    "       [[\"classical\",  \"techno\"],    0.0],\n",
    "       [[\"country\",    \"country\"],   1.0],\n",
    "       [[\"country\",    \"jazz\"],      0.1],\n",
    "       [[\"country\",    \"pop\"],       0.2],\n",
    "       [[\"country\",    \"rock\"],      0.3],\n",
    "       [[\"country\",    \"techno\"],    0.0],\n",
    "       [[\"jazz\",       \"jazz\"],      1.0],\n",
    "       [[\"jazz\",       \"pop\"],       0.0],\n",
    "       [[\"jazz\",       \"rock\"],      0.1],\n",
    "       [[\"jazz\",       \"techno\"],    0.0],\n",
    "       [[\"pop\",        \"pop\"],       1.0],\n",
    "       [[\"pop\",        \"rock\"],      0.2],\n",
    "       [[\"pop\",        \"techno\"],    0.8],\n",
    "       [[\"rock\",       \"rock\"],      1.0],\n",
    "       [[\"rock\",       \"techno\"],    0.0],\n",
    "       [[\"techno\",     \"techno\"],    1.0]]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following code creates a symmetric lookup-table from the list above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>blues</th>\n",
       "      <th>classical</th>\n",
       "      <th>country</th>\n",
       "      <th>jazz</th>\n",
       "      <th>pop</th>\n",
       "      <th>rock</th>\n",
       "      <th>techno</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>blues</th>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>0.3</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>classical</th>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>country</th>\n",
       "      <td>0.8</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.3</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>jazz</th>\n",
       "      <td>0.3</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.1</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>pop</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>rock</th>\n",
       "      <td>0.1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.3</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.2</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>techno</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           blues  classical  country  jazz  pop  rock  techno\n",
       "blues      1.0    0.0        0.8      0.3   0.0  0.1   0.0   \n",
       "classical  0.0    1.0        0.0      0.0   0.0  0.0   0.0   \n",
       "country    0.8    0.0        1.0      0.1   0.2  0.3   0.0   \n",
       "jazz       0.3    0.0        0.1      1.0   0.0  0.1   0.0   \n",
       "pop        0.0    0.0        0.2      0.0   1.0  0.2   0.8   \n",
       "rock       0.1    0.0        0.3      0.1   0.2  1.0   0.0   \n",
       "techno     0.0    0.0        0.0      0.0   0.8  0.0   1.0   "
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# get all playlist-names from our dataset\n",
    "playlist_names = [pl for pl in labels.columns]\n",
    "\n",
    "# create the lookup-table\n",
    "playlist_similarities = pd.DataFrame(np.zeros((len(labels.columns),len(labels.columns))), \n",
    "                                     index   = playlist_names, \n",
    "                                     columns = playlist_names)\n",
    "\n",
    "# self-similarity\n",
    "for i in range(len(playlist_names)):\n",
    "    for j in range(len(playlist_names)):\n",
    "        if i == j:\n",
    "            playlist_similarities.iloc[i,j] = 1.0\n",
    "\n",
    "# genre-similarities\n",
    "for s in sim:\n",
    "    playlist_similarities.loc[s[0][0],s[0][1]] = s[1]\n",
    "    playlist_similarities.loc[s[0][1],s[0][0]] = s[1]\n",
    "\n",
    "# show results\n",
    "playlist_similarities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_pairs_with_sims(feature_data, metadata, labels, playlist_similarities, num_triplets_per_track):\n",
    "    \n",
    "    data_ref  = []\n",
    "    data_dif  = []\n",
    "    gt_labels = []\n",
    "    \n",
    "    pbar = progressbar.ProgressBar(max_value=metadata.shape[0])\n",
    "    \n",
    "    for row_id, q_track in pbar(metadata.iterrows()):\n",
    "        \n",
    "        for _ in range(num_triplets_per_track):\n",
    "            \n",
    "            label_differences = np.abs(labels - labels.loc[row_id]).sum(axis=1)\n",
    "            \n",
    "            similar_instances    = label_differences[label_differences == 0]\n",
    "            dissimilar_instances = label_differences[label_differences != 0] \n",
    "            \n",
    "            # search similar and dissimilar examples\n",
    "            pos_example_idx      = similar_instances.sample(1).index.values[0]\n",
    "            neg_example_idx      = dissimilar_instances.sample(1).index.values[0]\n",
    "            \n",
    "            # create feature triplets\n",
    "            feat_id_ref          = metadata.loc[row_id].featurespace_id\n",
    "            feat_id_pos          = metadata.loc[pos_example_idx].featurespace_id\n",
    "            feat_id_neg          = metadata.loc[neg_example_idx].featurespace_id\n",
    "            \n",
    "            # genuine pair\n",
    "            data_ref.append(feature_data[feat_id_ref])\n",
    "            data_dif.append(feature_data[feat_id_pos])\n",
    "            \n",
    "            label = playlist_similarities.loc[labels.loc[row_id].idxmax(axis=1), \n",
    "                                              labels.loc[pos_example_idx].idxmax(axis=1)]\n",
    "                        \n",
    "            gt_labels.append(label)\n",
    "            \n",
    "            data_ref.append(feature_data[feat_id_ref])\n",
    "            data_dif.append(feature_data[feat_id_neg])\n",
    "            \n",
    "            label = playlist_similarities.loc[labels.loc[row_id].idxmax(axis=1), \n",
    "                                              labels.loc[neg_example_idx].idxmax(axis=1)]\n",
    "            \n",
    "            gt_labels.append(label)\n",
    "            \n",
    "    return [np.asarray(data_ref), np.asarray(data_dif)], np.asarray(gt_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Train network with prior knowledge\n",
    "\n",
    "With this lookup-table we can create more accurate input pairs. Insted of similar/dissimilar we can now apply the supplied similarites:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Execute the function to prepare the data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100% (2617 of 2617) |####################| Elapsed Time: 0:00:54 Time:  0:00:54\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "(26170, 80, 80, 1)"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# get pairs\n",
    "data_pairs, paired_labels = create_pairs_with_sims(melspecs, metadata, labels, playlist_similarities, 5)\n",
    "\n",
    "# check - how many instances have we created?\n",
    "data_pairs[0].shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Prepare the Siamese Neural Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define the model\n",
    "model_sim = create_siamese_network()\n",
    "\n",
    "# define the optimizer\n",
    "opt = Adam(lr=0.0001, decay=0.001)\n",
    "\n",
    "# compile the model\n",
    "model_sim.compile(loss      = contrastive_loss, \n",
    "                  optimizer = opt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train the model on the adapted data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train on 24170 samples, validate on 2000 samples\n",
      "Epoch 1/10\n",
      "24170/24170 [==============================] - 24s 984us/step - loss: 4.0022 - val_loss: 0.9355\n",
      "Epoch 2/10\n",
      "24170/24170 [==============================] - 81s 3ms/step - loss: 0.4463 - val_loss: 0.7617\n",
      "Epoch 3/10\n",
      "24170/24170 [==============================] - 86s 4ms/step - loss: 0.3375 - val_loss: 0.3240\n",
      "Epoch 4/10\n",
      "24170/24170 [==============================] - 89s 4ms/step - loss: 0.2946 - val_loss: 0.2845\n",
      "Epoch 5/10\n",
      "24170/24170 [==============================] - 71s 3ms/step - loss: 0.2716 - val_loss: 1.0104\n",
      "Epoch 6/10\n",
      "24170/24170 [==============================] - 72s 3ms/step - loss: 0.2569 - val_loss: 0.3103\n",
      "Epoch 7/10\n",
      "24170/24170 [==============================] - 72s 3ms/step - loss: 0.2464 - val_loss: 0.2793\n",
      "Epoch 8/10\n",
      "24170/24170 [==============================] - 49s 2ms/step - loss: 0.2383 - val_loss: 0.2695\n",
      "Epoch 9/10\n",
      "24170/24170 [==============================] - 23s 938us/step - loss: 0.2316 - val_loss: 0.2960\n",
      "Epoch 10/10\n",
      "24170/24170 [==============================] - 22s 899us/step - loss: 0.2259 - val_loss: 90.0341\n"
     ]
    }
   ],
   "source": [
    "model_sim.fit([data_pairs[0][:-2000], data_pairs[1][:-2000]], \n",
    "                paired_labels[:-2000], \n",
    "                batch_size       = 64, \n",
    "                verbose          = 1, \n",
    "                epochs           = 10,\n",
    "                shuffle          = False, # important !\n",
    "                validation_data = [[data_pairs[0][-2000:], data_pairs[1][-2000:]], paired_labels[-2000:]]);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_sim.save_weights(\"./models/part_2b_siamese_network_with_genre_similarities.h5\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's check the results for some individual tracks by supplying the index to our dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "artist          Sun Palace                                                                     \n",
      "title           Man of the Severn Wave                                                         \n",
      "album           Give Me a Perfect World                                                        \n",
      "url             http://www.magnatune.com/artists/albums/sunpalace-perfectworld/                \n",
      "original_url    http://he3.magnatune.com/all/07-Man%20of%20the%20Severn%20Wave-Sun%20Palace.mp3\n",
      "Name: 14722, dtype: object\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>artist</th>\n",
       "      <th>title</th>\n",
       "      <th>album</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>featurespace_id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1511</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Man of the Severn Wave</td>\n",
       "      <td>Give Me a Perfect World</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/sun_palace-give_me_a_perfect_world-07-man_of_the_severn_wave-117-146.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2160</th>\n",
       "      <td>Etherine</td>\n",
       "      <td>Where the sky ends</td>\n",
       "      <td>24 Days</td>\n",
       "      <td><audio src=\"http://localhost:9999/1/etherine-24_days-11-where_the_sky_ends-30-59.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2264</th>\n",
       "      <td>Emma's Mini</td>\n",
       "      <td>Disconnected</td>\n",
       "      <td>Beat Generation Mad Trick</td>\n",
       "      <td><audio src=\"http://localhost:9999/3/emmas_mini-beat_generation_mad_trick-02-disconnected-291-320.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1874</th>\n",
       "      <td>Spinecar</td>\n",
       "      <td>Autophile</td>\n",
       "      <td>Autophile</td>\n",
       "      <td><audio src=\"http://localhost:9999/3/spinecar-autophile-04-autophile-349-378.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1573</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Man of the Severn Wave</td>\n",
       "      <td>Give Me a Perfect World</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/sun_palace-give_me_a_perfect_world-07-man_of_the_severn_wave-204-233.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1373</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Round and Round</td>\n",
       "      <td>Give Me a Perfect World</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/sun_palace-give_me_a_perfect_world-02-round_and_round-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2088</th>\n",
       "      <td>Spinecar</td>\n",
       "      <td>Soul Patch</td>\n",
       "      <td>Autophile</td>\n",
       "      <td><audio src=\"http://localhost:9999/3/spinecar-autophile-03-soul_patch-204-233.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1888</th>\n",
       "      <td>Rocket City Riot</td>\n",
       "      <td>-Inside My Head-</td>\n",
       "      <td>Last Of The Pleasure Seekers</td>\n",
       "      <td><audio src=\"http://localhost:9999/0/rocket_city_riot-last_of_the_pleasure_seekers-06-inside_my_head-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2167</th>\n",
       "      <td>Monoide</td>\n",
       "      <td>Zara</td>\n",
       "      <td>Zeitpunkt</td>\n",
       "      <td><audio src=\"http://localhost:9999/7/monoide-zeitpunkt-06-zara-320-349.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1391</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Man of the Severn Wave</td>\n",
       "      <td>Give Me a Perfect World</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/sun_palace-give_me_a_perfect_world-07-man_of_the_severn_wave-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2169</th>\n",
       "      <td>Kenji Williams</td>\n",
       "      <td>Aura</td>\n",
       "      <td>Worldspirit Soundtrack</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/kenji_williams-worldspirit_soundtrack-06-aura-494-523.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "similar(model_sim, 1511)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Improve Performance through Identity\n",
    "\n",
    "So far we have taught the network what is similar and what not, but we have not shown it, what is identical. All input pairs created so far missed to pass identical data. In the following step, we will include identical pairs into the training instances. To emphasis the identity, only identical pairs will be assigned a label of 1. All other similarity values of the lookup-table will be decreased by 0.1. Thus, tracks of the same playlist will have a similarity value 0f 0.9."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_pairs_with_sims_and_identity(feature_data, metadata, labels, playlist_similarities, num_triplets_per_track):\n",
    "    \n",
    "    data_ref  = []\n",
    "    data_dif  = []\n",
    "    gt_labels = []\n",
    "    \n",
    "    pbar = progressbar.ProgressBar(max_value=metadata.shape[0])\n",
    "    \n",
    "    i = 0\n",
    "    \n",
    "    for row_id, q_track in pbar(metadata.iterrows()):\n",
    "        \n",
    "        for j in range(num_triplets_per_track):\n",
    "            \n",
    "            label_differences = np.abs(labels - labels.loc[row_id]).sum(axis=1)\n",
    "            \n",
    "            similar_instances    = label_differences[label_differences == 0]\n",
    "            dissimilar_instances = label_differences[label_differences != 0] \n",
    "            \n",
    "            # search similar and dissimilar examples\n",
    "            pos_example_idx      = similar_instances.sample(1).index.values[0]\n",
    "            neg_example_idx      = dissimilar_instances.sample(1).index.values[0]\n",
    "            \n",
    "            # create feature triplets\n",
    "            feat_id_ref          = metadata.loc[row_id].featurespace_id\n",
    "            feat_id_pos          = metadata.loc[pos_example_idx].featurespace_id\n",
    "            feat_id_neg          = metadata.loc[neg_example_idx].featurespace_id\n",
    "            \n",
    "            #print([feat_id_ref, feat_id_pos, feat_id_neg])\n",
    "            \n",
    "            if j == 0:\n",
    "                \n",
    "                data_ref.append(feature_data[feat_id_ref])\n",
    "                data_dif.append(feature_data[feat_id_ref])\n",
    "                gt_labels.append(1)\n",
    "                \n",
    "                data_ref.append(feature_data[feat_id_ref])\n",
    "                data_dif.append(feature_data[feat_id_neg])\n",
    "                \n",
    "                label = playlist_similarities.loc[labels.loc[row_id].idxmax(axis=1), \n",
    "                                                  labels.loc[neg_example_idx].idxmax(axis=1)]\n",
    "                gt_labels.append(label)\n",
    "            \n",
    "            # genuine pair\n",
    "            data_ref.append(feature_data[feat_id_ref])\n",
    "            data_dif.append(feature_data[feat_id_pos])\n",
    "            \n",
    "            label = playlist_similarities.loc[labels.loc[row_id].idxmax(axis=1), \n",
    "                                              labels.loc[pos_example_idx].idxmax(axis=1)]\n",
    "                        \n",
    "            gt_labels.append(label)\n",
    "            \n",
    "            data_ref.append(feature_data[feat_id_ref])\n",
    "            data_dif.append(feature_data[feat_id_neg])\n",
    "            \n",
    "            label = playlist_similarities.loc[labels.loc[row_id].idxmax(axis=1), \n",
    "                                              labels.loc[neg_example_idx].idxmax(axis=1)]\n",
    "            \n",
    "            gt_labels.append(label)\n",
    "            \n",
    "        #i += 1\n",
    "        #\n",
    "        #if i > 10:\n",
    "        #    break\n",
    "\n",
    "    return [np.asarray(data_ref), np.asarray(data_dif)], np.asarray(gt_labels)\n",
    "\n",
    "#data_pairs, paired_labels = create_pairs_with_sims_and_identity(melspecs, metadata, labels, playlist_similarities, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>blues</th>\n",
       "      <th>classical</th>\n",
       "      <th>country</th>\n",
       "      <th>jazz</th>\n",
       "      <th>pop</th>\n",
       "      <th>rock</th>\n",
       "      <th>techno</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>blues</th>\n",
       "      <td>0.9</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>0.3</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>classical</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.9</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>country</th>\n",
       "      <td>0.8</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.9</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.3</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>jazz</th>\n",
       "      <td>0.3</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.9</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>pop</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.9</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>rock</th>\n",
       "      <td>0.1</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.3</td>\n",
       "      <td>0.1</td>\n",
       "      <td>0.2</td>\n",
       "      <td>0.9</td>\n",
       "      <td>0.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>techno</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.8</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.9</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           blues  classical  country  jazz  pop  rock  techno\n",
       "blues      0.9    0.0        0.8      0.3   0.0  0.1   0.0   \n",
       "classical  0.0    0.9        0.0      0.0   0.0  0.0   0.0   \n",
       "country    0.8    0.0        0.9      0.1   0.2  0.3   0.0   \n",
       "jazz       0.3    0.0        0.1      0.9   0.0  0.1   0.0   \n",
       "pop        0.0    0.0        0.2      0.0   0.9  0.2   0.8   \n",
       "rock       0.1    0.0        0.3      0.1   0.2  0.9   0.0   \n",
       "techno     0.0    0.0        0.0      0.0   0.8  0.0   0.9   "
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "playlist_similarities[playlist_similarities == 1] = 0.9\n",
    "playlist_similarities"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Execute the function to prepare the data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100% (2617 of 2617) |####################| Elapsed Time: 0:00:49 Time:  0:00:49\n"
     ]
    }
   ],
   "source": [
    "data_pairs, paired_labels = create_pairs_with_sims_and_identity(melspecs, metadata, labels, playlist_similarities, 5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Prepare the Neural Network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "# define the model\n",
    "model_sim_id = create_siamese_network()\n",
    "\n",
    "# define the optimizer\n",
    "opt = Adam(lr=0.0001, decay=0.001)\n",
    "\n",
    "# compile the model\n",
    "model_sim_id.compile(loss      = contrastive_loss, \n",
    "                  optimizer = opt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train the model on the adapted data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Train on 29404 samples, validate on 2000 samples\n",
      "Epoch 1/10\n",
      "29404/29404 [==============================] - 28s 938us/step - loss: 3.4786 - val_loss: 0.5906\n",
      "Epoch 2/10\n",
      "29404/29404 [==============================] - 26s 900us/step - loss: 0.4205 - val_loss: 0.3831\n",
      "Epoch 3/10\n",
      "29404/29404 [==============================] - 26s 891us/step - loss: 0.3181 - val_loss: 63.9344\n",
      "Epoch 4/10\n",
      "29404/29404 [==============================] - 26s 883us/step - loss: 0.2753 - val_loss: 0.3352\n",
      "Epoch 5/10\n",
      "29404/29404 [==============================] - 26s 891us/step - loss: 0.2506 - val_loss: 0.2608\n",
      "Epoch 6/10\n",
      "29404/29404 [==============================] - 26s 887us/step - loss: 0.2348 - val_loss: 0.2809\n",
      "Epoch 7/10\n",
      "29404/29404 [==============================] - 26s 880us/step - loss: 0.2234 - val_loss: 0.2594\n",
      "Epoch 8/10\n",
      "29404/29404 [==============================] - 26s 891us/step - loss: 0.2150 - val_loss: 0.2471\n",
      "Epoch 9/10\n",
      "29404/29404 [==============================] - 26s 881us/step - loss: 0.2082 - val_loss: 0.2295\n",
      "Epoch 10/10\n",
      "29404/29404 [==============================] - 26s 881us/step - loss: 0.2025 - val_loss: 0.2946\n"
     ]
    }
   ],
   "source": [
    "model_sim_id.fit([data_pairs[0][:-2000], data_pairs[1][:-2000]], \n",
    "                  paired_labels[:-2000], \n",
    "                  batch_size       = 64, \n",
    "                  verbose          = 1, \n",
    "                  epochs           = 10,\n",
    "                  shuffle          = False, # important !\n",
    "                  validation_data = [[data_pairs[0][-2000:], data_pairs[1][-2000:]], paired_labels[-2000:]]);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_sim.save_weights(\"./models/part_2b_siamese_network_with_genre_similarities_and_identity.h5\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's check the results for some individual tracks by supplying the index to our dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "artist          Sun Palace                                                                     \n",
      "title           Man of the Severn Wave                                                         \n",
      "album           Give Me a Perfect World                                                        \n",
      "url             http://www.magnatune.com/artists/albums/sunpalace-perfectworld/                \n",
      "original_url    http://he3.magnatune.com/all/07-Man%20of%20the%20Severn%20Wave-Sun%20Palace.mp3\n",
      "Name: 14722, dtype: object\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>artist</th>\n",
       "      <th>title</th>\n",
       "      <th>album</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>featurespace_id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1511</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Man of the Severn Wave</td>\n",
       "      <td>Give Me a Perfect World</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/sun_palace-give_me_a_perfect_world-07-man_of_the_severn_wave-117-146.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1490</th>\n",
       "      <td>Norine Braun</td>\n",
       "      <td>I'd Guess You'd Say</td>\n",
       "      <td>Modern Anguish</td>\n",
       "      <td><audio src=\"http://localhost:9999/4/norine_braun-modern_anguish-02-id_guess_youd_say-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1633</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Palace Welcome</td>\n",
       "      <td>Give Me a Perfect World</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/sun_palace-give_me_a_perfect_world-06-palace_welcome-0-29.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>895</th>\n",
       "      <td>Various Artists</td>\n",
       "      <td>Suerte Mijo (Arthur Yoria)</td>\n",
       "      <td>The 2007 Magnatune Records Sampler</td>\n",
       "      <td><audio src=\"http://localhost:9999/9/various_artists-the_2007_magnatune_records_sampler-06-suerte_mijo_arthur_yoria-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2358</th>\n",
       "      <td>Mercy Machine</td>\n",
       "      <td>Invisible (Cosmic Sea Shanty Mix)</td>\n",
       "      <td>In Your Bed - the remixes</td>\n",
       "      <td><audio src=\"http://localhost:9999/8/mercy_machine-in_your_bed__the_remixes-08-invisible_cosmic_sea_shanty_mix-262-291.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1001</th>\n",
       "      <td>William Brooks</td>\n",
       "      <td>The Hanging of Allen Scott Johnson</td>\n",
       "      <td>Blue Ribbon - The Best of William Brooks</td>\n",
       "      <td><audio src=\"http://localhost:9999/c/william_brooks-blue_ribbon__the_best_of_william_brooks-12-the_hanging_of_allen_scott_johnson-0-29.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1816</th>\n",
       "      <td>Somadrone</td>\n",
       "      <td>WNQD</td>\n",
       "      <td>Trancelucent</td>\n",
       "      <td><audio src=\"http://localhost:9999/a/somadrone-trancelucent-03-wnqd-0-29.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2558</th>\n",
       "      <td>Etherfysh</td>\n",
       "      <td>Orange</td>\n",
       "      <td>Box of Fysh</td>\n",
       "      <td><audio src=\"http://localhost:9999/3/etherfysh-box_of_fysh-03-orange-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1551</th>\n",
       "      <td>Self Delusion</td>\n",
       "      <td>Dead Star</td>\n",
       "      <td>Happiness Hurts Me</td>\n",
       "      <td><audio src=\"http://localhost:9999/9/self_delusion-happiness_hurts_me-10-dead_star-30-59.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>606</th>\n",
       "      <td>Les Filles de Sainte Colombe</td>\n",
       "      <td>Pieces a tre Viola di Gamba (Schwartzkopff)</td>\n",
       "      <td>German music for Viols and Harpsichord</td>\n",
       "      <td><audio src=\"http://localhost:9999/3/les_filles_de_sainte_colombe-german_music_for_viols_and_harpsichord-01-pieces_a_tre_viola_di_gamba_schwartzkopff-320-349.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2452</th>\n",
       "      <td>Magnatune Compilation</td>\n",
       "      <td>Mr Gelatine_ Unknown Quantity</td>\n",
       "      <td>Electronica</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/magnatune_compilation-electronica-01-mr_gelatine_unknown_quantity-117-146.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "similar(model_sim_id, 1511)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  },
  "notify_time": "5",
  "toc": {
   "toc_cell": false,
   "toc_number_sections": true,
   "toc_threshold": 6,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}