{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Finding Similar Songs - Part 1: Distance Based Search\n",
    "\n",
    "The first part of this tutorial series demonstrates a distance based similarity search approach, based on extracted content descriptors and distance metrics. \n",
    "\n",
    "## Part 1 - Overview\n",
    "\n",
    "1. Introductions & Requirements\n",
    "2. Loading data\n",
    "3. Preprocess data\n",
    "4. Define the Similarity Model\n",
    "5. Optimize the Model\n",
    "\n",
    "# Short Introduction to Music Similarity Retrieval\n",
    "\n",
    "The objective of Music Similarity estimation or retrieval is to estimate the notion of similarity between two given tracks. A central part of such an approaches is the definition of a measure for similarity which is further affected by the approach taken to extract the relevant information. One approach is to analyze contextual data such as user generated listening behaviour data (e.g. play/skip-counts, user-tags, ratings, etc.). The approach followed by this tutorial is based on the music content itself and largely focuses on the notion of *acoustic similarity*. Music features are extracted from the audio content. The resulting music descriptors are high-dimensional numeric vectors and the accumulation of all feature vectors of a collection forms a vector-space. The general principle of content based similarity estimations is based on the assumption that numerical differences are an expression of perceptual dissimilarity. Different metrics such as the Manhattan (L1) or the Euclidean Distance (L2) or non-metric similarity functions such as the Kullback-Leibler divergence are used to estimate the numerical similarity of the feature vectors."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Requirements\n",
    "\n",
    "Please follow the instructions on the tutorial's Github page to install the following dependencies to run this tutorial:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2017-08-24T10:20:32.488000Z",
     "start_time": "2017-08-24T10:20:32.483000Z"
    }
   },
   "outputs": [],
   "source": [
    "# visualization\n",
    "%matplotlib inline\n",
    "import matplotlib\n",
    "import matplotlib.pyplot as plt\n",
    "matplotlib.style.use('ggplot')\n",
    "\n",
    "# numeric and scientific processing\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "# misc\n",
    "import os\n",
    "import progressbar\n",
    "\n",
    "from IPython.display import HTML, display\n",
    "pd.set_option('display.max_colwidth', -1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Loading Data\n",
    "\n",
    "Before we can train our models we first have to get some data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "DATASET_PATH    = \"D:/Research/Data/MIR/MagnaTagATune/ISMIR2018\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Feature Data\n",
    "\n",
    "load feature data from numpy pickle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "with np.load(\"%s/ISMIR2018_tut_Magnagtagatune_rp_features.npz\" % DATASET_PATH) as npz:\n",
    "    features_rp   = npz[\"rp\"]\n",
    "    features_ssd  = npz[\"ssd\"]\n",
    "    clip_id       = npz[\"clip_id\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "prepare feature-metadata for alignment with dataset meta-data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_metadata = pd.DataFrame({\"featurespace_id\": np.arange(features_rp.shape[0]), \n",
    "                                 \"clip_id\"        : clip_id})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Metadata\n",
    "\n",
    "load meta-data from csv-file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>clip_id</th>\n",
       "      <th>mp3_path</th>\n",
       "      <th>track_number</th>\n",
       "      <th>title</th>\n",
       "      <th>artist</th>\n",
       "      <th>album</th>\n",
       "      <th>url</th>\n",
       "      <th>segmentStart</th>\n",
       "      <th>segmentEnd</th>\n",
       "      <th>original_url</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>19200</th>\n",
       "      <td>42150</td>\n",
       "      <td>D:/Research/Data/MIR/MagnaTagATune/mp3_full/f/professor_armchair-too_much_mustard-10-bethena-117-146.mp3</td>\n",
       "      <td>10</td>\n",
       "      <td>Bethena</td>\n",
       "      <td>Professor Armchair</td>\n",
       "      <td>Too Much Mustard</td>\n",
       "      <td>http://www.magnatune.com/artists/albums/armchair-octopants/</td>\n",
       "      <td>117</td>\n",
       "      <td>146</td>\n",
       "      <td>http://he3.magnatune.com/all/10-Bethena-Professor%20Armchair.mp3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7745</th>\n",
       "      <td>16994</td>\n",
       "      <td>D:/Research/Data/MIR/MagnaTagATune/mp3_full/c/liquid_zen-seventythree-04-close-262-291.mp3</td>\n",
       "      <td>4</td>\n",
       "      <td>Close</td>\n",
       "      <td>Liquid Zen</td>\n",
       "      <td>Seventythree</td>\n",
       "      <td>http://www.magnatune.com/artists/albums/liquid-seventythree/</td>\n",
       "      <td>262</td>\n",
       "      <td>291</td>\n",
       "      <td>http://he3.magnatune.com/all/04-Close%20-%20Liquid%20Zen.mp3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25416</th>\n",
       "      <td>57455</td>\n",
       "      <td>D:/Research/Data/MIR/MagnaTagATune/mp3_full/6/doc_rossi-demarzi6_sonatas_for_cetra_o_kitara-23-sonata_vi_allegro-59-88.mp3</td>\n",
       "      <td>23</td>\n",
       "      <td>Sonata VI Allegro</td>\n",
       "      <td>Doc Rossi</td>\n",
       "      <td>Demarzi-6 Sonatas for Cetra o Kitara</td>\n",
       "      <td>http://www.magnatune.com/artists/albums/rossi-demarzi/</td>\n",
       "      <td>59</td>\n",
       "      <td>88</td>\n",
       "      <td>http://he3.magnatune.com/all/23-Sonata%20VI%20Allegro-Doc%20Rossi.mp3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>613</th>\n",
       "      <td>1473</td>\n",
       "      <td>D:/Research/Data/MIR/MagnaTagATune/mp3_full/b/philharmonia_baroque-beethoven_symphonies_no_3_eroica_and_no_8-01-eroica_1st-117-146.mp3</td>\n",
       "      <td>1</td>\n",
       "      <td>Eroica 1st</td>\n",
       "      <td>Philharmonia Baroque</td>\n",
       "      <td>Beethoven Symphonies No 3 Eroica and No 8</td>\n",
       "      <td>http://www.magnatune.com/artists/albums/pb-eroica/</td>\n",
       "      <td>117</td>\n",
       "      <td>146</td>\n",
       "      <td>http://he3.magnatune.com/all/01-Eroica%201st-Philharmonia%20Baroque.mp3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15095</th>\n",
       "      <td>33011</td>\n",
       "      <td>D:/Research/Data/MIR/MagnaTagATune/mp3_full/0/barbara_leoni-human_needs-07-ring_around_the_rosey-88-117.mp3</td>\n",
       "      <td>7</td>\n",
       "      <td>Ring around the rosey</td>\n",
       "      <td>Barbara Leoni</td>\n",
       "      <td>Human Needs</td>\n",
       "      <td>http://www.magnatune.com/artists/albums/leoni-human/</td>\n",
       "      <td>88</td>\n",
       "      <td>117</td>\n",
       "      <td>http://he3.magnatune.com/all/07-Ring%20around%20the%20rosey-Barbara%20Leoni.mp3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       clip_id  \\\n",
       "19200  42150     \n",
       "7745   16994     \n",
       "25416  57455     \n",
       "613    1473      \n",
       "15095  33011     \n",
       "\n",
       "                                                                                                                                     mp3_path  \\\n",
       "19200  D:/Research/Data/MIR/MagnaTagATune/mp3_full/f/professor_armchair-too_much_mustard-10-bethena-117-146.mp3                                 \n",
       "7745   D:/Research/Data/MIR/MagnaTagATune/mp3_full/c/liquid_zen-seventythree-04-close-262-291.mp3                                               \n",
       "25416  D:/Research/Data/MIR/MagnaTagATune/mp3_full/6/doc_rossi-demarzi6_sonatas_for_cetra_o_kitara-23-sonata_vi_allegro-59-88.mp3               \n",
       "613    D:/Research/Data/MIR/MagnaTagATune/mp3_full/b/philharmonia_baroque-beethoven_symphonies_no_3_eroica_and_no_8-01-eroica_1st-117-146.mp3   \n",
       "15095  D:/Research/Data/MIR/MagnaTagATune/mp3_full/0/barbara_leoni-human_needs-07-ring_around_the_rosey-88-117.mp3                              \n",
       "\n",
       "       track_number                  title                artist  \\\n",
       "19200  10            Bethena                Professor Armchair     \n",
       "7745   4             Close                  Liquid Zen             \n",
       "25416  23            Sonata VI Allegro      Doc Rossi              \n",
       "613    1             Eroica 1st             Philharmonia Baroque   \n",
       "15095  7             Ring around the rosey  Barbara Leoni          \n",
       "\n",
       "                                           album  \\\n",
       "19200  Too Much Mustard                            \n",
       "7745   Seventythree                                \n",
       "25416  Demarzi-6 Sonatas for Cetra o Kitara        \n",
       "613    Beethoven Symphonies No 3 Eroica and No 8   \n",
       "15095  Human Needs                                 \n",
       "\n",
       "                                                                url  \\\n",
       "19200  http://www.magnatune.com/artists/albums/armchair-octopants/    \n",
       "7745   http://www.magnatune.com/artists/albums/liquid-seventythree/   \n",
       "25416  http://www.magnatune.com/artists/albums/rossi-demarzi/         \n",
       "613    http://www.magnatune.com/artists/albums/pb-eroica/             \n",
       "15095  http://www.magnatune.com/artists/albums/leoni-human/           \n",
       "\n",
       "       segmentStart  segmentEnd  \\\n",
       "19200  117           146          \n",
       "7745   262           291          \n",
       "25416  59            88           \n",
       "613    117           146          \n",
       "15095  88            117          \n",
       "\n",
       "                                                                          original_url  \n",
       "19200  http://he3.magnatune.com/all/10-Bethena-Professor%20Armchair.mp3                 \n",
       "7745   http://he3.magnatune.com/all/04-Close%20-%20Liquid%20Zen.mp3                     \n",
       "25416  http://he3.magnatune.com/all/23-Sonata%20VI%20Allegro-Doc%20Rossi.mp3            \n",
       "613    http://he3.magnatune.com/all/01-Eroica%201st-Philharmonia%20Baroque.mp3          \n",
       "15095  http://he3.magnatune.com/all/07-Ring%20around%20the%20rosey-Barbara%20Leoni.mp3  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "metadata = pd.read_csv(\"./metadata/ismir2018_tut_part_3_similartiy_metadata.csv\", index_col=0)\n",
    "metadata.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Align featuredata with metadata"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "metadata = metadata.reset_index()\n",
    "metadata = metadata.merge(feature_metadata, left_on=\"clip_id\", right_on=\"clip_id\", how=\"inner\", left_index=True, right_index=False)\n",
    "metadata = metadata.set_index(\"index\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Add Media-Player to Metadata**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "tmp                = metadata.mp3_path.str.split(\"/\", expand=True)\n",
    "metadata[\"player\"] = '<audio src=\"http://localhost:9999/' + tmp[6] + '/' + tmp[7] +'\" controls>'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use jupyters display functions to enable HTML5 audio in pandas dataframe"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>title</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>index</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>19200</th>\n",
       "      <td>Bethena</td>\n",
       "      <td><audio src=\"http://localhost:9999/f/professor_armchair-too_much_mustard-10-bethena-117-146.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7745</th>\n",
       "      <td>Close</td>\n",
       "      <td><audio src=\"http://localhost:9999/c/liquid_zen-seventythree-04-close-262-291.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25416</th>\n",
       "      <td>Sonata VI Allegro</td>\n",
       "      <td><audio src=\"http://localhost:9999/6/doc_rossi-demarzi6_sonatas_for_cetra_o_kitara-23-sonata_vi_allegro-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "HTML(metadata.iloc[:3][[\"title\", \"player\"]].to_html(escape=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Aggregate Feature Space - Early Fusion"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(6380, 1608)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "feature_data = np.concatenate([features_rp, features_ssd], axis=1)\n",
    "\n",
    "feature_data.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Align Meta-data and Feature-data\n",
    "\n",
    "Sort Metadata by Feature-space ID => metadata is aligned to feature-data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "metadata.sort_values(\"featurespace_id\", inplace=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Select feature-data for this metadata"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "# subsample feature-space\n",
    "feature_data = feature_data[metadata.featurespace_id]\n",
    "\n",
    "# re-enumerate sub-sampled feature-space (alignment)\n",
    "metadata[\"featurespace_id\"] = np.arange(metadata.shape[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Normalize feature data\n",
    "\n",
    "The feature vectors are composed of differnt feature-sets. All of them with different value ranges. While features such as Acousticness and Danceability are scaled between 0 and 1, the BPM values of the tempo feature ranges around 120 or higher. We apply Standard Score or Zero Mean and Unit Variance normalization to uniformly scale the value ranges of the features.\n",
    "\n",
    "$$\n",
    "z = {x- \\mu \\over \\sigma}\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\anaconda\\lib\\site-packages\\ipykernel_launcher.py:3: RuntimeWarning: divide by zero encountered in true_divide\n",
      "  This is separate from the ipykernel package so we can avoid doing imports until\n"
     ]
    }
   ],
   "source": [
    "# standardize sequential_features\n",
    "feature_data -= feature_data.mean(axis=0)\n",
    "feature_data /= feature_data.std(axis=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "feature_data = np.nan_to_num(feature_data, 0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Calculate Similarities\n",
    "\n",
    "This section describes the fundamentals of the content-based audio similarity search approach followed in this tutorial. Audio features are descriptive numbers calculated from the audio spectrum of a track. A good example is the Spectral Centroid, which can be interpreted as the center of gravity of an audio recording. It describes the average frequency weighted by its intensity and distinguishes brighter from darker sounds. Such features are usually calculated for several intervals of a track and finally aggregated into a single vector representation. The latter step, which is a requirement for many machine/statistical learning tasks, is accomplished by calculating statistical measures such as mean, standard deviation, etc.\n",
    "\n",
    "In the following example, the Spectral Centroids of 10 different tracks are provided using their mean and standard deviation aggregations. Thus, the Spectral Centroid feature(-set) is represented by a two-dimensional feature vector such as the following example:\n",
    "\n",
    "    ID   Mean                  Standard Deviation\n",
    "    0    1517.5993814237531    291.1855836731788\n",
    "\n",
    "In this example the center frequency is 1518 Hz and it deviates by 291 Hz. These numbers already describe the audio content and can be used to find similar tracks. The common approach to calcualte music similarity from audio content is based on vector difference. The assumption is, that similar audio feature-values correspond with similar audio content. Thus, feature vectors with smaller vector differences correspond to more similar tracks. The following data represents the extracted Spectral Centroids of our 10-tracks collection:\n",
    "\n",
    "\n",
    "    ID   Mean                  Standard Deviation\n",
    "    0    1517.5993814237531    291.1855836731788\n",
    "    1    1659.1988993873124    327.64811981777865\n",
    "    2    1507.4617047141264    340.8830079395701\n",
    "    3    1597.6019371942953    507.1007933367403\n",
    "    4    1498.8531206911534    288.3780838480238\n",
    "    5    535.5910732230583     89.90893994909047\n",
    "    6    2261.4032345595674    353.5971736260454\n",
    "    7    2331.881852844861     406.33517225264194\n",
    "    8    1868.690426450363     342.7489751514078\n",
    "    9    2204.6324484864085    328.94334883095553\n",
    "\n",
    "The tracks have unique identifiers and we are using the track with ID 5 to search for similar items. This step requires a similarity metric, which defines how the vector distance has to be calculated as a single numeric value. The most common choices are the Manhattan (L1) and Euclidean (L2) distance measures. The Euclidean Distance is the square root of the sum of squared differences of two vectors.\n",
    "To calculate the Euclidean Distance between track 5 and track 0:\n",
    "\n",
    "    ID   Mean                  Standard Deviation\n",
    "    0    1517.5993814237531    291.1855836731788\n",
    "    5    535.5910732230583     89.90893994909047\n",
    "\n",
    "we first compute the difference between the values of each vectors\n",
    "\n",
    "    982.008308           201.276644\n",
    "\n",
    "square them to get the absolute magnitude:\n",
    "\n",
    "    964340.317375        40512.287309\n",
    "\n",
    "and take the sum of these values:\n",
    "\n",
    "    1004852.6046840245\n",
    "\n",
    "Per definition the square root has to be calculated from the sum, but this step is normally skipped because it does not alter the ranking and is processing intensive. By calculating the distance for all items in the collection, we retrieve a list of distance values where the smaller distances correspond to more similar audio content and the higher values should sound more dissimilar.\n",
    "\n",
    "    ID   Distance\n",
    "    0    1004852.6046840245\n",
    "    1    1319014.4646621975\n",
    "    2    1007520.5071585375\n",
    "    3    1301916.1177259558\n",
    "    4    967263.7731724023\n",
    "    5    0.0\n",
    "    6    3047959.100796666\n",
    "    7    3326786.1254441254\n",
    "    8    1841081.968976167\n",
    "    9    2842836.5609704787\n",
    "\n",
    "To retrieve a ranked list of similar sounding tracks, the list of vector distances has to be ordered ascendingly.\n",
    "\n",
    "    ID   Distance\n",
    "    5    0.0\n",
    "    4    967263.7731724023\n",
    "    0    1004852.6046840245\n",
    "    2    1007520.5071585375\n",
    "    3    1301916.1177259558\n",
    "    1    1319014.4646621975\n",
    "    8    1841081.968976167\n",
    "    9    2842836.5609704787\n",
    "    6    3047959.100796666\n",
    "    7    3326786.1254441254\n",
    "\n",
    "This so called vector space model is predominant in content based multimedia retrieval. The most crucial and problematic part is feature crafting, meaning that in the case in which the extracted numbers do not describe the audio well enough, the vector based similarity will also fail to provide results that are perceived as similar.\n",
    "The described approach requires the availability of all feature vectors of all items of a collection. Thus, the feature vectors must be stored. No matter which retrieval approach (pre-calculated / indexed / on demand) will be chosen, all features will be required at a certain time. Given that the feature extraction is an computationally expensive task (in terms of processing resources and total time), the extracted features are stored and made accessible using a common data format.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Euclidean Distance\n",
    "\n",
    "In the final part of this tutorial we wil use the Euclidean Distance to calculate similarities between tracks. As mentioned above, the Euclidean Distance is a metric to calculate the distance between two vectors and thus is a function of dissimilarity. This means, vectors with smaller distance values are more similar than those with higher distances.\n",
    "\n",
    "$$\n",
    "d(p,q) = \\sqrt{\\sum_{i=1}^n (q_i-p_i)^2}\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def eucledian_distance(feature_space, query_vector):\n",
    "    \n",
    "    return np.sqrt(np.sum((feature_space - query_vector)**2, axis=1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "For the rest of the tutorial we will use this song to demonstrate the results of the approach:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# display top-10 results (first track = query track)\n",
    "display_cols = [\"artist\", \"title\", \"album\", \"player\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>artist</th>\n",
       "      <th>title</th>\n",
       "      <th>album</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>index</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>18465</th>\n",
       "      <td>Chris Harvey</td>\n",
       "      <td>Pixelize</td>\n",
       "      <td>The White Sail</td>\n",
       "      <td><audio src=\"http://localhost:9999/0/chris_harvey-the_white_sail-09-pixelize-175-204.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "query_track_idx = 1102\n",
    "\n",
    "HTML(metadata[display_cols].iloc[[query_track_idx]].to_html(escape=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following lines of code implement the approach described above. First, the distances between the query vector and all other vectors of the collection are calculated. Then the distances are sorted ascnedingly to get the simlar tracks. Because the metric distance of identical vectors is 0, the top-most entry of the sorted list is always the query track."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>artist</th>\n",
       "      <th>title</th>\n",
       "      <th>album</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>featurespace_id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1102</th>\n",
       "      <td>Chris Harvey</td>\n",
       "      <td>Pixelize</td>\n",
       "      <td>The White Sail</td>\n",
       "      <td><audio src=\"http://localhost:9999/0/chris_harvey-the_white_sail-09-pixelize-175-204.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2774</th>\n",
       "      <td>Jami Sieber</td>\n",
       "      <td>Dancing at the Temple Gate</td>\n",
       "      <td>Lush Mechanique</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/jami_sieber-lush_mechanique-09-dancing_at_the_temple_gate-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2318</th>\n",
       "      <td>DJ Cary</td>\n",
       "      <td>Symphony of Force (Saros)</td>\n",
       "      <td>Power Synths</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/dj_cary-power_synths-02-symphony_of_force_saros-291-320.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315</th>\n",
       "      <td>DJ Cary</td>\n",
       "      <td>Symphony of Force (Saros)</td>\n",
       "      <td>Power Synths</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/dj_cary-power_synths-02-symphony_of_force_saros-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>Mijo</td>\n",
       "      <td>Click Here</td>\n",
       "      <td>Fata Morgana</td>\n",
       "      <td><audio src=\"http://localhost:9999/1/mijo-fata_morgana-01-click_here-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1137</th>\n",
       "      <td>Magnatune Compilation</td>\n",
       "      <td>Tim Rayborn_ Quen a Virgen ben servir</td>\n",
       "      <td>World Fusion</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/magnatune_compilation-world_fusion-09-tim_rayborn_quen_a_virgen_ben_servir-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2052</th>\n",
       "      <td>C Layne</td>\n",
       "      <td>How Soon I Forget</td>\n",
       "      <td>The Sun Will Come Out to Blind You</td>\n",
       "      <td><audio src=\"http://localhost:9999/e/c_layne-the_sun_will_come_out_to_blind_you-07-how_soon_i_forget-233-262.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Your Hands Lie Open</td>\n",
       "      <td>Into Heaven</td>\n",
       "      <td><audio src=\"http://localhost:9999/e/sun_palace-into_heaven-03-your_hands_lie_open-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2097</th>\n",
       "      <td>Curl</td>\n",
       "      <td>Sincerely Sorry</td>\n",
       "      <td>Inner</td>\n",
       "      <td><audio src=\"http://localhost:9999/9/curl-inner-04-sincerely_sorry-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>873</th>\n",
       "      <td>Atomic Opera</td>\n",
       "      <td>Meaningless Word</td>\n",
       "      <td>Alpha and Oranges</td>\n",
       "      <td><audio src=\"http://localhost:9999/c/atomic_opera-alpha_and_oranges-07-meaningless_word-204-233.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>611</th>\n",
       "      <td>Burnshee Thornside</td>\n",
       "      <td>Ha-Keem</td>\n",
       "      <td>Blues and misc</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/burnshee_thornside-blues_and_misc-05-hakeem-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# calculate the distance between the query-vector and all others\n",
    "dist = eucledian_distance(feature_data, feature_data[metadata.iloc[query_track_idx].featurespace_id])\n",
    "\n",
    "# sort the distances ascendingly - use sorted index\n",
    "sorted_idx = np.argsort(dist)\n",
    "\n",
    "HTML(metadata.set_index(\"featurespace_id\").loc[sorted_idx[:11]][display_cols].to_html(escape=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Scaled Eucledian Distance\n",
    "\n",
    "The approach taken to combine the different feature-sets is refered to as early fusion. The problem with the approach described in the previous step is, that larger feature-sets dominate the calculated distance values. The aggregated MFCC and Chroma features have 24 dimensions each. Together they have more dimensions as the remaining features which are mostly single dimensional features. Thus, the distances are unequally dominated by the two feature sets.\n",
    "\n",
    "To avoid such a bias, we scale the feature-space such that feature-sets and single-value features have euqal the same weights and thus euqal influence on the resulting distance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "# feature-set lengths and order\n",
    "featureset_lengths = [1440, # rp\n",
    "                      168]  # ssd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "def scaled_eucledian_distance(feature_space, query_vector):\n",
    "    \n",
    "    distances = (feature_space - query_vector)**2\n",
    "    \n",
    "    # feature_start_idx\n",
    "    start_idx = 0 \n",
    "    \n",
    "    # normalize distances\n",
    "    for sequence_length in featureset_lengths:\n",
    "        \n",
    "        # feature_stop_idx\n",
    "        stop_idx                         = start_idx + sequence_length\n",
    "        distances[:,start_idx:stop_idx] /= sequence_length#distances[:,start_idx:stop_idx].sum(axis=1).max()\n",
    "        start_idx                        = stop_idx\n",
    "    \n",
    "    return np.sqrt(np.sum(distances, axis=1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example result"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>artist</th>\n",
       "      <th>title</th>\n",
       "      <th>album</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>featurespace_id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1102</th>\n",
       "      <td>Chris Harvey</td>\n",
       "      <td>Pixelize</td>\n",
       "      <td>The White Sail</td>\n",
       "      <td><audio src=\"http://localhost:9999/0/chris_harvey-the_white_sail-09-pixelize-175-204.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2318</th>\n",
       "      <td>DJ Cary</td>\n",
       "      <td>Symphony of Force (Saros)</td>\n",
       "      <td>Power Synths</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/dj_cary-power_synths-02-symphony_of_force_saros-291-320.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315</th>\n",
       "      <td>DJ Cary</td>\n",
       "      <td>Symphony of Force (Saros)</td>\n",
       "      <td>Power Synths</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/dj_cary-power_synths-02-symphony_of_force_saros-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2774</th>\n",
       "      <td>Jami Sieber</td>\n",
       "      <td>Dancing at the Temple Gate</td>\n",
       "      <td>Lush Mechanique</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/jami_sieber-lush_mechanique-09-dancing_at_the_temple_gate-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>Mijo</td>\n",
       "      <td>Click Here</td>\n",
       "      <td>Fata Morgana</td>\n",
       "      <td><audio src=\"http://localhost:9999/1/mijo-fata_morgana-01-click_here-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1137</th>\n",
       "      <td>Magnatune Compilation</td>\n",
       "      <td>Tim Rayborn_ Quen a Virgen ben servir</td>\n",
       "      <td>World Fusion</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/magnatune_compilation-world_fusion-09-tim_rayborn_quen_a_virgen_ben_servir-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>611</th>\n",
       "      <td>Burnshee Thornside</td>\n",
       "      <td>Ha-Keem</td>\n",
       "      <td>Blues and misc</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/burnshee_thornside-blues_and_misc-05-hakeem-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1849</th>\n",
       "      <td>Magnatune Compilation</td>\n",
       "      <td>Jag_ Juke Joint Boogie</td>\n",
       "      <td>New Age and Jazz</td>\n",
       "      <td><audio src=\"http://localhost:9999/d/magnatune_compilation-new_age_and_jazz-13-jag_juke_joint_boogie-0-29.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2052</th>\n",
       "      <td>C Layne</td>\n",
       "      <td>How Soon I Forget</td>\n",
       "      <td>The Sun Will Come Out to Blind You</td>\n",
       "      <td><audio src=\"http://localhost:9999/e/c_layne-the_sun_will_come_out_to_blind_you-07-how_soon_i_forget-233-262.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021</th>\n",
       "      <td>Sun Palace</td>\n",
       "      <td>Your Hands Lie Open</td>\n",
       "      <td>Into Heaven</td>\n",
       "      <td><audio src=\"http://localhost:9999/e/sun_palace-into_heaven-03-your_hands_lie_open-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1941</th>\n",
       "      <td>Drop Trio</td>\n",
       "      <td>invisible pants</td>\n",
       "      <td>Cezanne</td>\n",
       "      <td><audio src=\"http://localhost:9999/6/drop_trio-cezanne-11-invisible_pants-30-59.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dist = scaled_eucledian_distance(feature_data, feature_data[metadata.iloc[query_track_idx].featurespace_id])\n",
    "\n",
    "HTML(metadata.set_index(\"featurespace_id\").loc[np.argsort(dist)[:11]][display_cols].to_html(escape=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Feature Weighting\n",
    "\n",
    "As explained above, the vanilla Eucliden Distance in an early fusion approach is dominated by large feature-sets. Through scaling the feature-space we achieved equal influence for all feature-sets and features. Now, equal influence is not always the best choice fo music similarity. For example, the year and popularity feature we included into our feature vector are not an intrinsic music property. We just added them to cluster recordings of the same epoch together. Currently this feature has the same impact on the estimated similarity as timbre, rhythm and harmonics. When using many features it is commonly a good choice to apply different weights to them. Estimating these weights is generally achieved empirically."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "# feature-set lengths and order\n",
    "featureset_weights = [0.5, # rp\n",
    "                      0.9]  # ssd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "def weighted_eucledian_distance(feature_space, query_vector, featureset_weights):\n",
    "    \n",
    "    distances = (feature_space - query_vector)**2\n",
    "    \n",
    "    # feature_start_idx\n",
    "    start_idx = 0 \n",
    "    \n",
    "    # normalize distances\n",
    "    for sequence_length, weight in zip(featureset_lengths, featureset_weights):\n",
    "\n",
    "        # feature_stop_idx\n",
    "        stop_idx                         = start_idx + sequence_length\n",
    "        distances[:,start_idx:stop_idx] /= sequence_length#distances[:,start_idx:stop_idx].sum(axis=1).max()\n",
    "        distances[:,start_idx:stop_idx] *= weight\n",
    "        start_idx                        = stop_idx\n",
    "\n",
    "    return np.sqrt(np.sum(distances, axis=1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example result:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>artist</th>\n",
       "      <th>title</th>\n",
       "      <th>album</th>\n",
       "      <th>player</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>featurespace_id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1102</th>\n",
       "      <td>Chris Harvey</td>\n",
       "      <td>Pixelize</td>\n",
       "      <td>The White Sail</td>\n",
       "      <td><audio src=\"http://localhost:9999/0/chris_harvey-the_white_sail-09-pixelize-175-204.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2318</th>\n",
       "      <td>DJ Cary</td>\n",
       "      <td>Symphony of Force (Saros)</td>\n",
       "      <td>Power Synths</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/dj_cary-power_synths-02-symphony_of_force_saros-291-320.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>315</th>\n",
       "      <td>DJ Cary</td>\n",
       "      <td>Symphony of Force (Saros)</td>\n",
       "      <td>Power Synths</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/dj_cary-power_synths-02-symphony_of_force_saros-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1137</th>\n",
       "      <td>Magnatune Compilation</td>\n",
       "      <td>Tim Rayborn_ Quen a Virgen ben servir</td>\n",
       "      <td>World Fusion</td>\n",
       "      <td><audio src=\"http://localhost:9999/b/magnatune_compilation-world_fusion-09-tim_rayborn_quen_a_virgen_ben_servir-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>Mijo</td>\n",
       "      <td>Click Here</td>\n",
       "      <td>Fata Morgana</td>\n",
       "      <td><audio src=\"http://localhost:9999/1/mijo-fata_morgana-01-click_here-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2774</th>\n",
       "      <td>Jami Sieber</td>\n",
       "      <td>Dancing at the Temple Gate</td>\n",
       "      <td>Lush Mechanique</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/jami_sieber-lush_mechanique-09-dancing_at_the_temple_gate-146-175.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>611</th>\n",
       "      <td>Burnshee Thornside</td>\n",
       "      <td>Ha-Keem</td>\n",
       "      <td>Blues and misc</td>\n",
       "      <td><audio src=\"http://localhost:9999/2/burnshee_thornside-blues_and_misc-05-hakeem-88-117.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1849</th>\n",
       "      <td>Magnatune Compilation</td>\n",
       "      <td>Jag_ Juke Joint Boogie</td>\n",
       "      <td>New Age and Jazz</td>\n",
       "      <td><audio src=\"http://localhost:9999/d/magnatune_compilation-new_age_and_jazz-13-jag_juke_joint_boogie-0-29.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1941</th>\n",
       "      <td>Drop Trio</td>\n",
       "      <td>invisible pants</td>\n",
       "      <td>Cezanne</td>\n",
       "      <td><audio src=\"http://localhost:9999/6/drop_trio-cezanne-11-invisible_pants-30-59.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1921</th>\n",
       "      <td>Chris Juergensen</td>\n",
       "      <td>Prospects</td>\n",
       "      <td>Prospects</td>\n",
       "      <td><audio src=\"http://localhost:9999/4/chris_juergensen-prospects-01-prospects-59-88.mp3\" controls></td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>883</th>\n",
       "      <td>Chris Juergensen</td>\n",
       "      <td>Papa legba</td>\n",
       "      <td>Prospects</td>\n",
       "      <td><audio src=\"http://localhost:9999/4/chris_juergensen-prospects-07-papa_legba-204-233.mp3\" controls></td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dist = weighted_eucledian_distance(feature_data, \n",
    "                                   feature_data[metadata.iloc[query_track_idx].featurespace_id], \n",
    "                                   featureset_weights)\n",
    "\n",
    "HTML(metadata.set_index(\"featurespace_id\").loc[np.argsort(dist)[:11]][display_cols].to_html(escape=False))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  },
  "notify_time": "5",
  "toc": {
   "toc_cell": false,
   "toc_number_sections": true,
   "toc_threshold": 6,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}