{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Classical Music Recommendation Playground\n", "\n", "This notebook will show how to implement simple recommender system follwing two different approaches: **Collaborative Filtering** (user based) and **Content Based** recommendation.\n", "\n", "> DISCLAIMER:\n", "> The used dataset is NOT a real dataset, but it has been artificially generated for the Tutorial purposes.\n", "> It absolutely should NOT be used as training data for any application." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import scipy.spatial.distance as distance" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data import\n", "\n", "### data.csv\n", "\n", "User listening experience dataset. 100 **users** -- numeric identifiers from 0 to 99 -- interact (or not) with 100 **items** (classical composers).\n", "\n", "1 = interaction, 0 = no interaction (implicit feedback)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | Wolfgang Amadeus Mozart | \n", "Franz Liszt | \n", "Joseph Haydn | \n", "Johannes Brahms | \n", "Robert Schumann | \n", "Antonio Vivaldi | \n", "Roland de Lassus | \n", "Frédéric Chopin | \n", "Franz Schubert | \n", "Domenico Scarlatti | \n", "... | \n", "Arnold Schoenberg | \n", "Bruno Mantovani | \n", "Antonín Dvořák | \n", "Piotr Ilitch Tchaïkovski | \n", "Johann Christian Bach | \n", "Aaron Copland | \n", "Ferruccio Busoni | \n", "Ralph Vaughan Williams | \n", "Zoltán Kodály | \n", "Leonard Bernstein | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
| 1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "... | \n", "1 | \n", "0 | \n", "1 | \n", "1 | \n", "1 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "
| 2 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "1 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "
| 3 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "
| 4 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "... | \n", "0 | \n", "0 | \n", "1 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "1 | \n", "
5 rows × 100 columns
\n", "| \n", " | label | \n", "0 | \n", "1 | \n", "2 | \n", "3 | \n", "4 | \n", "5 | \n", "6 | \n", "7 | \n", "8 | \n", "9 | \n", "10 | \n", "11 | \n", "12 | \n", "13 | \n", "14 | \n", "15 | \n", "16 | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| uri | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
| http://data.doremus.org/artist/4802a043-23bb-3b8d-a443-4a3bd22ccc63 | \n", "Wolfgang Amadeus Mozart | \n", "-0.049424 | \n", "0.012972 | \n", "0.030435 | \n", "0.672381 | \n", "0.705714 | \n", "-0.003292 | \n", "0.040397 | \n", "0.033374 | \n", "0.023954 | \n", "-0.085231 | \n", "0.158234 | \n", "0.044664 | \n", "0.018166 | \n", "0.010425 | \n", "-0.111956 | \n", "0.142959 | \n", "-0.030154 | \n", "
| http://data.doremus.org/artist/aabcd2ee-ac9b-30f2-8096-e9de8b3c7a81 | \n", "Franz Liszt | \n", "0.000628 | \n", "-0.008797 | \n", "-0.007513 | \n", "0.724762 | \n", "0.796191 | \n", "0.010984 | \n", "0.017313 | \n", "0.072120 | \n", "0.025766 | \n", "-0.085152 | \n", "0.156424 | \n", "0.043058 | \n", "0.020398 | \n", "0.032555 | \n", "-0.048986 | \n", "0.033035 | \n", "-0.001050 | \n", "
| http://data.doremus.org/artist/12fa21ff-cfa4-31d6-87d9-a22315193b04 | \n", "Joseph Haydn | \n", "-2.000000 | \n", "-2.000000 | \n", "-2.000000 | \n", "0.649524 | \n", "0.722857 | \n", "-0.001422 | \n", "0.035583 | \n", "0.029520 | \n", "0.023965 | \n", "-0.085292 | \n", "0.158400 | \n", "0.042192 | \n", "0.008899 | \n", "0.007677 | \n", "-0.091435 | \n", "0.147538 | \n", "0.000403 | \n", "
| http://data.doremus.org/artist/f9a2ac39-a62d-3be2-8abb-e564de0ec96d | \n", "Johannes Brahms | \n", "-0.010495 | \n", "-0.003960 | \n", "0.000920 | \n", "0.745714 | \n", "0.806667 | \n", "0.003321 | \n", "0.019724 | \n", "0.059037 | \n", "0.024170 | \n", "-0.085409 | \n", "0.158257 | \n", "0.044797 | \n", "0.028943 | \n", "0.022953 | \n", "-0.063101 | \n", "0.111707 | \n", "-0.025850 | \n", "
| http://data.doremus.org/artist/f753314d-87a7-32a9-9218-da98ae4f9812 | \n", "Robert Schumann | \n", "0.000628 | \n", "-0.008797 | \n", "-0.007513 | \n", "0.723810 | \n", "0.767619 | \n", "0.003386 | \n", "0.021836 | \n", "0.059332 | \n", "0.023935 | \n", "-0.084914 | \n", "0.158104 | \n", "0.045242 | \n", "0.030560 | \n", "0.023502 | \n", "-0.129961 | \n", "0.110820 | \n", "-0.069069 | \n", "
| label | \n", "Wolfgang Amadeus Mozart | \n", "Franz Liszt | \n", "Joseph Haydn | \n", "Johannes Brahms | \n", "Robert Schumann | \n", "Antonio Vivaldi | \n", "Roland de Lassus | \n", "Frédéric Chopin | \n", "Franz Schubert | \n", "Domenico Scarlatti | \n", "... | \n", "Arnold Schoenberg | \n", "Bruno Mantovani | \n", "Antonín Dvořák | \n", "Piotr Ilitch Tchaïkovski | \n", "Johann Christian Bach | \n", "Aaron Copland | \n", "Ferruccio Busoni | \n", "Ralph Vaughan Williams | \n", "Zoltán Kodály | \n", "Leonard Bernstein | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| label | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
| Wolfgang Amadeus Mozart | \n", "1 | \n", "0.977317 | \n", "0.818705 | \n", "0.981598 | \n", "0.985071 | \n", "0.814204 | \n", "0.624089 | \n", "0.976958 | \n", "0.987308 | \n", "0.813783 | \n", "... | \n", "0.803884 | \n", "0.573075 | \n", "0.808275 | \n", "0.808963 | \n", "0.818815 | \n", "0.628535 | \n", "0.805652 | \n", "0.80302 | \n", "0.797534 | \n", "0.627795 | \n", "
| Franz Liszt | \n", "0.977317 | \n", "1 | \n", "0.806518 | \n", "0.989032 | \n", "0.983511 | \n", "0.798843 | \n", "0.616185 | \n", "0.98243 | \n", "0.979911 | \n", "0.800846 | \n", "... | \n", "0.807869 | \n", "0.576931 | \n", "0.813884 | \n", "0.813808 | \n", "0.805256 | \n", "0.6364 | \n", "0.807672 | \n", "0.80757 | \n", "0.798923 | \n", "0.635377 | \n", "
| Joseph Haydn | \n", "0.994142 | \n", "0.979343 | \n", "1 | \n", "0.982494 | \n", "0.984184 | \n", "0.988096 | \n", "0.794793 | \n", "0.977726 | \n", "0.988863 | \n", "0.990187 | \n", "... | \n", "0.975743 | \n", "0.742932 | \n", "0.981111 | \n", "0.982625 | \n", "0.993293 | \n", "0.799882 | \n", "0.976397 | \n", "0.974162 | \n", "0.970276 | \n", "0.798628 | \n", "
| Johannes Brahms | \n", "0.981598 | \n", "0.989032 | \n", "0.809113 | \n", "1 | \n", "0.988774 | \n", "0.800753 | \n", "0.614646 | \n", "0.979436 | \n", "0.982709 | \n", "0.802281 | \n", "... | \n", "0.815153 | \n", "0.578398 | \n", "0.818782 | \n", "0.817388 | \n", "0.807047 | \n", "0.638088 | \n", "0.81323 | \n", "0.811394 | \n", "0.803853 | \n", "0.637382 | \n", "
| Robert Schumann | \n", "0.985071 | \n", "0.983511 | \n", "0.810504 | \n", "0.988774 | \n", "1 | \n", "0.804982 | \n", "0.617933 | \n", "0.976734 | \n", "0.981611 | \n", "0.804426 | \n", "... | \n", "0.810159 | \n", "0.576856 | \n", "0.813493 | \n", "0.812816 | \n", "0.810574 | \n", "0.634682 | \n", "0.814013 | \n", "0.809981 | \n", "0.798908 | \n", "0.634023 | \n", "
5 rows × 100 columns
\n", "| \n", " | Wolfgang Amadeus Mozart | \n", "Franz Liszt | \n", "Joseph Haydn | \n", "Johannes Brahms | \n", "Robert Schumann | \n", "Antonio Vivaldi | \n", "Roland de Lassus | \n", "Frédéric Chopin | \n", "Franz Schubert | \n", "Domenico Scarlatti | \n", "... | \n", "Arnold Schoenberg | \n", "Bruno Mantovani | \n", "Antonín Dvořák | \n", "Piotr Ilitch Tchaïkovski | \n", "Johann Christian Bach | \n", "Aaron Copland | \n", "Ferruccio Busoni | \n", "Ralph Vaughan Williams | \n", "Zoltán Kodály | \n", "Leonard Bernstein | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 100 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "... | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "0.0 | \n", "
1 rows × 100 columns
\n", "