{ "metadata": { "name": "", "signature": "sha256:6b70441f3cae996ad86e7811d9b5b522459dd633a58161c0b351ee9ce3a9f716" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#Last.fm\n", "\n", "###Author : Kartik Jagdale (https://github.com/kartikjagdale)\n", "`Last.fm is a music discovery service that gives you personalised recommendations based on the music you listento.`\n", "\n", "Here we are going to do some machine learning and data anlysis on the dataset of last.fm inorder to recommend the next songs to the user.\n", "\n", "We are going to use `NearestNeighbors Algorithm` to predict next songs that user will like to hear\n", "\n", "**Note**: Dataset retrieved Last.fm *[LastFM_Matrix.csv]* contaning `1257 records and 285 Songs` \n", "\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# First Import some essential Libraries\n", "import os\n", "import pandas as pd\n", "import numpy as np\n", "from sklearn.metrics.pairwise import cosine_similarity # For calculating similarity matrix\n", "from sklearn.neighbors import NearestNeighbors" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "DIR_PATH = os.getcwd() #Get currect directory\n", "\n", "lfm = pd.read_csv(DIR_PATH + \"//LastFM_Matrix.csv\") #Load dataset\n", "lfm.head() #Display Head of the dataset" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
usera perfect circleabbaac/dcadam greenaerosmithafiairalanis morissettealexisonfire...timbalandtom waitstooltori amostravistriviumu2underoathvolbeatyann tiersen
0 1 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
1 33 0 0 0 1 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
2 42 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
3 51 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
4 62 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
\n", "

5 rows \u00d7 286 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ " user a perfect circle abba ac/dc adam green aerosmith afi air \\\n", "0 1 0 0 0 0 0 0 0 \n", "1 33 0 0 0 1 0 0 0 \n", "2 42 0 0 0 0 0 0 0 \n", "3 51 0 0 0 0 0 0 0 \n", "4 62 0 0 0 0 0 0 0 \n", "\n", " alanis morissette alexisonfire ... timbaland tom waits tool \\\n", "0 0 0 ... 0 0 0 \n", "1 0 0 ... 0 0 0 \n", "2 0 0 ... 0 0 0 \n", "3 0 0 ... 0 0 0 \n", "4 0 0 ... 0 0 0 \n", "\n", " tori amos travis trivium u2 underoath volbeat yann tiersen \n", "0 0 0 0 0 0 0 0 \n", "1 0 0 0 0 0 0 0 \n", "2 0 0 0 0 0 0 0 \n", "3 0 0 0 0 0 0 0 \n", "4 0 0 0 0 0 0 0 \n", "\n", "[5 rows x 286 columns]" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "lets get all/some names of songs and user coloumn in the dataset" ] }, { "cell_type": "code", "collapsed": false, "input": [ "songs = pd.DataFrame(lfm.columns)\n", "songs.head(10)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0
0 user
1 a perfect circle
2 abba
3 ac/dc
4 adam green
5 aerosmith
6 afi
7 air
8 alanis morissette
9 alexisonfire
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ " 0\n", "0 user\n", "1 a perfect circle\n", "2 abba\n", "3 ac/dc\n", "4 adam green\n", "5 aerosmith\n", "6 afi\n", "7 air\n", "8 alanis morissette\n", "9 alexisonfire" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's import only songs and make a new DataFrame" ] }, { "cell_type": "code", "collapsed": false, "input": [ "lfm_songs = lfm.drop(\"user\",axis =1) #drop user column\n", "lfm_songs.head() # Show Head" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
a perfect circleabbaac/dcadam greenaerosmithafiairalanis morissettealexisonfirealicia keys...timbalandtom waitstooltori amostravistriviumu2underoathvolbeatyann tiersen
0 0 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0... 0 0 0 0 0 0 0 0 0 0
\n", "

5 rows \u00d7 285 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ " a perfect circle abba ac/dc adam green aerosmith afi air \\\n", "0 0 0 0 0 0 0 0 \n", "1 0 0 0 1 0 0 0 \n", "2 0 0 0 0 0 0 0 \n", "3 0 0 0 0 0 0 0 \n", "4 0 0 0 0 0 0 0 \n", "\n", " alanis morissette alexisonfire alicia keys ... timbaland \\\n", "0 0 0 0 ... 0 \n", "1 0 0 0 ... 0 \n", "2 0 0 0 ... 0 \n", "3 0 0 0 ... 0 \n", "4 0 0 0 ... 0 \n", "\n", " tom waits tool tori amos travis trivium u2 underoath volbeat \\\n", "0 0 0 0 0 0 0 0 0 \n", "1 0 0 0 0 0 0 0 0 \n", "2 0 0 0 0 0 0 0 0 \n", "3 0 0 0 0 0 0 0 0 \n", "4 0 0 0 0 0 0 0 0 \n", "\n", " yann tiersen \n", "0 0 \n", "1 0 \n", "2 0 \n", "3 0 \n", "4 0 \n", "\n", "[5 rows x 285 columns]" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "lfm_songs.shape #gives out total rows and columns" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "(1257, 285)" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate `cosine_similarity` in order to get Similarity Matrix" ] }, { "cell_type": "code", "collapsed": false, "input": [ "data_similarity = cosine_similarity(lfm_songs.T) #\n", "data_similarity" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "array([[ 1. , 0. , 0.01791723, ..., 0.06506 ,\n", " 0.05216405, 0. ],\n", " [ 0. , 1. , 0.05227877, ..., 0. ,\n", " 0.02536731, 0. ],\n", " [ 0.01791723, 0.05227877, 1. , ..., 0.02039967,\n", " 0.13084898, 0. ],\n", " ..., \n", " [ 0.06506 , 0. , 0.02039967, ..., 1. ,\n", " 0. , 0. ],\n", " [ 0.05216405, 0.02536731, 0.13084898, ..., 0. ,\n", " 1. , 0.02969569],\n", " [ 0. , 0. , 0. , ..., 0. ,\n", " 0.02969569, 1. ]])" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have obtained data similarity matrix now lets use K-nearest neighbour algo and predict the recommendations\n", "but first we will label the matrix " ] }, { "cell_type": "code", "collapsed": false, "input": [ "type(data_similarity)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "numpy.ndarray" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lets convert it ito DataFrame" ] }, { "cell_type": "code", "collapsed": false, "input": [ "data_similarity_df = pd.DataFrame(data_similarity, columns=(lfm_songs.columns), index=(lfm_songs.columns))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "data_similarity_df.head()# similarity Matrix" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
a perfect circleabbaac/dcadam greenaerosmithafiairalanis morissettealexisonfirealicia keys...timbalandtom waitstooltori amostravistriviumu2underoathvolbeatyann tiersen
a perfect circle 1.000000 0.000000 0.017917 0.051554 0.062776 0.000000 0.051755 0.060718 0 0.000000... 0.047338 0.081200 0.394709 0.125553 0.030359 0.111154 0.024398 0.06506 0.052164 0.000000
abba 0.000000 1.000000 0.052279 0.025071 0.061056 0.000000 0.016779 0.029527 0 0.000000... 0.000000 0.000000 0.000000 0.061056 0.029527 0.000000 0.094916 0.00000 0.025367 0.000000
ac/dc 0.017917 0.052279 1.000000 0.113154 0.177153 0.067894 0.075730 0.038076 0 0.088333... 0.044529 0.067894 0.058241 0.039367 0.000000 0.087131 0.122398 0.02040 0.130849 0.000000
adam green 0.051554 0.025071 0.113154 1.000000 0.056637 0.000000 0.093386 0.000000 0 0.025416... 0.000000 0.146516 0.083789 0.056637 0.082169 0.025071 0.022011 0.00000 0.023531 0.088045
aerosmith 0.062776 0.061056 0.177153 0.056637 1.000000 0.000000 0.113715 0.100056 0 0.061898... 0.052005 0.029735 0.025507 0.068966 0.033352 0.000000 0.214423 0.00000 0.057307 0.000000
\n", "

5 rows \u00d7 285 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ " a perfect circle abba ac/dc adam green aerosmith \\\n", "a perfect circle 1.000000 0.000000 0.017917 0.051554 0.062776 \n", "abba 0.000000 1.000000 0.052279 0.025071 0.061056 \n", "ac/dc 0.017917 0.052279 1.000000 0.113154 0.177153 \n", "adam green 0.051554 0.025071 0.113154 1.000000 0.056637 \n", "aerosmith 0.062776 0.061056 0.177153 0.056637 1.000000 \n", "\n", " afi air alanis morissette alexisonfire \\\n", "a perfect circle 0.000000 0.051755 0.060718 0 \n", "abba 0.000000 0.016779 0.029527 0 \n", "ac/dc 0.067894 0.075730 0.038076 0 \n", "adam green 0.000000 0.093386 0.000000 0 \n", "aerosmith 0.000000 0.113715 0.100056 0 \n", "\n", " alicia keys ... timbaland tom waits tool \\\n", "a perfect circle 0.000000 ... 0.047338 0.081200 0.394709 \n", "abba 0.000000 ... 0.000000 0.000000 0.000000 \n", "ac/dc 0.088333 ... 0.044529 0.067894 0.058241 \n", "adam green 0.025416 ... 0.000000 0.146516 0.083789 \n", "aerosmith 0.061898 ... 0.052005 0.029735 0.025507 \n", "\n", " tori amos travis trivium u2 underoath \\\n", "a perfect circle 0.125553 0.030359 0.111154 0.024398 0.06506 \n", "abba 0.061056 0.029527 0.000000 0.094916 0.00000 \n", "ac/dc 0.039367 0.000000 0.087131 0.122398 0.02040 \n", "adam green 0.056637 0.082169 0.025071 0.022011 0.00000 \n", "aerosmith 0.068966 0.033352 0.000000 0.214423 0.00000 \n", "\n", " volbeat yann tiersen \n", "a perfect circle 0.052164 0.000000 \n", "abba 0.025367 0.000000 \n", "ac/dc 0.130849 0.000000 \n", "adam green 0.023531 0.088045 \n", "aerosmith 0.057307 0.000000 \n", "\n", "[5 rows x 285 columns]" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "data_similarity_df.index.is_unique # check if there is no repeated songs" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "True" ] } ], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we will use `NearestNeighbors Algorithm` and apply to similarity matrix to get the recommendation" ] }, { "cell_type": "code", "collapsed": false, "input": [ "neigh = NearestNeighbors(n_neighbors=285)\n", "neigh.fit(data_similarity_df) # Fit the data" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "NearestNeighbors(algorithm='auto', leaf_size=30, metric='minkowski',\n", " metric_params=None, n_neighbors=285, p=2, radius=1.0)" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "#Copy the predicted data to a new DataFrame\n", "model = pd.DataFrame(neigh.kneighbors(data_similarity_df, return_distance=False))\n", "model.head() #gives you integer values instead of song names" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789...275276277278279280281282283284
0 0 277 81 70 189 206 108 235 264 80... 216 147 60 90 159 254 261 57 32 218
1 1 221 88 165 174 175 83 208 113 103... 230 33 213 172 19 79 162 150 125 241
2 2 128 172 36 190 75 182 116 258 140... 218 39 263 248 57 68 179 261 17 32
3 3 255 267 25 276 47 84 104 266 59... 213 11 90 20 238 79 92 162 150 125
4 4 281 157 158 115 93 106 78 103 262... 253 10 19 162 22 241 39 125 20 150
\n", "

5 rows \u00d7 285 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ " 0 1 2 3 4 5 6 7 8 9 ... 275 276 277 278 \\\n", "0 0 277 81 70 189 206 108 235 264 80 ... 216 147 60 90 \n", "1 1 221 88 165 174 175 83 208 113 103 ... 230 33 213 172 \n", "2 2 128 172 36 190 75 182 116 258 140 ... 218 39 263 248 \n", "3 3 255 267 25 276 47 84 104 266 59 ... 213 11 90 20 \n", "4 4 281 157 158 115 93 106 78 103 262 ... 253 10 19 162 \n", "\n", " 279 280 281 282 283 284 \n", "0 159 254 261 57 32 218 \n", "1 19 79 162 150 125 241 \n", "2 57 68 179 261 17 32 \n", "3 238 79 92 162 150 125 \n", "4 22 241 39 125 20 150 \n", "\n", "[5 rows x 285 columns]" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "final_model = pd.DataFrame(data_similarity_df.columns[model], index=data_similarity_df.index)#gives names with respect to songs\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "final_model.head() #preview final Model" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789...275276277278279280281282283284
a perfect circle a perfect circle tool dredg deftones nine inch nails porcupine tree godsmack staind the smashing pumpkins dream theater... red hot chili peppers katy perry coldplay ensiferum leona lewis the kooks the pussycat dolls christina aguilera beyonce rihanna
abba abba robbie williams elvis presley madonna michael jackson mika duffy queen groove coverage frank sinatra... slipknot billy talent rammstein metallica arctic monkeys disturbed linkin park killswitch engage in flames system of a down
ac/dc ac/dc iron maiden metallica black sabbath nirvana die toten hosen motorhead hammerfall the offspring judas priest... rihanna bloc party the shins the decemberists christina aguilera death cab for cutie modest mouse the pussycat dolls arcade fire beyonce
adam green adam green the libertines the strokes babyshambles tom waits bright eyes editors franz ferdinand the streets cocorosie... rammstein amon amarth ensiferum as i lay dying subway to sally disturbed equilibrium linkin park killswitch engage in flames
aerosmith aerosmith u2 led zeppelin lenny kravitz guns n roses eric clapton genesis dire straits frank sinatra the rolling stones... the killers all that remains arctic monkeys linkin park atreyu system of a down bloc party in flames as i lay dying killswitch engage
\n", "

5 rows \u00d7 285 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ " 0 1 2 \\\n", "a perfect circle a perfect circle tool dredg \n", "abba abba robbie williams elvis presley \n", "ac/dc ac/dc iron maiden metallica \n", "adam green adam green the libertines the strokes \n", "aerosmith aerosmith u2 led zeppelin \n", "\n", " 3 4 5 6 \\\n", "a perfect circle deftones nine inch nails porcupine tree godsmack \n", "abba madonna michael jackson mika duffy \n", "ac/dc black sabbath nirvana die toten hosen motorhead \n", "adam green babyshambles tom waits bright eyes editors \n", "aerosmith lenny kravitz guns n roses eric clapton genesis \n", "\n", " 7 8 9 \\\n", "a perfect circle staind the smashing pumpkins dream theater \n", "abba queen groove coverage frank sinatra \n", "ac/dc hammerfall the offspring judas priest \n", "adam green franz ferdinand the streets cocorosie \n", "aerosmith dire straits frank sinatra the rolling stones \n", "\n", " ... 275 276 \\\n", "a perfect circle ... red hot chili peppers katy perry \n", "abba ... slipknot billy talent \n", "ac/dc ... rihanna bloc party \n", "adam green ... rammstein amon amarth \n", "aerosmith ... the killers all that remains \n", "\n", " 277 278 279 \\\n", "a perfect circle coldplay ensiferum leona lewis \n", "abba rammstein metallica arctic monkeys \n", "ac/dc the shins the decemberists christina aguilera \n", "adam green ensiferum as i lay dying subway to sally \n", "aerosmith arctic monkeys linkin park atreyu \n", "\n", " 280 281 282 \\\n", "a perfect circle the kooks the pussycat dolls christina aguilera \n", "abba disturbed linkin park killswitch engage \n", "ac/dc death cab for cutie modest mouse the pussycat dolls \n", "adam green disturbed equilibrium linkin park \n", "aerosmith system of a down bloc party in flames \n", "\n", " 283 284 \n", "a perfect circle beyonce rihanna \n", "abba in flames system of a down \n", "ac/dc arcade fire beyonce \n", "adam green killswitch engage in flames \n", "aerosmith as i lay dying killswitch engage \n", "\n", "[5 rows x 285 columns]" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above model gives us all 285 Recommendation, but we want only **Top 10 recommendation**, so lets modify the DataFrame a bit" ] }, { "cell_type": "code", "collapsed": false, "input": [ "top10 = final_model[list(final_model.columns[:11])]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "top10.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012345678910
a perfect circle a perfect circle tool dredg deftones nine inch nails porcupine tree godsmack staind the smashing pumpkins dream theater opeth
abba abba robbie williams elvis presley madonna michael jackson mika duffy queen groove coverage frank sinatra hans zimmer
ac/dc ac/dc iron maiden metallica black sabbath nirvana die toten hosen motorhead hammerfall the offspring judas priest bloodhound gang
adam green adam green the libertines the strokes babyshambles tom waits bright eyes editors franz ferdinand the streets cocorosie queens of the stone age
aerosmith aerosmith u2 led zeppelin lenny kravitz guns n roses eric clapton genesis dire straits frank sinatra the rolling stones deep purple
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ " 0 1 2 \\\n", "a perfect circle a perfect circle tool dredg \n", "abba abba robbie williams elvis presley \n", "ac/dc ac/dc iron maiden metallica \n", "adam green adam green the libertines the strokes \n", "aerosmith aerosmith u2 led zeppelin \n", "\n", " 3 4 5 6 \\\n", "a perfect circle deftones nine inch nails porcupine tree godsmack \n", "abba madonna michael jackson mika duffy \n", "ac/dc black sabbath nirvana die toten hosen motorhead \n", "adam green babyshambles tom waits bright eyes editors \n", "aerosmith lenny kravitz guns n roses eric clapton genesis \n", "\n", " 7 8 9 \\\n", "a perfect circle staind the smashing pumpkins dream theater \n", "abba queen groove coverage frank sinatra \n", "ac/dc hammerfall the offspring judas priest \n", "adam green franz ferdinand the streets cocorosie \n", "aerosmith dire straits frank sinatra the rolling stones \n", "\n", " 10 \n", "a perfect circle opeth \n", "abba hans zimmer \n", "ac/dc bloodhound gang \n", "adam green queens of the stone age \n", "aerosmith deep purple " ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now lets put our results in `CSV` File called top10" ] }, { "cell_type": "code", "collapsed": false, "input": [ "top10.to_csv(\"top10.csv\",index_label = \"Index\") # store data in csv file" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now lets read the `CSV` File to check if its saved or not" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pd.read_csv(\"top10\").head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Index012345678910
0 a perfect circle a perfect circle tool dredg deftones nine inch nails porcupine tree godsmack staind the smashing pumpkins dream theater opeth
1 abba abba robbie williams elvis presley madonna michael jackson mika duffy queen groove coverage frank sinatra hans zimmer
2 ac/dc ac/dc iron maiden metallica black sabbath nirvana die toten hosen motorhead hammerfall the offspring judas priest bloodhound gang
3 adam green adam green the libertines the strokes babyshambles tom waits bright eyes editors franz ferdinand the streets cocorosie queens of the stone age
4 aerosmith aerosmith u2 led zeppelin lenny kravitz guns n roses eric clapton genesis dire straits frank sinatra the rolling stones deep purple
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 21, "text": [ " Index 0 1 2 \\\n", "0 a perfect circle a perfect circle tool dredg \n", "1 abba abba robbie williams elvis presley \n", "2 ac/dc ac/dc iron maiden metallica \n", "3 adam green adam green the libertines the strokes \n", "4 aerosmith aerosmith u2 led zeppelin \n", "\n", " 3 4 5 6 \\\n", "0 deftones nine inch nails porcupine tree godsmack \n", "1 madonna michael jackson mika duffy \n", "2 black sabbath nirvana die toten hosen motorhead \n", "3 babyshambles tom waits bright eyes editors \n", "4 lenny kravitz guns n roses eric clapton genesis \n", "\n", " 7 8 9 \\\n", "0 staind the smashing pumpkins dream theater \n", "1 queen groove coverage frank sinatra \n", "2 hammerfall the offspring judas priest \n", "3 franz ferdinand the streets cocorosie \n", "4 dire straits frank sinatra the rolling stones \n", "\n", " 10 \n", "0 opeth \n", "1 hans zimmer \n", "2 bloodhound gang \n", "3 queens of the stone age \n", "4 deep purple " ] } ], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##`Conclude`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To conclude we have created a model which recommends next song user will like to hear by using last.fm data.\n", "\n", "Further we can now use this model to make an `API` and use it in our Website or WebApp to recommend songs to the user.\n", "\n", "\n", "** *Github Link* **: https://github.com/kartikjagdale/Last.fm-Song-Recommender" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 21 } ], "metadata": {} } ] }