{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# [FMA: A Dataset For Music Analysis](https://github.com/mdeff/fma)\n",
"\n",
"Michaƫl Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2.\n",
"\n",
"## Free Music Archive web API\n",
"\n",
"All the data in the `raw_*.csv` tables was collected from the Free Music Archive [public API](https://freemusicarchive.org/api). With this notebook, you can:\n",
"* reconstruct the original data, \n",
"* update some fields, e.g. the `track listens` (play count),\n",
"* augment the data with newer fields wich may have been introduced in their API,\n",
"* update the dataset with new songs added to the archive.\n",
"\n",
"Notes:\n",
"* You need a key to access the API, which you can [request online](https://freemusicarchive.org/api/agreement) and write into your `.env` file as a new line reading `FMA_KEY=MYPERSONALKEY`.\n",
"* Requests take some hunderd milliseconds to complete."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import IPython.display as ipd\n",
"import utils"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"fma = utils.FreeMusicArchive(os.environ.get('FMA_KEY'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1 Get recently added tracks\n",
"\n",
"* `track_id` are assigned in monotonically increasing order.\n",
"* Tracks can be removed, so that number does not indicate the number of available tracks."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"156413 4/25/2017 11:28:59 AM Krestovsky\n",
"156409 4/25/2017 03:05:50 PM Parvus Decree\n",
"156408 4/25/2017 03:05:50 PM Parvus Decree\n",
"156407 4/25/2017 02:33:27 PM Parvus Decree\n",
"156406 4/25/2017 02:33:26 PM Parvus Decree\n",
"156405 4/25/2017 02:30:13 PM Surfer Blood\n",
"156404 4/25/2017 02:30:13 PM Surfer Blood\n",
"156403 4/25/2017 02:30:12 PM Surfer Blood\n",
"156402 4/25/2017 02:30:12 PM Surfer Blood\n",
"156401 4/25/2017 02:30:11 PM Surfer Blood\n",
"156400 4/25/2017 02:30:10 PM Surfer Blood\n",
"156399 4/25/2017 01:37:34 PM Jared C. Balogh\n",
"156398 4/25/2017 01:37:34 PM Jared C. Balogh\n",
"156397 4/25/2017 01:37:33 PM Jared C. Balogh\n",
"156396 4/25/2017 01:37:32 PM Jared C. Balogh\n",
"156395 4/25/2017 01:37:32 PM Jared C. Balogh\n",
"156394 4/25/2017 01:37:31 PM Jared C. Balogh\n",
"156393 4/25/2017 01:37:30 PM Jared C. Balogh\n",
"156392 4/25/2017 01:37:30 PM Jared C. Balogh\n",
"156391 4/25/2017 01:37:29 PM Jared C. Balogh\n"
]
}
],
"source": [
"for track_id, artist_name, date_created in zip(*fma.get_recent_tracks()):\n",
" print(track_id, date_created, artist_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2 Get metadata about tracks, albums and artists\n",
"\n",
"Given IDs, we can get information about tracks, albums and artists. See the available fields in the [API documentation](https://freemusicarchive.org/api)."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'album_id': '1',\n",
" 'artist_id': '1',\n",
" 'track_bit_rate': '256000',\n",
" 'track_comments': '0',\n",
" 'track_date_created': '11/26/2008 01:44:43 AM',\n",
" 'track_duration': '02:48',\n",
" 'track_favorites': '2',\n",
" 'track_interest': '4668',\n",
" 'track_listens': '1304',\n",
" 'track_title': 'Food'}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fma.get_track(track_id=2, fields=['track_title', 'track_date_created',\n",
" 'track_duration', 'track_bit_rate',\n",
" 'track_listens', 'track_interest', 'track_comments', 'track_favorites',\n",
" 'artist_id', 'album_id'])"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(['76', '103'], ['Experimental Pop', 'Singer-Songwriter'])"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fma.get_track_genres(track_id=20)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'album_comments': '0',\n",
" 'album_date_created': '11/26/2008 01:44:41 AM',\n",
" 'album_date_released': '1/05/2009',\n",
" 'album_favorites': '4',\n",
" 'album_listens': '6101',\n",
" 'album_title': 'AWOL - A Way Of Life',\n",
" 'album_tracks': '7'}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fma.get_album(album_id=1, fields=['album_title', 'album_tracks',\n",
" 'album_listens', 'album_comments', 'album_favorites',\n",
" 'album_date_created', 'album_date_released'])"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'artist_comments': '0',\n",
" 'artist_favorites': '9',\n",
" 'artist_location': 'New Jersey',\n",
" 'artist_name': 'AWOL'}"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fma.get_artist(artist_id=1, fields=['artist_name', 'artist_location',\n",
" 'artist_comments', 'artist_favorites'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3 Get data, i.e. raw audio\n",
"\n",
"We can download the original audio as well. Tracks are provided by the archive as MP3 with various bit and sample rates."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"track_file = fma.get_track(2, 'track_file')\n",
"fma.download_track(track_file, path='track.mp3')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4 Get genres\n",
"\n",
"Instead of compiling the genres of each track, we can get all the genres present on the archive with some API calls."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"164 genres\n"
]
},
{
"data": {
"text/html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" genre_parent_id | \n",
" genre_title | \n",
" genre_handle | \n",
" genre_color | \n",
"
\n",
" \n",
" genre_id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 11 | \n",
" 14 | \n",
" Disco | \n",
" Disco | \n",
" #E40089 | \n",
"
\n",
" \n",
" 12 | \n",
" None | \n",
" Rock | \n",
" Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 13 | \n",
" 126 | \n",
" Easy Listening | \n",
" Easy_Listening | \n",
" #5B747C | \n",
"
\n",
" \n",
" 14 | \n",
" None | \n",
" Soul-RnB | \n",
" Soul-RB | \n",
" #330033 | \n",
"
\n",
" \n",
" 15 | \n",
" None | \n",
" Electronic | \n",
" Electronic | \n",
" #FF6600 | \n",
"
\n",
" \n",
" 16 | \n",
" 6 | \n",
" Sound Effects | \n",
" Sound_Effects | \n",
" #003366 | \n",
"
\n",
" \n",
" 17 | \n",
" None | \n",
" Folk | \n",
" Folk | \n",
" #5E6D3F | \n",
"
\n",
" \n",
" 18 | \n",
" 1235 | \n",
" Soundtrack | \n",
" Soundtrack | \n",
" #669933 | \n",
"
\n",
" \n",
" 19 | \n",
" 14 | \n",
" Funk | \n",
" Funk | \n",
" #5E6D3F | \n",
"
\n",
" \n",
" 20 | \n",
" None | \n",
" Spoken | \n",
" Spoken | \n",
" #006699 | \n",
"
\n",
" \n",
" 21 | \n",
" None | \n",
" Hip-Hop | \n",
" Hip-Hop | \n",
" #CC0000 | \n",
"
\n",
" \n",
" 22 | \n",
" 38 | \n",
" Audio Collage | \n",
" Audio_Collage | \n",
" #dddd00 | \n",
"
\n",
" \n",
" 25 | \n",
" 12 | \n",
" Punk | \n",
" Punk | \n",
" #840000 | \n",
"
\n",
" \n",
" 26 | \n",
" 12 | \n",
" Post-Rock | \n",
" Post-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 27 | \n",
" 12 | \n",
" Lo-Fi | \n",
" Lo-fi | \n",
" #840000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" genre_parent_id genre_title genre_handle genre_color\n",
"genre_id \n",
"11 14 Disco Disco #E40089\n",
"12 None Rock Rock #840000\n",
"13 126 Easy Listening Easy_Listening #5B747C\n",
"14 None Soul-RnB Soul-RB #330033\n",
"15 None Electronic Electronic #FF6600\n",
"16 6 Sound Effects Sound_Effects #003366\n",
"17 None Folk Folk #5E6D3F\n",
"18 1235 Soundtrack Soundtrack #669933\n",
"19 14 Funk Funk #5E6D3F\n",
"20 None Spoken Spoken #006699\n",
"21 None Hip-Hop Hip-Hop #CC0000\n",
"22 38 Audio Collage Audio_Collage #dddd00\n",
"25 12 Punk Punk #840000\n",
"26 12 Post-Rock Post-Rock #840000\n",
"27 12 Lo-Fi Lo-fi #840000"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"genres = fma.get_all_genres()\n",
"print('{} genres'.format(genres.shape[0]))\n",
"genres[10:25]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And look for genres related to Rock."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" genre_parent_id | \n",
" genre_title | \n",
" genre_handle | \n",
" genre_color | \n",
"
\n",
" \n",
" genre_id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 12 | \n",
" None | \n",
" Rock | \n",
" Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 26 | \n",
" 12 | \n",
" Post-Rock | \n",
" Post-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 45 | \n",
" 12 | \n",
" Loud-Rock | \n",
" Loud-Rock | \n",
" #666666 | \n",
"
\n",
" \n",
" 53 | \n",
" 45 | \n",
" Noise-Rock | \n",
" Noise-Rock | \n",
" #666666 | \n",
"
\n",
" \n",
" 58 | \n",
" 12 | \n",
" Psych-Rock | \n",
" Psych-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 66 | \n",
" 12 | \n",
" Indie-Rock | \n",
" Indie-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 113 | \n",
" 26 | \n",
" Space-Rock | \n",
" Space-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 169 | \n",
" 9 | \n",
" Rockabilly | \n",
" Rockabilly | \n",
" #663366 | \n",
"
\n",
" \n",
" 440 | \n",
" 12 | \n",
" Rock Opera | \n",
" Rock_Opera | \n",
" #840000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" genre_parent_id genre_title genre_handle genre_color\n",
"genre_id \n",
"12 None Rock Rock #840000\n",
"26 12 Post-Rock Post-Rock #840000\n",
"45 12 Loud-Rock Loud-Rock #666666\n",
"53 45 Noise-Rock Noise-Rock #666666\n",
"58 12 Psych-Rock Psych-Rock #840000\n",
"66 12 Indie-Rock Indie-Rock #840000\n",
"113 26 Space-Rock Space-Rock #840000\n",
"169 9 Rockabilly Rockabilly #663366\n",
"440 12 Rock Opera Rock_Opera #840000"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"genres[['Rock' in title for title in genres['genre_title']]]"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" genre_parent_id | \n",
" genre_title | \n",
" genre_handle | \n",
" genre_color | \n",
"
\n",
" \n",
" genre_id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 25 | \n",
" 12 | \n",
" Punk | \n",
" Punk | \n",
" #840000 | \n",
"
\n",
" \n",
" 26 | \n",
" 12 | \n",
" Post-Rock | \n",
" Post-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 27 | \n",
" 12 | \n",
" Lo-Fi | \n",
" Lo-fi | \n",
" #840000 | \n",
"
\n",
" \n",
" 31 | \n",
" 12 | \n",
" Metal | \n",
" Metal | \n",
" #777777 | \n",
"
\n",
" \n",
" 36 | \n",
" 12 | \n",
" Krautrock | \n",
" Krautrock | \n",
" #840000 | \n",
"
\n",
" \n",
" 45 | \n",
" 12 | \n",
" Loud-Rock | \n",
" Loud-Rock | \n",
" #666666 | \n",
"
\n",
" \n",
" 58 | \n",
" 12 | \n",
" Psych-Rock | \n",
" Psych-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 66 | \n",
" 12 | \n",
" Indie-Rock | \n",
" Indie-Rock | \n",
" #840000 | \n",
"
\n",
" \n",
" 70 | \n",
" 12 | \n",
" Industrial | \n",
" Industrial | \n",
" #8400FF | \n",
"
\n",
" \n",
" 85 | \n",
" 12 | \n",
" Garage | \n",
" Garage | \n",
" #840000 | \n",
"
\n",
" \n",
" 88 | \n",
" 12 | \n",
" New Wave | \n",
" New_Wave | \n",
" #840000 | \n",
"
\n",
" \n",
" 98 | \n",
" 12 | \n",
" Progressive | \n",
" Progressive | \n",
" #840000 | \n",
"
\n",
" \n",
" 314 | \n",
" 12 | \n",
" Goth | \n",
" Goth | \n",
" #840000 | \n",
"
\n",
" \n",
" 359 | \n",
" 12 | \n",
" Shoegaze | \n",
" Shoegaze | \n",
" #840000 | \n",
"
\n",
" \n",
" 440 | \n",
" 12 | \n",
" Rock Opera | \n",
" Rock_Opera | \n",
" #840000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" genre_parent_id genre_title genre_handle genre_color\n",
"genre_id \n",
"25 12 Punk Punk #840000\n",
"26 12 Post-Rock Post-Rock #840000\n",
"27 12 Lo-Fi Lo-fi #840000\n",
"31 12 Metal Metal #777777\n",
"36 12 Krautrock Krautrock #840000\n",
"45 12 Loud-Rock Loud-Rock #666666\n",
"58 12 Psych-Rock Psych-Rock #840000\n",
"66 12 Indie-Rock Indie-Rock #840000\n",
"70 12 Industrial Industrial #8400FF\n",
"85 12 Garage Garage #840000\n",
"88 12 New Wave New_Wave #840000\n",
"98 12 Progressive Progressive #840000\n",
"314 12 Goth Goth #840000\n",
"359 12 Shoegaze Shoegaze #840000\n",
"440 12 Rock Opera Rock_Opera #840000"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"genres[genres['genre_parent_id'] == '12']"
]
}
],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 2
}