{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy, scipy, matplotlib.pyplot as plt, sklearn, librosa, mir_eval, urllib, IPython.display, stanford_mir\n", "plt.rcParams['figure.figsize'] = (14,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[← Back to Index](index.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise: Instrument Classification using K-NN" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This exercise is loosely based upon \"Lab 1\" from previous MIR workshops ([2010](https://ccrma.stanford.edu/workshops/mir2010/Lab1_2010.pdf))." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more on K-NN, see the [notebook on K-NN](knn.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For help from a similar exercise, [follow the steps in the feature sonification exercise](feature_sonification.ipynb#Step-1:-Retrieve-Audio) first." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Goals" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Extract spectral features from an audio signal.\n", "2. Train a K-Nearest Neighbor classifier.\n", "3. Use the classifier to classify beats in a drum loop." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 1: Retrieve Audio, Detect Onsets, and Segment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Download the file `simple_loop.wav` onto your local machine." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "filename = 'simple_loop.wav'\n", "urllib.urlretrieve?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the audio file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "librosa.load?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Play the audio file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "IPython.display.Audio?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Detect onsets:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "librosa.onset.onset_detect?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert onsets from frames to seconds (and samples):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "librosa.frames_to_time?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "librosa.frames_to_samples?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Listen to a click track, with clicks located at each onset, plus the original audio:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "mir_eval.sonify.clicks?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "IPython.display.Audio?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2: Extract Features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For each segment, compute the zero crossing rate and spectral centroid." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "librosa.zero_crossings?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "librosa.feature.spectral_centroid?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Scale the features to be in the range [-1, 1]:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "sklearn.preprocessing.MinMaxScaler?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "sklearn.preprocessing.MinMaxScaler.fit_transform?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 3: Train K-NN Classifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use `stanford_mir.download_drum_samples` to download ten kick drum samples and ten snare drum samples. Each audio file contains a single drum hit at the beginning of the file." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "stanford_mir.download_drum_samples?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For each audio file, extract one feature vector. Concatenate all of these feature vectors into one feature table." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "numpy.concatenate?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 4: Run the Classifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a K-NN classifer model object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "sklearn.neighbors.KNeighborsClassifier?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Train the classifier:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "sklearn.neighbors.KNeighborsClassifier.fit?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, run the classifier on the test input audio file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "sklearn.neighbors.KNeighborsClassifier.predict?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 5: Sonify the Classifier Output" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Play a \"beep\" for each detected kick drum. Repeat for the snare drum." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "mir_eval.sonify.clicks?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## For Further Exploration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In addition to the features used above, extract the following features:\n", "\n", "- spectral centroid\n", "- spectral spread\n", "- spectral skewness\n", "- spectral kurtosis\n", "- spectral rolloff\n", "- MFCCs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Re-train the classifier, and re-run the classifier over the test audio signal. Do the results change?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Repeat the steps above for more audio files." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[← Back to Index](index.html)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }