{ "metadata": { "name": "", "signature": "sha256:ffe9bd0d88be9549f042be42f3735f37a986bd220bfce8d245bb081f638cf3f3" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Melody analysis - MusicBricks Tutorial" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial will guide you through some tools for Melody Analysis using the Essentia library (http://www.essentia.upf.edu). Melody analysis tools will extract a pitch curve from a monophonic or polyphonic audio recording [1]. It outputs a time series (sequence of values) with the instantaneous pitch value (in Hertz) of the perceived melody. \n", "We provide two different operation modes: \n", " 1) using executable binaries; \n", " 2) Using Python wrappers. \n", " \n", " \n", "References:\n", "\n", "[1] J. Salamon and E. G\u00f3mez, \"Melody extraction from polyphonic music signals using pitch contour characteristics,\" IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 6, pp. 1759\u20131770, 2012. \n" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "1-Using executable binaries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can download the executable binaries for Linux (Ubuntu 14) and OSX in this link: http://tinyurl.com/melody-mbricks\n", "To execute the binaries you need to specify the input audio file and an output YAML file, where the melody values will be stored." ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Extracting melody from monophonic audio" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Locate an audio file to be processed in WAV format (input_audiofile)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Usage: ./streaming_pitchyinfft input_audiofile output_yamlfile " ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Extracting melody from polyphonic audio" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Usage: ./streaming_predominantmelody input_audiofile output_yamlfile " ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "2-Using Python wrappers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You should first install the Essentia library with Python bindings. Installation instructions are detailed here: http://essentia.upf.edu/documentation/installing.html . \n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# import essentia in standard mode\n", "import essentia\n", "import essentia.standard\n", "from essentia.standard import *" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "After importing Essentia library, let's import other numerical and plotting tools" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# import matplotlib for plotting\n", "import matplotlib.pyplot as plt\n", "import numpy" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load an audio file" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# create an audio loader and import audio file\n", "loader = essentia.standard.MonoLoader(filename = 'predom.wav', sampleRate = 44100)\n", "audio = loader()\n", "print(\"Duration of the audio sample [sec]:\")\n", "print(len(audio)/44100.0)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Duration of the audio sample [sec]:\n", "14.2285941043\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Extract the pitch curve from the audio example" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# PitchMelodia takes the entire audio signal as input - no frame-wise processing is required here...\n", "pExt = PredominantPitchMelodia(frameSize = 2048, hopSize = 128)\n", "pitch, pitchConf = pExt(audio)\n", "time=numpy.linspace(0.0,len(audio)/44100.0,len(pitch) )" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot extracted pitch contour" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# plot the pitch contour and confidence over time\n", "f, axarr = plt.subplots(2, sharex=True)\n", "axarr[0].plot(time,pitch)\n", "axarr[0].set_title('estimated pitch[Hz]')\n", "axarr[1].plot(time,pitchConf)\n", "axarr[1].set_title('pitch confidence')\n", "plt.show()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] } ], "metadata": {} } ] }