{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading and Writing Audio Files in Python\n", "\n", "[back to main page](../index.ipynb)\n", "\n", "One-line summary:\n", "\n", "> Use the [soundfile](audio-files-with-pysoundfile.ipynb) module except if you're a minimalist (then use the built-in [wave](audio-files-with-wave.ipynb) module).\n", "\n", "There are several libraries for reading and writing audio files with Python but\n", "it's often not clear which is the best one. That's mostly because there is no\n", "single best one. Each one has a different set of features and limitations,\n", "different availability on different platforms and different ease of use.\n", "\n", "This directory shows a few of the available libraries along with some advantages\n", "and disadvantages as well as some usage examples.\n", "\n", "Most of the mentioned libraries can be used on multiple platforms." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## File Formats\n", "\n", "There are a lot of digital audio formats available, but each one of them falls\n", "into one of two categories: either it's a lossless or a lossy format. Files in\n", "lossy formats (e.g. Ogg-Vorbis, MP3) are normally much smaller because\n", "information which is contained in the original signal but is (ideally) not\n", "audible is thrown away. When using a lossless format, the original signal can be\n", "perfectly restored, either because no compression is used at all (e.g. WAV) or\n", "the compression doesn't loose information (e.g. FLAC).\n", "\n", "If you want to distibute your latest song on the internet you might want to use\n", "a lossy format to save some bandwidth, if you plan to further process your data\n", "(or if you are aiming at a high-end crowd) you should choose a lossless format.\n", "\n", "The lowest common denominator for exchanging audio files is probably the MS WAV\n", "format. But although it is widely supported, only few programs support all\n", "variants of this format. WAV files support both fixed point and floating point\n", "numbers. Very popular are 16-bit fixed point (PCM) values, as this is the format\n", "of Audio CDs.\n", "Some professional recordings use 24-bit PCM data. 32-bit PCM data is also\n", "possible, but this is practically never used.\n", "\n", "32-bit floating-point data, however, are widely used by digital audio\n", "workstations (DAWs), so you may encounter those.\n", "As most audio processing applications use 32-bit floating-point data internally,\n", "this is also a meaningful data type for file exchange between applications." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "TODO: `WAV_PCM` vs. `WAVEX`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "TODO: range of values, e.g. 16-bit: -32768 to 32767, http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-there.html, http://blog.bjornroche.com/2009/12/linearity-and-dynamic-range-in-int.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conversion to Other Formats\n", "\n", "For most cases it should be sufficient to support the various variants of WAV\n", "files. There are many conversion programs which allow to convert to and from\n", "many other formats.\n", "Probably the most powerful is [SoX](http://sox.sf.net/) -- highly recommended!\n", "It's as easy as that to convert a file named `myfile.mp3` to the WAV format (you can use any output file name you want):\n", "\n", " sox myfile.mp3 myfile.wav\n", "\n", "To create `.ogg` and `.mp3` files, [oggenc](http://www.vorbis.com/)\n", "and [LAME](http://lame.sourceforge.net/) can be used, respectively.\n", "I like to use this for creating OGG files (the output will get the same name, but with `.ogg` extension):\n", "\n", " oggenc -q6 myfile.wav" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Python Sound File Libraries\n", "\n", "Here I provide some IPython notebook files which show how to use some of the available\n", "libraries.\n", "Special attention is paid to the support for different variants of the WAV format.\n", "\n", "All of the described libraries can read and write 16-bit WAV files. Most (or\n", "maybe all) can also read 32-bit PCM (fixed point) file, but those hardly exist\n", "*in the wild*.\n", "Only a few can read 24-bit files, which are quite commonly used, mostly in\n", "professional contexts.\n", "Also only a few can read 32-bit floating point files. This is unfortunate,\n", "because many DAW produce such files and most audio processing software uses this\n", "format internally.\n", "\n", "A wide range of sampling rates is available, 44100 Hz and 48000 Hz being the\n", "most common.\n", "None of the libraries has problems with different sampling rates, therefore I'm using only one sample rate in the examples, namely 44100 Hz.\n", "\n", "Only a few libraries support WAVEX files, some display a warning but if you're\n", "lucky the audio data can still be read successfully.\n", "\n", "[Reading and Writing Audio Files with `soundfile`](audio-files-with-pysoundfile.ipynb) (highly recommended!)\n", "\n", "[Reading and Writing Audio Files with `scipy.io`](audio-files-with-scipy-io.ipynb)\n", "\n", "[Reading and Writing Audio Files with `wave`](audio-files-with-wave.ipynb) (part of Python's standard library)\n", "\n", "[Reading and Writing Audio Files with `ewave`](audio-files-with-ewave.ipynb)\n", "\n", "[Reading and Writing Audio Files with `wavefile`](audio-files-with-python-wavefile.ipynb)\n", "\n", "[Reading and Writing Audio Files with `audioread`](audio-files-with-audioread.ipynb)\n", "\n", "[Reading and Writing Audio Files with `scikits.audiolab`](audio-files-with-audiolab.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Other Libraries (Untested)\n", "\n", "sndfileio: \n", "https://github.com/gesellkammer/sndfileio\n", "\n", "pysox: \n", "https://pythonhosted.org/pysox/ \n", "https://pypi.org/project/pysox/\n", "\n", "(py)sox: \n", "https://github.com/rabitt/pysox \n", "https://pysox.readthedocs.io/\n", "\n", "xbob.sox: \n", "https://github.com/bioidiap/xbob.sox\n", "\n", "pysndfile: \n", "http://pythonhosted.org/pysndfile/ \n", "https://forge.ircam.fr/p/pysndfile/\n", "\n", "essentia.standard.AudioLoader: \n", "http://essentia.upf.edu/documentation/reference/std_AudioLoader.html\n", "\n", "librosa.load: \n", "http://bmcfee.github.io/librosa/\n", "\n", "Python Audio Tools: \n", "http://audiotools.sourceforge.net/\n", "\n", "pyffmpeg: \n", "https://github.com/mhaller/pyffmpeg\n", "\n", "PyMedia: \n", "http://pymedia.org/\n", "\n", "openal.loaders (PyAL): \n", "http://pythonhosted.org/PyAL/loaders.html\n", "\n", "Pydub: \n", "http://pydub.com/ \n", "https://github.com/jiaaro/pydub\n", "\n", "wavio: \n", "https://github.com/WarrenWeckesser/wavio" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example Files (for Testing)\n", "\n", "Example WAV files with different encodings are available in the directory\n", "[data/](data/). These files were generated with the script\n", "[generate_example_wav_files.py](data/generate_example_wav_files.py).\n", "\n", "More examples can be downloaded from:\n", "\n", "* http://www.nch.com.au/acm/formats.html\n", "\n", "* http://www-mmsp.ece.mcgill.ca/documents/audioformats/\n", "\n", "* http://www.cs.bath.ac.uk/~rwd/cardattrit.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", " \n", " \"CC0\"\n", " \n", "
\n", " To the extent possible under law,\n", " the person who associated CC0\n", " with this work has waived all copyright and related or neighboring\n", " rights to this work.\n", "

" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 4 }