{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Analysis of acoustic signals with giotto-time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Data:* Acoustic impulse responses come from the simulated and measured impulse responses used in the following publication:\n", "H. P. Tukuljac, V. Pulkki, H. Gamper, K. Godin, I. J. Tashev and N. Raghuvanshi, \"A Sparsity Measure for Echo Density Growth in General Environments,\" ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, 2019, pp. 1-5.\n", "[paper](http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8682878&isnumber=8682151) and [code](https://github.com/microsoft/EchoDensity/)\n", "\n", "*Task:* Recognize the type of acoustic space from its impulse response. There are three types of acoustic space that we can distinguish: parallel walls, room without and room with a ceiling.\n", "\n", "*More information:* For more information about acoustics and acoustic data processing, please read: Helena Peic Tukuljac, “Sparse and parametric modeling with applications to acoustics and audio,” Lausanne, 2020, PhD Thesis, EPFL. [thesis](https://infoscience.epfl.ch/record/273930?ln=en)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The task: Detection of the type of the acoustic environment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An acoustic impulse response (AIR) is a response of a certain environment to a Dirac sound as the input signal. AIR consists out of direct sound and numerous echoes that reach the sensor when the sound reflects on the walls. Reflections of walls are modeled with a so-called *image-source model*, which are located at the mirrored positions of the real source against the wall. Higher order virutal sources are determined recursively. Each image-source models the propagation delay of the arrival of an echo.\n", "\n", "Here we see the real source and its image sources for an example of one wall and a 2D case with 4 walls. Sound travels at speed $c\\approx340\\frac{m}{s}$ and depending on the position of the image source, the corresponding echo has a certain delay. We will be interested in characterizing the evolution of the density of early reflections. Late reflections are usually characterized as attenuated Gaussian noise." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this demo, the task is to detect the type of the acoustic environment based on the distribution of echoes over time. We will be able to distinguish the following 4 situations in a _domino box_ (sliding lid) scenario: closed, almost closed, almost open and open." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import h5py\n", "import os\n", "\n", "from collections import defaultdict" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from gtime.custom.sorted_density import SortedDensity\n", "from gtime.custom.crest_factor_detrending import CrestFactorDetrending\n", "from src.helpers import remove_direct_sound, curve_fitting_echo_density" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "with h5py.File(os.path.join('data', 'acoustic_responses.h5'), 'r') as hf:\n", " dataset_simulation_sliding_lid = list(hf['dataset_simulation_sliding_lid'])\n", " Fs = list(hf['Fs'])[0]\n", "labels = ['closed', 'almost_closed', 'almost_open', 'open']" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | closed | \n", "almost_closed | \n", "almost_open | \n", "open | \n", "
---|---|---|---|---|
00:00:00 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "
00:00:00.000170 | \n", "4.443116e-07 | \n", "4.372601e-07 | \n", "4.443116e-07 | \n", "4.456350e-07 | \n", "
00:00:00.000340 | \n", "1.300656e-05 | \n", "1.283440e-05 | \n", "1.300656e-05 | \n", "1.303693e-05 | \n", "
00:00:00.000510 | \n", "4.364509e-05 | \n", "4.311583e-05 | \n", "4.364509e-05 | \n", "4.377377e-05 | \n", "
00:00:00.000680 | \n", "1.987153e-05 | \n", "1.957845e-05 | \n", "1.987153e-05 | \n", "1.991281e-05 | \n", "