{ "cells": [ { "cell_type": "markdown", "id": "1927572e-476a-4f72-aa50-b103e392b31d", "metadata": {}, "source": [ "

BIRDSONG

\n", "

Tutorial

" ] }, { "cell_type": "markdown", "id": "5dde1b43-316b-4adf-b034-17f06074df11", "metadata": {}, "source": [ "# Import\n", "\n", "Import the package and some utils functions" ] }, { "cell_type": "code", "execution_count": 1, "id": "bbfd4bfb-7dad-4d64-9b31-4cbdc3887408", "metadata": {}, "outputs": [], "source": [ "# the following line enable interact with figures, \n", "# you can make zoom and save images from a poup matplotlib window\n", "%matplotlib qt\n", "#notebook qt ipympl tk qt\n", "\n", "#with warnings.catch_warnings(): warnings.simplefilter('ignore')\n", " \n", "import birdsongs as bs\n", "from birdsongs.utils import *" ] }, { "cell_type": "markdown", "id": "a7f76123", "metadata": {}, "source": [ "Define a **path** object, it manages the folders directions of results, auxiliar data, audios, and birdsongs paths; and **ploter** object, to visualize syllables. The path object look for all the wav files in the audio file, located at `root_path/Audios`.\n", "\n", "You must fill the root path according where you clone the repositry and your Operative System." ] }, { "cell_type": "code", "execution_count": 2, "id": "4430d165-0fa2-4541-b205-dcc60aed0055", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The folder has 4 songs\n" ] } ], "source": [ "# root_path = \"path_to_repository\\\\'\n", "# audios_path = \"audios_path\\\\'\n", "# bird_name = \"Zonotrichia capensis\"\n", "# audios_path = \"C:\\\\Users\\\\sebas\\\\Documents\\\\GitHub\\\\audios\\\\Dissertation-xeno\\\\\"\n", "\n", "paths = bs.Paths() # root_path, audios_path, bird_name\n", "ploter = bs.Ploter(save=False) # to save figures save=True " ] }, { "cell_type": "code", "execution_count": 3, "id": "ea6afe9e-c1da-4305-86aa-0d0bc082e742", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1-humman.wav\n", "2-XC104508 - Ocellated Tapaculo - Acropternis orthonyx.wav\n", "3-XC11293 - Rufous-collared Sparrow - Zonotrichia capensis.wav\n", "4-XC513182 - Rufous-collared Sparrow - Zonotrichia capensis.wav\n" ] } ], "source": [ "paths.ShowFiles()" ] }, { "cell_type": "markdown", "id": "6a5aae15-5419-4d48-ab11-8fb9d2ac8938", "metadata": {}, "source": [ "# Objects Definition\n", "\n", "## Song\n", "\n", "Choose what wav audio you want to import, `no_file`. The **song** object is define from the path file name and the number of file choosed, the path of the song is stored in the paths object" ] }, { "cell_type": "code", "execution_count": 7, "id": "b21005eb-4d9b-4bc1-adbc-d233571d1ce0", "metadata": {}, "outputs": [], "source": [ "no_file = 2 # int(input(\"Enter the number of song (1 to {0}): \".format(paths.no_files)))" ] }, { "cell_type": "markdown", "id": "6c1b42c2-2432-4867-9c9a-132d2f08ef0b", "metadata": {}, "source": [ "define the song by the file number" ] }, { "cell_type": "code", "execution_count": 16, "id": "dd9f85bc-e4e2-4bfa-8857-3f9fb9f57cb8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The son has 38 syllables\n" ] }, { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "complet_bird = bs.Song(paths, no_file=no_file, Nt=5000, \n", " flim=(1e3,20e3), split_method=\"amplitud\", umbral=0.15)\n", "ploter.Plot(complet_bird, FF_on=False)\n", "AudioPlay(complet_bird)" ] }, { "cell_type": "code", "execution_count": 24, "id": "b5740c70-a96c-4309-8629-9a55d22bac80", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The son has 169 syllables\n" ] }, { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "clip_bird = bs.Song(paths, no_file=no_file, umbral_FF=1., split_method=\"freq\",\n", " flim=(1e3,20e3)) # , tlim=(12.5,14)\n", "ploter.Plot(clip_bird, FF_on=True)\n", "AudioPlay(clip_bird)" ] }, { "cell_type": "code", "execution_count": 25, "id": "e16e73d6-21bd-424a-89ec-1eea8f833d29", "metadata": {}, "outputs": [], "source": [ "klicker = ploter.FindTimes(complet_bird, FF_on=False)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 40, "id": "36fe37c4-ad79-4456-bb37-aa26ee1ea6ea", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[558507, 572379],\n", " [572893, 583169],\n", " [586509, 596785],\n", " [600510, 609373],\n", " [622089, 633521],\n", " [816388, 829952],\n", " [835203, 841694],\n", " [844611, 851612],\n", " [854674, 862405]], dtype=int64)" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# time_intervals = Positions(klicker)\n", "#time_intervals\n", "\n", "time_intervals = np.array([[12.66456702, 12.97913565],\n", " [12.99078634, 13.22380014],\n", " [13.29952962, 13.53254342],#]) - 12.5#,,\n", " [13.61701092, 13.81798533],\n", " [14.1063399 , 14.36556776],\n", " [18.5122157 , 18.81979551],\n", " [18.93885867, 19.08603395],\n", " [19.15218015, 19.31093102],\n", " [19.38038453, 19.55567195]])\n", "#[t[] for t in time_intervals]\n", "np.int64(time_intervals*complet_bird.fs)" ] }, { "cell_type": "markdown", "id": "9e54feb3-73b9-414d-afb4-177961d468d5", "metadata": {}, "source": [ "define the song by a time interval, interval of interest from a complete song" ] }, { "cell_type": "markdown", "id": "2ca73df1-5e39-44a9-a877-f5651231c550", "metadata": {}, "source": [ "### Plot\n", "\n", "Visualize the song with the ploter object use ploter function Plot and enter the object you want to visualize\n", "\n", "The syllable extraction has three methods:\n", "\n", "- **amplitud**: \n", "find where the normalized audio amplitud crosses an umbral, 0.05. \n", "- **freq**: \n", "find where the fundamental frequency changes drastically, where the changes is more than 500 Hz.\n", "- **maad**: \n", "use the maad segmentation tools (from scitik-maad) to find Region Of Interest (not implemented yet)." ] }, { "cell_type": "markdown", "id": "e705c9dc-51a2-417d-8301-e1b70e79f9d2", "metadata": {}, "source": [ "## Syllable\n", "\n", "This is the kernel of the model, the **Syllable** object. This object extract the syllable tempo-spectral features and define the neccesary variables to implement and solve the Motor Gesture. It returns a synthetic syllable as syllable object.\n", "\n", "\n", "To solve the syllable, find its synthetic syllable from some parameters ($\\alpha, \\beta, \\gamma$), use its method **Solve**. This mehotd dependes of the model parameters $p$, defined in each syllable object, which are $\\alpha_i, \\beta_i, $ and $\\gamma$. Although it is possible to define cutom parameters set, the syllable object has an initial set predefined (it is in the factible region). \n", "\n", "To display the parametres set use the command `Display(object.p)`. To change one of them values use `syllable_object.p[\"a0\"].set(value=value_a0)`." ] }, { "cell_type": "markdown", "id": "363f031e-a9a2-44e7-8a84-0f5fab0d7165", "metadata": {}, "source": [ "### Definition\n", "\n", "Defining with the Song object. Here is used the bird syllables divider, it requires the syllable number of interest" ] }, { "cell_type": "code", "execution_count": null, "id": "88c48483-6353-4c2a-b911-e5dcf62392e2", "metadata": {}, "outputs": [], "source": [ "%%time\n", "no_syllable = 4 # int(input(\"Enter the syllable number (1 to {0}): \".format(bird.no_syllables)))\n", "syllable = clip_bird.Syllable(no_syllable)\n", "ploter.Plot(syllable)\n", "AudioPlay(syllable)" ] }, { "cell_type": "markdown", "id": "f4b708c3-aee0-4f03-9ddf-3f0785e72349", "metadata": {}, "source": [ "define with the Syllable object, ir defines a syllable from an audio bird. You can optionally give a time range for clip the bird audio " ] }, { "cell_type": "code", "execution_count": null, "id": "72bb0f66-e762-4247-944d-d7a1dfd83e26", "metadata": {}, "outputs": [], "source": [ "syl_test = bs.Syllable(clip_bird, tlim=[1.131,1.188], \n", " umbral_FF=1.2, Nt=30, NN=128)\n", "ploter.Plot(syl_test)\n", "AudioPlay(syl_test)" ] }, { "cell_type": "markdown", "id": "139c0342-4e5b-45ed-9c42-5bae73b9e776", "metadata": {}, "source": [ "### Solution" ] }, { "cell_type": "code", "execution_count": null, "id": "617b84cb-4f4d-42d3-8f3d-20714cf25f1e", "metadata": {}, "outputs": [], "source": [ "%%time\n", "syllable_synth = syllable.Solve(syllable.p)\n", "syl_test_synth = syl_test.Solve(syl_test.p)" ] }, { "cell_type": "markdown", "id": "db5d56f8-3fdd-45e4-9101-6e355037178c", "metadata": {}, "source": [ "### Synthetic Plots" ] }, { "cell_type": "code", "execution_count": null, "id": "045c6447-dab3-4cc0-ab42-2cbce9e66d55", "metadata": {}, "outputs": [], "source": [ "# ploter.PlotVs(syllable_synth)\n", "ploter.PlotAlphaBeta(syllable_synth)\n", "ploter.Result(syllable, syllable_synth)\n", "ploter.Plot(syllable_synth)\n", "AudioPlay(syllable_synth)" ] }, { "cell_type": "code", "execution_count": null, "id": "b5816ad5-6a58-4608-abed-b17d4483aa3a", "metadata": {}, "outputs": [], "source": [ "# ploter.PlotVs(syl_test_synth)\n", "ploter.PlotAlphaBeta(syl_test_synth)\n", "ploter.Result(syl_test, syl_test_synth)\n", "ploter.Plot(syl_test_synth)\n", "AudioPlay(syl_test_synth)" ] }, { "cell_type": "markdown", "id": "2b53cda9-beea-4862-a989-0d081200e589", "metadata": {}, "source": [ "Syllable variables are visualize with ploter object, here also is measured the time exectuion of the Motor Gesture definition and solution.\n", "\n", "One of the biggest advantages of model implementation is the easily parameters exploration, as an example let's use the previous syllable defined but varying the input saic pressure in three levels: low, medium and high.\n", "\n", "To plot each object, song or syllable, use the command `ploter.Plot(obj)`.\n", "\n", "There is also the posibility to define a syllable object from a time interval and the bird object, nevertheless, you have to define some attribues to the object in order to ploter works fine. Avoid to enter wrong time limits, the object will not be defined. \n", "\n", "To improve the syllable frequency resolution, modify the Short Time Fourier Transform window length but remeber that better frequency resolution implies loss time resolution. This feature is useful when the syllable spectrum is \"complex\" (trilled syllables)." ] }, { "cell_type": "markdown", "id": "c7a76300-9f69-45c3-b847-efdfa9450bdb", "metadata": { "tags": [] }, "source": [ "## Varying Parameters" ] }, { "cell_type": "markdown", "id": "1779d521-e92f-4ea4-987a-8939548d5c4c", "metadata": { "tags": [] }, "source": [ "### Low\n", "\n", "$ a_0 = 0.01 $" ] }, { "cell_type": "code", "execution_count": null, "id": "4f6b6496-512c-4464-a897-b62d404d7bdb", "metadata": { "tags": [] }, "outputs": [], "source": [ "%%time\n", "syllable.p[\"a0\"].set(value=0.01)\n", "syllable_synth = syllable.Solve(syllable.p)\n", "\n", "# ploter.PlotVs(syllable_synth)\n", "ploter.PlotAlphaBeta(syllable_synth)\n", "ploter.Result(syllable, syllable_synth)" ] }, { "cell_type": "markdown", "id": "1eaec9a6-7886-419a-8c59-205777c6b4ab", "metadata": { "jp-MarkdownHeadingCollapsed": true, "tags": [] }, "source": [ "### Medium\n", "\n", "$ a_0 = 0.11 $" ] }, { "cell_type": "code", "execution_count": null, "id": "c3d9e2b0-30c5-4eb1-84d4-d7eeb2e48edb", "metadata": {}, "outputs": [], "source": [ "syllable.p[\"a0\"].set(value=0.11)\n", "syllable_synth = syllable.Solve(syllable.p)\n", "\n", "ploter.PlotAlphaBeta(syllable_synth)\n", "ploter.Result(syllable, syllable_synth)" ] }, { "cell_type": "markdown", "id": "e5645042-8ae1-4a42-8958-2d4372224ba0", "metadata": { "jp-MarkdownHeadingCollapsed": true, "tags": [] }, "source": [ "### High\n", "\n", "$ a_0 = 1.25 $" ] }, { "cell_type": "code", "execution_count": null, "id": "0dfa3062-6c34-48e9-bcf6-d41718f50cb0", "metadata": {}, "outputs": [], "source": [ "%%time\n", "syllable.p[\"a0\"].set(value=1.25)\n", "syllable_synth = sesyllable.Solve(syllable.p)\n", "\n", "ploter.PlotAlphaBeta(syllable_synth)\n", "ploter.Result(syllable, syllable_synth)" ] }, { "cell_type": "markdown", "id": "82d6b070-3242-45bc-b4a1-8db9840aacfc", "metadata": { "tags": [] }, "source": [ "## Chunck\n", "\n", "Although the sylalble division is a good approximation to solve the problem, a better methodology is take chuncks of a syllable, divide the syllable in fractions. Since the object syllable is already defined, is worth to use it again to define the **chunck** object. \n", "\n", "The biggest difference with the syllable object is the Fourier Transform window length and envelope parameters. Since these chuncks are smaller than the syllabes the space parameters curves is also smaller." ] }, { "cell_type": "markdown", "id": "0c4a09e8-c296-4f3d-850c-df07c76a46af", "metadata": {}, "source": [ "### Definition\n", "\n", "Choose what fraction of the syllable are you interested, `no_chunck`, and define it as a syllable object" ] }, { "cell_type": "code", "execution_count": null, "id": "3da226ac-c0f8-4ffb-bfeb-c4565a9b5eab", "metadata": {}, "outputs": [], "source": [ "no_chunck = 0 # int(input(\"Enter the number of song (1 to {0}): \".format(bird.no_chuncks)))\n", "chunck = clip_bird.Chunck(no_chunck)\n", "\n", "ploter.Plot(chunck)\n", "AudioPlay(chunck)" ] }, { "cell_type": "markdown", "id": "8fbfed2a-f768-42df-b308-8a98670c0d3e", "metadata": {}, "source": [ "### Solution\n", "Show the parameters used and solve generate the synthetic chunck" ] }, { "cell_type": "code", "execution_count": null, "id": "d94ff9c1-6ce0-4016-b245-7da6e2d6a905", "metadata": {}, "outputs": [], "source": [ "Display(chunck.p)\n", "chunck_synth = chunck.Solve(chunck.p)\n", "\n", "# ploter.PlotVs(chunck_synth)\n", "ploter.Plot(chunck_synth)\n", "AudioPlay(chunck_synth)" ] }, { "cell_type": "markdown", "id": "45e728ff-a0b0-462a-8c74-48838dd3debd", "metadata": {}, "source": [ "### Plot Results\n", "\n", "Visualize the chunck tempo-spectral features and the scored variables, scored defined to compare real and synthetic syllables " ] }, { "cell_type": "code", "execution_count": null, "id": "3d5b7eda-e580-479d-b52b-911f86888bff", "metadata": {}, "outputs": [], "source": [ "ploter.Syllables(chunck, chunck_synth)\n", "ploter.PlotAlphaBeta(chunck_synth)\n", "ploter.Result(chunck, chunck_synth)\n", "AudioPlay(chunck_synth)" ] }, { "cell_type": "markdown", "id": "845cbb12-10ae-47d2-857d-68eefd2fe558", "metadata": {}, "source": [ "## Plot All Objects\n", "\n", "Visualize again the song but with the chunck and syllables objects also plotted. The syllable and chunck must be previously defined " ] }, { "cell_type": "code", "execution_count": null, "id": "5bb2b5ec-1897-4e63-8a0b-94845d581ba4", "metadata": {}, "outputs": [], "source": [ "ploter.Plot(clip_bird, FF_on=True, syllable_on=True, chunck_on=True)" ] }, { "cell_type": "markdown", "id": "d152ee85-6b91-4ee3-b3d0-429f9d871bbf", "metadata": {}, "source": [ "# Optimization Problem" ] }, { "cell_type": "markdown", "id": "1f0937a1-e9d1-419a-9973-3cf6ff457e40", "metadata": { "tags": [] }, "source": [ "## General Problem" ] }, { "cell_type": "markdown", "id": "a85e4aaf-6bc9-4954-81b0-30f5d04c79dc", "metadata": {}, "source": [ "\\begin{equation}\\label{opt_general}\n", "\\begin{aligned}\n", "\\underset{ \\gamma \\in \\mathbb{R},\\; \\alpha,\\beta\\in \\mathbb{R}^n}{\\text{min}} &\\qquad ||\\hat{SCI}_{real} - \\hat{SCI}_{synt} ( \\gamma,\\alpha,\\beta)||_2 + || (\\hat{FF}_{real} - \\hat{FF}_{synt}(\\gamma,\\alpha,\\beta)||_2 \\\\\n", " & \\qquad \\qquad - corr(FC_{real},FC_{synt}(\\gamma, \\alpha, \\beta)) \\\\\n", " \\text { subject to } & \\qquad \\gamma \\in \\Omega_\\gamma, \\quad \\beta \\in \\Omega_\\beta , \\quad \\alpha \\in \\Omega_\\alpha\n", "\\end{aligned}\n", "\\end{equation}\n", "\n", "with $\\Omega_\\gamma , \\Omega_\\alpha$, and $\\Omega_\\beta$ the known feasible regions for each variable. In order to get an objective function adimensionless the following two variables are define\n", "\n", "$$\n", "\\hat{SCI} := \\frac{SCI}{dim(SCI)} , \\qquad\n", "\\hat{FF} := \\frac{1}{dim(FF)} \\frac{FF}{1 \\; KHz}\n", "$$\n", "\n", "where $dim()$ is the dimension the corresponding vector. " ] }, { "cell_type": "markdown", "id": "79eff06e-8b8c-4cfa-9589-6fd8faa3522d", "metadata": {}, "source": [ "## Sub-Optimization Problems" ] }, { "cell_type": "markdown", "id": "bde7fd4e-327f-4ef3-ad67-98e389bc4cd0", "metadata": {}, "source": [ "The general problem is computationally expensive since depends on many variables. Although solve the problem of one shot is the ideal method, a better approach is split the general problem in three auxiliar problems\n", "\n", "### Optimal $\\gamma$\n", "\n", "\\begin{equation}\\label{eq_optimal_gamma}\n", "\\begin{aligned}\n", "\\underset{ \\gamma \\in \\mathbb{R}}{\\text{min}} &\\qquad || \\hat{SCI}_{real} - \\hat{SCI}_{synt} ( \\gamma)||_2 + || \\hat{FF}_{real} - \\hat{FF}_{synt}(\\gamma)||_2\\\\\n", " \\text { subject to } & \\qquad \\; \\gamma \\in \\Gamma_\\gamma = [10000, 100000]\n", "\\end{aligned}\n", "\\end{equation}\n", "\n", "### Optimal $\\alpha$ Coeficients\n", "\n", "The coefficients of $\\alpha$ are calculated with the spectrum coeffcients correlation between the real syllable and the synthetic one \n", "\n", "\\begin{equation}\\label{optimal_a_min}\n", "\\begin{aligned}\n", "\\underset{a \\in \\mathbb{R}^3}{\\text{min}} & \\qquad - corr (real, synthetic(a)) \\\\\n", " \\text { subject to } & a\\in\\Omega_a\n", "\\end{aligned}\n", "\\end{equation}\n", "\n", "### Optimal $\\beta$ Coeficients\n", "\n", "The last step is to find the beta coefficients $b_i$\n", "\n", "\\begin{equation}\\label{optimal_b}\n", "\\begin{aligned}\n", "\\underset{b \\in \\mathbb{R}^3}{\\text{min}} &\\qquad || FF_{real} - FF_{synt} (b)|| \\\\\n", " \\text { subject to } & \\qquad \\; b \\in \\Omega_b\n", "\\end{aligned}\n", "\\end{equation}\n", "\n", "with $t\\in [0,T]$ where $T$ is the duration of the sillable (chunck).\n", "\n", "\n", "The air-sac pressure and labial tension are defined by six coefficients, 3 coefficients each one, this means their time curves are parabolic functions (the motor gestures are parabolas) but this can be modify but ommiting the third coefficients $a_2, b_2$ and working with lines curves as a motor gestures." ] }, { "cell_type": "markdown", "id": "f230acff-298a-4036-a424-80fc27ac3e95", "metadata": {}, "source": [ "## Optimization Solvers\n", "\n", "Define the method and its parameters to solve the optimization problem" ] }, { "cell_type": "code", "execution_count": null, "id": "b299d976-7331-49f4-bae0-69f41a238762", "metadata": {}, "outputs": [], "source": [ "brute = {'method':'brute', 'Ns':11} #, 'workers':-1} \n", "DualAnnealing = {'method':'dual_annealing','max_nfev':200, 'maxiter': 100}" ] }, { "cell_type": "markdown", "id": "3c29adb4-a2df-45bc-b398-4515cfb920fe", "metadata": {}, "source": [ "Define the object to optimize and its corresponding optimizer" ] }, { "cell_type": "code", "execution_count": null, "id": "5052b8fd-bf3f-4429-98eb-f540875d5c10", "metadata": {}, "outputs": [], "source": [ "obj = syl_test # syllable # chunck\n", "optimizer = bs.Optimizer(obj, method_kwargs=brute)" ] }, { "cell_type": "code", "execution_count": null, "id": "064e9b8d-e33b-4cfc-a181-b91ed99b50a2", "metadata": {}, "outputs": [], "source": [ "obj.id" ] }, { "cell_type": "code", "execution_count": null, "id": "f86adcb1-94c0-4043-8913-8bbf4ef3ba31", "metadata": {}, "outputs": [], "source": [ "ploter.Plot(obj)\n", "AudioPlay(obj)" ] }, { "cell_type": "markdown", "id": "9feacb99-c9d8-4c50-83db-349fe7a3838b", "metadata": {}, "source": [ "You can check all the methods availables uncommenting the following line, check the **method** attribute" ] }, { "cell_type": "code", "execution_count": null, "id": "f9c39768-9dc5-4e41-b93d-b4085d7d6ceb", "metadata": {}, "outputs": [], "source": [ "#?lmfit.minimize" ] }, { "cell_type": "markdown", "id": "d3f92ed2-2763-409d-8511-0ada344b8791", "metadata": {}, "source": [ "### Initial Synthetic Syllable\n", "\n", "Show model parameters and plots of the inital synthetic syllable" ] }, { "cell_type": "code", "execution_count": null, "id": "86def702-9fa9-4695-bae6-43a4921831d3", "metadata": {}, "outputs": [], "source": [ "Display(obj.p)\n", "obj_synth = obj.Solve(obj.p)\n", "\n", "ploter.PlotAlphaBeta(obj_synth)\n", "ploter.Result(obj, obj_synth)\n", "AudioPlay(obj_synth)" ] }, { "cell_type": "markdown", "id": "44b58a88-3311-4f37-b30c-dd132543ac81", "metadata": {}, "source": [ "# Solution\n", "\n", "## Optimal $\\gamma$\n", "\n", "Find the optimal time constant parameter ($\\gamma^*$) by solving the first suboptimization problem for each syllable." ] }, { "cell_type": "code", "execution_count": null, "id": "2f832e04-e2d0-43e7-947a-13d439a08d5d", "metadata": {}, "outputs": [], "source": [ "Gammas = optimizer.AllGammas(clip_bird)\n", "#optimizer.OptimalGamma(syl_test_synth)" ] }, { "cell_type": "markdown", "id": "20e2b7b1-b3c5-425b-99ed-24620a159d11", "metadata": {}, "source": [ "Altough the optimal parameters are stores in the optimizer, it is recommend to save this value in the object parameters set " ] }, { "cell_type": "code", "execution_count": null, "id": "1ff8ac7e-f8c8-4291-a564-1ea6f11a0622", "metadata": {}, "outputs": [], "source": [ "obj.p = optimizer.obj.p\n", "optimizer.optimal_gamma" ] }, { "cell_type": "markdown", "id": "30cab29c-e91a-420e-9df1-1886761ca556", "metadata": {}, "source": [ "### Plot \n", "\n", "Show parameter set, solve the object with this parameters, and visualize synthetic syllable" ] }, { "cell_type": "code", "execution_count": null, "id": "67f2b323-5679-450a-8b81-ef559414e912", "metadata": {}, "outputs": [], "source": [ "obj.id" ] }, { "cell_type": "code", "execution_count": null, "id": "4f5bc172-e791-4165-9f05-3caef48edaf9", "metadata": {}, "outputs": [], "source": [ "Display(obj.p)\n", "obj_synth = obj.Solve(obj.p)\n", "\n", "ploter.PlotAlphaBeta(obj_synth)\n", "ploter.Result(obj, obj_synth)\n", "AudioPlay(obj_synth)" ] }, { "cell_type": "markdown", "id": "002cc1f4-4659-42a4-b2b9-984b0847d6e3", "metadata": {}, "source": [ "## Optimal Parameters $\\alpha_i$ and $\\beta_i$\n", "\n", "Dependeing of what gesture approximation are you interested (linear or quadratic curves for $\\alpha$ or $\\beta$), the optimizer object find the optimal parameteres using its `OptimalVariable` method " ] }, { "cell_type": "code", "execution_count": null, "id": "a2cec92d-698d-47b3-9411-4446f6396d16", "metadata": {}, "outputs": [], "source": [ "# optimizer.OptimalAs(obj)\n", "# optimizer.OptimalBs(obj)\n", "\n", "optimizer.OptimalParams(obj, Ns=11)" ] }, { "cell_type": "markdown", "id": "1916e08f-73e7-4fac-a07f-abdd88f03167", "metadata": {}, "source": [ "### Optimal Parameters" ] }, { "cell_type": "code", "execution_count": null, "id": "8d118d91-9bd2-4c89-982e-8a077da6cc96", "metadata": {}, "outputs": [], "source": [ "Display(obj.p)" ] }, { "cell_type": "markdown", "id": "6bc2021c-1182-450f-9869-bcad917f9233", "metadata": {}, "source": [ "One shot solving is also implemented but the execution is very slow, since the parameter space has a dimension of at least 5 parameters." ] }, { "cell_type": "code", "execution_count": null, "id": "6e1daf32-5d2a-4ac6-8407-0cc3b01a2e01", "metadata": {}, "outputs": [], "source": [ "# optimizer.OptimalParameters()\n", "# Display(obj.p)" ] }, { "cell_type": "markdown", "id": "9bb100e7-bdf7-4da7-abf0-64bdf2cd8a59", "metadata": {}, "source": [ "Finding optimal $\\gamma$, $b_0$, and $b_1$ by the brute method" ] }, { "cell_type": "markdown", "id": "bb659208-8c70-4fe9-8db1-9c8c69ff20f7", "metadata": {}, "source": [ "### Plot Best Syllable\n", "\n", "Solve and visualize the optimal synthetic syllable and its features" ] }, { "cell_type": "code", "execution_count": null, "id": "c1dc6f7e-656d-4ede-877d-cb4ab4944df3", "metadata": {}, "outputs": [], "source": [ "#Display(obj.p)\n", "obj_synth_optimal = obj.Solve(obj.p)" ] }, { "cell_type": "code", "execution_count": null, "id": "fb1048f4-34c8-483a-a6a4-fe999c7866d3", "metadata": {}, "outputs": [], "source": [ "ploter.Syllables(obj, obj_synth_optimal)" ] }, { "cell_type": "code", "execution_count": null, "id": "0f0d3523-ede2-4e04-921a-dc06ef46e4dd", "metadata": {}, "outputs": [], "source": [ "# ploter.PlotVs(obj_synth_optimal)\n", "ploter.PlotAlphaBeta(obj_synth_optimal)\n", "ploter.Result(obj, obj_synth_optimal)\n", "AudioPlay(obj_synth_optimal)" ] }, { "cell_type": "markdown", "id": "8b92caca-52e7-405e-8638-42080d3eb66c", "metadata": {}, "source": [ "### Write Audio\n", "\n", "The final step is write the audio. To export the syllable in audio format use the syllable method `WriteAudio`" ] }, { "cell_type": "code", "execution_count": null, "id": "c635771a-bcd7-4218-850a-765d29062870", "metadata": {}, "outputs": [], "source": [ "obj.WriteAudio()\n", "obj_synth.WriteAudio()" ] }, { "cell_type": "markdown", "id": "848194a9-20ee-4af2-9a4e-7c10d5ca1e19", "metadata": {}, "source": [ "# Times" ] }, { "cell_type": "code", "execution_count": 27, "id": "90399fce-f1ee-4bf6-aa24-953a4622ab1e", "metadata": {}, "outputs": [], "source": [ "brute = {'method':'brute', 'Ns':11} #, 'workers':-1} " ] }, { "cell_type": "code", "execution_count": 28, "id": "d9a89a28-5423-4839-acc0-ec1fe739bb01", "metadata": {}, "outputs": [], "source": [ "optimizer_bird = bs.Optimizer(clip_bird, method_kwargs=brute)" ] }, { "cell_type": "code", "execution_count": null, "id": "4bc05464-01d9-4095-8dcb-c3e664bfb564", "metadata": {}, "outputs": [], "source": [ "#optimizer_bird.AllGammasByTimes(times)\n", "synth_bird = optimizer_bird.SongByTimes(time_intervals)" ] }, { "cell_type": "code", "execution_count": 30, "id": "78aeacf7-d5cc-40b1-a56e-0c261710d04d", "metadata": {}, "outputs": [], "source": [ "# # # plt.plot(complet_bird.time_s, optimizer_bird.synth_bird_s)\n", "# # # plt.xlim((12, 14))\n", "\n", "# #plt.plot(optimizer_bird.synth_bird_s)\n", "\n", "# synth_bird = bs.Song(complet_bird.paths, complet_bird.no_file, \n", "# sfs=[optimizer_bird.synth_bird_s, complet_bird.fs],\n", "# split_method=\"amplitud\", umbral=-1.01)\n", "\n", "# # #optimizer_bird.obj0.id" ] }, { "cell_type": "code", "execution_count": 35, "id": "c95ecc4b-5bdf-4381-81ef-68a8393cbfbc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ploter.Plot(clip_bird)\n", "AudioPlay(clip_bird)" ] }, { "cell_type": "code", "execution_count": 24, "id": "19f3f9a6-6407-4c8c-9571-d9fc00dba1e4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'song-synth'" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "optimizer_bird.synth_bird.id" ] }, { "cell_type": "code", "execution_count": 34, "id": "09c89453-7ec5-422f-a8f5-39b2448f361f", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#synth_bird.id = \"song-synth\"\n", "ploter.Plot(optimizer_bird.synth_bird)\n", "AudioPlay(optimizer_bird.synth_bird)" ] }, { "cell_type": "code", "execution_count": null, "id": "b3a27484-c060-4b51-8a3b-ea4c48916d59", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": 22, "id": "9b417941-664a-4654-8e2d-24d6debe6c42", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "C:\\Users\\sebas\\anaconda3\\lib\\site-packages\\maad\\sound\\input_output.py:390: UserWarning: Values for bit depth should be 8, 16 or 32. Argument ignored.\n", " warn('Values for bit depth should be 8, 16 or 32. Argument ignored.')\n" ] } ], "source": [ "clip_bird.WriteAudio()\n", "optimizer_bird.synth_bird.WriteAudio()" ] }, { "cell_type": "code", "execution_count": null, "id": "284a870f-5c68-43e2-9502-7f892d0e86e5", "metadata": {}, "outputs": [], "source": [ "# plt.plot(synth_bird.betas_bird)" ] }, { "cell_type": "code", "execution_count": null, "id": "b970eb1d-e4d0-4eec-b172-fc1dd26b6b74", "metadata": {}, "outputs": [], "source": [ "# AudioPlay(synth_bird)" ] }, { "cell_type": "markdown", "id": "21cf8002-9093-4216-a298-782918952f0e", "metadata": {}, "source": [ "## Whole Song\n", "\n", "This function attends to solve all the song, it calculates the optimal gamma and find the optimal parameters for each syllable" ] }, { "cell_type": "code", "execution_count": null, "id": "f120a1f2-cfd7-4fb0-a208-94d2c133a562", "metadata": {}, "outputs": [], "source": [ "# bird.WholeSong(brute, plot=True, syll_max=0)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "vscode": { "interpreter": { "hash": "3ad933181bd8a04b432d3370b9dc3b0662ad032c4dfaa4e4f1596c548f763858" } } }, "nbformat": 4, "nbformat_minor": 5 }