{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Sascha Spors,\n", "Professorship Signal Theory and Digital Signal Processing,\n", "Institute of Communications Engineering (INT),\n", "Faculty of Computer Science and Electrical Engineering (IEF),\n", "University of Rostock,\n", "Germany\n", "\n", "# Data Driven Audio Signal Processing - A Tutorial with Computational Examples\n", "\n", "Winter Semester 2025/26 (Master Course #24512)\n", "\n", "- lecture: https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture\n", "- tutorial: https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise\n", "\n", "Feel free to contact lecturer frank.schultz@uni-rostock.de" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 1: Introduction to DDASP" ] }, { "cell_type": "markdown", "metadata": { "vscode": { "languageId": "plaintext" } }, "source": [ "## Mindset\n", "\n", "When and why machine learning?!\n", "\n", "ChatGPT Text Synthesis vs. Prediction Model for Exam Grades ?!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motivation: Binary Classification with a Non-Linear Model\n", "\n", "\n", "\n", "Binary Logistic Regression\n", "- [binary_logistic_regression_manual.ipynb](binary_logistic_regression_manual.ipynb)\n", "- [binary_logistic_regression_torch.ipynb](binary_logistic_regression_torch.ipynb)\n", "- [binary_logistic_regression_tensorflow.ipynb](binary_logistic_regression_tensorflow.ipynb)\n", "\n", "Binary Classification with Non-Linear Models\n", "- [binary_logistic_regression_torch_with_hidden_layers.ipynb](binary_logistic_regression_torch_with_hidden_layers.ipynb) (above plot is created with this code)\n", "- [binary_logistic_regression_tf_with_hidden_layers.ipynb](binary_logistic_regression_tf_with_hidden_layers.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TensorFLow Playground\n", "- https://playground.tensorflow.org" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Machine Learning Ingredients\n", "- **Human Intelligence and Creativity**\n", "- Vector Calculus / Analysis\n", "- Matrix Calculus / Linear Algebra\n", "- Statistics\n", "- Signal Processing\n", "- Optimisation\n", "- Programming (Python!)\n", "- Data Handling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## IEF BinderHub\n", "- virtual machine storage is lost when virtual machine is abandoned\n", "- persistent storage only at `mnt/home`\n", "- File -> New Terminal\n", "- `cd mnt/home`\n", "- we can clone the tutorial material into persistent storage by:\n", "- `git clone https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise.git`\n", "- the same is possible for the lecture material:\n", "- `git clone https://github.com/spatialaudio/data-driven-audio-signal-processing-lecture`\n", "\n", "## Useful Python Packages\n", "\n", "- `numpy` for matrix / tensor linear algebra\n", "- `scipy` for important scientific math stuff\n", "- `matplotlib` for plotting\n", "- `scikit-learn` for predictive data analysis, machine learning\n", "- `statsmodels` statistic models, i.e. machine learning driven from statistics community\n", "- `tensorflow` deep learning with DNNs, CNNs...\n", "- `keras-tuner` for convenient hyper parameter tuning in tensorflow\n", "- `torch` deep learning with DNNs, CNNs...audio handling\n", "- `pandas` for data handling\n", "\n", "audio related packages that we might use here and there\n", "- `librosa`+`ffmpeg` music/audio analysis + en-/decoding/stream support\n", "- `soundfile` for read and write audio file\n", "- `pyloudnorm`to calculate a technical loudness measure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Most Recommended Books\n", "- Gilbert Strang *Linear Algebra and Learning From Data*, Wellesley, 2019\n", "- Kevin P. Murphy *Probabilistic Machine Learning: An Introduction*, MIT Press, 2022, free draft of most current version at https://probml.github.io/pml-book/book1.html\n", "- Sebastian Raschka *Machine Learning with PyTorch and Scikit-Learn*, Packt, 2022, https://www.packtpub.com/en-us/product/machine-learning-with-pytorch-and-scikit-learn-9781801819312\n", "\n", "Please do not learn from AI-written books! There are more textbook recommendations at the end of [index.ipynb](index.ipynb)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Homework Assignment\n", "Learning and thus improving our own skills is related to do things ourselves and manually.\n", "We should read text books, we need to use our brain!\n", "Consuming ChatGPT or comparable tools is the wrong approach to learn and comprehend, because we never know if these model tell the truth.\n", "\n", "We go for a manual, human solution on these two tasks\n", "\n", "1. Matrix Fundamentals\n", "- in StudIP `MatrixFundamentals.pdf`\n", "2. Regression with a Neural Network Model\n", "- in StudIP `RegressionWithNonLinearModel_Task.pdf`\n", "- hopefully helpful template to start with [homework/homework_template.ipynb](homework/homework_template.ipynb)\n", "\n", "It might be more painful in the beginning, but it is rewarding by orders of magnitudes compared to a cheated ChatGPT solution." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear Models vs. Non-Linear Models\n", "\n", "### Linear Model\n", "Forward Problem\n", "$$\\bm{y} = \\bm{X} \\bm{\\theta}$$\n", "Inverse Problem\n", "$$\\hat{\\bm{\\theta}} = \\bm{X}^{-1} \\bm{y},\\qquad\n", "\\hat{\\bm{\\theta}} = \\bm{X}^{\\dagger} \\bm{y}\n", "$$\n", "\n", "### Non-Linear Model\n", "Forward Problem\n", "$$\\bm{y} = f_3(f_2(f_1(\\bm{X},\\bm{\\theta}_1),\\bm{\\theta}_2), \\bm{\\theta}_3)$$\n", "How to solve the inverse problem???\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Didactic Story\n", "- First, we should familiarise ourselves with all the ingredients of machine learning using only linear models. This requires a fair understand of matrix calulus and linear algebra.\n", "- Then, we can move on to non-linear models, since this extension involves very few changes to key concepts and mindsets.\n", "- The binary logistic regression is a perfect model to initially learn how non-linear models work.\n", "- Small non-linear models (such as the regression model from the homework task and the small binary classification models in this tutorial) are perfectly suited to implement them manually.\n", "\n", "Hence:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## General Objective\n", "\n", "- For engineers **understanding the essence** of a concept is more important than a strict math proof\n", " - as engineers we can leave proofs to mathematicians\n", " - *example*: understanding the 4 matrix subspaces and the matrix (pseudo)-inverse based on the SVD is essential and need to know, in-depth proofs on this fundamental topic is nice to have\n", "- We should \n", " - understand building blocks of machine learning for (audio) data processing\n", " - create simple tool chains from these building blocks\n", " - create simple applications from these tool chains\n", " - get an impression about real industrial applications and their algorithmic and data effort\n", " - get in touch with scientific literature\n", " - where to find, how to read\n", " - there we will find latest tool chain inventions (if published at all, a lot of stuff is either unavailable due to company secrets, or only patent specifications exist, which usually omit heavy math and important details)\n", " - interpretation of results\n", " - reproducibility\n", " - re-inventing a tool chain\n", " - get in touch with major software libraries (in Python), see above" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Best Engineering Practice\n", "\n", "- engineering is about creating (hopefully useful) tools by using existing tools\n", "- models are tools and thus perfectly fit to the engineering community \n", "- we should better know our used and created tools in very detail\n", "- aspects on responsibility, ethics, moral \n", "- substantially reflecting our engineering task before starting is a good idea\n", " - critical reflection (higher good vs. earning money)\n", " - do we really need machine support here\n", " - if so, how can machines support us here, how do humans solve this task\n", " - what do machines better here than humans and vice versa\n", " - what is our expectation of the model perfomance\n", " - handcrafted model vs. machine learned model (problem: model transparency)\n", "- ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Established Procedure\n", "for structured development of data-driven methods (cf. the lecture)\n", "\n", "1. Definition of the problem and of performance measures\n", "2. Data preparation and feature extraction\n", "3. Spot check potential model architectures\n", "4. Model selection\n", "5. Evaluation and reporting\n", "6. Application\n", "\n", "If we lack on thinking about 1. and 2., we will almost certainly under-perform in 3. and 4., which directly affects 5. and 6.\n", "\n", "Thus, we really should take the whole procedure chain seriously. We hopefully do this all the time in the lecture and exercise." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Copyright\n", "\n", "- the notebooks are provided as [Open Educational Resources](https://en.wikipedia.org/wiki/Open_educational_resources)\n", "- the text is licensed under [Creative Commons Attribution 4.0](https://creativecommons.org/licenses/by/4.0/)\n", "- the code of the IPython examples is licensed under the [MIT license](https://opensource.org/licenses/MIT)\n", "- feel free to use the notebooks for your own purposes\n", "- please attribute the work as follows: *Frank Schultz, Data Driven Audio Signal Processing - A Tutorial Featuring Computational Examples, University of Rostock* ideally with relevant file(s), github URL https://github.com/spatialaudio/data-driven-audio-signal-processing-exercise, commit number and/or version tag, year." ] } ], "metadata": { "interpreter": { "hash": "1743232f157dd0954c61aae30535e75a2972519a625c7e796bafe0cd9a07bf7e" }, "kernelspec": { "display_name": "myddasp", "language": "python", "name": "myddasp" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.6" } }, "nbformat": 4, "nbformat_minor": 4 }