{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basic usage\n", "\n", "\n", "\n", "Let us start with a simple example to illustrate the use of `ruptures`: generate a 3-dimensional piecewise constant signal with noise and estimate the change points.\n", "\n", "## Setup\n", "First, we make the necessary imports." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt # for display purposes\n", "\n", "import ruptures as rpt # our package" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generate and display the signal\n", "\n", "Let us generate a 3-dimensional piecewise constant signal with Gaussian noise." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "n_samples, n_dims, sigma = 1000, 3, 2\n", "n_bkps = 4 # number of breakpoints\n", "signal, bkps = rpt.pw_constant(n_samples, n_dims, n_bkps, noise_std=sigma)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The true change points of this synthetic signal are available in the `bkps` variable." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(bkps)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the first four element are change point indexes while the last is simply the number of samples.\n", "(This is a technical convention so that functions in `ruptures` always know the length of the signal at hand.)\n", "\n", "It is also possible to plot our \\(\\mathbb{R}^3\\)-valued signal along with the true change points with the `rpt.display` function.\n", "In the following image, the color changes whenever the mean of the signal shifts." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax_array = rpt.display(signal, bkps)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Change point detection\n", "We can now perform change point detection, meaning that we find the indexes where the signal mean changes.\n", "To that end, we minimize the sum of squared errors when approximating the signal by a piecewise constant signal.\n", "Formally, for a signal \\( y_0 , y_1 , \\dots , y_{T-1} \\) (\\( T \\) samples), we solve the following optimization problem, over all possible change positions \\( t_1 < t_2 < \\dots < t_K \\)\n", "where the number \\( K \\) of changes is defined by the user:\n", "\n", "\\[\n", " \\hat{t}_1, \\hat{t}_2,\\dots,\\hat{t}_K = \\arg\\min_{t_1,\\dots,t_K} V(t_1,t_2,\\dots,t_K)\n", "\\]\n", "\n", "with\n", "\n", "\\[\n", " V(t_1,t_2,\\dots,t_K) := \\sum_{k=0}^K\\sum_{t=t_k}^{t_{k+1}-1} \\|y_t-\\bar{y}_{t_k..t_{k+1}}\\|^2\n", "\\]\n", "\n", "\n", "where \\( \\bar{y}_{t_k..t_{k+1}} \\) is the empirical mean of the sub-signal \\( y_{t_k}, y_{t_k+1},\\dots,y_{t_{k+1}-1} \\).\n", "(By convention \\( t_0=0 \\) and \\( t_{K+1}=T \\).)\n", "\n", "This optimization is solved with dynamic programming, using the [`Dynp`](../user-guide/detection/dynp.md) class. (More information in the section [What is change point detection?](/what-is-cpd) and the [User guide](/user-guide).)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# detection\n", "algo = rpt.Dynp(model=\"l2\").fit(signal)\n", "result = algo.predict(n_bkps=4)\n", "\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again the first elements are change point indexes and the last is the number of samples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Display the results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To visualy compare the true segmentation (`bkps`) and the estimated one (`result`), we can resort to `rpt.display` a second time.\n", "In the following image, the alternating colors indicate the true breakpoints and the dashed vertical lines, the estimated breakpoints." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# display\n", "rpt.display(signal, bkps, result)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this simple example, both are quite similar and almost undistinguishable." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }