{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Topology in time series forecasting\n",
    "\n",
    "This notebook shows how ``giotto-tda`` can be used to create topological features for time series forecasting tasks, and how to integrate them into ``scikit-learn``–compatible pipelines.\n",
    "\n",
    "In particular, we will concentrate on topological features which are created from consecutive **sliding windows** over the data. In sliding window models, a single time series array ``X`` of shape ``(n_timestamps, n_features)`` is turned into a time series of windows over the data, with a new shape ``(n_windows, n_samples_per_window, n_features)``. There are two main issues that arise when building forecasting models with sliding windows:\n",
    "\n",
    "1. ``n_windows`` is smaller than ``n_timestamps``. This is because we cannot have more windows than there are timestamps without padding ``X``, and this is not done by ``giotto-tda``. ``n_timestamps - n_windows`` is even larger if we decide to pick a large stride between consecutive windows.\n",
    "2. The target variable ``y`` needs to be properly \"aligned\" with each window so that the forecasting problem is meaningful and e.g. we don't \"leak\" information from the future. In particular, ``y`` needs to be \"resampled\" so that it too has length ``n_windows``.\n",
    "\n",
    "To deal with these issues, ``giotto-tda`` provides a selection of transformers with ``resample``, ``transform_resample`` and ``fit_transform_resample`` methods. These are inherited from a ``TransformerResamplerMixin`` base class. Furthermore, ``giotto-tda`` provides a drop-in replacement for ``scikit-learn``'s ``Pipeline`` which extends it to allow chaining ``TransformerResamplerMixin``s with regular ``scikit-learn`` estimators.\n",
    "\n",
    "If you are looking at a static version of this notebook and would like to run its contents, head over to [GitHub](https://github.com/giotto-ai/giotto-tda/blob/master/examples/time_series_forecasting.ipynb) and download the source.\n",
    "\n",
    "## See also\n",
    "\n",
    "- [Topology of time series](https://giotto-ai.github.io/gtda-docs/latest/notebooks/topology_time_series.html), in which the *Takens embedding* technique used here is explained in detail and illustrated via simple examples.\n",
    "- [Gravitational waves detection](https://giotto-ai.github.io/gtda-docs/latest/notebooks/gravitational_waves_detection.html), where,following [arXiv:1910.08245](https://arxiv.org/abs/1910.08245), the Takens embedding technique is shown to be effective for the detection of gravitational waves signals buried in background noise.\n",
    "- [Topological feature extraction using VietorisRipsPersistence and PersistenceEntropy](https://giotto-ai.github.io/gtda-docs/latest/notebooks/vietoris_rips_quickstart.html) for a quick introduction to general topological feature extraction in ``giotto-tda``.\n",
    "\n",
    "**License: AGPLv3**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ``SlidingWindow``\n",
    "\n",
    "Let us start with a simple example of a \"time series\" ``X`` with a corresponding target ``y`` of the same length."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "\n",
    "n_timestamps = 10\n",
    "X, y = np.arange(n_timestamps), np.arange(n_timestamps) - n_timestamps\n",
    "X, y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can instantiate our sliding window transformer-resampler and run it on the pair ``(X, y)``:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from gtda.time_series import SlidingWindow\n",
    "\n",
    "window_size = 3\n",
    "stride = 2\n",
    "\n",
    "SW = SlidingWindow(size=window_size, stride=stride)\n",
    "X_sw, yr = SW.fit_transform_resample(X, y)\n",
    "X_sw, yr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We note a couple of things:\n",
    "- ``fit_transform_resample`` returns a pair: the window-transformed ``X`` and the resampled and aligned ``y``.\n",
    "- ``SlidingWindow`` has made a choice for us on how to resample ``y`` and line it up with the windows from ``X``: a window on ``X`` corresponds to the *last* value in a corresponding window over ``y``. This is common in time series forecasting where, for example, ``y`` could be a shift of ``X`` by one timestamp.\n",
    "- Some of the initial values of ``X`` may not be found in ``X_sw``. This is because ``SlidingWindow`` only ensures the *last* value is represented in the last window, regardless of the stride. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multivariate time series example: Sliding window + topology ``Pipeline``"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``giotto-tda``'s topology transformers expect 3D input. But our ``X_sw`` above is 2D. How do we capture interesting properties of the topology of input time series then? For univariate time series, it turns out that a good way is to use the \"time delay embedding\" or \"Takens embedding\" technique explained in [Topology of time series](https://github.com/giotto-ai/giotto-tda/blob/master/examples/topology_time_series.ipynb). But as this involves an extra layer of complexity, we leave it for later and concentrate for now on an example with a simpler API which also demonstrates the use of a ``giotto-tda`` ``Pipeline``.\n",
    "\n",
    "Surprisingly, this involves multivariate time series input!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rng = np.random.default_rng(42)\n",
    "\n",
    "n_features = 2\n",
    "\n",
    "X = rng.integers(0, high=20, size=(n_timestamps, n_features), dtype=int)\n",
    "X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are interpreting this input as a time series in two variables, of length ``n_timestamps``. The target variable is the same ``y`` as before."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SW = SlidingWindow(size=window_size, stride=stride)\n",
    "X_sw, yr = SW.fit_transform_resample(X, y)\n",
    "X_sw, yr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``X_sw`` is now a complicated-looking array, but it has a simple interpretation. Again, ``X_sw[i]`` is the ``i``-th window on ``X``, and it contains ``window_size`` samples from the original time series. This time, the samples are not scalars but 1D arrays.\n",
    "\n",
    "What if we suspect that the way in which the **correlations** between the variables evolve over time can help forecast the target ``y``? This is a common situation in neuroscience, where each variable could be data from a single EEG sensor, for instance.\n",
    "\n",
    "``giotto-tda`` exposes a ``PearsonDissimilarity`` transformer which creates a 2D dissimilarity matrix from each window in ``X_sw``, and stacks them together into a single 3D object. This is the correct format (and information content!) for a typical topological transformer in ``gtda.homology``. See also [Topological feature extraction from graphs](https://github.com/giotto-ai/giotto-tda/blob/master/examples/persistent_homology_graphs.ipynb) for an in-depth look. Finally, we can extract simple scalar features using a selection of transformers in ``gtda.diagrams``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from gtda.time_series import PearsonDissimilarity\n",
    "from gtda.homology import VietorisRipsPersistence\n",
    "from gtda.diagrams import Amplitude\n",
    "\n",
    "PD = PearsonDissimilarity()\n",
    "X_pd = PD.fit_transform(X_sw)\n",
    "VR = VietorisRipsPersistence(metric=\"precomputed\")\n",
    "X_vr = VR.fit_transform(X_pd)  # \"precomputed\" required on dissimilarity data\n",
    "Ampl = Amplitude()\n",
    "X_a = Ampl.fit_transform(X_vr)\n",
    "X_a"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that we are not acting on ``y`` above. We are simply creating features from each window using topology! *Note*: it's two features per window because we used the default value for ``homology_dimensions`` in ``VietorisRipsPersistence``, not because we had two variables in the time series initially!\n",
    "\n",
    "We can now put this all together into a ``giotto-tda`` ``Pipeline`` which combines both the sliding window transformation on ``X`` and resampling of ``y`` with the feature extraction from the windows on ``X``.\n",
    "\n",
    "*Note*: while we could import the ``Pipeline`` class and use its constructor, we use the convenience function ``make_pipeline`` instead, which is a drop-in replacement for [scikit-learn's](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.make_pipeline.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn import set_config\n",
    "set_config(display='diagram')  # For HTML representations of pipelines\n",
    "\n",
    "from gtda.pipeline import make_pipeline\n",
    "\n",
    "pipe = make_pipeline(SW, PD, VR, Ampl)\n",
    "pipe"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, if we have a *regression* task on ``y`` we can add a final estimator such as scikit-learn's ``RandomForestRegressor`` as a final step in the previous pipeline, and fit it!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.ensemble import RandomForestRegressor\n",
    "\n",
    "RFR = RandomForestRegressor()\n",
    "\n",
    "pipe = make_pipeline(SW, PD, VR, Ampl, RFR)\n",
    "pipe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipe.fit(X, y)\n",
    "y_pred = pipe.predict(X)\n",
    "score = pipe.score(X, y)\n",
    "y_pred, score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Univariate time series – ``TakensEmbedding`` and ``SingleTakensEmbedding``\n",
    "\n",
    "The notebook [Topology of time series](https://github.com/giotto-ai/giotto-tda/blob/master/examples/topology_time_series.ipynb) explains a commonly used technique for converting a univariate time series into a single **point cloud**. Since topological features can be extracted from any point cloud, this is a gateway to time series analysis using topology. The second part of that notebook shows how to transform a *batch* of time series into a batch of point clouds, and how to extract topological descriptors from each of them independently. While in that notebook this is applied to a time series classification task, in this notebook we are concerned with topology-powered *forecasting* from a single time series.\n",
    "\n",
    "Reasoning by analogy with the multivariate case above, we can look at sliding windows over ``X`` as small time series in their own right and track the evolution of *their* topology against the variable of interest (or against itself, if we are interested in unsupervised tasks such as anomaly detection).\n",
    "\n",
    "There are two ways in which we can implement this idea in ``giotto-tda``:\n",
    "1. We can first apply a ``SlidingWindow``, and then an instance of ``TakensEmbedding``.\n",
    "2. We can *first* compute a global Takens embedding of the time series via ``SingleTakensEmbedding``, which takes us from 1D/column data to 2D data, and *then* partition the 2D data of vectors into sliding windows via ``SlidingWindow``.\n",
    "\n",
    "The first route ensures that we can run our \"topological feature extraction track\" in parallel with other feature-generation pipelines from sliding windows, without experiencing shape mismatches. The second route seems a little upside-down and it is not generally recommended, but it has the advantange that globally \"optimal\" parameters for the \"time delay\" and \"embedding dimension\" parameters can be computed automatically by ``SingleTakensEmbedding``. \n",
    "\n",
    "Below is what each route would look like.\n",
    "\n",
    "*Remark:* In the presence of noise, a small sliding window size is likely to reduce the reliability of the estimate of the time series' local topology."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Option 1: ``SlidingWindow`` + ``TakensEmbedding``\n",
    "\n",
    "``TakensEmbedding`` is not a ``TransformerResamplerMixin``, but this is not a problem in the context of a ``Pipeline`` when we order things in this way."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from gtda.time_series import TakensEmbedding\n",
    "\n",
    "X = np.arange(n_timestamps)\n",
    "\n",
    "window_size = 5\n",
    "stride = 2\n",
    "\n",
    "SW = SlidingWindow(size=window_size, stride=stride)\n",
    "X_sw, yr = SW.fit_transform_resample(X, y)\n",
    "X_sw, yr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "time_delay = 1\n",
    "dimension = 2\n",
    "\n",
    "TE = TakensEmbedding(time_delay=time_delay, dimension=dimension)\n",
    "X_te = TE.fit_transform(X_sw)\n",
    "X_te"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "VR = VietorisRipsPersistence()  # No \"precomputed\" for point clouds\n",
    "Ampl = Amplitude()\n",
    "RFR = RandomForestRegressor()\n",
    "\n",
    "pipe = make_pipeline(SW, TE, VR, Ampl, RFR)\n",
    "pipe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipe.fit(X, y)\n",
    "y_pred = pipe.predict(X)\n",
    "score = pipe.score(X, y)\n",
    "y_pred, score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Option 2: ``SingleTakensEmbeding`` + ``SlidingWindow``\n",
    "\n",
    "Note that ``SingleTakensEmbedding`` is also a ``TransformerResamplerMixin``, and that the logic for resampling/aligning ``y`` is the same as in ``SlidingWindow``."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from gtda.time_series import SingleTakensEmbedding\n",
    "\n",
    "X = np.arange(n_timestamps)\n",
    "\n",
    "STE = SingleTakensEmbedding(parameters_type=\"search\", time_delay=2, dimension=3)\n",
    "X_ste, yr = STE.fit_transform_resample(X, y)\n",
    "X_ste, yr"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "window_size = 5\n",
    "stride = 2\n",
    "\n",
    "SW = SlidingWindow(size=window_size, stride=stride)\n",
    "X_sw, yr = SW.fit_transform_resample(X_ste, yr)\n",
    "X_sw, yr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From here on, it is easy to push a very similar pipeline through as in the multivariate case:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "VR = VietorisRipsPersistence()  # No \"precomputed\" for point clouds\n",
    "Ampl = Amplitude()\n",
    "RFR = RandomForestRegressor()\n",
    "\n",
    "pipe = make_pipeline(STE, SW, VR, Ampl, RFR)\n",
    "pipe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipe.fit(X, y)\n",
    "y_pred = pipe.predict(X)\n",
    "score = pipe.score(X, y)\n",
    "y_pred, score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Integrating non-topological features\n",
    "\n",
    "The best results are obtained when topological methods are used not in isolation but in **combination** with other methods. Here's an example where, in parallel with the topological feature extraction from local sliding windows using **Option 2** above, we also compute the mean and variance in each sliding window. A ``scikit-learn`` ``FeatureUnion`` is used to combine these very different sets of features into a single pipeline object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import partial\n",
    "from sklearn.preprocessing import FunctionTransformer\n",
    "from sklearn.pipeline import FeatureUnion\n",
    "from sklearn.base import clone\n",
    "\n",
    "mean = FunctionTransformer(partial(np.mean, axis=1, keepdims=True))\n",
    "var = FunctionTransformer(partial(np.var, axis=1, keepdims=True))\n",
    "\n",
    "pipe_topology = make_pipeline(TE, VR, Ampl)\n",
    "\n",
    "feature_union = FeatureUnion([(\"window_mean\", mean),\n",
    "                              (\"window_variance\", var),\n",
    "                              (\"window_topology\", pipe_topology)])\n",
    "    \n",
    "pipe = make_pipeline(SW, feature_union, RFR)\n",
    "pipe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipe.fit(X, y)\n",
    "y_pred = pipe.predict(X)\n",
    "score = pipe.score(X, y)\n",
    "y_pred, score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Endogeneous target preparation with ``Labeller``\n",
    "\n",
    "Let us say that we simply wish to predict the future of a time series from itself. This is very common in the study of financial markets for example. ``giotto-tda`` provides convenience classes for target preparation from a time series. This notebook only shows a very simple example: many more options are described in ``Labeller``'s documentation.\n",
    "\n",
    "If we wished to create a target ``y`` from ``X`` such that ``y[i]`` is equal to ``X[i + 1]``, while also modifying ``X`` and ``y`` so that they still have the same length, we could proceed as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from gtda.time_series import Labeller\n",
    "\n",
    "X = np.arange(10)\n",
    "\n",
    "Lab = Labeller(size=1, func=np.max)\n",
    "Xl, yl = Lab.fit_transform_resample(X, X)\n",
    "Xl, yl"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice that we are feeding two copies of ``X`` to ``fit_transform_resample`` in this case!\n",
    "\n",
    "This is what fitting an end-to-end pipeline for future prediction using topology could look like. Again, you are encouraged to include your own non-topological features in the mix!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SW = SlidingWindow(size=5)\n",
    "TE = TakensEmbedding(time_delay=1, dimension=2)\n",
    "VR = VietorisRipsPersistence()\n",
    "Ampl = Amplitude()\n",
    "RFR = RandomForestRegressor()\n",
    "\n",
    "# Full pipeline including the regressor\n",
    "pipe = make_pipeline(Lab, SW, TE, VR, Ampl, RFR)\n",
    "pipe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipe.fit(X, X)\n",
    "y_pred = pipe.predict(X)\n",
    "y_pred"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Where to next?\n",
    "\n",
    "1. There are two additional simple ``TransformerResamplerMixin``s in ``gtda.time_series``: ``Resampler`` and ``Stationarizer``.\n",
    "2. The sort of pipeline for topological feature extraction using Takens embedding is a bit crude. More sophisticated methods exist for extracting robust topological summaries from (windows on) time series. A good source of inspiration is the following paper:\n",
    "\n",
    "   > [Persistent Homology of Complex Networks for Dynamic State Detection](https://arxiv.org/abs/1904.07403), by A. Myers, E. Munch, and F. A. Khasawneh.\n",
    "   \n",
    "   The module ``gtda.graphs`` contains several transformers implementing the main algorithms proposed there.\n",
    "3. Advanced users may be interested in ``ConsecutiveRescaling``, which can be found in ``gtda.point_clouds``.\n",
    "4. The notebook [Lorenz attractor](https://github.com/giotto-ai/giotto-tda/blob/master/examples/lorenz_attractor.ipynb) is an advanced use-case for ``TakensEmbedding`` and other time series forecasting techniques inspired by topology."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}