{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "# Convolution based time series classification in aeon\n", "\n", "This notebook is a high level introduction to using and configuring convolution based\n", "classifiers in aeon. Convolution based classifiers are based on the ROCKET transform\n", "[1] and the subsequent extensions MiniROCKET [2] and MultiROCKET [3]. These\n", "transforms can be used in pipelines, but we provide two convolution based classifiers\n", " based on ROCKET for ease of use and reproducability. The RocketClassifier combines\n", " the transform with a scikitlearn RidgeClassifierCV classifier. Ther term\n", " convolution and kernel are used interchangably in this notebook. A convolution is a\n", " subseries that is used to create features for a time series. To do this, a\n", " convolution is run along a series, and the dot product is calculated. The creates a\n", " new series (often called an activation map or feature map) where large values\n", " correspond to a close correlation to the convolution.\n", "\n", "\"Windowing.\"\n", "\n", "ROCKET computes two features from the resulting feature maps: the maximum value\n", "(sometimes called a max pooling operation), and the proportion of positive values (or\n", " ppv). In the above example the first entry of the activation map is the result of a\n", " dot-product between $T_{1:3} * \\omega = T_{1:3} \\cdot \\omega = 0 + 0 + 3 = 3$. Max\n", " pooling extracts the maximum from the activation map as feature. The proportion of\n", " positive values (PPV) is $8 / 11$ in this example.\n", "\n", "A large number of random convolutions are generated, and the two features are\n", "combined to produce a transformed train data set. This is used to train a linear\n", "classifier. [1] reccomend a RIDGE Regression Classifier using cross-validation to\n", "train the $L_2$-regularisation parameter $\\alpha$. A logistic regression classifier\n", "is suggested as a replacement for larger datasets.\n", "\"ROCKET.\"\n", "\n", "ROCKET employs dilation. Dilation is a form of down sampling, in that it defines\n", "spaces between time points. Hence, a convolution with dilation $d$ is compared to\n", "time points $d$ steps apart when calculating the distance.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": "[('Arsenal', aeon.classification.convolution_based._arsenal.Arsenal),\n ('RocketClassifier',\n aeon.classification.convolution_based._rocket_classifier.RocketClassifier)]" }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import warnings\n", "\n", "from aeon.registry import all_estimators\n", "\n", "warnings.filterwarnings(\"ignore\")\n", "all_estimators(\n", " \"classifier\", filter_tags={\"algorithm_type\": \"convolution\"}, as_dataframe=True\n", ")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(67, 1, 24)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.metrics import accuracy_score\n", "\n", "from aeon.classification.convolution_based import Arsenal, RocketClassifier\n", "from aeon.datasets import load_basic_motions # multivariate dataset\n", "from aeon.datasets import load_italy_power_demand # univariate dataset\n", "\n", "italy, italy_labels = load_italy_power_demand(split=\"train\")\n", "italy_test, italy_test_labels = load_italy_power_demand(split=\"test\")\n", "motions, motions_labels = load_basic_motions(split=\"train\")\n", "motions_test, motions_test_labels = load_basic_motions(split=\"train\")\n", "italy.shape" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "ROCKET compiles (via Numba) on import, which may take a few seconds. ROCKET does not\n", "produce estimates of class probabilities. Because of this, the Arsenal was developed\n", "to use with the HIVE-COTE meta-ensemble (in the hybrid package). The Arsenal is an\n", "ensemble of ROCKET classifiers that is no more accurate than ROCKET, but gives better\n", " probability estimates.\n", "\n", "\"Rocket\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.967930029154519" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rocket = RocketClassifier()\n", "rocket.fit(italy, italy_labels)\n", "y_pred = rocket.predict(italy_test)\n", "accuracy_score(italy_test_labels, y_pred)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.9698736637512148" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "afc = Arsenal()\n", "afc.fit(italy, italy_labels)\n", "y_pred = afc.predict(italy_test)\n", "accuracy_score(italy_test_labels, y_pred)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "MiniROCKET[2] is a fast version of ROCKET that uses hard coded convolutions and only\n", " uses PPV. MultiROCKET [3] adds three new pooling operations extracted from each\n", " kernel: mean of positive values (MPV), mean of indices of positive values (MIPV) and\n", " longest stretch of positive values (LSPV). MultiRocket generates a total of 50k\n", " features from 10k kernels and 5 pooling operations. It also extracts features from\n", " first order differences. The RocketClassifier and Arsenal can be configured to use\n", " MiniROCKET and MultiROCKET. Simply set with rocket_transform as \"minirocket\" or\n", " \"multirocket\". Both work on multivariate series: channels are simply randomly\n", " selected for each convolution." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.9718172983479106" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "multi_r = Arsenal(rocket_transform=\"multirocket\")\n", "multi_r.fit(italy, italy_labels)\n", "y_pred = multi_r.predict(italy_test)\n", "accuracy_score(italy_test_labels, y_pred)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "1.0" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mini_r = RocketClassifier(rocket_transform=\"minirocket\")\n", "mini_r.fit(motions, motions_labels)\n", "y_pred = mini_r.predict(motions_test)\n", "accuracy_score(motions_test_labels, y_pred)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "RocketClassifier has three other parameters that may effect performance.\n", "`num_kernels` (default 10,000) determines the number of convolutions/kernels generated\n", " and will influence the memory usage. `max_dilations_per_kernel` (default=32) and\n", "`n_features_per_kernel` (default=4) are used in 'MiniROCKET' and 'MultiROCKET. For\n", "each candidate convolution, `max_dilations_per_kernel` are assessed and\n", "`n_features_per_kernel` are retained.\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## References\n", "\n", "[1] Dempster A, Petitjean F and Webb GI (2019) ROCKET: Exceptionally fast\n", "and accurate time series classification using random convolutional kernels.\n", "[arXiv:1910.13051] (https://arxiv.org/abs/1910.13051),\n", "[Journal Paper](https://link.springer.com/article/10.1007/s10618-020-00701-z)\n", "\n", "[2] Dempster A, Schmidt D and Webb G (2021) MINIROCKET: A Very Fast (Almost)\n", "Deterministic Transform for Time Series Classification\n", "[arXiv:2012.08791](https://arxiv.org/abs/2012.08791)\n", "[Conference Paper](https://dl.acm.org/doi/abs/10.1145/3447548.3467231)\n", "\n", "[3] Cahng Wei T, Dempster A, Bergmeir C and Webb G (2022) MultiRocket: multiple pooling\n", "operators and transformations for fast and effective time series classification\n", "[Journal Paper](https://link.springer.com/article/10.1007/s10618-022-00844-1)\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" } }, "nbformat": 4, "nbformat_minor": 0 }