{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MiniRocket\n", "\n", "MiniRocket [1] transforms input time series using a small, fixed set of convolutional\n", "kernels. MiniRocket uses PPV pooling to compute a single feature for each of the resulting feature maps (i.e., the proportion of positive values). The transformed features are used to train a linear classifier.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1 Univariate Time Series\n", "\n", "### 1.1 Imports\n", "\n", "Import example data, `MiniRocket`, `MiniRocketClassifier`, `MiniRocketRegressor`,\n", "`RidgeClassifierCV` (scikit-learn), and ``numpy``.\n", "\n", "You can use the `MiniRocket`transform directly, in a pipeline, or in our baked in `MiniRocketClassifier` or `MiniRocketRegressor`.\n", "\n", "**Note**: ``MiniRocket`` is compiled by ``numba`` on import. The compiled functions are\n", "cached, so this should only happen once (i.e., the first time you import ``MiniRocket``)." ] }, { "cell_type": "code", "metadata": { "execution": { "iopub.execute_input": "2020-10-12T17:43:03.214929Z", "iopub.status.busy": "2020-10-12T17:43:03.214184Z", "iopub.status.idle": "2020-10-12T17:43:03.216304Z", "shell.execute_reply": "2020-10-12T17:43:03.216990Z" }, "ExecuteTime": { "end_time": "2024-11-25T11:08:58.368462Z", "start_time": "2024-11-25T11:08:58.349939Z" } }, "source": [ "# !pip install --upgrade numba" ], "outputs": [], "execution_count": 1 }, { "metadata": { "ExecuteTime": { "end_time": "2024-11-25T11:10:18.327182Z", "start_time": "2024-11-25T11:08:59.095253Z" } }, "cell_type": "code", "source": [ "import numpy as np\n", "from sklearn.linear_model import RidgeClassifierCV\n", "from sklearn.preprocessing import StandardScaler\n", "\n", "from aeon.classification.convolution_based import MiniRocketClassifier\n", "from aeon.datasets import load_arrow_head # univariate dataset\n", "from aeon.datasets import load_basic_motions # multivariate dataset\n", "from aeon.regression.convolution_based import MiniRocketRegressor\n", "from aeon.transformations.collection.convolution_based import MiniRocket" ], "outputs": [], "execution_count": 2 }, { "metadata": { "ExecuteTime": { "end_time": "2024-11-25T11:10:23.328664Z", "start_time": "2024-11-25T11:10:23.234728Z" } }, "cell_type": "code", "source": [ "X_train, y_train = load_arrow_head(split=\"train\")\n", "minirocket = MiniRocket() # by default, MiniRocket uses ~10_000 kernels\n", "minirocket.fit(X_train)\n", "X_train_transform = minirocket.transform(X_train)\n", "# test shape of transformed training data -> (n_cases, 9_996)\n", "X_train_transform.shape" ], "outputs": [ { "data": { "text/plain": [ "(36, 9996)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "execution_count": 5 }, { "metadata": {}, "cell_type": "markdown", "source": [ "### 1.4 Fit a Classifier" ] }, { "metadata": {}, "cell_type": "markdown", "source": [ "We suggest using `RidgeClassifierCV` (scikit-learn) for smaller datasets (fewer than ~10,000 training examples), and using logistic regression trained using stochastic gradient descent for larger datasets.\n", "\n", "**Note**: For larger datasets, this means integrating MiniRocket with stochastic gradient descent such that the transform is performed per minibatch, *not* simply substituting `RidgeClassifierCV` for, e.g., `LogisticRegression`.\n", "\n", "**Note**: While the input time-series of MiniRocket is unscaled, the output features of MiniRocket may need to be adjusted for following models. E.g. for `RidgeClassifierCV`, we scale the features using the sklearn StandardScaler." ] }, { "metadata": { "ExecuteTime": { "end_time": "2024-11-25T11:10:26.380394Z", "start_time": "2024-11-25T11:10:26.343196Z" } }, "cell_type": "code", "source": [ "scaler = StandardScaler(with_mean=False)\n", "classifier = RidgeClassifierCV(alphas=np.logspace(-3, 3, 10))\n", "X_train_scaled_transform = scaler.fit_transform(X_train_transform)\n", "classifier.fit(X_train_scaled_transform, y_train)" ], "outputs": [ { "data": { "text/plain": [ "RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n", " 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n", " 2.15443469e+02, 1.00000000e+03]))" ], "text/html": [ "
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n",
" 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n",
" 2.15443469e+02, 1.00000000e+03]))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n",
" 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n",
" 2.15443469e+02, 1.00000000e+03]))MiniRocketClassifier()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
MiniRocketClassifier()
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n",
" 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n",
" 2.15443469e+02, 1.00000000e+03]))In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n",
" 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n",
" 2.15443469e+02, 1.00000000e+03]))Pipeline(steps=[('minirocket', MiniRocket()),\n",
" ('standardscaler', StandardScaler(with_mean=False)),\n",
" ('ridgeclassifiercv',\n",
" RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n",
" 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n",
" 2.15443469e+02, 1.00000000e+03])))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. Pipeline(steps=[('minirocket', MiniRocket()),\n",
" ('standardscaler', StandardScaler(with_mean=False)),\n",
" ('ridgeclassifiercv',\n",
" RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n",
" 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n",
" 2.15443469e+02, 1.00000000e+03])))])MiniRocket()
StandardScaler(with_mean=False)
RidgeClassifierCV(alphas=array([1.00000000e-03, 4.64158883e-03, 2.15443469e-02, 1.00000000e-01,\n",
" 4.64158883e-01, 2.15443469e+00, 1.00000000e+01, 4.64158883e+01,\n",
" 2.15443469e+02, 1.00000000e+03]))