{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Univariate time series classification with sktime-dl\n", "[Github](https://github.com/sktime/sktime-dl)\n", "\n", "In this notebook, we use sktime-dl to perform for univariate time series classification by deep learning." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "from sklearn.model_selection import GridSearchCV\n", "from sktime.datasets import load_gunpoint, load_italy_power_demand\n", "from sktime_dl.deeplearning import CNNClassifier\n", "\n", "sns.set_style('whitegrid') " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Load a dataset" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
dim_0
00 -0.710520\n", "1 -1.183300\n", "2 -1.372400\n", "3...
10 -0.993010\n", "1 -1.426800\n", "2 -1.579900\n", "3...
20 1.319100\n", "1 0.569770\n", "2 0.195130\n", "3...
30 -0.812440\n", "1 -1.157600\n", "2 -1.416400\n", "3...
40 -0.972840\n", "1 -1.390500\n", "2 -1.536700\n", "3...
\n", "
" ], "text/plain": [ " dim_0\n", "0 0 -0.710520\n", "1 -1.183300\n", "2 -1.372400\n", "3...\n", "1 0 -0.993010\n", "1 -1.426800\n", "2 -1.579900\n", "3...\n", "2 0 1.319100\n", "1 0.569770\n", "2 0.195130\n", "3...\n", "3 0 -0.812440\n", "1 -1.157600\n", "2 -1.416400\n", "3...\n", "4 0 -0.972840\n", "1 -1.390500\n", "2 -1.536700\n", "3..." ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_train, y_train = load_italy_power_demand(split='train', return_X_y=True)\n", "X_test, y_test = load_italy_power_demand(split='test', return_X_y=True)\n", "X_train.head()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-0.71052" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "i = 0\n", "X_train.loc[i, \"dim_0\"]#.iloc[0]\n", "X_train.loc[i, \"dim_0\"].iloc[0]" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def plot_data_samples(X, y, sample_numbers):\n", " ''' \n", " Plot the time series data relating to the input list of sample numbers.\n", "\n", " sample_numbers: list of integers\n", " E.g. [1, 7, 22, 42]\n", " '''\n", " \n", " unique_labels = np.unique(y).astype(int)\n", " num_classes = len(unique_labels)\n", " if num_classes<=4:\n", " class_colors = ['red', 'blue', 'green' , 'orange']\n", " else:\n", " class_colors = sns.color_palette(n_colors=num_classes)\n", "\n", " fig, ax = plt.subplots()\n", " for i in sample_numbers:\n", " print('sample', i, 'class', str(y[i]))\n", " color_num = y[i].astype(int) - unique_labels.min()\n", " X_train.loc[i, \"dim_0\"].plot(label=str(y[i]), color=class_colors[color_num])\n", "\n", " print('')\n", " plt.ylim([-3.5, 3.5])\n", " if num_classes<=2:\n", " title = class_colors[0]+' : class '+str(unique_labels[0])\n", " title = title + '\\n'+class_colors[1]+' : class '+str(unique_labels[1])\n", " plt.title(title)\n", " ax.set_ylabel('Data value')\n", " ax.set_xlabel('Data point number')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot some data samples" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "sample 0 class 1\n", "sample 1 class 1\n", "sample 2 class 2\n", "sample 3 class 2\n", "\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plot_data_samples(X_train, y_train, [0, 1, 2, 3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Train a single deep neural network classifier\n", "Here we choose to use the CNN (convolutional neural network) classifier. Other classifiers provided by sktime-dl include MLP, ResNet and InceptionTime." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.6180758017492711" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "network = CNNClassifier(nb_epochs=200, verbose=False)\n", "network.fit(X_train, y_train)\n", "network.score(X_test, y_test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save the model to file" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "network.model.save(\"temp_model.h5\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Grid Search\n", "sktime-dl is compatible with scikit-learn and can use sklearn's GridSearchCV.\n", "\n", "Here we search over two parameters, number of epochs and CNN kernel size." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best cross-validation accuracy: 0.68\n", "Test set score: 0.53\n", "Best parameters: {'kernel_size': 9, 'nb_epochs': 100}\n" ] } ], "source": [ "param_grid = {'nb_epochs': [50, 100],\n", " 'kernel_size': [5, 7, 9] }\n", "grid = GridSearchCV(network, param_grid=param_grid, cv=5) \n", "grid.fit(X_train, y_train)\n", " \n", "print(\"Best cross-validation accuracy: {:.2f}\".format(grid.best_score_))\n", "print(\"Test set score: {:.2f}\".format(grid.score(X_test, y_test)))\n", "print(\"Best parameters: {}\".format(grid.best_params_))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 4 }