{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Signed Cumulative Distribution Transform Nearest Subspace (SCDT-NS) Classifier\n", "\n", "This tutorial will demonstrate how to use the SCDT-NS classifier from the *PyTransKit* package to classify 1D signals." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Class:: SCDT_NS\n", "**Functions**:\n", "\n", "1. Constructor function:\n", " scdt_ns_obj = SCDT_NS(num_classes, rm_edge)\n", " \n", " Inputs:\n", " ----------------\n", " num_classes : integer value\n", " totale number of classes in the dataset.\n", " rm_edge : [optional] boolean \n", " IF TRUE the first and last points of CDTs will be removed.\n", " \n", " Outputs:\n", " ----------------\n", " scdt_ns_obj : class object\n", " Instance of the class SCDT_NS.\n", " \n", "2. Fit function:\n", " scdt_ns_obj.fit(Xtrain, Ytrain, Ttrain)\n", " \n", " Inputs:\n", " ----------------\n", " Xtrain : array-like, shape (n_samples, n_columns)\n", " 1D data for training.\n", " Ytrain : ndarray of shape (n_samples,)\n", " Labels of the training samples.\n", " Ttrain : [optional] array-like, shape (n_samples, n_columns)\n", " domain for corresponding training signals.\n", " \n", "3. Predict function:\n", " preds = scdt_ns_obj.predict(Xtest, Ttest, use_gpu)\n", " \n", " Inputs:\n", " ----------------\n", " Xtest : array-like, shape (n_samples, n_columns)\n", " 1D data for testing.\n", " Ttest : [optional] array-like, shape (n_samples, n_columns)\n", " domain for corresponding test signals.\n", " use_gpu: [optional] boolean flag; IF TRUE, use gpu for calculations\n", " default = False.\n", " \n", " Outputs:\n", " ----------------\n", " preds : 1d array, shape (n_samples,)\n", " Predicted labels for test samples.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example\n", "The following example will demonstrate how to:\n", "* create and initialize an instance of the class SCDT_NS\n", "* train the model using 1D signals\n", "* apply the model to predict calss labels of the test 1D samples\n", "In this example we have used a synthetic dataset (1D). The dataset contains three classes.
\n", "Class 0: 4-th degree polynomial applied to a Gabor wave
\n", "Class 1: 4-th degree polynomial applied to a sawtooth wave
\n", "Class 2: 4-th degree polynomial applied to a square wave" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import python libraries" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from numpy import interp\n", "import math\n", "import matplotlib.pyplot as plt\n", "from scipy.linalg import lstsq\n", "from scipy import signal\n", "\n", "import sys\n", "sys.path.append('../')\n", "from pytranskit.classification.utils import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Import SCDT-NS class from PyTransKit package" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from pytranskit.classification.scdt_ns import SCDT_NS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Define a three template signals: Gabor wave, sawtooth wave, square wave" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def signal_gabor(t):\n", " w = 30\n", " s1 = np.real(math.pi**-0.25 * np.exp(2*math.pi*1j*w*(t-0.5)) * np.exp(-200*((t-0.5)**2)))\n", " return s1\n", "\n", "def signal_sawtooth(t):\n", " s2 = signal.sawtooth(2 * np.pi * 15 * (t-0.5))* np.exp(-250*((t-0.5)**2))\n", " return s2\n", "\n", "def signal_square(t):\n", " s3 = signal.square(2 * np.pi * 15 * (t-0.5))*np.exp(-250*((t-0.5)**2))\n", " return s3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generate template signals for three classes" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "N = 300 # number of discrete samples per signal\n", "\n", "## Define a set of signals\n", "t = np.linspace(0.,1.,N)\n", "\n", "s_template = []\n", "s_template.append(signal_gabor(t)) \n", "s_template.append(signal_sawtooth(t)) \n", "s_template.append(signal_square(t))\n", "\n", "## generate confounds\n", "num_classes = len(s_template) # number of classes\n", "\n", "## Plotting\n", "fig, ax = plt.subplots(1, 3, sharex=False, sharey=False, figsize=(25,3))\n", "\n", "#c = ['b*', 'ro', 'kx'][i]\n", "ax[0].plot(t,s_template[0],'b')\n", "ax[0].set_title(\"Gabor wave\")\n", "ax[0].set_yticks([])\n", "\n", "ax[1].plot(t,s_template[1],'r')\n", "ax[1].set_title(\"Sawtooth wave\")\n", "ax[1].set_yticks([])\n", "\n", "ax[2].plot(t,s_template[2],'k')\n", "ax[2].set_title(\"Square wave\")\n", "ax[2].set_yticks([])\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generate confounds (apply 4-th degree polynomial)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "Lp = 4\n", "p0 = np.linspace(-0.5,0.15,Lp) #[-0.5, 0.2]\n", "p1 = np.linspace(0.85,1.25,Lp) #[0.75, 1.5]\n", "p2 = np.linspace(0.5, 1.25,Lp) #[0.5, 1.5]\n", "p3 = np.linspace(0.5, 1.25,Lp) #[0.5, 1.5]\n", "p4 = np.linspace(0.5, 1.25,Lp) #[0.5, 1.5]\n", "\n", "s_conf = []\n", "y_conf = []\n", "coeff = np.zeros(5)\n", "for k in range(num_classes):\n", " Lc = 0\n", " for a in range(Lp):\n", " coeff[0]=p0[a]\n", " for b in range(Lp):\n", " coeff[1]=p1[b]\n", " for c in range(Lp):\n", " coeff[2]=p2[c]\n", " for d in range(Lp):\n", " coeff[3]=p3[d]\n", " for e in range(Lp):\n", " coeff[4]=p4[e]\n", " g = coeff[4]*(t**4)+coeff[3]*(t**3)+coeff[2]*(t**2)+coeff[1]*(t**1)+coeff[0]*(t**0)\n", " g_prime = 4*coeff[4]*(t**3)+3*coeff[3]*(t**2)+2*coeff[2]*(t**1)+coeff[1]\n", " if k==0:\n", " sc = g_prime*signal_gabor(g) \n", " elif k==1:\n", " sc = g_prime*signal_sawtooth(g) \n", " elif k==2:\n", " sc = g_prime*signal_square(g) \n", " s_conf.append(sc)\n", " y_conf.append(k)\n", " Lc = Lc+1\n", "s_conf = np.stack(s_conf, axis=0)\n", "y_conf = np.stack(y_conf, axis=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Split data into train and test sets" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def random_index_matrix(max_index, n_samples_perclass, num_classes, repeat, y_train):\n", " seed = int('{}{}{}'.format(n_samples_perclass, num_classes, repeat))\n", " np.random.seed(seed)\n", " tr_index = []\n", " te_index = []\n", " for classidx in range(num_classes):\n", " max_samples = (y_train == classidx).sum()\n", " tr_index.append(np.random.randint(0, max_samples, (n_samples_perclass)))\n", " te_index.append(~tr_index[classidx])\n", " return tr_index, te_index" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1536, 300)\n", "(1536, 300)\n" ] } ], "source": [ "data_train = []\n", "data_test = []\n", "y_train = []\n", "y_test = []\n", "\n", "max_index = len(y_conf) // num_classes\n", "train_index, test_index = random_index_matrix(max_index, max_index//2, num_classes, num_classes, y_conf)\n", "for i in range(num_classes):\n", " data_class = s_conf[np.where(y_conf==i)]\n", " y_class = y_conf[np.where(y_conf==i)]\n", " data_train.append(data_class[train_index[i],:])\n", " y_train.append(y_class[train_index[i]])\n", " data_test.append(data_class[~train_index[i],:])\n", " y_test.append(y_class[~train_index[i]])\n", "data_train = np.concatenate(data_train,axis=0)\n", "data_test = np.concatenate(data_test,axis=0)\n", "y_train = np.concatenate(y_train)\n", "y_test = np.concatenate(y_test)\n", "print(data_train.shape)\n", "print(data_test.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example we have used 256 randomly chosen samples per class to train the model. We have used another function *take_train_samples* function from *utils.py* script for this. User can use their own script." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(768, 300)\n" ] } ], "source": [ "n_samples_perclass = 256\n", "x_train_sub, y_train_sub = take_train_samples(data_train, y_train, n_samples_perclass, num_classes, repeat=0)\n", "print(x_train_sub.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create an instance for SCDT_NS class" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "scdt_ns_obj = SCDT_NS(num_classes, rm_edge=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training Phase" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Calculating SCDTs for training data ...\n", "Generating basis vectors for each class ...\n" ] } ], "source": [ "scdt_ns_obj.fit(x_train_sub, y_train_sub)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Testing Phase" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Calculating SCDTs for testing samples ...\n", "Finding nearest subspace for each test sample ...\n" ] } ], "source": [ "preds = scdt_ns_obj.predict(data_test, use_gpu=False)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Test accuracy: 100.0%\n" ] } ], "source": [ "from sklearn.metrics import accuracy_score\n", "\n", "print('\\nTest accuracy: {}%'.format(100*accuracy_score(y_test, preds)))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 4 }