{ "metadata": { "kernelspec": { "name": "python", "display_name": "Pyolite", "language": "python" }, "language_info": { "codemirror_mode": { "name": "python", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8" }, "toc-showcode": false }, "nbformat_minor": 4, "nbformat": 4, "cells": [ { "cell_type": "markdown", "source": "
\n \n \n \n
\n", "metadata": {} }, { "cell_type": "markdown", "source": "# **Softmax Regression ,One-vs-All & One-vs-One for Multi-class Classification**\n", "metadata": {} }, { "cell_type": "markdown", "source": "Estimated time needed: **1** hour\n", "metadata": {} }, { "cell_type": "markdown", "source": "In this lab, we will study how to convert a linear classifier into a multi-class classifier, including multinomial logistic regression or softmax regression, One vs. All (One-vs-Rest) and One vs. One\n", "metadata": {} }, { "cell_type": "markdown", "source": "## **Objectives**\n", "metadata": {} }, { "cell_type": "markdown", "source": "After completing this lab you will be able to:\n", "metadata": {} }, { "cell_type": "markdown", "source": "* Understand and apply some theory behind:\n * Softmax regression\n * One vs. All (One-vs-Rest)\n * One vs. One\n", "metadata": {} }, { "cell_type": "markdown", "source": "## **Introduction**\n", "metadata": {} }, { "cell_type": "markdown", "source": "In Multi-class classification, we classify data into multiple class labels . Unlike classification trees and k-nearest neighbour, the concept of Multi-class classification for linear classifiers is not as straightforward. We can convert logistic regression to Multi-class classification using multinomial logistic regression or softmax regression; this is a generalization of logistic regression, this will not work for support vector machines. One vs. All (One-vs-Rest) and One vs. One are two other multi-class classification techniques can covert any two-class classifier to a multi-class classifier.\n", "metadata": {} }, { "cell_type": "markdown", "source": "***\n", "metadata": {} }, { "cell_type": "markdown", "source": "## **Install and Import the required libraries**\n", "metadata": {} }, { "cell_type": "markdown", "source": "For this lab, we are going to be using several Python libraries such as scit-learn, numpy and matplotlib for visualizations. Some of these libraries might be installed in your lab environment, others may need to be installed by you by removing the hash signs. The cells below will install these libraries when executed.\n", "metadata": {} }, { "cell_type": "code", "source": "import piplite\nawait piplite.install(['pandas'])\nawait piplite.install(['matplotlib'])\nawait piplite.install(['numpy'])\nawait piplite.install(['scikit-learn'])\nawait piplite.install(['scipy'])\n", "metadata": { "trusted": true }, "execution_count": 1, "outputs": [] }, { "cell_type": "code", "source": "\nfrom pyodide.http import pyfetch\n\nasync def download(url, filename):\n response = await pyfetch(url)\n if response.status == 200:\n with open(filename, \"wb\") as f:\n f.write(await response.bytes())\n", "metadata": { "trusted": true }, "execution_count": 2, "outputs": [] }, { "cell_type": "code", "source": "import numpy as np\nimport matplotlib.pyplot as plt\nfrom sklearn import datasets\nfrom sklearn.svm import SVC\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.metrics import accuracy_score\nimport pandas as pd", "metadata": { "trusted": true }, "execution_count": 3, "outputs": [] }, { "cell_type": "markdown", "source": "## Utility Function\n", "metadata": {} }, { "cell_type": "markdown", "source": "This functions Plots different decision boundary\n", "metadata": {} }, { "cell_type": "code", "source": "plot_colors = \"ryb\"\nplot_step = 0.02\n\ndef decision_boundary (X,y,model,iris, two=None):\n x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1\n y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1\n xx, yy = np.meshgrid(np.arange(x_min, x_max, plot_step),\n np.arange(y_min, y_max, plot_step))\n plt.tight_layout(h_pad=0.5, w_pad=0.5, pad=2.5)\n \n Z = model.predict(np.c_[xx.ravel(), yy.ravel()])\n Z = Z.reshape(xx.shape)\n cs = plt.contourf(xx, yy, Z,cmap=plt.cm.RdYlBu)\n \n if two:\n cs = plt.contourf(xx, yy, Z,cmap=plt.cm.RdYlBu)\n for i, color in zip(np.unique(y), plot_colors):\n \n idx = np.where( y== i)\n plt.scatter(X[idx, 0], X[idx, 1], label=y,cmap=plt.cm.RdYlBu, s=15)\n plt.show()\n \n else:\n set_={0,1,2}\n print(set_)\n for i, color in zip(range(3), plot_colors):\n idx = np.where( y== i)\n if np.any(idx):\n\n set_.remove(i)\n\n plt.scatter(X[idx, 0], X[idx, 1], label=y,cmap=plt.cm.RdYlBu, edgecolor='black', s=15)\n\n\n for i in set_:\n idx = np.where( iris.target== i)\n plt.scatter(X[idx, 0], X[idx, 1], marker='x',color='black')\n\n plt.show()\n", "metadata": { "trusted": true }, "execution_count": 4, "outputs": [] }, { "cell_type": "markdown", "source": "This function will plot the probability of belonging to each class; each column is the probability of belonging to a class the row number is the sample number.\n", "metadata": {} }, { "cell_type": "code", "source": "def plot_probability_array(X,probability_array):\n\n plot_array=np.zeros((X.shape[0],30))\n col_start=0\n ones=np.ones((X.shape[0],30))\n for class_,col_end in enumerate([10,20,30]):\n plot_array[:,col_start:col_end]= np.repeat(probability_array[:,class_].reshape(-1,1), 10,axis=1)\n col_start=col_end\n plt.imshow(plot_array)\n plt.xticks([])\n plt.ylabel(\"samples\")\n plt.xlabel(\"probability of 3 classes\")\n plt.colorbar()\n plt.show()", "metadata": { "trusted": true }, "execution_count": 5, "outputs": [] }, { "cell_type": "markdown", "source": "In ths lab we will use the iris dataset, it consists of 3 different types of irises’ (Setosa y=0, Versicolour y=1, and Virginica y=2) petal and sepal length, stored in a 150x4 numpy.ndarray\n\nThe rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width.\n\nThe below plot uses the secoond two features\n", "metadata": {} }, { "cell_type": "code", "source": "pair=[1, 3]\niris = datasets.load_iris()\nX = iris.data[:, pair]\ny = iris.target\nnp.unique(y)", "metadata": { "trusted": true }, "execution_count": 6, "outputs": [ { "execution_count": 6, "output_type": "execute_result", "data": { "text/plain": "array([0, 1, 2])" }, "metadata": {} } ] }, { "cell_type": "code", "source": "plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.RdYlBu)\nplt.xlabel(\"sepal width (cm)\")\nplt.ylabel(\"petal width\")", "metadata": { "trusted": true }, "execution_count": 7, "outputs": [ { "execution_count": 7, "output_type": "execute_result", "data": { "text/plain": "Text(0, 0.5, 'petal width')" }, "metadata": {} } ] }, { "cell_type": "markdown", "source": "## **Softmax Regression**\n", "metadata": {} }, { "cell_type": "markdown", "source": "SoftMax regression is similar to logistic regression, the softmax function convernts the actual distances i.e. dot products of $x$ with each of the parameters $\\theta_i$ for the $K$ classes. This is converted to probabilities using the following :\n", "metadata": {} }, { "cell_type": "markdown", "source": "$softmax(x,i) = \\frac{e^{ \\theta_i^T \\bf x}}{\\sum\\_{j=1}^K e^{\\theta_j^T x}} $\n", "metadata": {} }, { "cell_type": "markdown", "source": "The training procedure is almost identical to logistic regression. Consider the three-class example where $y \\in {0,1,2}$ we would like to classify $x\\_1$. We can use the softmax function to generate a probability of how likely the sample belongs to each class:\n", "metadata": {} }, { "cell_type": "markdown", "source": "$\\[softmax(x\\_1,0),softmax(x\\_1,1),softmax(x\\_1,2)]=\\[0.97,0.2,0.1]$\n", "metadata": {} }, { "cell_type": "markdown", "source": "The index of each probability is the same as the class. We can make a prediction using the argmax function:\n", "metadata": {} }, { "cell_type": "markdown", "source": "$\\hat{y}=argmax_i {softmax(x,i)}$\n", "metadata": {} }, { "cell_type": "markdown", "source": "For the above example, we can make a prediction as follows:\n", "metadata": {} }, { "cell_type": "markdown", "source": "$\\hat{y}=argmax_i {\\[0.97,0.2,0.1]}=0$\n", "metadata": {} }, { "cell_type": "markdown", "source": "sklearn
does this automatically, but we can verify the prediction step, we fit the model:\n",
"metadata": {}
},
{
"cell_type": "code",
"source": "lr = LogisticRegression(random_state=0).fit(X, y)",
"metadata": {
"trusted": true
},
"execution_count": 8,
"outputs": []
},
{
"cell_type": "markdown",
"source": "We generate the probability using the method predict
.\n",
"metadata": {}
},
{
"cell_type": "code",
"source": "yhat =lr.predict(X)\naccuracy_score(yhat,softmax_prediction)",
"metadata": {
"trusted": true
},
"execution_count": 15,
"outputs": [
{
"execution_count": 15,
"output_type": "execute_result",
"data": {
"text/plain": "1.0"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "We can't use Softmax regression for SVMs let explore two methods of Multi-class Classification. that we can apply to SVM.\n",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "## SVM\n",
"metadata": {}
},
{
"cell_type": "markdown",
"source": "Sklean performs Multi-class Classification automatically, we can apply the method and calculate the accuracy. Train a SVM classifier with the `kernel` set to `linear`, `gamma` set to `0.5`, and the `probability` paramter set to `True`, then train the model using the `X` and `y` data.\n",
"metadata": {}
},
{
"cell_type": "code",
"source": "model = SVC(kernel='linear', gamma=.5, probability=True)\n\nmodel.fit(X,y)",
"metadata": {
"trusted": true
},
"execution_count": 17,
"outputs": [
{
"execution_count": 17,
"output_type": "execute_result",
"data": {
"text/plain": "SVC(gamma=0.5, kernel='linear', probability=True)"
},
"metadata": {}
}
]
},
{
"cell_type": "markdown",
"source": "my_models
. For each class we take the class samples we would like to classify, and the rest will be labelled as a dummy class. We repeat the process for each class. For each classifier, we plot the decision regions. The class we are interested in is in red, and the dummy class is in blue. Similarly, the class samples are marked in blue, and the dummy samples are marked with a black x.\n",
"metadata": {}
},
{
"cell_type": "code",
"source": "#dummy class\ndummy_class=y.max()+1\n#list used for classifiers \nmy_models=[]\n#iterate through each class\nfor class_ in np.unique(y):\n #select the index of our class\n select=(y==class_)\n temp_y=np.zeros(y.shape)\n #class, we are trying to classify \n temp_y[y==class_]=class_\n #set other samples to a dummy class \n temp_y[y!=class_]=dummy_class\n #Train model and add to list \n model=SVC(kernel='linear', gamma=.5, probability=True) \n my_models.append(model.fit(X,temp_y))\n #plot decision boundary \n decision_boundary (X,temp_y,model,iris)\n",
"metadata": {
"trusted": true
},
"execution_count": 20,
"outputs": [
{
"name": "stdout",
"text": "{0, 1, 2}\n",
"output_type": "stream"
},
{
"output_type": "display_data",
"data": {
"text/plain": "\n | 0 and 1 | \n0 and 2 | \n1 and 2 | \n
---|---|---|---|
0 | \n0 | \n0 | \n1 | \n
1 | \n0 | \n0 | \n1 | \n
2 | \n0 | \n0 | \n1 | \n
3 | \n0 | \n0 | \n1 | \n
4 | \n0 | \n0 | \n1 | \n
5 | \n0 | \n0 | \n1 | \n
6 | \n0 | \n0 | \n1 | \n
7 | \n0 | \n0 | \n1 | \n
8 | \n0 | \n0 | \n1 | \n
9 | \n0 | \n0 | \n1 | \n