# Example-1 (Comparison of three different classifiers)

A comparison of a 3 classifiers in `scikit-learn` on iris dataset.
The iris dataset is a classic and very easy multi-class classification dataset.

## Install scikit-learn

In [1]:
import sys
import os
!{sys.executable} -m pip install scikit-learn
if "Example1_Files" not in os.listdir():
 os.mkdir("Example1_Files")



You are using pip version 19.0.2, however version 19.0.3 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.


## Load dataset

In [2]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from pycm import ConfusionMatrix
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)


## Classifier 1 (C-Support vector)

In [3]:
from sklearn import svm
classifier_1 = svm.SVC(kernel='linear', C=0.01)

In [4]:
y_pred_1 = classifier_1.fit(X_train, y_train).predict(X_test)

In [5]:
cm1=ConfusionMatrix(y_test,y_pred_1)
cm1.print_matrix()

Predict 0 1 2 
Actual
0 13 0 0 

1 0 10 6 

2 0 0 9 




In [6]:
cm1.print_normalized_matrix()

Predict 0 1 2 
Actual
0 1.0 0.0 0.0 

1 0.0 0.625 0.375 

2 0.0 0.0 1.0 




In [7]:
cm1.Kappa 

0.7673469387755101

In [8]:
cm1.Overall_ACC

0.8421052631578947

In [9]:
cm1.SOA1 # Landis and Koch benchmark

'Substantial'

In [10]:
cm1.SOA2 # Fleiss’ benchmark

'Excellent'

In [11]:
cm1.SOA3 # Altman’s benchmark

'Good'

In [12]:
cm1.SOA4 # Cicchetti’s benchmark

'Excellent'

In [13]:
cm1.save_html(os.path.join("Example1_Files","cm1"))

{'Message': 'D:\\For Asus Laptop\\projects\\pycm\\Document\\Example1_Files\\cm1.html',
 'Status': True}

Open File

## Classifier 2 (Decision tree)

In [14]:
from sklearn.tree import DecisionTreeClassifier
classifier_2 = DecisionTreeClassifier(max_depth=5)

In [15]:
y_pred_2 = classifier_2.fit(X_train, y_train).predict(X_test)

In [16]:
cm2=ConfusionMatrix(y_test,y_pred_2)
cm2.print_matrix()

Predict 0 1 2 
Actual
0 13 0 0 

1 0 15 1 

2 0 0 9 




In [17]:
cm2.print_normalized_matrix()

Predict 0 1 2 
Actual
0 1.0 0.0 0.0 

1 0.0 0.9375 0.0625 

2 0.0 0.0 1.0 




In [18]:
cm2.Kappa 

0.95978835978836

In [19]:
cm2.Overall_ACC

0.9736842105263158

In [20]:
cm2.SOA1 # Landis and Koch benchmark

'Almost Perfect'

In [21]:
cm2.SOA2 # Fleiss’ benchmark

'Excellent'

In [22]:
cm2.SOA3 # Altman’s benchmark

'Very Good'

In [23]:
cm2.SOA4 # Cicchetti’s benchmark

'Excellent'

In [24]:
cm2.save_html(os.path.join("Example1_Files","cm2"))

{'Message': 'D:\\For Asus Laptop\\projects\\pycm\\Document\\Example1_Files\\cm2.html',
 'Status': True}

Open File

## Classifier 3 (AdaBoost)

In [25]:
from sklearn.ensemble import AdaBoostClassifier
classifier_3 = AdaBoostClassifier()

 from numpy.core.umath_tests import inner1d


In [26]:
y_pred_3 = classifier_3.fit(X_train, y_train).predict(X_test)

In [27]:
cm3=ConfusionMatrix(y_test,y_pred_3)
cm3.print_matrix()

Predict 0 1 2 
Actual
0 13 0 0 

1 0 15 1 

2 0 3 6 




In [28]:
cm3.print_normalized_matrix()

Predict 0 1 2 
Actual
0 1.0 0.0 0.0 

1 0.0 0.9375 0.0625 

2 0.0 0.33333 0.66667 




In [29]:
cm3.Kappa 

0.8354978354978355

In [30]:
cm3.Overall_ACC

0.8947368421052632

In [31]:
cm3.SOA1 # Landis and Koch benchmark

'Almost Perfect'

In [32]:
cm3.SOA2 # Fleiss’ benchmark

'Excellent'

In [33]:
cm3.SOA3 # Altman’s benchmark

'Very Good'

In [34]:
cm3.SOA4 # Cicchetti’s benchmark

'Excellent'

In [35]:
cm3.save_html(os.path.join("Example1_Files","cm3"))

{'Message': 'D:\\For Asus Laptop\\projects\\pycm\\Document\\Example1_Files\\cm3.html',
 'Status': True}

Open File

## How to compare classifiers?

Classifiers can be compared with each other according to results of the benchmarks.
The second classifier (DecisionTree) is the best one in this case. PYCM supports different useful parameters such as `Kappa value`, `Scott's pi`, `Entropy`, to name but a handful.