{ "cells": [ { "attachments": {}, "cell_type": "markdown", "id": "81e0620e", "metadata": {}, "source": [ "Last updated: 16 Feb 2023\n", "\n", "# 👋 PyCaret Binary Classification Tutorial\n", "\n", "PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that exponentially speeds up the experiment cycle and makes you more productive.\n", "\n", "Compared with the other open-source machine learning libraries, PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with a few lines only. This makes experiments exponentially fast and efficient. PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks, such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, and a few more.\n", "\n", "The design and simplicity of PyCaret are inspired by the emerging role of citizen data scientists, a term first used by Gartner. Citizen Data Scientists are power users who can perform both simple and moderately sophisticated analytical tasks that would previously have required more technical expertise.\n" ] }, { "attachments": {}, "cell_type": "markdown", "id": "8116e19d", "metadata": {}, "source": [ "# 💻 Installation\n", "\n", "PyCaret is tested and supported on the following 64-bit systems:\n", "- Python 3.7 – 3.10\n", "- Python 3.9 for Ubuntu only\n", "- Ubuntu 16.04 or later\n", "- Windows 7 or later\n", "\n", "You can install PyCaret with Python's pip package manager:\n", "\n", "`pip install pycaret`\n", "\n", "PyCaret's default installation will not install all the extra dependencies automatically. For that you will have to install the full version:\n", "\n", "`pip install pycaret[full]`\n", "\n", "or depending on your use-case you may install one of the following variant:\n", "\n", "- `pip install pycaret[analysis]`\n", "- `pip install pycaret[models]`\n", "- `pip install pycaret[tuner]`\n", "- `pip install pycaret[mlops]`\n", "- `pip install pycaret[parallel]`\n", "- `pip install pycaret[test]`" ] }, { "cell_type": "code", "execution_count": 1, "id": "d7142a33", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'3.0.0'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# check installed version\n", "import pycaret\n", "pycaret.__version__" ] }, { "attachments": {}, "cell_type": "markdown", "id": "fb66e98d", "metadata": {}, "source": [ "# 🚀 Quick start" ] }, { "attachments": {}, "cell_type": "markdown", "id": "00347d44", "metadata": {}, "source": [ "PyCaret’s Classification Module is a supervised machine learning module that is used for classifying elements into groups. The goal is to predict the categorical class labels which are discrete and unordered. \n", "\n", "Some common use cases include predicting customer default (Yes or No), predicting customer churn (customer will leave or stay), the disease found (positive or negative). \n", "\n", "This module can be used for binary or multiclass problems. It provides several pre-processing features that prepare the data for modeling through the setup function. It has over 18 ready-to-use algorithms and several plots to analyze the performance of trained models.\n", "\n", "A typical workflow in PyCaret consist of following 5 steps in this order:\n", "\n", "## **Setup** ➡️ **Compare Models** ➡️ **Analyze Model** ➡️ **Prediction** ➡️ **Save Model**" ] }, { "cell_type": "code", "execution_count": 2, "id": "956dfdab", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Number of times pregnant | \n", "Plasma glucose concentration a 2 hours in an oral glucose tolerance test | \n", "Diastolic blood pressure (mm Hg) | \n", "Triceps skin fold thickness (mm) | \n", "2-Hour serum insulin (mu U/ml) | \n", "Body mass index (weight in kg/(height in m)^2) | \n", "Diabetes pedigree function | \n", "Age (years) | \n", "Class variable | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "6 | \n", "148 | \n", "72 | \n", "35 | \n", "0 | \n", "33.6 | \n", "0.627 | \n", "50 | \n", "1 | \n", "
1 | \n", "1 | \n", "85 | \n", "66 | \n", "29 | \n", "0 | \n", "26.6 | \n", "0.351 | \n", "31 | \n", "0 | \n", "
2 | \n", "8 | \n", "183 | \n", "64 | \n", "0 | \n", "0 | \n", "23.3 | \n", "0.672 | \n", "32 | \n", "1 | \n", "
3 | \n", "1 | \n", "89 | \n", "66 | \n", "23 | \n", "94 | \n", "28.1 | \n", "0.167 | \n", "21 | \n", "0 | \n", "
4 | \n", "0 | \n", "137 | \n", "40 | \n", "35 | \n", "168 | \n", "43.1 | \n", "2.288 | \n", "33 | \n", "1 | \n", "
\n", " | Description | \n", "Value | \n", "
---|---|---|
0 | \n", "Session id | \n", "123 | \n", "
1 | \n", "Target | \n", "Class variable | \n", "
2 | \n", "Target type | \n", "Binary | \n", "
3 | \n", "Original data shape | \n", "(768, 9) | \n", "
4 | \n", "Transformed data shape | \n", "(768, 9) | \n", "
5 | \n", "Transformed train set shape | \n", "(537, 9) | \n", "
6 | \n", "Transformed test set shape | \n", "(231, 9) | \n", "
7 | \n", "Numeric features | \n", "8 | \n", "
8 | \n", "Preprocess | \n", "True | \n", "
9 | \n", "Imputation type | \n", "simple | \n", "
10 | \n", "Numeric imputation | \n", "mean | \n", "
11 | \n", "Categorical imputation | \n", "mode | \n", "
12 | \n", "Fold Generator | \n", "StratifiedKFold | \n", "
13 | \n", "Fold Number | \n", "10 | \n", "
14 | \n", "CPU Jobs | \n", "-1 | \n", "
15 | \n", "Use GPU | \n", "False | \n", "
16 | \n", "Log Experiment | \n", "False | \n", "
17 | \n", "Experiment Name | \n", "clf-default-name | \n", "
18 | \n", "USI | \n", "52db | \n", "
\n", " | Description | \n", "Value | \n", "
---|---|---|
0 | \n", "Session id | \n", "123 | \n", "
1 | \n", "Target | \n", "Class variable | \n", "
2 | \n", "Target type | \n", "Binary | \n", "
3 | \n", "Original data shape | \n", "(768, 9) | \n", "
4 | \n", "Transformed data shape | \n", "(768, 9) | \n", "
5 | \n", "Transformed train set shape | \n", "(537, 9) | \n", "
6 | \n", "Transformed test set shape | \n", "(231, 9) | \n", "
7 | \n", "Numeric features | \n", "8 | \n", "
8 | \n", "Preprocess | \n", "True | \n", "
9 | \n", "Imputation type | \n", "simple | \n", "
10 | \n", "Numeric imputation | \n", "mean | \n", "
11 | \n", "Categorical imputation | \n", "mode | \n", "
12 | \n", "Fold Generator | \n", "StratifiedKFold | \n", "
13 | \n", "Fold Number | \n", "10 | \n", "
14 | \n", "CPU Jobs | \n", "-1 | \n", "
15 | \n", "Use GPU | \n", "False | \n", "
16 | \n", "Log Experiment | \n", "False | \n", "
17 | \n", "Experiment Name | \n", "clf-default-name | \n", "
18 | \n", "USI | \n", "0071 | \n", "
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "TT (Sec) | \n", "
---|---|---|---|---|---|---|---|---|---|
lr | \n", "Logistic Regression | \n", "0.7689 | \n", "0.8047 | \n", "0.5602 | \n", "0.7208 | \n", "0.6279 | \n", "0.4641 | \n", "0.4736 | \n", "1.3810 | \n", "
ridge | \n", "Ridge Classifier | \n", "0.7670 | \n", "0.0000 | \n", "0.5497 | \n", "0.7235 | \n", "0.6221 | \n", "0.4581 | \n", "0.4690 | \n", "0.0370 | \n", "
lda | \n", "Linear Discriminant Analysis | \n", "0.7670 | \n", "0.8055 | \n", "0.5550 | \n", "0.7202 | \n", "0.6243 | \n", "0.4594 | \n", "0.4695 | \n", "0.0500 | \n", "
rf | \n", "Random Forest Classifier | \n", "0.7485 | \n", "0.7911 | \n", "0.5284 | \n", "0.6811 | \n", "0.5924 | \n", "0.4150 | \n", "0.4238 | \n", "0.1940 | \n", "
nb | \n", "Naive Bayes | \n", "0.7427 | \n", "0.7955 | \n", "0.5702 | \n", "0.6543 | \n", "0.6043 | \n", "0.4156 | \n", "0.4215 | \n", "0.0400 | \n", "
catboost | \n", "CatBoost Classifier | \n", "0.7410 | \n", "0.7993 | \n", "0.5278 | \n", "0.6630 | \n", "0.5851 | \n", "0.4005 | \n", "0.4078 | \n", "0.0890 | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "0.7373 | \n", "0.7918 | \n", "0.5550 | \n", "0.6445 | \n", "0.5931 | \n", "0.4013 | \n", "0.4059 | \n", "0.0770 | \n", "
ada | \n", "Ada Boost Classifier | \n", "0.7372 | \n", "0.7799 | \n", "0.5275 | \n", "0.6585 | \n", "0.5796 | \n", "0.3926 | \n", "0.4017 | \n", "0.0870 | \n", "
et | \n", "Extra Trees Classifier | \n", "0.7299 | \n", "0.7788 | \n", "0.4965 | \n", "0.6516 | \n", "0.5596 | \n", "0.3706 | \n", "0.3802 | \n", "0.1280 | \n", "
qda | \n", "Quadratic Discriminant Analysis | \n", "0.7282 | \n", "0.7894 | \n", "0.5281 | \n", "0.6558 | \n", "0.5736 | \n", "0.3785 | \n", "0.3910 | \n", "0.0510 | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "0.7133 | \n", "0.7645 | \n", "0.5398 | \n", "0.6036 | \n", "0.5650 | \n", "0.3534 | \n", "0.3580 | \n", "0.2440 | \n", "
knn | \n", "K Neighbors Classifier | \n", "0.7001 | \n", "0.7164 | \n", "0.5020 | \n", "0.5982 | \n", "0.5413 | \n", "0.3209 | \n", "0.3271 | \n", "0.0570 | \n", "
dt | \n", "Decision Tree Classifier | \n", "0.6928 | \n", "0.6512 | \n", "0.5137 | \n", "0.5636 | \n", "0.5328 | \n", "0.3070 | \n", "0.3098 | \n", "0.0460 | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "0.6853 | \n", "0.7516 | \n", "0.4912 | \n", "0.5620 | \n", "0.5216 | \n", "0.2887 | \n", "0.2922 | \n", "0.0520 | \n", "
dummy | \n", "Dummy Classifier | \n", "0.6518 | \n", "0.5000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0380 | \n", "
svm | \n", "SVM - Linear Kernel | \n", "0.5954 | \n", "0.0000 | \n", "0.3395 | \n", "0.4090 | \n", "0.2671 | \n", "0.0720 | \n", "0.0912 | \n", "0.0410 | \n", "
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "TT (Sec) | \n", "
---|---|---|---|---|---|---|---|---|---|
lr | \n", "Logistic Regression | \n", "0.7689 | \n", "0.8047 | \n", "0.5602 | \n", "0.7208 | \n", "0.6279 | \n", "0.4641 | \n", "0.4736 | \n", "0.0450 | \n", "
ridge | \n", "Ridge Classifier | \n", "0.7670 | \n", "0.0000 | \n", "0.5497 | \n", "0.7235 | \n", "0.6221 | \n", "0.4581 | \n", "0.4690 | \n", "0.0330 | \n", "
lda | \n", "Linear Discriminant Analysis | \n", "0.7670 | \n", "0.8055 | \n", "0.5550 | \n", "0.7202 | \n", "0.6243 | \n", "0.4594 | \n", "0.4695 | \n", "0.0370 | \n", "
rf | \n", "Random Forest Classifier | \n", "0.7485 | \n", "0.7911 | \n", "0.5284 | \n", "0.6811 | \n", "0.5924 | \n", "0.4150 | \n", "0.4238 | \n", "0.1320 | \n", "
nb | \n", "Naive Bayes | \n", "0.7427 | \n", "0.7955 | \n", "0.5702 | \n", "0.6543 | \n", "0.6043 | \n", "0.4156 | \n", "0.4215 | \n", "0.0360 | \n", "
catboost | \n", "CatBoost Classifier | \n", "0.7410 | \n", "0.7993 | \n", "0.5278 | \n", "0.6630 | \n", "0.5851 | \n", "0.4005 | \n", "0.4078 | \n", "0.0340 | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "0.7373 | \n", "0.7918 | \n", "0.5550 | \n", "0.6445 | \n", "0.5931 | \n", "0.4013 | \n", "0.4059 | \n", "0.0730 | \n", "
ada | \n", "Ada Boost Classifier | \n", "0.7372 | \n", "0.7799 | \n", "0.5275 | \n", "0.6585 | \n", "0.5796 | \n", "0.3926 | \n", "0.4017 | \n", "0.0750 | \n", "
et | \n", "Extra Trees Classifier | \n", "0.7299 | \n", "0.7788 | \n", "0.4965 | \n", "0.6516 | \n", "0.5596 | \n", "0.3706 | \n", "0.3802 | \n", "0.1320 | \n", "
qda | \n", "Quadratic Discriminant Analysis | \n", "0.7282 | \n", "0.7894 | \n", "0.5281 | \n", "0.6558 | \n", "0.5736 | \n", "0.3785 | \n", "0.3910 | \n", "0.0380 | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "0.7133 | \n", "0.7645 | \n", "0.5398 | \n", "0.6036 | \n", "0.5650 | \n", "0.3534 | \n", "0.3580 | \n", "0.0390 | \n", "
knn | \n", "K Neighbors Classifier | \n", "0.7001 | \n", "0.7164 | \n", "0.5020 | \n", "0.5982 | \n", "0.5413 | \n", "0.3209 | \n", "0.3271 | \n", "0.0490 | \n", "
dt | \n", "Decision Tree Classifier | \n", "0.6928 | \n", "0.6512 | \n", "0.5137 | \n", "0.5636 | \n", "0.5328 | \n", "0.3070 | \n", "0.3098 | \n", "0.0390 | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "0.6853 | \n", "0.7516 | \n", "0.4912 | \n", "0.5620 | \n", "0.5216 | \n", "0.2887 | \n", "0.2922 | \n", "0.0440 | \n", "
dummy | \n", "Dummy Classifier | \n", "0.6518 | \n", "0.5000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0330 | \n", "
svm | \n", "SVM - Linear Kernel | \n", "0.5954 | \n", "0.0000 | \n", "0.3395 | \n", "0.4090 | \n", "0.2671 | \n", "0.0720 | \n", "0.0912 | \n", "0.0310 | \n", "
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=None, max_iter=1000,\n", " multi_class='auto', n_jobs=None, penalty='l2',\n", " random_state=123, solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=None, max_iter=1000,\n", " multi_class='auto', n_jobs=None, penalty='l2',\n", " random_state=123, solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False)
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "Logistic Regression | \n", "0.7576 | \n", "0.8568 | \n", "0.5309 | \n", "0.7049 | \n", "0.6056 | \n", "0.4356 | \n", "0.4447 | \n", "
\n", " | Number of times pregnant | \n", "Plasma glucose concentration a 2 hours in an oral glucose tolerance test | \n", "Diastolic blood pressure (mm Hg) | \n", "Triceps skin fold thickness (mm) | \n", "2-Hour serum insulin (mu U/ml) | \n", "Body mass index (weight in kg/(height in m)^2) | \n", "Diabetes pedigree function | \n", "Age (years) | \n", "Class variable | \n", "prediction_label | \n", "prediction_score | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|
537 | \n", "6 | \n", "114 | \n", "88 | \n", "0 | \n", "0 | \n", "27.799999 | \n", "0.247 | \n", "66 | \n", "0 | \n", "0 | \n", "0.8037 | \n", "
538 | \n", "1 | \n", "97 | \n", "70 | \n", "15 | \n", "0 | \n", "18.200001 | \n", "0.147 | \n", "21 | \n", "0 | \n", "0 | \n", "0.9648 | \n", "
539 | \n", "2 | \n", "90 | \n", "70 | \n", "17 | \n", "0 | \n", "27.299999 | \n", "0.085 | \n", "22 | \n", "0 | \n", "0 | \n", "0.9393 | \n", "
540 | \n", "2 | \n", "105 | \n", "58 | \n", "40 | \n", "94 | \n", "34.900002 | \n", "0.225 | \n", "25 | \n", "0 | \n", "0 | \n", "0.7998 | \n", "
541 | \n", "11 | \n", "138 | \n", "76 | \n", "0 | \n", "0 | \n", "33.200001 | \n", "0.420 | \n", "35 | \n", "0 | \n", "1 | \n", "0.6391 | \n", "
\n", " | Number of times pregnant | \n", "Plasma glucose concentration a 2 hours in an oral glucose tolerance test | \n", "Diastolic blood pressure (mm Hg) | \n", "Triceps skin fold thickness (mm) | \n", "2-Hour serum insulin (mu U/ml) | \n", "Body mass index (weight in kg/(height in m)^2) | \n", "Diabetes pedigree function | \n", "Age (years) | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "6 | \n", "148 | \n", "72 | \n", "35 | \n", "0 | \n", "33.6 | \n", "0.627 | \n", "50 | \n", "
1 | \n", "1 | \n", "85 | \n", "66 | \n", "29 | \n", "0 | \n", "26.6 | \n", "0.351 | \n", "31 | \n", "
2 | \n", "8 | \n", "183 | \n", "64 | \n", "0 | \n", "0 | \n", "23.3 | \n", "0.672 | \n", "32 | \n", "
3 | \n", "1 | \n", "89 | \n", "66 | \n", "23 | \n", "94 | \n", "28.1 | \n", "0.167 | \n", "21 | \n", "
4 | \n", "0 | \n", "137 | \n", "40 | \n", "35 | \n", "168 | \n", "43.1 | \n", "2.288 | \n", "33 | \n", "
\n", " | Number of times pregnant | \n", "Plasma glucose concentration a 2 hours in an oral glucose tolerance test | \n", "Diastolic blood pressure (mm Hg) | \n", "Triceps skin fold thickness (mm) | \n", "2-Hour serum insulin (mu U/ml) | \n", "Body mass index (weight in kg/(height in m)^2) | \n", "Diabetes pedigree function | \n", "Age (years) | \n", "prediction_label | \n", "prediction_score | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "6 | \n", "148 | \n", "72 | \n", "35 | \n", "0 | \n", "33.599998 | \n", "0.627 | \n", "50 | \n", "1 | \n", "0.6939 | \n", "
1 | \n", "1 | \n", "85 | \n", "66 | \n", "29 | \n", "0 | \n", "26.600000 | \n", "0.351 | \n", "31 | \n", "0 | \n", "0.9419 | \n", "
2 | \n", "8 | \n", "183 | \n", "64 | \n", "0 | \n", "0 | \n", "23.299999 | \n", "0.672 | \n", "32 | \n", "1 | \n", "0.7975 | \n", "
3 | \n", "1 | \n", "89 | \n", "66 | \n", "23 | \n", "94 | \n", "28.100000 | \n", "0.167 | \n", "21 | \n", "0 | \n", "0.9453 | \n", "
4 | \n", "0 | \n", "137 | \n", "40 | \n", "35 | \n", "168 | \n", "43.099998 | \n", "2.288 | \n", "33 | \n", "1 | \n", "0.8393 | \n", "
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='most_frequent',\n", " verbose='deprecated'))),\n", " ('trained_model',\n", " LogisticRegression(C=1.0, class_weight=None, dual=False,\n", " fit_intercept=True, intercept_scaling=1,\n", " l1_ratio=None, max_iter=1000,\n", " multi_class='auto', n_jobs=None,\n", " penalty='l2', random_state=123,\n", " solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False))],\n", " verbose=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='most_frequent',\n", " verbose='deprecated'))),\n", " ('trained_model',\n", " LogisticRegression(C=1.0, class_weight=None, dual=False,\n", " fit_intercept=True, intercept_scaling=1,\n", " l1_ratio=None, max_iter=1000,\n", " multi_class='auto', n_jobs=None,\n", " penalty='l2', random_state=123,\n", " solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False))],\n", " verbose=False)
TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))
CleanColumnNames()
CleanColumnNames()
TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 hours in an oral '\n", " 'glucose tolerance test',\n", " 'Diastolic blood pressure (mm Hg)',\n", " 'Triceps skin fold thickness (mm)',\n", " '2-Hour serum insulin (mu U/ml)',\n", " 'Body mass index (weight in kg/(height in m)^2)',\n", " 'Diabetes pedigree function', 'Age (years)'],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='mean',\n", " verbose='deprecated'))
SimpleImputer()
SimpleImputer()
TransformerWrapper(exclude=None, include=[],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='most_frequent',\n", " verbose='deprecated'))
SimpleImputer(strategy='most_frequent')
SimpleImputer(strategy='most_frequent')
LogisticRegression(max_iter=1000, random_state=123)
\n", " | Description | \n", "Value | \n", "
---|---|---|
0 | \n", "Session id | \n", "123 | \n", "
1 | \n", "Target | \n", "Class variable | \n", "
2 | \n", "Target type | \n", "Binary | \n", "
3 | \n", "Original data shape | \n", "(768, 9) | \n", "
4 | \n", "Transformed data shape | \n", "(768, 9) | \n", "
5 | \n", "Transformed train set shape | \n", "(537, 9) | \n", "
6 | \n", "Transformed test set shape | \n", "(231, 9) | \n", "
7 | \n", "Numeric features | \n", "8 | \n", "
8 | \n", "Preprocess | \n", "True | \n", "
9 | \n", "Imputation type | \n", "simple | \n", "
10 | \n", "Numeric imputation | \n", "mean | \n", "
11 | \n", "Categorical imputation | \n", "mode | \n", "
12 | \n", "Fold Generator | \n", "StratifiedKFold | \n", "
13 | \n", "Fold Number | \n", "10 | \n", "
14 | \n", "CPU Jobs | \n", "-1 | \n", "
15 | \n", "Use GPU | \n", "False | \n", "
16 | \n", "Log Experiment | \n", "False | \n", "
17 | \n", "Experiment Name | \n", "clf-default-name | \n", "
18 | \n", "USI | \n", "038a | \n", "
\n", " | Number of times pregnant | \n", "Plasma glucose concentration a 2 hours in an oral glucose tolerance test | \n", "Diastolic blood pressure (mm Hg) | \n", "Triceps skin fold thickness (mm) | \n", "2-Hour serum insulin (mu U/ml) | \n", "Body mass index (weight in kg/(height in m)^2) | \n", "Diabetes pedigree function | \n", "Age (years) | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "13.0 | \n", "152.0 | \n", "90.0 | \n", "33.0 | \n", "29.0 | \n", "26.799999 | \n", "0.731 | \n", "43.0 | \n", "
1 | \n", "0.0 | \n", "104.0 | \n", "64.0 | \n", "37.0 | \n", "64.0 | \n", "33.599998 | \n", "0.510 | \n", "22.0 | \n", "
2 | \n", "5.0 | \n", "137.0 | \n", "108.0 | \n", "0.0 | \n", "0.0 | \n", "48.799999 | \n", "0.227 | \n", "37.0 | \n", "
3 | \n", "0.0 | \n", "111.0 | \n", "65.0 | \n", "0.0 | \n", "0.0 | \n", "24.600000 | \n", "0.660 | \n", "31.0 | \n", "
4 | \n", "6.0 | \n", "105.0 | \n", "70.0 | \n", "32.0 | \n", "68.0 | \n", "30.799999 | \n", "0.122 | \n", "37.0 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
532 | \n", "10.0 | \n", "179.0 | \n", "70.0 | \n", "0.0 | \n", "0.0 | \n", "35.099998 | \n", "0.200 | \n", "37.0 | \n", "
533 | \n", "0.0 | \n", "100.0 | \n", "88.0 | \n", "60.0 | \n", "110.0 | \n", "46.799999 | \n", "0.962 | \n", "31.0 | \n", "
534 | \n", "1.0 | \n", "89.0 | \n", "76.0 | \n", "34.0 | \n", "37.0 | \n", "31.200001 | \n", "0.192 | \n", "23.0 | \n", "
535 | \n", "1.0 | \n", "121.0 | \n", "78.0 | \n", "39.0 | \n", "74.0 | \n", "39.000000 | \n", "0.261 | \n", "28.0 | \n", "
536 | \n", "0.0 | \n", "140.0 | \n", "65.0 | \n", "26.0 | \n", "130.0 | \n", "42.599998 | \n", "0.431 | \n", "24.0 | \n", "
537 rows × 8 columns
\n", "\n", " | Description | \n", "Value | \n", "
---|---|---|
0 | \n", "Session id | \n", "123 | \n", "
1 | \n", "Target | \n", "Class variable | \n", "
2 | \n", "Target type | \n", "Binary | \n", "
3 | \n", "Original data shape | \n", "(768, 9) | \n", "
4 | \n", "Transformed data shape | \n", "(768, 9) | \n", "
5 | \n", "Transformed train set shape | \n", "(537, 9) | \n", "
6 | \n", "Transformed test set shape | \n", "(231, 9) | \n", "
7 | \n", "Numeric features | \n", "8 | \n", "
8 | \n", "Preprocess | \n", "True | \n", "
9 | \n", "Imputation type | \n", "simple | \n", "
10 | \n", "Numeric imputation | \n", "mean | \n", "
11 | \n", "Categorical imputation | \n", "mode | \n", "
12 | \n", "Normalize | \n", "True | \n", "
13 | \n", "Normalize method | \n", "minmax | \n", "
14 | \n", "Fold Generator | \n", "StratifiedKFold | \n", "
15 | \n", "Fold Number | \n", "10 | \n", "
16 | \n", "CPU Jobs | \n", "-1 | \n", "
17 | \n", "Use GPU | \n", "False | \n", "
18 | \n", "Log Experiment | \n", "False | \n", "
19 | \n", "Experiment Name | \n", "clf-default-name | \n", "
20 | \n", "USI | \n", "f18d | \n", "
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "TT (Sec) | \n", "
---|---|---|---|---|---|---|---|---|---|
ridge | \n", "Ridge Classifier | \n", "0.7708 | \n", "0.0000 | \n", "0.5392 | \n", "0.7353 | \n", "0.6203 | \n", "0.4618 | \n", "0.4744 | \n", "0.0340 | \n", "
lr | \n", "Logistic Regression | \n", "0.7689 | \n", "0.8068 | \n", "0.4959 | \n", "0.7614 | \n", "0.5968 | \n", "0.4453 | \n", "0.4673 | \n", "0.0360 | \n", "
lda | \n", "Linear Discriminant Analysis | \n", "0.7670 | \n", "0.8055 | \n", "0.5550 | \n", "0.7202 | \n", "0.6243 | \n", "0.4594 | \n", "0.4695 | \n", "0.0340 | \n", "
svm | \n", "SVM - Linear Kernel | \n", "0.7521 | \n", "0.0000 | \n", "0.5070 | \n", "0.7363 | \n", "0.5796 | \n", "0.4154 | \n", "0.4398 | \n", "0.0340 | \n", "
rf | \n", "Random Forest Classifier | \n", "0.7485 | \n", "0.7917 | \n", "0.5336 | \n", "0.6784 | \n", "0.5946 | \n", "0.4164 | \n", "0.4245 | \n", "0.1340 | \n", "
nb | \n", "Naive Bayes | \n", "0.7427 | \n", "0.7957 | \n", "0.5702 | \n", "0.6543 | \n", "0.6043 | \n", "0.4156 | \n", "0.4215 | \n", "0.0390 | \n", "
catboost | \n", "CatBoost Classifier | \n", "0.7410 | \n", "0.7994 | \n", "0.5278 | \n", "0.6630 | \n", "0.5851 | \n", "0.4005 | \n", "0.4078 | \n", "0.0430 | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "0.7373 | \n", "0.7920 | \n", "0.5550 | \n", "0.6445 | \n", "0.5931 | \n", "0.4013 | \n", "0.4059 | \n", "0.0730 | \n", "
ada | \n", "Ada Boost Classifier | \n", "0.7372 | \n", "0.7799 | \n", "0.5275 | \n", "0.6585 | \n", "0.5796 | \n", "0.3926 | \n", "0.4017 | \n", "0.0690 | \n", "
et | \n", "Extra Trees Classifier | \n", "0.7299 | \n", "0.7788 | \n", "0.4965 | \n", "0.6516 | \n", "0.5596 | \n", "0.3706 | \n", "0.3802 | \n", "0.1330 | \n", "
qda | \n", "Quadratic Discriminant Analysis | \n", "0.7282 | \n", "0.7894 | \n", "0.5281 | \n", "0.6558 | \n", "0.5736 | \n", "0.3785 | \n", "0.3910 | \n", "0.0360 | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "0.7113 | \n", "0.7653 | \n", "0.5181 | \n", "0.6036 | \n", "0.5533 | \n", "0.3427 | \n", "0.3479 | \n", "0.0480 | \n", "
knn | \n", "K Neighbors Classifier | \n", "0.7002 | \n", "0.7433 | \n", "0.4860 | \n", "0.5965 | \n", "0.5311 | \n", "0.3142 | \n", "0.3210 | \n", "0.0570 | \n", "
dt | \n", "Decision Tree Classifier | \n", "0.6947 | \n", "0.6526 | \n", "0.5137 | \n", "0.5665 | \n", "0.5343 | \n", "0.3103 | \n", "0.3130 | \n", "0.0380 | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "0.6853 | \n", "0.7522 | \n", "0.4912 | \n", "0.5620 | \n", "0.5216 | \n", "0.2887 | \n", "0.2922 | \n", "0.0390 | \n", "
dummy | \n", "Dummy Classifier | \n", "0.6518 | \n", "0.5000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0380 | \n", "
\n", " | Name | \n", "Reference | \n", "Turbo | \n", "
---|---|---|---|
ID | \n", "\n", " | \n", " | \n", " |
lr | \n", "Logistic Regression | \n", "sklearn.linear_model._logistic.LogisticRegression | \n", "True | \n", "
knn | \n", "K Neighbors Classifier | \n", "sklearn.neighbors._classification.KNeighborsCl... | \n", "True | \n", "
nb | \n", "Naive Bayes | \n", "sklearn.naive_bayes.GaussianNB | \n", "True | \n", "
dt | \n", "Decision Tree Classifier | \n", "sklearn.tree._classes.DecisionTreeClassifier | \n", "True | \n", "
svm | \n", "SVM - Linear Kernel | \n", "sklearn.linear_model._stochastic_gradient.SGDC... | \n", "True | \n", "
rbfsvm | \n", "SVM - Radial Kernel | \n", "sklearn.svm._classes.SVC | \n", "False | \n", "
gpc | \n", "Gaussian Process Classifier | \n", "sklearn.gaussian_process._gpc.GaussianProcessC... | \n", "False | \n", "
mlp | \n", "MLP Classifier | \n", "sklearn.neural_network._multilayer_perceptron.... | \n", "False | \n", "
ridge | \n", "Ridge Classifier | \n", "sklearn.linear_model._ridge.RidgeClassifier | \n", "True | \n", "
rf | \n", "Random Forest Classifier | \n", "sklearn.ensemble._forest.RandomForestClassifier | \n", "True | \n", "
qda | \n", "Quadratic Discriminant Analysis | \n", "sklearn.discriminant_analysis.QuadraticDiscrim... | \n", "True | \n", "
ada | \n", "Ada Boost Classifier | \n", "sklearn.ensemble._weight_boosting.AdaBoostClas... | \n", "True | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "sklearn.ensemble._gb.GradientBoostingClassifier | \n", "True | \n", "
lda | \n", "Linear Discriminant Analysis | \n", "sklearn.discriminant_analysis.LinearDiscrimina... | \n", "True | \n", "
et | \n", "Extra Trees Classifier | \n", "sklearn.ensemble._forest.ExtraTreesClassifier | \n", "True | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "xgboost.sklearn.XGBClassifier | \n", "True | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "lightgbm.sklearn.LGBMClassifier | \n", "True | \n", "
catboost | \n", "CatBoost Classifier | \n", "catboost.core.CatBoostClassifier | \n", "True | \n", "
dummy | \n", "Dummy Classifier | \n", "sklearn.dummy.DummyClassifier | \n", "True | \n", "
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "TT (Sec) | \n", "
---|---|---|---|---|---|---|---|---|---|
rf | \n", "Random Forest Classifier | \n", "0.7485 | \n", "0.7917 | \n", "0.5336 | \n", "0.6784 | \n", "0.5946 | \n", "0.4164 | \n", "0.4245 | \n", "0.1200 | \n", "
catboost | \n", "CatBoost Classifier | \n", "0.7410 | \n", "0.7994 | \n", "0.5278 | \n", "0.6630 | \n", "0.5851 | \n", "0.4005 | \n", "0.4078 | \n", "0.0410 | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "0.7373 | \n", "0.7920 | \n", "0.5550 | \n", "0.6445 | \n", "0.5931 | \n", "0.4013 | \n", "0.4059 | \n", "0.0780 | \n", "
et | \n", "Extra Trees Classifier | \n", "0.7299 | \n", "0.7788 | \n", "0.4965 | \n", "0.6516 | \n", "0.5596 | \n", "0.3706 | \n", "0.3802 | \n", "0.1300 | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "0.7113 | \n", "0.7653 | \n", "0.5181 | \n", "0.6036 | \n", "0.5533 | \n", "0.3427 | \n", "0.3479 | \n", "0.0460 | \n", "
dt | \n", "Decision Tree Classifier | \n", "0.6947 | \n", "0.6526 | \n", "0.5137 | \n", "0.5665 | \n", "0.5343 | \n", "0.3103 | \n", "0.3130 | \n", "0.0360 | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "0.6853 | \n", "0.7522 | \n", "0.4912 | \n", "0.5620 | \n", "0.5216 | \n", "0.2887 | \n", "0.2922 | \n", "0.0420 | \n", "
RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,\n", " criterion='gini', max_depth=None, max_features='sqrt',\n", " max_leaf_nodes=None, max_samples=None,\n", " min_impurity_decrease=0.0, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " n_estimators=100, n_jobs=-1, oob_score=False,\n", " random_state=123, verbose=0, warm_start=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,\n", " criterion='gini', max_depth=None, max_features='sqrt',\n", " max_leaf_nodes=None, max_samples=None,\n", " min_impurity_decrease=0.0, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " n_estimators=100, n_jobs=-1, oob_score=False,\n", " random_state=123, verbose=0, warm_start=False)
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "TT (Sec) | \n", "
---|---|---|---|---|---|---|---|---|---|
rf | \n", "Random Forest Classifier | \n", "0.7485 | \n", "0.7917 | \n", "0.5336 | \n", "0.6784 | \n", "0.5946 | \n", "0.4164 | \n", "0.4245 | \n", "0.120 | \n", "
catboost | \n", "CatBoost Classifier | \n", "0.7410 | \n", "0.7994 | \n", "0.5278 | \n", "0.6630 | \n", "0.5851 | \n", "0.4005 | \n", "0.4078 | \n", "0.041 | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "0.7373 | \n", "0.7920 | \n", "0.5550 | \n", "0.6445 | \n", "0.5931 | \n", "0.4013 | \n", "0.4059 | \n", "0.078 | \n", "
et | \n", "Extra Trees Classifier | \n", "0.7299 | \n", "0.7788 | \n", "0.4965 | \n", "0.6516 | \n", "0.5596 | \n", "0.3706 | \n", "0.3802 | \n", "0.130 | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "0.7113 | \n", "0.7653 | \n", "0.5181 | \n", "0.6036 | \n", "0.5533 | \n", "0.3427 | \n", "0.3479 | \n", "0.046 | \n", "
dt | \n", "Decision Tree Classifier | \n", "0.6947 | \n", "0.6526 | \n", "0.5137 | \n", "0.5665 | \n", "0.5343 | \n", "0.3103 | \n", "0.3130 | \n", "0.036 | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "0.6853 | \n", "0.7522 | \n", "0.4912 | \n", "0.5620 | \n", "0.5216 | \n", "0.2887 | \n", "0.2922 | \n", "0.042 | \n", "
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "TT (Sec) | \n", "
---|---|---|---|---|---|---|---|---|---|
nb | \n", "Naive Bayes | \n", "0.7427 | \n", "0.7957 | \n", "0.5702 | \n", "0.6543 | \n", "0.6043 | \n", "0.4156 | \n", "0.4215 | \n", "0.0430 | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "0.7373 | \n", "0.7920 | \n", "0.5550 | \n", "0.6445 | \n", "0.5931 | \n", "0.4013 | \n", "0.4059 | \n", "0.0710 | \n", "
lda | \n", "Linear Discriminant Analysis | \n", "0.7670 | \n", "0.8055 | \n", "0.5550 | \n", "0.7202 | \n", "0.6243 | \n", "0.4594 | \n", "0.4695 | \n", "0.0330 | \n", "
ridge | \n", "Ridge Classifier | \n", "0.7708 | \n", "0.0000 | \n", "0.5392 | \n", "0.7353 | \n", "0.6203 | \n", "0.4618 | \n", "0.4744 | \n", "0.0340 | \n", "
rf | \n", "Random Forest Classifier | \n", "0.7485 | \n", "0.7917 | \n", "0.5336 | \n", "0.6784 | \n", "0.5946 | \n", "0.4164 | \n", "0.4245 | \n", "0.1190 | \n", "
qda | \n", "Quadratic Discriminant Analysis | \n", "0.7282 | \n", "0.7894 | \n", "0.5281 | \n", "0.6558 | \n", "0.5736 | \n", "0.3785 | \n", "0.3910 | \n", "0.0370 | \n", "
catboost | \n", "CatBoost Classifier | \n", "0.7410 | \n", "0.7994 | \n", "0.5278 | \n", "0.6630 | \n", "0.5851 | \n", "0.4005 | \n", "0.4078 | \n", "0.0400 | \n", "
ada | \n", "Ada Boost Classifier | \n", "0.7372 | \n", "0.7799 | \n", "0.5275 | \n", "0.6585 | \n", "0.5796 | \n", "0.3926 | \n", "0.4017 | \n", "0.0670 | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "0.7113 | \n", "0.7653 | \n", "0.5181 | \n", "0.6036 | \n", "0.5533 | \n", "0.3427 | \n", "0.3479 | \n", "0.0450 | \n", "
dt | \n", "Decision Tree Classifier | \n", "0.6947 | \n", "0.6526 | \n", "0.5137 | \n", "0.5665 | \n", "0.5343 | \n", "0.3103 | \n", "0.3130 | \n", "0.0340 | \n", "
svm | \n", "SVM - Linear Kernel | \n", "0.7521 | \n", "0.0000 | \n", "0.5070 | \n", "0.7363 | \n", "0.5796 | \n", "0.4154 | \n", "0.4398 | \n", "0.0340 | \n", "
et | \n", "Extra Trees Classifier | \n", "0.7299 | \n", "0.7788 | \n", "0.4965 | \n", "0.6516 | \n", "0.5596 | \n", "0.3706 | \n", "0.3802 | \n", "0.1290 | \n", "
lr | \n", "Logistic Regression | \n", "0.7689 | \n", "0.8068 | \n", "0.4959 | \n", "0.7614 | \n", "0.5968 | \n", "0.4453 | \n", "0.4673 | \n", "0.0410 | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "0.6853 | \n", "0.7522 | \n", "0.4912 | \n", "0.5620 | \n", "0.5216 | \n", "0.2887 | \n", "0.2922 | \n", "0.0390 | \n", "
knn | \n", "K Neighbors Classifier | \n", "0.7002 | \n", "0.7433 | \n", "0.4860 | \n", "0.5965 | \n", "0.5311 | \n", "0.3142 | \n", "0.3210 | \n", "0.0570 | \n", "
dummy | \n", "Dummy Classifier | \n", "0.6518 | \n", "0.5000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0550 | \n", "
\n", " | Name | \n", "Display Name | \n", "Score Function | \n", "Scorer | \n", "Target | \n", "Args | \n", "Greater is Better | \n", "Multiclass | \n", "Custom | \n", "
---|---|---|---|---|---|---|---|---|---|
ID | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
acc | \n", "Accuracy | \n", "Accuracy | \n", "<function accuracy_score at 0x000002E242711280> | \n", "accuracy | \n", "pred | \n", "{} | \n", "True | \n", "True | \n", "False | \n", "
auc | \n", "AUC | \n", "AUC | \n", "<function roc_auc_score at 0x000002E24270B0D0> | \n", "make_scorer(roc_auc_score, needs_proba=True, e... | \n", "pred_proba | \n", "{'average': 'weighted', 'multi_class': 'ovr'} | \n", "True | \n", "True | \n", "False | \n", "
recall | \n", "Recall | \n", "Recall | \n", "<pycaret.internal.metrics.BinaryMulticlassScor... | \n", "make_scorer(recall_score, average=weighted) | \n", "pred | \n", "{'average': 'weighted'} | \n", "True | \n", "True | \n", "False | \n", "
precision | \n", "Precision | \n", "Prec. | \n", "<pycaret.internal.metrics.BinaryMulticlassScor... | \n", "make_scorer(precision_score, average=weighted) | \n", "pred | \n", "{'average': 'weighted'} | \n", "True | \n", "True | \n", "False | \n", "
f1 | \n", "F1 | \n", "F1 | \n", "<pycaret.internal.metrics.BinaryMulticlassScor... | \n", "make_scorer(f1_score, average=weighted) | \n", "pred | \n", "{'average': 'weighted'} | \n", "True | \n", "True | \n", "False | \n", "
kappa | \n", "Kappa | \n", "Kappa | \n", "<function cohen_kappa_score at 0x000002E242711... | \n", "make_scorer(cohen_kappa_score) | \n", "pred | \n", "{} | \n", "True | \n", "True | \n", "False | \n", "
mcc | \n", "MCC | \n", "MCC | \n", "<function matthews_corrcoef at 0x000002E242711... | \n", "make_scorer(matthews_corrcoef) | \n", "pred | \n", "{} | \n", "True | \n", "True | \n", "False | \n", "
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "Custom Metric | \n", "TT (Sec) | \n", "
---|---|---|---|---|---|---|---|---|---|---|
ridge | \n", "Ridge Classifier | \n", "0.7708 | \n", "0.0000 | \n", "0.5392 | \n", "0.7353 | \n", "0.6203 | \n", "0.4618 | \n", "0.4744 | \n", "991.5000 | \n", "0.0340 | \n", "
lr | \n", "Logistic Regression | \n", "0.7689 | \n", "0.8068 | \n", "0.4959 | \n", "0.7614 | \n", "0.5968 | \n", "0.4453 | \n", "0.4673 | \n", "915.0000 | \n", "0.0390 | \n", "
lda | \n", "Linear Discriminant Analysis | \n", "0.7670 | \n", "0.8055 | \n", "0.5550 | \n", "0.7202 | \n", "0.6243 | \n", "0.4594 | \n", "0.4695 | \n", "1019.0000 | \n", "0.0470 | \n", "
svm | \n", "SVM - Linear Kernel | \n", "0.7521 | \n", "0.0000 | \n", "0.5070 | \n", "0.7363 | \n", "0.5796 | \n", "0.4154 | \n", "0.4398 | \n", "929.5000 | \n", "0.0330 | \n", "
rf | \n", "Random Forest Classifier | \n", "0.7485 | \n", "0.7917 | \n", "0.5336 | \n", "0.6784 | \n", "0.5946 | \n", "0.4164 | \n", "0.4245 | \n", "976.0000 | \n", "0.1230 | \n", "
nb | \n", "Naive Bayes | \n", "0.7427 | \n", "0.7957 | \n", "0.5702 | \n", "0.6543 | \n", "0.6043 | \n", "0.4156 | \n", "0.4215 | \n", "1041.0000 | \n", "0.0410 | \n", "
catboost | \n", "CatBoost Classifier | \n", "0.7410 | \n", "0.7994 | \n", "0.5278 | \n", "0.6630 | \n", "0.5851 | \n", "0.4005 | \n", "0.4078 | \n", "964.5000 | \n", "0.0400 | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "0.7373 | \n", "0.7920 | \n", "0.5550 | \n", "0.6445 | \n", "0.5931 | \n", "0.4013 | \n", "0.4059 | \n", "1011.0000 | \n", "0.0720 | \n", "
ada | \n", "Ada Boost Classifier | \n", "0.7372 | \n", "0.7799 | \n", "0.5275 | \n", "0.6585 | \n", "0.5796 | \n", "0.3926 | \n", "0.4017 | \n", "963.5000 | \n", "0.0690 | \n", "
et | \n", "Extra Trees Classifier | \n", "0.7299 | \n", "0.7788 | \n", "0.4965 | \n", "0.6516 | \n", "0.5596 | \n", "0.3706 | \n", "0.3802 | \n", "904.5000 | \n", "0.1370 | \n", "
qda | \n", "Quadratic Discriminant Analysis | \n", "0.7282 | \n", "0.7894 | \n", "0.5281 | \n", "0.6558 | \n", "0.5736 | \n", "0.3785 | \n", "0.3910 | \n", "961.0000 | \n", "0.0360 | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "0.7113 | \n", "0.7653 | \n", "0.5181 | \n", "0.6036 | \n", "0.5533 | \n", "0.3427 | \n", "0.3479 | \n", "937.5000 | \n", "0.0450 | \n", "
knn | \n", "K Neighbors Classifier | \n", "0.7002 | \n", "0.7433 | \n", "0.4860 | \n", "0.5965 | \n", "0.5311 | \n", "0.3142 | \n", "0.3210 | \n", "877.5000 | \n", "0.0530 | \n", "
dt | \n", "Decision Tree Classifier | \n", "0.6947 | \n", "0.6526 | \n", "0.5137 | \n", "0.5665 | \n", "0.5343 | \n", "0.3103 | \n", "0.3130 | \n", "923.5000 | \n", "0.0330 | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "0.6853 | \n", "0.7522 | \n", "0.4912 | \n", "0.5620 | \n", "0.5216 | \n", "0.2887 | \n", "0.2922 | \n", "883.0000 | \n", "0.0400 | \n", "
dummy | \n", "Dummy Classifier | \n", "0.6518 | \n", "0.5000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0410 | \n", "
RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True, fit_intercept=True,\n", " max_iter=None, normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True, fit_intercept=True,\n", " max_iter=None, normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001)
\n", " | Name | \n", "Reference | \n", "Turbo | \n", "
---|---|---|---|
ID | \n", "\n", " | \n", " | \n", " |
lr | \n", "Logistic Regression | \n", "sklearn.linear_model._logistic.LogisticRegression | \n", "True | \n", "
knn | \n", "K Neighbors Classifier | \n", "sklearn.neighbors._classification.KNeighborsCl... | \n", "True | \n", "
nb | \n", "Naive Bayes | \n", "sklearn.naive_bayes.GaussianNB | \n", "True | \n", "
dt | \n", "Decision Tree Classifier | \n", "sklearn.tree._classes.DecisionTreeClassifier | \n", "True | \n", "
svm | \n", "SVM - Linear Kernel | \n", "sklearn.linear_model._stochastic_gradient.SGDC... | \n", "True | \n", "
rbfsvm | \n", "SVM - Radial Kernel | \n", "sklearn.svm._classes.SVC | \n", "False | \n", "
gpc | \n", "Gaussian Process Classifier | \n", "sklearn.gaussian_process._gpc.GaussianProcessC... | \n", "False | \n", "
mlp | \n", "MLP Classifier | \n", "sklearn.neural_network._multilayer_perceptron.... | \n", "False | \n", "
ridge | \n", "Ridge Classifier | \n", "sklearn.linear_model._ridge.RidgeClassifier | \n", "True | \n", "
rf | \n", "Random Forest Classifier | \n", "sklearn.ensemble._forest.RandomForestClassifier | \n", "True | \n", "
qda | \n", "Quadratic Discriminant Analysis | \n", "sklearn.discriminant_analysis.QuadraticDiscrim... | \n", "True | \n", "
ada | \n", "Ada Boost Classifier | \n", "sklearn.ensemble._weight_boosting.AdaBoostClas... | \n", "True | \n", "
gbc | \n", "Gradient Boosting Classifier | \n", "sklearn.ensemble._gb.GradientBoostingClassifier | \n", "True | \n", "
lda | \n", "Linear Discriminant Analysis | \n", "sklearn.discriminant_analysis.LinearDiscrimina... | \n", "True | \n", "
et | \n", "Extra Trees Classifier | \n", "sklearn.ensemble._forest.ExtraTreesClassifier | \n", "True | \n", "
xgboost | \n", "Extreme Gradient Boosting | \n", "xgboost.sklearn.XGBClassifier | \n", "True | \n", "
lightgbm | \n", "Light Gradient Boosting Machine | \n", "lightgbm.sklearn.LGBMClassifier | \n", "True | \n", "
catboost | \n", "CatBoost Classifier | \n", "catboost.core.CatBoostClassifier | \n", "True | \n", "
dummy | \n", "Dummy Classifier | \n", "sklearn.dummy.DummyClassifier | \n", "True | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.8148 | \n", "0.9023 | \n", "0.5789 | \n", "0.8462 | \n", "0.6875 | \n", "0.5624 | \n", "0.5828 | \n", "
1 | \n", "0.8333 | \n", "0.7970 | \n", "0.6316 | \n", "0.8571 | \n", "0.7273 | \n", "0.6112 | \n", "0.6260 | \n", "
2 | \n", "0.8519 | \n", "0.9383 | \n", "0.6316 | \n", "0.9231 | \n", "0.7500 | \n", "0.6499 | \n", "0.6736 | \n", "
3 | \n", "0.7222 | \n", "0.7759 | \n", "0.4211 | \n", "0.6667 | \n", "0.5161 | \n", "0.3350 | \n", "0.3524 | \n", "
4 | \n", "0.8333 | \n", "0.9083 | \n", "0.5789 | \n", "0.9167 | \n", "0.7097 | \n", "0.6010 | \n", "0.6322 | \n", "
5 | \n", "0.6852 | \n", "0.6737 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "
6 | \n", "0.7222 | \n", "0.7820 | \n", "0.4737 | \n", "0.6429 | \n", "0.5455 | \n", "0.3520 | \n", "0.3605 | \n", "
7 | \n", "0.7547 | \n", "0.8460 | \n", "0.3333 | \n", "0.8571 | \n", "0.4800 | \n", "0.3579 | \n", "0.4263 | \n", "
8 | \n", "0.7358 | \n", "0.6952 | \n", "0.4444 | \n", "0.6667 | \n", "0.5333 | \n", "0.3592 | \n", "0.3736 | \n", "
9 | \n", "0.7358 | \n", "0.7492 | \n", "0.4444 | \n", "0.6667 | \n", "0.5333 | \n", "0.3592 | \n", "0.3736 | \n", "
Mean | \n", "0.7689 | \n", "0.8068 | \n", "0.4959 | \n", "0.7614 | \n", "0.5968 | \n", "0.4453 | \n", "0.4673 | \n", "
Std | \n", "0.0557 | \n", "0.0857 | \n", "0.0970 | \n", "0.1236 | \n", "0.1024 | \n", "0.1353 | \n", "0.1379 | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.8148 | \n", "0.9023 | \n", "0.5789 | \n", "0.8462 | \n", "0.6875 | \n", "0.5624 | \n", "0.5828 | \n", "
1 | \n", "0.8333 | \n", "0.7970 | \n", "0.6316 | \n", "0.8571 | \n", "0.7273 | \n", "0.6112 | \n", "0.6260 | \n", "
2 | \n", "0.8519 | \n", "0.9383 | \n", "0.6316 | \n", "0.9231 | \n", "0.7500 | \n", "0.6499 | \n", "0.6736 | \n", "
3 | \n", "0.7222 | \n", "0.7759 | \n", "0.4211 | \n", "0.6667 | \n", "0.5161 | \n", "0.3350 | \n", "0.3524 | \n", "
4 | \n", "0.8333 | \n", "0.9083 | \n", "0.5789 | \n", "0.9167 | \n", "0.7097 | \n", "0.6010 | \n", "0.6322 | \n", "
5 | \n", "0.6852 | \n", "0.6737 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "
6 | \n", "0.7222 | \n", "0.7820 | \n", "0.4737 | \n", "0.6429 | \n", "0.5455 | \n", "0.3520 | \n", "0.3605 | \n", "
7 | \n", "0.7547 | \n", "0.8460 | \n", "0.3333 | \n", "0.8571 | \n", "0.4800 | \n", "0.3579 | \n", "0.4263 | \n", "
8 | \n", "0.7358 | \n", "0.6952 | \n", "0.4444 | \n", "0.6667 | \n", "0.5333 | \n", "0.3592 | \n", "0.3736 | \n", "
9 | \n", "0.7358 | \n", "0.7492 | \n", "0.4444 | \n", "0.6667 | \n", "0.5333 | \n", "0.3592 | \n", "0.3736 | \n", "
Mean | \n", "0.7689 | \n", "0.8068 | \n", "0.4959 | \n", "0.7614 | \n", "0.5968 | \n", "0.4453 | \n", "0.4673 | \n", "
Std | \n", "0.0557 | \n", "0.0857 | \n", "0.0970 | \n", "0.1236 | \n", "0.1024 | \n", "0.1353 | \n", "0.1379 | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.8101 | \n", "0.8526 | \n", "0.5714 | \n", "0.8372 | \n", "0.6792 | \n", "0.5510 | \n", "0.5713 | \n", "
1 | \n", "0.7486 | \n", "0.7921 | \n", "0.5000 | \n", "0.6889 | \n", "0.5794 | \n", "0.4065 | \n", "0.4172 | \n", "
2 | \n", "0.7486 | \n", "0.7804 | \n", "0.4194 | \n", "0.7429 | \n", "0.5361 | \n", "0.3815 | \n", "0.4108 | \n", "
Mean | \n", "0.7691 | \n", "0.8084 | \n", "0.4969 | \n", "0.7563 | \n", "0.5983 | \n", "0.4464 | \n", "0.4664 | \n", "
Std | \n", "0.0290 | \n", "0.0317 | \n", "0.0621 | \n", "0.0613 | \n", "0.0599 | \n", "0.0747 | \n", "0.0742 | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7963 | \n", "0.8872 | \n", "0.4737 | \n", "0.9000 | \n", "0.6207 | \n", "0.4992 | \n", "0.5472 | \n", "
1 | \n", "0.8148 | \n", "0.8030 | \n", "0.5789 | \n", "0.8462 | \n", "0.6875 | \n", "0.5624 | \n", "0.5828 | \n", "
2 | \n", "0.8519 | \n", "0.9353 | \n", "0.5789 | \n", "1.0000 | \n", "0.7333 | \n", "0.6406 | \n", "0.6865 | \n", "
3 | \n", "0.7037 | \n", "0.7684 | \n", "0.3684 | \n", "0.6364 | \n", "0.4667 | \n", "0.2812 | \n", "0.3013 | \n", "
4 | \n", "0.8519 | \n", "0.9038 | \n", "0.5789 | \n", "1.0000 | \n", "0.7333 | \n", "0.6406 | \n", "0.6865 | \n", "
5 | \n", "0.6852 | \n", "0.6737 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "
6 | \n", "0.7222 | \n", "0.7624 | \n", "0.4737 | \n", "0.6429 | \n", "0.5455 | \n", "0.3520 | \n", "0.3605 | \n", "
7 | \n", "0.7547 | \n", "0.8302 | \n", "0.3333 | \n", "0.8571 | \n", "0.4800 | \n", "0.3579 | \n", "0.4263 | \n", "
8 | \n", "0.7358 | \n", "0.6952 | \n", "0.3333 | \n", "0.7500 | \n", "0.4615 | \n", "0.3193 | \n", "0.3654 | \n", "
9 | \n", "0.7547 | \n", "0.7587 | \n", "0.4444 | \n", "0.7273 | \n", "0.5517 | \n", "0.3961 | \n", "0.4189 | \n", "
Mean | \n", "0.7671 | \n", "0.8018 | \n", "0.4585 | \n", "0.7931 | \n", "0.5765 | \n", "0.4315 | \n", "0.4647 | \n", "
Std | \n", "0.0561 | \n", "0.0828 | \n", "0.0922 | \n", "0.1437 | \n", "0.1039 | \n", "0.1360 | \n", "0.1440 | \n", "
LogisticRegression(C=0.5, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=0.15, max_iter=1000,\n", " multi_class='auto', n_jobs=None, penalty='l2',\n", " random_state=123, solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LogisticRegression(C=0.5, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=0.15, max_iter=1000,\n", " multi_class='auto', n_jobs=None, penalty='l2',\n", " random_state=123, solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False)
\n", " | \n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|---|
Split | \n", "Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
CV-Train | \n", "0 | \n", "0.7660 | \n", "0.8146 | \n", "0.5000 | \n", "0.7434 | \n", "0.5979 | \n", "0.4417 | \n", "0.4589 | \n", "
1 | \n", "0.7764 | \n", "0.8259 | \n", "0.5000 | \n", "0.7778 | \n", "0.6087 | \n", "0.4623 | \n", "0.4845 | \n", "|
2 | \n", "0.7702 | \n", "0.8138 | \n", "0.5000 | \n", "0.7568 | \n", "0.6022 | \n", "0.4499 | \n", "0.4690 | \n", "|
3 | \n", "0.7909 | \n", "0.8296 | \n", "0.5417 | \n", "0.7913 | \n", "0.6431 | \n", "0.5025 | \n", "0.5205 | \n", "|
4 | \n", "0.7764 | \n", "0.8142 | \n", "0.5060 | \n", "0.7727 | \n", "0.6115 | \n", "0.4640 | \n", "0.4845 | \n", "|
5 | \n", "0.7888 | \n", "0.8403 | \n", "0.5417 | \n", "0.7845 | \n", "0.6408 | \n", "0.4983 | \n", "0.5154 | \n", "|
6 | \n", "0.7826 | \n", "0.8242 | \n", "0.5238 | \n", "0.7788 | \n", "0.6263 | \n", "0.4812 | \n", "0.5000 | \n", "|
7 | \n", "0.7748 | \n", "0.8185 | \n", "0.5148 | \n", "0.7632 | \n", "0.6148 | \n", "0.4641 | \n", "0.4820 | \n", "|
8 | \n", "0.7810 | \n", "0.8387 | \n", "0.5266 | \n", "0.7739 | \n", "0.6268 | \n", "0.4796 | \n", "0.4974 | \n", "|
9 | \n", "0.7851 | \n", "0.8340 | \n", "0.5266 | \n", "0.7876 | \n", "0.6312 | \n", "0.4879 | \n", "0.5076 | \n", "|
CV-Val | \n", "0 | \n", "0.8148 | \n", "0.9023 | \n", "0.5789 | \n", "0.8462 | \n", "0.6875 | \n", "0.5624 | \n", "0.5828 | \n", "
1 | \n", "0.8333 | \n", "0.7970 | \n", "0.6316 | \n", "0.8571 | \n", "0.7273 | \n", "0.6112 | \n", "0.6260 | \n", "|
2 | \n", "0.8519 | \n", "0.9383 | \n", "0.6316 | \n", "0.9231 | \n", "0.7500 | \n", "0.6499 | \n", "0.6736 | \n", "|
3 | \n", "0.7222 | \n", "0.7759 | \n", "0.4211 | \n", "0.6667 | \n", "0.5161 | \n", "0.3350 | \n", "0.3524 | \n", "|
4 | \n", "0.8333 | \n", "0.9083 | \n", "0.5789 | \n", "0.9167 | \n", "0.7097 | \n", "0.6010 | \n", "0.6322 | \n", "|
5 | \n", "0.6852 | \n", "0.6737 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "|
6 | \n", "0.7222 | \n", "0.7820 | \n", "0.4737 | \n", "0.6429 | \n", "0.5455 | \n", "0.3520 | \n", "0.3605 | \n", "|
7 | \n", "0.7547 | \n", "0.8460 | \n", "0.3333 | \n", "0.8571 | \n", "0.4800 | \n", "0.3579 | \n", "0.4263 | \n", "|
8 | \n", "0.7358 | \n", "0.6952 | \n", "0.4444 | \n", "0.6667 | \n", "0.5333 | \n", "0.3592 | \n", "0.3736 | \n", "|
9 | \n", "0.7358 | \n", "0.7492 | \n", "0.4444 | \n", "0.6667 | \n", "0.5333 | \n", "0.3592 | \n", "0.3736 | \n", "|
CV-Train | \n", "Mean | \n", "0.7792 | \n", "0.8254 | \n", "0.5181 | \n", "0.7730 | \n", "0.6203 | \n", "0.4731 | \n", "0.4920 | \n", "
Std | \n", "0.0075 | \n", "0.0096 | \n", "0.0156 | \n", "0.0141 | \n", "0.0149 | \n", "0.0191 | \n", "0.0188 | \n", "|
CV-Val | \n", "Mean | \n", "0.7689 | \n", "0.8068 | \n", "0.4959 | \n", "0.7614 | \n", "0.5968 | \n", "0.4453 | \n", "0.4673 | \n", "
Std | \n", "0.0557 | \n", "0.0857 | \n", "0.0970 | \n", "0.1236 | \n", "0.1024 | \n", "0.1353 | \n", "0.1379 | \n", "|
Train | \n", "nan | \n", "0.7765 | \n", "0.8248 | \n", "0.5187 | \n", "0.7638 | \n", "0.6178 | \n", "0.4680 | \n", "0.4855 | \n", "
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=None, max_iter=1000,\n", " multi_class='auto', n_jobs=None, penalty='l2',\n", " random_state=123, solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=None, max_iter=1000,\n", " multi_class='auto', n_jobs=None, penalty='l2',\n", " random_state=123, solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False)
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7222 | \n", "0.9023 | \n", "0.2105 | \n", "1.0000 | \n", "0.3478 | \n", "0.2569 | \n", "0.3839 | \n", "
1 | \n", "0.7407 | \n", "0.7970 | \n", "0.2632 | \n", "1.0000 | \n", "0.4167 | \n", "0.3165 | \n", "0.4336 | \n", "
2 | \n", "0.7037 | \n", "0.9383 | \n", "0.1579 | \n", "1.0000 | \n", "0.2727 | \n", "0.1955 | \n", "0.3292 | \n", "
3 | \n", "0.7037 | \n", "0.7759 | \n", "0.2105 | \n", "0.8000 | \n", "0.3333 | \n", "0.2188 | \n", "0.2998 | \n", "
4 | \n", "0.7037 | \n", "0.9083 | \n", "0.1579 | \n", "1.0000 | \n", "0.2727 | \n", "0.1955 | \n", "0.3292 | \n", "
5 | \n", "0.6852 | \n", "0.6737 | \n", "0.2105 | \n", "0.6667 | \n", "0.3200 | \n", "0.1818 | \n", "0.2331 | \n", "
6 | \n", "0.7222 | \n", "0.7820 | \n", "0.3158 | \n", "0.7500 | \n", "0.4444 | \n", "0.2981 | \n", "0.3477 | \n", "
7 | \n", "0.6981 | \n", "0.8460 | \n", "0.1111 | \n", "1.0000 | \n", "0.2000 | \n", "0.1417 | \n", "0.2761 | \n", "
8 | \n", "0.7170 | \n", "0.6952 | \n", "0.2222 | \n", "0.8000 | \n", "0.3478 | \n", "0.2348 | \n", "0.3138 | \n", "
9 | \n", "0.6981 | \n", "0.7492 | \n", "0.1667 | \n", "0.7500 | \n", "0.2727 | \n", "0.1703 | \n", "0.2476 | \n", "
Mean | \n", "0.7095 | \n", "0.8068 | \n", "0.2026 | \n", "0.8767 | \n", "0.3228 | \n", "0.2210 | \n", "0.3194 | \n", "
Std | \n", "0.0152 | \n", "0.0857 | \n", "0.0554 | \n", "0.1281 | \n", "0.0690 | \n", "0.0531 | \n", "0.0575 | \n", "
CustomProbabilityThresholdClassifier(C=1.0, class_weight=None,\n", " classifier=LogisticRegression(C=1.0,\n", " class_weight=None,\n", " dual=False,\n", " fit_intercept=True,\n", " intercept_scaling=1,\n", " l1_ratio=None,\n", " max_iter=1000,\n", " multi_class='auto',\n", " n_jobs=None,\n", " penalty='l2',\n", " random_state=123,\n", " solver='lbfgs',\n", " tol=0.0001,\n", " verbose=0,\n", " warm_start=False),\n", " dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=None,\n", " max_iter=1000, multi_class='auto',\n", " n_jobs=None, penalty='l2',\n", " probability_threshold=0.66,\n", " random_state=123, solver='lbfgs',\n", " tol=0.0001, verbose=0, warm_start=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
CustomProbabilityThresholdClassifier(C=1.0, class_weight=None,\n", " classifier=LogisticRegression(C=1.0,\n", " class_weight=None,\n", " dual=False,\n", " fit_intercept=True,\n", " intercept_scaling=1,\n", " l1_ratio=None,\n", " max_iter=1000,\n", " multi_class='auto',\n", " n_jobs=None,\n", " penalty='l2',\n", " random_state=123,\n", " solver='lbfgs',\n", " tol=0.0001,\n", " verbose=0,\n", " warm_start=False),\n", " dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=None,\n", " max_iter=1000, multi_class='auto',\n", " n_jobs=None, penalty='l2',\n", " probability_threshold=0.66,\n", " random_state=123, solver='lbfgs',\n", " tol=0.0001, verbose=0, warm_start=False)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, l1_ratio=None, max_iter=1000,\n", " multi_class='auto', n_jobs=None, penalty='l2',\n", " random_state=123, solver='lbfgs', tol=0.0001, verbose=0,\n", " warm_start=False)
LogisticRegression(max_iter=1000, random_state=123)
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7222 | \n", "0.6774 | \n", "0.5263 | \n", "0.6250 | \n", "0.5714 | \n", "0.3682 | \n", "0.3711 | \n", "
1 | \n", "0.7222 | \n", "0.7015 | \n", "0.6316 | \n", "0.6000 | \n", "0.6154 | \n", "0.3982 | \n", "0.3985 | \n", "
2 | \n", "0.7407 | \n", "0.7038 | \n", "0.5789 | \n", "0.6471 | \n", "0.6111 | \n", "0.4176 | \n", "0.4190 | \n", "
3 | \n", "0.5926 | \n", "0.5053 | \n", "0.2105 | \n", "0.3636 | \n", "0.2667 | \n", "0.0116 | \n", "0.0125 | \n", "
4 | \n", "0.7778 | \n", "0.7684 | \n", "0.7368 | \n", "0.6667 | \n", "0.7000 | \n", "0.5242 | \n", "0.5259 | \n", "
5 | \n", "0.6296 | \n", "0.5940 | \n", "0.4737 | \n", "0.4737 | \n", "0.4737 | \n", "0.1880 | \n", "0.1880 | \n", "
6 | \n", "0.6296 | \n", "0.5699 | \n", "0.3684 | \n", "0.4667 | \n", "0.4118 | \n", "0.1469 | \n", "0.1491 | \n", "
7 | \n", "0.8302 | \n", "0.7770 | \n", "0.6111 | \n", "0.8462 | \n", "0.7097 | \n", "0.5940 | \n", "0.6098 | \n", "
8 | \n", "0.6604 | \n", "0.6079 | \n", "0.4444 | \n", "0.5000 | \n", "0.4706 | \n", "0.2219 | \n", "0.2227 | \n", "
9 | \n", "0.6415 | \n", "0.6206 | \n", "0.5556 | \n", "0.4762 | \n", "0.5128 | \n", "0.2319 | \n", "0.2336 | \n", "
Mean | \n", "0.6947 | \n", "0.6526 | \n", "0.5137 | \n", "0.5665 | \n", "0.5343 | \n", "0.3103 | \n", "0.3130 | \n", "
Std | \n", "0.0720 | \n", "0.0834 | \n", "0.1410 | \n", "0.1310 | \n", "0.1292 | \n", "0.1714 | \n", "0.1739 | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.8519 | \n", "0.8135 | \n", "0.6842 | \n", "0.8667 | \n", "0.7647 | \n", "0.6588 | \n", "0.6686 | \n", "
1 | \n", "0.7593 | \n", "0.6940 | \n", "0.4737 | \n", "0.7500 | \n", "0.5806 | \n", "0.4236 | \n", "0.4456 | \n", "
2 | \n", "0.7593 | \n", "0.7782 | \n", "0.8421 | \n", "0.6154 | \n", "0.7111 | \n", "0.5132 | \n", "0.5318 | \n", "
3 | \n", "0.7037 | \n", "0.6511 | \n", "0.4737 | \n", "0.6000 | \n", "0.5294 | \n", "0.3175 | \n", "0.3223 | \n", "
4 | \n", "0.8333 | \n", "0.7632 | \n", "0.5263 | \n", "1.0000 | \n", "0.6897 | \n", "0.5902 | \n", "0.6470 | \n", "
5 | \n", "0.6296 | \n", "0.5820 | \n", "0.4211 | \n", "0.4706 | \n", "0.4444 | \n", "0.1680 | \n", "0.1685 | \n", "
6 | \n", "0.7222 | \n", "0.6654 | \n", "0.4737 | \n", "0.6429 | \n", "0.5455 | \n", "0.3520 | \n", "0.3605 | \n", "
7 | \n", "0.7358 | \n", "0.6246 | \n", "0.2778 | \n", "0.8333 | \n", "0.4167 | \n", "0.2973 | \n", "0.3725 | \n", "
8 | \n", "0.6604 | \n", "0.5675 | \n", "0.2778 | \n", "0.5000 | \n", "0.3571 | \n", "0.1512 | \n", "0.1633 | \n", "
9 | \n", "0.7170 | \n", "0.6643 | \n", "0.5000 | \n", "0.6000 | \n", "0.5455 | \n", "0.3424 | \n", "0.3454 | \n", "
Mean | \n", "0.7372 | \n", "0.6804 | \n", "0.4950 | \n", "0.6879 | \n", "0.5585 | \n", "0.3814 | \n", "0.4026 | \n", "
Std | \n", "0.0653 | \n", "0.0782 | \n", "0.1608 | \n", "0.1611 | \n", "0.1258 | \n", "0.1587 | \n", "0.1654 | \n", "
DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',\n", " max_depth=None, max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " random_state=123, splitter='best')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',\n", " max_depth=None, max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " random_state=123, splitter='best')
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7593 | \n", "0.8008 | \n", "0.7368 | \n", "0.6364 | \n", "0.6829 | \n", "0.4906 | \n", "0.4940 | \n", "
1 | \n", "0.6667 | \n", "0.7444 | \n", "0.5263 | \n", "0.5263 | \n", "0.5263 | \n", "0.2692 | \n", "0.2692 | \n", "
2 | \n", "0.7593 | \n", "0.8241 | \n", "0.5263 | \n", "0.7143 | \n", "0.6061 | \n", "0.4384 | \n", "0.4490 | \n", "
3 | \n", "0.6667 | \n", "0.6293 | \n", "0.4211 | \n", "0.5333 | \n", "0.4706 | \n", "0.2322 | \n", "0.2357 | \n", "
4 | \n", "0.8333 | \n", "0.8962 | \n", "0.6842 | \n", "0.8125 | \n", "0.7429 | \n", "0.6209 | \n", "0.6259 | \n", "
5 | \n", "0.6667 | \n", "0.6534 | \n", "0.5789 | \n", "0.5238 | \n", "0.5500 | \n", "0.2863 | \n", "0.2872 | \n", "
6 | \n", "0.6296 | \n", "0.6759 | \n", "0.3158 | \n", "0.4615 | \n", "0.3750 | \n", "0.1248 | \n", "0.1293 | \n", "
7 | \n", "0.7736 | \n", "0.7698 | \n", "0.6111 | \n", "0.6875 | \n", "0.6471 | \n", "0.4812 | \n", "0.4830 | \n", "
8 | \n", "0.6415 | \n", "0.6817 | \n", "0.4444 | \n", "0.4706 | \n", "0.4571 | \n", "0.1899 | \n", "0.1900 | \n", "
9 | \n", "0.7547 | \n", "0.7437 | \n", "0.6111 | \n", "0.6471 | \n", "0.6286 | \n", "0.4457 | \n", "0.4461 | \n", "
Mean | \n", "0.7151 | \n", "0.7419 | \n", "0.5456 | \n", "0.6013 | \n", "0.5687 | \n", "0.3579 | \n", "0.3610 | \n", "
Std | \n", "0.0653 | \n", "0.0796 | \n", "0.1203 | \n", "0.1100 | \n", "0.1078 | \n", "0.1508 | \n", "0.1517 | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.8519 | \n", "0.8135 | \n", "0.6842 | \n", "0.8667 | \n", "0.7647 | \n", "0.6588 | \n", "0.6686 | \n", "
1 | \n", "0.7593 | \n", "0.6940 | \n", "0.4737 | \n", "0.7500 | \n", "0.5806 | \n", "0.4236 | \n", "0.4456 | \n", "
2 | \n", "0.7593 | \n", "0.7782 | \n", "0.8421 | \n", "0.6154 | \n", "0.7111 | \n", "0.5132 | \n", "0.5318 | \n", "
3 | \n", "0.7037 | \n", "0.6511 | \n", "0.4737 | \n", "0.6000 | \n", "0.5294 | \n", "0.3175 | \n", "0.3223 | \n", "
4 | \n", "0.8333 | \n", "0.7632 | \n", "0.5263 | \n", "1.0000 | \n", "0.6897 | \n", "0.5902 | \n", "0.6470 | \n", "
5 | \n", "0.6296 | \n", "0.5820 | \n", "0.4211 | \n", "0.4706 | \n", "0.4444 | \n", "0.1680 | \n", "0.1685 | \n", "
6 | \n", "0.7222 | \n", "0.6654 | \n", "0.4737 | \n", "0.6429 | \n", "0.5455 | \n", "0.3520 | \n", "0.3605 | \n", "
7 | \n", "0.7358 | \n", "0.6246 | \n", "0.2778 | \n", "0.8333 | \n", "0.4167 | \n", "0.2973 | \n", "0.3725 | \n", "
8 | \n", "0.6604 | \n", "0.5675 | \n", "0.2778 | \n", "0.5000 | \n", "0.3571 | \n", "0.1512 | \n", "0.1633 | \n", "
9 | \n", "0.7170 | \n", "0.6643 | \n", "0.5000 | \n", "0.6000 | \n", "0.5455 | \n", "0.3424 | \n", "0.3454 | \n", "
Mean | \n", "0.7372 | \n", "0.6804 | \n", "0.4950 | \n", "0.6879 | \n", "0.5585 | \n", "0.3814 | \n", "0.4026 | \n", "
Std | \n", "0.0653 | \n", "0.0782 | \n", "0.1608 | \n", "0.1611 | \n", "0.1258 | \n", "0.1587 | \n", "0.1654 | \n", "
DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',\n", " max_depth=1, max_features=1.0, max_leaf_nodes=None,\n", " min_impurity_decrease=0.01, min_samples_leaf=6,\n", " min_samples_split=5, min_weight_fraction_leaf=0.0,\n", " random_state=123, splitter='best')In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',\n", " max_depth=1, max_features=1.0, max_leaf_nodes=None,\n", " min_impurity_decrease=0.01, min_samples_leaf=6,\n", " min_samples_split=5, min_weight_fraction_leaf=0.0,\n", " random_state=123, splitter='best')
RandomizedSearchCV(cv=StratifiedKFold(n_splits=10, random_state=None, shuffle=False),\n", " error_score=nan,\n", " estimator=Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None,\n", " include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " Tra...\n", " 'actual_estimator__max_features': [1.0,\n", " 'sqrt',\n", " 'log2'],\n", " 'actual_estimator__min_impurity_decrease': [0,\n", " 0.0001,\n", " 0.001,\n", " 0.01,\n", " 0.0002,\n", " 0.002,\n", " 0.02,\n", " 0.0005,\n", " 0.005,\n", " 0.05,\n", " 0.1,\n", " 0.2,\n", " 0.3,\n", " 0.4,\n", " 0.5],\n", " 'actual_estimator__min_samples_leaf': [2,\n", " 3,\n", " 4,\n", " 5,\n", " 6],\n", " 'actual_estimator__min_samples_split': [2,\n", " 5,\n", " 7,\n", " 9,\n", " 10]},\n", " pre_dispatch='2*n_jobs', random_state=123, refit=False,\n", " return_train_score=False, scoring='accuracy', verbose=1)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RandomizedSearchCV(cv=StratifiedKFold(n_splits=10, random_state=None, shuffle=False),\n", " error_score=nan,\n", " estimator=Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None,\n", " include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " Tra...\n", " 'actual_estimator__max_features': [1.0,\n", " 'sqrt',\n", " 'log2'],\n", " 'actual_estimator__min_impurity_decrease': [0,\n", " 0.0001,\n", " 0.001,\n", " 0.01,\n", " 0.0002,\n", " 0.002,\n", " 0.02,\n", " 0.0005,\n", " 0.005,\n", " 0.05,\n", " 0.1,\n", " 0.2,\n", " 0.3,\n", " 0.4,\n", " 0.5],\n", " 'actual_estimator__min_samples_leaf': [2,\n", " 3,\n", " 4,\n", " 5,\n", " 6],\n", " 'actual_estimator__min_samples_split': [2,\n", " 5,\n", " 7,\n", " 9,\n", " 10]},\n", " pre_dispatch='2*n_jobs', random_state=123, refit=False,\n", " return_train_score=False, scoring='accuracy', verbose=1)
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " transformer=MinMaxScaler(clip=False,\n", " copy=True,\n", " feature_range=(0,\n", " 1)))),\n", " ('actual_estimator',\n", " DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None,\n", " criterion='gini', max_depth=None,\n", " max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1, min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " random_state=123, splitter='best'))],\n", " verbose=False)
TransformerWrapper(transformer=CleanColumnNames())
CleanColumnNames()
CleanColumnNames()
TransformerWrapper(include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 hours in an oral '\n", " 'glucose tolerance test',\n", " 'Diastolic blood pressure (mm Hg)',\n", " 'Triceps skin fold thickness (mm)',\n", " '2-Hour serum insulin (mu U/ml)',\n", " 'Body mass index (weight in kg/(height in m)^2)',\n", " 'Diabetes pedigree function', 'Age (years)'],\n", " transformer=SimpleImputer())
SimpleImputer()
SimpleImputer()
TransformerWrapper(include=[],\n", " transformer=SimpleImputer(strategy='most_frequent'))
SimpleImputer(strategy='most_frequent')
SimpleImputer(strategy='most_frequent')
TransformerWrapper(transformer=MinMaxScaler())
MinMaxScaler()
MinMaxScaler()
DecisionTreeClassifier(random_state=123)
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7593 | \n", "0.7820 | \n", "0.5263 | \n", "0.7143 | \n", "0.6061 | \n", "0.4384 | \n", "0.4490 | \n", "
1 | \n", "0.7778 | \n", "0.7895 | \n", "0.6316 | \n", "0.7059 | \n", "0.6667 | \n", "0.5008 | \n", "0.5025 | \n", "
2 | \n", "0.7222 | \n", "0.7880 | \n", "0.3684 | \n", "0.7000 | \n", "0.4828 | \n", "0.3170 | \n", "0.3476 | \n", "
3 | \n", "0.6852 | \n", "0.5662 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "
4 | \n", "0.7963 | \n", "0.8233 | \n", "0.6842 | \n", "0.7222 | \n", "0.7027 | \n", "0.5479 | \n", "0.5484 | \n", "
5 | \n", "0.6667 | \n", "0.6805 | \n", "0.5263 | \n", "0.5263 | \n", "0.5263 | \n", "0.2692 | \n", "0.2692 | \n", "
6 | \n", "0.6852 | \n", "0.6940 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "
7 | \n", "0.8302 | \n", "0.8508 | \n", "0.6667 | \n", "0.8000 | \n", "0.7273 | \n", "0.6055 | \n", "0.6108 | \n", "
8 | \n", "0.6604 | \n", "0.6389 | \n", "0.6111 | \n", "0.5000 | \n", "0.5500 | \n", "0.2816 | \n", "0.2853 | \n", "
9 | \n", "0.6415 | \n", "0.6849 | \n", "0.4444 | \n", "0.4706 | \n", "0.4571 | \n", "0.1899 | \n", "0.1900 | \n", "
Mean | \n", "0.7225 | \n", "0.7298 | \n", "0.5301 | \n", "0.6282 | \n", "0.5689 | \n", "0.3681 | \n", "0.3747 | \n", "
Std | \n", "0.0615 | \n", "0.0859 | \n", "0.1080 | \n", "0.1073 | \n", "0.0949 | \n", "0.1356 | \n", "0.1352 | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7407 | \n", "0.8383 | \n", "0.5263 | \n", "0.6667 | \n", "0.5882 | \n", "0.4028 | \n", "0.4088 | \n", "
1 | \n", "0.7963 | \n", "0.7797 | \n", "0.7368 | \n", "0.7000 | \n", "0.7179 | \n", "0.5587 | \n", "0.5591 | \n", "
2 | \n", "0.7593 | \n", "0.7669 | \n", "0.4737 | \n", "0.7500 | \n", "0.5806 | \n", "0.4236 | \n", "0.4456 | \n", "
3 | \n", "0.7222 | \n", "0.7842 | \n", "0.5263 | \n", "0.6250 | \n", "0.5714 | \n", "0.3682 | \n", "0.3711 | \n", "
4 | \n", "0.8148 | \n", "0.8421 | \n", "0.7368 | \n", "0.7368 | \n", "0.7368 | \n", "0.5940 | \n", "0.5940 | \n", "
5 | \n", "0.6852 | \n", "0.6759 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "
6 | \n", "0.7037 | \n", "0.7677 | \n", "0.5263 | \n", "0.5882 | \n", "0.5556 | \n", "0.3344 | \n", "0.3355 | \n", "
7 | \n", "0.7925 | \n", "0.8405 | \n", "0.4444 | \n", "0.8889 | \n", "0.5926 | \n", "0.4734 | \n", "0.5245 | \n", "
8 | \n", "0.6792 | \n", "0.6659 | \n", "0.5000 | \n", "0.5294 | \n", "0.5143 | \n", "0.2751 | \n", "0.2754 | \n", "
9 | \n", "0.6792 | \n", "0.6508 | \n", "0.3333 | \n", "0.5455 | \n", "0.4138 | \n", "0.2103 | \n", "0.2224 | \n", "
Mean | \n", "0.7373 | \n", "0.7612 | \n", "0.5225 | \n", "0.6602 | \n", "0.5756 | \n", "0.3906 | \n", "0.4009 | \n", "
Std | \n", "0.0488 | \n", "0.0695 | \n", "0.1212 | \n", "0.1060 | \n", "0.0924 | \n", "0.1195 | \n", "0.1221 | \n", "
BaggingClassifier(base_estimator=DecisionTreeClassifier(ccp_alpha=0.0,\n", " class_weight=None,\n", " criterion='gini',\n", " max_depth=None,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " random_state=123,\n", " splitter='best'),\n", " bootstrap=True, bootstrap_features=False, max_features=1.0,\n", " max_samples=1.0, n_estimators=10, n_jobs=None,\n", " oob_score=False, random_state=123, verbose=0,\n", " warm_start=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
BaggingClassifier(base_estimator=DecisionTreeClassifier(ccp_alpha=0.0,\n", " class_weight=None,\n", " criterion='gini',\n", " max_depth=None,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " random_state=123,\n", " splitter='best'),\n", " bootstrap=True, bootstrap_features=False, max_features=1.0,\n", " max_samples=1.0, n_estimators=10, n_jobs=None,\n", " oob_score=False, random_state=123, verbose=0,\n", " warm_start=False)
DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',\n", " max_depth=None, max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " random_state=123, splitter='best')
DecisionTreeClassifier(random_state=123)
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7222 | \n", "0.6895 | \n", "0.5789 | \n", "0.6111 | \n", "0.5946 | \n", "0.3836 | \n", "0.3839 | \n", "
1 | \n", "0.7222 | \n", "0.6774 | \n", "0.5263 | \n", "0.6250 | \n", "0.5714 | \n", "0.3682 | \n", "0.3711 | \n", "
2 | \n", "0.7593 | \n", "0.7421 | \n", "0.6842 | \n", "0.6500 | \n", "0.6667 | \n", "0.4785 | \n", "0.4788 | \n", "
3 | \n", "0.6111 | \n", "0.5436 | \n", "0.3158 | \n", "0.4286 | \n", "0.3636 | \n", "0.0928 | \n", "0.0950 | \n", "
4 | \n", "0.8148 | \n", "0.8211 | \n", "0.8421 | \n", "0.6957 | \n", "0.7619 | \n", "0.6126 | \n", "0.6201 | \n", "
5 | \n", "0.5926 | \n", "0.5654 | \n", "0.4737 | \n", "0.4286 | \n", "0.4500 | \n", "0.1278 | \n", "0.1282 | \n", "
6 | \n", "0.6667 | \n", "0.6226 | \n", "0.4737 | \n", "0.5294 | \n", "0.5000 | \n", "0.2512 | \n", "0.2520 | \n", "
7 | \n", "0.7925 | \n", "0.7484 | \n", "0.6111 | \n", "0.7333 | \n", "0.6667 | \n", "0.5178 | \n", "0.5223 | \n", "
8 | \n", "0.6604 | \n", "0.6214 | \n", "0.5000 | \n", "0.5000 | \n", "0.5000 | \n", "0.2429 | \n", "0.2429 | \n", "
9 | \n", "0.6792 | \n", "0.6357 | \n", "0.5000 | \n", "0.5294 | \n", "0.5143 | \n", "0.2751 | \n", "0.2754 | \n", "
Mean | \n", "0.7021 | \n", "0.6667 | \n", "0.5506 | \n", "0.5731 | \n", "0.5589 | \n", "0.3350 | \n", "0.3370 | \n", "
Std | \n", "0.0698 | \n", "0.0820 | \n", "0.1342 | \n", "0.1008 | \n", "0.1117 | \n", "0.1598 | \n", "0.1613 | \n", "
AdaBoostClassifier(algorithm='SAMME.R',\n", " base_estimator=DecisionTreeClassifier(ccp_alpha=0.0,\n", " class_weight=None,\n", " criterion='gini',\n", " max_depth=None,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " random_state=123,\n", " splitter='best'),\n", " learning_rate=1.0, n_estimators=10, random_state=123)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
AdaBoostClassifier(algorithm='SAMME.R',\n", " base_estimator=DecisionTreeClassifier(ccp_alpha=0.0,\n", " class_weight=None,\n", " criterion='gini',\n", " max_depth=None,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " random_state=123,\n", " splitter='best'),\n", " learning_rate=1.0, n_estimators=10, random_state=123)
DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',\n", " max_depth=None, max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " random_state=123, splitter='best')
DecisionTreeClassifier(random_state=123)
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7963 | \n", "0.8932 | \n", "0.6842 | \n", "0.7222 | \n", "0.7027 | \n", "0.5479 | \n", "0.5484 | \n", "
1 | \n", "0.7778 | \n", "0.8120 | \n", "0.6316 | \n", "0.7059 | \n", "0.6667 | \n", "0.5008 | \n", "0.5025 | \n", "
2 | \n", "0.8704 | \n", "0.9338 | \n", "0.6842 | \n", "0.9286 | \n", "0.7879 | \n", "0.6976 | \n", "0.7145 | \n", "
3 | \n", "0.7037 | \n", "0.7865 | \n", "0.4737 | \n", "0.6000 | \n", "0.5294 | \n", "0.3175 | \n", "0.3223 | \n", "
4 | \n", "0.8704 | \n", "0.8962 | \n", "0.6842 | \n", "0.9286 | \n", "0.7879 | \n", "0.6976 | \n", "0.7145 | \n", "
5 | \n", "0.7037 | \n", "0.6692 | \n", "0.4737 | \n", "0.6000 | \n", "0.5294 | \n", "0.3175 | \n", "0.3223 | \n", "
6 | \n", "0.7407 | \n", "0.7805 | \n", "0.6842 | \n", "0.6190 | \n", "0.6500 | \n", "0.4449 | \n", "0.4463 | \n", "
7 | \n", "0.7736 | \n", "0.8667 | \n", "0.4444 | \n", "0.8000 | \n", "0.5714 | \n", "0.4342 | \n", "0.4688 | \n", "
8 | \n", "0.6604 | \n", "0.6889 | \n", "0.4444 | \n", "0.5000 | \n", "0.4706 | \n", "0.2219 | \n", "0.2227 | \n", "
9 | \n", "0.6981 | \n", "0.7286 | \n", "0.4444 | \n", "0.5714 | \n", "0.5000 | \n", "0.2886 | \n", "0.2933 | \n", "
Mean | \n", "0.7595 | \n", "0.8056 | \n", "0.5649 | \n", "0.6976 | \n", "0.6196 | \n", "0.4469 | \n", "0.4555 | \n", "
Std | \n", "0.0683 | \n", "0.0868 | \n", "0.1103 | \n", "0.1407 | \n", "0.1104 | \n", "0.1575 | \n", "0.1616 | \n", "
VotingClassifier(estimators=[('Naive Bayes',\n", " GaussianNB(priors=None, var_smoothing=1e-09)),\n", " ('Gradient Boosting Classifier',\n", " GradientBoostingClassifier(ccp_alpha=0.0,\n", " criterion='friedman_mse',\n", " init=None,\n", " learning_rate=0.1,\n", " loss='log_loss',\n", " max_depth=3,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_wei...\n", " n_iter_no_change=None,\n", " random_state=123,\n", " subsample=1.0,\n", " tol=0.0001,\n", " validation_fraction=0.1,\n", " verbose=0,\n", " warm_start=False)),\n", " ('Linear Discriminant Analysis',\n", " LinearDiscriminantAnalysis(covariance_estimator=None,\n", " n_components=None,\n", " priors=None,\n", " shrinkage=None,\n", " solver='svd',\n", " store_covariance=False,\n", " tol=0.0001))],\n", " flatten_transform=True, n_jobs=-1, verbose=False,\n", " voting='soft', weights=None)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
VotingClassifier(estimators=[('Naive Bayes',\n", " GaussianNB(priors=None, var_smoothing=1e-09)),\n", " ('Gradient Boosting Classifier',\n", " GradientBoostingClassifier(ccp_alpha=0.0,\n", " criterion='friedman_mse',\n", " init=None,\n", " learning_rate=0.1,\n", " loss='log_loss',\n", " max_depth=3,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_wei...\n", " n_iter_no_change=None,\n", " random_state=123,\n", " subsample=1.0,\n", " tol=0.0001,\n", " validation_fraction=0.1,\n", " verbose=0,\n", " warm_start=False)),\n", " ('Linear Discriminant Analysis',\n", " LinearDiscriminantAnalysis(covariance_estimator=None,\n", " n_components=None,\n", " priors=None,\n", " shrinkage=None,\n", " solver='svd',\n", " store_covariance=False,\n", " tol=0.0001))],\n", " flatten_transform=True, n_jobs=-1, verbose=False,\n", " voting='soft', weights=None)
GaussianNB()
GradientBoostingClassifier(random_state=123)
LinearDiscriminantAnalysis()
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.8148 | \n", "0.9023 | \n", "0.6316 | \n", "0.8000 | \n", "0.7059 | \n", "0.5735 | \n", "0.5820 | \n", "
1 | \n", "0.7963 | \n", "0.7970 | \n", "0.6316 | \n", "0.7500 | \n", "0.6857 | \n", "0.5367 | \n", "0.5410 | \n", "
2 | \n", "0.8704 | \n", "0.9233 | \n", "0.6842 | \n", "0.9286 | \n", "0.7879 | \n", "0.6976 | \n", "0.7145 | \n", "
3 | \n", "0.7037 | \n", "0.7835 | \n", "0.4737 | \n", "0.6000 | \n", "0.5294 | \n", "0.3175 | \n", "0.3223 | \n", "
4 | \n", "0.8519 | \n", "0.8992 | \n", "0.6316 | \n", "0.9231 | \n", "0.7500 | \n", "0.6499 | \n", "0.6736 | \n", "
5 | \n", "0.6852 | \n", "0.6722 | \n", "0.4211 | \n", "0.5714 | \n", "0.4848 | \n", "0.2656 | \n", "0.2720 | \n", "
6 | \n", "0.7222 | \n", "0.7910 | \n", "0.5263 | \n", "0.6250 | \n", "0.5714 | \n", "0.3682 | \n", "0.3711 | \n", "
7 | \n", "0.7547 | \n", "0.8667 | \n", "0.3889 | \n", "0.7778 | \n", "0.5185 | \n", "0.3776 | \n", "0.4184 | \n", "
8 | \n", "0.6981 | \n", "0.6810 | \n", "0.4444 | \n", "0.5714 | \n", "0.5000 | \n", "0.2886 | \n", "0.2933 | \n", "
9 | \n", "0.7358 | \n", "0.7190 | \n", "0.5000 | \n", "0.6429 | \n", "0.5625 | \n", "0.3775 | \n", "0.3836 | \n", "
Mean | \n", "0.7633 | \n", "0.8035 | \n", "0.5333 | \n", "0.7190 | \n", "0.6096 | \n", "0.4453 | \n", "0.4572 | \n", "
Std | \n", "0.0628 | \n", "0.0879 | \n", "0.0989 | \n", "0.1300 | \n", "0.1061 | \n", "0.1479 | \n", "0.1514 | \n", "
StackingClassifier(cv=5,\n", " estimators=[('Naive Bayes',\n", " GaussianNB(priors=None, var_smoothing=1e-09)),\n", " ('Gradient Boosting Classifier',\n", " GradientBoostingClassifier(ccp_alpha=0.0,\n", " criterion='friedman_mse',\n", " init=None,\n", " learning_rate=0.1,\n", " loss='log_loss',\n", " max_depth=3,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,...\n", " solver='svd',\n", " store_covariance=False,\n", " tol=0.0001))],\n", " final_estimator=LogisticRegression(C=1.0, class_weight=None,\n", " dual=False,\n", " fit_intercept=True,\n", " intercept_scaling=1,\n", " l1_ratio=None,\n", " max_iter=1000,\n", " multi_class='auto',\n", " n_jobs=None, penalty='l2',\n", " random_state=123,\n", " solver='lbfgs',\n", " tol=0.0001, verbose=0,\n", " warm_start=False),\n", " n_jobs=-1, passthrough=True, stack_method='auto', verbose=0)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
StackingClassifier(cv=5,\n", " estimators=[('Naive Bayes',\n", " GaussianNB(priors=None, var_smoothing=1e-09)),\n", " ('Gradient Boosting Classifier',\n", " GradientBoostingClassifier(ccp_alpha=0.0,\n", " criterion='friedman_mse',\n", " init=None,\n", " learning_rate=0.1,\n", " loss='log_loss',\n", " max_depth=3,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_samples_leaf=1,\n", " min_samples_split=2,...\n", " solver='svd',\n", " store_covariance=False,\n", " tol=0.0001))],\n", " final_estimator=LogisticRegression(C=1.0, class_weight=None,\n", " dual=False,\n", " fit_intercept=True,\n", " intercept_scaling=1,\n", " l1_ratio=None,\n", " max_iter=1000,\n", " multi_class='auto',\n", " n_jobs=None, penalty='l2',\n", " random_state=123,\n", " solver='lbfgs',\n", " tol=0.0001, verbose=0,\n", " warm_start=False),\n", " n_jobs=-1, passthrough=True, stack_method='auto', verbose=0)
GaussianNB()
GradientBoostingClassifier(random_state=123)
LinearDiscriminantAnalysis()
LogisticRegression(max_iter=1000, random_state=123)
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7222 | \n", "0.8376 | \n", "0.4737 | \n", "0.6429 | \n", "0.5455 | \n", "0.3520 | \n", "0.3605 | \n", "
1 | \n", "0.7593 | \n", "0.7865 | \n", "0.7368 | \n", "0.6364 | \n", "0.6829 | \n", "0.4906 | \n", "0.4940 | \n", "
2 | \n", "0.6667 | \n", "0.8301 | \n", "0.4211 | \n", "0.5333 | \n", "0.4706 | \n", "0.2322 | \n", "0.2357 | \n", "
3 | \n", "0.6852 | \n", "0.7639 | \n", "0.5263 | \n", "0.5556 | \n", "0.5405 | \n", "0.3014 | \n", "0.3016 | \n", "
4 | \n", "0.7778 | \n", "0.8406 | \n", "0.6842 | \n", "0.6842 | \n", "0.6842 | \n", "0.5128 | \n", "0.5128 | \n", "
5 | \n", "0.6481 | \n", "0.6887 | \n", "0.3684 | \n", "0.5000 | \n", "0.4242 | \n", "0.1792 | \n", "0.1835 | \n", "
6 | \n", "0.7407 | \n", "0.7338 | \n", "0.5263 | \n", "0.6667 | \n", "0.5882 | \n", "0.4028 | \n", "0.4088 | \n", "
7 | \n", "0.8491 | \n", "0.8603 | \n", "0.6111 | \n", "0.9167 | \n", "0.7333 | \n", "0.6339 | \n", "0.6592 | \n", "
8 | \n", "0.6604 | \n", "0.6952 | \n", "0.5000 | \n", "0.5000 | \n", "0.5000 | \n", "0.2429 | \n", "0.2429 | \n", "
9 | \n", "0.6038 | \n", "0.6159 | \n", "0.3333 | \n", "0.4000 | \n", "0.3636 | \n", "0.0794 | \n", "0.0801 | \n", "
Mean | \n", "0.7113 | \n", "0.7653 | \n", "0.5181 | \n", "0.6036 | \n", "0.5533 | \n", "0.3427 | \n", "0.3479 | \n", "
Std | \n", "0.0689 | \n", "0.0766 | \n", "0.1235 | \n", "0.1346 | \n", "0.1141 | \n", "0.1610 | \n", "0.1655 | \n", "
\n", " | Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|
Fold | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "0.7037 | \n", "0.7338 | \n", "0.1579 | \n", "1.0000 | \n", "0.2727 | \n", "0.1955 | \n", "0.3292 | \n", "
1 | \n", "0.6296 | \n", "0.6767 | \n", "0.1053 | \n", "0.4000 | \n", "0.1667 | \n", "0.0235 | \n", "0.0322 | \n", "
2 | \n", "0.6667 | \n", "0.7677 | \n", "0.0526 | \n", "1.0000 | \n", "0.1000 | \n", "0.0672 | \n", "0.1864 | \n", "
3 | \n", "0.6667 | \n", "0.6940 | \n", "0.2105 | \n", "0.5714 | \n", "0.3077 | \n", "0.1459 | \n", "0.1774 | \n", "
4 | \n", "0.6481 | \n", "0.7962 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "0.0000 | \n", "
5 | \n", "0.5926 | \n", "0.6045 | \n", "0.1053 | \n", "0.2857 | \n", "0.1538 | \n", "-0.0439 | \n", "-0.0534 | \n", "
6 | \n", "0.7222 | \n", "0.6752 | \n", "0.2105 | \n", "1.0000 | \n", "0.3478 | \n", "0.2569 | \n", "0.3839 | \n", "
7 | \n", "0.7358 | \n", "0.7476 | \n", "0.2222 | \n", "1.0000 | \n", "0.3636 | \n", "0.2740 | \n", "0.3984 | \n", "
8 | \n", "0.6226 | \n", "0.6151 | \n", "0.1111 | \n", "0.3333 | \n", "0.1667 | \n", "-0.0038 | \n", "-0.0047 | \n", "
9 | \n", "0.6792 | \n", "0.5683 | \n", "0.1667 | \n", "0.6000 | \n", "0.2609 | \n", "0.1328 | \n", "0.1774 | \n", "
Mean | \n", "0.6667 | \n", "0.6879 | \n", "0.1342 | \n", "0.6190 | \n", "0.2140 | \n", "0.1048 | \n", "0.1627 | \n", "
Std | \n", "0.0431 | \n", "0.0712 | \n", "0.0692 | \n", "0.3474 | \n", "0.1103 | \n", "0.1074 | \n", "0.1586 | \n", "
\n", " | Model Name | \n", "Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "Custom Metric | \n", "
---|---|---|---|---|---|---|---|---|---|---|
Index | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "Logistic Regression | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7689 | \n", "0.8068 | \n", "0.4959 | \n", "0.7614 | \n", "0.5968 | \n", "0.4453 | \n", "0.4673 | \n", "NaN | \n", "
1 | \n", "K Neighbors Classifier | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7002 | \n", "0.7433 | \n", "0.4860 | \n", "0.5965 | \n", "0.5311 | \n", "0.3142 | \n", "0.3210 | \n", "NaN | \n", "
2 | \n", "Naive Bayes | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7427 | \n", "0.7957 | \n", "0.5702 | \n", "0.6543 | \n", "0.6043 | \n", "0.4156 | \n", "0.4215 | \n", "NaN | \n", "
3 | \n", "Decision Tree Classifier | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.6947 | \n", "0.6526 | \n", "0.5137 | \n", "0.5665 | \n", "0.5343 | \n", "0.3103 | \n", "0.3130 | \n", "NaN | \n", "
4 | \n", "SVM - Linear Kernel | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7521 | \n", "0.0000 | \n", "0.5070 | \n", "0.7363 | \n", "0.5796 | \n", "0.4154 | \n", "0.4398 | \n", "NaN | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
70 | \n", "Decision Tree Classifier | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7021 | \n", "0.6667 | \n", "0.5506 | \n", "0.5731 | \n", "0.5589 | \n", "0.3350 | \n", "0.3370 | \n", "NaN | \n", "
71 | \n", "Voting Classifier | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7595 | \n", "0.8056 | \n", "0.5649 | \n", "0.6976 | \n", "0.6196 | \n", "0.4469 | \n", "0.4555 | \n", "NaN | \n", "
72 | \n", "Stacking Classifier | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7633 | \n", "0.8035 | \n", "0.5333 | \n", "0.7190 | \n", "0.6096 | \n", "0.4453 | \n", "0.4572 | \n", "NaN | \n", "
73 | \n", "Light Gradient Boosting Machine | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.7113 | \n", "0.7653 | \n", "0.5181 | \n", "0.6036 | \n", "0.5533 | \n", "0.3427 | \n", "0.3479 | \n", "NaN | \n", "
74 | \n", "Decision Tree Classifier | \n", "(TransformerWrapper(exclude=None, include=None... | \n", "0.6667 | \n", "0.6879 | \n", "0.1342 | \n", "0.6190 | \n", "0.2140 | \n", "0.1048 | \n", "0.1627 | \n", "NaN | \n", "
75 rows × 10 columns
\n", "Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " strategy='most_frequent',\n", " verbose='deprecated'))),\n", " ('normalize',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False,\n", " copy=True,\n", " feature_range=(0,\n", " 1)))),\n", " ['trained_model',\n", " LinearDiscriminantAnalysis(covariance_estimator=None,\n", " n_components=None, priors=None,\n", " shrinkage=None, solver='svd',\n", " store_covariance=False,\n", " tol=0.0001)]],\n", " verbose=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " strategy='most_frequent',\n", " verbose='deprecated'))),\n", " ('normalize',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False,\n", " copy=True,\n", " feature_range=(0,\n", " 1)))),\n", " ['trained_model',\n", " LinearDiscriminantAnalysis(covariance_estimator=None,\n", " n_components=None, priors=None,\n", " shrinkage=None, solver='svd',\n", " store_covariance=False,\n", " tol=0.0001)]],\n", " verbose=False)
TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))
CleanColumnNames()
CleanColumnNames()
TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 hours in an oral '\n", " 'glucose tolerance test',\n", " 'Diastolic blood pressure (mm Hg)',\n", " 'Triceps skin fold thickness (mm)',\n", " '2-Hour serum insulin (mu U/ml)',\n", " 'Body mass index (weight in kg/(height in m)^2)',\n", " 'Diabetes pedigree function', 'Age (years)'],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='mean',\n", " verbose='deprecated'))
SimpleImputer()
SimpleImputer()
TransformerWrapper(exclude=None, include=[],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='most_frequent',\n", " verbose='deprecated'))
SimpleImputer(strategy='most_frequent')
SimpleImputer(strategy='most_frequent')
TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False, copy=True,\n", " feature_range=(0, 1)))
MinMaxScaler()
MinMaxScaler()
LinearDiscriminantAnalysis()
RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True, fit_intercept=True,\n", " max_iter=None, normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True, fit_intercept=True,\n", " max_iter=None, normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001)
\n", " | Model | \n", "Accuracy | \n", "AUC | \n", "Recall | \n", "Prec. | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "Ridge Classifier | \n", "0.7489 | \n", "0.6902 | \n", "0.4938 | \n", "0.7018 | \n", "0.5797 | \n", "0.4083 | \n", "0.4211 | \n", "
\n", " | Samples | \n", "Accuracy | \n", "Recall | \n", "Precision | \n", "F1 | \n", "Kappa | \n", "MCC | \n", "Selection Rate | \n", "
---|---|---|---|---|---|---|---|---|
Number of times pregnant | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
0 | \n", "27 | \n", "0.814815 | \n", "0.666667 | \n", "0.571429 | \n", "0.615385 | \n", "0.494382 | \n", "0.496929 | \n", "0.259259 | \n", "
1 | \n", "49 | \n", "0.734694 | \n", "0.181818 | \n", "0.333333 | \n", "0.235294 | \n", "0.091298 | \n", "0.097443 | \n", "0.122449 | \n", "
2 | \n", "27 | \n", "0.888889 | \n", "0.5 | \n", "1.0 | \n", "0.666667 | \n", "0.608696 | \n", "0.661438 | \n", "0.111111 | \n", "
3 | \n", "21 | \n", "0.761905 | \n", "0.5 | \n", "0.8 | \n", "0.615385 | \n", "0.455959 | \n", "0.482382 | \n", "0.238095 | \n", "
4 | \n", "23 | \n", "0.652174 | \n", "0.333333 | \n", "0.333333 | \n", "0.333333 | \n", "0.098039 | \n", "0.098039 | \n", "0.26087 | \n", "
5 | \n", "21 | \n", "0.714286 | \n", "0.375 | \n", "0.75 | \n", "0.5 | \n", "0.329787 | \n", "0.36863 | \n", "0.190476 | \n", "
6 | \n", "18 | \n", "0.722222 | \n", "0.571429 | \n", "0.666667 | \n", "0.615385 | \n", "0.4 | \n", "0.402911 | \n", "0.333333 | \n", "
7 | \n", "13 | \n", "0.692308 | \n", "0.5 | \n", "1.0 | \n", "0.666667 | \n", "0.434783 | \n", "0.527046 | \n", "0.307692 | \n", "
8 | \n", "12 | \n", "0.666667 | \n", "0.333333 | \n", "1.0 | \n", "0.5 | \n", "0.333333 | \n", "0.447214 | \n", "0.166667 | \n", "
9 | \n", "5 | \n", "0.8 | \n", "0.75 | \n", "1.0 | \n", "0.857143 | \n", "0.545455 | \n", "0.612372 | \n", "0.6 | \n", "
10 | \n", "7 | \n", "0.714286 | \n", "0.8 | \n", "0.8 | \n", "0.8 | \n", "0.3 | \n", "0.3 | \n", "0.714286 | \n", "
11 | \n", "2 | \n", "0.5 | \n", "1.0 | \n", "0.5 | \n", "0.666667 | \n", "0.0 | \n", "0.0 | \n", "1.0 | \n", "
12 | \n", "2 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "NaN | \n", "0.0 | \n", "1.0 | \n", "
13 | \n", "2 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "1.0 | \n", "0.5 | \n", "
14 | \n", "2 | \n", "0.5 | \n", "0.5 | \n", "1.0 | \n", "0.666667 | \n", "0.0 | \n", "0.0 | \n", "0.5 | \n", "
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " verbose='deprecated'))),\n", " ('normalize',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False,\n", " copy=True,\n", " feature_range=(0,\n", " 1)))),\n", " ('actual_estimator',\n", " RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True,\n", " fit_intercept=True, max_iter=None,\n", " normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001))],\n", " verbose=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " verbose='deprecated'))),\n", " ('normalize',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False,\n", " copy=True,\n", " feature_range=(0,\n", " 1)))),\n", " ('actual_estimator',\n", " RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True,\n", " fit_intercept=True, max_iter=None,\n", " normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001))],\n", " verbose=False)
TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))
CleanColumnNames()
CleanColumnNames()
TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 hours in an oral '\n", " 'glucose tolerance test',\n", " 'Diastolic blood pressure (mm Hg)',\n", " 'Triceps skin fold thickness (mm)',\n", " '2-Hour serum insulin (mu U/ml)',\n", " 'Body mass index (weight in kg/(height in m)^2)',\n", " 'Diabetes pedigree function', 'Age (years)'],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='mean',\n", " verbose='deprecated'))
SimpleImputer()
SimpleImputer()
TransformerWrapper(exclude=None, include=[],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='most_frequent',\n", " verbose='deprecated'))
SimpleImputer(strategy='most_frequent')
SimpleImputer(strategy='most_frequent')
TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False, copy=True,\n", " feature_range=(0, 1)))
MinMaxScaler()
MinMaxScaler()
RidgeClassifier(random_state=123)
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " verbose='deprecated'))),\n", " ('normalize',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False,\n", " copy=True,\n", " feature_range=(0,\n", " 1)))),\n", " ('trained_model',\n", " RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True,\n", " fit_intercept=True, max_iter=None,\n", " normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001))],\n", " verbose=False)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
Pipeline(memory=FastMemory(location=C:\\Users\\owner\\AppData\\Local\\Temp\\joblib),\n", " steps=[('clean_column_names',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))),\n", " ('numerical_imputer',\n", " TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 '\n", " 'hours in an oral glu...\n", " verbose='deprecated'))),\n", " ('normalize',\n", " TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False,\n", " copy=True,\n", " feature_range=(0,\n", " 1)))),\n", " ('trained_model',\n", " RidgeClassifier(alpha=1.0, class_weight=None, copy_X=True,\n", " fit_intercept=True, max_iter=None,\n", " normalize='deprecated', positive=False,\n", " random_state=123, solver='auto', tol=0.001))],\n", " verbose=False)
TransformerWrapper(exclude=None, include=None,\n", " transformer=CleanColumnNames(match='[\\\\]\\\\[\\\\,\\\\{\\\\}\\\\"\\\\:]+'))
CleanColumnNames()
CleanColumnNames()
TransformerWrapper(exclude=None,\n", " include=['Number of times pregnant',\n", " 'Plasma glucose concentration a 2 hours in an oral '\n", " 'glucose tolerance test',\n", " 'Diastolic blood pressure (mm Hg)',\n", " 'Triceps skin fold thickness (mm)',\n", " '2-Hour serum insulin (mu U/ml)',\n", " 'Body mass index (weight in kg/(height in m)^2)',\n", " 'Diabetes pedigree function', 'Age (years)'],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='mean',\n", " verbose='deprecated'))
SimpleImputer()
SimpleImputer()
TransformerWrapper(exclude=None, include=[],\n", " transformer=SimpleImputer(add_indicator=False, copy=True,\n", " fill_value=None,\n", " missing_values=nan,\n", " strategy='most_frequent',\n", " verbose='deprecated'))
SimpleImputer(strategy='most_frequent')
SimpleImputer(strategy='most_frequent')
TransformerWrapper(exclude=None, include=None,\n", " transformer=MinMaxScaler(clip=False, copy=True,\n", " feature_range=(0, 1)))
MinMaxScaler()
MinMaxScaler()
RidgeClassifier(random_state=123)
\n", " | Description | \n", "Value | \n", "
---|---|---|
0 | \n", "Session id | \n", "123 | \n", "
1 | \n", "Target | \n", "Class variable | \n", "
2 | \n", "Target type | \n", "Binary | \n", "
3 | \n", "Original data shape | \n", "(768, 9) | \n", "
4 | \n", "Transformed data shape | \n", "(768, 9) | \n", "
5 | \n", "Transformed train set shape | \n", "(537, 9) | \n", "
6 | \n", "Transformed test set shape | \n", "(231, 9) | \n", "
7 | \n", "Numeric features | \n", "8 | \n", "
8 | \n", "Preprocess | \n", "True | \n", "
9 | \n", "Imputation type | \n", "simple | \n", "
10 | \n", "Numeric imputation | \n", "mean | \n", "
11 | \n", "Categorical imputation | \n", "mode | \n", "
12 | \n", "Normalize | \n", "True | \n", "
13 | \n", "Normalize method | \n", "minmax | \n", "
14 | \n", "Fold Generator | \n", "StratifiedKFold | \n", "
15 | \n", "Fold Number | \n", "10 | \n", "
16 | \n", "CPU Jobs | \n", "-1 | \n", "
17 | \n", "Use GPU | \n", "False | \n", "
18 | \n", "Log Experiment | \n", "False | \n", "
19 | \n", "Experiment Name | \n", "clf-default-name | \n", "
20 | \n", "USI | \n", "3e8a | \n", "