{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# 👉 What is PyCaret?\n", "\n", "PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that speeds up the experiment cycle exponentially and makes you more productive.\n", "\n", "In comparison with the other open-source machine learning libraries, PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient. PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks such as scikit-learn, XGBoost, LightGBM, CatBoost, spaCy, Optuna, Hyperopt, Ray, and many more.\n", "\n", "The design and simplicity of PyCaret is inspired by the emerging role of citizen data scientists, a term first used by Gartner. Citizen Data Scientists are power users who can perform both simple and moderately sophisticated analytical tasks that would previously have required more expertise. Seasoned data scientists are often difficult to find and expensive to hire but citizen data scientists can be an effective way to mitigate this gap and address data-related challenges in the business setting.\n", "\n", "Official Website: https://www.pycaret.org\n", "Documentation: https://pycaret.readthedocs.io/en/latest/" ] }, { "attachments": { "image.png": { "image/png": "" } }, "cell_type": "markdown", "metadata": {}, "source": [ "![image.png](attachment:image.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 👉 Install PyCaret\n", "Installing PyCaret is very easy and takes only a few minutes. We strongly recommend using a virtual environment to avoid potential conflicts with other libraries. PyCaret's default installation is a slim version of pycaret that only installs hard dependencies that are listed in [requirements.txt](https://github.com/pycaret/pycaret/blob/master/requirements.txt). To install the default version:\n", "\n", "- `pip install pycaret`\n", "\n", "When you install the full version of pycaret, all the optional dependencies as listed [here](https://github.com/pycaret/pycaret/blob/master/requirements-optional.txt) are also installed.To install version:\n", "\n", "- `pip install pycaret[full]`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 👉Dataset" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agesexbmichildrensmokerregioncharges
019female27.9000yessouthwest16884.92400
118male33.7701nosoutheast1725.55230
228male33.0003nosoutheast4449.46200
333male22.7050nonorthwest21984.47061
432male28.8800nonorthwest3866.85520
\n", "
" ], "text/plain": [ " age sex bmi children smoker region charges\n", "0 19 female 27.900 0 yes southwest 16884.92400\n", "1 18 male 33.770 1 no southeast 1725.55230\n", "2 28 male 33.000 3 no southeast 4449.46200\n", "3 33 male 22.705 0 no northwest 21984.47061\n", "4 32 male 28.880 0 no northwest 3866.85520" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from pycaret.datasets import get_data\n", "data = get_data('insurance')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 👉 Data Preparation" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Description Value
0session_id123
1Targetcharges
2Original Data(1338, 7)
3Missing ValuesFalse
4Numeric Features2
5Categorical Features4
6Ordinal FeaturesFalse
7High Cardinality FeaturesFalse
8High Cardinality MethodNone
9Transformed Train Set(936, 14)
10Transformed Test Set(402, 14)
11Shuffle Train-TestTrue
12Stratify Train-TestFalse
13Fold GeneratorKFold
14Fold Number10
15CPU Jobs-1
16Use GPUFalse
17Log ExperimentFalse
18Experiment Namereg-default-name
19USI13da
20Imputation Typesimple
21Iterative Imputation IterationNone
22Numeric Imputermean
23Iterative Imputation Numeric ModelNone
24Categorical Imputerconstant
25Iterative Imputation Categorical ModelNone
26Unknown Categoricals Handlingleast_frequent
27NormalizeFalse
28Normalize MethodNone
29TransformationFalse
30Transformation MethodNone
31PCAFalse
32PCA MethodNone
33PCA ComponentsNone
34Ignore Low VarianceFalse
35Combine Rare LevelsFalse
36Rare Level ThresholdNone
37Numeric BinningFalse
38Remove OutliersFalse
39Outliers ThresholdNone
40Remove MulticollinearityFalse
41Multicollinearity ThresholdNone
42ClusteringFalse
43Clustering IterationNone
44Polynomial FeaturesFalse
45Polynomial DegreeNone
46Trignometry FeaturesFalse
47Polynomial ThresholdNone
48Group FeaturesFalse
49Feature SelectionFalse
50Feature Selection Methodclassic
51Features Selection ThresholdNone
52Feature InteractionFalse
53Feature RatioFalse
54Interaction ThresholdNone
55Transform TargetFalse
56Transform Target Methodbox-cox
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from pycaret.regression import *\n", "s = setup(data, target = 'charges', session_id = 123)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agebmisex_femalechildren_0children_1children_2children_3children_4children_5smoker_yesregion_northeastregion_northwestregion_southeastregion_southwest
30036.027.5499990.00.00.00.01.00.00.00.01.00.00.00.0
90460.035.0999981.01.00.00.00.00.00.00.00.00.00.01.0
67030.031.5700000.00.00.00.01.00.00.00.00.00.01.00.0
61749.025.6000000.00.00.01.00.00.00.01.00.00.00.01.0
37326.032.9000020.00.00.01.00.00.00.01.00.00.00.01.0
.............................................
123837.022.7050000.00.00.00.01.00.00.00.01.00.00.00.0
114720.031.9200001.01.00.00.00.00.00.00.00.01.00.00.0
10619.028.4000001.00.01.00.00.00.00.00.00.00.00.01.0
104118.023.0849990.01.00.00.00.00.00.00.01.00.00.00.0
112253.036.8600011.00.00.00.01.00.00.01.00.01.00.00.0
\n", "

936 rows × 14 columns

\n", "
" ], "text/plain": [ " age bmi sex_female children_0 children_1 children_2 \\\n", "300 36.0 27.549999 0.0 0.0 0.0 0.0 \n", "904 60.0 35.099998 1.0 1.0 0.0 0.0 \n", "670 30.0 31.570000 0.0 0.0 0.0 0.0 \n", "617 49.0 25.600000 0.0 0.0 0.0 1.0 \n", "373 26.0 32.900002 0.0 0.0 0.0 1.0 \n", "... ... ... ... ... ... ... \n", "1238 37.0 22.705000 0.0 0.0 0.0 0.0 \n", "1147 20.0 31.920000 1.0 1.0 0.0 0.0 \n", "106 19.0 28.400000 1.0 0.0 1.0 0.0 \n", "1041 18.0 23.084999 0.0 1.0 0.0 0.0 \n", "1122 53.0 36.860001 1.0 0.0 0.0 0.0 \n", "\n", " children_3 children_4 children_5 smoker_yes region_northeast \\\n", "300 1.0 0.0 0.0 0.0 1.0 \n", "904 0.0 0.0 0.0 0.0 0.0 \n", "670 1.0 0.0 0.0 0.0 0.0 \n", "617 0.0 0.0 0.0 1.0 0.0 \n", "373 0.0 0.0 0.0 1.0 0.0 \n", "... ... ... ... ... ... \n", "1238 1.0 0.0 0.0 0.0 1.0 \n", "1147 0.0 0.0 0.0 0.0 0.0 \n", "106 0.0 0.0 0.0 0.0 0.0 \n", "1041 0.0 0.0 0.0 0.0 1.0 \n", "1122 1.0 0.0 0.0 1.0 0.0 \n", "\n", " region_northwest region_southeast region_southwest \n", "300 0.0 0.0 0.0 \n", "904 0.0 0.0 1.0 \n", "670 0.0 1.0 0.0 \n", "617 0.0 0.0 1.0 \n", "373 0.0 0.0 1.0 \n", "... ... ... ... \n", "1238 0.0 0.0 0.0 \n", "1147 1.0 0.0 0.0 \n", "106 0.0 0.0 1.0 \n", "1041 0.0 0.0 0.0 \n", "1122 1.0 0.0 0.0 \n", "\n", "[936 rows x 14 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# check transformed X_train\n", "get_config('X_train')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Index(['age', 'bmi', 'sex_female', 'children_0', 'children_1', 'children_2',\n", " 'children_3', 'children_4', 'children_5', 'smoker_yes',\n", " 'region_northeast', 'region_northwest', 'region_southeast',\n", " 'region_southwest'],\n", " dtype='object')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# list columns of transformed X_train \n", "get_config('X_train').columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 👉Model Training & Selection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compare Models" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Model MAE MSE RMSE R2 RMSLE MAPE TT (Sec)
gbrGradient Boosting Regressor2702.768023242056.44094801.57040.83480.43970.31130.0480
catboostCatBoost Regressor2844.444624943135.52244977.19260.82280.47070.33642.0210
rfRandom Forest Regressor2736.745524862762.23054970.69590.82130.46740.32940.2620
lightgbmLight Gradient Boosting Machine2959.558425236477.04565013.08920.81710.54270.36850.1770
adaAdaBoost Regressor4162.232328328260.09555316.61460.79850.63490.72630.0180
etExtra Trees Regressor2814.296428815493.02605339.08790.79640.48890.33500.2560
xgboostExtreme Gradient Boosting3302.321531739266.60005615.59410.77010.56610.42180.7540
llarLasso Least Angle Regression4315.789538355976.51266173.87400.73110.61050.44150.0150
ridgeRidge Regression4336.230938381496.80006175.95410.73090.61930.44540.0160
brBayesian Ridge4333.688138381669.36296175.94760.73080.61510.44500.0140
lrLinear Regression4323.613638380061.20006175.71640.73080.61750.44321.2800
lassoLasso Regression4323.068838375137.80006175.38010.73080.61400.44310.0150
larLeast Angle Regression4447.787239653123.41676265.90280.72340.64680.46870.0170
dtDecision Tree Regressor3148.340243766011.64916584.71980.68550.53310.34550.0170
huberHuber Regressor3455.403048903724.48456970.81980.65450.47900.21740.0350
ompOrthogonal Matching Pursuit5754.776857503216.42907566.70930.59970.74180.89900.0140
parPassive Aggressive Regressor4164.784361324373.48357747.83320.58400.47240.25860.0190
enElastic Net7369.057390443346.80009468.67820.37910.73770.92560.0150
knnK Neighbors Regressor7805.8425126951808.000011221.65350.12180.83980.91470.0330
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# train all models using default hyperparameters\n", "best = compare_models()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "GradientBoostingRegressor(alpha=0.9, ccp_alpha=0.0, criterion='friedman_mse',\n", " init=None, learning_rate=0.1, loss='ls', max_depth=3,\n", " max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0, min_impurity_split=None,\n", " min_samples_leaf=1, min_samples_split=2,\n", " min_weight_fraction_leaf=0.0, n_estimators=100,\n", " n_iter_no_change=None, presort='deprecated',\n", " random_state=123, subsample=1.0, tol=0.0001,\n", " validation_fraction=0.1, verbose=0, warm_start=False)\n" ] } ], "source": [ "print(best)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "sklearn.ensemble._gb.GradientBoostingRegressor" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(best)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create Model" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MAE MSE RMSE R2 RMSLE MAPE
03001.229437001480.25906082.88420.77900.49840.3140
13389.888549305179.57327021.76470.71330.55740.3361
22926.019142025684.66666482.72200.46790.62150.4025
32744.714434078761.45075837.70170.71540.54120.3740
43924.481659489464.32077712.94140.55750.64550.4796
53322.543542747575.44536538.16300.72500.48690.2928
63158.704749369669.16527026.35530.66410.45110.3089
72405.297031318616.64405596.30380.82780.44970.1434
83021.546139091793.37756252.34300.74750.51170.4381
93588.977253231891.58897296.01890.65710.56790.3653
Mean3148.340243766011.64916584.71980.68550.53310.3455
SD410.79538481549.4829638.33900.10050.06310.0878
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# train individual model\n", "dt = create_model('dt')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DecisionTreeRegressor(ccp_alpha=0.0, criterion='mse', max_depth=None,\n", " max_features=None, max_leaf_nodes=None,\n", " min_impurity_decrease=0.0, min_impurity_split=None,\n", " min_samples_leaf=1, min_samples_split=2,\n", " min_weight_fraction_leaf=0.0, presort='deprecated',\n", " random_state=123, splitter='best')\n" ] } ], "source": [ "print(dt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tune Hyperparameters" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MAE MSE RMSE R2 RMSLE MAPE
02701.127321028661.96724585.70190.87440.40250.2937
13091.844930933374.91775561.77800.82010.46440.3233
22617.254522345453.18884727.09780.71710.46270.3155
32830.879622588574.38224752.74390.81140.46320.3918
42960.200725929500.54555092.10180.80710.45370.3175
52653.949818985846.93964357.27520.87790.35840.2672
62556.355818410040.32864290.69230.87470.38000.3059
72614.973622942326.37544789.81490.87380.46780.3049
82671.830919576326.36294424.51430.87360.39700.3326
92602.001323910249.53504889.81080.84600.43630.2869
Mean2730.041822665035.45434747.15310.83760.42860.3139
SD166.30703530111.2020359.96270.04850.03850.0316
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# tune hyperparameters of model\n", "tuned_dt = tune_model(dt, search_library = 'optuna')" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DecisionTreeRegressor(ccp_alpha=0.0, criterion='friedman_mse', max_depth=5,\n", " max_features=0.9723036176467299, max_leaf_nodes=None,\n", " min_impurity_decrease=7.642419608769017e-06,\n", " min_impurity_split=None, min_samples_leaf=6,\n", " min_samples_split=6, min_weight_fraction_leaf=0.0,\n", " presort='deprecated', random_state=123, splitter='best')\n" ] } ], "source": [ "print(tuned_dt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Ensemble Model" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MAE MSE RMSE R2 RMSLE MAPE
02584.170220381846.73394514.62590.87830.41120.3067
12957.190030309405.66715505.39790.82380.45560.2961
22481.189319964450.54584468.15960.74720.44110.2926
32715.078320812332.50534562.05350.82620.43920.3649
42763.631525039494.36535003.94790.81380.46190.2862
52537.793518224009.07414268.95880.88280.32940.2451
62319.549916909113.01224112.06920.88490.34420.2728
72451.752522210287.03724712.77910.87790.43520.2625
82452.795919002500.62784359.18580.87730.42940.3517
92490.926523750364.41024873.43460.84700.44220.2904
Mean2575.407821660380.39794638.06120.84590.41890.2969
SD177.19683709070.0443385.70520.04190.04320.0351
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bagged_tunned_dt = ensemble_model(tuned_dt)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "BaggingRegressor(base_estimator=DecisionTreeRegressor(ccp_alpha=0.0,\n", " criterion='friedman_mse',\n", " max_depth=5,\n", " max_features=0.9723036176467299,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=7.642419608769017e-06,\n", " min_impurity_split=None,\n", " min_samples_leaf=6,\n", " min_samples_split=6,\n", " min_weight_fraction_leaf=0.0,\n", " presort='deprecated',\n", " random_state=123,\n", " splitter='best'),\n", " bootstrap=True, bootstrap_features=False, max_features=1.0,\n", " max_samples=1.0, n_estimators=10, n_jobs=None, oob_score=False,\n", " random_state=123, verbose=0, warm_start=False)\n" ] } ], "source": [ "print(bagged_tunned_dt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Voting Ensemble" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MAE MSE RMSE R2 RMSLE MAPE
04317.633040225874.71206342.38710.75970.53770.5095
14603.877047671358.82646904.44490.72280.55280.4544
23526.565730025112.72225479.51760.61980.60980.5355
34059.957733028436.97775747.03720.72420.61590.6499
44719.123647134827.58386865.48090.64940.57830.5456
54036.987739416780.67536278.27850.74640.47450.3684
64098.219543644758.30566606.41800.70300.51160.4656
73938.689237270337.04876104.94370.79510.42890.3185
84493.179843777521.89256616.45840.71720.58000.6167
94452.810744338906.94746658.74660.71440.56130.4606
Mean4224.704440653391.56926360.37130.71520.54510.4925
SD341.84055548059.5247446.17120.04800.05610.0970
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "dt = create_model('dt', verbose=False)\n", "lasso = create_model('lasso', verbose=False)\n", "knn = create_model('knn', verbose=False)\n", "blender = blend_models([dt,lasso,knn])" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "VotingRegressor(estimators=[('dt',\n", " DecisionTreeRegressor(ccp_alpha=0.0,\n", " criterion='mse',\n", " max_depth=None,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_impurity_split=None,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " presort='deprecated',\n", " random_state=123,\n", " splitter='best')),\n", " ('lasso',\n", " Lasso(alpha=1.0, copy_X=True, fit_intercept=True,\n", " max_iter=1000, normalize=False,\n", " positive=False, precompute=False,\n", " random_state=123, selection='cyclic',\n", " tol=0.0001, warm_start=False)),\n", " ('knn',\n", " KNeighborsRegressor(algorithm='auto', leaf_size=30,\n", " metric='minkowski',\n", " metric_params=None, n_jobs=-1,\n", " n_neighbors=5, p=2,\n", " weights='uniform'))],\n", " n_jobs=-1, verbose=False, weights=None)\n" ] } ], "source": [ "print(blender)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "sklearn.ensemble._voting.VotingRegressor" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(blender)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Stacking Ensemble" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MAE MSE RMSE R2 RMSLE MAPE
03330.137226118092.61665110.58630.84400.45280.3698
13823.105636671240.61796055.67840.78680.53070.3557
23330.510929504475.52645431.80220.62640.57010.3860
33029.424921804106.66114669.48680.81790.48040.4295
43931.743536775712.80596064.29820.72650.51320.3751
53461.543229482846.82115429.81090.81030.49370.3541
63479.784831955531.26245652.92240.78260.46840.3600
73610.578630708816.39455541.55360.83110.50300.2904
83761.796127523649.50595246.29860.82220.55960.4627
93913.487835744866.01615978.70100.76980.63680.4228
Mean3567.211330628933.82285518.11390.78180.52090.3806
SD278.97524611695.3808423.50130.06120.05250.0458
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "stacker = stack_models([dt,lasso,knn])" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "StackingRegressor(cv=KFold(n_splits=10, random_state=RandomState(MT19937) at 0x1B967A57468,\n", " shuffle=False),\n", " estimators=[('dt',\n", " DecisionTreeRegressor(ccp_alpha=0.0,\n", " criterion='mse',\n", " max_depth=None,\n", " max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_impurity_split=None,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " presor...\n", " positive=False, precompute=False,\n", " random_state=123, selection='cyclic',\n", " tol=0.0001, warm_start=False)),\n", " ('knn',\n", " KNeighborsRegressor(algorithm='auto',\n", " leaf_size=30,\n", " metric='minkowski',\n", " metric_params=None,\n", " n_jobs=-1, n_neighbors=5,\n", " p=2, weights='uniform'))],\n", " final_estimator=LinearRegression(copy_X=True,\n", " fit_intercept=True,\n", " n_jobs=-1, normalize=False),\n", " n_jobs=-1, passthrough=True, verbose=0)\n" ] } ], "source": [ "print(stacker)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Analyze Model" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "932ee7e2f8ab43dea7700551839b3862", "version_major": 2, "version_minor": 0 }, "text/plain": [ "interactive(children=(ToggleButtons(description='Plot Type:', icons=('',), options=(('Hyperparameters', 'param…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "evaluate_model(best)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "interpret_model(dt)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
\n", " Visualization omitted, Javascript library not loaded!
\n", " Have you run `initjs()` in this notebook? If this notebook was from another\n", " user you must also trust this notebook (File -> Trust notebook). If you are viewing\n", " this notebook on github the Javascript has been stripped for security. If you are using\n", " JupyterLab this error is because a JupyterLab extension has not yet been written.\n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "interpret_model(dt, plot = 'reason', observation=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Model Predictions" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Model MAE MSE RMSE R2 RMSLE MAPE
0Gradient Boosting Regressor2386.201817296249.13794158.87590.87890.39850.2922
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# predict on holdout / test set\n", "pred_holdout = predict_model(best);" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agebmisex_femalechildren_0children_1children_2children_3children_4children_5smoker_yesregion_northeastregion_northwestregion_southeastregion_southwestchargesLabel
049.042.6800001.00.00.01.00.00.00.00.00.00.01.00.09800.88867210621.483595
132.037.3349990.00.01.00.00.00.00.00.01.00.00.00.04667.6074227290.151941
227.031.4000001.01.00.00.00.00.00.01.00.00.00.01.034838.87109436012.959871
335.024.1299990.00.01.00.00.00.00.00.00.01.00.00.05125.2158207553.788882
460.025.7400000.01.00.00.00.00.00.00.00.00.01.00.012142.57812514904.032497
\n", "
" ], "text/plain": [ " age bmi sex_female children_0 children_1 children_2 \\\n", "0 49.0 42.680000 1.0 0.0 0.0 1.0 \n", "1 32.0 37.334999 0.0 0.0 1.0 0.0 \n", "2 27.0 31.400000 1.0 1.0 0.0 0.0 \n", "3 35.0 24.129999 0.0 0.0 1.0 0.0 \n", "4 60.0 25.740000 0.0 1.0 0.0 0.0 \n", "\n", " children_3 children_4 children_5 smoker_yes region_northeast \\\n", "0 0.0 0.0 0.0 0.0 0.0 \n", "1 0.0 0.0 0.0 0.0 1.0 \n", "2 0.0 0.0 0.0 1.0 0.0 \n", "3 0.0 0.0 0.0 0.0 0.0 \n", "4 0.0 0.0 0.0 0.0 0.0 \n", "\n", " region_northwest region_southeast region_southwest charges \\\n", "0 0.0 1.0 0.0 9800.888672 \n", "1 0.0 0.0 0.0 4667.607422 \n", "2 0.0 0.0 1.0 34838.871094 \n", "3 1.0 0.0 0.0 5125.215820 \n", "4 0.0 1.0 0.0 12142.578125 \n", "\n", " Label \n", "0 10621.483595 \n", "1 7290.151941 \n", "2 36012.959871 \n", "3 7553.788882 \n", "4 14904.032497 " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pred_holdout.head()" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agesexbmichildrensmokerregion
019female27.9000yessouthwest
118male33.7701nosoutheast
228male33.0003nosoutheast
333male22.7050nonorthwest
432male28.8800nonorthwest
\n", "
" ], "text/plain": [ " age sex bmi children smoker region\n", "0 19 female 27.900 0 yes southwest\n", "1 18 male 33.770 1 no southeast\n", "2 28 male 33.000 3 no southeast\n", "3 33 male 22.705 0 no northwest\n", "4 32 male 28.880 0 no northwest" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# predict on new data\n", "data2 = data.copy()\n", "data2.drop('charges', axis=1, inplace=True)\n", "data2.head()" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "# finalize model\n", "best_final = finalize_model(best)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agesexbmichildrensmokerregionLabel
019female27.9000yessouthwest18894.260073
118male33.7701nosoutheast3698.287534
228male33.0003nosoutheast6029.271578
333male22.7050nonorthwest8958.189116
432male28.8800nonorthwest3900.039002
\n", "
" ], "text/plain": [ " age sex bmi children smoker region Label\n", "0 19 female 27.900 0 yes southwest 18894.260073\n", "1 18 male 33.770 1 no southeast 3698.287534\n", "2 28 male 33.000 3 no southeast 6029.271578\n", "3 33 male 22.705 0 no northwest 8958.189116\n", "4 32 male 28.880 0 no northwest 3900.039002" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# predict on data2\n", "predictions = predict_model(best_final, data=data2)\n", "predictions.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 👉 Save / Load / Deploy Model" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Transformation Pipeline and Model Succesfully Saved\n" ] }, { "data": { "text/plain": [ "(Pipeline(memory=None,\n", " steps=[('dtypes',\n", " DataTypes_Auto_infer(categorical_features=[],\n", " display_types=True, features_todrop=[],\n", " id_columns=[], ml_usecase='regression',\n", " numerical_features=[], target='charges',\n", " time_features=[])),\n", " ('imputer',\n", " Simple_Imputer(categorical_strategy='not_available',\n", " fill_value_categorical=None,\n", " fill_value_numerical=None,\n", " numeric_strategy...\n", " learning_rate=0.1, loss='ls',\n", " max_depth=3, max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_impurity_split=None,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " n_estimators=100,\n", " n_iter_no_change=None,\n", " presort='deprecated',\n", " random_state=123, subsample=1.0,\n", " tol=0.0001, validation_fraction=0.1,\n", " verbose=0, warm_start=False)]],\n", " verbose=False),\n", " 'insurance-pipeline.pkl')" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "save_model(best_final, 'insurance-pipeline')" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Transformation Pipeline and Model Successfully Loaded\n" ] } ], "source": [ "loaded_pipeline = load_model('insurance-pipeline')" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Pipeline(memory=None,\n", " steps=[('dtypes',\n", " DataTypes_Auto_infer(categorical_features=[],\n", " display_types=True, features_todrop=[],\n", " id_columns=[], ml_usecase='regression',\n", " numerical_features=[], target='charges',\n", " time_features=[])),\n", " ('imputer',\n", " Simple_Imputer(categorical_strategy='not_available',\n", " fill_value_categorical=None,\n", " fill_value_numerical=None,\n", " numeric_strategy...\n", " learning_rate=0.1, loss='ls',\n", " max_depth=3, max_features=None,\n", " max_leaf_nodes=None,\n", " min_impurity_decrease=0.0,\n", " min_impurity_split=None,\n", " min_samples_leaf=1,\n", " min_samples_split=2,\n", " min_weight_fraction_leaf=0.0,\n", " n_estimators=100,\n", " n_iter_no_change=None,\n", " presort='deprecated',\n", " random_state=123, subsample=1.0,\n", " tol=0.0001, validation_fraction=0.1,\n", " verbose=0, warm_start=False)]],\n", " verbose=False)\n" ] } ], "source": [ "print(loaded_pipeline)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model Succesfully Deployed on AWS S3\n" ] } ], "source": [ "# deploy model on AWS S3\n", "deploy_model(best_final, 'insurance-pipeline-aws', platform = 'aws',\n", " authentication = {'bucket' : 'pycaret-test'})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## THE END" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "pycaret-dev", "language": "python", "name": "pycaret-dev" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.10" } }, "nbformat": 4, "nbformat_minor": 2 }