\n "},"metadata":{},"execution_count":10}],"execution_count":10},{"cell_type":"markdown","source":"Lo primero que vamos a hacer es definir nuestra función objetivo que debe devolver un diccionario al menos con las etiquetas 'loss' y 'status'.","metadata":{"id":"Jne3otMi3UvY","cell_id":"a1502a53fb3449f6a30ddd8db4c08780","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"import csv\nfrom hyperopt import STATUS_OK\nfrom timeit import default_timer as timer\n\nMAX_EVALS = 500\nN_FOLDS = 10\n\ndef objective(params, n_folds = N_FOLDS):\n \"\"\"Función objetivo para la Optimización de hiperparametros del Gradient Boosting Machine\"\"\"\n # Llevar el conteo de iteraciones\n global ITERATION\n ITERATION += 1\n # Recupera el subsample si se encuentra, en caso contrario se asigna 1.0\n subsample = params['boosting_type'].get('subsample', 1.0)\n # Extrae el boosting type\n params['boosting_type'] = params['boosting_type']['boosting_type']\n params['subsample'] = subsample\n \n # Se asegura que los parametros que tienen que ser enteros sean enteros\n for parameter_name in ['num_leaves', 'subsample_for_bin', \n 'min_child_samples']:\n params[parameter_name] = int(params[parameter_name])\n start = timer()\n \n # realiza n_folds de cross validation\n cv_results = lgb.cv(params, train_set, num_boost_round = 10000, \n nfold = n_folds, early_stopping_rounds = 100, \n metrics = 'auc', seed = 50)\n run_time = timer() - start\n # Extrae el mejor score\n best_score = np.max(cv_results['auc-mean'])\n # El loss se debe minimizar\n loss = 1 - best_score\n # Impulsando las iteraciones que arrojaron el mayor score en CV\n n_estimators = int(np.argmax(cv_results['auc-mean']) + 1)\n # Escribe sobre el archivo CSV ('a' significa append)\n of_connection = open(out_file, 'a')\n writer = csv.writer(of_connection)\n writer.writerow([loss, params, ITERATION, n_estimators, \n run_time])\n # Dictionary con informacion para la evaluación\n return {'loss': loss, 'params': params, 'iteration': ITERATION,\n 'estimators': n_estimators, 'train_time': run_time, \n 'status': STATUS_OK}","metadata":{"id":"6G7jHRjJ3ZDJ","cell_id":"1861e6e115a24dceba9424b9aa72b020","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":708,"user_tz":240,"timestamp":1652643547820},"deepnote_cell_type":"code"},"outputs":[],"execution_count":11},{"cell_type":"markdown","source":"**Espacio del Dominio**: El Dominio representa el rango de valores que queremos evaluar para cada hiperparámetro. En cada iteración de la búsqueda, el algoritmo de optimización bayesiano elegirá un valor para cada hiperparámetro desde el espacio del domino. Cuando hacemos un Random Search o un Grid Search, el espacio del dominio es una cuadrícula (una tabla de valores establecidos). En la optimización bayesiana, la idea es la misma, excepto que este espacio tiene distribuciones de probabilidad para cada hiperparámetro en lugar de valores discretos.","metadata":{"id":"QPaU5N-b33Dg","cell_id":"e56ce6526aa34867803155e0831b2ba3","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"from hyperopt import hp\nspace = {\n'class_weight': hp.choice('class_weight', [None, 'balanced']),\n'boosting_type': hp.choice('boosting_type', [{'boosting_type': 'gbdt', 'subsample': hp.uniform('gdbt_subsample', 0.5, 1)},\n {'boosting_type': 'dart', 'subsample': hp.uniform('dart_subsample', 0.5, 1)},\n {'boosting_type': 'goss', 'subsample': 1.0}]),\n'num_leaves': hp.quniform('num_leaves', 30, 150, 1),\n'learning_rate': hp.loguniform('learning_rate', np.log(0.01),np.log(0.2)),\n'subsample_for_bin': hp.quniform('subsample_for_bin', 20000,300000,1000),\n'min_child_samples': hp.quniform('min_child_samples', 20, 500, 5),\n'reg_alpha': hp.uniform('reg_alpha', 0.0, 1.0),\n'reg_lambda': hp.uniform('reg_lambda', 0.0, 1.0),\n'colsample_bytree': hp.uniform('colsample_by_tree', 0.6, 1.0)\n}","metadata":{"id":"H7ZxRSYn5LRp","cell_id":"f24b7005b5c94eaab6fc6c32428f297f","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":303,"user_tz":240,"timestamp":1652643553783},"deepnote_cell_type":"code"},"outputs":[],"execution_count":12},{"cell_type":"markdown","source":"Aquí se pueden usar diferentes tipos de distribución de dominio (se puede conseguir la lista completa de distribuciones en la documentación de hyperopt):\n\n**choice** : variables categóricas\n\n**quniform** : discretas uniformes (números enteros espaciados uniformemente)\n\n**uniform** continuad uniformes (floats espaciados uniformemente)\n\n**loguniform:** logarítmicas continuas uniformes (floats espaciados uniformemente en una escala logaritmica)","metadata":{"id":"G5gHyhPp4Vl6","cell_id":"fb0f1ce1e3fb4bc28e28744e8bc27fee","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"from hyperopt import tpe\nfrom hyperopt import Trials\n\n# Algoritmo de optimización\ntpe_algorithm = tpe.suggest\n\n# Lleva el registro de los resultados\nbayes_trials = Trials()","metadata":{"id":"315g1l1U4Sja","cell_id":"d858eb57d6b84f7397d601a051d6fb0c","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":511,"user_tz":240,"timestamp":1652643556398},"deepnote_cell_type":"code"},"outputs":[],"execution_count":13},{"cell_type":"code","source":"from hyperopt import fmin\n\n# Variable Global\nglobal ITERATION\nITERATION = 0\nMAX_EVALS = 100\n\n# Crea un dataset lgb\ntrain_set = lgb.Dataset(X_train, label = y_train)","metadata":{"id":"VF6i8MlB4mLr","cell_id":"d425abd3dbe94af8bf8c7e3f8e504b54","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":334,"user_tz":240,"timestamp":1652643557949},"deepnote_cell_type":"code"},"outputs":[],"execution_count":14},{"cell_type":"code","source":"# archivo para guardar los primeros resultados\nout_file = './gbm_trials.csv'\nof_connection = open(out_file, 'w')\nwriter = csv.writer(of_connection)\n\n# escribe la cabecera de los archivos\nwriter.writerow(['loss', 'params', 'iteration', 'estimators', 'train_time'])\nof_connection.close()","metadata":{"id":"T69y-Qy07SuV","cell_id":"a34c224abf0c4699ad7cf19732395d26","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":309,"user_tz":240,"timestamp":1652643562694},"deepnote_cell_type":"code"},"outputs":[],"execution_count":15},{"cell_type":"code","source":"# Se demora bastante\nbest = fmin(fn = objective, space = space, algo = tpe.suggest,\n max_evals = MAX_EVALS, trials = bayes_trials, \n rstate =np.random.RandomState(50))","metadata":{"id":"SP4WnFAN4nSO","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"61a06617c3e54448a6014dca18966593","outputId":"88419f31-247e-482f-8572-2a8c7e5fd81a","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":5588040,"user_tz":240,"timestamp":1652649152683},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":" 5%|▌ | 5/100 [00:04<01:11, 1.34it/s, best loss: 0.13747126630679263]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 6%|▌ | 6/100 [01:44<54:17, 34.65s/it, best loss: 0.13747126630679263]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 8%|▊ | 8/100 [03:39<1:03:27, 41.38s/it, best loss: 0.13747126630679263]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 11%|█ | 11/100 [04:00<24:58, 16.84s/it, best loss: 0.13747126630679263]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 12%|█▏ | 12/100 [04:19<25:53, 17.66s/it, best loss: 0.13747126630679263]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 13%|█▎ | 13/100 [07:12<1:33:45, 64.66s/it, best loss: 0.13023161268556005]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 15%|█▌ | 15/100 [08:55<1:15:35, 53.36s/it, best loss: 0.13023161268556005]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 18%|█▊ | 18/100 [09:19<30:25, 22.26s/it, best loss: 0.13023161268556005]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 23%|██▎ | 23/100 [09:45<08:18, 6.48s/it, best loss: 0.1295262033288349] "},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 45%|████▌ | 45/100 [10:31<00:52, 1.05it/s, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 48%|████▊ | 48/100 [12:16<13:49, 15.94s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 51%|█████ | 51/100 [14:29<20:57, 25.67s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 55%|█████▌ | 55/100 [15:54<11:45, 15.67s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 59%|█████▉ | 59/100 [17:28<09:15, 13.55s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 63%|██████▎ | 63/100 [17:51<03:49, 6.19s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 65%|██████▌ | 65/100 [20:32<21:34, 36.98s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 66%|██████▌ | 66/100 [23:34<45:33, 80.41s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 67%|██████▋ | 67/100 [27:12<1:06:52, 121.58s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 68%|██████▊ | 68/100 [29:39<1:08:58, 129.33s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 69%|██████▉ | 69/100 [32:48<1:16:04, 147.25s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 70%|███████ | 70/100 [35:27<1:15:26, 150.90s/it, best loss: 0.1295262033288349]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 71%|███████ | 71/100 [38:04<1:13:47, 152.66s/it, best loss: 0.1292669815564551]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 72%|███████▏ | 72/100 [40:48<1:12:46, 155.94s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 73%|███████▎ | 73/100 [43:00<1:06:58, 148.83s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 74%|███████▍ | 74/100 [44:58<1:00:27, 139.53s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 75%|███████▌ | 75/100 [47:15<57:50, 138.83s/it, best loss: 0.12762910481331535] "},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 76%|███████▌ | 76/100 [48:58<51:13, 128.07s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 77%|███████▋ | 77/100 [50:55<47:47, 124.67s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 78%|███████▊ | 78/100 [53:17<47:37, 129.89s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 79%|███████▉ | 79/100 [56:04<49:24, 141.15s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 80%|████████ | 80/100 [57:36<42:09, 126.46s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 81%|████████ | 81/100 [1:01:03<47:38, 150.44s/it, best loss: 0.12762910481331535]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 82%|████████▏ | 82/100 [1:04:35<50:41, 168.98s/it, best loss: 0.12759834682860993]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 83%|████████▎ | 83/100 [1:08:12<51:56, 183.34s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 84%|████████▍ | 84/100 [1:08:32<35:48, 134.30s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 85%|████████▌ | 85/100 [1:11:26<36:33, 146.24s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 86%|████████▌ | 86/100 [1:12:19<27:36, 118.31s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 87%|████████▋ | 87/100 [1:12:39<19:15, 88.88s/it, best loss: 0.12692223346828602] "},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 88%|████████▊ | 88/100 [1:14:59<20:50, 104.22s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 90%|█████████ | 90/100 [1:17:07<13:00, 78.04s/it, best loss: 0.12692223346828602] "},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 91%|█████████ | 91/100 [1:19:07<13:35, 90.59s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 93%|█████████▎| 93/100 [1:20:48<07:39, 65.63s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 94%|█████████▍| 94/100 [1:22:43<08:02, 80.47s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 95%|█████████▌| 95/100 [1:25:05<08:13, 98.76s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 96%|█████████▌| 96/100 [1:26:46<06:37, 99.42s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":" 98%|█████████▊| 98/100 [1:31:03<03:25, 102.73s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"\r 99%|█████████▉| 99/100 [1:32:49<01:43, 103.87s/it, best loss: 0.12692223346828602]"},{"output_type":"stream","name":"stderr","text":"/usr/local/lib/python3.7/dist-packages/lightgbm/callback.py:189: UserWarning: Early stopping is not available in dart mode\n warnings.warn('Early stopping is not available in dart mode')\n\n"},{"output_type":"stream","name":"stdout","text":"100%|██████████| 100/100 [1:33:07<00:00, 55.88s/it, best loss: 0.12692223346828602]\n"}],"execution_count":16},{"cell_type":"markdown","source":"Esta función activa el proceso de búsqueda de la mejor combinación. \nUna vez finalizado el proceso, podemos tomar el objeto Trials (bayes_trials en nuestro caso) y analizar sus resultados:","metadata":{"id":"A5yZD49E7fWV","cell_id":"4eef4756ebe34cd8af3e0014ecda9ce0","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"# Ordena las pruebas segun el menor loss (mayor AUC) primero\nbayes_trials_results = sorted(bayes_trials.results, key = lambda x: x['loss'])\nbayes_trials_results[:2]","metadata":{"id":"_uWgB92T7iy-","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"909b0a4b114b4dfab56990cb31e8feeb","outputId":"bc382a3e-8bd7-40da-e796-8870861bf2d2","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":312,"user_tz":240,"timestamp":1652649430011},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"[{'estimators': 1390,\n 'iteration': 83,\n 'loss': 0.12692223346828602,\n 'params': {'boosting_type': 'dart',\n 'class_weight': 'balanced',\n 'colsample_bytree': 0.8617558102005193,\n 'learning_rate': 0.045091115529774406,\n 'min_child_samples': 40,\n 'num_leaves': 145,\n 'reg_alpha': 0.03906368016088817,\n 'reg_lambda': 0.8457944649575712,\n 'subsample': 0.5562695107489157,\n 'subsample_for_bin': 201000},\n 'status': 'ok',\n 'train_time': 216.78248231699945},\n {'estimators': 377,\n 'iteration': 82,\n 'loss': 0.12759834682860993,\n 'params': {'boosting_type': 'dart',\n 'class_weight': 'balanced',\n 'colsample_bytree': 0.941484025255672,\n 'learning_rate': 0.08594930906358782,\n 'min_child_samples': 40,\n 'num_leaves': 143,\n 'reg_alpha': 0.14142198699291497,\n 'reg_lambda': 0.5752120688348917,\n 'subsample': 0.5992524654733394,\n 'subsample_for_bin': 247000},\n 'status': 'ok',\n 'train_time': 212.17623683700003}]"},"metadata":{},"execution_count":17}],"execution_count":17},{"cell_type":"code","source":"results = pd.read_csv('./gbm_trials.csv')\n\n# Ordena con el mejor score de primero y resetea el indice para las divisiones \nresults.sort_values('loss', ascending = True, inplace = True)\nresults.reset_index(inplace = True, drop = True)\nresults.head()","metadata":{"id":"4x3OGTAr7qCt","colab":{"height":206,"base_uri":"https://localhost:8080/"},"cell_id":"863f438adab5498b8c53328e89860c28","outputId":"94f53627-dd3b-400c-f533-daf40bf9de82","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":305,"user_tz":240,"timestamp":1652649437081},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":" loss params iteration \\\n0 0.126922 {'boosting_type': 'dart', 'class_weight': 'bal... 83 \n1 0.127598 {'boosting_type': 'dart', 'class_weight': 'bal... 82 \n2 0.127629 {'boosting_type': 'dart', 'class_weight': None... 72 \n3 0.128016 {'boosting_type': 'dart', 'class_weight': 'bal... 81 \n4 0.128408 {'boosting_type': 'dart', 'class_weight': 'bal... 97 \n\n estimators train_time \n0 1390 216.782482 \n1 377 212.176237 \n2 4473 163.547625 \n3 484 206.337745 \n4 299 256.564190 ","text/html":"\n
\n
\n
\n\n
\n \n
\n
\n
loss
\n
params
\n
iteration
\n
estimators
\n
train_time
\n
\n \n \n
\n
0
\n
0.126922
\n
{'boosting_type': 'dart', 'class_weight': 'bal...
\n
83
\n
1390
\n
216.782482
\n
\n
\n
1
\n
0.127598
\n
{'boosting_type': 'dart', 'class_weight': 'bal...
\n
82
\n
377
\n
212.176237
\n
\n
\n
2
\n
0.127629
\n
{'boosting_type': 'dart', 'class_weight': None...
\n
72
\n
4473
\n
163.547625
\n
\n
\n
3
\n
0.128016
\n
{'boosting_type': 'dart', 'class_weight': 'bal...
\n
81
\n
484
\n
206.337745
\n
\n
\n
4
\n
0.128408
\n
{'boosting_type': 'dart', 'class_weight': 'bal...
\n
97
\n
299
\n
256.564190
\n
\n \n
\n
\n \n \n \n\n \n
\n
\n "},"metadata":{},"execution_count":18}],"execution_count":18},{"cell_type":"code","source":"import ast\n# Convierte de string a un dictionary\nast.literal_eval(results.loc[0, 'params'])","metadata":{"id":"ternC_q-7_6q","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"846f51116d434c3aaff01a07fb85f2e9","outputId":"c506ec18-0091-4017-d547-81142b844091","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":245,"user_tz":240,"timestamp":1652649441071},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"{'boosting_type': 'dart',\n 'class_weight': 'balanced',\n 'colsample_bytree': 0.8617558102005193,\n 'learning_rate': 0.045091115529774406,\n 'min_child_samples': 40,\n 'num_leaves': 145,\n 'reg_alpha': 0.03906368016088817,\n 'reg_lambda': 0.8457944649575712,\n 'subsample': 0.5562695107489157,\n 'subsample_for_bin': 201000}"},"metadata":{},"execution_count":19}],"execution_count":19},{"cell_type":"markdown","source":"\n \nCreated in Deepnote","metadata":{"created_in_deepnote_cell":true,"deepnote_cell_type":"markdown"}}],"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Optimizacion_Bayesiana.ipynb","provenance":[],"collapsed_sections":[]},"deepnote":{},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"},"deepnote_notebook_id":"3516e3d953f64aeaa0ec988c90faba0a","deepnote_execution_queue":[]}}