{"cells":[{"cell_type":"markdown","source":"# LOOCV","metadata":{"id":"cgI2bTuUYkyu","cell_id":"43706f53629841febac0c0cb839eda12","deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## LOOCV para clasificacion","metadata":{"id":"TJzaPMabWZ2A","cell_id":"9262b1c8ac9540be8415e88104576666","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"# librerias\nfrom numpy import mean\nfrom numpy import std\nfrom pandas import read_csv\nfrom sklearn.model_selection import LeaveOneOut\nfrom sklearn.model_selection import cross_val_score\nfrom sklearn.ensemble import RandomForestClassifier","metadata":{"id":"4wZI6qHSWbo9","cell_id":"1d3e3504840e4ea99887804831dc9c6d","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":1743,"user_tz":240,"timestamp":1652612015824},"deepnote_cell_type":"code"},"outputs":[],"execution_count":1},{"cell_type":"markdown","source":"**Descripcion de datos**\nhttps://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.names\n\n**Enlace con datos**\nhttps://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv\n\nEl archivo \"sonar.mines\" contiene 111 patrones obtenidos al hacer rebotar señales de sonar en un cilindro de metal en varios ángulos y bajo diversas condiciones. El archivo \"sonar.rocks\" contiene 97 patrones obtenidos de rocas en condiciones similares. La señal del sonar transmitida es un chirrido de frecuencia modulada, aumentando en frecuencia. El conjunto de datos contiene señales obtenidas desde una variedad de ángulos de aspecto diferentes, que abarcan 90 grados para el cilindro y 180 grados para la roca.\n\nCada patrón es un conjunto de 60 números en el rango de 0,0 a 1,0. Cada número representa la energía dentro de una banda de frecuencia particular, integrada durante un cierto período de tiempo. La apertura de integración para frecuencias más altas ocurre más tarde, ya que estas frecuencias se transmiten más tarde durante el chirrido.\n\nLa etiqueta asociada a cada registro contiene la letra \"R\" si el objeto es una roca y \"M\" si es una mina (cilindro de metal). Los números en las etiquetas están en orden creciente de ángulo de aspecto, pero no codifican el ángulo directamente.","metadata":{"id":"c-uqzdzEWsyS","cell_id":"ae71068b25c34b96802ed0dc5b542619","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"# datos\nurl = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'\ndataframe = read_csv(url, header=None)\ndata = dataframe.values\ndataframe.head()","metadata":{"id":"HJveRtrNWfyH","colab":{"height":299,"base_uri":"https://localhost:8080/"},"cell_id":"c5d3baa572e94016a1a4e5db345209dd","outputId":"f377da84-82a6-467a-a8c3-6bbcddaab2a7","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":647,"user_tz":240,"timestamp":1652612028450},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":" 0 1 2 3 4 5 6 7 8 \\\n0 0.0200 0.0371 0.0428 0.0207 0.0954 0.0986 0.1539 0.1601 0.3109 \n1 0.0453 0.0523 0.0843 0.0689 0.1183 0.2583 0.2156 0.3481 0.3337 \n2 0.0262 0.0582 0.1099 0.1083 0.0974 0.2280 0.2431 0.3771 0.5598 \n3 0.0100 0.0171 0.0623 0.0205 0.0205 0.0368 0.1098 0.1276 0.0598 \n4 0.0762 0.0666 0.0481 0.0394 0.0590 0.0649 0.1209 0.2467 0.3564 \n\n 9 ... 51 52 53 54 55 56 57 \\\n0 0.2111 ... 0.0027 0.0065 0.0159 0.0072 0.0167 0.0180 0.0084 \n1 0.2872 ... 0.0084 0.0089 0.0048 0.0094 0.0191 0.0140 0.0049 \n2 0.6194 ... 0.0232 0.0166 0.0095 0.0180 0.0244 0.0316 0.0164 \n3 0.1264 ... 0.0121 0.0036 0.0150 0.0085 0.0073 0.0050 0.0044 \n4 0.4459 ... 0.0031 0.0054 0.0105 0.0110 0.0015 0.0072 0.0048 \n\n 58 59 60 \n0 0.0090 0.0032 R \n1 0.0052 0.0044 R \n2 0.0095 0.0078 R \n3 0.0040 0.0117 R \n4 0.0107 0.0094 R \n\n[5 rows x 61 columns]","text/html":"\n
\n
\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
0123456789...51525354555657585960
00.02000.03710.04280.02070.09540.09860.15390.16010.31090.2111...0.00270.00650.01590.00720.01670.01800.00840.00900.0032R
10.04530.05230.08430.06890.11830.25830.21560.34810.33370.2872...0.00840.00890.00480.00940.01910.01400.00490.00520.0044R
20.02620.05820.10990.10830.09740.22800.24310.37710.55980.6194...0.02320.01660.00950.01800.02440.03160.01640.00950.0078R
30.01000.01710.06230.02050.02050.03680.10980.12760.05980.1264...0.01210.00360.01500.00850.00730.00500.00440.00400.0117R
40.07620.06660.04810.03940.05900.06490.12090.24670.35640.4459...0.00310.00540.01050.01100.00150.00720.00480.01070.0094R
\n

5 rows × 61 columns

\n
\n \n \n \n\n \n
\n
\n "},"metadata":{},"execution_count":2}],"execution_count":2},{"cell_type":"code","source":"dataframe.shape","metadata":{"id":"_3qcVkZ4XLdT","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"3b3cb9c618bc4cedb3a588f3ce5fc007","outputId":"a59ad652-14a7-4955-bf36-f3a50c584dac","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":251,"user_tz":240,"timestamp":1652612194045},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"(208, 61)"},"metadata":{},"execution_count":3}],"execution_count":3},{"cell_type":"code","source":"dataframe.isnull().sum()","metadata":{"id":"Aoc0O4lXXUui","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"a1b4a4f45fa2487fad3f79c3e1036599","outputId":"95698ff9-0c6d-4715-ab1e-7339f627fa68","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":11,"user_tz":240,"timestamp":1652612230772},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"0 0\n1 0\n2 0\n3 0\n4 0\n ..\n56 0\n57 0\n58 0\n59 0\n60 0\nLength: 61, dtype: int64"},"metadata":{},"execution_count":4}],"execution_count":4},{"cell_type":"code","source":"sum(dataframe.isna().sum())\n# no hay nulos","metadata":{"id":"X3s35hdfXZTe","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"8e0e81dd79914498808ba9e5df5f80d9","outputId":"705327ea-19d6-4dbd-e8b5-741106a9712c","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":272,"user_tz":240,"timestamp":1652612259976},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"0"},"metadata":{},"execution_count":7}],"execution_count":7},{"cell_type":"code","source":"# Separar en X y y\nX, y = data[:, :-1], data[:, -1]\nprint(X.shape, y.shape)","metadata":{"id":"rqoMoCdqWnU2","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"c8b0e5591a224666aeed9a4bf4d91127","outputId":"25073d4e-af56-4f34-9be6-546cbd925778","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":5,"user_tz":240,"timestamp":1652612280842},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"(208, 60) (208,)\n"}],"execution_count":8},{"cell_type":"code","source":"# crear el procedimiento loocv \ncv = LeaveOneOut()","metadata":{"id":"mWhTZ2BxXjdZ","cell_id":"46a1876303094ba7a1506831ddbbd92b","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":4,"user_tz":240,"timestamp":1652612298351},"deepnote_cell_type":"code"},"outputs":[],"execution_count":9},{"cell_type":"code","source":"LeaveOneOut?","metadata":{"id":"IgKmy3L6XnuX","cell_id":"f8b1300a87e8467e8122ad37c4e7b562","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":238,"user_tz":240,"timestamp":1652612306973},"deepnote_cell_type":"code"},"outputs":[],"execution_count":10},{"cell_type":"code","source":"# crear modelo\nmodel = RandomForestClassifier(random_state=1,n_estimators=30, criterion=\"gini\",max_depth=4)","metadata":{"id":"humzHVosXqW4","cell_id":"0fcc52685cf9415c824563d1184d9902","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":265,"user_tz":240,"timestamp":1652612359130},"deepnote_cell_type":"code"},"outputs":[],"execution_count":11},{"cell_type":"code","source":"cross_val_score?","metadata":{"id":"RB3FWnzJYDuM","cell_id":"072a5d38343c45d0beaddf258eb04e28","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":3,"user_tz":240,"timestamp":1652612422663},"deepnote_cell_type":"code"},"outputs":[],"execution_count":14},{"cell_type":"code","source":"# evaluar el modelo\nscores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, verbose=1)","metadata":{"id":"o08GqjcxX2tQ","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"ee56be323948406a81deba7ae69d4cf5","outputId":"ef854deb-f95b-4278-dec2-65ca32e144de","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":14760,"user_tz":240,"timestamp":1652612463635},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stderr","text":"[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n[Parallel(n_jobs=1)]: Done 208 out of 208 | elapsed: 14.7s finished\n"}],"execution_count":15},{"cell_type":"code","source":"# reportar el performance\nprint('Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))","metadata":{"id":"Eb09HM8hX5MI","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"45046127248f4d5e9279241a5e637d00","outputId":"272b9eb0-9499-4ba8-ad74-b02ca37c7cb2","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":310,"user_tz":240,"timestamp":1652612478473},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"Accuracy: 0.808 (0.394)\n"}],"execution_count":16},{"cell_type":"markdown","source":"El modelo se evalúa utilizando LOOCV y el rendimiento estimado al hacer predicciones sobre nuevos datos tiene una precisión de alrededor de ~80%.","metadata":{"id":"t9ZWakCVYaXQ","cell_id":"22ed09541a7b451fb00942554bc3d801","deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## LOOCV para regresion","metadata":{"id":"-uLI5tirYfwr","cell_id":"dcda38d3cdb04b399d458ca76127414b","deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"**Descripcion de datos**\nhttps://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.names\n\n**Enlace con datos**\nhttps://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv\n\n1. CRIM per capita crime rate by town\n2. ZN proportion of residential land zoned for lots over 25,000 sq.ft.\n3. INDUS proportion of non-retail business acres per town\n4. CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)\n5. NOX nitric oxides concentration (parts per 10 million)\n6. RM average number of rooms per dwelling\n7. AGE proportion of owner-occupied units built prior to 1940\n8. DIS weighted distances to five Boston employment centres\n9. RAD index of accessibility to radial highways\n10. TAX full-value property-tax rate per 10,000\n11. PTRATIO pupil teacher ratio by town\n12. B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town\n13. LSTAT % lower status of the population\n14. MEDV Median value of owner-occupied homes in $1000's","metadata":{"id":"_mKNx3NeYutA","cell_id":"18cd065b9ea2457db66fb6589e013ea2","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"# librerias\nfrom pandas import read_csv\n# cargar datos\nurl = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'\ndataframe = read_csv(url, header=None)\n# shape\nprint(dataframe.shape)","metadata":{"id":"u_0oIqz5ZVZb","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"029b03a27c4a40c1967f7251cfc2ba75","outputId":"c5c46976-4ada-47a6-dc2f-e7857d375ae8","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":313,"user_tz":240,"timestamp":1652613616158},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"(506, 14)\n"}],"execution_count":42},{"cell_type":"code","source":"dataframe","metadata":{"id":"LE0y7f4PZdHk","colab":{"height":423,"base_uri":"https://localhost:8080/"},"cell_id":"195c449c13344ff6bc18c61af80fe3ca","outputId":"d7038669-d259-4fac-e65f-75bcb6fb7eb9","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":6,"user_tz":240,"timestamp":1652613617594},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":" 0 1 2 3 4 5 6 7 8 9 10 \\\n0 0.00632 18.0 2.31 0 0.538 6.575 65.2 4.0900 1 296.0 15.3 \n1 0.02731 0.0 7.07 0 0.469 6.421 78.9 4.9671 2 242.0 17.8 \n2 0.02729 0.0 7.07 0 0.469 7.185 61.1 4.9671 2 242.0 17.8 \n3 0.03237 0.0 2.18 0 0.458 6.998 45.8 6.0622 3 222.0 18.7 \n4 0.06905 0.0 2.18 0 0.458 7.147 54.2 6.0622 3 222.0 18.7 \n.. ... ... ... .. ... ... ... ... .. ... ... \n501 0.06263 0.0 11.93 0 0.573 6.593 69.1 2.4786 1 273.0 21.0 \n502 0.04527 0.0 11.93 0 0.573 6.120 76.7 2.2875 1 273.0 21.0 \n503 0.06076 0.0 11.93 0 0.573 6.976 91.0 2.1675 1 273.0 21.0 \n504 0.10959 0.0 11.93 0 0.573 6.794 89.3 2.3889 1 273.0 21.0 \n505 0.04741 0.0 11.93 0 0.573 6.030 80.8 2.5050 1 273.0 21.0 \n\n 11 12 13 \n0 396.90 4.98 24.0 \n1 396.90 9.14 21.6 \n2 392.83 4.03 34.7 \n3 394.63 2.94 33.4 \n4 396.90 5.33 36.2 \n.. ... ... ... \n501 391.99 9.67 22.4 \n502 396.90 9.08 20.6 \n503 396.90 5.64 23.9 \n504 393.45 6.48 22.0 \n505 396.90 7.88 11.9 \n\n[506 rows x 14 columns]","text/html":"\n
\n
\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
012345678910111213
00.0063218.02.3100.5386.57565.24.09001296.015.3396.904.9824.0
10.027310.07.0700.4696.42178.94.96712242.017.8396.909.1421.6
20.027290.07.0700.4697.18561.14.96712242.017.8392.834.0334.7
30.032370.02.1800.4586.99845.86.06223222.018.7394.632.9433.4
40.069050.02.1800.4587.14754.26.06223222.018.7396.905.3336.2
.............................................
5010.062630.011.9300.5736.59369.12.47861273.021.0391.999.6722.4
5020.045270.011.9300.5736.12076.72.28751273.021.0396.909.0820.6
5030.060760.011.9300.5736.97691.02.16751273.021.0396.905.6423.9
5040.109590.011.9300.5736.79489.32.38891273.021.0393.456.4822.0
5050.047410.011.9300.5736.03080.82.50501273.021.0396.907.8811.9
\n

506 rows × 14 columns

\n
\n \n \n \n\n \n
\n
\n "},"metadata":{},"execution_count":43}],"execution_count":43},{"cell_type":"code","source":"dataframe.dtypes","metadata":{"id":"BAOnINNWbxhV","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"fbbb84e944b0493eadf8a08b59f87b6b","outputId":"c2d0f5ee-2ebc-4abb-8be6-54c75e7057ba","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":311,"user_tz":240,"timestamp":1652613620251},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"0 float64\n1 float64\n2 float64\n3 int64\n4 float64\n5 float64\n6 float64\n7 float64\n8 int64\n9 float64\n10 float64\n11 float64\n12 float64\n13 float64\ndtype: object"},"metadata":{},"execution_count":44}],"execution_count":44},{"cell_type":"code","source":"dataframe.isnull().sum()","metadata":{"id":"0jTYRZU-Zd6J","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"f215bd2565474479b2258fe3e94edd15","outputId":"5ba70f31-adcb-44c8-9348-8454f5699830","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":5,"user_tz":240,"timestamp":1652613621328},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"0 0\n1 0\n2 0\n3 0\n4 0\n5 0\n6 0\n7 0\n8 0\n9 0\n10 0\n11 0\n12 0\n13 0\ndtype: int64"},"metadata":{},"execution_count":45}],"execution_count":45},{"cell_type":"code","source":"# separar en X y y\ndata= dataframe.values\nX, y = data[:, :-1], data[:, -1]\nprint(X.shape, y.shape)","metadata":{"id":"COZGHu1YZf9Y","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"7188643088034c3493c43dfd41f6a5c3","outputId":"9742799e-9daf-426f-ca1a-0f90330a1b1c","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":7,"user_tz":240,"timestamp":1652613650258},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"(506, 13) (506,)\n"}],"execution_count":47},{"cell_type":"code","source":"# crear el procediemiento LOOCV\ncv = LeaveOneOut()","metadata":{"id":"zYna9Ra0ZN21","cell_id":"855fbc41f3f84c1488be171d65dc3041","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":306,"user_tz":240,"timestamp":1652613653325},"deepnote_cell_type":"code"},"outputs":[],"execution_count":48},{"cell_type":"code","source":"# crear el modelo\nfrom sklearn.ensemble import RandomForestRegressor\nmodel = RandomForestRegressor(random_state=42, n_estimators=10,max_depth=4)","metadata":{"id":"LNl2xSzrZrlv","cell_id":"01c07148241b46b38ceba057c5646acf","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":285,"user_tz":240,"timestamp":1652613655862},"deepnote_cell_type":"code"},"outputs":[],"execution_count":49},{"cell_type":"code","source":"cross_val_score?","metadata":{"id":"aJ3b-1qfaGjI","cell_id":"a130f068815f4ca1afe97f5dd7d9be18","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":309,"user_tz":240,"timestamp":1652612956280},"deepnote_cell_type":"code"},"outputs":[],"execution_count":22},{"cell_type":"markdown","source":"**Metodos de scoring disponibles**\n\nhttps://scikit-learn.org/stable/modules/model_evaluation.html","metadata":{"id":"WVWHVmiPaT_8","cell_id":"6db27c7ec72842dda3332b238c9ae507","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"# evaluar el modelo (criterio de comparacion MAE)\nfrom sklearn.metrics import mean_squared_error, make_scorer, mean_absolute_error\nMAE = make_scorer(mean_absolute_error)\nscores = cross_val_score(model, X, y, scoring=MAE, cv=cv,error_score='raise',verbose=1)\n# convertir a postivos\nscores = abs(scores)\n# reportar el performance\nprint('MAE: %.3f (%.3f)' % (mean(scores), std(scores)))","metadata":{"id":"ALnUaNI0Z8Bb","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"185dae9f679d434b816d3b41daf91e7e","outputId":"c28a5f8c-30c5-4acf-f543-5495771ab8ba","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":11925,"user_tz":240,"timestamp":1652613833976},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stderr","text":"[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.\n"},{"output_type":"stream","name":"stdout","text":"MAE: 2.650 (2.727)\n"},{"output_type":"stream","name":"stderr","text":"[Parallel(n_jobs=1)]: Done 506 out of 506 | elapsed: 11.9s finished\n"}],"execution_count":54},{"cell_type":"markdown","source":"El modelo se evalúa usando LOOCV y el desempeño del modelo al hacer predicciones sobre nuevos datos tiene un MAE de ~2.650 (miles de dólares)","metadata":{"id":"PUA4D5YYdUIj","cell_id":"676b248f2ac547f696b47d64159bce17","deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"# Validacion simple","metadata":{"id":"PTEB59Ixdf8l","cell_id":"fb3ac5a0191c48f7be231e5ada69e9b7","deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## Clasificacion","metadata":{"id":"8hAF5MISdlOT","cell_id":"f9ead9c782984cacb38639c84e8ab932","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"# datos\nfrom sklearn.model_selection import train_test_split\nurl = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'\ndataframe = read_csv(url, header=None)\ndataframe.head()","metadata":{"id":"PRnQm-qFdpni","colab":{"height":299,"base_uri":"https://localhost:8080/"},"cell_id":"07873f7bc8cc493b8c81e75a3f40f929","outputId":"abc11f75-aad5-4087-9b45-7932a5654ee6","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":472,"user_tz":240,"timestamp":1652613936163},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":" 0 1 2 3 4 5 6 7 8 \\\n0 0.0200 0.0371 0.0428 0.0207 0.0954 0.0986 0.1539 0.1601 0.3109 \n1 0.0453 0.0523 0.0843 0.0689 0.1183 0.2583 0.2156 0.3481 0.3337 \n2 0.0262 0.0582 0.1099 0.1083 0.0974 0.2280 0.2431 0.3771 0.5598 \n3 0.0100 0.0171 0.0623 0.0205 0.0205 0.0368 0.1098 0.1276 0.0598 \n4 0.0762 0.0666 0.0481 0.0394 0.0590 0.0649 0.1209 0.2467 0.3564 \n\n 9 ... 51 52 53 54 55 56 57 \\\n0 0.2111 ... 0.0027 0.0065 0.0159 0.0072 0.0167 0.0180 0.0084 \n1 0.2872 ... 0.0084 0.0089 0.0048 0.0094 0.0191 0.0140 0.0049 \n2 0.6194 ... 0.0232 0.0166 0.0095 0.0180 0.0244 0.0316 0.0164 \n3 0.1264 ... 0.0121 0.0036 0.0150 0.0085 0.0073 0.0050 0.0044 \n4 0.4459 ... 0.0031 0.0054 0.0105 0.0110 0.0015 0.0072 0.0048 \n\n 58 59 60 \n0 0.0090 0.0032 R \n1 0.0052 0.0044 R \n2 0.0095 0.0078 R \n3 0.0040 0.0117 R \n4 0.0107 0.0094 R \n\n[5 rows x 61 columns]","text/html":"\n
\n
\n
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
0123456789...51525354555657585960
00.02000.03710.04280.02070.09540.09860.15390.16010.31090.2111...0.00270.00650.01590.00720.01670.01800.00840.00900.0032R
10.04530.05230.08430.06890.11830.25830.21560.34810.33370.2872...0.00840.00890.00480.00940.01910.01400.00490.00520.0044R
20.02620.05820.10990.10830.09740.22800.24310.37710.55980.6194...0.02320.01660.00950.01800.02440.03160.01640.00950.0078R
30.01000.01710.06230.02050.02050.03680.10980.12760.05980.1264...0.01210.00360.01500.00850.00730.00500.00440.00400.0117R
40.07620.06660.04810.03940.05900.06490.12090.24670.35640.4459...0.00310.00540.01050.01100.00150.00720.00480.01070.0094R
\n

5 rows × 61 columns

\n
\n \n \n \n\n \n
\n
\n "},"metadata":{},"execution_count":55}],"execution_count":55},{"cell_type":"code","source":"train_test_split?","metadata":{"id":"QuhlzClXePEw","cell_id":"9d287aee66a5482bb4d88a1057d6712d","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":332,"user_tz":240,"timestamp":1652614040837},"deepnote_cell_type":"code"},"outputs":[],"execution_count":58},{"cell_type":"code","source":"data= dataframe.values\nX,y = data[:,:-1],data[:, -1]\nX_train, X_test, y_train, y_test= train_test_split(X,y,test_size=0.2, shuffle=True)\nprint(X_train.shape, X_test.shape)","metadata":{"id":"ZyjM4djWd3jb","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"2db42b6074944382b6013849341c8267","outputId":"3adccef3-ee02-4876-f06c-7be3d4634ef7","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":306,"user_tz":240,"timestamp":1652614068872},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"(166, 60) (42, 60)\n"}],"execution_count":59},{"cell_type":"code","source":"# Modelo\nmodel = RandomForestClassifier(random_state=1,n_estimators=30, criterion=\"gini\",max_depth=4)\n# Ajuste\nmodel.fit(X_train,y_train)","metadata":{"id":"SJ16XBJweYR6","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"43d70ce9572b4c8eb2a096717c8839d0","outputId":"a8e2ce09-634f-4bb2-f452-ea2d1cc3b639","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":290,"user_tz":240,"timestamp":1652614152313},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"RandomForestClassifier(max_depth=4, n_estimators=30, random_state=1)"},"metadata":{},"execution_count":62}],"execution_count":62},{"cell_type":"code","source":"predicciones= model.predict(X_test)\npredicciones[0:5]","metadata":{"id":"0NrUxtqXei9j","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"eabf9d993e864b1dbf430740b3d47a4c","outputId":"530e8094-0a7d-46bb-84ad-bf6f8ebd34de","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":7,"user_tz":240,"timestamp":1652614155140},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"array(['M', 'M', 'M', 'M', 'M'], dtype=object)"},"metadata":{},"execution_count":63}],"execution_count":63},{"cell_type":"code","source":"# Validacion simple\nfrom sklearn.metrics import classification_report\nprint(classification_report(y_true= y_test, y_pred= predicciones))","metadata":{"id":"Cgjvdp-Qed4y","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"db95d0f903e843fdbabc0027ab941ae1","outputId":"4976d1f7-76c6-420f-b36e-c1b2c8fc7b38","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":308,"user_tz":240,"timestamp":1652614178747},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":" precision recall f1-score support\n\n M 0.74 0.92 0.82 25\n R 0.82 0.53 0.64 17\n\n accuracy 0.76 42\n macro avg 0.78 0.72 0.73 42\nweighted avg 0.77 0.76 0.75 42\n\n"}],"execution_count":64},{"cell_type":"markdown","source":"## Regresion","metadata":{"id":"ip48iwonezys","cell_id":"796da098020e4f41b4899b374ace335f","deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"# librerias\nfrom pandas import read_csv\n# cargar datos\nurl = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv'\ndataframe = read_csv(url, header=None)\n# shape\nprint(dataframe.shape)","metadata":{"id":"MW-OOp-Pe1Bl","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"53d6d0d94e44481da1f4b18df0deb2f2","outputId":"4e1f9206-8e73-4817-9e3a-c94c12edf578","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":696,"user_tz":240,"timestamp":1652614208523},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"(506, 14)\n"}],"execution_count":65},{"cell_type":"code","source":"data= dataframe.values\nX,y = data[:,:-1],data[:, -1]\nX_train, X_test, y_train, y_test= train_test_split(X,y,test_size=0.2, shuffle=True)\nprint(X_train.shape, X_test.shape)3","metadata":{"id":"qrdgNeXge5tj","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"dca0f639dc5e416f92a4adb4f092bb46","outputId":"f5f7825f-e779-48ef-ab81-b71eae6ad0ef","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":271,"user_tz":240,"timestamp":1652614218441},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"(404, 13) (102, 13)\n"}],"execution_count":66},{"cell_type":"code","source":"# Modelo\nmodel =RandomForestRegressor(random_state=42, n_estimators=10,max_depth=4)\n# Ajuste\nmodel.fit(X_train,y_train)","metadata":{"id":"wYugQC8xe_cu","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"771e71c1128b45969215be21c01284c6","outputId":"e6000d04-49e0-40e9-b6d3-6b515b93c70d","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":307,"user_tz":240,"timestamp":1652614262922},"deepnote_cell_type":"code"},"outputs":[{"output_type":"execute_result","data":{"text/plain":"RandomForestRegressor(max_depth=4, n_estimators=10, random_state=42)"},"metadata":{},"execution_count":68}],"execution_count":68},{"cell_type":"code","source":"predicciones= model.predict(X_test)","metadata":{"id":"NqYWOO_mfPOh","cell_id":"1084fba5fca64f32b16ba395ae1fd3e6","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":406,"user_tz":240,"timestamp":1652614304703},"deepnote_cell_type":"code"},"outputs":[],"execution_count":70},{"cell_type":"code","source":"# Validacion simple\nfrom sklearn.metrics import mean_squared_error, mean_absolute_error,r2_score\nprint('MSE: ',mean_squared_error(y_true= y_test, y_pred= predicciones))\nprint('MAE: ',mean_absolute_error(y_true= y_test, y_pred= predicciones))\nprint('R2: ',r2_score(y_true= y_test, y_pred= predicciones))","metadata":{"id":"9rV9aCqafHUf","colab":{"base_uri":"https://localhost:8080/"},"cell_id":"de3dd30549e54e719663074dd9afdd78","outputId":"72229ecc-a1c7-45f5-87b2-acfe6e3a7576","executionInfo":{"user":{"userId":"09471607480253994520","displayName":"David Francisco Bustos Usta"},"status":"ok","elapsed":315,"user_tz":240,"timestamp":1652614380280},"deepnote_cell_type":"code"},"outputs":[{"output_type":"stream","name":"stdout","text":"MSE: 13.256900736696371\nMAE: 2.74655558241846\nR2: 0.810238703397846\n"}],"execution_count":73},{"cell_type":"markdown","source":"\nCreated in deepnote.com \nCreated in Deepnote","metadata":{"created_in_deepnote_cell":true,"deepnote_cell_type":"markdown"}}],"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Ejemplo 1 - LOOCV y Validacion simple.ipynb","provenance":[],"authorship_tag":"ABX9TyO2aj2QNL/CW0qyF0G0GlOQ","collapsed_sections":[]},"deepnote":{},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"},"deepnote_notebook_id":"734c9c5141ef4980bc97ebbad5ae5d65","deepnote_execution_queue":[]}}