{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "1. K-Nearest Neighbors (K-NN)\n", "==\n", "El algoritmo (k-NN) es una forma de aprendizaje automático supervisado que se utiliza para predecir categorías, sklearn.neighbors proporciona funcionalidad para los métodos de aprendizaje basados en vecinos supervisados y sin supervisión. El aprendizaje supervisado basado en vecinos se presenta en dos tipos: \n", "\n", ">Clasificación para datos con etiquetas discretas\n", "\n", ">Regresión para datos con etiquetas continúas.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Paso 1\n", "==\n", "Primero debemos importar las librerías a utilizar:\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from matplotlib.colors import ListedColormap\n", "from sklearn import neighbors, datasets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Paso 2\n", "==\n", "De datasets importamos el conjunto de datos de iris() y establecemos el número de vecino más cercano en 15\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "iris = datasets.load_iris()\n", "n_neighbors = 15" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La librería Iris se encuentra separada en: \n", " \n", ">data que contiene todas las características.\n", "\n", ">target que contiene las clases asociadas a esas características. \n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[5.1 3.5]\n", " [4.9 3. ]\n", " [4.7 3.2]\n", " [4.6 3.1]\n", " [5. 3.6]\n", " [5.4 3.9]\n", " [4.6 3.4]\n", " [5. 3.4]\n", " [4.4 2.9]\n", " [4.9 3.1]\n", " [5.4 3.7]\n", " [4.8 3.4]\n", " [4.8 3. ]\n", " [4.3 3. ]\n", " [5.8 4. ]\n", " [5.7 4.4]\n", " [5.4 3.9]\n", " [5.1 3.5]\n", " [5.7 3.8]\n", " [5.1 3.8]\n", " [5.4 3.4]\n", " [5.1 3.7]\n", " [4.6 3.6]\n", " [5.1 3.3]\n", " [4.8 3.4]\n", " [5. 3. ]\n", " [5. 3.4]\n", " [5.2 3.5]\n", " [5.2 3.4]\n", " [4.7 3.2]\n", " [4.8 3.1]\n", " [5.4 3.4]\n", " [5.2 4.1]\n", " [5.5 4.2]\n", " [4.9 3.1]\n", " [5. 3.2]\n", " [5.5 3.5]\n", " [4.9 3.6]\n", " [4.4 3. ]\n", " [5.1 3.4]\n", " [5. 3.5]\n", " [4.5 2.3]\n", " [4.4 3.2]\n", " [5. 3.5]\n", " [5.1 3.8]\n", " [4.8 3. ]\n", " [5.1 3.8]\n", " [4.6 3.2]\n", " [5.3 3.7]\n", " [5. 3.3]\n", " [7. 3.2]\n", " [6.4 3.2]\n", " [6.9 3.1]\n", " [5.5 2.3]\n", " [6.5 2.8]\n", " [5.7 2.8]\n", " [6.3 3.3]\n", " [4.9 2.4]\n", " [6.6 2.9]\n", " [5.2 2.7]\n", " [5. 2. ]\n", " [5.9 3. ]\n", " [6. 2.2]\n", " [6.1 2.9]\n", " [5.6 2.9]\n", " [6.7 3.1]\n", " [5.6 3. ]\n", " [5.8 2.7]\n", " [6.2 2.2]\n", " [5.6 2.5]\n", " [5.9 3.2]\n", " [6.1 2.8]\n", " [6.3 2.5]\n", " [6.1 2.8]\n", " [6.4 2.9]\n", " [6.6 3. ]\n", " [6.8 2.8]\n", " [6.7 3. ]\n", " [6. 2.9]\n", " [5.7 2.6]\n", " [5.5 2.4]\n", " [5.5 2.4]\n", " [5.8 2.7]\n", " [6. 2.7]\n", " [5.4 3. ]\n", " [6. 3.4]\n", " [6.7 3.1]\n", " [6.3 2.3]\n", " [5.6 3. ]\n", " [5.5 2.5]\n", " [5.5 2.6]\n", " [6.1 3. ]\n", " [5.8 2.6]\n", " [5. 2.3]\n", " [5.6 2.7]\n", " [5.7 3. ]\n", " [5.7 2.9]\n", " [6.2 2.9]\n", " [5.1 2.5]\n", " [5.7 2.8]\n", " [6.3 3.3]\n", " [5.8 2.7]\n", " [7.1 3. ]\n", " [6.3 2.9]\n", " [6.5 3. ]\n", " [7.6 3. ]\n", " [4.9 2.5]\n", " [7.3 2.9]\n", " [6.7 2.5]\n", " [7.2 3.6]\n", " [6.5 3.2]\n", " [6.4 2.7]\n", " [6.8 3. ]\n", " [5.7 2.5]\n", " [5.8 2.8]\n", " [6.4 3.2]\n", " [6.5 3. ]\n", " [7.7 3.8]\n", " [7.7 2.6]\n", " [6. 2.2]\n", " [6.9 3.2]\n", " [5.6 2.8]\n", " [7.7 2.8]\n", " [6.3 2.7]\n", " [6.7 3.3]\n", " [7.2 3.2]\n", " [6.2 2.8]\n", " [6.1 3. ]\n", " [6.4 2.8]\n", " [7.2 3. ]\n", " [7.4 2.8]\n", " [7.9 3.8]\n", " [6.4 2.8]\n", " [6.3 2.8]\n", " [6.1 2.6]\n", " [7.7 3. ]\n", " [6.3 3.4]\n", " [6.4 3.1]\n", " [6. 3. ]\n", " [6.9 3.1]\n", " [6.7 3.1]\n", " [6.9 3.1]\n", " [5.8 2.7]\n", " [6.8 3.2]\n", " [6.7 3.3]\n", " [6.7 3. ]\n", " [6.3 2.5]\n", " [6.5 3. ]\n", " [6.2 3.4]\n", " [5.9 3. ]]\n", "[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0\n", " 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1\n", " 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2\n", " 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2\n", " 2 2]\n" ] } ], "source": [ "X = iris.data[:,:2] # solo tomaremos los primeros 2 features, para ejemplificar.\n", "print(X)\n", "y = iris.target\n", "print(y)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Paso 4\n", "==\n", "Para analizar los datos visualmente crearemos un mapa de colores mediante ListedColormap a partir de una lista de colores, y utilizaremos un mallado con un paso de 0.2 \n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "h = .02 \n", "cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA', '#AAAAFF'])\n", "cmap_bold = ListedColormap(['#FF0000', '#00FF00', '#0000FF'])\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Paso 5\n", "==\n", "Ahora realizaremos dos clasificaciones, una para un peso uniforme y otra para un peso en función al inverso de la distancia.\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 0 2]\n", "[2 0 1]\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# n_neighbors = 5\n", "for weights in ['uniform', 'distance']:\n", " # Creamos una instancia del clasificador de vecinos más cercanos y le pasamos los datos mediante fit().\n", " # El primer párametro de KNeighborsClassifier es con cuantos vecinos quiero clasificar y el segundo el tipo de peso a utilizar.\n", " clf = neighbors.KNeighborsClassifier(n_neighbors, weights=weights)\n", " clf.fit(X, y)\n", "\n", " # Establecemos los límites del gráfico y asignamos un color a cada punto de malla.\n", " x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1\n", " y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1\n", " xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))\n", " Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])\n", "\n", " # Agregamos el resultado al gráfico\n", " Z = Z.reshape(xx.shape)\n", " plt.figure()\n", " plt.pcolormesh(xx, yy, Z, shading='auto', cmap=cmap_light)\n", "\n", " # Ploteo los datos de entrenamiento\n", " plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold, edgecolor='k', s=20)\n", " plt.xlim(xx.min(), xx.max())\n", " plt.ylim(yy.min(), yy.max())\n", " plt.title(\"3-Clasificación (k = %i, weights = '%s')\" % (n_neighbors, weights))\n", " \n", " #Ploteo un nuevo dato \n", " Xn = np.array([[7.3,3], [5.1,2.9], [6.4,3.2]])\n", " Yn = clf.predict(Xn)\n", " print(Yn)\n", "plt.show()\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2.- Arboles de decisiones\n", "==\n", "Un árbol de decisiones se asemeja a las raíces de un árbol, en donde partimos de un conjunto de datos con determinadas características, que llamaremos raíz principal y que iremos descomponiendo por atributos, en ramas a partir de una determinada clasificación. Cada descomposición lleva asociada una condición que puede resultar verdadera o falsa y que se encuentra relacionada a una caracterización específica. \n", "Podríamos tener por ejemplo el atributo “tipo de vehículo” con valores:\n", " \n", ">Camionetas \n", "\n", ">Autos\n", "\n", "Y el atributo “tracción”, con valores:\n", " \n", ">Cuatro ruedas\n", "\n", ">Dos ruedas\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "En base a estos atributos podríamos crear un árbol en el cual la primera división se realice por “tipo de vehículo” y luego por “tracción” o al revés. Esta división la realizaremos a partir de un algoritmo que optimice la forma en la cual se lleva a cabo la división en base a un análisis probabilístico.\n", "Cuanto más profundo es el árbol, más complejas son las reglas de decisión y más se ajusta el modelo\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.datasets import load_iris\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.tree import export_graphviz\n", "import graphviz\n", "import matplotlib.pyplot as plt\n", "from matplotlib.colors import ListedColormap" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "iris=load_iris()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "X = iris.data[:,:2] # solo tomaremos los primeros 2 features, para ejemplificar.\n", "Y = iris.target" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "X_entrenamiento, X_test, y_entrenamiento, y_test=train_test_split(X, Y)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "arbol=DecisionTreeClassifier(max_depth=3)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DecisionTreeClassifier(max_depth=3)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arbol.fit(X_entrenamiento, y_entrenamiento)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.6578947368421053" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arbol.score(X_test, y_test)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.8571428571428571" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arbol.score(X_entrenamiento, y_entrenamiento)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# import os\n", "# Usuarios de Windows:\n", "# os.environ[\"PATH\"] += os.pathsep + 'C:\\Program Files (x86)\\Graphviz2.38\\bin'\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "export_graphviz(arbol, out_file='arbol1.dot', class_names=iris.target_names, feature_names=iris.feature_names[:2], impurity=False, filled=True)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", "\n", "\n", "\n", "Tree\n", "\n", "\n", "\n", "0\n", "\n", "sepal length (cm) <= 5.45\n", "samples = 112\n", "value = [42, 37, 33]\n", "class = setosa\n", "\n", "\n", "\n", "1\n", "\n", "sepal width (cm) <= 2.7\n", "samples = 41\n", "value = [37, 3, 1]\n", "class = setosa\n", "\n", "\n", "\n", "0->1\n", "\n", "\n", "True\n", "\n", "\n", "\n", "6\n", "\n", "sepal length (cm) <= 6.25\n", "samples = 71\n", "value = [5, 34, 32]\n", "class = versicolor\n", "\n", "\n", "\n", "0->6\n", "\n", "\n", "False\n", "\n", "\n", "\n", "2\n", "\n", "sepal width (cm) <= 2.45\n", "samples = 4\n", "value = [0, 3, 1]\n", "class = versicolor\n", "\n", "\n", "\n", "1->2\n", "\n", "\n", "\n", "\n", "\n", "5\n", "\n", "samples = 37\n", "value = [37, 0, 0]\n", "class = setosa\n", "\n", "\n", "\n", "1->5\n", "\n", "\n", "\n", "\n", "\n", "3\n", "\n", "samples = 2\n", "value = [0, 2, 0]\n", "class = versicolor\n", "\n", "\n", "\n", "2->3\n", "\n", "\n", "\n", "\n", "\n", "4\n", "\n", "samples = 2\n", "value = [0, 1, 1]\n", "class = versicolor\n", "\n", "\n", "\n", "2->4\n", "\n", "\n", "\n", "\n", "\n", "7\n", "\n", "sepal width (cm) <= 3.45\n", "samples = 34\n", "value = [5, 24, 5]\n", "class = versicolor\n", "\n", "\n", "\n", "6->7\n", "\n", "\n", "\n", "\n", "\n", "10\n", "\n", "sepal length (cm) <= 7.05\n", "samples = 37\n", "value = [0, 10, 27]\n", "class = virginica\n", "\n", "\n", "\n", "6->10\n", "\n", "\n", "\n", "\n", "\n", "8\n", "\n", "samples = 29\n", "value = [0, 24, 5]\n", "class = versicolor\n", "\n", "\n", "\n", "7->8\n", "\n", "\n", "\n", "\n", "\n", "9\n", "\n", "samples = 5\n", "value = [5, 0, 0]\n", "class = setosa\n", "\n", "\n", "\n", "7->9\n", "\n", "\n", "\n", "\n", "\n", "11\n", "\n", "samples = 31\n", "value = [0, 10, 21]\n", "class = virginica\n", "\n", "\n", "\n", "10->11\n", "\n", "\n", "\n", "\n", "\n", "12\n", "\n", "samples = 6\n", "value = [0, 0, 6]\n", "class = virginica\n", "\n", "\n", "\n", "10->12\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with open('arbol1.dot') as f:\n", " dot_graph=f.read()\n", "graphviz.Source(dot_graph)\n" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "from sklearn.tree import plot_tree" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[Text(153.45000000000002, 190.26, 'X[0] <= 5.45\\ngini = 0.663\\nsamples = 112\\nvalue = [42, 37, 33]'),\n", " Text(83.7, 135.9, 'X[1] <= 2.7\\ngini = 0.18\\nsamples = 41\\nvalue = [37, 3, 1]'),\n", " Text(55.800000000000004, 81.53999999999999, 'X[1] <= 2.45\\ngini = 0.375\\nsamples = 4\\nvalue = [0, 3, 1]'),\n", " Text(27.900000000000002, 27.180000000000007, 'gini = 0.0\\nsamples = 2\\nvalue = [0, 2, 0]'),\n", " Text(83.7, 27.180000000000007, 'gini = 0.5\\nsamples = 2\\nvalue = [0, 1, 1]'),\n", " Text(111.60000000000001, 81.53999999999999, 'gini = 0.0\\nsamples = 37\\nvalue = [37, 0, 0]'),\n", " Text(223.20000000000002, 135.9, 'X[0] <= 6.25\\ngini = 0.563\\nsamples = 71\\nvalue = [5, 34, 32]'),\n", " Text(167.4, 81.53999999999999, 'X[1] <= 3.45\\ngini = 0.458\\nsamples = 34\\nvalue = [5, 24, 5]'),\n", " Text(139.5, 27.180000000000007, 'gini = 0.285\\nsamples = 29\\nvalue = [0, 24, 5]'),\n", " Text(195.3, 27.180000000000007, 'gini = 0.0\\nsamples = 5\\nvalue = [5, 0, 0]'),\n", " Text(279.0, 81.53999999999999, 'X[0] <= 7.05\\ngini = 0.394\\nsamples = 37\\nvalue = [0, 10, 27]'),\n", " Text(251.10000000000002, 27.180000000000007, 'gini = 0.437\\nsamples = 31\\nvalue = [0, 10, 21]'),\n", " Text(306.90000000000003, 27.180000000000007, 'gini = 0.0\\nsamples = 6\\nvalue = [0, 0, 6]')]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plot_tree(arbol)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAckAAAEGCAYAAAAOgW4QAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8/fFQqAAAACXBIWXMAAAsTAAALEwEAmpwYAAAXnUlEQVR4nO3de7RedX3n8feHgAS5BDvgGLAY5KIjN4GYKSAUpnSWgIqMWFoZFXW0aIuXLtoy1Vo7qETrmrKktSyljnZ5qzdcDii3ch2QhgRyAUtQBFSgogKRu4R8549nn+WTw9nn7EPynOc5yfu11lnn7P3sy+dsQj757b2f/aSqkCRJT7fFsANIkjSqLElJklpYkpIktbAkJUlqYUlKktRiy2EH0Ma100471YIFC4YdQ5JmlWXLlv28qnYeP9+S3MQsWLCApUuXDjuGJM0qSe6aaL6nWyVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFpakJEktLElJklpYkpIktfBhApuYVXevYcEZFw47hjaCOxcfN+wI0mbPkaQkSS0sSUmSWliSkiS1sCQlSWphSUqS1MKSlCSphSUpSVILS1KSpBaWpCRJLSxJSZJaWJKSJLWwJCVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFpakJEktLElJklpYkpIktbAkJUlqYUlKktTCkpQkqYUlKUlSC0tSkqQWlqQkSS0sSUmSWliSkiS1sCQlSWphSUqS1MKSlCSphSUpSVILS1KSpBaWpCRJLSxJSZJaWJKSJLWwJCVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFptMSSY5MskFz2C9XZJ8reW1K5MsbH7+i775C5Lc3HH770nyxunmmmA7f5zkzRu6HUlSd5tMST5TVXVPVZ3YYdG/mHqR9SXZEngL8MVpB3u6zwDv2gjbkSR1NGMlmWTbJBcmWZHk5iQnNfMPTnJVkmVJLk4yv5l/ZZKzk1zXLL+omb+omXdT8/1FU+z320n2b36+KckHmp/PTPI/+keFSbZJ8uUkK5P8M7BNM38xsE2S5Um+0Gx6TpJPJ7klySVJtplg9/8FuLGq1jbb2TPJZc0xuDHJHs0I+KokX0lyW5LFSU5OsiTJqiR7AFTVo8CdY8dBkjR4MzmSfAVwT1UdUFX7Ahcl2Qo4Bzixqg6mN1r6cN8621bVocA7m9cAbgWOqKoDgQ8AH5liv1cDhyfZAVgLHNbMfzlwzbhl3wE8WlX7NzkOBqiqM4DHquqlVXVys+xewN9X1T7Ag8BrJ9j3YcCyvukvNOscABwK3NvMPwB4N7Af8AZg76paBJwHnNa3/lLg8PE7SfL2JEuTLH3q0TWTHQtJ0jTMZEmuAo5O8tEkh1fVGuBFwL7ApUmWA+8Hnt+3zpcAqupqYIckOwLzgK82o7+/BfaZYr/XAEfQK8ULge2SPBtYUFWrxy17BPD5Zp8rgZWTbPeOqlre/LwMWDDBMvOBnwEk2R7YtarOb7b/eDM6BLihqu6tqieA24FLmvmrxm33PmCX8Tupqk9V1cKqWjjn2fMmiSxJmo4tZ2pHVXVbkoOBY4GzklwCnA/cUlWHtK02wfSZwBVVdUKSBcCVU+z6BmAh8EPgUmAn4G2sP8KbbJ9tnuj7+SmaU7PjPAbMbX5Ox22t65tex/r/jeY225QkzYCZvCa5C71TmZ8HPg4cBKwGdk5ySLPMVkn6R4Zj1y1fDqxpRp/zgLub10+Zar9V9Svgx8DvAdfTG1meztNPtULv1OzJzT73Bfbve+3J5vTwdPwbsGeT45fAT5K8ptn+1s2Idjr2BjrdVStJ2nAzebp1P2BJc1r1fcCHmgI7EfhokhXAcnrX6sY8kOQ64Fzgrc28j9EbiV4LzOm472uAnzanN6+hd0p3opL8B3qnY1cCfwYs6XvtU8DKvht3uvgOvVO4Y94AvKvZ/nXA86axLehd47xsmutIkp6hVHU9uzizklwJnF5VS4edZUMkOR/4s6r6/gZu50DgT6rqDZMtt/X8vWr+m87ekF1pRNy5+LhhR5A2G0mWVdXC8fM3+/dJzoAz6N3As6F2Av5yI2xHktTRjN24M11VdeSwM2wMzR204++ifSbbuXQjxJEkTYMjSUmSWliSkiS1sCQlSWphSUqS1MKSlCSphSUpSVILS1KSpBaWpCRJLSxJSZJaWJKSJLWwJCVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFpakJEktOn/ocpLjgH2AuWPzqup/DSKUJEmjoNNIMsm5wEnAaUCA1wEvGGAuSZKGruvp1kOr6o3AA1X118AhwG8OLpYkScPXtSQfa74/mmQX4Elg98FEkiRpNHS9JnlBkh2BvwFuBAo4b1ChJEkaBZ1KsqrObH78epILgLlVtWZwsSRJGr6uN+78UTOSpKqeALZI8s5BBpMkadi6XpN8W1U9ODZRVQ8AbxtIIkmSRkTXktwiScYmkswBnjWYSJIkjYauN+5cDHyleb9kAacCFw0slSRJI6BrSf458IfAO+g9TOASvLtVkrSJ63p36zrgH5ovSZI2C5OWZJKvVNXvJVlF7zTreqpq/4ElkyRpyKYaSb67+f7KQQeRJGnUTHp3a1Xd2/z4zqq6q/8L8H2SkqRNWte3gPzuBPOO2ZhBJEkaNVNdk3wHvRHjHklW9r20PXDtIINJkjRsU12T/CLwHeAs4Iy++Q9V1f0DSyVJ0giY6prkmqq6E3g/8O/Ntcjdgf8+9ixXSZI2VV2vSX4deCrJnsA/0ivKLw4slSRJI6DrE3fWVdXaJP8NOLuqzkly0yCD6ZnZb9d5LF183LBjSNImoetI8skkfwC8EbigmbfVYCJJkjQaupbkm4FDgA9X1R1Jdgc+P7hYkiQNX+vp1iSnAquq6tqq+h7wrrHXquoOYPEM5JMkaWgmuyb5JeATzV2si/HZrZKkzUxrSVbVGuBNSZ6Lz26VJG2Gutzd+gvg4qo6etBhJEkaJVPeuFNVTwGPJpk3A3kkSRoZXd8n+TiwKsmlwCNjM6vqXe2rSJI0u3UtyQubL0mSNhudSrKqPpdkG2C3qlo94EySJI2ETg8TSPIqYDlwUTP90iTfGmAuSZKGrusTdz4ILAIeBKiq5fQeci5J0iara0mubd432e9pDxeQJGlT0vXGnZuTvB6Yk2Qveo+ou25wsSRJGr6uI8nTgH2AJ+h9juQa4N2DCiVJ0ijoOpI8rqreB7xvbEaS1wFfHUgqSZJGQNeR5P/sOE+SpE3GpCPJJMcAxwK7JvlE30s7AGsHGUySpGGb6nTrPcBS4NXAsr75DwHvHVQoSZJGwaQlWVUrgBVJzgceaR52TpI5wNYzkE+SpKHpek3yEmCbvultgMs2fhxJkkZH17tb51bVw2MTVfVwkmcPKJM2wKq717DgDJ9FL2nzcufi4way3a4jyUeSHDQ2keRg4LGBJJIkaUR0HUm+B/hqknua6fnASQNJJEnSiOj6UVk3JHkx8CIgwK1V9eRAk0mSNGRdR5LQK8iXAHOBA5NQVf80mFiSJA1fp5JM8lfAkfRK8tvAMcD/AyxJSdImq+uNOycCvwP8e1W9GTgA3ycpSdrEdS3Jx6pqHbA2yQ7AfcALBxdLkqTh63pNcmmSHYFP03s83cPAkkGFkiRpFExZkkkCnFVVDwLnJrkI2KGqVg46nCRJwzTl6daqKuCbfdN3WpCSpM1B12uS1yd52UCTSJI0YrpekzwK+MMkdwGP0HugQFXV/gNLJknSkHUtyWMGmkKSpBHU9bF0dwEkeS69J+5IkrTJ63RNMsmrk3wfuAO4CrgT+M4Ac0mSNHRdb9w5E/gt4Laq2p3e03euHVgqSZJGQNeSfLKqfgFskWSLqroCeOngYkmSNHxdb9x5MMl2wNXAF5LcB6wdXCxJkoZv0pJMsifwH4HjgceA9wInAy8ATht4OkmShmiq061nAw9V1SNVta6q1lbV5+h9XNYHBx1OkqRhmqokF0z0CLqqWgosGEgiSZJGxFQlOdl7IrfZmEEkSRo1U5XkDUneNn5mkrfS+8gsSZI2WVPd3foe4PwkJ/PrUlwIPAs4YYC5JEkauklLsqp+Chya5Chg32b2hVV1+cCTSZI0ZF2f3XoFcMWAs0iSNFK6PnFHkqTNjiUpSVILS1KSpBaWpCRJLSxJSZJaWJKSJLWwJCVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFpakJEktLElJklpYkpIktZiVJZnkyCQXdJ2/Efb3miQv6Zu+MsnCDuvN3xh5kuyc5KIN3Y4kaXpmZUkOwWuAl0y10AT+BPj0hu68qn4G3JvksA3dliSpu4GUZJJtk1yYZEWSm5Oc1Mw/OMlVSZYluTjJ/Gb+lUnOTnJds/yiZv6iZt5NzfcXTTPDZ5Lc0Kx/fDP/lCTfSHJRku8n+VjfOm9NcluT59NJ/i7JocCrgb9JsjzJHs3ir0uypFn+8JYYrwUuarY9J8nHk6xKsjLJac38O5N8JMl3kyxNclBzbG5Pcmrftr4JnNz195ckbbgtB7TdVwD3VNVxAEnmJdkKOAc4vqp+1hTnh4G3NOtsW1WHJjkC+AywL3ArcERVrU1yNPAResXTxfuAy6vqLUl2BJYkuax57aXAgcATwOok5wBPAX8JHAQ8BFwOrKiq65J8C7igqr7W/D4AW1bVoiTHAn8FHN2/8yS7Aw9U1RPNrLcDuwMHNr/Pb/Qt/uOqOiTJ3wKfBQ4D5gK3AOc2yywFPjTRL5rk7c32mbPDzh0PjyRpKoMqyVXAx5N8lF65XJNkX3rFd2lTMnOAe/vW+RJAVV2dZIem2LYHPpdkL6CAraaR4b8Cr05yejM9F9it+flfqmoNQJLvAS8AdgKuqqr7m/lfBfaeZPvfaL4vAxZM8Pp84Gd900cD51bV2ub3vL/vtW8131cB21XVQ8BDSR5PsmNVPQjcB+wyUZCq+hTwKYCt5+9Vk2SWJE3DQEqyqm5LcjBwLHBWkkuA84FbquqQttUmmD4TuKKqTkiyALhyGjECvLaqVq83M/nP9EaQY56idxwyjW3Tt42x9cd7jF4x9+dpK7Cxba0bl21d37bnNtuUJM2QQV2T3AV4tKo+D3yc3inM1cDOSQ5pltkqyT59q41dt3w5sKYZ6c0D7m5eP2WaMS4GTkszbE1y4BTLLwF+O8lzkmzJ+qd1H6I3qp2O21h/hHkJcGqzbcadbu1ib+Dmaa4jSdoAg7q7dT961wCX07s2+KGq+hVwIvDRJCuA5cChfes8kOQ6etfg3trM+xi9kei19E7PTseZ9E7PrkxyczPdqqrupnfN81+By4DvAWual78M/GlzA9AeLZsYv71HgNuT7NnMOg/4UZNnBfD6af4+RwEXTnMdSdIGSNXwL2EluRI4vaqWDjnHdlX1cDPaOx/4TFWdvwHbOwE4uKrevxGyXU3vpqcHJltu6/l71fw3nb2hu5OkWeXOxcdt0PpJllXV097/7vsk1/fBZvR7M3AHvbddPGNNwd65oaGS7Az876kKUpK0cQ3q7tZpqaojh50BoKpOn3qpaW/zvI2wjZ+xgYUtSZo+R5KSJLWwJCVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFpakJEktLElJklpYkpIktbAkJUlqYUlKktTCkpQkqYUlKUlSC0tSkqQWlqQkSS0sSUmSWliSkiS1sCQlSWphSUqS1MKSlCSphSUpSVILS1KSpBaWpCRJLSxJSZJaWJKSJLWwJCVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFpakJEktLElJklpYkpIktbAkJUlqYUlKktTCkpQkqcWWww6gjWu/XeexdPFxw44hSZsER5KSJLWwJCVJamFJSpLUwpKUJKmFJSlJUgtLUpKkFpakJEktLElJklpYkpIktUhVDTuDNqIkDwGrh52jo52Anw87xDTMpryzKSvMrrxmHZxh5n1BVe08fqaPpdv0rK6qhcMO0UWSpbMlK8yuvLMpK8yuvGYdnFHM6+lWSZJaWJKSJLWwJDc9nxp2gGmYTVlhduWdTVlhduU16+CMXF5v3JEkqYUjSUmSWliSkiS1sCRnqSSvSLI6yQ+SnDHB60nyieb1lUkOGkbOJstUWV+c5LtJnkhy+jAy9mWZKuvJzfFcmeS6JAcMI2dfnqnyHt9kXZ5kaZKXDyNnk2XSrH3LvSzJU0lOnMl8E+SY6tgemWRNc2yXJ/nAMHI2WaY8tk3e5UluSXLVTGccl2WqY/unfcf15ubPw28MIytV5dcs+wLmALcDLwSeBawAXjJumWOB7wABfgv41xHO+lzgZcCHgdNH/LgeCjyn+fmYYR3XaeTdjl/fe7A/cOuoZu1b7nLg28CJI35sjwQuGFbGaWbdEfgesFsz/dxRzjtu+VcBlw8rryPJ2WkR8IOq+mFV/Qr4MnD8uGWOB/6peq4Hdkwyf6aD0iFrVd1XVTcATw4hX78uWa+rqgeayeuB589wxn5d8j5czd80wLbAsO7U6/JnFuA04OvAfTMZbgJd846CLllfD3yjqn4Evf/nZjhjv+ke2z8AvjQjySZgSc5OuwI/7pv+STNvusvMhFHJ0cV0s76V3mh9WDrlTXJCkluBC4G3zFC28abMmmRX4ATg3BnM1abrn4VDkqxI8p0k+8xMtKfpknVv4DlJrkyyLMkbZyzd03X+/yzJs4FX0PuH01D4WLrZKRPMGz9C6LLMTBiVHF10zprkKHolObRrfHTMW1XnA+cnOQI4Ezh60MEm0CXr2cCfV9VTyUSLz6gueW+k97zPh5McC3wT2GvQwSbQJeuWwMHA7wDbAN9Ncn1V3TbocBOYzt8JrwKurar7B5hnUpbk7PQT4Df7pp8P3PMMlpkJo5Kji05Zk+wPnAccU1W/mKFsE5nWsa2qq5PskWSnqprph0h3yboQ+HJTkDsBxyZZW1XfnJGE65syb1X9su/nbyf55Agf258AP6+qR4BHklwNHAAMoySn8+f29xniqVbAG3dm4xe9f9z8ENidX1/43mfcMsex/o07S0Y1a9+yH2S4N+50Oa67AT8ADp0lfw725Nc37hwE3D02PWpZxy3/WYZ7406XY/u8vmO7CPjRqB5b4D8B/9Is+2zgZmDfUT22zXLzgPuBbYf156CqHEnORlW1NskfAxfTu1PsM1V1S5JTm9fPpXd34LH0/kJ/FHjzqGZN8jxgKbADsC7Je+jd7fbLtu0OKyvwAeA/AJ9sRjxra0ifWtAx72uBNyZ5EngMOKmav4FGMOvI6Jj3ROAdSdbSO7a/P6rHtqr+LclFwEpgHXBeVd0801m75m0WPQG4pHqj36HxsXSSJLXw7lZJklpYkpIktbAkJUlqYUlKktTCkpQkqYUlKW2Gkjw8w/tbkOT1A9juwiSf2Njblcb4FhBpM5Tk4arabob2tSW9x/edXlWvnIl9ShuLI0lpM9Z8xuBVSb6S5LYki5vPzFySZFWSPZrlPpvk3CTXNMu9spk/N8n/aZa9qXmmLUlOSfLVJP8XuARYDBzefD7ge5uR5TVJbmy+Du3Lc2WSryW5NckX0jy1ofmcyeuaB4ovSbJ9s/wFzeuLmtdvar6/aAiHVJsYn7gj6QB6jy27n97jws6rqkVJ3k3vo6ve0yy3APhtYA/giiR7An8EUFX7JXkxcEmSvZvlDwH2r6r7kxxJ30iy+XSH362qx5PsRe/5nGNPLjoQ2Ife8zyvBQ5LsgT4Z3pPDLohyQ70nnLT71bgiOaJLkcDH6H3xCHpGbMkJd1QVfcCJLmd3sgPYBVwVN9yX6mqdcD3k/wQeDG906jnAFTVrUnuovexTACXVvunN2wF/F2SlwJP9a0DvecM/6TJs5xeOa8B7q3e544y9sjCcZ8WMg/4XFO61exD2iCebpX0RN/P6/qm17H+P6TH38BQTPyxR2Mme+bme4Gf0hvFLqT3oOuJ8jzVZMgE+x/vTOCKqtqX3kcszZ1ieWlKlqSkrl6XZIvmOuULgdXA1cDJAM1p1t2a+eM9BGzfNz2P3shwHfAGeg+6nsytwC5JXtbsa/vmhqB+8+h9ygnAKV1/KWkylqSkrlYDV9H7CLZTq+px4JPAnCSr6F0zPKWqnphg3ZXA2uamm/c2670pyfX0TrVO+kkPVfUr4CTgnCQrgEt5+kjxY8BZSa5l6tKVOvEtIJKmlOSzwAVV9bVhZ5FmkiNJSZJaOJKUJKmFI0lJklpYkpIktbAkJUlqYUlKktTCkpQkqcX/B15W4FmVbr0JAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "caracteristica=2\n", "\n", "plt.barh(range(caracteristica),arbol.feature_importances_)\n", "plt.yticks(np.arange(caracteristica),iris.feature_names[:2])\n", "plt.xlabel('Importancia')\n", "plt.ylabel('Característica')\n", "plt.show()\n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "h = .02 \n", "cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA', '#AAAAFF'])\n", "cmap_bold = ListedColormap(['#FF0000', '#00FF00', '#0000FF'])" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 0 2]\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Establecemos los límites del gráfico y asignamos un color a cada punto de malla.\n", "x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1\n", "y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1\n", "xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))\n", "Z = arbol.predict(np.c_[xx.ravel(), yy.ravel()])\n", "\n", "# Agregamos el resultado al gráfico\n", "Z = Z.reshape(xx.shape)\n", "plt.figure()\n", "plt.pcolormesh(xx, yy, Z, shading='auto', cmap=cmap_light)\n", "\n", "# Ploteo los datos de entrenamiento\n", "plt.scatter(X[:, 0], X[:, 1], c=y, cmap=cmap_bold, edgecolor='k', s=20)\n", "plt.xlim(xx.min(), xx.max())\n", "plt.ylim(yy.min(), yy.max())\n", "plt.title(\"Decision tree\")\n", " \n", "# #Ploteo un nuevo dato \n", "Xn = np.array([[7.3,3], [5.1,2.9], [6.4,3.2]])\n", "Yn = arbol.predict(Xn)\n", "print(Yn)\n", "plt.show()\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5" } }, "nbformat": 4, "nbformat_minor": 4 }