{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### 简介\n", "CART树即分类回归树(classification and regression tree),顾名思义,它即能用作分类任务又能用作回归任务,它的应用比较广泛,通常会用作集成学习的基分类器,总得来说,它与ID3/C4.5有如下不同: \n", "\n", "(1)它是一颗二叉树; \n", "\n", "(2)特征选择的方法不一样,CART分类树利用基尼系数做特征选择,CART回归树利用平方误差做特征选择; \n", "\n", "接下来,分别对CART分类树和回归树做介绍" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### CART分类树\n", "\n", "首先介绍特征选择方法,基尼系数: \n", "\n", "$$\n", "Gini(p)=\\sum_{k=1}^Kp_k(1-p_k)=1-\\sum_{k=1}^Kp_k^2\n", "$$ \n", "\n", "所以,对于给定的样本集合$D$,其基尼指数: \n", "\n", "$$\n", "Gini(D)=1-\\sum_{k=1}^K(\\frac{\\mid C_k \\mid}{\\mid D \\mid})^2\n", "$$ \n", "\n", "这里,$C_k$是$D$中属于第$k$类的样本子集,$K$是类的个数,由于CART树是二叉树,所以对于某特征$A$,判断其对分类标签的贡献时,只需要判断该特征是否等于某个取值$a$的情况,将当前数据集分割成$D_1$和$D_2$两部分: \n", "\n", "$$\n", "D_1=\\{(x,y)\\in D\\mid A(x)=a\\},D_2=D-D_1\n", "$$ \n", "\n", "所以在特征$A(x)=a$的条件下,集合$D$的基尼指数可以定义为: \n", "\n", "$$\n", "Gini(D,A,a)=\\frac{\\mid D_1 \\mid}{\\mid D \\mid}Gini(D_1)+\\frac{\\mid D_2 \\mid}{\\mid D \\mid}Gini(D_2),这里D_1=\\{(x,y)\\in D\\mid A(x)=a\\},D_2=D-D_1\n", "$$ \n", "\n", "#### 代码实现\n", "接下来进行CART分类树的代码实现,这里与ID3/C4.5最大的不同就是每次对当前结点仅进行二分处理" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "\"\"\"\n", "定义计算gini系数相关的函数,代码封装到ml_models.utils\n", "\"\"\"\n", "import numpy as np\n", "def gini(x, sample_weight=None):\n", " \"\"\"\n", " 计算基尼系数 Gini(D)\n", " :param x:\n", " :param sample_weight:\n", " :return:\n", " \"\"\"\n", " x_num = len(x)\n", " # 如果sample_weight为None设均设置一样\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * x_num)\n", " x_counter = {}\n", " weight_counter = {}\n", " # 统计各x取值出现的次数以及其对应的sample_weight列表\n", " for index in range(0, x_num):\n", " x_value = x[index]\n", " if x_counter.get(x_value) is None:\n", " x_counter[x_value] = 0\n", " weight_counter[x_value] = []\n", " x_counter[x_value] += 1\n", " weight_counter[x_value].append(sample_weight[index])\n", "\n", " # 计算gini系数\n", " gini_value = 1.0\n", " for key, value in x_counter.items():\n", " p_i = 1.0 * value * np.mean(weight_counter.get(key)) / x_num\n", " gini_value -= p_i * p_i\n", " return gini_value\n", "\n", "\n", "def cond_gini(x, y, sample_weight=None):\n", " \"\"\"\n", " 计算条件gini系数:Gini(y,x)\n", " \"\"\"\n", " x = np.asarray(x)\n", " y = np.asarray(y)\n", " # x中元素个数\n", " x_num = len(x)\n", " # 如果sample_weight为None设均设置一样\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * x_num)\n", " # 计算\n", " gini_value = .0\n", " for x_value in set(x):\n", " x_index = np.where(x == x_value)\n", " new_x = x[x_index]\n", " new_y = y[x_index]\n", " new_sample_weight = sample_weight[x_index]\n", " p_i = 1.0 * len(new_x) / x_num\n", " gini_value += p_i * gini(new_y, new_sample_weight)\n", " return gini_value\n", "\n", "\n", "def gini_gain(x, y, sample_weight=None):\n", " \"\"\"\n", " gini值的增益\n", " \"\"\"\n", " x_num = len(x)\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * x_num)\n", " return gini(y, sample_weight) - cond_gini(x, y, sample_weight)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import os\n", "os.chdir('../')\n", "from ml_models import utils\n", "from ml_models.wrapper_models import DataBinWrapper\n", "\"\"\"\n", "CART分类树的实现,代码封装到ml_models.tree模块\n", "\"\"\"\n", "class CARTClassifier(object):\n", " class Node(object):\n", " \"\"\"\n", " 树节点,用于存储节点信息以及关联子节点\n", " \"\"\"\n", "\n", " def __init__(self, feature_index: int = None, feature_value=None, target_distribute: dict = None,\n", " weight_distribute: dict = None,\n", " left_child_node=None, right_child_node=None, num_sample: int = None):\n", " \"\"\"\n", " :param feature_index: 特征id\n", " :param feature_value: 特征取值\n", " :param target_distribute: 目标分布\n", " :param weight_distribute:权重分布\n", " :param left_child_node: 左孩子结点\n", " :param right_child_node: 右孩子结点\n", " :param num_sample:样本量\n", " \"\"\"\n", " self.feature_index = feature_index\n", " self.feature_value = feature_value\n", " self.target_distribute = target_distribute\n", " self.weight_distribute = weight_distribute\n", " self.left_child_node = left_child_node\n", " self.right_child_node = right_child_node\n", " self.num_sample = num_sample\n", "\n", " def __init__(self, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1,\n", " min_impurity_decrease=0, max_bins=10):\n", " \"\"\"\n", " :param criterion:划分标准,默认为gini,另外entropy表示用信息增益比\n", " :param max_depth:树的最大深度\n", " :param min_samples_split:当对一个内部结点划分时,要求该结点上的最小样本数,默认为2\n", " :param min_samples_leaf:设置叶子结点上的最小样本数,默认为1\n", " :param min_impurity_decrease:打算划分一个内部结点时,只有当划分后不纯度(可以用criterion参数指定的度量来描述)减少值不小于该参数指定的值,才会对该结点进行划分,默认值为0\n", " \"\"\"\n", " self.criterion = criterion\n", " if criterion == 'gini':\n", " self.criterion_func = utils.gini_gain\n", " else:\n", " self.criterion_func = utils.info_gain_rate\n", " self.max_depth = max_depth\n", " self.min_samples_split = min_samples_split\n", " self.min_samples_leaf = min_samples_leaf\n", " self.min_impurity_decrease = min_impurity_decrease\n", "\n", " self.root_node: self.Node = None\n", " self.dbw = DataBinWrapper(max_bins=max_bins)\n", "\n", " def _build_tree(self, current_depth, current_node: Node, x, y, sample_weight):\n", " \"\"\"\n", " 递归进行特征选择,构建树\n", " :param x:\n", " :param y:\n", " :param sample_weight:\n", " :return:\n", " \"\"\"\n", " rows, cols = x.shape\n", " # 计算y分布以及其权重分布\n", " target_distribute = {}\n", " weight_distribute = {}\n", " for index, tmp_value in enumerate(y):\n", " if tmp_value not in target_distribute:\n", " target_distribute[tmp_value] = 0.0\n", " weight_distribute[tmp_value] = []\n", " target_distribute[tmp_value] += 1.0\n", " weight_distribute[tmp_value].append(sample_weight[index])\n", " for key, value in target_distribute.items():\n", " target_distribute[key] = value / rows\n", " weight_distribute[key] = np.mean(weight_distribute[key])\n", " current_node.target_distribute = target_distribute\n", " current_node.weight_distribute = weight_distribute\n", " current_node.num_sample = rows\n", " # 判断停止切分的条件\n", "\n", " if len(target_distribute) <= 1:\n", " return\n", "\n", " if rows < self.min_samples_split:\n", " return\n", "\n", " if self.max_depth is not None and current_depth > self.max_depth:\n", " return\n", "\n", " # 寻找最佳的特征以及取值\n", " best_index = None\n", " best_index_value = None\n", " best_criterion_value = 0\n", " for index in range(0, cols):\n", " for index_value in set(x[:, index]):\n", " criterion_value = self.criterion_func((x[:, index] == index_value).astype(int), y, sample_weight)\n", " if criterion_value > best_criterion_value:\n", " best_criterion_value = criterion_value\n", " best_index = index\n", " best_index_value = index_value\n", "\n", " # 如果criterion_value减少不够则停止\n", " if best_index is None:\n", " return\n", " if best_criterion_value <= self.min_impurity_decrease:\n", " return\n", " # 切分\n", " current_node.feature_index = best_index\n", " current_node.feature_value = best_index_value\n", " selected_x = x[:, best_index]\n", "\n", " # 创建左孩子结点\n", " left_selected_index = np.where(selected_x == best_index_value)\n", " # 如果切分后的点太少,以至于都不能做叶子节点,则停止分割\n", " if len(left_selected_index[0]) >= self.min_samples_leaf:\n", " left_child_node = self.Node()\n", " current_node.left_child_node = left_child_node\n", " self._build_tree(current_depth + 1, left_child_node, x[left_selected_index], y[left_selected_index],\n", " sample_weight[left_selected_index])\n", " # 创建右孩子结点\n", " right_selected_index = np.where(selected_x != best_index_value)\n", " # 如果切分后的点太少,以至于都不能做叶子节点,则停止分割\n", " if len(right_selected_index[0]) >= self.min_samples_leaf:\n", " right_child_node = self.Node()\n", " current_node.right_child_node = right_child_node\n", " self._build_tree(current_depth + 1, right_child_node, x[right_selected_index], y[right_selected_index],\n", " sample_weight[right_selected_index])\n", "\n", " def fit(self, x, y, sample_weight=None):\n", " # check sample_weight\n", " n_sample = x.shape[0]\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * n_sample)\n", " # check sample_weight\n", " if len(sample_weight) != n_sample:\n", " raise Exception('sample_weight size error:', len(sample_weight))\n", "\n", " # 构建空的根节点\n", " self.root_node = self.Node()\n", "\n", " # 对x分箱\n", " self.dbw.fit(x)\n", "\n", " # 递归构建树\n", " self._build_tree(1, self.root_node, self.dbw.transform(x), y, sample_weight)\n", "\n", " # 检索叶子节点的结果\n", " def _search_node(self, current_node: Node, x, class_num):\n", " if current_node.left_child_node is not None and x[current_node.feature_index] == current_node.feature_value:\n", " return self._search_node(current_node.left_child_node, x, class_num)\n", " elif current_node.right_child_node is not None and x[current_node.feature_index] != current_node.feature_value:\n", " return self._search_node(current_node.right_child_node, x, class_num)\n", " else:\n", " result = []\n", " total_value = 0.0\n", " for index in range(0, class_num):\n", " value = current_node.target_distribute.get(index, 0) * current_node.weight_distribute.get(index, 1.0)\n", " result.append(value)\n", " total_value += value\n", " # 归一化\n", " for index in range(0, class_num):\n", " result[index] = result[index] / total_value\n", " return result\n", "\n", " def predict_proba(self, x):\n", " # 计算结果概率分布\n", " x = self.dbw.transform(x)\n", " rows = x.shape[0]\n", " results = []\n", " class_num = len(self.root_node.target_distribute)\n", " for row in range(0, rows):\n", " results.append(self._search_node(self.root_node, x[row], class_num))\n", " return np.asarray(results)\n", "\n", " def predict(self, x):\n", " return np.argmax(self.predict_proba(x), axis=1)\n", "\n", " def _prune_node(self, current_node: Node, alpha):\n", " # 如果有子结点,先对子结点部分剪枝\n", " if current_node.left_child_node is not None:\n", " self._prune_node(current_node.left_child_node, alpha)\n", " if current_node.right_child_node is not None:\n", " self._prune_node(current_node.right_child_node, alpha)\n", " # 再尝试对当前结点剪枝\n", " if current_node.left_child_node is not None or current_node.right_child_node is not None:\n", " # 避免跳层剪枝\n", " for child_node in [current_node.left_child_node, current_node.right_child_node]:\n", " # 当前剪枝的层必须是叶子结点的层\n", " if child_node.left_child_node is not None or child_node.right_child_node is not None:\n", " return\n", " # 计算剪枝的前的损失值\n", " pre_prune_value = alpha * 2\n", " for child_node in [current_node.left_child_node, current_node.right_child_node]:\n", " for key, value in child_node.target_distribute.items():\n", " pre_prune_value += -1 * child_node.num_sample * value * np.log(\n", " value) * child_node.weight_distribute.get(key, 1.0)\n", " # 计算剪枝后的损失值\n", " after_prune_value = alpha\n", " for key, value in current_node.target_distribute.items():\n", " after_prune_value += -1 * current_node.num_sample * value * np.log(\n", " value) * current_node.weight_distribute.get(key, 1.0)\n", "\n", " if after_prune_value <= pre_prune_value:\n", " # 剪枝操作\n", " current_node.left_child_node = None\n", " current_node.right_child_node = None\n", " current_node.feature_index = None\n", " current_node.feature_value = None\n", "\n", " def prune(self, alpha=0.01):\n", " \"\"\"\n", " 决策树剪枝 C(T)+alpha*|T|\n", " :param alpha:\n", " :return:\n", " \"\"\"\n", " # 递归剪枝\n", " self._prune_node(self.root_node, alpha)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "#造伪数据\n", "from sklearn.datasets import make_classification\n", "data, target = make_classification(n_samples=100, n_features=2, n_classes=2, n_informative=1, n_redundant=0,\n", " n_repeated=0, n_clusters_per_class=1, class_sep=.5,random_state=21)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#训练并查看效果\n", "tree = CARTClassifier()\n", "tree.fit(data, target)\n", "utils.plot_decision_function(data, target, tree)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "一样的,如果不加以限制,同样会存在过拟合现象,所以可以剪枝..." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#剪枝\n", "tree.prune(5)\n", "utils.plot_decision_function(data, target, tree)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### CART回归树\n", "回归树的特征选择是使用的平方误差,即选择一个特征$j$和一个取值$s$,将训练集按$X^j\\leq s$和$X^j>s$分为两部分,寻找使这两部分的误差平方之和下降最多的$j,s$,这个过程可以描述如下: \n", "\n", "$$\n", "\\min_{j,s}[\\min_{c_1}\\sum_{x_i\\in R_1(j,s)}(y_i-c_1)^2+\\min_{c_2}\\sum_{x_i\\in R_2(j,s)}(y_i-c_2)^2]\n", "$$ \n", "\n", "这里$R_1(j,s)=\\{x\\mid x^j\\leq s\\},R_2(j,s)=\\{x\\mid x^j> s\\},c_1=ave(y_i\\mid x_i\\in R_1(j,s)),c_2=ave(y_i\\mid x_i\\in R_2(j,s))$ \n", "\n", "代码实现: " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "\"\"\"\n", "平方误差相关函数,封装到ml_models.utils\n", "\"\"\"\n", "def square_error(x, sample_weight=None):\n", " \"\"\"\n", " 平方误差\n", " :param x:\n", " :param sample_weight:\n", " :return:\n", " \"\"\"\n", " x = np.asarray(x)\n", " x_mean = np.mean(x)\n", " x_num = len(x)\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * x_num)\n", " error = 0.0\n", " for index in range(0, x_num):\n", " error += (x[index] - x_mean) * (x[index] - x_mean) * sample_weight[index]\n", " return error\n", "\n", "\n", "def cond_square_error(x, y, sample_weight=None):\n", " \"\"\"\n", " 计算按x分组的y的误差值\n", " :param x:\n", " :param y:\n", " :param sample_weight:\n", " :return:\n", " \"\"\"\n", " x = np.asarray(x)\n", " y = np.asarray(y)\n", " # x中元素个数\n", " x_num = len(x)\n", " # 如果sample_weight为None设均设置一样\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * x_num)\n", " # 计算\n", " error = .0\n", " for x_value in set(x):\n", " x_index = np.where(x == x_value)\n", " new_y = y[x_index]\n", " new_sample_weight = sample_weight[x_index]\n", " error += square_error(new_y, new_sample_weight)\n", " return error\n", "\n", "\n", "def square_error_gain(x, y, sample_weight=None):\n", " \"\"\"\n", " 平方误差带来的增益值\n", " :param x:\n", " :param y:\n", " :param sample_weight:\n", " :return:\n", " \"\"\"\n", " x_num = len(x)\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * x_num)\n", " return square_error(y, sample_weight) - cond_square_error(x, y, sample_weight)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "\"\"\"\n", "CART回归树实现,封装到ml_models.tree\n", "\"\"\"\n", "class CARTRegressor(object):\n", " class Node(object):\n", " \"\"\"\n", " 树节点,用于存储节点信息以及关联子节点\n", " \"\"\"\n", "\n", " def __init__(self, feature_index: int = None, feature_value=None, y_hat=None, square_error=None,\n", " left_child_node=None, right_child_node=None, num_sample: int = None):\n", " \"\"\"\n", " :param feature_index: 特征id\n", " :param feature_value: 特征取值\n", " :param y_hat: 预测值\n", " :param square_error: 当前结点的平方误差\n", " :param left_child_node: 左孩子结点\n", " :param right_child_node: 右孩子结点\n", " :param num_sample:样本量\n", " \"\"\"\n", " self.feature_index = feature_index\n", " self.feature_value = feature_value\n", " self.y_hat = y_hat\n", " self.square_error = square_error\n", " self.left_child_node = left_child_node\n", " self.right_child_node = right_child_node\n", " self.num_sample = num_sample\n", "\n", " def __init__(self, criterion='mse', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_std=1e-3,\n", " min_impurity_decrease=0, max_bins=10):\n", " \"\"\"\n", " :param criterion:划分标准,目前仅有平方误差\n", " :param max_depth:树的最大深度\n", " :param min_samples_split:当对一个内部结点划分时,要求该结点上的最小样本数,默认为2\n", " :param min_std:最小的标准差\n", " :param min_samples_leaf:设置叶子结点上的最小样本数,默认为1\n", " :param min_impurity_decrease:打算划分一个内部结点时,只有当划分后不纯度(可以用criterion参数指定的度量来描述)减少值不小于该参数指定的值,才会对该结点进行划分,默认值为0\n", " \"\"\"\n", " self.criterion = criterion\n", " if criterion == 'mse':\n", " self.criterion_func = utils.square_error_gain\n", " self.max_depth = max_depth\n", " self.min_samples_split = min_samples_split\n", " self.min_samples_leaf = min_samples_leaf\n", " self.min_std = min_std\n", " self.min_impurity_decrease = min_impurity_decrease\n", "\n", " self.root_node: self.Node = None\n", " self.dbw = DataBinWrapper(max_bins=max_bins)\n", "\n", " def _build_tree(self, current_depth, current_node: Node, x, y, sample_weight):\n", " \"\"\"\n", " 递归进行特征选择,构建树\n", " :param x:\n", " :param y:\n", " :param sample_weight:\n", " :return:\n", " \"\"\"\n", " rows, cols = x.shape\n", " # 计算当前y的加权平均值\n", " current_node.y_hat = np.dot(sample_weight / np.sum(sample_weight), y)\n", " current_node.num_sample = rows\n", " # 判断停止切分的条件\n", " current_node.square_error = np.dot(y - np.mean(y), y - np.mean(y))\n", " if np.sqrt(current_node.square_error / rows) <= self.min_std:\n", " return\n", "\n", " if rows < self.min_samples_split:\n", " return\n", "\n", " if self.max_depth is not None and current_depth > self.max_depth:\n", " return\n", "\n", " # 寻找最佳的特征以及取值\n", " best_index = None\n", " best_index_value = None\n", " best_criterion_value = 0\n", " for index in range(0, cols):\n", " for index_value in sorted(set(x[:, index])):\n", " criterion_value = self.criterion_func((x[:, index] <= index_value).astype(int), y, sample_weight)\n", " if criterion_value > best_criterion_value:\n", " best_criterion_value = criterion_value\n", " best_index = index\n", " best_index_value = index_value\n", "\n", " # 如果criterion_value减少不够则停止\n", " if best_index is None:\n", " return\n", " if best_criterion_value <= self.min_impurity_decrease:\n", " return\n", " # 切分\n", " current_node.feature_index = best_index\n", " current_node.feature_value = best_index_value\n", " selected_x = x[:, best_index]\n", "\n", " # 创建左孩子结点\n", " left_selected_index = np.where(selected_x <= best_index_value)\n", " # 如果切分后的点太少,以至于都不能做叶子节点,则停止分割\n", " if len(left_selected_index[0]) >= self.min_samples_leaf:\n", " left_child_node = self.Node()\n", " current_node.left_child_node = left_child_node\n", " self._build_tree(current_depth + 1, left_child_node, x[left_selected_index], y[left_selected_index],\n", " sample_weight[left_selected_index])\n", " # 创建右孩子结点\n", " right_selected_index = np.where(selected_x > best_index_value)\n", " # 如果切分后的点太少,以至于都不能做叶子节点,则停止分割\n", " if len(right_selected_index[0]) >= self.min_samples_leaf:\n", " right_child_node = self.Node()\n", " current_node.right_child_node = right_child_node\n", " self._build_tree(current_depth + 1, right_child_node, x[right_selected_index], y[right_selected_index],\n", " sample_weight[right_selected_index])\n", "\n", " def fit(self, x, y, sample_weight=None):\n", " # check sample_weight\n", " n_sample = x.shape[0]\n", " if sample_weight is None:\n", " sample_weight = np.asarray([1.0] * n_sample)\n", " # check sample_weight\n", " if len(sample_weight) != n_sample:\n", " raise Exception('sample_weight size error:', len(sample_weight))\n", "\n", " # 构建空的根节点\n", " self.root_node = self.Node()\n", "\n", " # 对x分箱\n", " self.dbw.fit(x)\n", "\n", " # 递归构建树\n", " self._build_tree(1, self.root_node, self.dbw.transform(x), y, sample_weight)\n", "\n", " # 检索叶子节点的结果\n", " def _search_node(self, current_node: Node, x):\n", " if current_node.left_child_node is not None and x[current_node.feature_index] <= current_node.feature_value:\n", " return self._search_node(current_node.left_child_node, x)\n", " elif current_node.right_child_node is not None and x[current_node.feature_index] > current_node.feature_value:\n", " return self._search_node(current_node.right_child_node, x)\n", " else:\n", " return current_node.y_hat\n", "\n", " def predict(self, x):\n", " # 计算结果概率分布\n", " x = self.dbw.transform(x)\n", " rows = x.shape[0]\n", " results = []\n", " for row in range(0, rows):\n", " results.append(self._search_node(self.root_node, x[row]))\n", " return np.asarray(results)\n", "\n", " def _prune_node(self, current_node: Node, alpha):\n", " # 如果有子结点,先对子结点部分剪枝\n", " if current_node.left_child_node is not None:\n", " self._prune_node(current_node.left_child_node, alpha)\n", " if current_node.right_child_node is not None:\n", " self._prune_node(current_node.right_child_node, alpha)\n", " # 再尝试对当前结点剪枝\n", " if current_node.left_child_node is not None or current_node.right_child_node is not None:\n", " # 避免跳层剪枝\n", " for child_node in [current_node.left_child_node, current_node.right_child_node]:\n", " # 当前剪枝的层必须是叶子结点的层\n", " if child_node.left_child_node is not None or child_node.right_child_node is not None:\n", " return\n", " # 计算剪枝的前的损失值\n", " pre_prune_value = alpha * 2 + \\\n", " (0.0 if current_node.left_child_node.square_error is None else current_node.left_child_node.square_error) + \\\n", " (0.0 if current_node.right_child_node.square_error is None else current_node.right_child_node.square_error)\n", " # 计算剪枝后的损失值\n", " after_prune_value = alpha + current_node.square_error\n", "\n", " if after_prune_value <= pre_prune_value:\n", " # 剪枝操作\n", " current_node.left_child_node = None\n", " current_node.right_child_node = None\n", " current_node.feature_index = None\n", " current_node.feature_value = None\n", " current_node.square_error = None\n", "\n", " def prune(self, alpha=0.01):\n", " \"\"\"\n", " 决策树剪枝 C(T)+alpha*|T|\n", " :param alpha:\n", " :return:\n", " \"\"\"\n", " # 递归剪枝\n", " self._prune_node(self.root_node, alpha)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "#构造数据\n", "data = np.linspace(1, 10, num=100)\n", "target = np.sin(data) + np.random.random(size=100)#添加噪声\n", "data = data.reshape((-1, 1))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "tree = CARTRegressor(max_bins=50)\n", "tree.fit(data, target)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX8AAAD8CAYAAACfF6SlAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJztnXl81NW5/98nIUBYsgAJZAFCZBEUWQxIRRSXCi5giq2it61irdar9XpbvaK21dr+rii12tZWpe7WWveIiqKCdb2gICCbYV+SsCSQEAIJZDm/P85MZvtOMpPZ833er9e8vjNnznzPIcx8vs/3Oc95HqW1RhAEQbAXSbGegCAIghB9RPwFQRBsiIi/IAiCDRHxFwRBsCEi/oIgCDZExF8QBMGGiPgLgiDYEBF/QRAEGyLiLwiCYEO6xHoC/ujXr58uKCiI9TQEQRASipUrV1ZprbPa6xe34l9QUMCKFStiPQ1BEISEQim1M5B+4vYRBEGwISL+giAINkTEXxAEwYaELP5KqYFKqY+UUhuVUuuVUv9l0Ucppf6slNqilPpGKTU+1HEFQRCEjhOOBd8m4Jda66+VUr2BlUqpD7TWG9z6XAAMczxOAx51HAVBEIQYELLlr7Xeo7X+2vH8MLARyPPqdgnwnDYsAzKUUjmhji0IgiB0jLCGeiqlCoBxwHKvt/KA3W6vyxxte8I5fkcoWVXO/MWlVNTUk5uRym3TRlA8zvvaJQhCKMjvLP4Im/grpXoBrwG3aK1rvd+2+IhP/Uil1HXAdQCDBg0K19T8UrKqnDteX0t9YzMA5TX13PH6WgD5YgpCgLQn7PI7i0/CEu2jlErBCP8LWuvXLbqUAQPdXucDFd6dtNYLtNZFWuuirKx2N6h1mJJV5Uyet5RbXlrd+oV0Ut/YzPzFpREbWxA6E05hL6+pR+MS9pJV5a195i8uld9ZHBKOaB8FPAls1Fr/0U+3hcCPHVE/k4BDWuuYuHzcv6z+qGjjPUEQXAQi7P5+T/I7iy3hcPtMBn4ErFVKrXa03QkMAtBaPwYsAi4EtgBHgTlhGLdDWH1ZvcnNSI3SbAQhsQlE2HMzUi2NLfmdxZaQxV9r/RnWPn33Phq4MdSxwkF71kZqSjK3TRvh0SaLVYJgTSDCftu0ER4+f7D+nQnRxXY7fNuyNvIyUrlv1mjLxaq2fJqCYFdumzaC1JRkjzZvYS8el8d9s0aTl5GKwvp3JkQfZYzy+KOoqEhHIqund+QBmC+rvy/j5HlLLS2bvIxUPp97TtjnJwiJhtwZxxdKqZVa66L2+sVtSudI4fxSBvpllcUqQWib4nF5IvYJiO3EH4L7sspilSAInRHb+fyDJRCfpiAIQqJhS8s/GIJ1EwmCICQCIv4BID5NQRA6G+L2EQRBsCFi+QM0N8OOHeAMe+3XDzIyYjolQRCESGJv8a+rg6efhocegu3bXe29esHy5TBqVESHl/hoQRBihb3E/5VX4K9/db3+5huorobTT4e5c6FHD2hpgV/8An7yE/jsM0hO9n++EJA0t4IgxBJ7if+zz8LXX8N4Rwnh6dPhppuM+LuTnAw//CH85S9wyy0RmUpb2RBF/AVBiDT2Ev/ycjjrLHjrrbb7XXklvPgi3HUXzJwJhYVhn4rsHBYEIZbYS/zLyuC0AOrGKwWPPgonnQQ/+hHMnm3a09LM66TQg6Rk57AgCLHEPuLf0ABVVZCfH1j/gQPhT3+Cn/4UvvjC1Z6TA+efH/J0JM2tIAixxD5x/hWOqpF5QfjT58yBgwfNRaO8HLp1g8WLwzIdSXMrCEIssY/lX+7Ivx+o5e8kLc31fMoUI/4PPtjhaUh4pyAI8YB9xL+szByDsfy9mTYNbrvNnCvYiwgS3inEL9E0SgIZS4ykyGMft4/T8g9V/AHef7+1qWRVOZPnLWXI3HeYPG9pmxW+Ail2LQjRJprV6gIZS6rnRQf7iH9Zmdm56+7G8aJdIT/5ZLPg6/D7B/sllfBOIR6JplESyFhiJEUH24h/+fot7EzNZMgdiyyFPSAhV8pE+nz4ITQ3B/0l9RfGKeGdQiyJplESyFhiJEUHW4h/yapyKjduZXePPn6FPWAhnzbNRACtXBn0l1QKwwjxSDSNkkDGEiMpOthC/OcvLiWr9gD7evdtbfMW9oCF/LvfNXcAixcH/SWV8E4hHommURLIWGIkRQdbRPvsPVhH9pGD7Ondz6PdXdgD3nHbrx+ceiosXsxtf7km6I1aUhhGiDfCXq1Oa5M08ZVX4NVXYds211jnnIO69zEeWLrN71hSPS86hEX8lVJPARcD+7XWJ1u8PxV4E3DmTX5da31vOMYOhJO6HCOlpZm9vfp6tLsLu9WOW4VxEU2et9TzyzdtGsybR/GGf5PXs5bX1lfy+oBTyOqbJl9SISEJ2Si59FJ47z3zvKXF7KhPSoKpU+F73zN3ywcOwIIFXDLlBS65667Izkdol3BZ/s8AjwDPtdHnU631xWEaLyh+cVJPAPa6Wf7eFrq7tVFeU48CHKVdfOPxZ86E//f/4Ic/ZAIwAZj3wgtwZXEU/jWCEGdobSLgTj4ZzjzTtA0bBsXFkJ3t2ffQIfjtb2HGDDjllOjPVWglLOKvtf5EKVUQjnNFgqk9jwPQkpuLAr+3kU5rY/K8pT4uII90yxMnws6dcOSIsXKKimDlSkpGniW3qoL9qKkxv4XLLze1MNrikUfgo4/gqqvgyy8hJSU6cxR8iKbP/ztKqTVABXCr1np91EZ2bPB66q5iGDCg3e7tLf6a3YdbWkX+7SHDafzsS+7oIbt3BRuye7c5DhzYft9+/eCxx2DWLJM00Zllt39/c0fdxRbLkHFBtP7SXwODtdZ1SqkLgRJgmHcnpdR1wHUAgwYNCt/oZWXmS+V9C+qHthZ/rVI0LOmWw7nrllE/tcn4Nh1IcRYhrtEaSkuN1Q6mkt2JJ3p8hwNi1y5zDET8wawBXHstPPGEKbDkZMgQuP12uPpqk0RRiChRCfXUWtdqresczxcBKUqpfhb9Fmiti7TWRVlZWeGbQHk55OYGnIe/rVAzq/0Aa/sVkHn0EP3rDvicSzamCHHLsmUwcqRxWxYVmZrVH38c/Hmcln8wBtuCBbBvH+zdax5vvglZWfCzn0Fmputx4omm1nYIBJOCxU5ERfyVUgOUMuaEUmqiY1xfpYwUQSZiayse30rMN2YPAWDk/u0+78nGFCFu2e74vj72mMsCL+1ACoXdu82ddf/+gX9GKXMn3r+/y+WzbBl88AFcfz38+Mdm8bi0FDZvDn5ODiRPkH/CFer5IjAV6KeUKgPuBlIAtNaPAd8HblBKNQH1wGyttfZzupCwzAZYXg5jxgR1Hn+hZlYuoW+zCgA45cBO/n3ChNZ22ZgixDUHHPbXpZdCRoapX+HMfhsMu3ebhInJye33bQul4LzzzAPgq69g4UIzp3HjOnRKqZXtn3BF+1zRzvuPYEJBI4plyuTXvuHiXbvpctFFYRnDaj9AY+90juQO5LKUg7yWkRqxaB9JcyuElaoqI7iZmUa4c3I6Lv6B+vuDwXm37nQrdQDJE+SfTrW0bnWVT6mrpUtDfWipnN3wt/uw59rx9Pz2Wz6fe05YxvFGagEIYaeqCvr0cVns+fkdF/9AamNb0KZBk51t3EkdmZMDf8EbGnw3b9qMTiX+VlfzAYerzJMOFF/xh6VLaMwYeOstOHrURE2EGbl9FcJOVZUJvXQycCCsWxfcOVpajPh///tBD9+uQZOcbAI1QhB/qzt1J3Y3oDpVYjerxdUBhx1+zTBZ/n4ZM8b8EPz9eA4dgg0bzGPjRmj2/TK2hdy+CmHHW/zz842QB7Mct38/NDYGF+njIKBMuh29G3HgHrxhhZ3rBHQq8bcK0Rx0tNo8ibT4jx1rjmvW+L63f7/Z7n7SSeYxahRMnw5NTQGfXtLcCuGkZFU5mzfs4IN9Ta7wx/x8E/NfWxv4iYLZ4OVFQAbNwIEhiT+YC8Dnc8/B3+4FuxpQnUr8rUI0/2Og42KQmxvZwQsKoHdvWL3a972774bqanjySXjpJbjnHlMQ5o47Aj69pLkVwoXT3dKr7hAHU9Na3R9fNTrclcEssIYg/v4MF6c/vvWCVFYW3N1IkOPZ1YDqVD5/cPjjs4Fjx0zDryrM5pFI7xhMSjKJqrwt//XrzYaWG2+Ea65xte/fD3/4g0kPPXt2u6eXNLdCuJi/uJT64030qa+luocpa1rf2MyTOxqZAEZsT/ZJzmtNCOIfiD/+BHozur7eFFDq29fiLMGP9/s3HmBchXH1JClFzfU3ApEJ1IhnOp34AyZuefly1+sJE/z3DSdjxsDzzxvfv3M38S9/aeoG3323Z9+HHjIXip/8xLh/nLWFx4/3uzgtaW6FcFBRU0+Pxga6NTdyMNVV03qd6m2eBOBmcUbpXP36x/y4S1fe3dVAsc+e/bbxzqTrTX1jM//a08Jo55xCFP/icXl0q9zHBb//iFU5I9jbN4fvbF3J1pffYHLaabYzpjqn+P/mN2Yxy0m0xH/sWPjb32DHDigsNPnNFy+GP/7R94vbtaspdFFUBD/6kat9yhT45JPozFewJbkZqaid+wCodhP/pPw8E/ffjvi7R+nk1FZS3rsfd7yxDpQKWjydBs2Que9g5djZkOSYX1lZ0Bs1rbigciMANfMf4hffKp54/nbSG+psGfnTOcX/wgtjM67zy3nZZSZ+eu1aGDrUuHysGDDARP9s2WJeP/II/POfxmUlia2ECHHbtBH84xETlXbQ4fZJTUnmFxeeBH8d0K74u0fp5B6upCItK+SwY3/x+C3Ou+AQF31bWbIEMjP5zc4u1Dce51C3XgyrM64ru4VOd6oFX2+intBp7FiTsbBrV5OMavhweOYZ89ofaWnG1TN+PFx0kRF+q4ghQQgTxePy+MW4PoCx/D1qSQcQWukeHZNTW8We3lk+7cHiL6BhzqXfMfH+4RB/rY34n302ZbWmxseh7r1Ib3AljrNT5E/ntPyJ0Y7Yrl3h9dc7/vlJk8xx2TJTMEYQIsTpGeb4+q9nGiPFSX4+bNrU5medVnpySzPZR6qpSOvX2t5R2gxoyMkJKcVDK1u3mvTTt99Obq35NxjxP2wuDErZKvKn01r+AW0giTfyHJbXsmWxnonQ2XEmdevntUrr3OjVBk4rvX/dAZJ1C3t69wtL2LEzHn/7vIv4fO45LiMtxI1erSxdao7nntv6b6jt3otuzU10bzpmu9DpTmv5J+yO2EmTRPyFyFNVZSLSMjI82/PzzSav2lpXBJoXTlF+73Fzh3A8J8/lNooE+flm/cyCoJIdLlli9vsMH06xo2DN5nWLARjRtYk5kfw3xCGd1vJP2A0dkyaZPOv79sV6JkJnpqrKRKB5FzhyLrCWt70+Vjwuj8fOMr7+P946I7Ki6dzl67XRK6hc/S0txvI/99zWSmXF4/K4bbZxtb75HyfZSvihE4t/wu6Idff7C0Kk8M7r4ySY6Jp2NniFLeDCmXbi0CGP5qBcu2vXmn/zued6tmdmmmN1dcfmlsB0WrdPwu6IHT8eUlKM+F9ySaxnI3RW/Im/U8gDFf/0dA/3kNMNU15Tj4LW2P2QAi7c8/q7uamCcu0uWWKO3uLfx0Q9ifh3MhJyR2xqqgkZFctfiCRVVWYPijfOHFgO8W/Tp+5VxMU7ws5701aH4+jd70ZGj3ZN1c/egFbXbl0d1NSY5++9Z6KavHfP29jy77Run4Rm0iRTwi6IrJ+CEBQHDlhb/t26mSIqZWWWPvW7Xl3NwmVboaHBhE26ib+VG8abDgVc+HFFtenaXbPGRM8NHGgeH3zgKg/pjo3Fv1Nb/vFC0OUXJ02Cv/wF1q+npKVfu5+V8o5CUGjt3+0DreGe3mKe1lBHyXO/oPB/K1x9f/az1qeBCHuHAi5ycizTTvh17Q5OhaLvQa9eJnmiUmZhe8YM33Onp5ujiL8QLkLxfb7fu4DzgTtv/zsvjr2gzc9KeUchaA4fNgVY2hL/7dt9xPz65a9RWF3Bn0+fzc0zxhhRveyy1vf9uWGcdDjgIiXFb31hH9duczNcfLHp+8knrgAKfyQnmwuADcVf3D4RwP12Gfz7Pv199r+WH6KqRzrnb1rGuZuXc97m5RQeKLP8bEJuZhNiizPpYVviX1bmYaVn1R3kmhULeXPkWbw046cwdy7cfjsMGdLax8oN4yyg4pFCoiO0tdGrtNSskS1bZub03nvw5z+3L/xOMjNtKf5i+UeAUHyf8xeXUt/UwrKBo7m49DOmbl8JQGWPDE678VlakpI9Ppuwm9mE2BGI+FdXM/fMgfzPu1upb2zm5i/+RZeWJv529o/9Wu8RjbDLzzflT7158EG49VbPtjlz4PrrAz93ZqapF2AzRPwjQCi+T+dnb7vwFh6dZIpiT9q9jl8vfYJxFaWszB/l8dl2Ix4EwZv2xN+xiDujn6Z51mj++cJSZq9ZzMKJF3HDtdPaFPNIRNiVrCrn+F7NBVt3MH3eUs4+MYuPvq3kvCUv89sPH6f8vIvI++VNpnP37iYtuvJXtNECm1r+4vaJAO0Jb1u+T+dn67t2Z/2AoawfMJSXT/kujUnJnLflS5/PJuxmNiF2OMXfX3EUZ3TNwoUUV67n5U2vkNK9G5e+8XjU15GcLtQtXTPofbyept27Kfn4W8769+v89sPHeX/YJKZP+Bkl/UebuthTpxo/fjD06WNL8RfLPwJYladzLvrmtXMrbPXZum49WT7wZKZv/4qcWQ97fDZhN7MJscNfUjcnQ4cay/m221xtv/mNqT8RZZwu1PK0bACW/+3q1vc+KjyVm2bezvEWFVoefpta/mERf6XUU8DFwH6ttU/xT6WUAv4EXAgcBa7WWn8djrHjkVAE2d9nz8iZA7fcwpC0BsvPiNgL7eGMQLty4TKuT0ri7W11FI/P8O2Yn2+KDDn94N27w7hxMQkpdrpBPxw6kV+d/590azS1uY90TeWNk87meJeU1n4dnp9T/B1pne1CuCz/Z4BHgOf8vH8BMMzxOA141HHstIQiyJafTZ8Bt9wCb71ljoJtCIfouocEZx6tpTo1zW/pRTNehdt4g2F1RUxCip1rWsdSuvGPcf4r9KWnpnR8fpmZcPw41NdDjx5hm3u8Exafv9b6E6Ct5fJLgOe0YRmQoZTKCcfYtqGwEEaNMuIv2IagMle2gXsEWmZ9LQdT0yxDgv2N99u31sckpNhqTcub1JRklKLj87PpLt9oLfjmAe4VIsocbR4opa5TSq1QSq2orKyM0tQSiBkzzMYVZ74Sd7SG/fuhosI8vDIgColJuPZxuEeg9amvbS3c7h2Z5m+86qON7Z43EhSPM7UC8jJSUZg1sx9OGuTx+r5Zo6kJZX5tiH/US8FGkWgt+Fo50rz3PqG1XgAsACgqKvJ53/bMnAn33282scye7WrftQtuuAEWLXK1paSYsnV+0u0KiUG49nG4hwRnHq1la9/81vZQzxtpAnGhOnfTexPQ/PyIf1u7551jJnKQRbQs/zLAXYXygQo/fQV/nHaaidB49VXYscM8HnkETjoJPv4Y7r4bHn8c7r3XbN//7LNYz1gIkXAVJXJ3n2Q2GMvfKiTY33kzUlPiOqQ4pJBnP+Lv7y7onoXrw+KKizXREv+FwI+VYRJwSGu9J0pjdx6Sk03ektdeM9vqhwyBn/8cJk+GdevgnnvguuvM1vvu3U1mUCGhCdc+jlb3SXp3Mo/W0pjZxzLdgr/x7pl5ko/7JaKlG4PEyj0U8PycOf29dvn6uwuqqW/sFClVwhXq+SIwFeinlCoD7gZSALTWjwGLMGGeWzChnnPCMa4tuf9+s5HFWdIuJwfOP98zRC0lxdQEEPFPeMK5j6N4XB7FQ3rCnS1cNbMILM7R3njxIvZWdDjCzo/l316iOm8SLaVKWMRfa31FO+9r4MZwjNVZCTicLzsbrrqq/RNOmABPPmlqAnSRvXyJTFj3cbSX2iHc48WIoMJj09ON8eQl/lYbLlNTkumekmS5AJ5oKVVEFeKAiKRlnjDB1ATYuLG1+pHk/RcCEf9EJ+jfU1KSZVpnf3dBgOVF4bZpIxLqNybiHwe0Fc7Xkc088xeX0n3rEZYAX7/6PuNHj5a8/3Zm507jLmxsdKVF7sTi36Hfk58UD23dBbV3UYj335iIfxwQrnA+d4FXffKo7dqDTW8vZVfx7LBeYIQE49ln4dFHXfV5R4829Ww7KR36PQWZ38fqojB53tKE+o2J+McB4UrL7C7wWiWxbsBQRlVs4gaHhWJFoi1SCR1gzRoYNgw2bYr1TKJCh35PYUjulmi/MUnpHAeEK5zP+0u2Jmc4J+7fQVVVbdjixYUEZPVqGDMm1rOIGh36PYVB/BPtNybiHweEFKPshveXbE3OMLq2NDGlvkLy/tuV2lrYts1W4t+h31MYcvon2m9M3D5xQjjC67xD077JGQbAz3tXM0by/icUwUaN+O2/1pGOYOzYKM08Pgj69xSGtM6JVltDxL8T4f3lUwMH0dCnH2P2bm59P16/iIKLYCOz2uo/5O2PGQOcvqgKtW5pXItRTHGmdT56FHr27PBpEuk3JuLfyfD58n1+GqxYEbsJCUHTXiZPb8uyrRw0dy7+lMHde1HRux/EeehhTHHf5RuC+CcS4vPv7EyYYKoyvfWWyQb68cfQ0hLrWQlt4C86xGnReycU85eCoKa+keF7t7Ehu7DVlZGIOWiigg1z+ovl39k54wzjx5w509X27rum2LUQf1x5JSveeo8W7ZvR/I2Tz+V/p3qmxapvbCZZKZot+ie1NDOicif/HOv5fx2voYex5PMDzUwGLp+3iLLRlbZwj4n4d3bOOcfEeR89asrUnXOOWQQU8Y9PFi+my4Bs3k07gWa3O7RTK0r5wZr3ue+sq9DK84a9WWtSU5J90g2MOLSX1KZjbMwe4tE/XkMPY0XJqnKeX1HJZCCtoS7ud+aGCxH/zo5ScMoprtfZ2VAqt/1xidZQW0v69dfT/Qc3evj2/3z8GzLv/m+GV+2iNKvA42N5br5/97WAvMXbATzEP55DD2PF/MWlJHUxtXvTG+qA+N6ZGy5E/O3GiBEi/vHKsWMmC2vv3r4L9zsK4e7/5ozy9R7i7xRzyyiTV3bT0qULR04YjqprjvvQw1hRUVNP7+69AJf4O9s7MyL+dmPECHjzzVjPQrCittYc09J83xs8GAYO5JqW3byXkRpYHPmaNSSNHMm/fyUuvrbIzUilorqFFhRpbuLf2d1jIv6dGMuNPyNGQGWliWpwRjgI8YFT/Hv39n1PKTjrLPLef5/PF58d2EakNWvg7LPDO8dOiHNzZG33nq2Wvx3cYxLq2UlxbvzxDgtc1sWRyldcP/HH4cPmaGX5A5x1FuzfH9j/XVUVlJfbbmdvR3CmgzjSozcZDXWhl6j8v/+D/HyzvpadDUOHQkX8lSwX8e+k+Nv48/Buh8Uo4h9/tOX2ATjzTHP85JP2z7VmjTnaKKdPKBSPyyOvMI/i0k/5/H+LKT5rpCmXunVr8Cd77TVzd/3978O0aeYcH30U9jmHioh/J8XfYtXKpAzo0oXSj1cwed5Shsx9h8nzllKyqjzKMxR8cFr+Vm4fMGmZBwwwG/WsmDfP1G9OTobzzjNtIv6B8/vfw803w7XXwtVXm2yoY8eaeggW+yj88tlnMHEi/O1v8PTTkJoKK1dGbNodRXz+nRR/Oc2z+/bmcP5gdn3xNeXZZiHQLnHNcU97lr/D78/HH/smIPvqK7jrLrOPY9Ik01ZYCFlZkZ1zZ2LaNJg2rXWtjCvH8ch7DzPu6qvh4Yehl4kIYtw4+POfrc9x5IgR+ltvNa+7dDEX4DgUf7H8OyltpZdd1aM/g6vKPN6Tbf9xQFsLvk7OOsv48rdvd7UdOwZz5phKXa++Cr/7nXnMmeP/PIIl7mtl5WnZXPr93/HAuT+hMqUndO0KNTWmNrb739+dL7804bpTprjaiorg66/jLq2KiH8npa2c5ht65zC4poKkFs81gc4e1xz3tLfgCy6/v7vr5957Yf16WLDAFCIXOoz3WllLUjJ/K/oexbPuhSVLXGHS/sKlP/3U3JGdfrqr7dRToa4u7iqpidunE+MvvezB/CF0W95E/qH97MrMaW3v7HHNcU9trRGOtrJKjhpliq//4x+QlGQ+c//9xsq/4ILozbWT0m4pxsJCs2O+pARuucW342efmRrJGRmutqIic1yxAk48Mcwz7jhi+duQMy6aDEDhQZfrxw5xzXHP4cPG5dNWDL9SJi/T0qVmUfLmm2HQIPjjH6M2zc5MQKUYi4uNhV9V5dmpqcmEeZ5xhmf7iSfG5aJvWMRfKTVdKVWqlNqilJpr8f7VSqlKpdRqx+PacIxrZ0pWlXc4WufMGebLOb5+f0hlI4UwU1vbtsvHyVNPmdKMzsf69Z6WptBhAirFWFxs/PdvveX54TVrjHvH3d8PZtF37Ni4q6sRsttHKZUM/BX4LlAGfKWUWqi13uDV9SWt9U2hjicEX+nJh379oE8fbs5v4eZ5F0VyqkIQlO/ax/HjSZwz9522UzekpMCQIb7tDoItASm4CKgU49ix5m6rpMRzUf3TT83R2/IH4/p56ilobjahuHFAOHz+E4EtWuttAEqpfwGXAN7iL4SJtio9BfwjlwRvcUXJqnL6bd9Lz+RUjx3ZEFz4bciGgdB+KUaljPW/YIEJ7XSu0Xz6KRQUmN293px6qokS2rQJRo6MyLyDJRxunzxgt9vrMkebN5cqpb5RSr2qlBpodSKl1HVKqRVKqRWVlZVhmFrnpN1FqUAQ8Y8r5i8upUfDEQ5369Ha1pHw2/ZKQAphorgYGhrg/ffNa63NYq+3y8fJqaeaYxy5fsIh/larU97b4d4CCrTWpwAfAs9anUhrvUBrXaS1LsqSzSl+CWhRqj1GjIA9e1yx5UJMqaipp+fxeo50TfVpD/Y8wbQLHWTKFJMYsaTEvN682eRdcnP5uK/LnfnmHpq6x9eibzjcPmWAuyWfD3hkMdJaH3B7+Xfg/jCMa1ucWQi9KzcFFa0zwtF30SITjZCSYm5HkyQALBbkZqTS61g9dV17+LQHex6RXmy9AAAa2UlEQVSrnd0SxhtmunSBGTPg+edNLp+mJtPusPy93W+7Dh9nTd8Ckt74kFnd3yE9NQWloOZoY8zWZcIh/l8Bw5RSQ4ByYDZwpXsHpVSO1nqP4+VMYGMYxrUtAS1KtcfJJ5vjFVe42p5+2oQPClHntmkj6H3PUeq6uUS6I+G3YTEMhMD41a+gf3/Xzt28vNY4fiv32zcDhnL5N++jWppxvz7Hal0mZPHXWjcppW4CFgPJwFNa6/VKqXuBFVrrhcDNSqmZQBNwELg61HHtTruLUu0xbJjJDnnggPFXzpoFO3eGb4JCUBSPzUU31qPS0lDQYWswLIaBEBjDhsEDD1i+ZeVmWztgKHNWvsUjb95PfUo3GpNT+OMZ/8H+3n1jUjYyLDt8tdaLgEVebb9xe34HcEc4xhLCiPviVHo6HDwYu7nYnaNHUS0tXD39FK7+n9DCb0M2DISQsXK/fTZ4LBuzCjh5n0kTPejQPirSsvjzZHP3He11GUnvIBj69jV3AUJsaC+dswUSzx+/WLnf9vfuywXXPNL6+rXnb+W8LctbxT/a6zKyuicY+vQRyz+WtJfO2Qt/ldqkLkN84J1YMSM1hZRkz8DIJUMncsreLfQ/XBWTdRmx/AVDnz5i+ceSQNI5uxGWjX5CRPF2v7nfqaWnprB89GT45DkurVjN8GtvTchoH6Ez0LevyRMjxIZA0jm7IfH8iYfPWozW8O79/M/xTRCDC7a4fQRKVpXz2vaj1JTt9ZskLpREckIABOn2CctGPyG2KAUzZ5o6AUeORH14EX+b4/QdlyX1IK3hCHsO1vn4jsW/HAWCXPANKPukEP/MmGEqsX3wARBdI0vE3+Y4fcc1qb1IQpN27IhPLhjJFxMFgrT826rUJiQQU6aYMOuFC6NuZInP3+Y4fcTVqUZ0MuoPU5Oa5uE7Fv9yeLEM0QxywRcknj/aRCS0NiUFLrwQ3n6bB4deEdVFfBF/m+PcjFLTvRcAmfWH2YGn7zjQfDESd94+/lIuj6yoYERysqn4JMQdkUyV/dUpZzDhxRe57tWHqeluLv6fF4xh2aBTgMgZWeL2sTlO3/Ehx5cuveGwj+84EP+yrAsEhj8X2tqNu43Lp60SjkLMiJTrs2RVOf95oD8Vvftxxer3uGHZK9yw7BXGl3/b2idSi/hi+dscp9Xy4gv7ARiSdIxiL99xIPliJO48MPxacc76vUJcEinX5/zFpVQmd+f0/3zG8v1ILuKL+AvGdzzoIngQ7j59gGXMcXv+ZVkX8MXKDebPhZaljwW82CtEn0ilym7r95EXYdepiL9gyMgwLocOpniQPPKeePuI9x6s4/hPruXJ5KPsOdRAI0n86fTZrB8wlNSUZEb2BJLF8o9XIpUq29/vJi8jlc/nnhPSudtDfP6CITnZXAA6mOJB4s498XaDjdy/nctWvUfXrVvoX3eQc7Z+ySUbP2kN0czmuFj+cUykQmtj+bsRy19w0bdvhy1/ySPvifft/Cl7NwNw1WX3sjtjAEueuIGLex7lOqd1V1sLgwdHe5pCEEQitDaWvxsRf8FFiMndJO7chfft/Ml7t1DTvRe70/sDsCNjAIM2bnJ9QBZ8bUusfjfi9hFctJfW+aGHoLjY9XjxxejNLcHwvp0fvXcLa/sPbQ3l3JUxgJyDe0xyLzCWv7h9hCgi4i+4aK+gy733whdfwI4d5njLLXD8eNSml0i4+4i7NjUyonIn6wYMbX1/Z0YOvY7XQ2WlqQErlr8QZUT8BRdtWf7V1VBTA3PnwurV8MwzsH8/vPVWVKeYSBSPy+PzueewYFxXurY0sdZN/Pf2c9zmb93qyugolr8QRUT8BRd9+8KhQ9DU5PueM9d/YaE5TpsGAwfC3/8e1BB2TA09tW43APuHn9waKfL9H5xp3ty6NeikboIQDmTBV3DRp485VldDVpbne97in5wM11xjXEE7dkBBQbunj2R+lLhmxQrIzOTV+690pW9oaDDPt22DoiLTJm4fIYqI5S+46NvXHK1cP07xHzLE1XbNNeb41FMBnd62qaFXroRTT/XM29O9O+TlieUvxAwRf8GF0/K3WvTdts3cDbhbp4MGwfTpRvytXEVe2DIFREMDrFvnsu7dOeEET/EXy1+IIiL+gov2LH+ny8edn/4Uysvh3XfbPb0tSw+uXQuNjcby98Yp/kHW7xWEcBAW8VdKTVdKlSqltiil5lq8300p9ZLj/eVKqYJwjCuEmfYsf4f4uy/anrmhJw19s+Cyy6BfP/NwuoO8sGUKiJUrzdFK/AsLYe9e8wCx/IWoErL4K6WSgb8CFwCjgCuUUqO8uv0EqNZaDwUeAu4PdVwhAjjF39vyb2qCnTuhsNAnb/+uw43c/N2fs23m5TB7NuTmwqJFlqe3ZenBlSvN39VqQfyEE8xxzRpzFMtfiCLhiPaZCGzRWm8DUEr9C7gE2ODW5xLgHsfzV4FHlFJKa+f2RiEuSE+HpCRfy3/3bmhuhsJCy0Xb9wePZ33GZJOF8J57TARQY6MpUeeF7VJArFjhu9jrxCn+q1aZo4h/QpKoFezCIf55wG6312XAaf76aK2blFKHgL5AVRjGF8JFUhJkZvpa/m5hnhWbjlh+tHXRNjfXpCzYtw/y8yM42TilpQVeeQXq6szfYd06uPVW675O8V+71lwou3WL3jyFsJDI4cvhEH+runPeFn0gfVBKXQdcBzBo0KDQZyYEj1WKBzfxz83Y0nbe/txcc6yo6FTiH7B1t2SJcX+5M3Wq9Un79DFptGtqXIvtQkKRyBXswrHgWwYMdHudD1T466OU6gKkAz4hJVrrBVrrIq11UZb3JiMhOlileNi61VimeXntL9q6i38nIaj6xF98YVw8334Lu3aZO6Bp0/yf3BlBJYu9CUkihy+HQ/y/AoYppYYopboCs4GFXn0WAlc5nn8fWCr+/jjFn+VfUADJye0v2jrFf8+eaM46oviz7m55abVvioply+Dkk2HECJP+Iju77ZM7XT/i709IEjl8OWS3j8OHfxOwGEgGntJar1dK3Qus0FovBJ4EnldKbcFY/LP9n1GIKX36GD+1O14x/m0u2mZlmdQPncjyb8uK8/Dxjs2F5cvh0ksDP7mIf0ITqfKO0SAsuX201ouARV5tv3F73gD8IBxjCRHGyu2zbRtMnBjY55OTYcCAoMU/niMm/NVZddLq4+15xORFmjQp8JM7xV/cPglJIlewk8Ruggcbj6cw8vBhht9WQlbfNO78Tn8uqq623t3rj9zctsW/sdGENzYba+mD/U3cseJI3EZMWFl33lTU1BuXDwQl/p+1pHEG8Nb2OubNW5owwiG4SNTwZUnvILRSsqqcV7cfBSCtoY7ymnqeev4j82Y4xf/RR+G00+D00+H00znne1PpWeMZ9RtPCd/c1zn8kZuRalw+aWkwcmRA5y1ZVc7da80dRV3X1LYXkgUhzIj4C63MX1zK/pSeAGTUm3wz2VUOIQqn+K9ZY9JAvPce3HcfybqFgmrf/vEUMeEszPLw5WP9RzstWwYTJpj9EgEwf3Ep21MzOZLSnYM90oH4uugJnRtx+witVNTUU5hqfM8ZDUb8Bx1y5J1xT+XshqWvPicHqqrg2DHrjUubNsGoUSYEctAguOMO8g7tZ0X+SR7d4jFiwq+Pd0SmuajN9Ult5ZeKmnp0UjI/+I8HKE/L8mgXhEgj4i+0kpuRSnWqiTrJdFj+g2r2UtMjjYz0dJ/+/nY3Dk5KZRyYhGWDB/sOtGkTzJxpnjs28xXUebp9Ih0xEcoCs6WP99NPzRpGEP5+50Lyhv6FPu2CEGnE7SO0ctu0ETT0MiLvdPsUHNqH9mP1+4t//8euRvPCyvVTU2Nq/w4fbl737AlZWXwvszFqCd+C2rQVKMuXm+Np3plN/GPLLKdC3CCWv9BK8bg8ulx2GvwVrljzHlOqNjPhwDa6Tpxu2d+fe2Kj6uXoYCH+mzebo1P8AQYPpuBIlUkMFwUisiV/2TKzLhLEzvREDhMUEh8Rf8GDi6ecCGefzfjSUsaXr4beveCiiyz7+ot/V7kO8bIS/02bzNFL/H02lkWQoLbkP/883HijSdgGZh/E3LmmiI171tJly+Css4J2JyVqmKCQ+Ij4C54oBUuXBtTV3+7GnxZPgAdSrFM8bNpkomGcm5vAiP+iRSYLplXq4zDj76Jl6Wt/+mmT6vryy83rL780F4OHH4Y77zQb2mprobycb/JPTNgMj4L9EPEX/NKeFdum2yInp9Xydz/P39/9hEm5A+nVtatroMGDob4eKivbz4UTKvfdxzvPvmAibTR8OHQiD035obWv/dAhs5D7y1/CvHmmTWtzobr9dpgzx6P7gw0DqFeJmeFRsB8i/oIlgeYp9+u2cMT6e59nwL7dfN0ri4Oryl2fc0YE7dwZUfEvWVXOmIcfp/vROvblDSO/ei83LHuVd8+9nJ99b6zvv+ODD0wVM3e3l1Lm9fTpZpeys3B9WhqfPLfdclwJ3RTiERF/wZKQF0Vzc6G01PM8WjPkYDlf5p/Ek46NTPMXl5K+aReLgC8/+pqJEyYEPMdg/OvOi9AXtQd5e+QUfn3+fzL24E5K/n4j7+XsAavPvfOOKW7zne/4vpecDEVFnv/kjL2Bu5MEIcZIqKdgSch5yh2Wv3v/7LqD9GxsYFufvNY7ifKaesrSjLX/7/e/CjjcMthwzfmLS2lsOEZmw2GqemQAsLrPYDYPKITnnvP9QEuLce9MmwZdArORJHRTSCRE/AVLQs5TnpsL1dUU9HR9xQqrjTBv75NHslKtdwS13XtR27UH2dX7Ak5t0NadiRUVNfX0PVoDQFXPjNb2l0dONZE6pV6fW7nS7EfwE+lkhS0L1AsJi4i/YEnIVqyjqMsdY9NbzzPkoFkA3pM9kGavWj7l6dnk1e4P+M4i2DuT3IxU+h3xFf/lk6ab6KPnn/f8wKJFxr8/3XqPgz+cOYC2z7uIz+eeI8IvxC3i8xcsCXkDUk4OAOdntnDfrNHMX1xK4cFyjnXpys1Xnc38DzZ7+MfL07LIP7Q/4DuLoMI1MRezRWs+B6CyZyZgLmbXzDqdfR+fif7r3zm98TTSenRDKXjm8Rcg70Su/tvX1Bxt9Pvvj+c6BILQFiL+gl9C2oDkVsu3+LIzzHm+eBhGjqD41IGQlOQZTZSezcSyDQHfWQRbQal4XB6DTjQ7j6t6ZpLnEGqAB7JP48Ev/s3E3etYNugU+h2pZsyezfxhyg+pPmpSVVhFOwUaESUI8YiIvxAZrAq5b9pk6tvie2dR1z+PtK+PUDykZ0Cn78idyfhuxwD45MHZJqcQMHneUg4UTuCerqnc8tk/+eiETQyv2gXARyd4Rh55RztFJE2EIEQJEX8hMmRmmnTOTvFvaoKtWz3q23rcWbxyFN5dYGL9MzIsTuhL0Hcm+/YZ0e/pusBU1NSjU7rzyujvcs3KhUzabdJMbOo7iPXZvjUM3NcUQo6IEoQYIuIvRAalPIu6bN9uLgDuOX3ccd/oNWZMZOa0bx/07+/R5Fw7uPfcn/LAWT9ubT+enGKZasJ9TSHYdQdBiCck2keIHLm5rvw+Vgnd3HEX/0ixb5/JxeNGa1STUjSkdG99tCQl+3zce01B4vqFREYsfyFy5ObChx/CtdfCt9+aNn/in50N3btHXvy9xvdeO0hPTUEpqDna6PE86NxGghDniPgLkeOCC+D//s/U6gWzW7ZvX+u+SpmqXpEW/ylTfJpDiWqSlMxCoiLiL0SOOXN8Ml+2yeDBkRP/xkZTV9jL5y8IdiUkn79Sqo9S6gOl1GbHMdNPv2al1GrHY2EoYwqdmIIC2LEjMueurDRHEX9BAEJf8J0LLNFaDwOWOF5bUa+1Hut4zAxxTKGzMniwEenaWhMZ1Nzc/mcCZd8+c/Ra8BUEuxKq2+cSYKrj+bPAv4HbQzynYFecheLTTRF5UlNNzvwRwUXPWKZccIq/WP6CAIQu/v211nsAtNZ7lFL+KnF0V0qtAJqAeVrrkhDHFTojM2fCH/5gqnodOGBKJX79dVDi7y/lwsCUbzkVRPwFwUG74q+U+hCwule+K4hxBmmtK5RShcBSpdRarfVWi7GuA64DGDRoUBCnFzoFvXqZkokAdXVG/INcAPaXcmHZig0i/oLgRrvir7U+z997Sql9Sqkch9WfA+z3c44Kx3GbUurfwDjAR/y11guABQBFRUXa+30hcQg522WvXtCnT9Di7y+1Qreq/dCjhzmvIAghL/guBK5yPL8KeNO7g1IqUynVzfG8HzAZ2BDiuEIcE2yVLb90IPTTX2qFQY11stgrCG6EKv7zgO8qpTYD33W8RilVpJR6wtFnJLBCKbUG+Ajj8xfx78QEW2XLLx0Qf38pF8Z2OyYuH0FwI6QFX631AeBci/YVwLWO518Ao0MZR0gswpbtcvBg+OAD0NoyyZoV/lIuZJfUwNChwY0vCJ0Y2eErhJ2wZbscPBiOHIGDB/2nhbDAMuXCvn0weXJw4wtCJ0ayegphJ2zZLgsKzDHUXb9NTZLaQRC8EPEXwk7xuDzumzWavIxUFJCXkcp9s0YHnwAtXGmeKyuN60gWfAWhFXH7CBEhLNku2xP/hgYYNcr1vlLwpz/BjTd69pPdvYLgg1j+QvzSp48puehP/NesMRXCZs+GO++Ek06C++4zGTzdEfEXBB/E8hfiF6XaDvdcscIc582DgQPhtNNgxgx4/XW4/HJXPxF/QfBBLH8hvmlP/LOyID/fvL7wQhPO+fDDnv1E/AXBBxF/Ib4ZPNh/tM+KFVBU5NoDkJQEN98My5bBl1+6+u3bJ6kdBMELEX8hvikogOpqOHzYs/3oUdiwwYi/O1dfDWlpZuHXyd69xuoPcKOYINgBEX8hvvEX8bN6NbS0+Ip/795wzTXw8stQUWHa9u0Tl48geCHiL8Q3/sTfudh76qm+n/n5z00VsOHDTcTQRx+J+AuCFxLtI8Q3/sR/5UqzaSs31/czhYXwxBPm7sDJFVdEbo6CkICI+AvxzYABNKd05cWXPuHXuwa7agN4L/Z6c8010Z2nICQY4vYR4pqSNXso69WX9P0VrbUBfvevL9EbN/r6+wVBCBgRfyGumb+4lLK0LPIPuYrEDSnfjNLa2t8vCEJAiPgLcU1FTT1laf3Jq3WJ/yl7tpgnIv6C0GHE5y/EHe71f5OUojw9m+wj1XRrOs6xLl0ZvXcz+9P6kZ2TE+upCkLCIpa/EFd41/9t1prytGwAcmorAThl31aax42P4SwFIfERy1+IK6zq/5anZwHw5Gu/ozG1B4UHy1Dn/jQW0xOEToOIvxBXWNX5XTNgOCWjplI8pKdpmHiSSeMsCEKHEbePEFdY1fmt79qd+T/6Nbz9NiW/e5zJE29iyJObmDxvKSWrymMwS0FIfET8hbiirfq/3usB5TX13PH6WrkACEIHELePEFc4Sz86o31ad/SOy2PyvKU+6wH1jc3MX1zqt2Ske+SQ+7kEwe6I+Atxh7/6v1brAW21O+8UnBcM552CcwxBsDMi/kLCkJuRSrmF0OdmpFpa+FaRQ+3dKQiCXQjJ56+U+oFSar1SqkUp5TfRilJqulKqVCm1RSk1N5QxBfvibz3g7BOzLNcCrC4U4P9OQRDsRKgLvuuAWcAn/joopZKBvwIXAKOAK5RSo0IcV7AhxePyuG/WaPIyUlFAXkYq980azUffVlpa+Ml+Mn5aRRQJgt0Iye2jtd4IoNoujzcR2KK13ubo+y/gEmBDKGML9sRqPeC/X1pt2bdZa1JTkj0uDM7IIUGwO9EI9cwDdru9LnO0+aCUuk4ptUIptaKysjIKUxM6A/4seeedgfedgvj7BSEAy18p9SEwwOKtu7TWbwYwhtVtgbbqqLVeACwAKCoqsuwjCN7cNm2ER1QPuCx8f5FDgmB32hV/rfV5IY5RBgx0e50PVIR4TkFopa29AYIgWBONUM+vgGFKqSFAOTAbuDIK4wo2Qix8QQiOUEM9v6eUKgO+A7yjlFrsaM9VSi0C0Fo3ATcBi4GNwMta6/WhTVsQBEEIhVCjfd4A3rBorwAudHu9CFgUyliCIAhC+JDEboIgCDZExF8QBMGGiPgLgiDYEBF/QRAEG6K0js+9VEqpSmBnrOcRBvoBVbGeRJwgfwtP5O/hQv4WnoTy9xistc5qr1Pcin9nQSm1QmvtN+OpnZC/hSfy93AhfwtPovH3ELePIAiCDRHxFwRBsCEi/pFnQawnEEfI38IT+Xu4kL+FJxH/e4jPXxAEwYaI5S8IgmBDRPwjgFJqoFLqI6XURkeN4/+K9ZziAaVUslJqlVLq7VjPJZYopTKUUq8qpb51fEe+E+s5xRKl1H87fifrlFIvKqW6x3pO0UQp9ZRSar9Sap1bWx+l1AdKqc2OY2a4xxXxjwxNwC+11iOBScCNUrcYgP/CZHa1O38C3tNanwiMwcZ/E6VUHnAzUKS1PhlIxqR9txPPANO92uYCS7TWw4AljtdhRcQ/Amit92itv3Y8P4z5cds62bxSKh+4CHgi1nOJJUqpNOBM4EkArfVxrXVNbGcVc7oAqUqpLkAPbFbsSWv9CXDQq/kS4FnH82eB4nCPK+IfYZRSBcA4YHlsZxJzHgb+B2iJ9URiTCFQCTztcIE9oZTqGetJxQqtdTnwB2AXsAc4pLV+P7azigv6a633gDEmgexwDyDiH0GUUr2A14BbtNa1sZ5PrFBKXQzs11qvjPVc4oAuwHjgUa31OOAIEbilTxQcvuxLgCFALtBTKfXD2M7KHoj4RwilVApG+F/QWr8e6/nEmMnATKXUDuBfwDlKqX/Edkoxowwo01o77wRfxVwM7Mp5wHatdaXWuhF4HTg9xnOKB/YppXIAHMf94R5AxD8CKKUUxqe7UWv9x1jPJ9Zore/QWudrrQswi3lLtda2tO601nuB3UqpEY6mc4ENMZxSrNkFTFJK9XD8bs7FxgvgbiwErnI8vwp4M9wDRKOAux2ZDPwIWKuUWu1ou9NRzlIQfg68oJTqCmwD5sR4PjFDa71cKfUq8DUmSm4VNtvtq5R6EZgK9HPURL8bmAe8rJT6CeYC+YOwjys7fAVBEOyHuH0EQRBsiIi/IAiCDRHxFwRBsCEi/oIgCDZExF8QBMGGiPgLgiDYEBF/QRAEGyLiLwiCYEP+PzBfKyReIPWvAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "plt.scatter(data, target)\n", "plt.plot(data, tree.predict(data), color='r')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#剪枝\n", "tree.prune(1)\n", "plt.scatter(data, target)\n", "plt.plot(data, tree.predict(data), color='r')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }