{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Decision Tree Regression with AdaBoost\n\nA decision tree is boosted using the AdaBoost.R2 [1]_ algorithm on a 1D\nsinusoidal dataset with a small amount of Gaussian noise.\n299 boosts (300 decision trees) is compared with a single decision tree\nregressor. As the number of boosts is increased the regressor can fit more\ndetail.\n\nSee `sphx_glr_auto_examples_ensemble_plot_hgbt_regression.py` for an\nexample showcasing the benefits of using more efficient regression models such\nas :class:`~ensemble.HistGradientBoostingRegressor`.\n\n.. [1] [H. Drucker, \"Improving Regressors using Boosting Techniques\", 1997.](https://citeseerx.ist.psu.edu/doc_view/pid/8d49e2dedb817f2c3330e74b63c5fc86d2399ce3)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparing the data\nFirst, we prepare dummy data with a sinusoidal relationship and some gaussian noise.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause\n\nimport numpy as np\n\nrng = np.random.RandomState(1)\nX = np.linspace(0, 6, 100)[:, np.newaxis]\ny = np.sin(X).ravel() + np.sin(6 * X).ravel() + rng.normal(0, 0.1, X.shape[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training and prediction with DecisionTree and AdaBoost Regressors\nNow, we define the classifiers and fit them to the data.\nThen we predict on that same data to see how well they could fit it.\nThe first regressor is a `DecisionTreeRegressor` with `max_depth=4`.\nThe second regressor is an `AdaBoostRegressor` with a `DecisionTreeRegressor`\nof `max_depth=4` as base learner and will be built with `n_estimators=300`\nof those base learners.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from sklearn.ensemble import AdaBoostRegressor\nfrom sklearn.tree import DecisionTreeRegressor\n\nregr_1 = DecisionTreeRegressor(max_depth=4)\n\nregr_2 = AdaBoostRegressor(\n DecisionTreeRegressor(max_depth=4), n_estimators=300, random_state=rng\n)\n\nregr_1.fit(X, y)\nregr_2.fit(X, y)\n\ny_1 = regr_1.predict(X)\ny_2 = regr_2.predict(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting the results\nFinally, we plot how well our two regressors,\nsingle decision tree regressor and AdaBoost regressor, could fit the data.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\nimport seaborn as sns\n\ncolors = sns.color_palette(\"colorblind\")\n\nplt.figure()\nplt.scatter(X, y, color=colors[0], label=\"training samples\")\nplt.plot(X, y_1, color=colors[1], label=\"n_estimators=1\", linewidth=2)\nplt.plot(X, y_2, color=colors[2], label=\"n_estimators=300\", linewidth=2)\nplt.xlabel(\"data\")\nplt.ylabel(\"target\")\nplt.title(\"Boosted Decision Tree Regression\")\nplt.legend()\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.21" } }, "nbformat": 4, "nbformat_minor": 0 }