{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Plot individual and voting regression predictions\n\n.. currentmodule:: sklearn\n\nA voting regressor is an ensemble meta-estimator that fits several base\nregressors, each on the whole dataset. Then it averages the individual\npredictions to form a final prediction.\nWe will use three different regressors to predict the data:\n:class:`~ensemble.GradientBoostingRegressor`,\n:class:`~ensemble.RandomForestRegressor`, and\n:class:`~linear_model.LinearRegression`).\nThen the above 3 regressors will be used for the\n:class:`~ensemble.VotingRegressor`.\n\nFinally, we will plot the predictions made by all models for comparison.\n\nWe will work with the diabetes dataset which consists of 10 features\ncollected from a cohort of diabetes patients. The target is a quantitative\nmeasure of disease progression one year after baseline.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause\n\nimport matplotlib.pyplot as plt\n\nfrom sklearn.datasets import load_diabetes\nfrom sklearn.ensemble import (\n GradientBoostingRegressor,\n RandomForestRegressor,\n VotingRegressor,\n)\nfrom sklearn.linear_model import LinearRegression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training classifiers\n\nFirst, we will load the diabetes dataset and initiate a gradient boosting\nregressor, a random forest regressor and a linear regression. Next, we will\nuse the 3 regressors to build the voting regressor:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "X, y = load_diabetes(return_X_y=True)\n\n# Train classifiers\nreg1 = GradientBoostingRegressor(random_state=1)\nreg2 = RandomForestRegressor(random_state=1)\nreg3 = LinearRegression()\n\nreg1.fit(X, y)\nreg2.fit(X, y)\nreg3.fit(X, y)\n\nereg = VotingRegressor([(\"gb\", reg1), (\"rf\", reg2), (\"lr\", reg3)])\nereg.fit(X, y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making predictions\n\nNow we will use each of the regressors to make the 20 first predictions.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "xt = X[:20]\n\npred1 = reg1.predict(xt)\npred2 = reg2.predict(xt)\npred3 = reg3.predict(xt)\npred4 = ereg.predict(xt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot the results\n\nFinally, we will visualize the 20 predictions. The red stars show the average\nprediction made by :class:`~ensemble.VotingRegressor`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "plt.figure()\nplt.plot(pred1, \"gd\", label=\"GradientBoostingRegressor\")\nplt.plot(pred2, \"b^\", label=\"RandomForestRegressor\")\nplt.plot(pred3, \"ys\", label=\"LinearRegression\")\nplt.plot(pred4, \"r*\", ms=10, label=\"VotingRegressor\")\n\nplt.tick_params(axis=\"x\", which=\"both\", bottom=False, top=False, labelbottom=False)\nplt.ylabel(\"predicted\")\nplt.xlabel(\"training samples\")\nplt.legend(loc=\"best\")\nplt.title(\"Regressor predictions and their average\")\n\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.21" } }, "nbformat": 4, "nbformat_minor": 0 }