{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# StackingRegressor: a simple stacking implementation for regression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An ensemble-learning meta-regressor for stacking regression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> from mlxtend.regressor import StackingRegressor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Stacking regression is an ensemble learning technique to combine multiple regression models via a meta-regressor. The individual regression models are trained based on the complete training set; then, the meta-regressor is fitted based on the outputs -- meta-features -- of the individual regression models in the ensemble." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./StackingRegressor_files/stackingregression_overview.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### References\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Breiman, Leo. \"[Stacked regressions.](https://link.springer.com/article/10.1023/A:1018046112532#page-1)\" Machine learning 24.1 (1996): 49-64." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 1 - Simple Stacked Regression" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from mlxtend.regressor import StackingRegressor\n", "from mlxtend.data import boston_housing_data\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.linear_model import Ridge\n", "from sklearn.svm import SVR\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import warnings\n", "\n", "warnings.simplefilter('ignore')\n", "\n", "# Generating a sample dataset\n", "np.random.seed(1)\n", "X = np.sort(5 * np.random.rand(40, 1), axis=0)\n", "y = np.sin(X).ravel()\n", "y[::5] += 3 * (0.5 - np.random.rand(8))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean Squared Error: 0.1846\n", "Variance Score: 0.7329\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Initializing models\n", "\n", "lr = LinearRegression()\n", "svr_lin = SVR(kernel='linear')\n", "ridge = Ridge(random_state=1)\n", "svr_rbf = SVR(kernel='rbf')\n", "\n", "stregr = StackingRegressor(regressors=[svr_lin, lr, ridge], \n", " meta_regressor=svr_rbf)\n", "\n", "# Training the stacking classifier\n", "\n", "stregr.fit(X, y)\n", "stregr.predict(X)\n", "\n", "# Evaluate and visualize the fit\n", "\n", "print(\"Mean Squared Error: %.4f\"\n", " % np.mean((stregr.predict(X) - y) ** 2))\n", "print('Variance Score: %.4f' % stregr.score(X, y))\n", "\n", "with plt.style.context(('seaborn-whitegrid')):\n", " plt.scatter(X, y, c='lightgray')\n", " plt.plot(X, stregr.predict(X), c='darkgreen', lw=2)\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "StackingRegressor(meta_regressor=SVR(),\n", " regressors=[SVR(kernel='linear'), LinearRegression(),\n", " Ridge(random_state=1)])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stregr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example 2 - Stacked Regression and GridSearch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this second example we demonstrate how `StackingCVRegressor` works in combination with `GridSearchCV`. The stack still allows tuning hyper parameters of the base and meta models!\n", "\n", "For instance, we can use `estimator.get_params().keys()` to get a full list of tunable parameters." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best: -0.082717 using {'lasso__alpha': 0.1, 'meta_regressor__C': 1.0, 'meta_regressor__gamma': 1.0, 'ridge__alpha': 0.1, 'svr__C': 10.0}\n" ] } ], "source": [ "from sklearn.model_selection import GridSearchCV\n", "from sklearn.linear_model import Lasso\n", "\n", "# Initializing models\n", "\n", "lr = LinearRegression()\n", "svr_lin = SVR(kernel='linear')\n", "ridge = Ridge(random_state=1)\n", "lasso = Lasso(random_state=1)\n", "svr_rbf = SVR(kernel='rbf')\n", "regressors = [svr_lin, lr, ridge, lasso]\n", "stregr = StackingRegressor(regressors=regressors, \n", " meta_regressor=svr_rbf)\n", "\n", "params = {'lasso__alpha': [0.1, 1.0, 10.0],\n", " 'ridge__alpha': [0.1, 1.0, 10.0],\n", " 'svr__C': [0.1, 1.0, 10.0],\n", " 'meta_regressor__C': [0.1, 1.0, 10.0, 100.0],\n", " 'meta_regressor__gamma': [0.1, 1.0, 10.0]}\n", "\n", "grid = GridSearchCV(estimator=stregr, \n", " param_grid=params, \n", " cv=5,\n", " refit=True)\n", "grid.fit(X, y)\n", "\n", "print(\"Best: %f using %s\" % (grid.best_score_, grid.best_params_))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-9.810 +/- 6.86 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 0.1, 'svr__C': 0.1}\n", "-9.591 +/- 6.67 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 0.1, 'svr__C': 1.0}\n", "-9.591 +/- 6.67 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 0.1, 'svr__C': 10.0}\n", "-9.819 +/- 6.87 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 1.0, 'svr__C': 0.1}\n", "-9.600 +/- 6.68 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 1.0, 'svr__C': 1.0}\n", "-9.600 +/- 6.68 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 1.0, 'svr__C': 10.0}\n", "-9.878 +/- 6.91 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 10.0, 'svr__C': 0.1}\n", "-9.665 +/- 6.71 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 10.0, 'svr__C': 1.0}\n", "-9.665 +/- 6.71 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 0.1, 'ridge__alpha': 10.0, 'svr__C': 10.0}\n", "-4.839 +/- 3.98 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 1.0, 'ridge__alpha': 0.1, 'svr__C': 0.1}\n", "-3.986 +/- 3.16 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 1.0, 'ridge__alpha': 0.1, 'svr__C': 1.0}\n", "-3.986 +/- 3.16 {'lasso__alpha': 0.1, 'meta_regressor__C': 0.1, 'meta_regressor__gamma': 1.0, 'ridge__alpha': 0.1, 'svr__C': 10.0}\n", "...\n", "Best parameters: {'lasso__alpha': 0.1, 'meta_regressor__C': 1.0, 'meta_regressor__gamma': 1.0, 'ridge__alpha': 0.1, 'svr__C': 10.0}\n", "Accuracy: -0.08\n" ] } ], "source": [ "cv_keys = ('mean_test_score', 'std_test_score', 'params')\n", "\n", "for r, _ in enumerate(grid.cv_results_['mean_test_score']):\n", " print(\"%0.3f +/- %0.2f %r\"\n", " % (grid.cv_results_[cv_keys[0]][r],\n", " grid.cv_results_[cv_keys[1]][r] / 2.0,\n", " grid.cv_results_[cv_keys[2]][r]))\n", " if r > 10:\n", " break\n", "print('...')\n", "\n", "print('Best parameters: %s' % grid.best_params_)\n", "print('Accuracy: %.2f' % grid.best_score_)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean Squared Error: 0.1845\n", "Variance Score: 0.7330\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Evaluate and visualize the fit\n", "print(\"Mean Squared Error: %.4f\"\n", " % np.mean((grid.predict(X) - y) ** 2))\n", "print('Variance Score: %.4f' % grid.score(X, y))\n", "\n", "with plt.style.context(('seaborn-whitegrid')):\n", " plt.scatter(X, y, c='lightgray')\n", " plt.plot(X, grid.predict(X), c='darkgreen', lw=2)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**\n", "\n", "The `StackingCVRegressor` also enables grid search over the `regressors` and even a single base regressor. When there are level-mixed hyperparameters, `GridSearchCV` will try to replace hyperparameters in a top-down order, i.e., `regressors` -> single base regressor -> regressor hyperparameter. For instance, given a hyperparameter grid such as\n", "\n", " params = {'randomforestregressor__n_estimators': [1, 100],\n", " 'regressors': [(regr1, regr1, regr1), (regr2, regr3)]}\n", " \n", "it will first use the instance settings of either `(regr1, regr2, regr3)` or `(regr2, regr3)` . Then it will replace the `'n_estimators'` settings for a matching regressor based on `'randomforestregressor__n_estimators': [1, 100]`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## API" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "## StackingRegressor\n", "\n", "*StackingRegressor(regressors, meta_regressor, verbose=0, use_features_in_secondary=False, store_train_meta_features=False, refit=True, multi_output=False)*\n", "\n", "A Stacking regressor for scikit-learn estimators for regression.\n", "\n", "**Parameters**\n", "\n", "- `regressors` : array-like, shape = [n_regressors]\n", "\n", " A list of regressors.\n", " Invoking the `fit` method on the `StackingRegressor` will fit clones\n", " of those original regressors that will\n", " be stored in the class attribute\n", " `self.regr_`.\n", "\n", "\n", "- `meta_regressor` : object\n", "\n", " The meta-regressor to be fitted on the ensemble of\n", " regressors\n", "\n", "\n", "- `verbose` : int, optional (default=0)\n", "\n", " Controls the verbosity of the building process.\n", " - `verbose=0` (default): Prints nothing\n", " - `verbose=1`: Prints the number & name of the regressor being fitted\n", " - `verbose=2`: Prints info about the parameters of the\n", " regressor being fitted\n", " - `verbose>2`: Changes `verbose` param of the underlying regressor to\n", " self.verbose - 2\n", "\n", "\n", "- `use_features_in_secondary` : bool (default: False)\n", "\n", " If True, the meta-regressor will be trained both on\n", " the predictions of the original regressors and the\n", " original dataset.\n", " If False, the meta-regressor will be trained only on\n", " the predictions of the original regressors.\n", "\n", "\n", "- `store_train_meta_features` : bool (default: False)\n", "\n", " If True, the meta-features computed from the training data\n", " used for fitting the\n", " meta-regressor stored in the `self.train_meta_features_` array,\n", " which can be\n", " accessed after calling `fit`.\n", "\n", "\n", "**Attributes**\n", "\n", "- `regr_` : list, shape=[n_regressors]\n", "\n", " Fitted regressors (clones of the original regressors)\n", "\n", "\n", "- `meta_regr_` : estimator\n", "\n", " Fitted meta-regressor (clone of the original meta-estimator)\n", "\n", "\n", "- `coef_` : array-like, shape = [n_features]\n", "\n", " Model coefficients of the fitted meta-estimator\n", "\n", "\n", "- `intercept_` : float\n", "\n", " Intercept of the fitted meta-estimator\n", "\n", "\n", "- `train_meta_features` : numpy array,\n", "\n", " shape = [n_samples, len(self.regressors)]\n", " meta-features for training data, where n_samples is the\n", " number of samples\n", " in training data and len(self.regressors) is the number of regressors.\n", "\n", "\n", "- `refit` : bool (default: True)\n", "\n", " Clones the regressors for stacking regression if True (default)\n", " or else uses the original ones, which will be refitted on the dataset\n", " upon calling the `fit` method. Setting refit=False is\n", " recommended if you are working with estimators that are supporting\n", " the scikit-learn fit/predict API interface but are not compatible\n", " to scikit-learn's `clone` function.\n", "\n", "**Examples**\n", "\n", "For usage examples, please see\n", " https://rasbt.github.io/mlxtend/user_guide/regressor/StackingRegressor/\n", "\n", "### Methods\n", "\n", "
\n", "\n", "*fit(X, y, sample_weight=None)*\n", "\n", "Learn weight coefficients from training data for each regressor.\n", "\n", "**Parameters**\n", "\n", "- `X` : {array-like, sparse matrix}, shape = [n_samples, n_features]\n", "\n", " Training vectors, where n_samples is the number of samples and\n", " n_features is the number of features.\n", "\n", "\n", "- `y` : numpy array, shape = [n_samples] or [n_samples, n_targets]\n", "\n", " Target values. Multiple targets are supported only if\n", " self.multi_output is True.\n", "\n", "\n", "- `sample_weight` : array-like, shape = [n_samples], optional\n", "\n", " Sample weights passed as sample_weights to each regressor\n", " in the regressors list as well as the meta_regressor.\n", " Raises error if some regressor does not support\n", " sample_weight in the fit() method.\n", "\n", "**Returns**\n", "\n", "- `self` : object\n", "\n", "\n", "
\n", "\n", "*fit_transform(X, y=None, **fit_params)*\n", "\n", "Fit to data, then transform it.\n", "\n", " Fits transformer to `X` and `y` with optional parameters `fit_params`\n", " and returns a transformed version of `X`.\n", "\n", "**Parameters**\n", "\n", "- `X` : array-like of shape (n_samples, n_features)\n", "\n", " Input samples.\n", "\n", "\n", "- `y` : array-like of shape (n_samples,) or (n_samples, n_outputs), default=None\n", "\n", " Target values (None for unsupervised transformations).\n", "\n", "\n", "- `**fit_params` : dict\n", "\n", " Additional fit parameters.\n", "\n", "**Returns**\n", "\n", "- `X_new` : ndarray array of shape (n_samples, n_features_new)\n", "\n", " Transformed array.\n", "\n", "
\n", "\n", "*get_params(deep=True)*\n", "\n", "Return estimator parameter names for GridSearch support.\n", "\n", "
\n", "\n", "*predict(X)*\n", "\n", "Predict target values for X.\n", "\n", "**Parameters**\n", "\n", "- `X` : {array-like, sparse matrix}, shape = [n_samples, n_features]\n", "\n", " Training vectors, where n_samples is the number of samples and\n", " n_features is the number of features.\n", "\n", "**Returns**\n", "\n", "- `y_target` : array-like, shape = [n_samples] or [n_samples, n_targets]\n", "\n", " Predicted target values.\n", "\n", "
\n", "\n", "*predict_meta_features(X)*\n", "\n", "Get meta-features of test-data.\n", "\n", "**Parameters**\n", "\n", "- `X` : numpy array, shape = [n_samples, n_features]\n", "\n", " Test vectors, where n_samples is the number of samples and\n", " n_features is the number of features.\n", "\n", "**Returns**\n", "\n", "- `meta-features` : numpy array, shape = [n_samples, len(self.regressors)]\n", "\n", " meta-features for test data, where n_samples is the number of\n", " samples in test data and len(self.regressors) is the number\n", " of regressors. If self.multi_output is True, then the number of\n", " columns is len(self.regressors) * n_targets\n", "\n", "
\n", "\n", "*score(X, y, sample_weight=None)*\n", "\n", "Return the coefficient of determination :math:`R^2` of the\n", " prediction.\n", "\n", " The coefficient :math:`R^2` is defined as :math:`(1 - \\frac{u}{v})`,\n", " where :math:`u` is the residual sum of squares ``((y_true - y_pred)\n", "** 2).sum()`` and :math:`v` is the total sum of squares ``((y_true -\n", "y_true.mean()) ** 2).sum()``. The best possible score is 1.0 and it\n", "\n", "can be negative (because the model can be arbitrarily worse). A\n", "\n", "constant model that always predicts the expected value of `y`,\n", " disregarding the input features, would get a :math:`R^2` score of\n", " 0.0.\n", "\n", "**Parameters**\n", "\n", "- `X` : array-like of shape (n_samples, n_features)\n", "\n", " Test samples. For some estimators this may be a precomputed\n", " kernel matrix or a list of generic objects instead with shape\n", " ``(n_samples, n_samples_fitted)``, where ``n_samples_fitted``\n", " is the number of samples used in the fitting for the estimator.\n", "\n", "\n", "- `y` : array-like of shape (n_samples,) or (n_samples, n_outputs)\n", "\n", " True values for `X`.\n", "\n", "\n", "- `sample_weight` : array-like of shape (n_samples,), default=None\n", "\n", " Sample weights.\n", "\n", "**Returns**\n", "\n", "- `score` : float\n", "\n", " :math:`R^2` of ``self.predict(X)`` wrt. `y`.\n", "\n", "**Notes**\n", "\n", "The :math:`R^2` score used when calling ``score`` on a regressor uses\n", " ``multioutput='uniform_average'`` from version 0.23 to keep consistent\n", " with default value of :func:`~sklearn.metrics.r2_score`.\n", " This influences the ``score`` method of all the multioutput\n", " regressors (except for\n", " :class:`~sklearn.multioutput.MultiOutputRegressor`).\n", "\n", "
\n", "\n", "*set_params(**params)*\n", "\n", "Set the parameters of this estimator.\n", "\n", " Valid parameter keys can be listed with ``get_params()``.\n", "\n", "**Returns**\n", "\n", "self\n", "\n", "### Properties\n", "\n", "
\n", "\n", "*coef_*\n", "\n", "None\n", "\n", "
\n", "\n", "*intercept_*\n", "\n", "None\n", "\n", "
\n", "\n", "*named_regressors*\n", "\n", "None\n", "\n", "\n" ] } ], "source": [ "with open('../../api_modules/mlxtend.regressor/StackingRegressor.md', 'r') as f:\n", " print(f.read())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "toc": { "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }