{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Isotonic Regression\n\nAn illustration of the isotonic regression on generated data (non-linear\nmonotonic trend with homoscedastic uniform noise).\n\nThe isotonic regression algorithm finds a non-decreasing approximation of a\nfunction while minimizing the mean squared error on the training data. The\nbenefit of such a non-parametric model is that it does not assume any shape for\nthe target function besides monotonicity. For comparison a linear regression is\nalso presented.\n\nThe plot on the right-hand side shows the model prediction function that\nresults from the linear interpolation of thresholds points. The thresholds\npoints are a subset of the training input observations and their matching\ntarget values are computed by the isotonic non-parametric fit.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause\n\nimport matplotlib.pyplot as plt\nimport numpy as np\nfrom matplotlib.collections import LineCollection\n\nfrom sklearn.isotonic import IsotonicRegression\nfrom sklearn.linear_model import LinearRegression\nfrom sklearn.utils import check_random_state\n\nn = 100\nx = np.arange(n)\nrs = check_random_state(0)\ny = rs.randint(-50, 50, size=(n,)) + 50.0 * np.log1p(np.arange(n))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fit IsotonicRegression and LinearRegression models:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "ir = IsotonicRegression(out_of_bounds=\"clip\")\ny_ = ir.fit_transform(x, y)\n\nlr = LinearRegression()\nlr.fit(x[:, np.newaxis], y) # x needs to be 2d for LinearRegression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot results:\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "segments = [[[i, y[i]], [i, y_[i]]] for i in range(n)]\nlc = LineCollection(segments, zorder=0)\nlc.set_array(np.ones(len(y)))\nlc.set_linewidths(np.full(n, 0.5))\n\nfig, (ax0, ax1) = plt.subplots(ncols=2, figsize=(12, 6))\n\nax0.plot(x, y, \"C0.\", markersize=12)\nax0.plot(x, y_, \"C1.-\", markersize=12)\nax0.plot(x, lr.predict(x[:, np.newaxis]), \"C2-\")\nax0.add_collection(lc)\nax0.legend((\"Training data\", \"Isotonic fit\", \"Linear fit\"), loc=\"lower right\")\nax0.set_title(\"Isotonic regression fit on noisy data (n=%d)\" % n)\n\nx_test = np.linspace(-10, 110, 1000)\nax1.plot(x_test, ir.predict(x_test), \"C1-\")\nax1.plot(ir.X_thresholds_, ir.y_thresholds_, \"C1.\", markersize=12)\nax1.set_title(\"Prediction function (%d thresholds)\" % len(ir.X_thresholds_))\n\nplt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that we explicitly passed `out_of_bounds=\"clip\"` to the constructor of\n`IsotonicRegression` to control the way the model extrapolates outside of the\nrange of data observed in the training set. This \"clipping\" extrapolation can\nbe seen on the plot of the decision function on the right-hand.\n\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.21" } }, "nbformat": 4, "nbformat_minor": 0 }