{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Regularization path of L1- Logistic Regression\n\n\nTrain l1-penalized logistic regression models on a binary classification\nproblem derived from the Iris dataset.\n\nThe models are ordered from strongest regularized to least regularized. The 4\ncoefficients of the models are collected and plotted as a \"regularization\npath\": on the left-hand side of the figure (strong regularizers), all the\ncoefficients are exactly 0. When regularization gets progressively looser,\ncoefficients can get non-zero values one after the other.\n\nHere we choose the liblinear solver because it can efficiently optimize for the\nLogistic Regression loss with a non-smooth, sparsity inducing l1 penalty.\n\nAlso note that we set a low value for the tolerance to make sure that the model\nhas converged before collecting the coefficients.\n\nWe also use warm_start=True which means that the coefficients of the models are\nreused to initialize the next model fit to speed-up the computation of the\nfull-path.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load data\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from sklearn import datasets\n\niris = datasets.load_iris()\nX = iris.data\ny = iris.target\nfeature_names = iris.feature_names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we remove the third class to make the problem a binary classification\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "X = X[y != 2]\ny = y[y != 2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compute regularization path\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.svm import l1_min_c\n\ncs = l1_min_c(X, y, loss=\"log\") * np.logspace(0, 1, 16)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a pipeline with `StandardScaler` and `LogisticRegression`, to normalize\nthe data before fitting a linear model, in order to speed-up convergence and\nmake the coefficients comparable. Also, as a side effect, since the data is now\ncentered around 0, we don't need to fit an intercept.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "clf = make_pipeline(\n StandardScaler(),\n LogisticRegression(\n l1_ratio=1,\n solver=\"liblinear\",\n tol=1e-6,\n max_iter=int(1e6),\n warm_start=True,\n fit_intercept=False,\n ),\n)\ncoefs_ = []\nfor c in cs:\n clf.set_params(logisticregression__C=c)\n clf.fit(X, y)\n coefs_.append(clf[\"logisticregression\"].coef_.ravel().copy())\n\ncoefs_ = np.array(coefs_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot regularization path\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n\n# Colorblind-friendly palette (IBM Color Blind Safe palette)\ncolors = [\"#648FFF\", \"#785EF0\", \"#DC267F\", \"#FE6100\"]\n\nplt.figure(figsize=(10, 6))\nfor i in range(coefs_.shape[1]):\n plt.semilogx(cs, coefs_[:, i], marker=\"o\", color=colors[i], label=feature_names[i])\n\nymin, ymax = plt.ylim()\nplt.xlabel(\"C\")\nplt.ylabel(\"Coefficients\")\nplt.title(\"Logistic Regression Path\")\nplt.legend()\nplt.axis(\"tight\")\nplt.show()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.14" } }, "nbformat": 4, "nbformat_minor": 0 }