{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Release Highlights for scikit-learn 1.7\n\n.. currentmodule:: sklearn\n\nWe are pleased to announce the release of scikit-learn 1.7! Many bug fixes\nand improvements were added, as well as some key new features. Below we\ndetail the highlights of this release. **For an exhaustive list of\nall the changes**, please refer to the `release notes `.\n\nTo install the latest version (with pip)::\n\n pip install --upgrade scikit-learn\n\nor with conda::\n\n conda install -c conda-forge scikit-learn\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Improved estimator's HTML representation\nThe HTML representation of estimators now includes a section containing the list of\nparameters and their values. Non-default parameters are highlighted in orange. A copy\nbutton is also available to copy the \"fully-qualified\" parameter name without the\nneed to call the `get_params` method. It is particularly useful when defining a\nparameter grid for a grid-search or a randomized-search with a complex pipeline.\n\nSee the example below and click on the different estimator's blocks to see the\nimproved HTML representation.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.preprocessing import StandardScaler\n\nmodel = make_pipeline(StandardScaler(with_std=False), LogisticRegression(C=2.0))\nmodel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Custom validation set for histogram-based Gradient Boosting estimators\nThe :class:`ensemble.HistGradientBoostingClassifier` and\n:class:`ensemble.HistGradientBoostingRegressor` now support directly passing a custom\nvalidation set for early stopping to the `fit` method, using the `X_val`, `y_val`, and\n`sample_weight_val` parameters.\nIn a :class:`pipeline.Pipeline`, the validation set `X_val` can be transformed along\nwith `X` using the `transform_input` parameter.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import sklearn\nfrom sklearn.datasets import make_classification\nfrom sklearn.ensemble import HistGradientBoostingClassifier\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler\n\nsklearn.set_config(enable_metadata_routing=True)\n\nX, y = make_classification(random_state=0)\nX_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=0)\n\nclf = HistGradientBoostingClassifier()\nclf.set_fit_request(X_val=True, y_val=True)\n\nmodel = Pipeline([(\"sc\", StandardScaler()), (\"clf\", clf)], transform_input=[\"X_val\"])\nmodel.fit(X, y, X_val=X_val, y_val=y_val)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting ROC curves from cross-validation results\nThe class :class:`metrics.RocCurveDisplay` has a new class method `from_cv_results`\nthat allows to easily plot multiple ROC curves from the results of\n:func:`model_selection.cross_validate`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from sklearn.datasets import make_classification\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.metrics import RocCurveDisplay\nfrom sklearn.model_selection import cross_validate\n\nX, y = make_classification(n_samples=150, random_state=0)\nclf = LogisticRegression(random_state=0)\ncv_results = cross_validate(clf, X, y, cv=5, return_estimator=True, return_indices=True)\n_ = RocCurveDisplay.from_cv_results(cv_results, X, y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Array API support\nSeveral functions have been updated to support array API compatible inputs since\nversion 1.6, especially metrics from the :mod:`sklearn.metrics` module.\n\nIn addition, it is no longer required to install the `array-api-compat` package to use\nthe experimental array API support in scikit-learn.\n\nPlease refer to the `array API support` page for instructions to use\nscikit-learn with array API compatible libraries such as PyTorch or CuPy.\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Improved API consistency of Multi-layer Perceptron\nThe :class:`neural_network.MLPRegressor` has a new parameter `loss` and now supports\nthe \"poisson\" loss in addition to the default \"squared_error\" loss.\nMoreover, the :class:`neural_network.MLPClassifier` and\n:class:`neural_network.MLPRegressor` estimators now support sample weights.\nThese improvements have been made to improve the consistency of these estimators\nwith regard to the other estimators in scikit-learn.\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Migration toward sparse arrays\nIn order to prepare [SciPy migration from sparse matrices to sparse arrays](https://docs.scipy.org/doc/scipy/reference/sparse.migration_to_sparray.html),\nall scikit-learn estimators that accept sparse matrices as input now also accept\nsparse arrays.\n\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.18" } }, "nbformat": 4, "nbformat_minor": 0 }