{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Examples of Using `FrozenEstimator`\n\nThis examples showcases some use cases of :class:`~sklearn.frozen.FrozenEstimator`.\n\n:class:`~sklearn.frozen.FrozenEstimator` is a utility class that allows to freeze a\nfitted estimator. This is useful, for instance, when we want to pass a fitted estimator\nto a meta-estimator, such as :class:`~sklearn.model_selection.FixedThresholdClassifier`\nwithout letting the meta-estimator refit the estimator.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setting a decision threshold for a pre-fitted classifier\nFitted classifiers in scikit-learn use an arbitrary decision threshold to decide\nwhich class the given sample belongs to. The decision threshold is either `0.0` on the\nvalue returned by :term:`decision_function`, or `0.5` on the probability returned by\n:term:`predict_proba`.\n\nHowever, one might want to set a custom decision threshold. We can do this by\nusing :class:`~sklearn.model_selection.FixedThresholdClassifier` and wrapping the\nclassifier with :class:`~sklearn.frozen.FrozenEstimator`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from sklearn.datasets import make_classification\nfrom sklearn.frozen import FrozenEstimator\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.model_selection import FixedThresholdClassifier, train_test_split\n\nX, y = make_classification(n_samples=1000, random_state=0)\nX_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)\nclassifier = LogisticRegression().fit(X_train, y_train)\n\nprint(\n \"Probability estimates for three data points:\\n\"\n f\"{classifier.predict_proba(X_test[-3:]).round(3)}\"\n)\nprint(\n \"Predicted class for the same three data points:\\n\"\n f\"{classifier.predict(X_test[-3:])}\"\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now imagine you'd want to set a different decision threshold on the probability\nestimates. We can do this by wrapping the classifier with\n:class:`~sklearn.frozen.FrozenEstimator` and passing it to\n:class:`~sklearn.model_selection.FixedThresholdClassifier`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "threshold_classifier = FixedThresholdClassifier(\n estimator=FrozenEstimator(classifier), threshold=0.9\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that in the above piece of code, calling `fit` on\n:class:`~sklearn.model_selection.FixedThresholdClassifier` does not refit the\nunderlying classifier.\n\nNow, let's see how the predictions changed with respect to the probability\nthreshold.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "print(\n \"Probability estimates for three data points with FixedThresholdClassifier:\\n\"\n f\"{threshold_classifier.predict_proba(X_test[-3:]).round(3)}\"\n)\nprint(\n \"Predicted class for the same three data points with FixedThresholdClassifier:\\n\"\n f\"{threshold_classifier.predict(X_test[-3:])}\"\n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see that the probability estimates stay the same, but since a different decision\nthreshold is used, the predicted classes are different.\n\nPlease refer to\n`sphx_glr_auto_examples_model_selection_plot_cost_sensitive_learning.py`\nto learn about cost-sensitive learning and decision threshold tuning.\n\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calibration of a pre-fitted classifier\nYou can use :class:`~sklearn.frozen.FrozenEstimator` to calibrate a pre-fitted\nclassifier using :class:`~sklearn.calibration.CalibratedClassifierCV`.\n\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from sklearn.calibration import CalibratedClassifierCV\nfrom sklearn.metrics import brier_score_loss\n\ncalibrated_classifier = CalibratedClassifierCV(\n estimator=FrozenEstimator(classifier)\n).fit(X_train, y_train)\n\nprob_pos_clf = classifier.predict_proba(X_test)[:, 1]\nclf_score = brier_score_loss(y_test, prob_pos_clf)\nprint(f\"No calibration: {clf_score:.3f}\")\n\nprob_pos_calibrated = calibrated_classifier.predict_proba(X_test)[:, 1]\ncalibrated_score = brier_score_loss(y_test, prob_pos_calibrated)\nprint(f\"With calibration: {calibrated_score:.3f}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.21" } }, "nbformat": 4, "nbformat_minor": 0 }