{ "cells": [ { "cell_type": "markdown", "id": "d0831f1b", "metadata": {}, "source": [ "# 机器学习调参练习" ] }, { "cell_type": "markdown", "id": "62875267", "metadata": {}, "source": [ "在机器学习中,超参数是指无法从数据中学习而需要在训练前提供的参数。机器学习模型的性能在很大程度上依赖于寻找最佳超参数集。\n", "\n", "超参数调整一般是指调整模型的超参数,这基本上是一个非常耗时的过程。在本文中,我们将和你一起研习 3 种最流行的超参数调整技术:\n", "\n", "- **网格搜索**\n", "\n", "- **随机搜索**\n", "\n", "- **贝叶斯搜索**" ] }, { "cell_type": "markdown", "id": "495f25f8", "metadata": {}, "source": [ "其实还有第零种调参方法,就是手动调参,因为简单机械,就不在本文讨论范围内。\n", "\n", "为方便阅读,列出本文的结构如下:\n", "\n", "1.获取和准备数据 \n", "\n", "2.网格搜索 \n", "\n", "3.随机搜索 \n", "\n", "4.贝叶斯搜索 \n", "\n", "5.写在最后\n", "\n", "## 获取和准备数据\n", "\n", "\n", "\n", "为演示方便,本文使用内置乳腺癌数据来训练**支持向量分类**(SVC)。可以通过`load_breast_cancer`函数获取数据。" ] }, { "cell_type": "code", "execution_count": 1, "id": "d9ccf734", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | mean radius | \n", "mean texture | \n", "mean perimeter | \n", "mean area | \n", "mean smoothness | \n", "mean compactness | \n", "mean concavity | \n", "mean concave points | \n", "mean symmetry | \n", "mean fractal dimension | \n", "... | \n", "worst radius | \n", "worst texture | \n", "worst perimeter | \n", "worst area | \n", "worst smoothness | \n", "worst compactness | \n", "worst concavity | \n", "worst concave points | \n", "worst symmetry | \n", "worst fractal dimension | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "17.99 | \n", "10.38 | \n", "122.80 | \n", "1001.0 | \n", "0.11840 | \n", "0.27760 | \n", "0.3001 | \n", "0.14710 | \n", "0.2419 | \n", "0.07871 | \n", "... | \n", "25.38 | \n", "17.33 | \n", "184.60 | \n", "2019.0 | \n", "0.1622 | \n", "0.6656 | \n", "0.7119 | \n", "0.2654 | \n", "0.4601 | \n", "0.11890 | \n", "
| 1 | \n", "20.57 | \n", "17.77 | \n", "132.90 | \n", "1326.0 | \n", "0.08474 | \n", "0.07864 | \n", "0.0869 | \n", "0.07017 | \n", "0.1812 | \n", "0.05667 | \n", "... | \n", "24.99 | \n", "23.41 | \n", "158.80 | \n", "1956.0 | \n", "0.1238 | \n", "0.1866 | \n", "0.2416 | \n", "0.1860 | \n", "0.2750 | \n", "0.08902 | \n", "
| 2 | \n", "19.69 | \n", "21.25 | \n", "130.00 | \n", "1203.0 | \n", "0.10960 | \n", "0.15990 | \n", "0.1974 | \n", "0.12790 | \n", "0.2069 | \n", "0.05999 | \n", "... | \n", "23.57 | \n", "25.53 | \n", "152.50 | \n", "1709.0 | \n", "0.1444 | \n", "0.4245 | \n", "0.4504 | \n", "0.2430 | \n", "0.3613 | \n", "0.08758 | \n", "
| 3 | \n", "11.42 | \n", "20.38 | \n", "77.58 | \n", "386.1 | \n", "0.14250 | \n", "0.28390 | \n", "0.2414 | \n", "0.10520 | \n", "0.2597 | \n", "0.09744 | \n", "... | \n", "14.91 | \n", "26.50 | \n", "98.87 | \n", "567.7 | \n", "0.2098 | \n", "0.8663 | \n", "0.6869 | \n", "0.2575 | \n", "0.6638 | \n", "0.17300 | \n", "
| 4 | \n", "20.29 | \n", "14.34 | \n", "135.10 | \n", "1297.0 | \n", "0.10030 | \n", "0.13280 | \n", "0.1980 | \n", "0.10430 | \n", "0.1809 | \n", "0.05883 | \n", "... | \n", "22.54 | \n", "16.67 | \n", "152.20 | \n", "1575.0 | \n", "0.1374 | \n", "0.2050 | \n", "0.4000 | \n", "0.1625 | \n", "0.2364 | \n", "0.07678 | \n", "
5 rows × 30 columns
\n", "| \n", " | Cancer | \n", "
|---|---|
| 0 | \n", "0 | \n", "
| 1 | \n", "0 | \n", "
| 2 | \n", "0 | \n", "
| 3 | \n", "0 | \n", "
| 4 | \n", "0 | \n", "