{
"cells": [
{
"cell_type": "markdown",
"id": "eb50e2ce",
"metadata": {},
"source": [
"**实验五:支持向量机**"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "95f5ebed",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import math\n",
"import cvxopt\n",
"from cvxopt import matrix\n",
"from cvxopt import solvers"
]
},
{
"cell_type": "markdown",
"id": "714b16e8",
"metadata": {},
"source": [
"**第一部分:函数介绍**"
]
},
{
"cell_type": "markdown",
"id": "d3531f8a",
"metadata": {},
"source": [
"二次规划问题是形式如下的一类最优化问题:\n",
"$$\n",
"\\begin{align}\n",
"\\min_x \\quad &\\frac{1}{2}x^TPx+q^Tx \\\\\n",
"s.t. \\quad &Gx\\leq h \\\\\n",
" &Ax=b\n",
"\\end{align}\n",
"$$\n",
"对于这一类问题可以使用[cvxopt](https://cvxopt.org/userguide/coneprog.html#quadratic-programming)库的solvers.qp()函数进行求解。"
]
},
{
"cell_type": "markdown",
"id": "20b0f664",
"metadata": {},
"source": [
"以下是一个例子(参考[Solving a quadratic program](https://cvxopt.org/examples/tutorial/qp.html)):\n",
"$$\n",
"\\begin{align}\n",
"\\min_x \\quad &2x_1^2+x_2^2+x_1x_2+x_1+x_2 \\\\\n",
"s.t. \\quad &x_1\\geq 0 \\\\\n",
" &x_2\\geq 0 \\\\\n",
" &x_1+x_2=1\n",
"\\end{align}\n",
"$$\n",
"为了使用solvers.qp()函数,我们需要知道在该二次规划问题中的$P,q,G,h,A,b$矩阵分别是什么。\n",
"在该优化问题中,\n",
"\n",
"* $P:=\\begin{bmatrix}\n",
" 4 & 1 \\\\ 1 & 2\n",
" \\end{bmatrix}$,\n",
"* $q:=\\begin{bmatrix}\n",
" 1 \\\\ 1\n",
" \\end{bmatrix}$,\n",
"* $G:=\\begin{bmatrix}\n",
" -1 & 0 \\\\ 0 & -1\n",
" \\end{bmatrix}$,\n",
"* $h:=\\begin{bmatrix}\n",
" 0 \\\\ 0\n",
" \\end{bmatrix}$,\n",
"* $A:=\\begin{bmatrix}\n",
" 1 & 1\n",
" \\end{bmatrix}$,\n",
"* $b:=\\begin{bmatrix}\n",
" 1\n",
" \\end{bmatrix}$,\n",
" \n",
"把这些参数送入solvers.qp()函数中即可求出解。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "9e5b5531",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" pcost dcost gap pres dres\n",
" 0: 1.8889e+00 7.7778e-01 1e+00 3e-16 2e+00\n",
" 1: 1.8769e+00 1.8320e+00 4e-02 2e-16 6e-02\n",
" 2: 1.8750e+00 1.8739e+00 1e-03 2e-16 5e-04\n",
" 3: 1.8750e+00 1.8750e+00 1e-05 6e-17 5e-06\n",
" 4: 1.8750e+00 1.8750e+00 1e-07 2e-16 5e-08\n",
"Optimal solution found.\n",
"[ 2.50e-01]\n",
"[ 7.50e-01]\n",
"\n"
]
}
],
"source": [
"# Tips1: cvxopt库中的matrix只接受double类型的数据\n",
"# Tips2: matrix使用列表作为参数创建矩阵和numpy.array使用列表作为参数创建矩阵是不同的\n",
"# print(matrix([[1.0, 1.0]]))\n",
"# print(np.array([[1.0, 1.0]]))\n",
"# print(matrix(np.array([[1.0, 1.0]])))\n",
"Q = 2*matrix([ [2, .5], [.5, 1] ])\n",
"p = matrix([1.0,1.0])\n",
"G = matrix([[-1.0,0.0],[0.0,-1.0]])\n",
"h = matrix([0.0,0.0])\n",
"A = matrix([1.0, 1.0], (1,2))\n",
"b = matrix(1.0)\n",
"sol=solvers.qp(Q, p, G, h, A, b)\n",
"print(sol['x'])"
]
},
{
"cell_type": "markdown",
"id": "2384abb1",
"metadata": {},
"source": [
"**第二部分:实验任务**"
]
},
{
"cell_type": "markdown",
"id": "68091bb5",
"metadata": {},
"source": [
"1.线性可分支持向量机与硬间隔最大化"
]
},
{
"cell_type": "markdown",
"id": "ad91cf52",
"metadata": {},
"source": [
"1) 读入数据集'dataset1.csv',把数据类型都转换成np.double类型,并画出数据集的散点图,给正样本(y为+1)和负样本(y为-1)分别标上不同的颜色。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1b25ddd4",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "bbcf4613",
"metadata": {},
"source": [
"2) 求解如下对偶问题(参考课件):\n",
"$$\n",
"\\begin{align}\n",
"\\min_\\alpha \\quad &\\frac{1}{2}\\sum_{i=1}^m\\sum_{j=1}^m\\alpha_i\\alpha_jy_iy_j\\pmb{x}_i^T\\pmb{x}_j-\\sum_{i=1}^m\\alpha_i \\\\\n",
"s.t. \\quad &\\sum_{i=1}^m\\alpha_iy_i=0 \\\\\n",
" &\\pmb{\\alpha}\\geq \\pmb{0}\n",
"\\end{align}\n",
"$$\n",
"\n",
"这个优化问题是一个二次规划问题。\n",
"* $P是一个m\\times m的矩阵,其中P_{ij}=y_iy_j\\pmb{x}_i^T\\pmb{x}_j$,\n",
"* $q是一个m\\times 1的所有值都为-1的列向量,即q:=\\begin{bmatrix}\n",
" -1 & -1 & \\cdots & -1\n",
" \\end{bmatrix}^T$,\n",
"* $G:=\\begin{bmatrix}\n",
" -1 & 0 & \\cdots & 0 \\\\\n",
" 0 & -1 & \\cdots & 0 \\\\\n",
" \\vdots & \\vdots & \\ddots &0 \\\\\n",
" 0 & 0 & 0 & -1\n",
" \\end{bmatrix}_{m\\times m}=-\\pmb{I},\\pmb{I}为单位矩阵,$\n",
"* $h是一个m\\times 1的零向量,即h:=\\begin{bmatrix}\n",
" 0 & 0 & \\cdots & 0\n",
" \\end{bmatrix}^T$,\n",
"* $A:=\\begin{bmatrix}\n",
" y_1 & y_2 & \\cdots & y_m\n",
" \\end{bmatrix}^T$,\n",
"* $b:=\\begin{bmatrix}\n",
" 0\n",
" \\end{bmatrix},一个标量$\n",
" \n",
"把上述参数送入求解器solvers.qp()中即可得到最优解$\\alpha^*$。 \n",
" \n",
"附:$P$矩阵的一个计算方法:\n",
"设$X=\\begin{bmatrix}\n",
" x_{11} & x_{12} \\\\\n",
" x_{21} & x_{22} \\\\\n",
" \\vdots & \\vdots \\\\\n",
" x_{m1} & x_{m2}\n",
" \\end{bmatrix}$,\n",
" $Y=\\begin{bmatrix}\n",
" y_{1} \\\\\n",
" y_{2} \\\\\n",
" \\vdots \\\\\n",
" y_{m}\n",
" \\end{bmatrix}$,\n",
" \n",
"计算$X'=\\begin{bmatrix}\n",
" x_{11}y_1 & x_{12}y_1 \\\\\n",
" x_{21}y_2 & x_{22}y_2 \\\\\n",
" \\vdots & \\vdots \\\\\n",
" x_{m1}y_m & x_{m2}y_m\n",
" \\end{bmatrix}=X*Y(注意这里是星乘)$\n",
" \n",
"则$P=X'X'^T$。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1537fe6b",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"#如果求解报错可以尝试在solvers.qp()中添加参数kktsolver='ldl'\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "825c6609",
"metadata": {},
"source": [
"3) 求出$\\pmb{\\omega}^*=\\sum_{i=1}^m\\alpha_i^*y_i\\pmb{x}_i$和$b^*=y_j-\\pmb{\\omega}^{*T}\\pmb{x_j}$, 其中$j$为$\\alpha^*$中的一个正分量$\\alpha_j^*>0$的下标。注意:由于求解器求出来的是一个近似解,所以$\\alpha^*$中很多实际上为0的分量会略大于0,这时候可以设置一个阈值把非常靠近0的那些分量筛去,再从剩下的分量中选取一个正分量来计算$b^*$,或者也可以直接取$\\alpha^*$中最大的分量来计算$b^*$。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "388c3510",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "ad94df88",
"metadata": {},
"source": [
"4) 画出数据集的散点图,给正样本(y为+1)和负样本(y为-1)分别标上不同的颜色,再为支持向量(训练数据中$\\alpha_j^*>0$的对应的样本)标上不同的颜色,并画出决策边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=0$和间隔边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=1$与$\\pmb{\\omega}^{*T}\\pmb{x}+b=-1$。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "36428566",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "f33d9ee3",
"metadata": {},
"source": [
"2.线性支持向量机与软间隔最大化"
]
},
{
"cell_type": "markdown",
"id": "bdbeff31",
"metadata": {},
"source": [
"1) 读入数据集'dataset2.csv',把数据类型都转换成np.double类型,并画出数据集的散点图,给正样本(y为+1)和负样本(y为-1)分别标上不同的颜色。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5caeaadf",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "3d108a81",
"metadata": {},
"source": [
"2) 选择一个参数C,求解如下对偶问题(参考课件):\n",
"$$\n",
"\\begin{align}\n",
"\\min_\\alpha \\quad &\\frac{1}{2}\\sum_{i=1}^m\\sum_{j=1}^m\\alpha_i\\alpha_jy_iy_j\\pmb{x}_i^T\\pmb{x}_j-\\sum_{i=1}^m\\alpha_i \\\\\n",
"s.t. \\quad &\\sum_{i=1}^m\\alpha_iy_i=0 \\\\\n",
" &\\pmb{0}\\leq \\pmb{\\alpha}\\leq C \n",
"\\end{align}\n",
"$$\n",
"* $G:=\\begin{bmatrix}\n",
" -1 & 0 & \\cdots & 0 \\\\\n",
" 0 & -1 & \\cdots & 0 \\\\\n",
" \\vdots & \\vdots & \\ddots &0 \\\\\n",
" 0 & 0 & 0 & -1 \\\\\n",
" 1 & 0 & \\cdots & 0 \\\\\n",
" 0 & 1 & \\cdots & 0 \\\\\n",
" \\vdots & \\vdots & \\ddots &0 \\\\\n",
" 0 & 0 & 0 & 1\n",
" \\end{bmatrix}_{2m\\times m}=\\begin{bmatrix}\n",
" -\\pmb{I} \\\\\n",
" \\pmb{I}\n",
" \\end{bmatrix},\\pmb{I}为单位矩阵,$\n",
"* $h:=\\begin{bmatrix}\n",
" 0 \\\\\n",
" 0 \\\\\n",
" \\vdots \\\\\n",
" 0 \\\\\n",
" C \\\\\n",
" C \\\\\n",
" \\vdots \\\\\n",
" C\n",
" \\end{bmatrix}_{2m\\times 1}, 即一个m\\times 1的零列向量与一个m\\times 1的分量全为C的列向量上下拼接$,\n",
"* $P,q,A,b$与硬间隔优化问题中的矩阵相同。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "de031f65",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "7093e867",
"metadata": {},
"source": [
"3) 求出$\\pmb{\\omega}^*=\\sum_{i=1}^m\\alpha_i^*y_i\\pmb{x}_i$和$b^*=y_j-\\pmb{\\omega}^{*T}\\pmb{x_j}$, 其中$j$为$\\alpha^*$中的一个正分量$0<\\alpha_j^*"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7a7801c9",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "7b97abec",
"metadata": {},
"source": [
"4) 画出数据集的散点图,给正样本(y为+1)和负样本(y为-1)分别标上不同的颜色,再为支持向量(训练数据中$\\alpha_j^*>0$的对应的样本)标上不同的颜色,并画出决策边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=0$和间隔边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=1$与$\\pmb{\\omega}^{*T}\\pmb{x}+b=-1$。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4446b508",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "e20c4293",
"metadata": {},
"source": [
"3.非线性支持向量机与核函数"
]
},
{
"cell_type": "markdown",
"id": "f830c3b3",
"metadata": {},
"source": [
"[Raisin Dataset](https://www.kaggle.com/datasets/muratkokludataset/raisin-dataset)是一个葡萄干的数据集,总共有900个样本,每个样本包含7个(都是连续的)特征以及1个标签,每个标签只有两种可能取值。本次实验已经按照8:2的比例划分成了训练数据集'Raisin_train.csv'以及测试数据集'Raisin_test.csv',且每个数据集都已经做了特征归一化处理以及把标签的值替换成了+1和-1。"
]
},
{
"cell_type": "markdown",
"id": "7a6c8ca9",
"metadata": {},
"source": [
"1) 读入训练数据集'Raisin_train.csv',把数据类型都转换成np.double类型。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5b020812",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "89cf5ba7",
"metadata": {},
"source": [
"2) 选择一个核函数$K(\\pmb{x},\\pmb{z})$以及参数C,求解如下对偶问题(参考课件):\n",
"$$\n",
"\\begin{align}\n",
"\\min_\\alpha\\quad &\\frac{1}{2}\\sum_{i=1}^m\\sum_{j=1}^m\\alpha_i\\alpha_jy_iy_jK(\\pmb{x}_i,\\pmb{x}_j)-\\sum_{i=1}^m\\alpha_i \\\\\n",
"s.t. \\quad &\\sum_{i=1}^m\\alpha_iy_i=0 \\\\\n",
" &\\pmb{0}\\leq \\pmb{\\alpha}\\leq C \n",
"\\end{align}\n",
"$$\n",
"\n",
"相较于软间隔最大化的优化问题,该优化问题仅需要对矩阵$P$做改动。\n",
"从以下常用的核函数中选择一个作为该优化问题中的$K$(参数自己进行调整):\n",
"* 线性核:$K(\\pmb{x},\\pmb{z})=\\pmb{x}^T\\pmb{z}$\n",
"* 多项式核:$K(\\pmb{x},\\pmb{z})=(\\pmb{x}^T\\pmb{z}+1)^p$\n",
"* 高斯核:$K(\\pmb{x},\\pmb{z})=exp(-\\frac{\\parallel \\pmb{x}-\\pmb{z} \\parallel^2}{2\\sigma^2})$\n",
"* 拉普拉斯核:$K(\\pmb{x},\\pmb{z})=exp(-\\frac{\\parallel \\pmb{x}-\\pmb{z} \\parallel}{\\sigma})$\n",
"* Sigmoid核:$K(\\pmb{x},\\pmb{z})=tanh(\\beta\\pmb{x}^T\\pmb{z}+\\theta)$\n",
"\n",
"则$P是一个m\\times m的矩阵,其中P_{ij}=y_iy_jK(\\pmb{x_i},\\pmb{x_j})$。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2716c531",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "1a1a084b",
"metadata": {},
"source": [
"3) 求出$b^*=y_j-\\sum_{i=1}^m \\alpha_i^*y_iK(\\pmb{x_i},\\pmb{x_j})$, 其中$j$为$\\alpha^*$中的一个正分量$0<\\alpha_j^*"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51cca8c9",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "16b41092",
"metadata": {},
"source": [
"4) 读入测试数据集'Raisin_test.csv',用分类决策函数$f(\\pmb{x})=sign(\\sum_{i=1}^m \\alpha_i^*y_iK(\\pmb{x}_i,\\pmb{x})+b^*)$(注意这里的$m,\\alpha_i^*,y_i,\\pmb{x}_i$是训练集的, $\\pmb{x}$是测试集的)进行预测,输出预测准确率。"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2e528d4",
"metadata": {},
"outputs": [],
"source": [
"# ---- Your code here ----\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}