{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "eb50e2ce",
   "metadata": {},
   "source": [
    "**<font color = black size=6>实验五：支持向量机</font>**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "95f5ebed",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import math\n",
    "import cvxopt\n",
    "from cvxopt import matrix\n",
    "from cvxopt import solvers"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "714b16e8",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第一部分:函数介绍</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3531f8a",
   "metadata": {},
   "source": [
    "二次规划问题是形式如下的一类最优化问题：\n",
    "$$\n",
    "\\begin{align}\n",
    "\\min_x \\quad  &\\frac{1}{2}x^TPx+q^Tx \\\\\n",
    "s.t. \\quad  &Gx\\leq h \\\\\n",
    "      &Ax=b\n",
    "\\end{align}\n",
    "$$\n",
    "对于这一类问题可以使用[cvxopt](https://cvxopt.org/userguide/coneprog.html#quadratic-programming)库的solvers.qp()函数进行求解。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20b0f664",
   "metadata": {},
   "source": [
    "以下是一个例子（参考[Solving a quadratic program](https://cvxopt.org/examples/tutorial/qp.html)）:\n",
    "$$\n",
    "\\begin{align}\n",
    "\\min_x \\quad  &2x_1^2+x_2^2+x_1x_2+x_1+x_2 \\\\\n",
    "s.t. \\quad  &x_1\\geq 0 \\\\\n",
    "      &x_2\\geq 0 \\\\\n",
    "      &x_1+x_2=1\n",
    "\\end{align}\n",
    "$$\n",
    "为了使用solvers.qp()函数，我们需要知道在该二次规划问题中的$P,q,G,h,A,b$矩阵分别是什么。\n",
    "在该优化问题中，\n",
    "\n",
    "* $P:=\\begin{bmatrix}\n",
    "    4 & 1 \\\\ 1 & 2\n",
    "   \\end{bmatrix}$,\n",
    "* $q:=\\begin{bmatrix}\n",
    "    1 \\\\ 1\n",
    "   \\end{bmatrix}$,\n",
    "* $G:=\\begin{bmatrix}\n",
    "    -1 & 0 \\\\ 0 & -1\n",
    "   \\end{bmatrix}$,\n",
    "* $h:=\\begin{bmatrix}\n",
    "    0 \\\\ 0\n",
    "   \\end{bmatrix}$,\n",
    "* $A:=\\begin{bmatrix}\n",
    "    1 & 1\n",
    "   \\end{bmatrix}$,\n",
    "* $b:=\\begin{bmatrix}\n",
    "    1\n",
    "   \\end{bmatrix}$,\n",
    "   \n",
    "把这些参数送入solvers.qp()函数中即可求出解。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "9e5b5531",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "     pcost       dcost       gap    pres   dres\n",
      " 0:  1.8889e+00  7.7778e-01  1e+00  3e-16  2e+00\n",
      " 1:  1.8769e+00  1.8320e+00  4e-02  2e-16  6e-02\n",
      " 2:  1.8750e+00  1.8739e+00  1e-03  2e-16  5e-04\n",
      " 3:  1.8750e+00  1.8750e+00  1e-05  6e-17  5e-06\n",
      " 4:  1.8750e+00  1.8750e+00  1e-07  2e-16  5e-08\n",
      "Optimal solution found.\n",
      "[ 2.50e-01]\n",
      "[ 7.50e-01]\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Tips1: cvxopt库中的matrix只接受double类型的数据\n",
    "# Tips2: matrix使用列表作为参数创建矩阵和numpy.array使用列表作为参数创建矩阵是不同的\n",
    "# print(matrix([[1.0, 1.0]]))\n",
    "# print(np.array([[1.0, 1.0]]))\n",
    "# print(matrix(np.array([[1.0, 1.0]])))\n",
    "Q = 2*matrix([ [2, .5], [.5, 1] ])\n",
    "p = matrix([1.0,1.0])\n",
    "G = matrix([[-1.0,0.0],[0.0,-1.0]])\n",
    "h = matrix([0.0,0.0])\n",
    "A = matrix([1.0, 1.0], (1,2))\n",
    "b = matrix(1.0)\n",
    "sol=solvers.qp(Q, p, G, h, A, b)\n",
    "print(sol['x'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2384abb1",
   "metadata": {},
   "source": [
    "**<font color = blue size=4>第二部分:实验任务</font>**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68091bb5",
   "metadata": {},
   "source": [
    "1.线性可分支持向量机与硬间隔最大化"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad91cf52",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">1) 读入数据集'dataset1.csv',把数据类型都转换成np.double类型，并画出数据集的散点图，给正样本（y为+1）和负样本（y为-1）分别标上不同的颜色。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1b25ddd4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bbcf4613",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">2) 求解如下对偶问题（参考课件）：</span>\n",
    "$$\n",
    "\\begin{align}\n",
    "\\min_\\alpha \\quad  &\\frac{1}{2}\\sum_{i=1}^m\\sum_{j=1}^m\\alpha_i\\alpha_jy_iy_j\\pmb{x}_i^T\\pmb{x}_j-\\sum_{i=1}^m\\alpha_i \\\\\n",
    "s.t. \\quad  &\\sum_{i=1}^m\\alpha_iy_i=0 \\\\\n",
    "      &\\pmb{\\alpha}\\geq \\pmb{0}\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "这个优化问题是一个二次规划问题。\n",
    "* $P是一个m\\times m的矩阵，其中P_{ij}=y_iy_j\\pmb{x}_i^T\\pmb{x}_j$,\n",
    "* $q是一个m\\times 1的所有值都为-1的列向量，即q:=\\begin{bmatrix}\n",
    "    -1 & -1 & \\cdots & -1\n",
    "   \\end{bmatrix}^T$,\n",
    "* $G:=\\begin{bmatrix}\n",
    "    -1 & 0 & \\cdots & 0 \\\\\n",
    "    0 & -1 & \\cdots & 0 \\\\\n",
    "   \\vdots & \\vdots & \\ddots &0 \\\\\n",
    "   0 & 0 & 0 & -1\n",
    "   \\end{bmatrix}_{m\\times m}=-\\pmb{I},\\pmb{I}为单位矩阵,$\n",
    "* $h是一个m\\times 1的零向量,即h:=\\begin{bmatrix}\n",
    "    0 & 0 & \\cdots & 0\n",
    "   \\end{bmatrix}^T$,\n",
    "* $A:=\\begin{bmatrix}\n",
    "    y_1 & y_2 & \\cdots & y_m\n",
    "   \\end{bmatrix}^T$,\n",
    "* $b:=\\begin{bmatrix}\n",
    "    0\n",
    "   \\end{bmatrix},一个标量$\n",
    "   \n",
    "<span style=\"color:purple\">把上述参数送入求解器solvers.qp()中即可得到最优解$\\alpha^*$</span>。 \n",
    " \n",
    "附：$P$矩阵的一个计算方法：\n",
    "设$X=\\begin{bmatrix}\n",
    "    x_{11} & x_{12} \\\\\n",
    "    x_{21} & x_{22} \\\\\n",
    "    \\vdots & \\vdots \\\\\n",
    "    x_{m1} & x_{m2}\n",
    "   \\end{bmatrix}$,\n",
    "   $Y=\\begin{bmatrix}\n",
    "    y_{1} \\\\\n",
    "    y_{2} \\\\\n",
    "    \\vdots \\\\\n",
    "    y_{m}\n",
    "   \\end{bmatrix}$,\n",
    "   \n",
    "计算$X'=\\begin{bmatrix}\n",
    "    x_{11}y_1 & x_{12}y_1 \\\\\n",
    "    x_{21}y_2 & x_{22}y_2 \\\\\n",
    "    \\vdots & \\vdots \\\\\n",
    "    x_{m1}y_m & x_{m2}y_m\n",
    "   \\end{bmatrix}=X*Y(注意这里是星乘)$\n",
    "   \n",
    "则$P=X'X'^T$。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1537fe6b",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "#如果求解报错可以尝试在solvers.qp()中添加参数kktsolver='ldl'\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "825c6609",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">3) 求出$\\pmb{\\omega}^*=\\sum_{i=1}^m\\alpha_i^*y_i\\pmb{x}_i$和$b^*=y_j-\\pmb{\\omega}^{*T}\\pmb{x_j}$, 其中$j$为$\\alpha^*$中的一个正分量$\\alpha_j^*>0$的下标。注意：由于求解器求出来的是一个近似解，所以$\\alpha^*$中很多实际上为0的分量会略大于0，这时候可以设置一个阈值把非常靠近0的那些分量筛去，再从剩下的分量中选取一个正分量来计算$b^*$,或者也可以直接取$\\alpha^*$中最大的分量来计算$b^*$。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "388c3510",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad94df88",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">4) 画出数据集的散点图，给正样本（y为+1）和负样本（y为-1）分别标上不同的颜色，再为支持向量（训练数据中$\\alpha_j^*>0$的对应的样本）标上不同的颜色，并画出决策边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=0$和间隔边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=1$与$\\pmb{\\omega}^{*T}\\pmb{x}+b=-1$。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36428566",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f33d9ee3",
   "metadata": {},
   "source": [
    "2.线性支持向量机与软间隔最大化"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bdbeff31",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">1) 读入数据集'dataset2.csv',把数据类型都转换成np.double类型，并画出数据集的散点图，给正样本（y为+1）和负样本（y为-1）分别标上不同的颜色。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5caeaadf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d108a81",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">2) 选择一个参数C，求解如下对偶问题（参考课件）：</span>\n",
    "$$\n",
    "\\begin{align}\n",
    "\\min_\\alpha \\quad  &\\frac{1}{2}\\sum_{i=1}^m\\sum_{j=1}^m\\alpha_i\\alpha_jy_iy_j\\pmb{x}_i^T\\pmb{x}_j-\\sum_{i=1}^m\\alpha_i \\\\\n",
    "s.t. \\quad  &\\sum_{i=1}^m\\alpha_iy_i=0 \\\\\n",
    "      &\\pmb{0}\\leq \\pmb{\\alpha}\\leq C \n",
    "\\end{align}\n",
    "$$\n",
    "* $G:=\\begin{bmatrix}\n",
    "    -1 & 0 & \\cdots & 0 \\\\\n",
    "    0 & -1 & \\cdots & 0 \\\\\n",
    "   \\vdots & \\vdots & \\ddots &0 \\\\\n",
    "   0 & 0 & 0 & -1 \\\\\n",
    "   1 & 0 & \\cdots & 0 \\\\\n",
    "    0 & 1 & \\cdots & 0 \\\\\n",
    "   \\vdots & \\vdots & \\ddots &0 \\\\\n",
    "   0 & 0 & 0 & 1\n",
    "   \\end{bmatrix}_{2m\\times m}=\\begin{bmatrix}\n",
    "    -\\pmb{I} \\\\\n",
    "    \\pmb{I}\n",
    "   \\end{bmatrix},\\pmb{I}为单位矩阵,$\n",
    "* $h:=\\begin{bmatrix}\n",
    "    0 \\\\\n",
    "    0 \\\\\n",
    "    \\vdots \\\\\n",
    "    0 \\\\\n",
    "    C \\\\\n",
    "    C \\\\\n",
    "    \\vdots \\\\\n",
    "    C\n",
    "   \\end{bmatrix}_{2m\\times 1}, 即一个m\\times 1的零列向量与一个m\\times 1的分量全为C的列向量上下拼接$,\n",
    "* $P,q,A,b$与硬间隔优化问题中的矩阵相同。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de031f65",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7093e867",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">3) 求出$\\pmb{\\omega}^*=\\sum_{i=1}^m\\alpha_i^*y_i\\pmb{x}_i$和$b^*=y_j-\\pmb{\\omega}^{*T}\\pmb{x_j}$, 其中$j$为$\\alpha^*$中的一个正分量$0<\\alpha_j^*<C$的下标。与硬间隔优化问题同理，应该筛掉非常接近0和非常接近$C$的分量。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7a7801c9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7b97abec",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">4) 画出数据集的散点图，给正样本（y为+1）和负样本（y为-1）分别标上不同的颜色，再为支持向量（训练数据中$\\alpha_j^*>0$的对应的样本）标上不同的颜色，并画出决策边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=0$和间隔边界$\\pmb{\\omega}^{*T}\\pmb{x}+b=1$与$\\pmb{\\omega}^{*T}\\pmb{x}+b=-1$。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4446b508",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e20c4293",
   "metadata": {},
   "source": [
    "3.非线性支持向量机与核函数"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f830c3b3",
   "metadata": {},
   "source": [
    "[Raisin Dataset](https://www.kaggle.com/datasets/muratkokludataset/raisin-dataset)是一个葡萄干的数据集，总共有900个样本，每个样本包含7个(都是连续的)特征以及1个标签，每个标签只有两种可能取值。本次实验已经按照8：2的比例划分成了训练数据集'Raisin_train.csv'以及测试数据集'Raisin_test.csv'，且每个数据集都已经做了特征归一化处理以及把标签的值替换成了+1和-1。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7a6c8ca9",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">1) 读入训练数据集'Raisin_train.csv',把数据类型都转换成np.double类型。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b020812",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89cf5ba7",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">2) 选择一个核函数$K(\\pmb{x},\\pmb{z})$以及参数C，求解如下对偶问题（参考课件）：</span>\n",
    "$$\n",
    "\\begin{align}\n",
    "\\min_\\alpha\\quad   &\\frac{1}{2}\\sum_{i=1}^m\\sum_{j=1}^m\\alpha_i\\alpha_jy_iy_jK(\\pmb{x}_i,\\pmb{x}_j)-\\sum_{i=1}^m\\alpha_i \\\\\n",
    "s.t. \\quad  &\\sum_{i=1}^m\\alpha_iy_i=0 \\\\\n",
    "      &\\pmb{0}\\leq \\pmb{\\alpha}\\leq C \n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "相较于软间隔最大化的优化问题，该优化问题仅需要对矩阵$P$做改动。\n",
    "从以下常用的核函数中选择一个作为该优化问题中的$K$（参数自己进行调整）：\n",
    "* 线性核：$K(\\pmb{x},\\pmb{z})=\\pmb{x}^T\\pmb{z}$\n",
    "* 多项式核：$K(\\pmb{x},\\pmb{z})=(\\pmb{x}^T\\pmb{z}+1)^p$\n",
    "* 高斯核：$K(\\pmb{x},\\pmb{z})=exp(-\\frac{\\parallel \\pmb{x}-\\pmb{z} \\parallel^2}{2\\sigma^2})$\n",
    "* 拉普拉斯核：$K(\\pmb{x},\\pmb{z})=exp(-\\frac{\\parallel \\pmb{x}-\\pmb{z} \\parallel}{\\sigma})$\n",
    "* Sigmoid核：$K(\\pmb{x},\\pmb{z})=tanh(\\beta\\pmb{x}^T\\pmb{z}+\\theta)$\n",
    "\n",
    "则$P是一个m\\times m的矩阵，其中P_{ij}=y_iy_jK(\\pmb{x_i},\\pmb{x_j})$。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2716c531",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a1a084b",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">3) 求出$b^*=y_j-\\sum_{i=1}^m \\alpha_i^*y_iK(\\pmb{x_i},\\pmb{x_j})$, 其中$j$为$\\alpha^*$中的一个正分量$0<\\alpha_j^*<C$的下标。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "51cca8c9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16b41092",
   "metadata": {},
   "source": [
    "<span style=\"color:purple\">4) 读入测试数据集'Raisin_test.csv',用分类决策函数$f(\\pmb{x})=sign(\\sum_{i=1}^m \\alpha_i^*y_iK(\\pmb{x}_i,\\pmb{x})+b^*)$（注意这里的$m,\\alpha_i^*,y_i,\\pmb{x}_i$是训练集的, $\\pmb{x}$是测试集的）进行预测，输出预测准确率。</span>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d2e528d4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ---- Your code here ----\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}