{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "第$j$个输入实例$x$的特征向量\n", "\\begin{align*} \\\\& x_{j} = \\left( x_{j}^{\\left(1\\right)},x_{j}^{\\left(2\\right)}, \\cdots, x_{j}^{\\left(i\\right)}, \\cdots, x_{j}^{\\left(n\\right)} \\right)^{T}, \\quad i=1,2,\\cdots,n; \\quad j=1,2,\\cdots,N \\end{align*} \n", "其中,$x_{j}^{\\left(i\\right)}$表示第$j$个输入实例的第$i$个特征。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "监督学习的训练数据集合由输入(特征向量)与输出对组成\n", "\\begin{align*} \\\\& T = \\left\\{ \\left( x_{1}, y_{1} \\right), \\left( x_{2}, y_{2} \\right), \\cdots, \\left( x_{N}, y_{N} \\right) \\right\\} \\end{align*} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设空间$\\mathcal{F}$定义为决策函数的集合\n", "\\begin{align*} \\\\& \\mathcal{F} = \\left\\{ f | Y = f \\left( X \\right) \\right\\} \\end{align*}\n", "其中,$X$是定义在输入空间$\\mathcal{X}$上的变量,$Y$是定义在输入空间$\\mathcal{}$上的变量。" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "假设空间$\\mathcal{F}$通常是由一个参数向量决定的函数族\n", "\\begin{align*} \\\\& \\mathcal{F} = \\left\\{ f | Y = f_{\\theta} \\left( X \\right), \\theta \\in R^{n} \\right\\} \\end{align*}\n", "其中,参数向量$\\theta$取值于$n$维向量空间$R^{n}$,称为参数空间。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设空间$\\mathcal{F}$也可定义为条件概率的集合\n", "\\begin{align*} \\\\& \\mathcal{F} = \\left\\{ P | P \\left( Y | X \\right) \\right\\} \\end{align*}\n", "其中,$X$是定义在输入空间$\\mathcal{X}$上的随机变量,$Y$是定义在输入空间$\\mathcal{}$上的随机变量。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设空间$\\mathcal{F}$通常是由一个参数向量决定的概率分布族\n", "\\begin{align*} \\\\& \\mathcal{F} = \\left\\{ P | P_{\\theta} \\left( Y | X \\right), \\theta \\in R^{n} \\right\\} \\end{align*}\n", "其中,参数向量$\\theta$取值于$n$维向量空间$R^{n}$,称为参数空间。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "损失函数(代价函数)来度量预测错误的程度,是预测输出$f\\left(X\\right)$和实际输出$Y$的非负实值函数,记作$L \\left(Y, f \\left( X \\right) \\right)$。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "0-1损失函数\n", "\\begin{align*} L \\left(Y, f \\left( X \\right) \\right) = \\left\\{\n", "\\begin{aligned} \n", "\\ & 1, Y \\neq f \\left( X \\right)\n", "\\\\ & 0, Y = f \\left( X \\right)\n", "\\end{aligned}\n", "\\right.\\end{align*} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "平方损失函数\n", "\\begin{align*} L \\left(Y, f \\left( X \\right) \\right) = \\left( Y - f \\left( X \\right) \\right)^{2} \\end{align*} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "绝对值损失函数\n", "\\begin{align*} L \\left(Y, f \\left( X \\right) \\right) = \\left| Y - f \\left( X \\right) \\right| \\end{align*} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "绝对值损失函数(对数似然损失函数)\n", "\\begin{align*} L \\left(Y, f \\left( X \\right) \\right) = - \\log P \\left( Y | X \\right) \\end{align*} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "风险损失(期望损失)是模型$f\\left(X\\right)$关于联合概率分布$P\\left(X,Y\\right)$的平均意义下的损失\n", "\\begin{align*} R_{exp} \\left( f \\right) = E_{P} \\left[L \\left(Y, f \\left( X \\right) \\right) \\right] = \\int_{\\mathcal{X} \\times \\mathcal{Y}} L \\left(Y, f \\left( X \\right) \\right) P \\left(x,y\\right) dxdy \\end{align*} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "经验风险(经验损失)是模型$f\\left(X\\right)$关于训练数据集\n", "\\begin{align*} \\\\& T = \\left\\{ \\left( x_{1}, y_{1} \\right), \\left( x_{2}, y_{2} \\right), \\cdots, \\left( x_{N}, y_{N} \\right) \\right\\} \\end{align*} \n", "的平均损失\n", "\\begin{align*} R_{emp} \\left( f \\right) = \\dfrac{1}{N} \\sum_{i=1}^{N} L \\left(y_{i}, f \\left( x_{i} \\right) \\right) \\end{align*} " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "经验风险最小化\n", "\\begin{align*} \\min_{f \\in \\mathcal{F}} \\dfrac{1}{N} \\sum_{i=1}^{N} L \\left(y_{i}, f \\left( x_{i} \\right) \\right) \\end{align*}\n", "其中,$\\mathcal{F}$是假设空间。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "结构风险最小化\n", "\\begin{align*} \\min_{f \\in \\mathcal{F}} \\dfrac{1}{N} \\sum_{i=1}^{N} L \\left(y_{i}, f \\left( x_{i} \\right) \\right) + \\lambda J \\left(f\\right) \\end{align*}\n", "其中,$J \\left(f\\right)$是模型复杂度,是增则化项,是定义在建设空间$\\mathcal{F}$上的泛函;$\\lambda \\geq 0$是系数,用以权衡风险和模型复杂度。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "正则化项可以是参数向量的$L_{2}$范数\n", "\\begin{align*} L_{2} = \\| w \\|\\end{align*} \n", "其中,$\\|w\\|$表示参数向量$w$的$L_{2}$范数。 \n", "正则化项可以是参数向量的$L_{1}$范数\n", "\\begin{align*} L_{1} = \\| w \\|_{1} \\end{align*} \n", "其中,$\\|w\\|_{1}$表示参数向量$w$的$L_{1}$范数。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "训练误差是模型$Y = \\hat f \\left(X\\right)$关于训练数据集的平均损失\n", "\\begin{align*} R_{emp} \\left( \\hat f \\right) = \\dfrac{1}{N} \\sum_{i=1}^{N} L \\left(y_{i}, \\hat f \\left( x_{i} \\right) \\right) \\end{align*} \n", "其中,$N$是训练样本容量。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "测试误差是模型$Y = \\hat f \\left(X\\right)$关于测试数据集的平均损失\n", "\\begin{align*} e_{test} = \\dfrac{1}{N'} \\sum_{i=1}^{N'} L \\left(y_{i}, \\hat f \\left( x_{i} \\right) \\right) \\end{align*} \n", "其中,$N'$是测试样本容量。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "当损失函数是0-1损失,测试误差即测试集上的误差率\n", "\\begin{align*} e_{test} = \\dfrac{1}{N‘} \\sum_{i=1}^{N’} I \\left( y_{i} \\neq \\hat f \\left(x_{i} \\right) \\right) \\end{align*} \n", "其中,$I$是指示函数,即$y \\neq \\hat f \\left( x \\right)$时为1,否则为0。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "测试集上的准确率\n", "\\begin{align*} r_{test} = \\dfrac{1}{N‘} \\sum_{i=1}^{N’} I \\left( y_{i} = \\hat f \\left(x_{i} \\right) \\right) \\end{align*} \n", "则,$r_{test} + e_{test} = 1 $。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "生成方法由数据学习联合概率分布$P\\left(X,Y\\right)$,然后求出条件概率分布$P\\left(Y|X\\right)$作为预测的模型,即生成模型\n", "\\begin{align*} P\\left(Y|X\\right) = \\dfrac{P\\left(X,Y\\right)}{P\\left(X\\right)}\\end{align*} \n", "判别方法由数据直接学习决策函数$f\\left(X\\right)$或者条件概率分布$P\\left(Y|X\\right)$作为预测的模型,即判别模型。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "TP——将正类预测为正类;\n", "FN——将正类预测为负类;\n", "FP——将负类预测为正类;\n", "TN——将负类预测为负类。 \n", "精确率\n", "\\begin{align*} P = \\dfrac{TP}{TP+FP}\\end{align*} \n", "召回率\n", "\\begin{align*} R = \\dfrac{TP}{TP+FN}\\end{align*} \n", "$F_{1}$值是精确率和召回率的调和均值\n", "\\begin{align*} \\\\ & \\dfrac{2}{F_{1}} = \\dfrac{1}{P} + \\dfrac{1}{R} \n", "\\\\ & F_{1} = \\dfrac{2TP}{2TP+FP+FN}\\end{align*} " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 1 }