{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Additive Model\n", "\n", "Chapter 9. Additive Models, Trees, and Related Methods\n", "\n", "根据引言,本章开始到第11章节,将介绍具体的监督学习模型。\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Outline\n", "\n", "1. *Generalized Additive Models*\n", "\\begin{align}\n", "g[\\mu(X)] = \\alpha + f_1(X_1)+\\cdots+f_p(X_p)\n", "\\end{align}\n", "其中$g[\\mu] = \\mu$, $logit(\\mu)$, or $\\log(\\mu)$, 是 *link function*, 默认 $\\alpha = ave (y_i)$, 也就是 $\\sum_{i=1}^N f_j (x_{i,j}) = 0$。\n", "\n", "1. *Backfitting Algorithm for Additive Models* 其实是一个循环递归收敛过程。\n", " ```\n", " - Initialize:$\\hat{\\alpha} = \\tfrac{1}{N}\\sum_1^N y_i, \\hat{f}_j \\equiv 0, \\forall i,j. $\n", " - Cycle: j = 1,2,...,p,...,1,2,...,p,..., \n", " ```\n", " \\begin{align}\n", " \\hat{f}_j \\leftarrow & S_j\\left[ \\{ y_i - \\hat{\\alpha} -\\sum_{k\\ neq j}\\hat{f}_k(x_{ik}) \\}_1^N\\right] \\\\\n", " \\hat{f}_j \\leftarrow & \\hat{f}_j - \\frac{1}{N}\\sum_{i=1}^N \\hat{f}_j(x_{ij})\n", " \\end{align}\n", " 上述 Cycle 里面的第二步 理论上是不需要的,因为所有的 $\\hat{f}_j$ 总和为0。但实际程序运行时,会有偏差,所以需要归零。\n", "\n", "1. *Tree-Based Models* 基本就是一个二叉树遍历所有的组合,然后每个组合上的值为常数$c_m$,\n", "\\begin{align}\n", "f(x) = \\sum_{m=1}^M c_m I(x \\in R_m).\n", "\\end{align}\n", "由于遍历所有组合会导致过拟合,所以需要减枝,于是就有了 *regression tree* 和 *classification tree*。\n", "\n", "1. *CART* (classification and regression tree) 算法。就是上述的生成树模型的算法。 \n", "\n", "1. *PRIM* (patient rule induction method) 算法。从所有数据出发,先捕捉一块特征值(response mean)最高的数据,然后移除该数据,重复上述操作,再次捕捉。类似与捕鱼,先捕大鱼,再捕小鱼,最后是虾米。原理上,PRIM 与 CART 类似,也是将所有数据分割,但分割的更细致,不像 CART 快刀斩乱麻,一分为二。\n", "\n", "1. *MARS* (Maltivariate Adaptive Regression Splines) 算法,尤其适用于高维空间。定义了一个 piecewise linear 的 basis functions,\n", "\\begin{align}\n", "\\mathbf{C} = \\left\\{ (X_j - t)_+, (t-X_j)_+ \\right\\} _{t \\in \\{x_{1j}, x_{2j}, \\cdots, x_{Nj}\\},\\ j=1,2,\\cdots,p.}\n", "\\end{align}\n", "MARS 可以表示为 \n", "\\begin{align}\n", "f(X) = \\beta_0 + \\sum_{m=1}^M \\beta_m h_m(X),\n", "\\end{align}\n", "其中, $h_m(X) \\in \\mathbf{C}$.\n", "*MARS 有点: 可以同时处理定量和定性的混合输入*.\n", "> MARS can handle \"mixed\" predictors--quantitative and qualitative--in a natural way, much like CART does.\n", "\n", "1. *HME* (Hierarchical Mixtures of Experts) 算法。把 CART 的 hard dicision 改成了 soft decision,以概率的形式。树枝叫做 Gating Network,树叶叫做 Expert Network。每层 Gating Network 有概率判别\n", "\\begin{align}\n", "g_j(x,\\gamma_j) =& \\frac{e^{\\gamma_j^{T}x}}{\\sum_{k=1}^K e^{\\gamma_k^{T}x}} , \\quad j=1,2,\\cdots, K \\\\\n", "g_{l|j}(x,\\gamma_{jl}) =& \\frac{e^{\\gamma_{jl}^{T}x}}{\\sum_{k=1}^K e^{\\gamma_{jk}^{T}x}} , \\quad l=1,2,\\cdots, K\n", "\\end{align}\n", "Expert Network 可以表示为 $Y \\sim \\Pr(y|x,\\theta_{jl})$, 其中根据 regression 或者 classification 不同,\n", "\\begin{align}\n", "\\Pr(y|x,\\theta_{jl}) =& \\beta_{jl}^Tx+\\epsilon \\\\\n", "\\Pr(y=1|x,\\theta_{jl}) =& \\frac{1}{1+ e^{-\\theta_{jl}^Tx}}.\n", "\\end{align}\n", "最终 HME 表达模型为\n", "\\begin{align}\n", "\\Pr(y|x,\\Psi) = \\sum_{j=1}^Kg_j(x,\\gamma_j)\\sum_{l=1}^Kg_{l|j}(x,\\gamma_{jl})\\Pr(y|x,\\theta_{jl}).\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## 具体内容\n", "\n", "### regression tree\n", "\n", "定义\n", "\\begin{align}\n", "N_m =& \\#\\{x_i \\in R_m\\} \\\\\n", "\\hat{c}_m =& \\frac{1}{N}\\sum_{x_i \\in R_i} y_i, \\\\\n", "Q_{m}(T) =& \\frac{1}{N}\\sum_{x_i \\in R_i} (y_i - \\hat{c}_m)^2\n", "\\end{align}\n", "从而最小化 \n", "\\begin{align}\n", "C_{\\alpha}(T) = \\sum_{m=1}^{|T|}N_m Q_m(T) + \\alpha |T|\n", "\\end{align}\n", "其中 $\\alpha$ 就是可调的参数,决定树的大小。\n", " \n", "### classification tree\n", "和 regression tree 类似,需要重新定义 $Q_m(T)$。引入 class-k 在 node-m\n", "的概率\n", "\\begin{align}\n", "\\hat{p}_{mk} = \\frac{1}{N_m} \\sum_{x_i \\in R_m}I(y_i=k)\n", "\\end{align}\n", "$Q_m(T)$ 可以表示为三类:\n", "- Misclassification error (预测误差): $Q_m(T) = 1 - \\hat{p}_{mk}$\n", "- Gini index (预测方差): $Q_m(T) = \\sum_{k=1}^K \\hat{p}_{mk} (1- \\hat{p}_{mk})$\n", "- Cross-entropy or deviance (熵): $Q_m(T) = -\\sum_{k=1}^K\\hat{p}_{mk}\\log \\hat{p}_{mk}$" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 2 }