{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 一、[线性回归](code/1-LinearRegression)\n", "- [全部代码](../code/1-LinearRegression/LinearRegression.py)\n", "\n", "### 1、代价函数\n", "- \n", "- 其中:\n", "\n", "\n", "- 下面就是要求出theta,使代价最小,即代表我们拟合出来的方程距离真实值最近\n", "- 共有m条数据,其中代表我们要拟合出来的方程到真实值距离的平方,平方的原因是因为可能有负值,正负可能会抵消\n", "- 前面有系数`2`的原因是下面求梯度是对每个变量求偏导,`2`可以消去\n", "\n", "- 实现代码:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# 计算代价函数\n", "def computerCost(X,y,theta):\n", " m = len(y)\n", " J = 0\n", " \n", " J = (np.transpose(X*theta-y))*(X*theta-y)/(2*m) #计算代价J\n", " return J" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- 注意这里的X是真实数据前加了一列1,因为有theta(0)\n", "\n", "### 2、梯度下降算法\n", "- 代价函数对求偏导得到: \n", "![\\frac{{\\partial J(\\theta )}}{{\\partial {\\theta _j}}} = \\frac{1}{m}\\sum\\limits_{i = 1}^m {[({h_\\theta }({x^{(i)}}) - {y^{(i)}})x_j^{(i)}]} ](http://chart.apis.google.com/chart?cht=tx&chs=1x0&chf=bg,s,FFFFFF00&chco=000000&chl=%5Cfrac%7B%7B%5Cpartial%20J%28%5Ctheta%20%29%7D%7D%7B%7B%5Cpartial%20%7B%5Ctheta%20_j%7D%7D%7D%20%3D%20%5Cfrac%7B1%7D%7Bm%7D%5Csum%5Climits_%7Bi%20%3D%201%7D%5Em%20%7B%5B%28%7Bh_%5Ctheta%20%7D%28%7Bx%5E%7B%28i%29%7D%7D%29%20-%20%7By%5E%7B%28i%29%7D%7D%29x_j%5E%7B%28i%29%7D%5D%7D%20)\n", "- 所以对theta的更新可以写为: \n", "![{\\theta _j} = {\\theta _j} - \\alpha \\frac{1}{m}\\sum\\limits_{i = 1}^m {[({h_\\theta }({x^{(i)}}) - {y^{(i)}})x_j^{(i)}]} ](http://chart.apis.google.com/chart?cht=tx&chs=1x0&chf=bg,s,FFFFFF00&chco=000000&chl=%7B%5Ctheta%20_j%7D%20%3D%20%7B%5Ctheta%20_j%7D%20-%20%5Calpha%20%5Cfrac%7B1%7D%7Bm%7D%5Csum%5Climits_%7Bi%20%3D%201%7D%5Em%20%7B%5B%28%7Bh_%5Ctheta%20%7D%28%7Bx%5E%7B%28i%29%7D%7D%29%20-%20%7By%5E%7B%28i%29%7D%7D%29x_j%5E%7B%28i%29%7D%5D%7D%20)\n", "- 其中为学习速率,控制梯度下降的速度,一般取**0.01,0.03,0.1,0.3.....**\n", "- 为什么梯度下降可以逐步减小代价函数\n", " - 假设函数`f(x)`\n", " - 泰勒展开:`f(x+△x)=f(x)+f'(x)*△x+o(△x)`\n", " - 令:`△x=-α*f'(x)` ,即负梯度方向乘以一个很小的步长`α`\n", " - 将`△x`代入泰勒展开式中:`f(x+x)=f(x)-α*[f'(x)]²+o(△x)`\n", " - 可以看出,`α`是取得很小的正数,`[f'(x)]²`也是正数,所以可以得出:`f(x+△x)<=f(x)`\n", " - 所以沿着**负梯度**放下,函数在减小,多维情况一样。\n", "- 实现代码" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# 梯度下降算法\n", "def gradientDescent(X,y,theta,alpha,num_iters):\n", " m = len(y) \n", " n = len(theta)\n", " \n", " temp = np.matrix(np.zeros((n,num_iters))) # 暂存每次迭代计算的theta,转化为矩阵形式\n", " \n", " \n", " J_history = np.zeros((num_iters,1)) #记录每次迭代计算的代价值\n", " \n", " for i in range(num_iters): # 遍历迭代次数 \n", " h = np.dot(X,theta) # 计算内积,matrix可以直接乘\n", " temp[:,i] = theta - ((alpha/m)*(np.dot(np.transpose(X),h-y))) #梯度的计算\n", " theta = temp[:,i]\n", " J_history[i] = computerCost(X,y,theta) #调用计算代价函数\n", " print '.', \n", " return theta,J_history " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3、均值归一化\n", "- 目的是使数据都缩放到一个范围内,便于使用梯度下降算法\n", "- \n", "- 其中  为所有此feture数据的平均值\n", "- 可以是**最大值-最小值**,也可以是这个feature对应的数据的**标准差**\n", "- 实现代码:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# 归一化feature\n", "def featureNormaliza(X):\n", " X_norm = np.array(X) #将X转化为numpy数组对象,才可以进行矩阵的运算\n", " #定义所需变量\n", " mu = np.zeros((1,X.shape[1])) \n", " sigma = np.zeros((1,X.shape[1]))\n", " \n", " mu = np.mean(X_norm,0) # 求每一列的平均值(0指定为列,1代表行)\n", " sigma = np.std(X_norm,0) # 求每一列的标准差\n", " for i in range(X.shape[1]): # 遍历列\n", " X_norm[:,i] = (X_norm[:,i]-mu[i])/sigma[i] # 归一化\n", " \n", " return X_norm,mu,sigma" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- 注意预测的时候也需要均值归一化数据\n", "\n", "### 4、最终运行结果\n", "- 代价随迭代次数的变化 \n", "\n", "\n", "\n", "### 5、使用scikit-learn库中的线性模型实现\n", "- [全部代码](../../code/1-LinearRegression/LinearRegression_scikit-learn.py)\n", "- 导入包" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn import linear_model\n", "from sklearn.preprocessing import StandardScaler #引入缩放的包" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- 归一化" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ " # 归一化操作\n", "scaler = StandardScaler() \n", "scaler.fit(X)\n", "x_train = scaler.transform(X)\n", "x_test = scaler.transform(np.array([1650,3]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- 线性模型拟合" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# 线性模型拟合\n", "model = linear_model.LinearRegression()\n", "model.fit(x_train, y)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.13" } }, "nbformat": 4, "nbformat_minor": 2 }