{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 自回归模型" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 原理讲解" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "到目前为止,我们强调的是以回归模型来推断因果关系。如果从一个客户的角度仅关心某变量(比如股价)的未来值,则可以用该变量的过去值来预测其未来值(因为时间序列一般存在自相关),这种模型被称为“单变量时间序列”。此时,可以不必理会因果关系,只考虑相关关系即可。比如,看到街上有人带伞,可以预测今天要下雨,但行人带伞并不导致下雨。对于样本数据 $y _ { 1 } , y _ { 2 } ,… y _T$,最简单的预测方法为一阶自回归 $AR(1)$:\n", "\n", "$$y _ { t } = \\beta _ { 0 } + \\beta _ { 1 } y _ { t - 1 } + \\varepsilon _t \\left( t = 2 , \\cdots , T \\right)\\tag{1}$$\n", "\n", "其中,扰动项 $\\varepsilon_t$ 为白噪声,满足零期望 $E \\left( \\varepsilon _ { t } \\right) = 0$,同方差 $Var \\left( \\varepsilon _ { t } \\right) = \\sigma _ \\varepsilon ^ { 2 }$,且无自相关 $Cov \\left( \\varepsilon _ { t } , \\varepsilon _ { s } \\right) =0$,$t \\ne s $。假设$| \\beta _ { 1 } | < 1$,则 {$ y _ { t }$}为渐近独立的平稳过程。由于 $y _ { t - 1 }$ 依赖于 {$ \\varepsilon _ { t - 1 }$,…$\\varepsilon _ { 1 }$},而 $\\varepsilon _ { t }$ 与 {$ \\varepsilon _ { t - 1 }$,…$\\varepsilon _ { 1 }$} 不相关,故 $y _ { t - 1 }$ 与 $\\varepsilon _ { t }$ 不相关,因此用OLS估计方程(1)是一致的。然而,使用 OLS 只能用观测值 $t = 2$,…,T 进行回归,将损失一个样本容量。\n", "\n", "为了提高估计效率,考虑最大似然估计。为此,进一步假设 $\\varepsilon_{t}$ 为独立同分布,且服从正态分布 $N\\left(0, \\sigma_{\\varepsilon}^{2}\\right)$。由于 ${y_t}$ 为平稳过程,其期望与方差均不随时间而变。对方程(1)两边同时取期望可得\n", "\n", "$$\\mathrm{E}(y)=\\boldsymbol{\\beta}_{0}+\\boldsymbol{\\beta}_{1} \\mathrm{E}(y)\\tag{2}$$\n", "\n", "经移项整理可知,{$y_t$} 的无条件期望为 $\\frac{\\beta_{0}}{1-\\beta_{1}}$。对方程(1)两边同时取方差可得\n", "\n", "$$\\operatorname{Var}(y)=\\beta_{1}^{2} \\operatorname{Var}(y)+\\sigma_{\\varepsilon}^{2}\\tag{3}$$\n", "\n", "故 $y_t$ 的无条件方差为 $\\frac{\\sigma_{\\varepsilon}^{2}}{1-\\beta_{1}^{2}}$。因此,$y_1$ 服从正态分布 $N\\left(\\frac{\\beta_{0}}{1-\\beta_{1}}, \\frac{\\sigma_{\\varepsilon}^{2}}{1-\\beta_{1}^{2}}\\right)$,其(无条件)密度函数为\n", "\n", "$$f_{y_{1}}\\left(y_{1}\\right)=\\frac{1}{\\sqrt{2 \\pi \\sigma_{\\varepsilon}^{2} /\\left(1-\\beta_{1}^{2}\\right)}} \\exp \\left\\{\\frac{-\\left[y_{1}-\\left(\\beta_{0} /\\left(1-\\beta_{1}\\right)\\right)\\right]^{2}}{2 \\sigma_{\\varepsilon}^{2} /\\left(1-\\beta_{1}^{2}\\right)}\\right\\}\\tag{4}$$\n", "\n", "在给定 $y_1$ 的条件下,$y_2$ 的条件分布为 $y_{2} | y_{1} \\sim N\\left(\\boldsymbol{\\beta}_{0}+\\boldsymbol{\\beta}_{1} y_{1}, \\sigma_{\\varepsilon}^{2}\\right)$,其条件密度为\n", "\n", "$$f_{y_{2} | x_{1}}\\left(y_{2} | y_{1}\\right)=\\frac{1}{\\sqrt{2 \\pi \\sigma_{\\varepsilon}^{2}}} \\exp \\left\\{\\frac{-\\left(y_{2}-\\beta_{0}-\\beta_{1} y_{1}\\right)^{2}}{2 \\sigma_{\\varepsilon}^{2}}\\right\\}\\tag{5}$$\n", "\n", "则 $y_1$ 与 $y_2$ 的联合分布密度为\n", "\n", "$$f_{y_{1}, y_{2}}\\left(y_{1}, y_{2}\\right)=f_{y_{1}}\\left(y_{1}\\right) f_{y_{2} | y_{1}}\\left(y_{2} | y_{1}\\right)\\tag{6}$$\n", "\n", "依此类推得 $y_{t} | y_{t-1} \\sim N\\left(\\beta_{0}+\\beta_{1} y_{t-1}, \\sigma_{\\varepsilon}^{2}\\right), t=2, \\cdots, T$,写出整个样本数据 $\\left\\{y_{1}, y_{2}, \\cdots, y_{T}\\right\\}$ 的联合概率密度(即似然函数)为\n", "\n", "$$f_{\\gamma_{1}, \\ldots, y_{T}}\\left(y_{1}, \\cdots, y_{T}\\right)=f_{y_{1}}\\left(y_{1}\\right) \\prod_{t=2}^{T} f_{y_{1} | y_{t-1}}\\left(y_{t} | y_{t-1}\\right)\\tag{7}$$\n", "\n", "取对数可得对数似然函数:\n", "\n", "$$\\ln L\\left(\\beta_{0}, \\beta_{1}, \\sigma_{\\varepsilon}^{2} ; y_{1}, \\cdots, y_{T}\\right)=\\ln f_{y_{1}}\\left(y_{1}\\right)+\\sum_{t=2}^{T} \\ln f_{y_{1} | y_{t-1}}\\left(y_{t} | y_{t-1}\\right)\\tag{8}$$\n", "\n", "代入 $f_{y_{1}}\\left(y_{1}\\right)$ 与 $f_{y_{1} | y_{t-1}}\\left(y_{t} | y_{t-1}\\right)$ 的表达式可得\n", "\n", "$$\\ln L=-\\frac{1}{2} \\ln 2 \\pi-\\frac{1}{2} \\ln \\left[\\sigma_{\\varepsilon}^{2} /\\left(1-\\beta_{1}^{2}\\right)\\right]-\\frac{\\left[y_{1}-\\left(\\beta_{0} /\\left(1-\\beta_{1}\\right)\\right)\\right]^{2}}{2 \\sigma_{\\varepsilon}^{2} /\\left(1-\\beta_{1}^{2}\\right)}-\\frac{T-1}{2} \\ln 2 \\pi-\\frac{T-1}{2} \\ln \\sigma_{\\varepsilon}^{2}-\\sum_{t=2}^{T} \\frac{\\left(y_{t}-\\beta_{0}-\\beta_{1} y_{t-1}\\right)^{2}}{2 \\sigma_{\\varepsilon}^{2}}\\tag{9}$$\n", "\n", "寻找最优参数 $\\left(\\beta_{0}, \\beta_{1}, \\sigma_{\\varepsilon}^{2}\\right)$,使得 $ln L$ 最大化,这个估计量被称为“精确最大似然估计量”。由于 $ln L$ 为 $\\left(\\beta_{0}, \\beta_{1}, \\sigma_{\\varepsilon}^{2}\\right)$)的非线性函数,故需要使用迭代法进行数值计算。\n", "\n", "如果样本容量 $T$ 较大,则第一个观测值对似然函数的贡献较小,可以忽略。此时,相当于考虑在第一个观测值 $y_1$ 给定的情况下,{$y_2,\\cdots,y_T$} 的条件分布,则对数似然函数简化为\n", "\n", "$$\\ln L=-\\frac{T-1}{2} \\ln 2 \\pi-\\frac{T-1}{2} \\ln \\sigma_{\\varepsilon}^{2}-\\sum_{t=2}^{T} \\frac{\\left(y_{t}-\\beta_{0}-\\beta_{1} y_{t-1}\\right)^{2}}{2 \\sigma_{\\varepsilon}^{2}}\\tag{10}$$\n", "\n", "最大化上式所得到的估计量称为“条件最大似然估计量”( conditional MLE)。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "更一般地,可以考虑 $p$ 阶自回归模型,记为 $AR(p)$:\n", "\n", "$$y_t = \\beta_0 + \\beta_1 y_{t-1}+\\cdots+\\beta_p y_{y-p}+\\varepsilon_t\\tag{11}$$\n", "\n", "对 $AR(p)$ 的估计与 $AR(1)$ 类似。在使用“条件 MLE”时,考虑在给定 {$y_1,\\cdots,y_p$} 的情况下,{$y_p+1,\\cdots,y_T$} 的条件分布。由于条件 MLE 等价于 OLS,而后者并不依赖于正态性假定,故条件 MLE 的一致性也不依赖于正态性假定。\n", "\n", "然而,通常我们并不知道滞后期 $p$。因此,如何估计 $\\hat{p}$ 在实践中有着重要的意义。\n", "\n", "估计滞后期 $p$ 有两种方法:\n", "1. 由大到小的序贯 $t$ 规则\n", "2. 使用信息准则:AIC、BIC 准则等\n", "\n", "在实践中,可以结合以上两种方法来确定 $\\hat{p}$。如果二者不一致,为了保守起见(即尽量避免遗漏变量偏差),可取二者滞后阶数的大者。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 最大似然估计原理" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "如果回归模型存在非线性,可能不方便使用 OLS,这时常常采用最大似然估计法(MLE)或非线性最小二乘法(NLS)。\n", "\n", "假设随机向量 $y$ 的概率密度函数为 $f(y; \\theta)$,其中 $\\theta$ 为 $K$ 维未知参数向量,通过抽取随机样本 ${y_1,\\cdots,y_n}$ 来估计 $\\theta$。假设${y_1,\\cdots,y_n}$ 为独立同分布(iid),则样本数据的联合密度函数为 $f(y_1; \\theta)f(y_2; \\theta) \\cdots f(y_n; \\theta)$。\n", "\n", "在抽样之前,${y_1,\\cdots,y_n}$ 被视为随机向量。抽样之后,${y_1,\\cdots,y_n}$ 就有了特定的样本值。因此,可以将样本的联合密度函数视为在 ${y_1,\\cdots,y_n}$ 给定的情况下,未知参数 $\\theta$ 的函数。定义“似然函数”(likelihood function)为\n", "\n", "$$L\\left(\\boldsymbol{\\theta} ; y_{1}, \\cdots, y_{n}\\right)=\\prod_{i=1}^{n} f\\left(y_{i} ; \\boldsymbol{\\theta}\\right)\\tag{12}$$\n", "\n", "由此可知,似然函数与联合密度函数完全相等,只是 $\\theta$ 与 ${y_1,\\cdots,y_n}$ 的角色互换,即把 $\\theta$ 作为自变量,而视 ${y_1,\\cdots,y_n}$ 为已给定的。为了运算方便,常把似然函数取对数,将乘积的形式转化为求和的形式\n", "\n", "$$\\ln L\\left(\\boldsymbol{\\theta} ; y_{1}, \\cdots, y_{n}\\right)=\\sum_{i=1}^{n} \\ln f\\left(y_{i} ; \\boldsymbol{\\theta}\\right)\\tag{13}$$\n", "\n", "“最大似然估计法\"(Maximum Likelihood Estimation,简记 MLE 或 ML)来源于一个简单而深刻的思想:给定样本取值后,该样本最有可能来自参数 $\\theta$ 为何值的总体。换言之,寻找 $\\hat{\\boldsymbol{\\theta}}_{\\mathrm{ML}}$,使得观测到样本数据的可能性最大,即最大化对数似然函数(loglikelihoodfunction):\n", "\n", "$$\\max _{\\theta \\in \\Theta} \\ln L(\\theta ; y)\\tag{14}$$\n", "\n", "在数学上,常把最大似然估计量 $\\hat{\\boldsymbol{\\theta}}_{\\mathrm{ML}}$ 写为\n", "\n", "$$\\hat{\\boldsymbol{\\theta}}_{\\mathrm{ML}} \\equiv \\operatorname{argmax} \\ln L(\\boldsymbol{\\theta} ; y)\\tag{15}$$\n", "\n", "其中,“argmax\"(即argument of the maximum)表示能使 $ln L(y; \\theta)$ 最大化的 $\\theta$ 取值。假设存在唯一内点解,则该无约束极值问题的一阶条件为\n", "\n", "$$s(\\boldsymbol{\\theta} ; y) \\equiv \\frac{\\partial \\ln L(\\boldsymbol{\\theta} ; y)}{\\partial \\boldsymbol{\\theta}} \\equiv\\left(\\begin{array}{c}\n", "\\frac{\\partial \\ln L(\\boldsymbol{\\theta} ; y)}{\\partial \\theta_{1}} \\\\\n", "\\vdots \\\\\n", "\\frac{\\partial \\ln L(\\boldsymbol{\\theta} ; y)}{\\partial \\theta_{K}}\n", "\\end{array}\\right)=\\mathbf{0}\\tag{16}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 原理实现" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<table class=\"simpletable\">\n", "<caption>OLS Regression Results</caption>\n", "<tr>\n", " <th>Dep. Variable:</th> <td>y</td> <th> R-squared: </th> <td> 0.979</td> \n", "</tr>\n", "<tr>\n", " <th>Model:</th> <td>OLS</td> <th> Adj. R-squared: </th> <td> 0.979</td> \n", "</tr>\n", "<tr>\n", " <th>Method:</th> <td>Least Squares</td> <th> F-statistic: </th> <td>2.126e+04</td>\n", "</tr>\n", "<tr>\n", " <th>Date:</th> <td>Fri, 29 May 2020</td> <th> Prob (F-statistic):</th> <td> 0.00</td> \n", "</tr>\n", "<tr>\n", " <th>Time:</th> <td>21:02:41</td> <th> Log-Likelihood: </th> <td> -2286.7</td> \n", "</tr>\n", "<tr>\n", " <th>No. Observations:</th> <td> 459</td> <th> AIC: </th> <td> 4577.</td> \n", "</tr>\n", "<tr>\n", " <th>Df Residuals:</th> <td> 457</td> <th> BIC: </th> <td> 4586.</td> \n", "</tr>\n", "<tr>\n", " <th>Df Model:</th> <td> 1</td> <th> </th> <td> </td> \n", "</tr>\n", "<tr>\n", " <th>Covariance Type:</th> <td>nonrobust</td> <th> </th> <td> </td> \n", "</tr>\n", "</table>\n", "<table class=\"simpletable\">\n", "<tr>\n", " <td></td> <th>coef</th> <th>std err</th> <th>t</th> <th>P>|t|</th> <th>[0.025</th> <th>0.975]</th> \n", "</tr>\n", "<tr>\n", " <th>const</th> <td> 39.1999</td> <td> 19.891</td> <td> 1.971</td> <td> 0.049</td> <td> 0.111</td> <td> 78.288</td>\n", "</tr>\n", "<tr>\n", " <th>var_1</th> <td> 0.9863</td> <td> 0.007</td> <td> 145.803</td> <td> 0.000</td> <td> 0.973</td> <td> 1.000</td>\n", "</tr>\n", "</table>\n", "<table class=\"simpletable\">\n", "<tr>\n", " <th>Omnibus:</th> <td>50.265</td> <th> Durbin-Watson: </th> <td> 1.999</td>\n", "</tr>\n", "<tr>\n", " <th>Prob(Omnibus):</th> <td> 0.000</td> <th> Jarque-Bera (JB): </th> <td> 186.067</td>\n", "</tr>\n", "<tr>\n", " <th>Skew:</th> <td>-0.417</td> <th> Prob(JB): </th> <td>3.94e-41</td>\n", "</tr>\n", "<tr>\n", " <th>Kurtosis:</th> <td> 6.006</td> <th> Cond. No. </th> <td>3.55e+04</td>\n", "</tr>\n", "</table><br/><br/>Warnings:<br/>[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.<br/>[2] The condition number is large, 3.55e+04. This might indicate that there are<br/>strong multicollinearity or other numerical problems." ], "text/plain": [ "<class 'statsmodels.iolib.summary.Summary'>\n", "\"\"\"\n", " OLS Regression Results \n", "==============================================================================\n", "Dep. Variable: y R-squared: 0.979\n", "Model: OLS Adj. R-squared: 0.979\n", "Method: Least Squares F-statistic: 2.126e+04\n", "Date: Fri, 29 May 2020 Prob (F-statistic): 0.00\n", "Time: 21:02:41 Log-Likelihood: -2286.7\n", "No. Observations: 459 AIC: 4577.\n", "Df Residuals: 457 BIC: 4586.\n", "Df Model: 1 \n", "Covariance Type: nonrobust \n", "==============================================================================\n", " coef std err t P>|t| [0.025 0.975]\n", "------------------------------------------------------------------------------\n", "const 39.1999 19.891 1.971 0.049 0.111 78.288\n", "var_1 0.9863 0.007 145.803 0.000 0.973 1.000\n", "==============================================================================\n", "Omnibus: 50.265 Durbin-Watson: 1.999\n", "Prob(Omnibus): 0.000 Jarque-Bera (JB): 186.067\n", "Skew: -0.417 Prob(JB): 3.94e-41\n", "Kurtosis: 6.006 Cond. No. 3.55e+04\n", "==============================================================================\n", "\n", "Warnings:\n", "[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n", "[2] The condition number is large, 3.55e+04. This might indicate that there are\n", "strong multicollinearity or other numerical problems.\n", "\"\"\"" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "import pandas as pd\n", "import statsmodels.api as sm\n", "\n", "data = pd.read_excel('../数据/上证指数与沪深300.xlsx')\n", "\n", "def ARxx(X, P):\n", " \"\"\"\n", " 获取自回归的估计向量\n", " X - 输入数据,列向量\n", " P - AR 阶数,标量\n", " \"\"\"\n", " N = len(X)\n", " ARx = pd.DataFrame()\n", " for m in range(1,P+1):\n", " ARx.insert(m-1, f'var_{m}', list(X[P-m:N-m]))\n", " ARy = list(X[P:N])\n", " return ARy, ARx\n", "\n", "\n", "\n", "y, X = ARxx(data['sz'], 1)\n", "X = sm.add_constant(X)\n", "mod = sm.OLS(y, X)\n", "res = mod.fit()\n", "res.summary()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## statsmodels 库实现\n", "详情请参考:[AutoReg 模型](https://www.statsmodels.org/stable/generated/statsmodels.tsa.ar_model.AutoReg.html?highlight=auto#statsmodels.tsa.ar_model.AutoReg)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<table class=\"simpletable\">\n", "<caption>AutoReg Model Results</caption>\n", "<tr>\n", " <th>Dep. Variable:</th> <td>sz</td> <th> No. Observations: </th> <td>460</td> \n", "</tr>\n", "<tr>\n", " <th>Model:</th> <td>AutoReg(1)</td> <th> Log Likelihood </th> <td>-2286.658</td>\n", "</tr>\n", "<tr>\n", " <th>Method:</th> <td>Conditional MLE</td> <th> S.D. of innovations</th> <td>35.265</td> \n", "</tr>\n", "<tr>\n", " <th>Date:</th> <td>Fri, 29 May 2020</td> <th> AIC </th> <td>7.139</td> \n", "</tr>\n", "<tr>\n", " <th>Time:</th> <td>21:02:41</td> <th> BIC </th> <td>7.166</td> \n", "</tr>\n", "<tr>\n", " <th>Sample:</th> <td>1</td> <th> HQIC </th> <td>7.149</td> \n", "</tr>\n", "<tr>\n", " <th></th> <td>460</td> <th> </th> <td> </td> \n", "</tr>\n", "</table>\n", "<table class=\"simpletable\">\n", "<tr>\n", " <td></td> <th>coef</th> <th>std err</th> <th>z</th> <th>P>|z|</th> <th>[0.025</th> <th>0.975]</th> \n", "</tr>\n", "<tr>\n", " <th>intercept</th> <td> 39.1999</td> <td> 19.847</td> <td> 1.975</td> <td> 0.048</td> <td> 0.300</td> <td> 78.100</td>\n", "</tr>\n", "<tr>\n", " <th>sz.L1</th> <td> 0.9863</td> <td> 0.007</td> <td> 146.121</td> <td> 0.000</td> <td> 0.973</td> <td> 1.000</td>\n", "</tr>\n", "</table>\n", "<table class=\"simpletable\">\n", "<caption>Roots</caption>\n", "<tr>\n", " <td></td> <th> Real</th> <th> Imaginary</th> <th> Modulus</th> <th> Frequency</th>\n", "</tr>\n", "<tr>\n", " <th>AR.1</th> <td> 1.0139</td> <td> +0.0000j</td> <td> 1.0139</td> <td> 0.0000</td>\n", "</tr>\n", "</table>" ], "text/plain": [ "<class 'statsmodels.iolib.summary.Summary'>\n", "\"\"\"\n", " AutoReg Model Results \n", "==============================================================================\n", "Dep. Variable: sz No. Observations: 460\n", "Model: AutoReg(1) Log Likelihood -2286.658\n", "Method: Conditional MLE S.D. of innovations 35.265\n", "Date: Fri, 29 May 2020 AIC 7.139\n", "Time: 21:02:41 BIC 7.166\n", "Sample: 1 HQIC 7.149\n", " 460 \n", "==============================================================================\n", " coef std err z P>|z| [0.025 0.975]\n", "------------------------------------------------------------------------------\n", "intercept 39.1999 19.847 1.975 0.048 0.300 78.100\n", "sz.L1 0.9863 0.007 146.121 0.000 0.973 1.000\n", " Roots \n", "=============================================================================\n", " Real Imaginary Modulus Frequency\n", "-----------------------------------------------------------------------------\n", "AR.1 1.0139 +0.0000j 1.0139 0.0000\n", "-----------------------------------------------------------------------------\n", "\"\"\"" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from statsmodels.tsa.ar_model import AutoReg\n", "import pandas as pd\n", "\n", "data = pd.read_excel('../数据/上证指数与沪深300.xlsx')\n", "res = AutoReg(data['sz'], lags=1).fit()\n", "res.summary()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## matlab实现" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 使用 OLS 估计代码\n", "\n", "ARxx.m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```matlab\n", "function [ARy,ARx] = ARxx(X,P)\n", "\n", "% 获取自回归的估计向量\n", "% X - 输入数据,列向量\n", "% P - AR 阶数,标量\n", "\n", "N = length(X);\n", "ARx = zeros(N-P,P+1);\n", "ARx(:,1) = ones(N-P,1); %常数项\n", "for m = 1:P\n", " ARx(:,m+1) = X(P-m+1:N-m,1); %P-m+1 = N-m-(N-P)+1\n", "end\n", "ARy = X(P+1:N,1);\n", "\n", "% % 1.载入数据\n", "% data = xlsread('../数据/上证指数与沪深300.xlsx');\n", "% f1 = data(:,2);\n", "% \n", "% % 2.估计自回归模型f1 = c + f1(-1) + f1(-2)\n", "% [ARy,ARx] = ARxx(f1,2);\n", "% beta = regress(ARy,ARx);\n", "% \n", "% % 3.计算各滞后阶AIC值,选取最优滞后阶数(最大阶数为6)\n", "% T = length(f1);\n", "% AR_AIC = zeros(6,1);\n", "% for p = 1:6\n", "% [ARy,ARx] = ARxx(f1,p);\n", "% beta = regress(ARy,ARx);\n", "% y_hat = ARx*beta;\n", "% resid = ARy - y_hat;\n", "% AR_AIC(p) = log(resid'*resid/T)+2*(p+1)/T;\n", "% end\n", "% \n", "% % 4.找到最小的AIC值所对应的阶数\n", "% Best_p = find(AR_AIC == min(AR_AIC));\n", "% [ARy,ARx] = ARxx(f1,Best_p);\n", "% beta = regress(ARy,ARx);\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### fminsearch 函数(最大似然估计)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 使用" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "fminsearch 函数用来求解多维无约束的线性优化问题,用 derivative-free 的方法找到多变量无约束函数的最小值\n", "\n", "**语法:**\n", "```matlab\n", "x = fminsearch(fun,x0) \n", " x = fminsearch(fun,x0,options) \n", " [x,fval] = fminsearch(...)\n", " [x,fval,exitflag] = fminsearch(...)\n", " [x,fval,exitflag,output] = fminsearch(...)\n", "```\n", "\n", "**解释:**\n", "\n", "fminsearch 能够从一个初始值开始,找到一个标量函数的最小值。通常被称为无约束非线性优化\n", "\n", "* x = fminsearch(fun,x0) 从 x0 开始,找到函数 fun 中的局部最小值 x,x0 可以是标量,向量,矩阵。fun 是一个函数句柄 \n", "\n", "* x = fminsearch(fun,x0,options) 以优化参数指定的结构最小化函数,可以用 optimset 函数定义这些参数。(见 matlab help)\n", "\n", "* [x,fval] = fminsearch(...)返回在结果 x 出的目标函数的函数值\n", "\n", "* [x,fval,exitflag] = fminsearch(...) 返回 exitflag 值来表示 fminsearch 退出的条件:\n", " * 1--函数找到结果 x\n", " * 0--函数最大功能评价次数达到,或者是迭代次数达到\n", " * -1--算法由外部函数结束\n", "\n", "* [x,fval,exitflag,output] = fminsearch(...) 返回一个结构输出 output,包含最优化函数的信息:\n", " * output.algorithm 使用的优化算法\n", " * output.funcCount 函式计算次数\n", " * output.iterations 迭代次数\n", " * output.message 退出信息" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 优化参数选项options" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以通过 optimset 函数设置或改变这些参数。其中有的参数适用于所有的优化算法,有的则只适用于大型优化问题,另外一些则只适用于中型问题。\n", "\n", "首先描述适用于大型问题的选项。这仅仅是一个参考,因为使用大型问题算法有一些条件。对于 fminunc(fminsearch)函数来说,必须提供梯度信息。\n", "\n", "> LargeScale – 当设为'on'时使用大型算法,若设为'off'则使用中型问题的算法。\n", " \n", "**适用于大型和中型算法的参数:**\n", "* Diagnostics – 打印最小化函数的诊断信息。 \n", "\n", "* Display – 显示水平。选择'off',不显示输出;选择'iter',显示每一步迭代过程的输出;选择'final',显示最终结果。打印最小化函数的诊断信息。 \n", "\n", "* GradObj – 用户定义的目标函数的梯度。对于大型问题此参数是必选的,对于中型问题则是可选项。 \n", "\n", "* MaxFunEvals – 函数评价的最大次数。 \n", "\n", "* MaxIter – 最大允许迭代次数。 \n", "\n", "* TolFun – 函数值的终止容限。 \n", "\n", "* TolX – x 处的终止容限。\n", "\n", "**只用于大型算法的参数:** \n", "* Hessian – 用户定义的目标函数的 Hessian 矩阵。\n", "\n", "* HessPattern –用于有限差分的 Hessian 矩阵的稀疏形式。若不方便求 fun 函数的稀疏 Hessian 矩阵 H,可以通过用梯度的有限差分获得的 H 的稀疏结构(如非零值的位置等)来得到近似的 Hessian 矩阵 H。若连矩阵的稀疏结构都不知道,则可以将 HessPattern 设为密集矩阵,在每一次迭代过程中,都将进行密集矩阵的有限差分近似(这是缺省设置)。这将非常麻烦,所以花一些力气得到 Hessian 矩阵的稀疏结构还是值得的。\n", "\n", "* MaxPCGIter – PCG 迭代的最大次数。\n", "\n", "* PrecondBandWidth – PCG 前处理的上带宽,缺省时为零。对于有些问题,增加带宽可以减少迭代次数。\n", "\n", "* TolPCG – PCG 迭代的终止容限。\n", "\n", "* TypicalX – 典型 x 值。\n", " \n", "**只用于中型算法的参数:** \n", "* DerivativeCheck – 对用户提供的导数和有限差分求出的导数进行对比。\n", "\n", "* DiffMaxChange – 变量有限差分梯度的最大变化。\n", "\n", "* DiffMinChange - 变量有限差分梯度的最小变化。\n", "\n", "* LineSearchType – 一维搜索算法的选择。\n", "\n", "\n", "* exitflag:描述退出条件\n", " * exitflag>0 表示目标函数收敛于解 x 处。\n", " * exitflag=0 表示已经达到函数评价或迭代的最大次数。\n", " * exitflag<0 表示目标函数不收敛。\n", "\n", "\n", "* output:该参数包含下列优化信息:\n", " * output.iterations – 迭代次数。\n", " * output.algorithm – 所采用的算法。\n", " * output.funcCount – 函数评价次数。\n", " * output.cgiterations – PCG 迭代次数(只适用于大型规划问题)。\n", " * output.stepsize – 最终步长的大小(只用于中型问题)。\n", " * output.firstorderopt – 一阶优化的度量:解 x 处梯度的范数。\n", "\n", "\n", "* fminunc(fminsearch)为中型优化算法的搜索方向提供了 4 中算法,由 options 中的参数 HessUpdate 控制\n", " * HessUpdate=‘bfgs’(默认值),为拟牛顿的 BFGS 法\n", " * HessUpdate='dfp'为拟牛顿 DFP 法\n", " * HessUpdate=‘steepdesc’最速下降法\n", "\n", "\n", "* fminunc(fminsearch)中为中型优化算法的步长一维搜索提供了两种算法,由 options 中参数 LineSearchType 控制\n", " * LineSearchType='quadcubic'混合的二次和三次多项式插值\n", " * LineSearchType=‘cubicpoly’三次多项式插值\n", "\n", "> 使用 fminunc 和 fminsearch 都可能会得到局部最优解。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### fminsearch函数的缺陷" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. 依赖于初值\n", "2. 不同目标函数需要选择不同的算法\n", "3. 仅是局部最优算法,并不是全局最优\n", "4. 迭代法与解析解的速度不是一个量级" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### fminsearch函数实例" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**1. 求解 $z=(x+1)^2+(y-1)^2$ 的最小值**\n", "\n", "三维图像显示:\n", "<div align=center><img src=\"https://lei-picture.oss-cn-beijing.aliyuncs.com/img/20200428003742.png\" width=\"350\" ></div>\n", "\n", "\n", "可能不太好从图像里看出极值,这里再举一个例子:\n", "\n", "$z=-\\sqrt{(x+1)^2+(y-1)^2}$ 三维图 \n", "<div align=center><img src=\"https://lei-picture.oss-cn-beijing.aliyuncs.com/img/20200428003850.png\" width=\"350\" ></div>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```matlab\n", "% 1.编写目标函数\n", "function z = ObjectFunction(Beta)\n", "\tz = (Beta(1)+1)^2 + (Beta(2)-1)^2;\n", "\n", "% 2.初始参数设定\n", "maxsize = 2000; % 生成均匀随机数的个数(用于赋初值)\n", "REP\t\t\t = 100; % 若发散则继续进行迭代的次数\n", "nInitialVectors = [maxsize, 2]; % 生成随机数向量\n", "MaxFunEvals = 5000; % 函数评价的最大次数\n", "MaxIter = 5000; % 允许迭代的最大次数\n", "options = optimset('LargeScale', 'off','HessUpdate', 'dfp','MaxFunEvals', ...\n", "MaxFunEvals, 'display', 'on', 'MaxIter', MaxIter, 'TolFun', 1e-6, 'TolX', 1e-6,'TolCon',10^-12);\n", "\n", "% 3.寻找最优初值\n", "initialTargetVectors = unifrnd(-5,5, nInitialVectors);\n", "RQfval = zeros(nInitialVectors(1), 1);\n", "for i = 1:nInitialVectors(1)\n", " RQfval(i) = ObjectFunction(initialTargetVectors(i,:));\n", "end\n", "Results = [RQfval, initialTargetVectors];\n", "SortedResults = sortrows(Results,1);\n", "BestInitialCond = SortedResults(1,2: size(Results,2)); \n", "\n", "% 4.迭代求出最优估计值Beta\n", "[Beta, fval exitflag] = fminsearch(' ObjectFunction ', BestInitialCond);\n", "for it = 1:REP\n", "if exitflag == 1, break, end\n", "[Beta, fval exitflag] = fminsearch(' ObjectFunction ', BestInitialCond);\n", "end\n", "if exitflag~=1, warning('警告:迭代并没有完成'), end\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**2. 求解 $\\ln L=-\\frac{1}{2} \\ln 2 \\pi-\\frac{1}{2} \\ln \\left[\\sigma_{\\varepsilon}^{2} /\\left(1-\\beta_{1}^{2}\\right)\\right]-\\frac{\\left[y_{1}-\\left(\\beta_{0} /\\left(1-\\beta_{1}\\right)\\right)\\right]^{2}}{2 \\sigma_{\\varepsilon}^{2} /\\left(1-\\beta_{1}^{2}\\right)}-\\frac{T-1}{2} \\ln 2 \\pi-\\frac{T-1}{2} \\ln \\sigma_{\\varepsilon}^{2}-\\sum_{t=2}^{T} \\frac{\\left(y_{t}-\\beta_{0}-\\beta_{1} y_{t-1}\\right)^{2}}{2 \\sigma_{\\varepsilon}^{2}}$ 的最大值**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```matlab\n", "%导入数据\n", "data = xlsread('../数据/上证指数与沪深300.xlsx');\n", "f1 = data(:,2);\n", "\n", "% 1.编写目标函数\n", "function lnL = ARlnL(Beta,Y)\n", "%Beta(1) = Beta0 ;Beta(2) = Beta1 ;Beta(3) = siga ;\n", "T = length(Y);\n", "lnL1 = -0.5*log(2*pi)-0.5*log(Beta(3)^2/(1-Beta(2)^2))-(Y(1)-(Beta(1)/(1-Beta(2))))^2/(2*Beta(3)^2/(1-Beta(2)^2))-0.5*log(2*pi)*(T-1)-0.5*(T-1)*log(Beta(3)^2);\n", "for i = 2:T\n", " lnL2(i) = (Y(i)- Beta(1)-Beta(2)*Y(i-1))^2;\n", "end\n", "lnL = lnL1-0.5*sum(lnL2)/Beta(3)^2;\n", "lnL = -lnL;\n", "\n", "% 2.初始参数设定\n", "maxsize = 2000; % 生成均匀随机数的个数(用于赋初值)\n", "REP\t\t\t = 100; % 若发散则继续进行迭代的次数\n", "nInitialVectors = [maxsize, 3]; % 生成随机数向量\n", "MaxFunEvals = 5000; % 函数评价的最大次数\n", "MaxIter = 5000; % 允许迭代的最大次数\n", "options = optimset('LargeScale', 'off','HessUpdate', 'dfp','MaxFunEvals', ...\n", "MaxFunEvals, 'display', 'on', 'MaxIter', MaxIter, 'TolFun', 1e-6, 'TolX', 1e-6,'TolCon',10^-12);\n", "\n", "% 3.寻找最优初值\n", "initialTargetVectors = unifrnd(-1,1, nInitialVectors);\n", "RQfval = zeros(nInitialVectors(1), 1);\n", "for i = 1:nInitialVectors(1)\n", " RQfval(i) = ARlnL (initialTargetVectors(i,:), f1);\n", "end\n", "Results = [RQfval, initialTargetVectors];\n", "SortedResults = sortrows(Results,1);\n", "BestInitialCond = SortedResults(1,2: size(Results,2)); \n", "\n", "% 4.迭代求出最优估计值Beta\n", "[Beta, fval exitflag] = fminsearch(' ARlnL ', BestInitialCond,options,f1);\n", "for it = 1:REP\n", "if exitflag == 1, break, end\n", "[Beta, fval exitflag] = fminsearch(' ARlnL ', BestInitialCond,options,f1);\n", "end\n", "if exitflag~=1, warning('警告:迭代并没有完成'), end\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "最终结果为:\n", "\n", "| c | AR(1) | sigma |\n", "| :----: | :----: | :----: |\n", "| 38.11 | 1.00 | 35.96 |" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }