{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 收益率概念与平稳性"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 资产收益率"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "多数金融研究针对的是资产收益率而不是资产价格。\n",
    "\n",
    "使用收益率有两个主要理由：\n",
    "1. 对普通的投资者来说，资产收益率完全体现了该资产的投资机会，且与其投资规模无关；\n",
    "2. 收益率序列比价格序列更容易处理，因为前者有更好的统计性质\n",
    "\n",
    "然而，资产收益率有多种定义，设 $P_t$ 是资产在 $t$ 时刻的价格。下面给出常见的几个收益率定义.暂时假定资产不支付分红。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 单期简单收益率"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "若从第 $t-1$ 天到第 $t$ 天(一个周期)持有某种资产，则**简单毛收益率**为：\n",
    "\n",
    "$$1+R_{t}=\\frac{P_{t}}{P_{t-1}} \\quad \\text { 或 } \\quad P_{t}=P_{t-1}\\left(1+R_{t}\\right)\\tag{1}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "对应的**单期简单收益率**或称**简单收益率**为：\n",
    "\n",
    "$$R_{t}=\\frac{P_{t}}{P_{t-1}}-1=\\frac{P_{t}-P_{t-1}}{P_{t-1}}\\tag{2}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 连续复合收益率"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "资产的简单毛收益率的自然对数称为**连续复合收益率** 或 **对数收益率**：\n",
    "\n",
    "$$r_{t}=\\ln \\left(1+R_{t}\\right)=\\ln \\frac{P_{t}}{P_{t-1}}=p_{t}-p_{t-1}\\tag{3}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "其中 $p_{t}=\\ln P_{t}$，与简单净收益率 $R_t$ 相比，连续复合收益率 $r_t$ 有一些优点。首先，对多期收益率，我们有：\n",
    "\n",
    "$$\\begin{aligned}\n",
    "r_{t}[k] &=\\ln \\left(1+R_{t}[k]\\right)=\\ln \\left[\\left(1+R_{t}\\right)\\left(1+R_{t-1}\\right) \\cdots\\left(1+R_{t-k+1}\\right)\\right] \\\\\n",
    "&=\\ln \\left(1+R_{t}\\right)+\\ln \\left(1+R_{t-1}\\right)+\\cdots+\\ln \\left(1+R_{t-k+1}\\right) \\\\\n",
    "&=r_{t}+r_{t-1}+\\cdots+r_{t-k+1}\n",
    "\\end{aligned}\\tag{4}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这样，连续复合多期收益率就是它所包含的连续复合单期收益率之和。其次，对数收益率具有更容易处理的统计性质。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>日期</th>\n",
       "      <th>hs300</th>\n",
       "      <th>sz</th>\n",
       "      <th>sz收益率</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2018-01-03</td>\n",
       "      <td>4111.3925</td>\n",
       "      <td>3369.1084</td>\n",
       "      <td>0.618765</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2018-01-04</td>\n",
       "      <td>4128.8119</td>\n",
       "      <td>3385.7102</td>\n",
       "      <td>0.491555</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2018-01-05</td>\n",
       "      <td>4138.7505</td>\n",
       "      <td>3391.7501</td>\n",
       "      <td>0.178235</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2018-01-08</td>\n",
       "      <td>4160.1595</td>\n",
       "      <td>3409.4795</td>\n",
       "      <td>0.521360</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2018-01-09</td>\n",
       "      <td>4189.2977</td>\n",
       "      <td>3413.8996</td>\n",
       "      <td>0.129558</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          日期      hs300         sz     sz收益率\n",
       "0 2018-01-03  4111.3925  3369.1084  0.618765\n",
       "1 2018-01-04  4128.8119  3385.7102  0.491555\n",
       "2 2018-01-05  4138.7505  3391.7501  0.178235\n",
       "3 2018-01-08  4160.1595  3409.4795  0.521360\n",
       "4 2018-01-09  4189.2977  3413.8996  0.129558"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "stock = pd.read_excel('../数据/上证指数与沪深300.xlsx')\n",
    "stock['sz收益率'] = 100*np.log(stock['sz']/stock['sz'].shift(1))\n",
    "stock = stock.dropna()   #删除缺失值\n",
    "stock = stock.reset_index(drop=True)\n",
    "stock.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 平稳性\n",
    "### 概念"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "平稳性是时间序列分析的基础。\n",
    "\n",
    "时间序列 {$r_t$} 称为**严平稳**的(strictly stationary)，如果对所有的 $t$，任意正整数 $k$ 和任意 $k$ 个正整数($t_1,\\cdots,t_k$)，($r_{t_1},\\cdots,r_{t_k}$)的联合分布与($r_{t_1+t},\\cdots,r_{t_k+t}$)的联合分布是相同的.换言之，严平稳性要求($r_{t_1},\\cdots,r_{t_k}$)的联合分布在时间的平移变换下保持不变。这是一个很强的条件，难以用经验方法验证，经常假定的是平稳性的一一个较弱的形式。\n",
    "\n",
    "时间序列 {$r_t$} 称为**弱平稳**的（weakly stationary），如果 $r_t$ 的均值与 $r_t$ 和 $r_{t-l}$ 的协方差不随时间而改变，其中 $l$ 是任意整数.更具体地说，{$r_t$} 是弱平稳的，若：\n",
    "1. $E_{r_t}=\\mu$，$\\mu$ 是一个常数；\n",
    "2. $\\operatorname{Cov}\\left(r_{t}, r_{t-l}\\right)=\\gamma_{l}$，$\\gamma_{l}$ 只依赖于 $l$。\n",
    "\n",
    "在实际中，假定我们有 $T$ 个数据观测点$\\left\\{r_{t} | t=1, \\cdots, T\\right\\}$，弱平稳性意味着数据的时间图显示出 $T$ 个值在一个常数水平上下以相同幅度波动。在应用中，弱平稳性使我们可以对未来观测进行推断，即预测。\n",
    "\n",
    "在弱平稳性的条件中，我们隐含地假定了 $r_t$ 的前两阶矩是有限的.由定义可见，若 $r_t$ 是严平稳的且它的前两阶矩是有限的，则 $r_t$ 也是弱平稳的.反之，一般是不成立的.但如果时间序列 $r_t$ 是正态分布的，则弱平稳性与严平稳性是等价的.本内容主要考虑弱平稳序列。\n",
    "\n",
    "协方差 $\\gamma_{l}=\\operatorname{Cov}\\left(r_{t}, r_{t-l}\\right)$ 称为 $r_t$ 的间隔为 $l$ 的自协方差、它具有两个重要性质：\n",
    "1. $\\gamma_{0}=\\operatorname{Var}\\left(r_{t}\\right)$\n",
    "2. $\\gamma_{-l}=\\gamma_{l}$\n",
    "\n",
    "第二个性质成立是因为$\\operatorname{Cov}\\left(r_{t}, r_{t-(-l)}\\right)=\\operatorname{Cov}\\left(r_{t-(-l)}, r_{t}\\right)=\\operatorname{Cov}\\left(r_{t+l}, r_{t}\\right)=\\operatorname{Cov}\\left(r_{t_{1}}, r_{t_{1}-l}\\right)$，其中$t_{1}=t+l$\n",
    "\n",
    "在金融文献中，通常假定资产收益率序列是弱平稳的.只要有足够多的历史收益率数据，这个假定可以用实证方法验证，例如，我们可以把数据分成若干子样本，然后检验它们的一致性。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 单位根所带来的问题\n",
    "对于 AR(1)，一般从理论上认为，不太可能出现 | $\\beta_{1} |> 1$ 的情形，否则任何对经济的扰动都将被无限放大。因此，经济学家通常只担心存在单位根的情形， 即 $\\beta_{1}=1_{\\circ}$ 如果时间序列存在单位根，则为非平稳序列，可能带来以下问题:\n",
    "1. 自回归系数的估计值向左偏向于 0\n",
    "2. 传统的 t 检验失效\n",
    "3. 两个相互独立的单位根变量可能出现伪回归(spurious regression)或伪相关"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Augmented Dickey-Fuller 单位根检验(ADF检验)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(-2.3069400422968207,\n",
       " 0.16973943854299967,\n",
       " 7,\n",
       " 451,\n",
       " {'1%': -3.444932949082776,\n",
       "  '5%': -2.867969899953726,\n",
       "  '10%': -2.57019489663276},\n",
       " 4390.386487977853)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import statsmodels.tsa.stattools as ts\n",
    "ts.adfuller(stock['sz'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<caption>Augmented Dickey-Fuller Results</caption>\n",
       "<tr>\n",
       "  <td>Test Statistic</td>    <td>-2.307</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>P-value</td>            <td>0.170</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>Lags</td>                   <td>7</td>\n",
       "</tr>\n",
       "</table><br/><br/>Trend: Constant<br/>Critical Values: -3.44 (1%), -2.87 (5%), -2.57 (10%)<br/>Null Hypothesis: The process contains a unit root.<br/>Alternative Hypothesis: The process is weakly stationary."
      ],
      "text/plain": [
       "<class 'arch.unitroot.unitroot.ADF'>\n",
       "\"\"\"\n",
       "   Augmented Dickey-Fuller Results   \n",
       "=====================================\n",
       "Test Statistic                 -2.307\n",
       "P-value                         0.170\n",
       "Lags                                7\n",
       "-------------------------------------\n",
       "\n",
       "Trend: Constant\n",
       "Critical Values: -3.44 (1%), -2.87 (5%), -2.57 (10%)\n",
       "Null Hypothesis: The process contains a unit root.\n",
       "Alternative Hypothesis: The process is weakly stationary.\n",
       "\"\"\""
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from arch.unitroot import ADF\n",
    "ADF(stock['sz'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### KPSS 平稳性检验"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "G:\\anaconda\\lib\\site-packages\\statsmodels\\tsa\\stattools.py:1685: InterpolationWarning: p-value is smaller than the indicated p-value\n",
      "  warn(\"p-value is smaller than the indicated p-value\", InterpolationWarning)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "(0.8735590412113318,\n",
       " 0.01,\n",
       " 12,\n",
       " {'10%': 0.347, '5%': 0.463, '2.5%': 0.574, '1%': 0.739})"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ts.kpss(stock['sz'], nlags='auto')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<caption>KPSS Stationarity Test Results</caption>\n",
       "<tr>\n",
       "  <td>Test Statistic</td>     <td>0.874</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>P-value</td>            <td>0.005</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>Lags</td>                  <td>12</td>\n",
       "</tr>\n",
       "</table><br/><br/>Trend: Constant<br/>Critical Values: 0.74 (1%), 0.46 (5%), 0.35 (10%)<br/>Null Hypothesis: The process is weakly stationary.<br/>Alternative Hypothesis: The process contains a unit root."
      ],
      "text/plain": [
       "<class 'arch.unitroot.unitroot.KPSS'>\n",
       "\"\"\"\n",
       "    KPSS Stationarity Test Results   \n",
       "=====================================\n",
       "Test Statistic                  0.874\n",
       "P-value                         0.005\n",
       "Lags                               12\n",
       "-------------------------------------\n",
       "\n",
       "Trend: Constant\n",
       "Critical Values: 0.74 (1%), 0.46 (5%), 0.35 (10%)\n",
       "Null Hypothesis: The process is weakly stationary.\n",
       "Alternative Hypothesis: The process contains a unit root.\n",
       "\"\"\""
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from arch.unitroot import KPSS\n",
    "KPSS(stock['sz'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### DFGLS 检验"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<caption>Dickey-Fuller GLS Results</caption>\n",
       "<tr>\n",
       "  <td>Test Statistic</td>    <td>-0.563</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>P-value</td>            <td>0.492</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>Lags</td>                   <td>7</td>\n",
       "</tr>\n",
       "</table><br/><br/>Trend: Constant<br/>Critical Values: -2.61 (1%), -1.99 (5%), -1.67 (10%)<br/>Null Hypothesis: The process contains a unit root.<br/>Alternative Hypothesis: The process is weakly stationary."
      ],
      "text/plain": [
       "<class 'arch.unitroot.unitroot.DFGLS'>\n",
       "\"\"\"\n",
       "      Dickey-Fuller GLS Results      \n",
       "=====================================\n",
       "Test Statistic                 -0.563\n",
       "P-value                         0.492\n",
       "Lags                                7\n",
       "-------------------------------------\n",
       "\n",
       "Trend: Constant\n",
       "Critical Values: -2.61 (1%), -1.99 (5%), -1.67 (10%)\n",
       "Null Hypothesis: The process contains a unit root.\n",
       "Alternative Hypothesis: The process is weakly stationary.\n",
       "\"\"\""
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from arch.unitroot import DFGLS\n",
    "DFGLS(stock['sz'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### PhillipsPerron（PP）检验"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table class=\"simpletable\">\n",
       "<caption>Phillips-Perron Test (Z-tau)</caption>\n",
       "<tr>\n",
       "  <td>Test Statistic</td>    <td>-2.108</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>P-value</td>            <td>0.241</td>\n",
       "</tr>\n",
       "<tr>\n",
       "  <td>Lags</td>                  <td>18</td>\n",
       "</tr>\n",
       "</table><br/><br/>Trend: Constant<br/>Critical Values: -3.44 (1%), -2.87 (5%), -2.57 (10%)<br/>Null Hypothesis: The process contains a unit root.<br/>Alternative Hypothesis: The process is weakly stationary."
      ],
      "text/plain": [
       "<class 'arch.unitroot.unitroot.PhillipsPerron'>\n",
       "\"\"\"\n",
       "     Phillips-Perron Test (Z-tau)    \n",
       "=====================================\n",
       "Test Statistic                 -2.108\n",
       "P-value                         0.241\n",
       "Lags                               18\n",
       "-------------------------------------\n",
       "\n",
       "Trend: Constant\n",
       "Critical Values: -3.44 (1%), -2.87 (5%), -2.57 (10%)\n",
       "Null Hypothesis: The process contains a unit root.\n",
       "Alternative Hypothesis: The process is weakly stationary.\n",
       "\"\"\""
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from arch.unitroot import PhillipsPerron\n",
    "PhillipsPerron(stock['sz'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "更多单位根检验请参考：[Unit Root Testing](https://arch.readthedocs.io/en/latest/unitroot/unitroot.html)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}