华东理工大学《Python与金融计算》

单因子资产定价模型的实证检验(单资产)

蒋志强 2022-03-17 18:00-21:00

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
In [2]:
stock_data = pd.read_csv('1EShowData_data_stock_daily_price_2001-2018.csv', encoding='GB2312', usecols=[1, 5, 10])
stock_data
Out[2]:
日期_Date 收盘价_Clpr 日无风险收益率_DRfRet
0 2001/8/27 35.55 0.000054
1 2001/8/28 36.86 0.000054
2 2001/8/29 36.38 0.000054
3 2001/8/30 37.10 0.000054
4 2001/8/31 37.01 0.000054
... ... ... ...
4232 2018/12/24 568.00 0.000088
4233 2018/12/25 565.79 0.000090
4234 2018/12/26 560.08 0.000091
4235 2018/12/27 563.00 0.000091
4236 2018/12/28 590.01 0.000092

4237 rows × 3 columns

In [3]:
stock_data.columns = ['date', 'close', 'rfreturn']
stock_data
Out[3]:
date close rfreturn
0 2001/8/27 35.55 0.000054
1 2001/8/28 36.86 0.000054
2 2001/8/29 36.38 0.000054
3 2001/8/30 37.10 0.000054
4 2001/8/31 37.01 0.000054
... ... ... ...
4232 2018/12/24 568.00 0.000088
4233 2018/12/25 565.79 0.000090
4234 2018/12/26 560.08 0.000091
4235 2018/12/27 563.00 0.000091
4236 2018/12/28 590.01 0.000092

4237 rows × 3 columns

In [4]:
stock_data.dropna(inplace=True)
stock_data
Out[4]:
date close rfreturn
0 2001/8/27 35.55 0.000054
1 2001/8/28 36.86 0.000054
2 2001/8/29 36.38 0.000054
3 2001/8/30 37.10 0.000054
4 2001/8/31 37.01 0.000054
... ... ... ...
4232 2018/12/24 568.00 0.000088
4233 2018/12/25 565.79 0.000090
4234 2018/12/26 560.08 0.000091
4235 2018/12/27 563.00 0.000091
4236 2018/12/28 590.01 0.000092

4135 rows × 3 columns

In [5]:
stock_data['return'] = np.log(stock_data['close']) - np.log(stock_data['close'].shift(periods=1))
stock_data
Out[5]:
date close rfreturn return
0 2001/8/27 35.55 0.000054 NaN
1 2001/8/28 36.86 0.000054 0.036187
2 2001/8/29 36.38 0.000054 -0.013108
3 2001/8/30 37.10 0.000054 0.019598
4 2001/8/31 37.01 0.000054 -0.002429
... ... ... ... ...
4232 2018/12/24 568.00 0.000088 0.001039
4233 2018/12/25 565.79 0.000090 -0.003898
4234 2018/12/26 560.08 0.000091 -0.010143
4235 2018/12/27 563.00 0.000091 0.005200
4236 2018/12/28 590.01 0.000092 0.046860

4135 rows × 4 columns

In [6]:
stock_data.dropna(inplace=True)
stock_data
Out[6]:
date close rfreturn return
1 2001/8/28 36.86 0.000054 0.036187
2 2001/8/29 36.38 0.000054 -0.013108
3 2001/8/30 37.10 0.000054 0.019598
4 2001/8/31 37.01 0.000054 -0.002429
5 2001/9/3 36.99 0.000054 -0.000541
... ... ... ... ...
4232 2018/12/24 568.00 0.000088 0.001039
4233 2018/12/25 565.79 0.000090 -0.003898
4234 2018/12/26 560.08 0.000091 -0.010143
4235 2018/12/27 563.00 0.000091 0.005200
4236 2018/12/28 590.01 0.000092 0.046860

4134 rows × 4 columns

In [7]:
stock_data['return'].plot()
Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbc711fe2d0>
In [8]:
ind = (stock_data['return'] >= -0.1) & (stock_data['return'] <= 0.1)
stock_data = stock_data.loc[ind, :]
stock_data['return'].plot()
Out[8]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fbc71132250>
In [9]:
stock_data
Out[9]:
date close rfreturn return
1 2001/8/28 36.86 0.000054 0.036187
2 2001/8/29 36.38 0.000054 -0.013108
3 2001/8/30 37.10 0.000054 0.019598
4 2001/8/31 37.01 0.000054 -0.002429
5 2001/9/3 36.99 0.000054 -0.000541
... ... ... ... ...
4232 2018/12/24 568.00 0.000088 0.001039
4233 2018/12/25 565.79 0.000090 -0.003898
4234 2018/12/26 560.08 0.000091 -0.010143
4235 2018/12/27 563.00 0.000091 0.005200
4236 2018/12/28 590.01 0.000092 0.046860

4125 rows × 4 columns

In [10]:
index_data = pd.read_csv('1EShowData_data_Index_daily_price_2001-2018.csv', encoding = 'GB2312', usecols=[1, 5])
index_data
Out[10]:
交易日期_TrdDt 收盘价(元/点)_ClPr
0 2005/1/4 982.79
1 2005/1/5 992.56
2 2005/1/6 983.17
3 2005/1/7 983.96
4 2005/1/10 993.88
... ... ...
3397 2018/12/24 3038.20
3398 2018/12/25 3017.28
3399 2018/12/26 3002.03
3400 2018/12/27 2990.51
3401 2018/12/28 3010.65

3402 rows × 2 columns

In [11]:
index_data.columns = ['date', 'close']
index_data
Out[11]:
date close
0 2005/1/4 982.79
1 2005/1/5 992.56
2 2005/1/6 983.17
3 2005/1/7 983.96
4 2005/1/10 993.88
... ... ...
3397 2018/12/24 3038.20
3398 2018/12/25 3017.28
3399 2018/12/26 3002.03
3400 2018/12/27 2990.51
3401 2018/12/28 3010.65

3402 rows × 2 columns

In [12]:
index_data.dropna(inplace=True)
index_data
Out[12]:
date close
0 2005/1/4 982.79
1 2005/1/5 992.56
2 2005/1/6 983.17
3 2005/1/7 983.96
4 2005/1/10 993.88
... ... ...
3397 2018/12/24 3038.20
3398 2018/12/25 3017.28
3399 2018/12/26 3002.03
3400 2018/12/27 2990.51
3401 2018/12/28 3010.65

3402 rows × 2 columns

In [13]:
index_data['return'] = np.log(index_data['close']) - np.log(index_data['close'].shift(periods=1))
index_data
Out[13]:
date close return
0 2005/1/4 982.79 NaN
1 2005/1/5 992.56 0.009892
2 2005/1/6 983.17 -0.009505
3 2005/1/7 983.96 0.000803
4 2005/1/10 993.88 0.010031
... ... ... ...
3397 2018/12/24 3038.20 0.002901
3398 2018/12/25 3017.28 -0.006909
3399 2018/12/26 3002.03 -0.005067
3400 2018/12/27 2990.51 -0.003845
3401 2018/12/28 3010.65 0.006712

3402 rows × 3 columns

In [14]:
index_data.dropna(inplace=True)
index_data
Out[14]:
date close return
1 2005/1/5 992.56 0.009892
2 2005/1/6 983.17 -0.009505
3 2005/1/7 983.96 0.000803
4 2005/1/10 993.88 0.010031
5 2005/1/11 997.13 0.003265
... ... ... ...
3397 2018/12/24 3038.20 0.002901
3398 2018/12/25 3017.28 -0.006909
3399 2018/12/26 3002.03 -0.005067
3400 2018/12/27 2990.51 -0.003845
3401 2018/12/28 3010.65 0.006712

3401 rows × 3 columns

In [15]:
merge_data = pd.merge(left=stock_data[['date', 'return', 'rfreturn']],
                      right=index_data[['date', 'return']],
                      on='date',
                      how='inner')
merge_data
Out[15]:
date return_x rfreturn return_y
0 2005/1/5 0.021442 0.000073 0.009892
1 2005/1/6 -0.017335 0.000071 -0.009505
2 2005/1/7 0.008163 0.000071 0.000803
3 2005/1/10 0.035932 0.000071 0.010031
4 2005/1/11 0.001306 0.000071 0.003265
... ... ... ... ...
3321 2018/12/24 0.001039 0.000088 0.002901
3322 2018/12/25 -0.003898 0.000090 -0.006909
3323 2018/12/26 -0.010143 0.000091 -0.005067
3324 2018/12/27 0.005200 0.000091 -0.003845
3325 2018/12/28 0.046860 0.000092 0.006712

3326 rows × 4 columns

In [16]:
merge_data.columns = ['date', 'return_stk', 'rfreturn', 'return_ind']
merge_data
Out[16]:
date return_stk rfreturn return_ind
0 2005/1/5 0.021442 0.000073 0.009892
1 2005/1/6 -0.017335 0.000071 -0.009505
2 2005/1/7 0.008163 0.000071 0.000803
3 2005/1/10 0.035932 0.000071 0.010031
4 2005/1/11 0.001306 0.000071 0.003265
... ... ... ... ...
3321 2018/12/24 0.001039 0.000088 0.002901
3322 2018/12/25 -0.003898 0.000090 -0.006909
3323 2018/12/26 -0.010143 0.000091 -0.005067
3324 2018/12/27 0.005200 0.000091 -0.003845
3325 2018/12/28 0.046860 0.000092 0.006712

3326 rows × 4 columns

In [17]:
stk_ret = merge_data['return_stk'].values
rf_ret = merge_data['rfreturn'].values
ind_ret = merge_data['return_ind'].values
In [19]:
plt.plot(ind_ret - rf_ret, stk_ret - rf_ret, 'o', ms=5, mfc='w', lw=2)
plt.xlabel(r'$r_m - r_f$', fontsize = 20)
plt.ylabel(r'$r_i - r_f$', fontsize = 20)
Out[19]:
Text(0, 0.5, '$r_i - r_f$')
In [20]:
x = sm.add_constant(ind_ret-rf_ret)
y = stk_ret - rf_ret
model = sm.OLS(y, x)
results = model.fit()
print(results.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.248
Model:                            OLS   Adj. R-squared:                  0.248
Method:                 Least Squares   F-statistic:                     1097.
Date:                Thu, 17 Mar 2022   Prob (F-statistic):          3.53e-208
Time:                        18:44:08   Log-Likelihood:                 8498.5
No. Observations:                3326   AIC:                        -1.699e+04
Df Residuals:                    3324   BIC:                        -1.698e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0011      0.000      3.391      0.001       0.000       0.002
x1             0.6159      0.019     33.125      0.000       0.579       0.652
==============================================================================
Omnibus:                      437.950   Durbin-Watson:                   1.838
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             1855.618
Skew:                           0.583   Prob(JB):                         0.00
Kurtosis:                       6.468   Cond. No.                         57.0
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.