{ "cells": [ { "cell_type": "markdown", "id": "59bd68c1", "metadata": {}, "source": [ "# 客製化因子\n", "\n", "上一個章節([lecture/Factors.ipynb](https://github.com/tejtw/TQuant-Lab/blob/main/lecture/Factors.ipynb))我們介紹了何謂因子以及如何使用因子,TQuant Lab 已經內建許多不同因子。然而在因子研究不斷勃發之下,許多新型態價量因子持續問世,或許您也有自己的專屬策略因子,因此本章將示範如何客製化因子並運用於 TQuant Lab 中。\n", "\n", "概念上而言,客製化因子與內建因子十分相同。兩者皆以 _inputs_, _window_length_, _mask_ 為輸入參數,並且輸出 _factor_ 物件的類別。\n", "\n", "假使欲計算每檔股票每天的滾動標準差 ([standard deviation](https://zh.wikipedia.org/zh-tw/%E6%A8%99%E6%BA%96%E5%B7%AE)),我們可以使用 `zipline.pipeline.CustomFactor` 子類與 `compute` 方法函式建構。\n", "\n", "### _class_ zipline.pipeline.CustomFactor \n", "\n", "#### Parameters:\n", "* inputs: _iterable_, optional \n", " \n", " 輸入資料。\n", " \n", "* outputs: _iterable[str]_, optional\n", " \n", " 輸出的因子。\n", "\n", "* window_length: _int_, optional\n", " \n", " 輸入資料的時間窗格。\n", " \n", "* mask: _zipline.pipeline.Filter_, optional\n", " \n", " 決定哪些資產需要計算因子。\n", "\n", "#### def compute(self, today, assets, out, *inputs)\n", "\n", "- today: 為pandas.Timestamp型態,記錄 Pipeline 啟動當天的日期。\n", "- assets: 是長度為 N 的numpy array,紀錄 sids(資產)。\n", "- *inputs: 為 MxN 的 numpy.arrays,M 為 window_length 且 N 為資產數量,可以設立多個inputs。\n", "- out: 是長度為 N 的numpy arrays。out 將會產出當天的 CustomFactor 計算結果。" ] }, { "cell_type": "markdown", "id": "a5d69b40", "metadata": {}, "source": [ "## 導入價量資料與必要模組" ] }, { "cell_type": "code", "execution_count": 1, "id": "9b1e34cb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Merging daily equity files:\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[2023-10-25 08:14:55.609378] INFO: zipline.data.bundles.core: Ingesting tquant.\n" ] } ], "source": [ "import os\n", "import pandas as pd\n", "import numpy as np \n", "import tejapi\n", "import warnings\n", "warnings.filterwarnings('ignore')\n", "\n", "os.environ['TEJAPI_BASE'] = 'https://api.tej.com.tw'\n", "os.environ['TEJAPI_KEY'] = 'YOUR KEY'\n", "\n", "os.environ['mdate'] = '20080401 20230702'\n", "os.environ['ticker'] = '2330 2409'\n", "\n", "from zipline.pipeline import Pipeline, CustomFactor\n", "from zipline.TQresearch.tej_pipeline import run_pipeline\n", "from zipline.pipeline.data import TWEquityPricing\n", "from zipline.pipeline.filters import StaticAssets,StaticSids\n", "from zipline.api import sid, symbol\n", "\n", "# ingest stock data\n", "!zipline ingest -b tquant" ] }, { "cell_type": "markdown", "id": "49e24285", "metadata": {}, "source": [ "## 建立計算標準差的因子\n", "\n", "於此例我們使用 `np.nanstd` 計算輸入值的標準差,輸入值與時間區間會依照 `make_pipeline()` 中的 `StdDev` 所給的 __inputs__ 與 __window_length__ 所決定。以此例中,若我們想要計算台積電 (2330) 與友達 (2409) 的 7 日收盤價標準差,可以設定為:\n", "\n", "1. inputs = [TWEquityPricing.close], TWEquityPricing 內建 bundle 內所有股票的價量資料。\n", "2. window_length = 7\n", "\n", "接著使用 `run_pipeline` 呼叫 `Pipeline` ,於回測期間內,逐日計算因子,最終產出 dataframe。可以發現該dataframe有MultiIndex,分別是時間與標的,並且每個指標於每天都會生成 7 日收盤價標準差。\n", "\n", "### zipline.TQresearch.tej_pipeline.run_pipeline\n", "\n", "執行 Pipeline 並生成資料表。\n", "\n", "#### Parameters:\n", "* pipeline: _zipline.pipeline.Pipeline_\n", " 欲運行的 pipeline 函式。\n", "* start_date: _pd.Timestamp_\n", " pipeline 起始執行的日期。需注意該日期必須於 bundle 時間區間內。\n", "* end_date: _pd.Timestamp_\n", " pipeline 執行結束的日期。需注意該日期必須於 bundle 時間區間內。\n", " \n", "#### Returns\n", " _pd.DataFrame_, 輸出 Pipeline 執行結果。" ] }, { "cell_type": "code", "execution_count": 2, "id": "6bf90d88", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | \n", " | std_dev | \n", "
---|---|---|
2013-01-03 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "1.375737 | \n", "
Equity(1 [2409]) | \n", "0.350946 | \n", "|
2013-01-04 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "2.024644 | \n", "
Equity(1 [2409]) | \n", "0.410947 | \n", "|
2013-01-07 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "2.287053 | \n", "
... | \n", "... | \n", "... | \n", "
2022-12-29 00:00:00+00:00 | \n", "Equity(1 [2409]) | \n", "0.314772 | \n", "
2022-12-30 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "6.326975 | \n", "
Equity(1 [2409]) | \n", "0.277562 | \n", "|
2023-01-03 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "6.689163 | \n", "
Equity(1 [2409]) | \n", "0.184888 | \n", "
4904 rows × 1 columns
\n", "\n", " | \n", " | close_open_diff | \n", "
---|---|---|
2013-01-03 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "0.100 | \n", "
Equity(1 [2409]) | \n", "-0.040 | \n", "|
2013-01-04 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "0.090 | \n", "
Equity(1 [2409]) | \n", "-0.065 | \n", "|
2013-01-07 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "0.200 | \n", "
... | \n", "... | \n", "... | \n", "
2022-12-29 00:00:00+00:00 | \n", "Equity(1 [2409]) | \n", "-0.060 | \n", "
2022-12-30 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "-0.150 | \n", "
Equity(1 [2409]) | \n", "-0.050 | \n", "|
2023-01-03 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "-1.250 | \n", "
Equity(1 [2409]) | \n", "-0.055 | \n", "
4904 rows × 1 columns
\n", "\n", " | \n", " | close_open_diff | \n", "
---|---|---|
2013-01-03 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "1.540 | \n", "
Equity(1 [2409]) | \n", "0.520 | \n", "|
2013-01-04 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "1.630 | \n", "
Equity(1 [2409]) | \n", "0.515 | \n", "|
2013-01-07 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "1.600 | \n", "
... | \n", "... | \n", "... | \n", "
2022-12-29 00:00:00+00:00 | \n", "Equity(1 [2409]) | \n", "0.375 | \n", "
2022-12-30 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "5.850 | \n", "
Equity(1 [2409]) | \n", "0.370 | \n", "|
2023-01-03 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "6.100 | \n", "
Equity(1 [2409]) | \n", "0.360 | \n", "
4904 rows × 1 columns
\n", "\n", " | \n", " | TenDaysLowest | \n", "
---|---|---|
2013-03-18 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "102.00 | \n", "
Equity(1 [2409]) | \n", "12.70 | \n", "|
2013-03-19 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "100.50 | \n", "
Equity(1 [2409]) | \n", "12.65 | \n", "|
2013-03-20 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "100.00 | \n", "
... | \n", "... | \n", "... | \n", "
2022-12-29 00:00:00+00:00 | \n", "Equity(1 [2409]) | \n", "14.65 | \n", "
2022-12-30 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "446.00 | \n", "
Equity(1 [2409]) | \n", "14.65 | \n", "|
2023-01-03 00:00:00+00:00 | \n", "Equity(0 [2330]) | \n", "446.00 | \n", "
Equity(1 [2409]) | \n", "14.65 | \n", "
4814 rows × 1 columns
\n", "\n", " | date | \n", "close | \n", "
---|---|---|
66 | \n", "2013-03-04 00:00:00+00:00 | \n", "102.0 | \n", "
68 | \n", "2013-03-05 00:00:00+00:00 | \n", "104.0 | \n", "
70 | \n", "2013-03-06 00:00:00+00:00 | \n", "104.0 | \n", "
72 | \n", "2013-03-07 00:00:00+00:00 | \n", "103.0 | \n", "
74 | \n", "2013-03-08 00:00:00+00:00 | \n", "103.5 | \n", "
76 | \n", "2013-03-11 00:00:00+00:00 | \n", "102.0 | \n", "
78 | \n", "2013-03-12 00:00:00+00:00 | \n", "102.5 | \n", "
80 | \n", "2013-03-13 00:00:00+00:00 | \n", "104.5 | \n", "
82 | \n", "2013-03-14 00:00:00+00:00 | \n", "104.0 | \n", "
84 | \n", "2013-03-15 00:00:00+00:00 | \n", "103.0 | \n", "
86 | \n", "2013-03-18 00:00:00+00:00 | \n", "100.5 | \n", "
88 | \n", "2013-03-19 00:00:00+00:00 | \n", "100.0 | \n", "