{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Labeling and MetaLabeling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "\n", "In this chapter of the book AFML, De Prado introduces several novel techniques for labeling returns for the purposes of supervised machine learning. \n", "\n", "First he identifies the typical issues of fixed-time horizon labeling methods - primarily that it is easy to mislabel a return due to dynamic nature of volatility throughout a trading period.\n", "\n", "More importantly he addresses a major overlooked aspect of the financial literature. He emphasizes that every investment strategy makes use of stop-loss limits of some kind, whether those are enforced by a margin call, risk department or self-imposed. He highlights how unrealistic it is to test/implement/propagate a strategy that profits from positions that would have been stopped out. \n", "\n", "> That virtually no publication accounts for that when labeling observations tells you something about the current state of financial literature.\n", ">\n", "> -De Prado, \"Advances in Financial Machine Learning\", pg.44\n", "\n", "He also introduces a technique called metalabeling, which is used to augment a strategy by improving recall while also reducing the likelihood of overfitting." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:13.812868Z", "start_time": "2019-03-01T17:54:08.567716Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2019-03-01T10:54:08-07:00\n", "\n", "CPython 3.7.2\n", "IPython 6.5.0\n", "\n", "compiler : GCC 7.3.0\n", "system : Linux\n", "release : 4.19.11-041911-generic\n", "machine : x86_64\n", "processor : x86_64\n", "CPU cores : 12\n", "interpreter: 64bit\n", "\n", "pandas 0.23.4\n", "pandas_datareader 0.7.0\n", "dask 1.0.0\n", "numpy 1.15.4\n", "sklearn 0.20.2\n", "statsmodels 0.9.0\n", "scipy 1.1.0\n", "ffn (0, 3, 4)\n", "matplotlib 3.0.2\n", "seaborn 0.9.0\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark\n", "\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "# import standard libs\n", "from IPython.display import display\n", "from IPython.core.debugger import set_trace as bp\n", "from pathlib import PurePath, Path\n", "import sys\n", "import time\n", "from collections import OrderedDict as od\n", "import re\n", "import os\n", "import json\n", "\n", "# import python scientific stack\n", "import pandas as pd\n", "import pandas_datareader.data as web\n", "pd.set_option('display.max_rows', 100)\n", "from dask import dataframe as dd\n", "from dask.diagnostics import ProgressBar\n", "from multiprocessing import cpu_count\n", "pbar = ProgressBar()\n", "pbar.register()\n", "import numpy as np\n", "import scipy.stats as stats\n", "import statsmodels.api as sm\n", "from numba import jit\n", "import math\n", "import ffn\n", "\n", "# import visual tools\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "import matplotlib.gridspec as gridspec\n", "%matplotlib inline\n", "import seaborn as sns\n", "\n", "plt.style.use('seaborn-talk')\n", "plt.style.use('bmh')\n", "#plt.rcParams['font.family'] = 'DejaVu Sans Mono'\n", "plt.rcParams['font.size'] = 9.5\n", "plt.rcParams['font.weight'] = 'medium'\n", "plt.rcParams['figure.figsize'] = 10,7\n", "blue, green, red, purple, gold, teal = sns.color_palette('colorblind', 6)\n", "\n", "# import util libs\n", "from tqdm import tqdm, tqdm_notebook\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")\n", "import missingno as msno\n", "from src.utils.utils import *\n", "import src.features.bars as brs\n", "import src.features.snippets as snp\n", "\n", "RANDOM_STATE = 777\n", "\n", "pdir = get_relative_project_dir('Adv_Fin_ML_Exercises', partial=False)\n", "data_dir = pdir / 'data'\n", "\n", "print()\n", "%watermark -p pandas,pandas_datareader,dask,numpy,sklearn,statsmodels,scipy,ffn,matplotlib,seaborn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Code Snippets\n", "\n", "Below I reproduce all the relevant code snippets found in the book that are necessary to work through the excercises found at the end of chapter 3." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Symmetric CUSUM Filter [2.5.2.1]" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:14.136573Z", "start_time": "2019-03-01T17:54:13.814830Z" } }, "outputs": [], "source": [ "def getTEvents(gRaw, h):\n", " tEvents, sPos, sNeg = [], 0, 0\n", " diff = np.log(gRaw).diff().dropna()\n", " for i in tqdm(diff.index[1:]):\n", " try:\n", " pos, neg = float(sPos+diff.loc[i]), float(sNeg+diff.loc[i])\n", " except Exception as e:\n", " print(e)\n", " print(sPos+diff.loc[i], type(sPos+diff.loc[i]))\n", " print(sNeg+diff.loc[i], type(sNeg+diff.loc[i]))\n", " break\n", " sPos, sNeg=max(0., pos), min(0., neg)\n", " if sNeg<-h:\n", " sNeg=0;tEvents.append(i)\n", " elif sPos>h:\n", " sPos=0;tEvents.append(i)\n", " return pd.DatetimeIndex(tEvents)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Daily Volatility Estimator [3.1]" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:14.430421Z", "start_time": "2019-03-01T17:54:14.138108Z" } }, "outputs": [], "source": [ "def getDailyVol(close,span0=100):\n", " # daily vol reindexed to close\n", " df0=close.index.searchsorted(close.index-pd.Timedelta(days=1))\n", " df0=df0[df0>0] \n", " df0=(pd.Series(close.index[df0-1], \n", " index=close.index[close.shape[0]-df0.shape[0]:])) \n", " try:\n", " df0=close.loc[df0.index]/close.loc[df0.values].values-1 # daily rets\n", " except Exception as e:\n", " print(f'error: {e}\\nplease confirm no duplicate indices')\n", " df0=df0.ewm(span=span0).std().rename('dailyVol')\n", " return df0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Triple-Barrier Labeling Method [3.2]" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:14.746120Z", "start_time": "2019-03-01T17:54:14.432530Z" } }, "outputs": [], "source": [ "def applyPtSlOnT1(close,events,ptSl,molecule):\n", " # apply stop loss/profit taking, if it takes place before t1 (end of event)\n", " events_=events.loc[molecule]\n", " out=events_[['t1']].copy(deep=True)\n", " if ptSl[0]>0: pt=ptSl[0]*events_['trgt']\n", " else: pt=pd.Series(index=events.index) # NaNs\n", " if ptSl[1]>0: sl=-ptSl[1]*events_['trgt']\n", " else: sl=pd.Series(index=events.index) # NaNs\n", " for loc,t1 in events_['t1'].fillna(close.index[-1]).iteritems():\n", " df0=close[loc:t1] # path prices\n", " df0=(df0/close[loc]-1)*events_.at[loc,'side'] # path returns\n", " out.loc[loc,'sl']=df0[df0pt[loc]].index.min() # earliest profit taking\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Gettting Time of First Touch (getEvents) [3.3], [3.6]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:15.090397Z", "start_time": "2019-03-01T17:54:14.748256Z" } }, "outputs": [], "source": [ "def getEvents(close, tEvents, ptSl, trgt, minRet, numThreads, t1=False, side=None):\n", " #1) get target\n", " trgt=trgt.loc[tEvents]\n", " trgt=trgt[trgt>minRet] # minRet\n", " #2) get t1 (max holding period)\n", " if t1 is False:t1=pd.Series(pd.NaT, index=tEvents)\n", " #3) form events object, apply stop loss on t1\n", " if side is None:side_,ptSl_=pd.Series(1.,index=trgt.index), [ptSl[0],ptSl[0]]\n", " else: side_,ptSl_=side.loc[trgt.index],ptSl[:2]\n", " events=(pd.concat({'t1':t1,'trgt':trgt,'side':side_}, axis=1)\n", " .dropna(subset=['trgt']))\n", " df0=mpPandasObj(func=applyPtSlOnT1,pdObj=('molecule',events.index),\n", " numThreads=numThreads,close=close,events=events,\n", " ptSl=ptSl_)\n", " events['t1']=df0.dropna(how='all').min(axis=1) # pd.min ignores nan\n", " if side is None:events=events.drop('side',axis=1)\n", " return events" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding Vertical Barrier [3.4]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:15.400850Z", "start_time": "2019-03-01T17:54:15.092280Z" } }, "outputs": [], "source": [ "def addVerticalBarrier(tEvents, close, numDays=1):\n", " t1=close.index.searchsorted(tEvents+pd.Timedelta(days=numDays))\n", " t1=t1[t1minPct or df0.shape[0]<3:break\n", " print('dropped label: ', df0.argmin(),df0.min())\n", " events=events[events['bin']!=df0.argmin()]\n", " return events" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Linear Partitions [20.4.1]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:16.614458Z", "start_time": "2019-03-01T17:54:16.306320Z" } }, "outputs": [], "source": [ "def linParts(numAtoms,numThreads):\n", " # partition of atoms with a single loop\n", " parts=np.linspace(0,numAtoms,min(numThreads,numAtoms)+1)\n", " parts=np.ceil(parts).astype(int)\n", " return parts" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:16.943498Z", "start_time": "2019-03-01T17:54:16.616483Z" } }, "outputs": [], "source": [ "def nestedParts(numAtoms,numThreads,upperTriang=False):\n", " # partition of atoms with an inner loop\n", " parts,numThreads_=[0],min(numThreads,numAtoms)\n", " for num in range(numThreads_):\n", " part=1+4*(parts[-1]**2+parts[-1]+numAtoms*(numAtoms+1.)/numThreads_)\n", " part=(-1+part**.5)/2.\n", " parts.append(part)\n", " parts=np.round(parts).astype(int)\n", " if upperTriang: # the first rows are heaviest\n", " parts=np.cumsum(np.diff(parts)[::-1])\n", " parts=np.append(np.array([0]),parts)\n", " return parts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### multiprocessing snippet [20.7]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:17.274840Z", "start_time": "2019-03-01T17:54:16.945515Z" } }, "outputs": [], "source": [ "def mpPandasObj(func,pdObj,numThreads=24,mpBatches=1,linMols=True,**kargs):\n", " '''\n", " Parallelize jobs, return a dataframe or series\n", " + func: function to be parallelized. Returns a DataFrame\n", " + pdObj[0]: Name of argument used to pass the molecule\n", " + pdObj[1]: List of atoms that will be grouped into molecules\n", " + kwds: any other argument needed by func\n", " \n", " Example: df1=mpPandasObj(func,('molecule',df0.index),24,**kwds)\n", " '''\n", " import pandas as pd\n", " #if linMols:parts=linParts(len(argList[1]),numThreads*mpBatches)\n", " #else:parts=nestedParts(len(argList[1]),numThreads*mpBatches)\n", " if linMols:parts=linParts(len(pdObj[1]),numThreads*mpBatches)\n", " else:parts=nestedParts(len(pdObj[1]),numThreads*mpBatches)\n", " \n", " jobs=[]\n", " for i in range(1,len(parts)):\n", " job={pdObj[0]:pdObj[1][parts[i-1]:parts[i]],'func':func}\n", " job.update(kargs)\n", " jobs.append(job)\n", " if numThreads==1:out=processJobs_(jobs)\n", " else: out=processJobs(jobs,numThreads=numThreads)\n", " if isinstance(out[0],pd.DataFrame):df0=pd.DataFrame()\n", " elif isinstance(out[0],pd.Series):df0=pd.Series()\n", " else:return out\n", " for i in out:df0=df0.append(i)\n", " df0=df0.sort_index()\n", " return df0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### single-thread execution for debugging [20.8]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:17.601019Z", "start_time": "2019-03-01T17:54:17.277000Z" } }, "outputs": [], "source": [ "def processJobs_(jobs):\n", " # Run jobs sequentially, for debugging\n", " out=[]\n", " for job in jobs:\n", " out_=expandCall(job)\n", " out.append(out_)\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example of async call to multiprocessing lib [20.9]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:17.934204Z", "start_time": "2019-03-01T17:54:17.602756Z" } }, "outputs": [], "source": [ "import multiprocessing as mp\n", "import datetime as dt\n", "\n", "#________________________________\n", "def reportProgress(jobNum,numJobs,time0,task):\n", " # Report progress as asynch jobs are completed\n", " msg=[float(jobNum)/numJobs, (time.time()-time0)/60.]\n", " msg.append(msg[1]*(1/msg[0]-1))\n", " timeStamp=str(dt.datetime.fromtimestamp(time.time()))\n", " msg=timeStamp+' '+str(round(msg[0]*100,2))+'% '+task+' done after '+ \\\n", " str(round(msg[1],2))+' minutes. Remaining '+str(round(msg[2],2))+' minutes.'\n", " if jobNum\n", "DatetimeIndex: 941297 entries, 2009-09-28 09:30:00 to 2018-02-26 18:30:00\n", "Data columns (total 6 columns):\n", "price 941297 non-null float64\n", "bid 941297 non-null float64\n", "ask 941297 non-null float64\n", "size 941297 non-null float64\n", "v 941297 non-null float64\n", "dv 941297 non-null float64\n", "dtypes: float64(6)\n", "memory usage: 50.3 MB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "infp = PurePath(data_dir/'processed'/'IVE_dollarValue_resampled_1s.parquet')\n", "df = pd.read_parquet(infp)\n", "cprint(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [3.1] Form Dollar Bars" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:19.835164Z", "start_time": "2019-03-01T17:54:19.142594Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 941297/941297 [00:00<00:00, 2874489.73it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " price bid ask size \\\n", "2018-02-26 15:31:06 115.29 115.280000 115.290000 2022.000000 \n", "2018-02-26 15:40:15 115.41 115.400000 115.410000 723.000000 \n", "2018-02-26 15:49:42 115.20 115.176667 115.186667 4487.166667 \n", "2018-02-26 15:59:04 115.27 115.260000 115.270000 300.000000 \n", "2018-02-26 16:16:14 115.30 114.720000 115.620000 778677.000000 \n", "\n", " v dv \n", "2018-02-26 15:31:06 2022.000000 2.331164e+05 \n", "2018-02-26 15:40:15 723.000000 8.344143e+04 \n", "2018-02-26 15:49:42 4487.166667 5.171190e+05 \n", "2018-02-26 15:59:04 300.000000 3.458100e+04 \n", "2018-02-26 16:16:14 778677.000000 8.978146e+07 \n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 30860 entries, 2009-09-28 09:53:49 to 2018-02-26 16:16:14\n", "Data columns (total 6 columns):\n", "price 30860 non-null float64\n", "bid 30860 non-null float64\n", "ask 30860 non-null float64\n", "size 30860 non-null float64\n", "v 30860 non-null float64\n", "dv 30860 non-null float64\n", "dtypes: float64(6)\n", "memory usage: 1.6 MB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "dbars = brs.dollar_bar_df(df, 'dv', 1_000_000).drop_duplicates().dropna()\n", "cprint(dbars)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (a) Run cusum filter with threshold equal to std dev of daily returns" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:20.241528Z", "start_time": "2019-03-01T17:54:19.837138Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " dailyVol\n", "2018-02-26 15:31:06 0.006852\n", "2018-02-26 15:40:15 0.006893\n", "2018-02-26 15:49:42 0.006889\n", "2018-02-26 15:59:04 0.006894\n", "2018-02-26 16:16:14 0.006902\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 30843 entries, 2009-09-29 10:03:18 to 2018-02-26 16:16:14\n", "Data columns (total 1 columns):\n", "dailyVol 30842 non-null float64\n", "dtypes: float64(1)\n", "memory usage: 481.9 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "close = dbars.price.copy()\n", "dailyVol = getDailyVol(close)\n", "cprint(dailyVol.to_frame())" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:20.789871Z", "start_time": "2019-03-01T17:54:20.243279Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "f,ax=plt.subplots()\n", "dailyVol.plot(ax=ax)\n", "ax.axhline(dailyVol.mean(),ls='--',color='r')" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:23.002687Z", "start_time": "2019-03-01T17:54:20.791410Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 30858/30858 [00:01<00:00, 16495.86it/s]\n" ] }, { "data": { "text/plain": [ "DatetimeIndex(['2009-09-29 09:33:01', '2009-09-30 09:45:21',\n", " '2009-09-30 13:31:12', '2009-10-01 09:43:58',\n", " '2009-10-01 11:12:07', '2009-10-02 09:44:14',\n", " '2009-10-02 10:35:05', '2009-10-05 09:51:42',\n", " '2009-10-05 14:55:48', '2009-10-06 09:29:52',\n", " ...\n", " '2018-02-16 14:23:51', '2018-02-20 09:30:00',\n", " '2018-02-20 15:21:07', '2018-02-21 14:04:12',\n", " '2018-02-21 15:12:30', '2018-02-22 12:18:21',\n", " '2018-02-22 14:56:14', '2018-02-23 11:37:32',\n", " '2018-02-23 15:58:39', '2018-02-26 13:06:34'],\n", " dtype='datetime64[ns]', length=2278, freq=None)" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tEvents = getTEvents(close,h=dailyVol.mean())\n", "tEvents" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (b) Add vertical barrier" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:23.325936Z", "start_time": "2019-03-01T17:54:23.005620Z" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "2009-09-29 09:33:01 2009-09-30 09:45:21\n", "2009-09-30 09:45:21 2009-10-01 10:00:48\n", "2009-09-30 13:31:12 2009-10-01 13:33:25\n", "2009-10-01 09:43:58 2009-10-02 09:44:14\n", "2009-10-01 11:12:07 2009-10-02 11:50:21\n", "2009-10-02 09:44:14 2009-10-05 09:51:42\n", "2009-10-02 10:35:05 2009-10-05 09:51:42\n", "2009-10-05 09:51:42 2009-10-06 10:16:02\n", "2009-10-05 14:55:48 2009-10-06 15:35:49\n", "2009-10-06 09:29:52 2009-10-07 09:47:16\n", "2009-10-06 11:32:02 2009-10-07 11:48:22\n", "2009-10-06 14:07:37 2009-10-07 14:22:36\n", "2009-10-08 09:29:51 2009-10-09 09:31:12\n", "2009-10-12 09:31:02 2009-10-13 09:47:54\n", "2009-10-13 10:52:10 2009-10-14 11:12:03\n", "2009-10-14 09:29:52 2009-10-15 09:37:24\n", "2009-10-14 15:30:48 2009-10-15 15:57:25\n", "2009-10-16 09:55:03 2009-10-19 09:37:41\n", "2009-10-16 15:40:15 2009-10-19 09:37:41\n", "2009-10-19 11:39:38 2009-10-20 11:50:28\n", "2009-10-20 11:50:28 2009-10-21 12:44:38\n", "2009-10-21 10:11:57 2009-10-22 10:47:06\n", "2009-10-21 15:32:09 2009-10-22 15:49:30\n", "2009-10-22 09:55:51 2009-10-23 10:03:53\n", "2009-10-22 14:33:52 2009-10-23 14:49:39\n", "2009-10-23 10:57:52 2009-10-26 09:52:17\n", "2009-10-26 09:52:17 2009-10-27 09:57:46\n", "2009-10-26 11:32:02 2009-10-27 12:04:42\n", "2009-10-26 11:59:14 2009-10-27 12:04:42\n", "2009-10-27 13:37:35 2009-10-28 14:04:15\n", "2009-10-28 10:00:16 2009-10-29 10:00:59\n", "2009-10-28 14:41:52 2009-10-29 15:00:53\n", "2009-10-29 09:32:01 2009-10-30 09:43:02\n", "2009-10-29 13:40:22 2009-10-30 13:54:51\n", "2009-10-30 09:58:07 2009-11-02 09:51:15\n", "2009-10-30 11:51:20 2009-11-02 09:51:15\n", "2009-10-30 12:57:50 2009-11-02 09:51:15\n", "2009-10-30 15:06:13 2009-11-02 09:51:15\n", "2009-10-30 15:44:12 2009-11-02 09:51:15\n", "2009-11-02 10:17:36 2009-11-03 10:42:33\n", "2009-11-02 12:23:50 2009-11-03 12:24:26\n", "2009-11-02 12:58:06 2009-11-03 13:10:26\n", "2009-11-02 14:07:16 2009-11-03 14:22:31\n", "2009-11-02 14:55:04 2009-11-03 15:18:16\n", "2009-11-03 14:22:31 2009-11-04 14:41:42\n", "2009-11-04 09:34:15 2009-11-05 09:59:36\n", "2009-11-04 15:46:56 2009-11-05 16:09:46\n", "2009-11-05 09:59:36 2009-11-06 10:06:33\n", "2009-11-05 16:09:46 2009-11-09 09:54:17\n", "2009-11-09 09:54:17 2009-11-10 10:09:52\n", " ... \n", "2018-02-06 09:36:34 2018-02-07 09:43:03\n", "2018-02-06 09:58:38 2018-02-07 10:04:28\n", "2018-02-06 10:18:08 2018-02-07 10:22:20\n", "2018-02-06 10:38:41 2018-02-07 10:39:35\n", "2018-02-06 11:35:33 2018-02-07 11:46:44\n", "2018-02-06 11:53:57 2018-02-07 11:57:50\n", "2018-02-06 12:32:24 2018-02-07 12:42:28\n", "2018-02-06 13:04:03 2018-02-07 13:08:44\n", "2018-02-06 14:19:57 2018-02-07 14:20:37\n", "2018-02-06 14:49:56 2018-02-07 14:53:22\n", "2018-02-06 15:05:41 2018-02-07 15:11:44\n", "2018-02-06 15:42:53 2018-02-07 15:47:02\n", "2018-02-07 09:43:03 2018-02-08 09:57:38\n", "2018-02-07 11:15:27 2018-02-08 11:18:31\n", "2018-02-07 13:16:25 2018-02-08 13:17:26\n", "2018-02-07 15:28:09 2018-02-08 15:33:11\n", "2018-02-07 15:58:58 2018-02-08 15:59:48\n", "2018-02-08 10:33:27 2018-02-09 10:41:40\n", "2018-02-08 12:29:28 2018-02-09 12:40:46\n", "2018-02-08 13:45:14 2018-02-09 13:52:34\n", "2018-02-08 15:07:57 2018-02-09 15:09:17\n", "2018-02-08 15:45:50 2018-02-09 15:47:50\n", "2018-02-09 09:30:00 2018-02-12 09:30:00\n", "2018-02-09 10:41:40 2018-02-12 09:30:00\n", "2018-02-09 12:05:08 2018-02-12 09:30:00\n", "2018-02-09 13:27:21 2018-02-12 09:30:00\n", "2018-02-09 13:52:34 2018-02-12 09:30:00\n", "2018-02-09 14:11:06 2018-02-12 09:30:00\n", "2018-02-09 15:05:41 2018-02-12 09:30:00\n", "2018-02-09 15:29:15 2018-02-12 09:30:00\n", "2018-02-09 15:47:50 2018-02-12 09:30:00\n", "2018-02-12 09:30:00 2018-02-13 09:30:00\n", "2018-02-12 10:25:02 2018-02-13 10:36:48\n", "2018-02-12 12:12:51 2018-02-13 12:34:24\n", "2018-02-13 09:30:00 2018-02-14 09:30:00\n", "2018-02-13 13:43:37 2018-02-14 13:53:59\n", "2018-02-14 10:30:48 2018-02-15 10:42:27\n", "2018-02-14 13:36:02 2018-02-15 13:42:09\n", "2018-02-15 09:31:56 2018-02-16 09:42:36\n", "2018-02-15 14:05:41 2018-02-16 14:15:08\n", "2018-02-16 11:11:50 2018-02-20 09:30:00\n", "2018-02-16 14:23:51 2018-02-20 09:30:00\n", "2018-02-20 09:30:00 2018-02-21 09:34:28\n", "2018-02-20 15:21:07 2018-02-21 15:22:14\n", "2018-02-21 14:04:12 2018-02-22 14:20:25\n", "2018-02-21 15:12:30 2018-02-22 15:16:50\n", "2018-02-22 12:18:21 2018-02-23 12:30:16\n", "2018-02-22 14:56:14 2018-02-23 15:02:21\n", "2018-02-23 11:37:32 2018-02-26 09:30:00\n", "2018-02-23 15:58:39 2018-02-26 09:30:00\n", "Length: 2277, dtype: datetime64[ns]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t1 = addVerticalBarrier(tEvents, close, numDays=1)\n", "t1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (c) Apply triple-barrier method where `ptSl = [1,1]` and `t1` is the series created in `1.b`" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:24.141594Z", "start_time": "2019-03-01T17:54:23.327391Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2019-03-01 10:54:23.908117 9.09% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.03 minutes.\r", "2019-03-01 10:54:23.918158 18.18% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:23.919821 27.27% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:23.921352 36.36% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:23.921910 45.45% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:23.923873 54.55% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:23.924893 63.64% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:23.935960 72.73% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:23.940660 81.82% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:23.972865 90.91% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:23.983755 100.0% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " t1 trgt\n", "2018-02-13 13:43:37 2018-02-14 13:53:59 0.014365\n", "2018-02-14 10:30:48 2018-02-15 09:31:56 0.012136\n", "2018-02-14 13:36:02 2018-02-15 13:42:09 0.011688\n", "2018-02-15 09:31:56 2018-02-16 09:42:36 0.011244\n", "2018-02-15 14:05:41 2018-02-16 12:05:18 0.010183\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 929 entries, 2009-10-05 14:55:48 to 2018-02-15 14:05:41\n", "Data columns (total 2 columns):\n", "t1 929 non-null datetime64[ns]\n", "trgt 929 non-null float64\n", "dtypes: datetime64[ns](1), float64(1)\n", "memory usage: 21.8 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "# create target series\n", "ptsl = [1,1]\n", "target=dailyVol\n", "# select minRet\n", "minRet = 0.01\n", "\n", "# Run in single-threaded mode on Windows\n", "import platform\n", "if platform.system() == \"Windows\":\n", " cpus = 1\n", "else:\n", " cpus = cpu_count() - 1\n", " \n", "events = getEvents(close,tEvents,ptsl,target,minRet,cpus,t1=t1)\n", "\n", "cprint(events)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (d) Apply `getBins` to generate labels" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:24.550849Z", "start_time": "2019-03-01T17:54:24.143356Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " ret bin\n", "2018-02-13 13:43:37 0.010108 1.0\n", "2018-02-14 10:30:48 0.015045 1.0\n", "2018-02-14 13:36:02 0.005056 1.0\n", "2018-02-15 09:31:56 0.003964 1.0\n", "2018-02-15 14:05:41 0.010431 1.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 929 entries, 2009-10-05 14:55:48 to 2018-02-15 14:05:41\n", "Data columns (total 2 columns):\n", "ret 929 non-null float64\n", "bin 929 non-null float64\n", "dtypes: float64(2)\n", "memory usage: 61.8 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n", " 1.0 523\n", "-1.0 406\n", "Name: bin, dtype: int64\n" ] } ], "source": [ "labels = getBins(events, close)\n", "cprint(labels)\n", "print(labels.bin.value_counts())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [3.2] Use snippet 3.8 to drop under-populated labels" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:24.976600Z", "start_time": "2019-03-01T17:54:24.552234Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " ret bin\n", "2018-02-13 13:43:37 0.010108 1.0\n", "2018-02-14 10:30:48 0.015045 1.0\n", "2018-02-14 13:36:02 0.005056 1.0\n", "2018-02-15 09:31:56 0.003964 1.0\n", "2018-02-15 14:05:41 0.010431 1.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 929 entries, 2009-10-05 14:55:48 to 2018-02-15 14:05:41\n", "Data columns (total 2 columns):\n", "ret 929 non-null float64\n", "bin 929 non-null float64\n", "dtypes: float64(2)\n", "memory usage: 61.8 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "clean_labels = dropLabels(labels)\n", "cprint(clean_labels)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:25.770917Z", "start_time": "2019-03-01T17:54:24.979035Z" } }, "outputs": [ { "data": { "text/plain": [ " 1.0 523\n", "-1.0 406\n", "Name: bin, dtype: int64" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "clean_labels.bin.value_counts()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [3.3] Adjust the `getBins` function to return a `0` whenever the vertical barrier is the one touched first." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:29.719766Z", "start_time": "2019-03-01T17:54:25.772774Z" } }, "outputs": [], "source": [ "def getBinsNew(events, close, t1=None):\n", " '''\n", " Compute event's outcome (including side information, if provided).\n", " events is a DataFrame where:\n", " -events.index is event's starttime\n", " -events['t1'] is event's endtime\n", " -events['trgt'] is event's target\n", " -events['side'] (optional) implies the algo's position side\n", " -t1 is original vertical barrier series\n", " Case 1: ('side' not in events): bin in (-1,1) <-label by price action\n", " Case 2: ('side' in events): bin in (0,1) <-label by pnl (meta-labeling)\n", " '''\n", " #1) prices aligned with events\n", " events_=events.dropna(subset=['t1'])\n", " px=events_.index.union(events_['t1'].values).drop_duplicates()\n", " px=close.reindex(px,method='bfill')\n", " #2) create out object\n", " out=pd.DataFrame(index=events_.index)\n", " out['ret']=px.loc[events_['t1'].values].values/px.loc[events_.index]-1\n", " if 'side' in events_:out['ret']*=events_['side'] # meta-labeling\n", " out['bin']=np.sign(out['ret'])\n", " \n", " if 'side' not in events_:\n", " # only applies when not meta-labeling\n", " # to update bin to 0 when vertical barrier is touched, we need the original\n", " # vertical barrier series since the events['t1'] is the time of first \n", " # touch of any barrier and not the vertical barrier specifically. \n", " # The index of the intersection of the vertical barrier values and the \n", " # events['t1'] values indicate which bin labels needs to be turned to 0\n", " vtouch_first_idx = events[events['t1'].isin(t1.values)].index\n", " out.loc[vtouch_first_idx, 'bin'] = 0.\n", " \n", " if 'side' in events_:out.loc[out['ret']<=0,'bin']=0 # meta-labeling\n", " return out" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [3.4] Develop moving average crossover strategy. For each obs. the model suggests a side but not size of the bet" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:32.472289Z", "start_time": "2019-03-01T17:54:29.721345Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " price fast slow\n", "2018-02-26 15:31:06 115.29 115.227691 115.057569\n", "2018-02-26 15:40:15 115.41 115.273268 115.101623\n", "2018-02-26 15:49:42 115.20 115.254951 115.113920\n", "2018-02-26 15:59:04 115.27 115.258713 115.133430\n", "2018-02-26 16:16:14 115.30 115.269035 115.154251\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 30860 entries, 2009-09-28 09:53:49 to 2018-02-26 16:16:14\n", "Data columns (total 3 columns):\n", "price 30860 non-null float64\n", "fast 30860 non-null float64\n", "slow 30860 non-null float64\n", "dtypes: float64(3)\n", "memory usage: 964.4 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "fast_window = 3\n", "slow_window = 7\n", "\n", "close_df = (pd.DataFrame()\n", " .assign(price=close)\n", " .assign(fast=close.ewm(fast_window).mean())\n", " .assign(slow=close.ewm(slow_window).mean()))\n", "cprint(close_df)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:33.757620Z", "start_time": "2019-03-01T17:54:32.473676Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "def get_up_cross(df):\n", " crit1 = df.fast.shift(1) < df.slow.shift(1)\n", " crit2 = df.fast > df.slow\n", " return df.fast[(crit1) & (crit2)]\n", "\n", "def get_down_cross(df):\n", " crit1 = df.fast.shift(1) > df.slow.shift(1)\n", " crit2 = df.fast < df.slow\n", " return df.fast[(crit1) & (crit2)]\n", "\n", "up = get_up_cross(close_df)\n", "down = get_down_cross(close_df)\n", "\n", "f, ax = plt.subplots(figsize=(11,8))\n", "\n", "close_df.loc['2014':].plot(ax=ax, alpha=.5)\n", "up.loc['2014':].plot(ax=ax,ls='',marker='^', markersize=7,\n", " alpha=0.75, label='upcross', color='g')\n", "down.loc['2014':].plot(ax=ax,ls='',marker='v', markersize=7, \n", " alpha=0.75, label='downcross', color='r')\n", "\n", "ax.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (a) Derive meta-labels for `ptSl = [1,2]` and `t1` where `numdays=1`. Use as `trgt` dailyVol computed by snippet 3.1 (get events with sides)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:34.189161Z", "start_time": "2019-03-01T17:54:33.760127Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " 0\n", "2018-02-21 11:10:00 1\n", "2018-02-21 15:12:30 -1\n", "2018-02-22 11:48:39 1\n", "2018-02-22 13:34:29 -1\n", "2018-02-23 10:01:41 1\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 1712 entries, 2009-09-30 09:45:21 to 2018-02-23 10:01:41\n", "Data columns (total 1 columns):\n", "0 1712 non-null int64\n", "dtypes: int64(1)\n", "memory usage: 26.8 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "side_up = pd.Series(1, index=up.index)\n", "side_down = pd.Series(-1, index=down.index)\n", "side = pd.concat([side_up,side_down]).sort_index()\n", "cprint(side)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:36.918898Z", "start_time": "2019-03-01T17:54:34.191595Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|██████████| 30858/30858 [00:01<00:00, 17281.84it/s]\n", "2019-03-01 10:54:36.730759 100.0% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes..\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " side t1 trgt\n", "2018-02-13 13:43:37 NaN 2018-02-14 13:53:59 0.014365\n", "2018-02-14 10:30:48 NaN 2018-02-15 10:42:27 0.012136\n", "2018-02-14 13:36:02 NaN 2018-02-15 13:42:09 0.011688\n", "2018-02-15 09:31:56 NaN 2018-02-16 09:42:36 0.011244\n", "2018-02-15 14:05:41 NaN 2018-02-16 14:15:08 0.010183\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 929 entries, 2009-10-05 14:55:48 to 2018-02-15 14:05:41\n", "Data columns (total 3 columns):\n", "side 102 non-null float64\n", "t1 929 non-null datetime64[ns]\n", "trgt 929 non-null float64\n", "dtypes: datetime64[ns](1), float64(2)\n", "memory usage: 29.0 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "minRet = .01 \n", "ptsl=[1,2]\n", "\n", "dailyVol = getDailyVol(close_df['price'])\n", "tEvents = getTEvents(close_df['price'],h=dailyVol.mean())\n", "t1 = addVerticalBarrier(tEvents, close_df['price'], numDays=1)\n", "\n", "ma_events = getEvents(close_df['price'],tEvents,ptsl,target,minRet,cpus,\n", " t1=t1,side=side)\n", "cprint(ma_events)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:37.223058Z", "start_time": "2019-03-01T17:54:36.920513Z" } }, "outputs": [ { "data": { "text/plain": [ " 1.0 53\n", "-1.0 49\n", "Name: side, dtype: int64" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ma_events.side.value_counts()" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:37.538874Z", "start_time": "2019-03-01T17:54:37.225750Z" } }, "outputs": [], "source": [ "ma_side = ma_events.dropna().side" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:37.883132Z", "start_time": "2019-03-01T17:54:37.540690Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " ret bin\n", "2016-07-07 14:28:00 -0.018703 0.0\n", "2016-07-08 09:30:57 0.010571 1.0\n", "2018-02-06 10:18:08 -0.026702 0.0\n", "2018-02-07 15:28:09 -0.030792 0.0\n", "2018-02-13 09:30:00 -0.001803 0.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 102 entries, 2009-10-29 13:40:22 to 2018-02-13 09:30:00\n", "Data columns (total 2 columns):\n", "ret 102 non-null float64\n", "bin 102 non-null float64\n", "dtypes: float64(2)\n", "memory usage: 2.4 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "ma_bins = getBinsNew(ma_events,close_df['price'], t1).dropna()\n", "cprint(ma_bins)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:38.190267Z", "start_time": "2019-03-01T17:54:37.886249Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " ret bin side\n", "2016-07-07 14:28:00 -0.018703 0.0 -1\n", "2016-07-08 09:30:57 0.010571 1.0 1\n", "2018-02-06 10:18:08 -0.026702 0.0 -1\n", "2018-02-07 15:28:09 -0.030792 0.0 1\n", "2018-02-13 09:30:00 -0.001803 0.0 -1\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 102 entries, 2009-10-29 13:40:22 to 2018-02-13 09:30:00\n", "Data columns (total 3 columns):\n", "ret 102 non-null float64\n", "bin 102 non-null float64\n", "side 102 non-null int64\n", "dtypes: float64(2), int64(1)\n", "memory usage: 3.2 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "Xx = pd.merge_asof(ma_bins, side.to_frame().rename(columns={0:'side'}),\n", " left_index=True, right_index=True, direction='forward')\n", "cprint(Xx)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (b) Train Random Forest to decide whether to trade or not `{0,1}` since underlying model (crossing m.a.) has decided the side, `{-1,1}`" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:38.523365Z", "start_time": "2019-03-01T17:54:38.192661Z" } }, "outputs": [], "source": [ "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.metrics import roc_curve, classification_report" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:44.603030Z", "start_time": "2019-03-01T17:54:38.525191Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " precision recall f1-score support\n", "\n", " 0.0 0.00 0.00 0.00 24\n", " 1.0 0.53 1.00 0.69 27\n", "\n", " micro avg 0.53 0.53 0.53 51\n", " macro avg 0.26 0.50 0.35 51\n", "weighted avg 0.28 0.53 0.37 51\n", "\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "X = ma_side.values.reshape(-1,1)\n", "#X = Xx.side.values.reshape(-1,1)\n", "y = ma_bins.bin.values\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5,\n", " shuffle=False)\n", "\n", "n_estimator = 10000\n", "rf = RandomForestClassifier(max_depth=2, n_estimators=n_estimator,\n", " criterion='entropy', random_state=RANDOM_STATE)\n", "rf.fit(X_train, y_train)\n", "\n", "# The random forest model by itself\n", "y_pred_rf = rf.predict_proba(X_test)[:, 1]\n", "y_pred = rf.predict(X_test)\n", "fpr_rf, tpr_rf, _ = roc_curve(y_test, y_pred_rf)\n", "print(classification_report(y_test, y_pred))\n", "\n", "plt.figure(1)\n", "plt.plot([0, 1], [0, 1], 'k--')\n", "plt.plot(fpr_rf, tpr_rf, label='RF')\n", "plt.xlabel('False positive rate')\n", "plt.ylabel('True positive rate')\n", "plt.title('ROC curve')\n", "plt.legend(loc='best')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## [3.5] Develop mean-reverting Bollinger Band Strategy. For each obs. model suggests a side but not size of the bet." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:44.934155Z", "start_time": "2019-03-01T17:54:44.604514Z" } }, "outputs": [], "source": [ "def bbands(price, window=None, width=None, numsd=None):\n", " \"\"\" returns average, upper band, and lower band\"\"\"\n", " ave = price.rolling(window).mean()\n", " sd = price.rolling(window).std(ddof=0)\n", " if width:\n", " upband = ave * (1+width)\n", " dnband = ave * (1-width)\n", " return price, np.round(ave,3), np.round(upband,3), np.round(dnband,3) \n", " if numsd:\n", " upband = ave + (sd*numsd)\n", " dnband = ave - (sd*numsd)\n", " return price, np.round(ave,3), np.round(upband,3), np.round(dnband,3)" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:45.316561Z", "start_time": "2019-03-01T17:54:44.935929Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " price ave upper lower\n", "2018-02-26 15:31:06 115.29 114.005 114.959 113.051\n", "2018-02-26 15:40:15 115.41 114.069 115.008 113.129\n", "2018-02-26 15:49:42 115.20 114.124 115.047 113.202\n", "2018-02-26 15:59:04 115.27 114.183 115.083 113.282\n", "2018-02-26 16:16:14 115.30 114.231 115.125 113.338\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 30811 entries, 2009-10-01 15:51:02 to 2018-02-26 16:16:14\n", "Data columns (total 4 columns):\n", "price 30811 non-null float64\n", "ave 30811 non-null float64\n", "upper 30811 non-null float64\n", "lower 30811 non-null float64\n", "dtypes: float64(4)\n", "memory usage: 1.2 MB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "window=50\n", "bb_df = pd.DataFrame()\n", "bb_df['price'],bb_df['ave'],bb_df['upper'],bb_df['lower']=bbands(close, window=window, numsd=1)\n", "bb_df.dropna(inplace=True)\n", "cprint(bb_df)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:45.892818Z", "start_time": "2019-03-01T17:54:45.319710Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "f,ax=plt.subplots(figsize=(11,8))\n", "bb_df.loc['2014'].plot(ax=ax)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:46.625944Z", "start_time": "2019-03-01T17:54:45.894447Z" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "def get_up_cross(df, col):\n", " # col is price column\n", " crit1 = df[col].shift(1) < df.upper.shift(1) \n", " crit2 = df[col] > df.upper\n", " return df[col][(crit1) & (crit2)]\n", "\n", "def get_down_cross(df, col):\n", " # col is price column \n", " crit1 = df[col].shift(1) > df.lower.shift(1) \n", " crit2 = df[col] < df.lower\n", " return df[col][(crit1) & (crit2)]\n", "\n", "bb_down = get_down_cross(bb_df, 'price')\n", "bb_up = get_up_cross(bb_df, 'price') \n", "\n", "f, ax = plt.subplots(figsize=(11,8))\n", "\n", "bb_df.loc['2014':].plot(ax=ax, alpha=.5)\n", "bb_up.loc['2014':].plot(ax=ax, ls='', marker='^', markersize=7,\n", " alpha=0.75, label='upcross', color='g')\n", "bb_down.loc['2014':].plot(ax=ax, ls='', marker='v', markersize=7, \n", " alpha=0.75, label='downcross', color='r')\n", "ax.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (a) Derive meta-labels for `ptSl=[0,2]` and `t1` where `numdays=1`. Use as `trgt` dailyVol." ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:47.459211Z", "start_time": "2019-03-01T17:54:46.627389Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " 0\n", "2018-02-22 13:34:29 1\n", "2018-02-22 14:20:25 1\n", "2018-02-22 14:44:33 1\n", "2018-02-23 13:41:26 -1\n", "2018-02-23 14:40:49 -1\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 2040 entries, 2009-10-06 09:29:52 to 2018-02-23 14:40:49\n", "Data columns (total 1 columns):\n", "0 2040 non-null int64\n", "dtypes: int64(1)\n", "memory usage: 31.9 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "2019-03-01 10:54:47.213061 9.09% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.03 minutes.\r", "2019-03-01 10:54:47.217601 18.18% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:47.219915 27.27% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:47.222889 36.36% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:47.224198 45.45% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:47.224620 54.55% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:47.226273 63.64% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:47.227236 72.73% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:47.230049 81.82% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:47.235371 90.91% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:47.236315 100.0% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " side t1 trgt\n", "2018-02-13 13:43:37 -1.0 2018-02-14 13:53:59 0.014365\n", "2018-02-14 10:30:48 NaN 2018-02-15 10:42:27 0.012136\n", "2018-02-14 13:36:02 NaN 2018-02-15 13:42:09 0.011688\n", "2018-02-15 09:31:56 NaN 2018-02-16 09:42:36 0.011244\n", "2018-02-15 14:05:41 NaN 2018-02-16 14:15:08 0.010183\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 929 entries, 2009-10-05 14:55:48 to 2018-02-15 14:05:41\n", "Data columns (total 3 columns):\n", "side 139 non-null float64\n", "t1 929 non-null datetime64[ns]\n", "trgt 929 non-null float64\n", "dtypes: datetime64[ns](1), float64(2)\n", "memory usage: 29.0 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n", "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " side\n", "2016-07-07 10:17:10 -1.0\n", "2016-07-08 09:30:57 -1.0\n", "2018-02-06 10:18:08 1.0\n", "2018-02-06 14:19:57 1.0\n", "2018-02-13 13:43:37 -1.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 139 entries, 2009-10-06 09:29:52 to 2018-02-13 13:43:37\n", "Data columns (total 1 columns):\n", "side 139 non-null float64\n", "dtypes: float64(1)\n", "memory usage: 2.2 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "bb_side_up = pd.Series(-1, index=bb_up.index) # sell on up cross for mean reversion\n", "bb_side_down = pd.Series(1, index=bb_down.index) # buy on down cross for mean reversion\n", "bb_side_raw = pd.concat([bb_side_up,bb_side_down]).sort_index()\n", "cprint(bb_side_raw)\n", "\n", "minRet = .01 \n", "ptsl=[0,2]\n", "bb_events = getEvents(close,tEvents,ptsl,target,minRet,cpus,t1=t1,side=bb_side_raw)\n", "cprint(bb_events)\n", "\n", "bb_side = bb_events.dropna().side\n", "cprint(bb_side)" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:47.820661Z", "start_time": "2019-03-01T17:54:47.460801Z" } }, "outputs": [ { "data": { "text/plain": [ " 1.0 72\n", "-1.0 67\n", "Name: side, dtype: int64" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bb_side.value_counts()" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:48.192517Z", "start_time": "2019-03-01T17:54:47.823058Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " ret bin\n", "2016-07-07 10:17:10 -0.003791 0.0\n", "2016-07-08 09:30:57 -0.010571 0.0\n", "2018-02-06 10:18:08 0.025085 1.0\n", "2018-02-06 14:19:57 0.028779 1.0\n", "2018-02-13 13:43:37 -0.010108 0.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 139 entries, 2009-10-06 09:29:52 to 2018-02-13 13:43:37\n", "Data columns (total 2 columns):\n", "ret 139 non-null float64\n", "bin 139 non-null float64\n", "dtypes: float64(2)\n", "memory usage: 3.3 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n", "0.0 79\n", "1.0 60\n", "Name: bin, dtype: int64\n" ] } ], "source": [ "bb_bins = getBins(bb_events,close).dropna()\n", "cprint(bb_bins)\n", "print(bb_bins.bin.value_counts())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### (b) train random forest to decide to trade or not. Use features: volatility, serial correlation, and the crossing moving averages from exercise 2." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:48.508454Z", "start_time": "2019-03-01T17:54:48.195468Z" } }, "outputs": [], "source": [ "def returns(s):\n", " arr = np.diff(np.log(s))\n", " return (pd.Series(arr, index=s.index[1:]))\n", "\n", "def df_rolling_autocorr(df, window, lag=1):\n", " \"\"\"Compute rolling column-wise autocorrelation for a DataFrame.\"\"\"\n", "\n", " return (df.rolling(window=window)\n", " .corr(df.shift(lag))) # could .dropna() here\n", "\n", "#df_rolling_autocorr(d1, window=21).dropna().head()" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:48.863084Z", "start_time": "2019-03-01T17:54:48.511289Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " srl_corr\n", "2018-02-26 15:31:06 0.028037\n", "2018-02-26 15:40:15 0.015957\n", "2018-02-26 15:49:42 0.032877\n", "2018-02-26 15:59:04 0.046014\n", "2018-02-26 16:16:14 0.109129\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 30859 entries, 2009-09-28 10:06:04 to 2018-02-26 16:16:14\n", "Data columns (total 1 columns):\n", "srl_corr 30809 non-null float64\n", "dtypes: float64(1)\n", "memory usage: 482.2 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "srl_corr = df_rolling_autocorr(returns(close), window=window).rename('srl_corr')\n", "cprint(srl_corr)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:49.215766Z", "start_time": "2019-03-01T17:54:48.864737Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " vol ma_side srl_corr\n", "2016-07-07 14:28:00 0.012624 -1.0 0.251865\n", "2016-07-08 09:30:57 0.011944 1.0 0.238590\n", "2018-02-06 10:18:08 0.013317 -1.0 0.123961\n", "2018-02-07 15:28:09 0.024870 1.0 -0.005597\n", "2018-02-13 09:30:00 0.017363 -1.0 0.198935\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 102 entries, 2009-10-29 13:40:22 to 2018-02-13 09:30:00\n", "Data columns (total 3 columns):\n", "vol 102 non-null float64\n", "ma_side 102 non-null float64\n", "srl_corr 102 non-null float64\n", "dtypes: float64(3)\n", "memory usage: 3.2 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "features = (pd.DataFrame()\n", " .assign(vol=bb_events.trgt)\n", " .assign(ma_side=ma_side)\n", " .assign(srl_corr=srl_corr)\n", " .drop_duplicates()\n", " .dropna())\n", "cprint(features)" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:49.534835Z", "start_time": "2019-03-01T17:54:49.218405Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " vol ma_side srl_corr bin\n", "2016-07-07 14:28:00 0.012624 -1.0 0.251865 0.0\n", "2016-07-08 09:30:57 0.011944 1.0 0.238590 0.0\n", "2018-02-06 10:18:08 0.013317 -1.0 0.123961 1.0\n", "2018-02-07 15:28:09 0.024870 1.0 -0.005597 0.0\n", "2018-02-13 09:30:00 0.017363 -1.0 0.198935 0.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 102 entries, 2009-10-29 13:40:22 to 2018-02-13 09:30:00\n", "Data columns (total 4 columns):\n", "vol 102 non-null float64\n", "ma_side 102 non-null float64\n", "srl_corr 102 non-null float64\n", "bin 102 non-null float64\n", "dtypes: float64(4)\n", "memory usage: 4.0 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n" ] } ], "source": [ "Xy = (pd.merge_asof(features, bb_bins[['bin']], \n", " left_index=True, right_index=True, \n", " direction='forward').dropna())\n", "cprint(Xy)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:49.873144Z", "start_time": "2019-03-01T17:54:49.536775Z" } }, "outputs": [ { "data": { "text/plain": [ "0.0 60\n", "1.0 42\n", "Name: bin, dtype: int64" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Xy.bin.value_counts()" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:54:56.011226Z", "start_time": "2019-03-01T17:54:49.875226Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " precision recall f1-score support\n", "\n", " no_trade 0.47 0.73 0.58 26\n", " trade 0.36 0.16 0.22 25\n", "\n", " micro avg 0.45 0.45 0.45 51\n", " macro avg 0.42 0.45 0.40 51\n", "weighted avg 0.42 0.45 0.40 51\n", "\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "X = Xy.drop('bin',axis=1).values\n", "y = Xy['bin'].values\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5,\n", " shuffle=False)\n", "\n", "n_estimator = 10000\n", "rf = RandomForestClassifier(max_depth=2, n_estimators=n_estimator,\n", " criterion='entropy', random_state=RANDOM_STATE)\n", "rf.fit(X_train, y_train)\n", "\n", "# The random forest model by itself\n", "y_pred_rf = rf.predict_proba(X_test)[:, 1]\n", "y_pred = rf.predict(X_test)\n", "fpr_rf, tpr_rf, _ = roc_curve(y_test, y_pred_rf)\n", "print(classification_report(y_test, y_pred, target_names=['no_trade','trade']))\n", "\n", "plt.figure(1)\n", "plt.plot([0, 1], [0, 1], 'k--')\n", "plt.plot(fpr_rf, tpr_rf, label='RF')\n", "plt.xlabel('False positive rate')\n", "plt.ylabel('True positive rate')\n", "plt.title('ROC curve')\n", "plt.legend(loc='best')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2018-06-12T14:52:40.235268Z", "start_time": "2018-06-12T14:52:39.912077Z" } }, "source": [ "### (c) What is accuracy of predictions from primary model if the secondary model does not filter bets? What is classification report?" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "ExecuteTime": { "end_time": "2019-03-01T17:55:02.893334Z", "start_time": "2019-03-01T17:54:56.012559Z" }, "scrolled": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2019-03-01 10:54:56.611727 9.09% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.03 minutes.\r", "2019-03-01 10:54:56.612903 18.18% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:56.613181 27.27% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.01 minutes.\r", "2019-03-01 10:54:56.615975 36.36% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:56.624309 45.45% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:56.626470 54.55% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:56.629972 63.64% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:56.632532 72.73% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:56.633112 81.82% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:56.639612 90.91% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\r", "2019-03-01 10:54:56.657881 100.0% applyPtSlOnT1 done after 0.0 minutes. Remaining 0.0 minutes.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " t1 trgt\n", "2018-02-13 13:43:37 2018-02-14 13:53:59 0.014365\n", "2018-02-14 10:30:48 2018-02-15 10:42:27 0.012136\n", "2018-02-14 13:36:02 2018-02-15 13:42:09 0.011688\n", "2018-02-15 09:31:56 2018-02-16 09:42:36 0.011244\n", "2018-02-15 14:05:41 2018-02-16 14:15:08 0.010183\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 929 entries, 2009-10-05 14:55:48 to 2018-02-15 14:05:41\n", "Data columns (total 2 columns):\n", "t1 929 non-null datetime64[ns]\n", "trgt 929 non-null float64\n", "dtypes: datetime64[ns](1), float64(1)\n", "memory usage: 21.8 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n", "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " ret bin\n", "2018-02-13 13:43:37 0.010108 1.0\n", "2018-02-14 10:30:48 0.010876 1.0\n", "2018-02-14 13:36:02 0.005056 1.0\n", "2018-02-15 09:31:56 0.003964 1.0\n", "2018-02-15 14:05:41 0.004842 1.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 929 entries, 2009-10-05 14:55:48 to 2018-02-15 14:05:41\n", "Data columns (total 2 columns):\n", "ret 929 non-null float64\n", "bin 929 non-null float64\n", "dtypes: float64(2)\n", "memory usage: 21.8 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n", "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " vol ma_side srl_corr\n", "2016-07-07 14:28:00 0.012624 -1.0 0.251865\n", "2016-07-08 09:30:57 0.011944 1.0 0.238590\n", "2018-02-06 10:18:08 0.013317 -1.0 0.123961\n", "2018-02-07 15:28:09 0.024870 1.0 -0.005597\n", "2018-02-13 09:30:00 0.017363 -1.0 0.198935\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 102 entries, 2009-10-29 13:40:22 to 2018-02-13 09:30:00\n", "Data columns (total 3 columns):\n", "vol 102 non-null float64\n", "ma_side 102 non-null float64\n", "srl_corr 102 non-null float64\n", "dtypes: float64(3)\n", "memory usage: 3.2 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n", "-------------------------------------------------------------------------------\n", "dataframe information\n", "-------------------------------------------------------------------------------\n", " vol ma_side srl_corr bin\n", "2016-07-07 14:28:00 0.012624 -1.0 0.251865 1.0\n", "2016-07-08 09:30:57 0.011944 1.0 0.238590 1.0\n", "2018-02-06 10:18:08 0.013317 -1.0 0.123961 1.0\n", "2018-02-07 15:28:09 0.024870 1.0 -0.005597 -1.0\n", "2018-02-13 09:30:00 0.017363 -1.0 0.198935 1.0\n", "--------------------------------------------------\n", "\n", "DatetimeIndex: 102 entries, 2009-10-29 13:40:22 to 2018-02-13 09:30:00\n", "Data columns (total 4 columns):\n", "vol 102 non-null float64\n", "ma_side 102 non-null float64\n", "srl_corr 102 non-null float64\n", "bin 102 non-null float64\n", "dtypes: float64(4)\n", "memory usage: 4.0 KB\n", "None\n", "-------------------------------------------------------------------------------\n", "\n", " precision recall f1-score support\n", "\n", " no_trade 0.39 0.43 0.41 21\n", " trade 0.57 0.53 0.55 30\n", "\n", " micro avg 0.49 0.49 0.49 51\n", " macro avg 0.48 0.48 0.48 51\n", "weighted avg 0.50 0.49 0.49 51\n", "\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEfCAYAAABSy/GnAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzsnXl8FdXZx78PWUgMSUhIDC6AC1SrraKiEnFBqdSlluXFVy1Ua0WwUkBl0YpapNoqq6C+1l2pS90qCmJBQW1QECt1r5SggAgJ2chGkpuE5/3jTm5jvElukpm5y5zv5zOf3DnnzJznl5PcZ84yzxFVxWAwGAyG9ugWbgMMBoPBEB0Yh2EwGAyGkDAOw2AwGAwhYRyGwWAwGELCOAyDwWAwhIRxGAaDwWAICeMwDAaDwRASxmEYYhoReUJE1DoaRWSniCwVkUOClM0RkXtFZJuI+ESkSEReFJGBQcrGi8hkEdkoIpUiUi4i/xKRWSKS4Y46g8FdjMMweIE84CCgL/AL4ATgheYFRKQP8E/gNOA3QH/gQqAe2CAi5zUrmwC8BtwJPA+cAxwPzAIGA1c4K+e7iEiim/UZvIuYN70NsYyIPAEcqqo/aZY2GVgCpKtqhZX2KnAK8IOmtGblVwInAoerao2ITAPmAUNUdX2QOjNUtawVe+KBm/E7lUOBYuBvqjrZylfgl6r6VLNr3gR2quqvrPNtwFNAJnAJ8BXwHyBbVYe3qO91oFxVL7XOzwVmW3pKgdXAdFUtaf23aDD4MT0Mg6cQkYOBMUCjdWANIV0I3NfSWVj8CcgBzrXOfwmsDeYsAFpzFhaPAr/F/6V9DPA/+L/wO8oUYA+Qi9/5LAWGNR9qE5Emm5+0zs8BXgH+ChwHjAQOA14WEemEDQaPER9uAwwGFxgqIlX4H5CSrbQFqlptfR5g5X3eyvVN6UdZP38A/KOjRohIf+By4GJVfdFK3gps6Oi9gA9UdXaze38JFADjgLut5LFAEf5eBMBtwBJVvbfZdVcA2/EPqX3UCTsMHsL0MAxe4H1gIP4hpz/g/4K+tVl+e0/XLcdtJUhaKJxo/VzdZqnQ2Nj8RFX3A0/j7/008UvgaVVttM5PBq4TkaqmA/jCyhtgg02GGMf0MAxeoEZV863Pn4nID4D7gV9baVuA/cCPgJeDXP8j6+fmZj+PdchW5fsOLCFIueogaU8CM0TkJKAOv5NsPgHfDX/v4y9Bri3ouKkGr2F6GAYvMhu4QkQGAahqKfA6MElE0oKUvxkoBN6wzp8CzhGR3GA3b2NZ7Sbr5/BW8sE/L3Fws3t1xz/X0S6q+rlVx+XW8ZGqftKsyD+BY1U1P8hRFUodBm9jHIbBc6jql8AK/JPZTUzCPwm+VkTOE5E+InKyiDwDnA38SlVrrLKLgTXAKhGZLiKDRKSfdd0y/F/WwerNxz9s9H8iMk5EjrTqmNqs2JvANSKSKyI/Ap4AOrJs9kngMvzzF0tb5N0GjBCRRSIy0Kr/PBF5VESSv3cng6EFxmEYvMpc4CciMgxAVbcDg/DPdzyIfzL6daA7kKuqf2+6UFXrgfPxz4NcCrwDfIrfAW3EWpXUClda978D+Df+IbDDm+VPBz4DVln1/wP4oAO6ngF6AgdanwOo6lv43xn5Mf53Uz4BFgGV+N83MRjaxLyHYTAYDIaQMD0Mg8FgMISEcRgGg8FgCAnjMAwGg8EQEsZhGAwGgyEkYvbFvbffflu7d+/eqWsbGhqIj4/ZX01QjGZvYDR7g65o3rdvX/GwYcOyg+XF7G+xe/fuHH300Z26tq6ujs46m2jFaPYGRrM36IrmTZs2bW8tzwxJBaGgwHtREoxmb2A0ewOnNBuHEYSEhGChe2Ibo9kbGM3ewCnNxmEEIT09PdwmuI7R7A2MZm/glGbjMIJQXFwcbhNcx2j2BkazN3BKs3EYQTBPJN7AaPYGRrN9uLZKSkQuxR8R9HjgAFVts24r9PT/4d+LYDfw++b7HHcFVaWqqorW4mj5fD4qKoLt1Bm9qCpJSUmtrpzw+XwuWxR+jGZvYDTbh5vLasvwO4Bk4KG2CopIOv5InfOBM4Az8e87vLW1fZQ7QlVVFd27dycxMXjU6O7du8fcMjxVZd++fTQ0NJCSkvK9/JqamiBXxTZGszcwmu3DNYehqqsARGRoCMVHAzXAXPV3A94QkZeBCUBIDqOhoYH8/PzAeWZmJpmZmU22tOosIDZXVYgIKSkprfacevfu7bJF4cdo9gZe0rxixQqWFBwCwOrx/Wy/f6S+uHc8sEm/O2a0ie/uV9wme/bsYcyYMYHziRMnMnv2bAoKCkhOTuaAAw6gsbGR+Ph4Ghv9Wx7Hx8dTX1/P/v37SUhIoLGxkYSEBBoaGgCIi4ujoaGBuLg4VDVQrr6+HhEJOT8+Pp79+/d/J79bt25069YtkN/Y2IiqfidfRL5jc3v5zTXFxcUBUF1dTWJiIkVFRYgImZmZFBUVUVNTQ1ZWFtXV1fTu3ZuCggISEhJIT0+nuLiY9PR0fD4fNTU1gfzExERSU1MpKSkhIyODmpoaamtrA/lJSUkkJydTVlZGr169qKysxOfzBfKTk5NJTEykvLycrKwsysvLqa+vD+SnpKQQFxdHRUUF2dnZlJaWoqpkZ2dTWFhIjx49AH+PMScn53ua0tLSaGxsbFVTbW1twO5Y0dReO9XV1ZGenh5TmtprJ5/PR48ePWJKU8t22r17N7fddhsrVqxg0Nw1AOzatatTmtrC9f0wrB7Gm23NYYjIo0C8ql7RLO1KYJaq9g+lnry8PE1KSgqcN+9hVFRUkJYWbCdOPz6fr80eSDTTmvbdu3dz0EEHhcGi8GE0ewOvaL777ru5//77Ofq2VwBYPf6ETt1n06ZNHw4bNmxQsLxIXSVVCbSc5u8JhDwTHR8fT//+/QNHk7MIhaancS+RmpoabhNcx2j2BrGq+ZtvvmHdunWB8+uvv5733nvP0Toj1WF8DLR0jydY6Y7TNATlJUpKSsJtgusYzd4g1jTv37+fRx55hCFDhvDrX/86oC8xMZFDDz3U0bpdcxgiEiciSVgb2otIknVIkOIvAweIyAwRSbT2XR5NO6ur7CJSIltedNFF9O7dmz59+tCvXz/OPPNMli1bFjS/6ZgyZUqn6srIyLDL7KjBaPYGsaR5y5YtXHjhhcycOZOqqipOO+20Vl8PcAI3exi/xL/yaRUQZ32uAfqJyBkiUiUifQFUdS9wAXAxUA48DFxjx5LaUNi/f78b1YTE9OnT+eabb9i6dSuXXXYZEyZM4KuvvvpeftOxZMmSTtVjlh56A6M5Oqmvr2fRokWceeaZvP/+++Tk5PDkk0/yxBNPkJWV5Zodbi6rfQJ4opXsbcB3puhV9QPgFEeNshj+yL/cqCZAZyaj4uPjufzyy5k1axaffvopRxxxhK021dbW2nq/aMBo9gaxoHnChAm88op/Mnvs2LH84Q9/oGfPnq7bERljL4Z28fl8PPbYYwD07x/SQrEO4aW16k0Yzd4gFjSPHz+ejz/+mIULFzJ06NCw2WEcBt9/4o+kDVcWLlzIfffdR1VVFQkJCSxevJhjjz32e/lNvPDCC5x88skdrqegoIB+/ex/0SeSMZq9QTRq3rBhA++++y7Tpk0DYMiQIbz//vthf6k4UldJhZVu3SLn13LDDTewbds28vPzOffcc8nLywua33R0xlkANH9nxSsYzd4gmjRXVlZy4403cuGFF3LnnXeyYcOGQF64nQUYhxGUSHIYTfTs2ZPFixfzxhtvsHLlStvvn5ycbPs9Ix2j2RtEi+Y1a9YwZMgQHn74YeLi4pg2bRonnNC5l++cIvK+GSOASH0PIyMjg2uvvZY//OEPtq/kKisrs/V+0YDR7A0iXXNZWRmTJk3i4osvZufOnQwcOJC1a9cya9asiBkab8LMYQQhUt7DCMbEiRN54IEH+Otf/2rrfXv16mXr/aIBo9kbdFXzLau2svEbh7c7OGE8g04YHzi9YUM9bHB39WYoRO43YxhpbGyMiPAgy5cv/15aWlpa4D2MX/ziF7bVVVlZGVLwsVjCaPYGXdXsuLNwgOOznemZGIcRBLcDMkYCZpMZb2A0d57OBvNrjqryzDPP8NRTT7Fs2TLHhpy2b9/uyH3NHEYQImE1gtvEwlr1jmI0e4NI0bxjxw7+53/+h8mTJ/P+++/z8ssvO1aXU5qNwwhCfX19uE1wnYKCgnCb4DpGszcIt+bGxkYefPBBhgwZwttvv01mZiYPPvggl1xyiWN1OqXZDEkFIRKX1TpNtCw9tBOj2RuEU/PmzZuZOnUqGzduBGDUqFHcddddZGdnO1qvU5qNwwhC8AC6sU2sbhjVFkazNwin5k2bNrFx40Z69+7N/PnzueCCC1yp1ynNnnQYItLmrnpN25zGEqrKvn37Wl39VV5eHpZgZuHEaPYGbmsuLS0NbNh26aWXsnfvXn7xi1+Qnt5yTzjncEpzbH0rhkiPHj2oqqpqNYplQ0MDdXV1LlvlLKpKUlJSq6sy3AyRHCkYzd7ALc01NTXMnTuXRx99lLVr19K/f39EhN/85jeu1N8cpzR70mGISJvbNu7atYuDDz7YRYvCT3l5OSkpKeE2w1WMZm/ghub33nuPqVOnsnXrVrp160ZeXp4jUaVDxSnNnnQY7eHFVVJGszcwmu2loqKCOXPmBLYeOOqoo7j33nsZNGiQY3WGglOajcMIQqSs23YTo9kbGM32sWHDBq6++mq+/fZb4uPjueGGG7j++usjIv6TeQ/DRcK9bjscGM3ewGi2j169elFcXMwJJ5zA22+/zU033RQRzgLMexiu4rUxXjCavYLR3HlUlby8PM444wxEhAEDBrBixQqOP/74iFtV6VQ7mx5GECIh8KDbGM3ewGjuPL/85S8ZOXLkdyJFn3TSSRHnLMC5djYOIwgVFdEXnbKrGM3ewGjuGM0Dka5cuZLU1NSoiAThVDtHvvIw4PRr+5GI0ewNjObQ2bZtG6NGjQqc//SnP+W9995zNAaUXTjVzsZhBKG0tDTcJriO0ewNjObQeP/99xkyZAj/+Mc/AmnPPPMMhxxyiJ2mOYZT7Rx5g28RgBf3wzCavYHRHBoDBw6kT58+HHfccWyz0qIpxpxT7Wx6GEEw3XZvYDR7g1A0+3w+Fi9eHNj/u3v37qxevZqHHnrIafMcwal2Nj2MIBQWFtKvX79wm+EqRrM38LLmdvfmTjmT117aBoE+RfTiVDubHkYQvLbnMRjNXsHLmruyN/cpfdLsMscVnGpn08MwGAyeYvX4E1i3bh1Tp07l66+/plu3bkyaNIkbb7yRAw44INzmRTSmhxGEqqqqcJvgOkazNzCaIT8/nxEjRvD1119zzDHH8MYbb3D77bfHlLNwqp1NDyMIOTk54TbBdYxmb2A0Q//+/Rk/fjxZWVlMnTo1JnchdKqdXethiEiciMwTkSIRqRSRl0Sk1V0+RGS6iGy1ym4RkWvdsrWoqMitqiIGo9kbeFHz5s2bmTBhwnfS7r77bmbMmBGTzgKca2c3h6RuAkYApwKHWml/CVZQRH4O3A6MVdVU4HJgnoic64ah0bTe2i6MZm/gJc2qyosvvsioUaN48cUXw22OqzjVzm4OSU0A5qjqVwAiMhPIF5HDVHVbi7L9gY9VdQOAqq4XkU+A44E3QqmsoaGB/Pz8wHlmZmZgn932CLVcLGE0e4No19zu0tjvcST9b37JMXsiFafa2RWHISLpQF/gw6Y0Vd0qIhXAcXx/4fNfgV+LyBBgPTAE+AHw91Dr3LNnD2PGjAmcT5w4kdmzZ1NQUEBKSgpxcXFUVFSQnZ1NaWkpqkp2djaFhYXU1NSQnZ1NVVUVOTk5FBUVISJkZmZSVFREWloajY2NVFdX07t3bwoKCkhISCA9PZ3i4mLS09Px+XzU1NQE8hMTE0lNTaWkpISMjAxqamqora0N5CclJZGcnExZWRm9evWisrISn88XyE9OTiYxMZHy8nKysrIoLy+nvr4+kN+epqZldq1pqqmpISsrK6Y0tddOtbW1AbtjRVN77VRXV0d6enrUaurK0liAQYf0YPv27RGlyYm/PZ/PR0pKSqc0tYW4ESpARPoAO4AjVPXrZunbgVmq+lSL8vHALcDN/HfY7DpVvS/UOvPy8jQpKSlw3pEeRllZGRkZGaFWFRMYzd4g2jUPf+RfgH9pbDDKyso45ZRTKCkp4YILLmDevHkkJSVFtebO0JV23rRp04fDhg0LusesW0NSldbP9BbpPYFgjwy3ApcBA4F/A8cAr4pIjao+GkqF8fHxnd6EvbGxsVPXRTNGszeIRc0NDQ2oKgkJCWRkZHDPPffg8/kYOXIkIkJxcXG4TXQdp9rZlUlvVd2Lv4dxYlOaiBwBpAGfBLnkJOBlVf1C/XwOLAN+5oa91dXVblQTURjN3iDWNH/xxRecd955LFq0KJB24YUXMmrUqMDEb6xpDgWnNLu5Suoh4EYROVxE0oC7gVVBJrwB3gVGisgAABH5ITAS2OSGoU5toB7JGM3eIFY019XV8ac//YmhQ4eyadMmnnvuOerq6oKWjRXNHcEpzW6ukroLyAA+ALrjX+00DkBExgIPqmrTrMs8/MNXb1jvapQCL1j3cJyCggLPBWgzmr2BnZo7vmLJPoYOHcrmzZsBuOqqq7j11lvp3r170LKmne3DNYehqo3AdOtomfc08HSz8wb8723c5JZ9zUlISAhHtWHFaPYGdmoOl7Mo//f7bNm8mSOPPJLFixdz2mmntVnetLN9hOwwrJVLJwOHqOqLIpIMoKo1jlgWRtLTW87Nxz5GszdwQnNrK5bsRlUZPXo0/7ICB86cOZPk5OR2rzPtbB8hOQwRORZ4xTrtDbwIDAPG4l/NFFMUFxeTkpISbjNcxWj2BtGmuby8nKqqKg455BBEhHvuuYeysjIGDhwY8j2iTbMdOKU51EnvB4A7VLU/UG+lvQ2cYbtFEYB5IvEGRnNks3LlSnJzc7nmmmvYv38/AP369euQs4Do0mwXTmkO1WH8GHjS+qwAqloFxE484Gb4fL5wm+A6RrM3iAbNe/bs4de//jXjxo2joKAAn89HeXl5p+8XDZrtxinNoTqM7cB3BipFZBCw1XaLIoCampiblmkXo9kbRLJmVeX5558nNzeXZcuWkZKSwl133cXKlSu79KZ2JGt2Cqc0hzrpfRvwmoj8H5AoIjOAScBvHLEqzJh1297AaI4cVJUrrriCFStWAP5ls/fccw99+/bt8r0jVbOTOKU5pB6Gqr4K/Bzog/+luqOA/1XV1x2xKswUFBSE2wTXMZq9QaRqFhGOP/540tPTue+++3jppZdscRYQuZqdxCnNoa6SGqWqL+N/6a55+khVXeaIZWEkVjdVaQuj2RtEkub8/Hx27NjBOeecA8CUKVMYN26c7bvFRZJmt3BKc6hzGE+2kv6YXYZEEqmpqeE2wXWMZm8QCZobGhpYvHgxZ5xxBhMmTAjsDpeQkODI1qKRoNltnNLcZg9DRJr6hN2sEOXNt3E6AggevCXKKSkpCSk2fCxhNHuDcGv+7LPPmDx5Mh9//DEAo0ePdvxN7HBrDgdOaW5vSGob1jJa/CulmlMM/N5ugyIBr8XOB6PZK4RLc21tLQsWLGDx4sU0NDTQp08fFi1aFBiOchLTzvbR3pBUAv5AgRusz01HvKoeqKoPOGJVmDHL8LyB0eweV199NQsWLKCxsZEJEybw7rvvuuIswLSznbTZw7ACBoJ/i1TPUFtbG24TXMdo9gbh0jxp0iTy8/NZtGgRgwcPdrVu0872EeoqqThgInAWkEWzuQxVdecxwUXMum1vYDQ7x9q1a1m/fj2zZs0CYPDgwaxbt464uDhX6m+OaWf7CHWV1EJgCrAROBV4DTgUWOeIVWHGrNv2Bkaz/ezdu5dJkyYxZswYFixYwPr16wN54XAWYNrZTkJ1GGOA81R1AdBo/RwBnOmIVWEmKSkp3Ca4jtHsDZzUvHz5cnJzc3n22Wfp3r07v//97zn55JMdqy9UTDvbR6ihQQ7gv6uk9olIsqr+W0RObOuiaCWUGPuxhtHsDZzQXFhYyMyZM1m+fDngH35avHgxAwYMsL2uzmDa2T5C7WF8CQyyPn8I3CYiNwG7HLEqzJSVlYXbBNcxmr2BE5oXLlzI8uXL6dGjB3PnzmXFihUR4yzAtLOdhNrDuB7Yb32eBjwIpALXOGFUuOnVq1e4TXAdo9kbBNPc1b25b775ZsrLy5k1axZ9+vTpinmOYNrZPtrtYVgrpH4AfAagqptVdaiqnqSqbztiVZiprKwMtwmuYzR7g2Cau+IsTumTRnp6On/+858j0lmAaWc7abeHoaqNInKvqi51xIIIxGy44g2M5u/S3t7c//nPf5g6dSrvv/8+AA8++CAX//RiW+1zAtPO9hHqHMZrInKBIxZEIGbdtjcwmkOjvr6ehQsXcuaZZ/L++++Tk5PD0qVLufjiyHcWYNrZTkJ1GN2Av4nImyLyuIg81nQ4YlWYMeu2vYHR3D6ffPIJP/nJT7jjjjvw+XyMHTuW9evX87Of/cwhC+3HtLN9hDrpvQWY54gFEYhZhucNjOb22bx5M59++il9+/blnnvuYejQoc4Y5iCmne0jJIehqrc6UnuEYjZc8QZGc3AKCgoCQxpjxoyhurqaiy++mJSUFKfNcwTTzvYR6pCUpygvLw+3Ca5jNHuDtjRXVlYyc+ZMTjzxRDZv3gz4t0791a9+FbXOAkw720moQ1KeIisrK9wmuI7R7A3a0jxkyBB27txJfHw8//znPznqqKNctMw5TDvbh+lhBME8kXgDoxlKS0sDn3fu3MnAgQNZu3YtY8eOdds0xzDtbB/GYQShvr4+3Ca4jtHsDZprXrduHbm5uYHz2bNns3r1an70ox+FwzTH8Ho720nIDkNErhSR1SLyL+v8DBEZ04Hr40RknogUiUiliLwkIq32m0TkQBF5UkRKRKRCRD4SkYNDra8rmHXb3sDrmg8++GCqqqoC51OmTCE+PvZGqb3eznYSksMQkduBa4GlwOFW8i7gdx2o6yb8IdFPxb+XBsBfWqkvCVgD+ICjgJ7AWKAqWHm7Meu2vYHXNKsqzz//PKoKwBFHHMHq1avDbJXzeK2dIfzvYVwJDFLVPSJyr5X2FXBEB+qaAMxR1a8ARGQmkC8ih6nqthZlr8DvJK5V1aa+1ecdqIuGhgby8/MD55mZmWRmZoZ0bTSvCOksRnN0E3oAwR/xl0c/ctyeSCKW2jlUnNIcqsOIB5pmUdT62YMQn/hFJB3oiz80uv8mqltFpAI4DtjW4pKzgS+AB0VkBFAEPKSqC0O0lz179jBmzH9HzCZOnMjs2bMpKCggJSWFuLg4KioqyM7OprS0FFUlOzubwsJCRISSkhKqqqrIycmhqKgIESEzM5OioiLS0tJobGykurqa3r17U1BQQEJCAunp6RQXF5Oeno7P56OmpiaQn5iYSGpqKiUlJWRkZFBTU0NtbW0gPykpieTkZMrKyujVqxeVlZX4fL5AfnJyMomJiZSXl5OVlUV5eTn19fWB/PY09ejRA6BVTd26daO4uDimNLXXTnFxcezZsycmNHUlgCDAiQcdwPbt2yNKk11/ewkJCezevTumNLXXTomJiezatatTmtpCmrqnbRbyhwCpBm4AClU1U0QWACmq2m6IcxHpA+wAjlDVr5ulbwdmqepTLcq/CQwDrgMewO9U/g5MVdWn2zUYyMvL0+a7TnWkh7F9+3b69esXUtlYwWiOboY/8i/guwEEv/zyS6ZOncoHH3wAwOjRo5k0aRInnNB2kMFYI5baOVS6onnTpk0fDhs2bFCwvI7sh/E0/l5Gd6tn8A4wLsTrm2LtprdI7wkEezSqBL5V1cXW+T9F5Cn8cyAhOYz4+Hj69+8fonnfJTs7u1PXRTNGc2yxfv16Ro0ahc/n46CDDmL+/Pmcf/757Nu3L9ymuU4st3NrOKU5pElvVS1X1Z8BA4DTgWNU9SJVDWmxr6ruxd/DCGzpKiJHAGnAJ0Eu+Yj/Dn1951ah1NdVmq9N9wpGc2xx0kknMWDAAC6//HLWr1/P+eefD8S25tYwmu0jpB6GiMwHnlbVfwHfdrKuh4AbReQtoAS4G1gVZMIb4Amr7CTgz8CP8K+S+m0n6+4QoQzTxRpGc2xQUlJCr169SExMZNWqVRxwwAHfyY9Fze1hNNtHqO9hJAN/F5EvReRWETmyE3XdBSwHPsDvdOKwhrREZKyIBCbQVXU7cAEwHv+Q1YvAbFV9rhP1dhjThfUGsaL53XffDXy++eabA59bOguIHc0dwWi2j1CHpCYBB+OfyxgAbBKR90VkSqgVqWqjqk5X1SxVTVXV0apabOU9rao9WpR/W1VPUNUUVR2gqveHLqtrFBYWulVVxGA0Rx8VFRVMmzaNiy66KJA2YcKENq+Jds2dwWi2j5Df9La+8F9X1cuBY/BPgC9yxKowE8rysljDaI4u3njjDU477TQef/xxEhISAuknnXRSm9dFs+bOYjTbR0dCgySLyKUi8grwHyv5KkesMhgMrbJlyxYuvfRSdu3axYknnshbb70VbpMMHiHU0CDPAoXAFOBN4HBVHa6qTzhoW9hoHl/HKxjN0cOAAQOYNGkSc+bMYdWqVRxzzDEhXxutmruC0Wwfob6H8Rlwc/OX7mKZnJyccJvgOkZz5LJ7925mzJjB1VdfzVlnnQXAnDlzOnWvaNFsJ0azfYQ66X2nV5wFQFFRUbhNcB2jOfJQVZYuXUpubi4rV67k97//fZeXS0a6Zicwmu2j1R6GiHyqqj+2Pn9NKy/NqWpHAhBGBSISbhNcx2iOLL7++muuu+468vLyADjvvPOYP39+l22OZM1OYTTbR1tDUs1fkhvvSO0RSqgxp2IJozkyaGxs5M9//jN//OMfqampISsri7vuuotRo0bZ8iUQiZqdxmi2j1aHpFT1nWanaaq6puUBpDpiVZgxXVhvEImaKysruffee6mpqeHiiy9m/fr1jB492rYnxkjU7DRGs32EOun9JPBykPTHgGX2mRMZpKWlhdsE1zGaw4fP5wP8vIoNAAAgAElEQVQgMTGRnj17smTJEgCGDx9ue12RotlNjGb7aHPSW0T6ikhfoJuI9Gk6t46hQJ0jVoWZxsbGcJvgOkZzePjwww85++yzmT9/fiBt+PDhjjgLiAzNbmM020d7q6S2AV/jjyW13TpvOp4HOre2L8Kprq4OtwmuYzS7y759+7j11lv56U9/yr///W9effXVQE/DSUw7ewOnNLfnMBKA7sAG63PTEa+qB6rqA45YFWbMpvHeIFya8/LyOP3007n/fn94tMmTJ7N27VoSExMdr9u0szdwSnObDsOKH9WgqkOsz03HfkesiRDMpvHewG3NdXV1XH/99YwYMYJt27ZxzDHH8MYbb3D77bcHjSzrBKadvYFTmtt6D+M1Vb3Q+vwWrb+HcY4jloWR5sHcvILR7DyJiYns3LmThIQEpk+fztSpU13pVTTHtLM3cEpzW6ukmu898VSrpWKQ9PSWO8nGPkazMxQXF7Nv3z769u2LiLBo0SKqqqo4+uijHa87GKadvYFTmlt1GKq6tNnnRx2pPUIpLi4mJSUl3Ga4itFsL6rKSy+9xE033cQPfvADVqxYQbdu3Tj00EMdqS9UTDt7A6c0hxqt9n9F5Gjr8wARWSMiq0XkB7ZbFAGYJxJv4JTmnTt3ctlllzFhwgRKS0tJTEykoqLCkbo6imlnb+CU5lD3w/gjUGZ9XgB8AmwEYnKVlBvLGyMNo7nr7N+/nyeeeILTTjuN1atXk5aWxuLFi3n55Zfp2bOnrXV1FtPO3sApzaG+6X2gqhaKSHfgDOBioAGIyXfua2pqwm2C6xjNXUNVueSSS1izZg0AF154IXPnzuWggw6yrQ47MO3sDZzSHGoPo0REjgDOA/6pqnVAIhCTYSDNum1vYKdmEeGMM84gOzubxx57jKVLl0acswDTzl4hLO9hNOMOYBPwONAUw2AY/qGpmMOs2/YGXdX8+eefs2rVqsD5tddey4YNGxg5cmTEhtQ27ewNXH8Pozmq+qiIPG99rrSS/wn8whGrwozba+MjAaM5dOrq6liwYAH33HMPKSkpbNiwgZycHOLj48nIyLDZSnsx7ewNnNIc6hwG+HsjF4rIIcC3wEpV3euIVWEmNTUmo7a3idEcGh988AFTpkxh8+bNAIwZM4bk5GS7TXMM087ewCnNoS6rPQXYClwHnAJMBfJF5FRHrAozJSUl4TbBdYzmtqmurubmm2/mvPPOY/PmzRx55JG89tprzJs3L6rCZ5t29gZOaQ61h7EEmKqqTzcliMgvrPSYcxqRPqzgBEZz24wfP55Vq1YRFxfH5MmTmTFjRlT1LJow7ewNnNIcqsM4Cni2RdpzwP32mhMZ1NTURNVTox1EuuZbVm1l4zdhfPlt2EwGDZsJwDpg3dNfhs+WLhDp7ewERrN9hLpKKh8Y0yJtNPCVveZEBrW1teE2wXUiXXNYnUWUcEqf9r8gIr2dncBoto9QexjXAytEZAr+jZQOA44Bfu6IVWHGrNuOXFaPP8G2e9XV1dG9e/fvpO3Zs4ebbrqJZcv8Ow+/9tpr5Obm2lZnuImWdrYTo9k+QuphqOo6oD/wCPA58DAwQFXzHLEqzJh1296guWZV5bnnniM3N5dly5aRkpLCXXfdxamnxtYUndfb2SuE9T0MAFUtFpHXgYOBXapa7IhFEUBSUlK4TXAdL2veuXMnN9xwA2+++SYAQ4cO5Z577qFv377hNM8RvNzOXsIpzaEuqz3U2kTpW2AN8K2IvC0ifUKtSETiRGSeiBSJSKWIvCQiWSFc9xsRURG5JdS6uko0rn7pKl7WfN999/Hmm2/Ss2dP7r//fl566aWYdBbg7Xb2Ek5pDnXS+0n8Q1GZqpoJ9AI+BZa2edV3uQkYgX8ZbtOmAH9p6wIR6QdMs+pyjbKysvYLxRhe09zQ0BDQfPPNN3PFFVewfv16LrvssogN62EHXmtnMJrtJNQhqZOB81S1HkBVy0VkOh2LVjsBmKOqXwGIyEz8L/8dpqrbWrnmUWAW8JsO1AP4vxDy8/MD55mZmWRmZoZ0ba9evTpaXdTjhuawL43F/3dx33338cILLwQmttPS0li0aFFY7XIL87ftDZzSHKrD2AicBGxoljbQSm8XEUkH+gIfNqWp6lYRqQCOA7YFuWYisE9VnxORDjuMPXv2MGbMf1cCT5w4kdmzZ1NQUEBKSgpxcXFUVFSQnZ1NaWkpqkp2djaFhYXU19eTkZFBVVUVOTk5FBUVISJkZmZSVFREWloajY2NVFdX07t3bwoKCkhISCA9PZ3i4mLS09Px+XzU1NQE8hMTE0lNTaWkpISMjAxqamqora0N5CclJZGcnExZWRm9evWisrISn88XyE9OTiYxMZHy8nKysrIoLy+nvr4+kN+eph49egC0qqmhoYHa2lpHNXXVWRyXncg333wTsqaW7fTOO+8wZ84cvvjiCwBeeuklRo8eHVXt1NW/vf3791NdXR1TmtprJ1WNOU3ttZOqUlFR0SlNbSGq2n4hkfuAscBy4BugD3AR/r2+A70MVZ3TyvV9gB3AEar6dbP07cAsVX2qRfm+wLvAYFX9VkTeBt5U1TvaNdYiLy9Pm0/8dKSHsX37dvr16xdqVTGBG5qHP/IvwN6lsaFQW1vL/PnzWbx4MY2NjfTp04eFCxfSv39/084ewGjuGJs2bfpw2LBhg4LlhdrDSAdW4N//omk2cAXQ0zoA2vI8TRFuW+4b2BMI9tj5CHCHqn4bon3fIz4+nv79+3fqWrNuO3bYuHEjkydPZsuWLYgIEyZM4JZbbqFHjx7U1dWF2zzXidV2bguj2T5CDW/+y65Uoqp7RWQHcCLwEYC1IVMawffUOBc4SUTutM7TgZNF5KeqekZXbAmFgoICzz2RxKrmb7/9li1btjBgwAAWL17M4MGDA3mxqrktjGZv4JTmjoQ37yoPATday3NLgLuBVa1MeLdcrvsCkId/P3HHMcvwopsdO3YElsWOHDkSn8/HiBEjvrc2PZY0h4rR7A3CvazWDu7CPwfyAf73OeKAcQAiMlZEqpoKqurO5gdQB1SoaqEbhpoNV6KTsrIyJk2axMknnxyY2BYRLrnkkqAvMsWC5o5iNHsDpzS75jBUtVFVp6tqlqqmquroprfFVfVpVW11il5Vh3ZkwrurlJeXu1VVxBDtml999VVyc3N59tln6datG5999lm710S75s5gNHsDpzS7OSQVNWRltfsCeswRrZoLCwuZOXMmy5cvB2Dw4MEsXryYAQMGtHtttGruCkazN3BKc8g9DBE5W0QeFJFl1vmJInKWI1aFGfNEEh2sXbuW3Nxcli9fTo8ePZg3bx4rVqwIyVlAdGruKkazN3BKc6ixpK7F/9b1N8DZVrIPuLPVi6KY+vr6cJvgOtGo+cgjj8Tn8zFs2DDee+89rrrqKrp1C32UNRo1dxWj2Rs4pTnU/65pwE+seYT9Vtq/gR86YlWYMeu2I5P9+/fz8ssvs3+//0+wX79+vPXWWzz//PMceuih7Vz9faJBs90Yzd4grPthAKn4N06C/76gF4+/lxFzmPj5kcfmzZu54IILuOqqq1i69L8xLwcMGNDpYIGRrtkJjGZv4JTmUB3GOmB6i7RJwDv2mhMZpKSkhNsE14lUzfX19SxcuJCzzjqLjRs3kpOTY9vTU6RqdhKj2Rs4pTnUVVKT8W/RejWQKiKf4+9dXOCIVWEmLi4u3Ca4TiRq/vjjj5kyZQqffuqPbj9u3DjmzJlDz54927kyNCJRs9MYzd7AKc2hhgb5VkROAnLxx5L6Blivqo2OWBVmKioqyMjICLcZrhJpmtetW8eoUaNobGykX79+LFq0iKFDh9paR6RpdgOj2Rs4pbkjW7Tuxx9B9l3brYgwsrOzw22C60Sa5lNPPZUf//jHDB48mFmzZjnSxY40zW5gNHsDpzSH5DBE5GtaiUarqkfYalEEUFpaygEHHBBuM1wl3JorKyuZO3cukydP5sADDyQhIYG///3vjoZ1CLfmcGA0ewOnNIfawxjf4vwg/PMaz9prTmQQyh4hsUY4Nb/xxhvccMMNfPvttxQUFPDwww8DzscAMu3sDYxm+wh1DmNNyzQRWQOsBO6x26hwY7qw7lBaWsqsWbN47rnnABg4cCDXXXeda/WbdvYGRrN9dCX4YA0Qc8NR4I9P5DXc1KyqLFu2jNzcXJ577jmSkpKYPXs2q1ev5thjj3XNDtPO3sBoto9Q5zBua5F0AHAhsNp2iyKAUPa2jTXc1Pyf//yHq666ClVlyJAh3HPPPRx55JGu1d+EaWdvYDTbR6hzGC2juVUD9wNP2GqNIWZpPqZ61FFHccMNN3DIIYdw+eWXdyj+k8FgCB/tOgwRiQPeAJ5X1VrnTQo/VVVV9OrVK9xmuIqTmrdt28b1118PF/63ozpr1ixH6uoIpp29gdFsH+0+2lkv593rFWcBkJOTE24TXMcJzY2NjTzwwAOcfvrpvPNO5EWRMe3sDYxm+wh1LOA1EYnJMCDBKCoqCrcJrmO35i+//JLzzz+fWbNmsW/fPkaPHm3r/e3AtLM3MJrtI9Q5jG7A30RkHf6wIIEBaVX9tROGhZPORj+NZuzS7PP5WLx4MQsWLMDn83HQQQcxf/58zj//fIY/8i9b6rAL087ewGi2j1AdxhZgniMWRCCZmZnhNsF17NJcU1PD448/js/n4/LLL+f2228nPT3dlnvbjWlnb2A020ebDkNELlPVZ1X1Vkdqj1CKioro169fuM1wla5o3rdvH926dSMpKYn09HTuu+8+4uPjOfPMM2220l5MO3sDo9k+2pvDeND2GqOAtLS0cJvgOp3V/O6773LmmWcyd+7cQNo555wT8c4CTDt7BaPZPtpzGN4b/MO/usdrdFRzRUUF06ZN46KLLuKrr77izTffxOeLrg0YTTt7A6PZPtpzGHEicraInNPa4YhVYaa6ujrcJrhORzSvXr2a0047jccff5yEhARmzpzJm2++6XiwQLsx7ewNjGb7aG/SuzvwKK33NJQYjCdlNo0PTm1tLVOnTuWFF14A4MQTT2TJkiUcc8wxTpvnCKadvYHRbB/t9TCqVfUIVT28lSPmnAWYTeNbo3v37lRUVJCcnMwf/vAHVq1aFbXOAkw7ewWj2T5C3nHPSyQkJITbBNdpTfOuXbuoq6vj8MMPR0RYsGABtbW1HHFE9D8rmHb2BkazfZhJ7yBE6nsDTtJSs6ry5JNPkpubyzXXXBOYRDv44INjwlmAaWevYDTbR5sOQ1VTHak1wikuLg63Ca7TXPPXX3/NyJEjuf7666msrKRXr14xOXHo9Xb2CkazfbgWV1pE4kRknogUiUiliLwkIlmtlL1ARNaKSLGIlIlInoic4ZatXn0iaWxs5P777+f0008nLy+PrKwsHnnkEZ5++umYXMvu1Xb2Gkazfbi5EcFNwAjgVOBQK+0vrZTNAO4F+gPZwDPA6yLSx2kjgah7n8AO6urqGD16NLfeeis1NTVcfPHFrF+/ntGjR8dsLB4vtrPR7A2c0uzmpPcEYI6qfgUgIjOBfBE5TFW3NS+oqk+3uPYBEZkDDMIf/LBdGhoayM/PD5xnZmaGHF+lpqYmpHKxRG1tLcOHD2fr1q0sXLiQ4cOHh9skx/FiOxvN3sApza44DBFJB/oCHzalqepWEakAjgO2tXP9cUAv4LNQ69yzZw9jxowJnE+cOJHZs2dTUFBASkoKcXFxVFRUkJ2dTWlpKapKdnY2hYWFJCUlUVJSQlVVFTk5ORQVFSEiZGZmUlRURFpaGo2NjVRXV9O7d28KCgpISEggPT2d4uJi0tPT8fl81NTUBPITExNJTU2lpKSEjIwMampqqK2tDeQnJSWRnJxMWVkZvXr1orKyEp/PF8hPTk4mMTGR8vJysrKyKC8vp76+PpDfnqamLRuba/rss8+oqqrilFNOITk5mTFjxjB06FD69+/P9u3bbdfURF1dnWOaOtJOBxxwAHv27In4drLzby8lJYXCwsKY0tReO6WmprJ79+6Y0tReO6WlpbFr165OaWrzu7j51plOYQ0l7QCOUNWvm6VvB2ap6lNtXHsgsA74m6reFGqdeXl5mpSUFDjvSA9j+/btMR2sbN++ffzxj3/kz3/+MykpKaxfv576+nrHNTeFN189/gRH6wmVWG/nYBjN3qArmjdt2vThsGHDBgXLc2tIqtL62XImpidQ0dpFInIw/u1hVwO/60iF8fHx9O/fvyOXBIi2EBcdIS8vj6lTp7Jt2za6devGlVdeSUZGBnv37g23aa4Ty+3cGkazN3BKsysOQ1X3isgO4ETgIwAROQJIAz4Jdo2IHAasAV5W1elu2NlEamrsrSauqKjgtttuY+nSpQAcc8wx3HvvvZxwgv9p34sB2mKxndvDaPYGTml2c5XUQ8CNInK4iKQBdwOrWk54A4jI0fiHoZ5121kAlJSUuF2l44wfP56lS5eSkJDA7373O9auXRtwFhCbmtvDaPYGRrN9uOkw7gKWAx8A3wJxwDgAERkrIlXNyt4IHAJcJyJVzY6xbhiakZHhRjWucvPNN3Pqqafy9ttvM2PGjO91WWNRc3sYzd7AaLYP1xyGqjaq6nRVzVLVVFUdrarFVt7TqtqjWdkrVVVUtUeLo+VyW0eI9mV4qsoLL7zAjBkzAmkDBw5k5cqV/PCHPwx6TbRr7gxGszcwmu3DBB8MQm1tbbhN6DQ7d+5k+vTprF69GoDRo0eTm5sLtL0xfDRr7ixGszcwmu3DzSGpqCEa4+fv37+fxx9/nNNOO43Vq1eTlpbGkiVLGDx4cEjXR6PmrmI0ewOj2T6MwwhCtMXP37p1KyNGjGDatGlUVVVxwQUXsH79esaNGxdyWI9o02wHRrM3MJrtwwxJBaH5C3/RwMMPP8y7775LdnY2d999NyNGjOhw/Kdo02wHRrM3MJrtwziMICQnJ4fbhHapq6uje/fuAMyaNQsRYcaMGSG/zd6SaNBsN0azNzCa7cM4jCCUlZVFbDjvuro65s+fzyuvvMJbb71FSkoKqamp/OlPf+rSfTui+ZZVW9n4Tasv6EcNkdzOTmE0ewOnNJs5jCA0D5QXSWzcuJGzzjqLBQsWkJ+fz5o1a2y7d0c0d8VZnNIncv5xI7WdncRo9gZOaTY9jCBUVlaGFLnRLaqqqrjzzjt56KGHUFWOPPJIlixZElguawed0RwpQQQ7S6S1sxsYzd7AKc3GYQQhkjZcWbduHb/97W/ZsWMHcXFxTJ48mRkzZtg+RhlJmt3CaPYGRrN9GIcRhEhat11WVsaOHTv48Y9/zJIlSzj++OMdqSeSNLuF0ewNjGb7MHMYQQj3uu3//Oc/gc8XXXQRjz/+OG+++aZjzgLCrzkcGM3ewGi2D+MwghCuZXh79uzhyiuv5PTTT+fTTz8NpI8YMYKEhARH6zZLD72B0ewNnNJsHEYQ3N5wRVX561//yuDBg3nllVfo3r07W7duddUGs8mMNzCavYFTmo3DCEJ5eblrdX3zzTdcfPHFXHvttezdu5ezzz6bd999l5EjR7pmA7irOVIwmr2B0WwfZtI7CFlZWa7Us3r1asaPH09VVRU9e/bkzjvv5NJLL+1wWA87cEtzJGE0ewOj2T5MDyMIbj2RHHPMMYB/Ynv9+vVcdtllYXEWYJ7CvILR7A1MD8NF6uvrHbvvCy+8wKWXXkq3bt049NBDWbduHX379nWkvo7a5jWMZm9gNNuHcRhBcGIN8yeffMKUKVP45JNPqK6u5uqrrwaICGcBZq26VzCavYF5D8NF7FzDXFtbyx133MGwYcP45JNP6NOnD/3797ft/nZh1qp7A6PZG5j9MFwkJSXFlvts2LCBqVOnsmXLFkSECRMmcMstt0REXJvgEWdLw2JLuLCrnaMJo9kbOKXZOIwgxMXFdfke77zzDqNHj0ZVGTBgAIsXLw55u1Q36Gp48kiKOttZ7GjnaMNo9gZOaTYOIwgVFRVkZGR06R5Dhgxh0KBBnHHGGUyfPj1id/1qiji7fft2+vXrF2Zr3MWOdo42jGZv4JRm4zCCkJ2d3eFrysrKuPPOO5k2bRoHHXQQ8fHxvPbaa8THR8evuDOaox2j2RsYzfZhJr2DUFrasbH8V199ldzcXB577DFmzZoVSI8WZwEd1xwLGM3ewGi2j+j5RnMRVQ2pXEFBATNnzmTFihUADB48mN/97ndOmuYYoWqOJYxmb2A024fpYQShve6cqvLMM8+Qm5vLihUr6NGjB/PmzWPFihUMGDDAJSvtxXTbvYHR7A3MkJSLFBYWtpm/efNmpkyZQnl5OcOGDeO9997jqquuolu36P11tqc5FjGavYHRbB9mSCoIwd6TUNVAnKejjz6aG2+8kb59+/K///u/YYv/ZCeR8G6I2xjN3sBoto/ofSR2kc2bN3P++eezevXqQNqMGTO45JJLYsJZGAwGQygYhxGEqqoqwB/Aa8GCBZx11lls3LiRuXPnxuwEWpNmL2E0ewOj2T5ccxgiEici80SkSEQqReQlEWk1aLuInCcin4tIjYh8JiLD3bI1JyeHjz76iGHDhnHnnXfi8/kYN24cL774Ysz2KHJycsJtgusYzd7AaLYPN3sYNwEjgFOBQ620vwQrKCJHAH8D/gSkWz9fFpHDnDaypqaGWbNmce655/LZZ5/Rr18//va3v7FkyRJ69uzpdPVho6ioKNwmuI7R7A2MZvtwc9J7AjBHVb8CEJGZQL6IHKaq21qUvQL4UFWfss6fFpFrrPTbQ6msoaGB/Pz8wHlmZiaZmZntXjfi6S/hh+M44U/jAml/zIc/5v8rlGqjlljtObWF0ewNjGb7cMVhiEg60Bf4sClNVbeKSAVwHLCtxSXHNy9rsclKD4k9e/YwZsyYwPnEiROZPXs2BQUFpKSkEBcXR0VFBdnZ2ZSWlqKqnluvfVx2Ivv27aOoqIju3btTXFxMdXU1vXv3pqCggISEBNLT0ykuLiY9PR2fz0dNTU0gPzExkdTUVEpKSsjIyKCmpoba2tpAflJSEsnJyZSVldGrVy8qKyvx+XyB/OTkZBITEykvLycrK4vy8nLq6+sD+W21U2FhYWAlSFVVFTk5ORQVFSEiZGZmUlRURFpaGo2Nja1qSkpKYs+ePTGlqb12Sk5OprCwMKY0tddOKSkp7N69O6Y0tddOPXr0YNeuXZ3S1BbixiSuiPQBdgBHqOrXzdK3A7Oa9SSa0tcA61T1983SbgeGqOpPQqkzLy9Pmwf8C7WHAd4MxGc0ewOj2Rt0RfOmTZs+HDZs2KBgeW4NSVVaP9NbpPcEgsXZruxA2aDEx8d3eqOitLToD93dUYxmb2A0ewOnNLsy6a2qe/H3ME5sSrMmttOAT4Jc8nHzshYnWOmO09jY6EY1EYXR7A2MZm/glGY3V0k9BNwoIoeLSBpwN7AqyIQ3wFJgkIhcJiIJInIZcBLwpBuGVldXu1FNRGE0ewOj2Rs4pdlNh3EXsBz4APgWiAPGAYjIWBEJvGmiqluB0cAt+IehbgFGteJcbMdsGu8NjGZvYDTbh2sOQ1UbVXW6qmapaqqqjlbVYivvaVXt0aL831X1WFVNtn6uDn5n+zGbxnsDo9kbGM32YUKDtKC0tJSHHnrIU5uuGM3ewGj2Bk5qNg6jBaWlpTzwwAOe+wMzmmMfo9kbOKnZOAyDwWAwhIQrL+6FgzVr1hQB2zt6XUNDQ1xpaWlOZmZmYXx8vCfW4xnNRnOsYjR3SnO/YcOGBQ17EbMOw2AwGAz2YoakDAaDwRASxmEYDAaDISSMwzAYDAZDSBiHYTAYDIaQMA7DYDAYDCFhHIbBYDAYQsI4DIPBYDCEhHEYBoPBYAgJ4zAMBoPBEBKedBgiEici80SkSEQqReQlEclqo/x5IvK5iNSIyGciMtxNe+2gI5pF5AIRWSsixSJSJiJ5InKG2zZ3lY62c7PrfiMiKiK3uGGnXXTi7/pAEXlSREpEpEJEPhKRg920uat0QvN0Edlqld0iIte6aa8diMil1v9khYg0hFB+kIhsFJF9lvZxna3bkw4DuAkYAZwKHGql/SVYQWsr2b8Bf8K/z/ifgJdF5DDHrbSXkDUDGcC9QH8gG3gGeF1E+jhtpM10RDMAItIPmAZ86qxpjtCRv+skYA3gA44CegJjgapg5SOYjmj+OXA7MFZVU4HLgXkicq4bhtpIGfB/wHXtFRSRdOB14CX8/9fXAH8WkdxO1ayqnjvwByW8qtn5kYAChwUpezuQ1yItD/h9uHU4pbmV64vw73oYdi1OagbeBC4B3gZuCbcGp/QCE4FvgIRw2+2i5huA91qkrQemh1tHJ7UPBRraKXMlsAMrbqCV9hfg8c7U6bkehuVx+wIfNqWpf0vYCuC4IJcc37ysxSYrPSrohOaW1x8H9AI+c8pGu+mMZhGZCOxT1edcMdJGOqH3bOAL4EFrSOpLEbnBFWNtohOa/wqkicgQEelmDbP+APi7G/aGieOBTWp5CotOf3/F22JSdJFm/Sxvkb63WV5zUlspe6zNdjlJRzUHEJEDgReBuaq6xQHbnKJDmkWkL/694wc7bJdTdLSNs4Bh+Ic1rsH/Bft3ESlU1acds9JeOqp5D/6/5bf473D8daoaNQ9CnaC17682/+9bw3M9DKDS+pneIr0n/ieTYOVDLRupdFQzANYE6FvAauB3zpjmGB3V/Ahwh6p+66hVztGZv+tvVXWxqvpU9Z/AU/jnA6KFjmq+FfgFMBBIwP+Ufb2IXOWYheHH1u8vzzkMVd2Lf0zvxKY0a2I7DfgkyCUfNy9rcYKVHhV0QjPWpH4e8Lqq/rZFlzbi6YTmc4E/WivDioEhwO9EJM8Ne7tKJ/R+hH+s/3u3csRAB+iE5pOAl1X1C/XzObAM+Jkb9oaJj/F/XzWn899f4Z64CdNk0SxgM3A4/j+uFw3KnQkAAAYlSURBVIC/t1L2SGAfcBn+p5LLgGpCnCyOlKODmo8GduJ/4g677S5pPrTFsR6YC+SEW4dDevtZf9eTgDj8T9tFwCXh1uGg5t9ZZQdY5z8EtgK3hltHBzXHAUnAcKDB+pxEs4ntZmV7Wu06A0jEPwxZBeR2qu5wiw/jL3w+UIy/y/Y3IMvKGwtUtSh/HvA5UGP9HB5uDU5qBh7H/6RZ1eIYG24dTrZzi2vfJvpWSXX073oo8C/8D0BbgEnh1uCkZvxztncB26y/5x3AAqJspRjwK+v/s+VxGHCGpa1vs/InAxut76+vgHGdrdts0WowGAyGkPDcHIbBYDAYOodxGAaDwWAICeMwDAaDwRASxmEYDAaDISSMwzAYDAZDSBiHYTAYDIaQMA7DEHOIyFMiMjvcdrSHiFwhIq+3kT9URD530yaDoS2MwzBELCKyzdq0qqrZEVUb/LSFqj6pqucDiEi8tWnTYc3y31bVsAe5DGabwZsYh2GIdC5S1R7Njl3hNijWEBEvRq02dALjMAxRh7WXwYsiUiAie0XkbRH5YStlDxSRlVa5UhH5R7O8Q0XkZWt7z69FZFIbdT4lIveLyBpre8+3mu9AKCKni8g/RaTc2g7z1GZ5V1m9pUoR+UpELrXSx4vI21axJrs+t3pS/yMiPxGRbVbZW0Tkry1sul9EFlqfe4rI4yKyW0R2isgcEQn6/y0id4jIcyLyrIhUAuNEJFdENli/p90iskREElqzzbrPz0XkY+uadSLyo9Z+f4bYwDgMQ7SyAhgA9Ma/sVNrW6/OwB8/J9sqeyv494K27vEBcAj+aLUzRGRYG3WOA27Dv5fEF011in8P6dfwxyXqBSwBVopIhoikAQuBc9W/LegQgkdSPdP6eazVk3qpRf4zwM9EJMWqMx642EoHf2jyGvzBMgcBF+Lfba01RlnXpgPP4Q9iN9XSNgR//LSJrdkmIicDDwPjLc2PAa+ISGIbdRqiHOMwDJHOMusJdq+ILANQ1f2q+oSqVqpqLTAbOKnpy7QF9cDB+IOx+VT1HSt9MJCmqn+00vOBR4FL27Bluaq+q6p1wM3AmSJyEHAR8LmqPquqDar6FH4ndaF1nQI/EpEkVd2tql909Jegql/hd4xN+1WcC+xV1X+KyCH4o5Ber6r7VLUAuKcdLetUdbn1u6xR1Q9U9X3L/q+Ah4Cz2rh+AvB/1nWNqvqYlX5yR7UZogfjMAyRzkhV7WkdI8HfOxCRudbwTgWQb5XNCnL9Xfj3fV4jIltFZIaV3g/o28wZ7QVm4u+FtMY3TR9UtRz/TmYHW8f2FmW3A4eoagX+kPiTgAIRWSEiP+iA/uY8Y90L/BsBNe2M1w/oDhQ203I/kBOKFgAROVpEXrOG+SqAOQT/fTbRD7ixxe/vIPy9NUOMYhyGIRq5HLgAOAf/kEp/K11aFlTVClW9XlUPA0bi/5I7C/8X5pZmzqinqqaq6kVt1Nt8ziLdqnuXdfRrUbYv8K1lw+uq+hP8X6j5wINB7h1K2OjngJ+IyKH4expNw1Hf4N/bIrOZljRVbWu/9pb1PYi/B9NfVdPwD71JK2Wb6ry9xe/vAFV9PgQdhijFOAxDNJIK1AElwAHAna0VFJGLRORIERH8PYJG61gP+ERkmogkWb2WH4vISW3Ue5E1OdwduAP/sM5u/HMhx4rIJdYS1F/gd2IrReQgy4YDAB/+vScaW95YVRstPUe0VrmqFgLr8O9XslmtPdZV9RvgHWC+iKRZiwL6i8iZrd0rCE17P1dbCwia5i9as+0hYJKInCx+elg6gw0LGmIE4zAM0cjj/PfJ/nPgvTbKHgWsxb+pzLvAYlVdp6oN+Hspp+DfUKcY/1N2Whv3egq/oygGjgN+CaCqRcDPgRvxf7FeD/xMVUvxb/AzA9ht5Z0G/LaV+/8eeMYa4hndSplngJ/w395FE+OAFPyT8WX4d55ra3itJdOAK/BvQvQg/t5Mq7ap6vvAb4AH/r+9O6YBGIahKGgzL5iiisrFXQLgb5GiOwL29uQs2fPW3oGL+UAJAt39VtU3M8/pXeAUFwYAEcEAIOJJCoCICwOAiGAAEBEMACKCAUBEMACI/LhQ5VuwJFgxAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "minRet = .01 \n", "ptsl=[0,2]\n", "bb_events = getEvents(close,tEvents,ptsl,target,minRet,cpus,t1=t1)\n", "cprint(bb_events)\n", "\n", "bb_bins = getBins(bb_events,close).dropna()\n", "cprint(bb_bins)\n", "\n", "features = (pd.DataFrame()\n", " .assign(vol=bb_events.trgt)\n", " .assign(ma_side=ma_side)\n", " .assign(srl_corr=srl_corr)\n", " .drop_duplicates()\n", " .dropna())\n", "cprint(features)\n", "\n", "Xy = (pd.merge_asof(features, bb_bins[['bin']], \n", " left_index=True, right_index=True, \n", " direction='forward').dropna())\n", "cprint(Xy)\n", "\n", "### run model ###\n", "X = Xy.drop('bin',axis=1).values\n", "y = Xy['bin'].values\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5,\n", " shuffle=False)\n", "\n", "n_estimator = 10000\n", "rf = RandomForestClassifier(max_depth=2, n_estimators=n_estimator,\n", " criterion='entropy', random_state=RANDOM_STATE)\n", "rf.fit(X_train, y_train)\n", "\n", "# The random forest model by itself\n", "y_pred_rf = rf.predict_proba(X_test)[:, 1]\n", "y_pred = rf.predict(X_test)\n", "fpr_rf, tpr_rf, _ = roc_curve(y_test, y_pred_rf)\n", "print(classification_report(y_test, y_pred, \n", " target_names=['no_trade','trade']))\n", "\n", "plt.figure(1)\n", "plt.plot([0, 1], [0, 1], 'k--')\n", "plt.plot(fpr_rf, tpr_rf, label='RF')\n", "plt.xlabel('False positive rate')\n", "plt.ylabel('True positive rate')\n", "plt.title('ROC curve')\n", "plt.legend(loc='best')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:py37]", "language": "python", "name": "conda-env-py37-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "490px", "width": "355px" }, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "346px" }, "toc_section_display": true, "toc_window_display": true }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }