{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Quickstart or _\"How to get 100% return per year\"_" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, do some initialization and set the debugging level to `debug` to see the progress of computation." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "%load_ext autoreload\n", "%autoreload 2\n", "%config InlineBackend.figure_format = 'svg'\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "import datetime as dt\n", "import matplotlib.pyplot as plt\n", "\n", "import universal as up\n", "from universal import tools, algos\n", "from universal.algos import *\n", "\n", "sns.set_context(\"notebook\")\n", "plt.rcParams[\"figure.figsize\"] = (16, 8)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Ignore logged warnings\n", "import logging\n", "logging.getLogger().setLevel(logging.ERROR)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try to replicate the results of B. Li and S. Hoi from their article [On-Line Portfolio Selection with Moving Average Reversion](http://arxiv.org/abs/1206.4626). They claim superior performance on several datasets using their OLMAR algorithm. These datasets are available in the `data/` directory in `.csv` format. These are all raw prices and artificial tickers. We can start with NYSE stocks from the period 1/1/1985 - 30/6/2010." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Load data using the tools module\n", "data = tools.dataset('nyse_o')\n", "\n", "# Plot the first three as an example\n", "data.iloc[:,:3].plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we need an implementation of the OLMAR algorithm. Fortunately, it is already implemented in the `algos` module, so all we have to do is load it and set its parameters. The authors recommend a lookback window $w = 5$ and threshold $\\epsilon = 10$ (these are the default parameters anyway). Just call the `run` method on our data to get results for analysis." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Set algo parameters\n", "algo = algos.OLMAR(window=5, eps=10)\n", "\n", "# Run\n", "result = algo.run(data)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, let's see some results. First, print some basic summary metrics and plot portfolio equity with UCRP (uniform constant rebalanced portfolio)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Summary:\n", " Profit factor: 1.89\n", " Sharpe ratio: 3.34 ± 0.19\n", " Ulcer index: 23.24\n", " Information ratio (wrt benchmark): 3.27\n", " Benchmark Sharpe: 1.16 ± 0.21\n", " Appraisal ratio (wrt benchmark): 3.13 ± 0.21\n", " Beta / Alpha: 1.56 / 165.627%\n", " Annualized return: 189.87%\n", " Annualized volatility: 56.79%\n", " Longest drawdown: 185 days\n", " Max drawdown: 46.29%\n", " Winning days: 58.2%\n", " Annual turnover: 339.6\n", " Utility (q=0.5): 122.53%\n", " Utility (q=0.7): 141.77%\n", " Utility (q=1.0): 156.20%\n", " Utility (q=2.0): 173.03%\n", " \n" ] }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "print(result.summary())\n", "result.plot(weights=False, assets=False, ucrp=True, logy=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That seems really impressive; in fact, it looks too good to be true. Let's see how individual stocks contribute to portfolio equity and disable the legend to keep the graph clean." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result.plot_decomposition(legend=False, logy=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, almost all wealth comes from a single stock (don't forget it has a logarithmic scale!). So if we used just 5 of all these stocks, we would get almost the same equity as if we used all of them. To stress test the strategy, we can remove that stock and rerun the algorithm." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Summary:\n", " Profit factor: 1.55\n", " Sharpe ratio: 2.48 ± 0.20\n", " Ulcer index: 12.70\n", " Information ratio (wrt benchmark): 2.36\n", " Benchmark Sharpe: 1.12 ± 0.21\n", " Appraisal ratio (wrt benchmark): 2.21 ± 0.21\n", " Beta / Alpha: 1.48 / 97.131%\n", " Annualized return: 119.24%\n", " Annualized volatility: 48.01%\n", " Longest drawdown: 202 days\n", " Max drawdown: 45.91%\n", " Winning days: 56.5%\n", " Annual turnover: 338.3\n", " Utility (q=0.5): 72.02%\n", " Utility (q=0.7): 85.51%\n", " Utility (q=1.0): 95.63%\n", " Utility (q=2.0): 107.43%\n", " \n" ] }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Find the name of the most profitable asset\n", "most_profitable = result.equity_decomposed.iloc[-1].idxmax()\n", "\n", "# Rerun the algorithm on data without it\n", "result_without = algo.run(data.drop(columns=[most_profitable]))\n", "\n", "# And print results\n", "print(result_without.summary())\n", "result_without.plot(weights=False, assets=False, ucrp=True, logy=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We lost about 7 orders of wealth, but the results are more realistic now. Let's move on and try adding fees of 0.1% per transaction (we pay \\\\$1 for every \\\\$1000 of stocks bought or sold)." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Summary:\n", " Profit factor: 1.35\n", " Sharpe ratio: 1.78 ± 0.20\n", " Ulcer index: 6.79\n", " Information ratio (wrt benchmark): 1.59\n", " Benchmark Sharpe: 1.12 ± 0.21\n", " Appraisal ratio (wrt benchmark): 1.44 ± 0.21\n", " Beta / Alpha: 1.48 / 63.323%\n", " Annualized return: 85.41%\n", " Annualized volatility: 48.08%\n", " Longest drawdown: 382 days\n", " Max drawdown: 50.21%\n", " Winning days: 50.4%\n", " Annual turnover: 338.3\n", " Utility (q=0.5): 38.61%\n", " Utility (q=0.7): 51.98%\n", " Utility (q=1.0): 62.01%\n", " Utility (q=2.0): 73.71%\n", " \n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "result_without.fee = 0.001\n", "print(result_without.summary())\n", "result_without.plot(weights=False, assets=False, ucrp=True, logy=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Results still hold, although our Sharpe ratio decreased from 3.14 to 1.56 and annualized return from 466% to 109%. Now, some of you trained in quantitative finance might start asking: \"_Isn't there some [survivorship bias](http://en.wikipedia.org/wiki/Survivorship_bias)?_\". Yes, there is. In fact, a huge one considering that we have almost 25 years of data and a mean-reversion type of strategy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Testing Yahoo data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see whether the algo works on recent data, too. First, download closing prices of several (randomly chosen) stocks from Yahoo." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import yfinance as yf\n", "\n", "# Load data from Yahoo\n", "yahoo_data_raw = yf.download(['MSFT', 'IBM', 'AAPL', 'GOOG'], start=dt.datetime(2005,1,1), auto_adjust=False, multi_level_index=False, progress=False)\n", "yahoo_data = yahoo_data_raw['Adj Close']\n", "\n", "# Plot normalized prices of these stocks\n", "(yahoo_data / yahoo_data.iloc[0,:]).plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of using fixed parameters, we will test several `window` parameters with the function `run_combination`. It is the same as `run`, just use it as a classmethod and use lists for combinations of values. `run_combination` returns a list of results which can be used similarly to `result`." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Summary for window=3:\n", " Profit factor: 1.08\n", " Sharpe ratio: 0.55 ± 0.22\n", " Ulcer index: 1.01\n", " Information ratio (wrt benchmark): -0.23\n", " Benchmark Sharpe: 0.94 ± 0.22\n", " Appraisal ratio (wrt benchmark): -0.29 ± 0.22\n", " Beta / Alpha: 1.06 / -5.508%\n", " Annualized return: 16.55%\n", " Annualized volatility: 30.23%\n", " Longest drawdown: 649 days\n", " Max drawdown: 51.24%\n", " Winning days: 51.9%\n", " Annual turnover: 285.1\n", " Utility (q=0.5): -1.74%\n", " Utility (q=0.7): 3.49%\n", " Utility (q=1.0): 7.41%\n", " Utility (q=2.0): 11.98%\n", " \n", "Summary for window=5:\n", " Profit factor: 1.11\n", " Sharpe ratio: 0.71 ± 0.22\n", " Ulcer index: 1.54\n", " Information ratio (wrt benchmark): 0.04\n", " Benchmark Sharpe: 0.94 ± 0.22\n", " Appraisal ratio (wrt benchmark): -0.04 ± 0.22\n", " Beta / Alpha: 1.07 / -0.757%\n", " Annualized return: 21.53%\n", " Annualized volatility: 30.20%\n", " Longest drawdown: 807 days\n", " Max drawdown: 48.12%\n", " Winning days: 52.5%\n", " Annual turnover: 213.0\n", " Utility (q=0.5): 3.25%\n", " Utility (q=0.7): 8.47%\n", " Utility (q=1.0): 12.39%\n", " Utility (q=2.0): 16.96%\n", " \n", "Summary for window=10:\n", " Profit factor: 1.13\n", " Sharpe ratio: 0.79 ± 0.22\n", " Ulcer index: 1.64\n", " Information ratio (wrt benchmark): 0.14\n", " Benchmark Sharpe: 0.94 ± 0.22\n", " Appraisal ratio (wrt benchmark): 0.09 ± 0.22\n", " Beta / Alpha: 1.04 / 1.749%\n", " Annualized return: 23.52%\n", " Annualized volatility: 29.92%\n", " Longest drawdown: 570 days\n", " Max drawdown: 53.78%\n", " Winning days: 52.3%\n", " Annual turnover: 148.4\n", " Utility (q=0.5): 5.57%\n", " Utility (q=0.7): 10.70%\n", " Utility (q=1.0): 14.54%\n", " Utility (q=2.0): 19.03%\n", " \n", "Summary for window=15:\n", " Profit factor: 1.10\n", " Sharpe ratio: 0.65 ± 0.22\n", " Ulcer index: 1.33\n", " Information ratio (wrt benchmark): -0.08\n", " Benchmark Sharpe: 0.94 ± 0.22\n", " Appraisal ratio (wrt benchmark): -0.12 ± 0.22\n", " Beta / Alpha: 1.04 / -2.307%\n", " Annualized return: 19.35%\n", " Annualized volatility: 29.85%\n", " Longest drawdown: 627 days\n", " Max drawdown: 57.15%\n", " Winning days: 52.0%\n", " Annual turnover: 115.7\n", " Utility (q=0.5): 1.50%\n", " Utility (q=0.7): 6.60%\n", " Utility (q=1.0): 10.43%\n", " Utility (q=2.0): 14.89%\n", " \n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "list_result = algos.OLMAR.run_combination(yahoo_data, window=[3,5,10,15], eps=10)\n", "print(list_result.summary())\n", "list_result.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we don't know the best parameters in hindsight, we will invest equal money in each of them at the beginning and let them run. This is called a _buy and hold_ strategy. Portfolio equities in `list_result` can be regarded as stock prices and used as input for a new algo (_buy and hold_ in this case). This way you can chain algorithms however you like, for example OLMAR on OLMAR, etc.\n", "\n", "To compare it with individual assets or a uniform constant rebalanced portfolio, use the parameters `assets` and `ucrp`." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Run buy and hold on OLMAR results and show its equity together with original assets\n", "algos.BAH().run(list_result).plot(assets=True, weights=False, ucrp=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, so that was enough for the start. There are plenty of other algorithms in the `algos` module collected from research papers about online portfolios, including the famous [Universal portfolio](http://en.wikipedia.org/wiki/Universal_portfolio_algorithm) by Thomas Cover." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# How to write your own algorithm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The entire package is actually pretty simple. Algorithms are subclasses of the base `Algo` class, and methods for reporting, plotting, and analyzing are built on top of this class. I will illustrate it on this mean-reversion strategy:\n", "\n", "1. Use the logarithm of price.\n", "2. Calculate the difference $\\delta_i$ between the current price of the $i$-th stock and its moving average of $n$ days.\n", "3. If $\\delta_i > 0$, assign zero portfolio weight $w_i = 0$ for the $i$-th stock.\n", "4. If $\\delta_i < 0$, assign weight $w_i = -\\delta_i$ for the $i$-th stock.\n", "5. Normalize all weights so that $\\sum w_i = 1$.\n", "\n", "The idea is that badly performing stocks will revert to their mean and have higher returns than those above their mean. Here is the complete code; comments should be self-explanatory." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "from universal.algo import Algo\n", "import numpy as np\n", "\n", "class MeanReversion(Algo):\n", " # Use logarithm of prices\n", " PRICE_TYPE = 'log'\n", "\n", " def __init__(self, n):\n", " # Length of moving average\n", " self.n = n\n", " # step function will be called after min_history days\n", " super().__init__(min_history=n)\n", "\n", " def init_weights(self, cols):\n", " # Use zero weights for start\n", " return pd.Series(np.zeros(len(cols)), cols)\n", "\n", " def step(self, x, last_b, history):\n", " # Calculate moving average\n", " ma = history.iloc[-self.n:].mean()\n", "\n", " # Weights\n", " delta = x - ma\n", " w = np.maximum(-delta, 0.)\n", "\n", " # Normalize so that they sum to 1\n", " return w / sum(w)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's all. Now let's try it on NYSE data." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Summary:\n", " Profit factor: 1.46\n", " Sharpe ratio: 2.13 ± 0.20\n", " Ulcer index: 4.84\n", " Information ratio (wrt benchmark): 1.94\n", " Benchmark Sharpe: 1.16 ± 0.21\n", " Appraisal ratio (wrt benchmark): 1.81 ± 0.21\n", " Beta / Alpha: 1.15 / 32.163%\n", " Annualized return: 50.05%\n", " Annualized volatility: 23.47%\n", " Longest drawdown: 808 days\n", " Max drawdown: 41.29%\n", " Winning days: 52.4%\n", " Annual turnover: 361.7\n", " Utility (q=0.5): 38.84%\n", " Utility (q=0.7): 42.04%\n", " Utility (q=1.0): 44.44%\n", " Utility (q=2.0): 47.25%\n", " \n" ] }, { "data": { "image/svg+xml": [ "\n", "\n", "\n" ], "text/plain": [ "

" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mr = MeanReversion(n=20)\n", "result = mr.run(data)\n", "\n", "print(result.summary())\n", "result.plot(assets=False, logy=True, weights=False, ucrp=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not bad considering how simple that strategy is. The next step could be performance optimization. To profile your strategy, you can use the function `profile` in `universal.tools`, which profiles the code using the fantastic [line_profiler](http://pythonhosted.org/line_profiler/). After identifying the most critical parts of the code, you have two options. Either optimize your `step` function (using tools such as [weave](http://docs.scipy.org/doc/scipy/reference/tutorial/weave.html), [numba](http://numba.pydata.org/), [theano](http://deeplearning.net/software/theano/) or [cython](http://cython.org/)), or subclass the `weights` method if your code can be easily vectorized (beware the forward bias!). " ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 4 }