{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# Epsilon Fund - Walk-Forward Validation\n", "Uses `infrastructure/walk_forward/` to run rolling Optuna optimisation and evaluate OOS robustness. Ideal to use after finding strategy that seems to work using backtesting framework to ensure logic is valid.\n", "\n", "---\n", "### Iteration Guidelines\n", "\n", "**Overfitting the iteration process:** Each time you inspect OOS results and adjust parameters, you leak OOS information into your design decisions. Cap yourself at **3–4 iterations** — first run with everything free, second with obvious fixes from CV + plateau analysis, third to tighten remaining params. \n", "\n", "If the strategy still shows heavy overfitting signals after three passes, **the problem is the strategy architecture, not the parameters**.\n", "\n", "**WFE:** Walk-forward efficiency - examine IS/OOS ratio (simplest).\n", "\n", "**Pertubation degradation:** Examine pertubation table to see if degradation reduces across runs.\n", "\n", "| Signal | Meaning | Action |\n", "|--------|---------|--------|\n", "| IS Sharpe drops, OOS Sharpe holds or rises, WFE improves | Removing noise-fitting degrees of freedom | Continue iterating |\n", "| Perturbation degradation shrinks across iterations | Parameters becoming more robust | Continue iterating |\n", "| N/A plateau params decreasing across iterations | Strategy becoming more tolerant of parameter movement | Continue iterating |\n", "| WFE improvement flattens (e.g. 0.55 → 0.65 → 0.67) | Diminishing returns — further fixes won't help much | Stop iterating |\n", "| IS and OOS both decline but WFE rises (IS falls faster) | Constraining away real signal, not just noise | Stop iterating |\n", "| OOS Sharpe keeps declining despite \"better\" param setup | Overfitting the iteration process itself | Stop — problem is strategy architecture, not parameters |\n", "| WFE decreases after fixing a parameter | Locked in a param that was legitimately adapting across folds | Unfix that parameter and re-run |\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": null, "id": "1", "metadata": {}, "outputs": [], "source": [ "import sys\n", "import os\n", "import pandas as pd\n", "import numpy as np\n", "\n", "\n", "# ── repo root — works on both Mac and Windows ────────────────────────────────\n", "ROOT = os.path.expanduser('~/Desktop/epsilon/github/Epsilon-Quant-Research')\n", "# ROOT = r'C:\\Users\\user\\Documents\\Epsilon Fund\\Epsilon-Quant-Research' # ← Windows path\n", "# ─────────────────────────────────────────────────────────────────────────────\n", "\n", "sys.path.append(os.path.join(ROOT, 'infrastructure', 'data'))\n", "sys.path.append(os.path.join(ROOT, 'infrastructure', 'walkforward'))\n", "sys.path.append(os.path.join(ROOT, 'infrastructure', 'backtester'))\n", "\n", "\n", "from binance_client import get_binance_client, get_data\n", "from wf_engine import walk_forward, plateau_analysis, plateau_summary, perturbation_test, cost_stress_test\n", "from wf_visualizer import plot_walk_forward_results, plot_plateau_analysis\n", "from engine import backtest\n" ] }, { "cell_type": "markdown", "id": "2", "metadata": {}, "source": [ "---\n", "## Data\n", "\n", "**Pairs** — any Binance pair in `BASEQUOTE` format (e.g. `BTCUSDT`, `ETHUSDT`, `SOLUSDT`, `BNBUSDT`). \n", "Verify availability at [binance.com/en/trade](https://www.binance.com/en/trade).\n", "\n", "**Intervals** — `'1m'` `'5m'` `'15m'` `'1h'` `'4h'` `'1d'` `'1w'`\n", "\n", "**Lookback** — days of history: must be >= (train_bars + test_bars) * n_folds desired" ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "SYMBOL = 'BTCUSDT'\n", "INTERVAL = '1d'\n", "LOOKBACK = 2150 \n", "\n", "\n", "# ── Multiple pairs (optional) ──────────────────────────────────────────────────\n", "# SYMBOLS = ['BTCUSDT', 'ETHUSDT', 'SOLUSDT']\n", "# data_dict = get_multiple_data(client, SYMBOLS, INTERVAL, LOOKBACK)\n", "# Access via: data_dict['BTCUSDT_1d'], data_dict['ETHUSDT_1d'] ...\n", "# ──────────────────────────────────────────────────────────────────────────────\n", "\n", "client = get_binance_client()\n", "df = get_data(client, SYMBOL, INTERVAL, LOOKBACK)\n", "print(f'Data: {df.index[0].date()} → {df.index[-1].date()} ({len(df)} bars)')\n", "df.tail(3)" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "---\n", "## Parameter Configuration\n", "\n", "Define which parameters to optimise and anchor - **recommended to do after strategy writeup**\n", "\n", "`FIXED_PARAMS`: choose parameters with CV < 0.15 from prior stability run, cross referencing with pertubation verdict table to reduce search space, improve OOS credibility.\n", "\n", "**Practical rule**: free parameter count to be **at most** `n_trials` / 20 for meaningful conversion. \n", "\n", "> e.g 400 trials: ~20 free params as the theoretical ceiling, in practice you want far fewer because TPE (Optuna method) efficiency degrades exponentially with dimensionality, not linearly. A good target for 400 trials is 6–10 free parameters." ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "# ── parameter search space ────────────────────────────────────────────────────\n", "# Format: 'param_name': ('int' | 'float', lo, hi)\n", "# Only keys present in PARAMS above are searched — remove a key from PARAMS to exclude it entirely.\n", "\n", "PARAM_DEFS = {\n", " 'ema_span': ('int', 5, 40),\n", " 'swing_caution': ('int', 3, 14),\n", " 'swing_stop': ('int', 3, 10),\n", " 'atr_caution': ('int', 10, 30),\n", " 'atr_stop': ('int', 10, 30),\n", " 'atr_size': ('int', 3, 14),\n", " 'adx_override': ('int', 52, 65),\n", " 'stop_atr_scale': ('float', 0.5, 2.0),\n", " 'risk_per_trade': ('float', 0.005, 0.05),\n", " 'max_leverage': ('float', 1.0, 3.0),\n", " 'stop_mult_pos_caution': ('float', 0.1, 0.6),\n", " 'stop_mult_pos_normal': ('float', 0.8, 2.0),\n", " 'stop_mult_ent_both': ('float', 1, 2.5),\n", " 'stop_mult_ent_caution': ('float', 0.1, 0.9),\n", " 'stop_mult_ent_normal': ('float', 0.5, 1.5),\n", " 'obv_ma_period': ('int', 10, 40), # OBV smoothing window\n", " 'obv_lookback': ('int', 10, 30), # bars back to compare price vs OBV direction\n", "}\n", "\n", "# ── anchored params (set after a stability run; empty first time) ─────────────\n", "# These bypass Optuna and are held constant across all folds.\n", "# Populate using stability_df results: fix params with CV < 0.15\n", "FIXED_PARAMS = {\n", " 'risk_per_trade': 0.0426,\n", " 'max_leverage': 2.8325,\n", "\n", " 'stop_atr_scale': 1,\n", " 'stop_mult_pos_normal':1,\n", " 'stop_mult_ent_normal': 1,\n", "\n", "# 'obv_ma_period': 32,\n", "# 'obv_lookback': 16,\n", "# 'adx_override': 60,\n", "\n", "# 'stop_mult_ent_normal':1\n", " }" ] }, { "cell_type": "markdown", "id": "6", "metadata": {}, "source": [ "1. Economiclally intuitive range narrowing" ] }, { "cell_type": "markdown", "id": "7", "metadata": {}, "source": [ "### *Guide to parameter anchoring*\n", "\n", "| | Robust Plateau| Fragile Plateau |\n", "|--|-------------------|-------------------|\n", "| Low CV | Stable across folds and insensitive to exact value - keep free| Looks stable but is fitting the same noise patterns - fix at concensus|\n", "| High CV | Parameter unimportant - fix at any reasonable value | Unstable across folds and sitting on a cliff - strong candidate to eliminate |\n", "\n", "Copy-paste plateau analysis table above into fixed params section and decide manually on which to fix/keep free.c" ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "---\n", "## Strategy\n", "\n", "**CRUCIAL** - Strategy logic needs to work well in backtesting notebook before running here, saves time not running walk-forward for a broken strategy.\n", "\n", "**Available columns:** `Open` `High` `Low` `Close` `Volume`\n", "\n", "**Required output:** a `position` column — `1` long · `0` flat · `-1` short \n", "**Optional:** `position_size` column (0–1) to use fractional capital\n", "\n", "> The engine applies a 1-bar execution lag automatically. Inside the strategy loop, use `prev` for signal decisions and `curr` for execution — no manual shifting needed.\n", "\n", "**To implement your strategy:**\n", "1. Write strategy logic — compute indicators, signals, and execution loop: use `param_name`for those to be searched\n", "2. Update `indicator_cols` to list your longest-warmup indicators — the engine uses this to clean NaN rows after OOS trimming\n" ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "def my_strategy(df_slice: pd.DataFrame, params: dict) -> pd.DataFrame:\n", "\n", " df = df_slice.copy()\n", "\n", " # ── strategy logic ────────────────────────────────────────────────────────\n", " df['EMA'] = df['Close'].ewm(span=params['ema_span'], adjust=False).mean()\n", " df['Swing_Hi_Cau'] = df['High'].rolling(params['swing_caution']).max()\n", " df['Swing_Lo_Cau'] = df['Low'].rolling(params['swing_caution']).min()\n", " df['Swing_Hi_Stp'] = df['High'].rolling(params['swing_stop']).max()\n", "\n", " def atr(period):\n", " hl = df['High'] - df['Low']\n", " hc = (df['High'] - df['Close'].shift(1)).abs()\n", " lc = (df['Low'] - df['Close'].shift(1)).abs()\n", " return pd.concat([hl, hc, lc], axis=1).max(axis=1).ewm(span=period, adjust=False).mean()\n", "\n", " df['ATR_Cau'] = atr(params['atr_caution'])\n", " df['ATR_Stp'] = atr(params['atr_stop'])\n", " df['ATR_Sz'] = atr(params['atr_size'])\n", "\n", " up = df['High'].diff(); down = -df['Low'].diff()\n", " pdm = up.where((up > down) & (up > 0), 0.0)\n", " ndm = down.where((down > up) & (down > 0), 0.0)\n", " atr14 = atr(14)\n", " pdi = 100 * pdm.ewm(span=14, adjust=False).mean() / atr14\n", " ndi = 100 * ndm.ewm(span=14, adjust=False).mean() / atr14\n", " dx = (100 * (pdi - ndi).abs() / (pdi + ndi).replace(0, np.nan)).fillna(0)\n", " df['ADX_14'] = dx.ewm(span=14, adjust=False).mean()\n", "\n", " direction = df['Close'].diff().apply(lambda x: 1 if x > 0 else -1) # removed Vol_MA\n", " df['OBV'] = (df['Volume'] * direction).cumsum()\n", " df['OBV_MA'] = df['OBV'].rolling(params['obv_ma_period']).mean()\n", "\n", " df['Caution_OBV'] = (df['Close'] > df['Close'].shift(params['obv_lookback'])) & (df['OBV'] < df['OBV_MA'])\n", " df['Caution_Long'] = ((df['Swing_Hi_Cau'] - df['Low']) > 1.5 * df['ATR_Cau']) | df['Caution_OBV']\n", " df['Caution_Short'] = ((df['High'] - df['Swing_Lo_Cau']) > 1.5 * df['ATR_Cau']) | (df['Close'] > df['EMA'])\n", " # guard: only allow entry when stop/size indicators are fully warmed up (no NaN stop-loss trap)\n", " df['Entry_Long'] = (df['Close'] > df['EMA']) & (~df['Caution_Long'] | (df['ADX_14'] > params['adx_override'])) # removed volume filter\n", " df['position_size_raw'] = (params['risk_per_trade'] / (df['ATR_Sz'] / df['Close'])).clip(0.1, params['max_leverage'])\n", "\n", " n = len(df)\n", " position = [0] * n\n", " position_size = [0.0] * n\n", " stop_arr = [np.nan] * n\n", " in_position = 0\n", " stop_loss = np.nan\n", " current_size = 0.0\n", "\n", " for i in range(1, n):\n", " prev = df.iloc[i - 1]\n", " curr = df.iloc[i]\n", "\n", " if in_position == 0:\n", " if prev['Entry_Long']:\n", " in_position = 1\n", " current_size = curr['position_size_raw']\n", " cl = prev['Caution_Long']; cs = prev['Caution_Short']\n", " if cl and cs: sm = params['stop_mult_ent_both']\n", " elif cl: sm = params['stop_mult_ent_caution']\n", " else: sm = params['stop_mult_ent_normal']\n", " stop_loss = curr['Swing_Hi_Stp'] - curr['ATR_Stp'] * sm * params['stop_atr_scale']\n", " else:\n", " if prev['Close'] < stop_loss:\n", " in_position = 0\n", " current_size = 0.0\n", " stop_loss = np.nan\n", " else:\n", " sm = params['stop_mult_pos_caution'] if curr['Caution_Long'] else params['stop_mult_pos_normal']\n", " stop_loss = max(stop_loss, curr['Swing_Hi_Stp'] - curr['ATR_Stp'] * sm \n", " * params['stop_atr_scale'])\n", "\n", " position[i] = in_position\n", " position_size[i] = current_size\n", " stop_arr[i] = stop_loss\n", "\n", " df['position'] = position\n", " df['position_size'] = position_size\n", " df['stop_loss'] = stop_arr\n", "\n", " # ── cleanup ───────────────────────────────────────────────────────────────\n", " indicator_cols = ['EMA', 'ATR_Cau', 'ADX_14', 'Swing_Hi_Cau', 'OBV_MA'] \n", " df['position'] = df['position'].fillna(0).astype(int)\n", " df['position_size'] = df['position_size'].fillna(0.0)\n", " df['stop_loss'] = df['stop_loss'].fillna(0.0)\n", "\n", " return df, indicator_cols" ] }, { "cell_type": "markdown", "id": "10", "metadata": {}, "source": [ "---\n", "## Run Walk-Forward\n", "Simulates how the strategy would have performed if re-optimised periodically\n", "in live trading, and exposes whether good IS performance survives unseen data.\n", "\n", "**Folds Setup**\n", "| Parameter | Description | Guidance |\n", "|---|---|---|\n", "| `TRAIN_BARS` | Bars per training window | Aim for 2:1 to 3:1 ratio vs `TEST_BARS` |\n", "| `TEST_BARS` | Bars per test window | `365` = ~1 year on daily data |\n", "| `BURNIN_BARS` | Bars prepended to test for indicator warmup | Match your longest indicator period |\n", "| `N_TRIALS` | Optuna trials per fold | 300–500 for daily; more = better but slower.10-20 trials per free parameter to avoid overfit |\n", "| `COST` | Round-trip cost per trade | `0.001` = 0.1% |\n", "\n", "Use `N_TRIALS` as robustness dia: if OOS degrades sharply as you increase it from 100→200→300, direct signal your parameter space still has too many degrees of freedom relative to the information content of the training window (consider decreasing). \n", "\n", "**Score and Rejection** — use to calibrate what Optuna optimises IS: default `score_fn(m)` uses weighted basket of Sharpe, Calmar and Return, normalised using their \"max\" value; default `reject_fn(m)` discards runs failing certain criteria that limits credibility.\n", "\n", "> Pay attention to the **degradation ratio** — OOS/IS Sharpe reveals overfitting." ] }, { "cell_type": "code", "execution_count": null, "id": "11", "metadata": {}, "outputs": [], "source": [ "# ── walk-forward windows ──────────────────────────────────────────────────────\n", "TRAIN_BARS = 1050 \n", "TEST_BARS = 137 \n", "BURNIN_BARS = 100 \n", "N_TRIALS = 400 \n", "COST = 0.001 \n", "\n", "# ── SCORING FUNCTION ──────────────────────────────────────────────────────────\n", "# Modify weights or swap components. Must return a float (higher = better).\n", "\n", "def score_fn(m):\n", " SHARPE_MAX = 2.5\n", " CALMAR_MAX = 6\n", " RETURN_MAX = 15.0\n", "\n", " calmar = m['total_return'] / abs(m['max_drawdown']) if m['max_drawdown'] != 0 else 0\n", "\n", " s = np.clip(m['sharpe_ratio'] / SHARPE_MAX, 0, 1)\n", " c = np.clip(calmar / CALMAR_MAX, 0, 1)\n", " r = np.clip(m['total_return'] / RETURN_MAX, 0, 1)\n", "\n", " return 0.50 * s + 0.30 * c + 0.20 * r\n", "\n", "# ── REJECTION CRITERIA ────────────────────────────────────────────────────────\n", "# Trials that return True are discarded (score → -999).\n", "\n", "def reject_fn(m):\n", " if m is None: return True\n", " if m['num_trades'] < 15: return True\n", " if m['win_rate'] < 0.3: return True\n", " if m['max_drawdown'] < -0.7: return True\n", " if m['profit_factor'] < 0.6: return True\n", " return False\n", "\n", "\n", "results = walk_forward(\n", " df = df,\n", " strategy_fn = my_strategy,\n", " param_defs = PARAM_DEFS,\n", " fixed_params = FIXED_PARAMS,\n", " train_bars = TRAIN_BARS,\n", " test_bars = TEST_BARS,\n", " burnin_bars = BURNIN_BARS,\n", " n_trials = N_TRIALS,\n", " cost = COST,\n", " score_fn = score_fn, # ← your notebook definition\n", " reject_fn = reject_fn, # ← your notebook definition\n", " save_csv = None,\n", ")" ] }, { "cell_type": "markdown", "id": "12", "metadata": {}, "source": [ "8 folds: 1.12, (but higher returns decently)\n", "- most consistent! \n" ] }, { "cell_type": "markdown", "id": "13", "metadata": {}, "source": [ "---\n", "## Granular Results and Parameter Stability\n", "\n", "Per-fold IS vs OOS performance. Each row is one fold — compare `train_*` vs `test_*` columns to assess overfitting.\n", "\n", "| Column | Description |\n", "|---|---|\n", "| `*_sharpe` `*_return` `*_drawdown` `*_calmar` | Core performance metrics |\n", "| `*_trades` `*_winrate` `*_profit_factor` | Trade statistics |\n", "| `optuna_score` | Best score achieved on training window |\n", "| `param_*` | Best parameter values per fold e.g. `param_ema_span` |\n", "\n", "**Concensus Parameters** - use to anchor: the engine determines stability using the coefficient of variation (CV) — the standard deviation of a parameter's best values across all folds divided by their median.\n", "\n", ">CV < 0.15: indicates the strategy relies on value rather than it being noise-fitted to a specific period — making it safe to fix for future runs. A high CV means the parameter is period-sensitive and should stay free." ] }, { "cell_type": "code", "execution_count": null, "id": "14", "metadata": {}, "outputs": [], "source": [ "# ── fold summary table ────────────────────────────────────────────────────────\n", "display_cols = [\n", " 'train_sharpe', 'test_sharpe',\n", " 'train_return', 'test_return',\n", " 'train_drawdown', 'test_drawdown',\n", " 'optuna_score',\n", "]\n", "display(results['results_df'][display_cols].round(2))\n", "\n", "# ── parameter stability ───────────────────────────────────────────────────────\n", "stability = results['stability_df'].copy()\n", "stability['stable'] = stability['stable'].map({True: '✓', False: ''})\n", "stability['fixed'] = stability['fixed'].map({True: '✓', False: ''})\n", "stability = stability[['param', 'median', 'std', 'cv', 'stable', 'fixed']].round(2)\n", "display(stability.sort_values('cv'))\n", "\n", "# ── consensus params ──────────────────────────────────────────────────────────\n", "stable = results['stability_df'][results['stability_df']['cv'] < 0.15]\n", "\n", "print('Stable parameters (CV < 0.15) — copy into FIXED_PARAMS:')\n", "for _, row in stable.iterrows():\n", " v = results['consensus_params'][row['param']]\n", " v_fmt = int(round(v)) if isinstance(v, float) and v == int(v) else round(v, 4) if isinstance(v, float) else v\n", " print(f\" '{row['param']}': {v_fmt},\")\n", " \n", "print('\\nConsensus parameters (median across folds):')\n", "for k, v in results['consensus_params'].items():\n", " print(f' {k:<30} = {round(v, 2) if isinstance(v, float) else v}')" ] }, { "cell_type": "markdown", "id": "15", "metadata": {}, "source": [ "---\n", "## Parameter Robustness Checks\n", "\n", "### Plateau Analysis\n", "Sweep each free parameter across its range while holding others at consensus (median) value then evaluates the `score` at each point by backtesting over entire lookback.\n", "\n", "The stability table (CV across folds) tells you *\"does the optimizer consistently pick the same value?\"* \n", "\n", "Plateau analysis tells you *\"if that value were slightly wrong, would performance collapse?\"* \n", "\n", "**Plateau %** - what fraction of each parameter's range stays within `threshold`% (default 20) of peak score: >60% = `robust plateau`, 30–60% = `moderate`, <30% = `fragile` (consider anchoring). `N/A` means every sweep point failed rejection filters — the strategy is completely intolerant of movement on that dimension.\n", "\n", ">Run time: `n_free_params` × `n_steps`\n", "\n", "### Perturbation test\n", "Jitters all free parameters by ±5/10/20% of their range (50 random samples per offset range). Measures how much the score degrades vs the base\n", "\n", "Test whether optimum is a broad hill in `#free params`-D space or a narrow spike\n", "\n", "**>15%:** fragile optimum, consider reducing free parameters" ] }, { "cell_type": "code", "execution_count": null, "id": "16", "metadata": {}, "outputs": [], "source": [ "# ── 1-D sensitivity sweeps around consensus params ─────────────────────────\n", "sweep_results = plateau_analysis(\n", " df = df,\n", " strategy_fn = my_strategy,\n", " base_params = results['consensus_params'],\n", " param_defs = PARAM_DEFS,\n", " fixed_params = FIXED_PARAMS,\n", " cost = COST,\n", " n_steps = 20, #Adjust for number of steps around concensus per parameter\n", ")\n", "\n", "# ── text verdicts ──────────────────────────────────────────────────────────\n", "verdict_df = plateau_summary(\n", " sweep_results,\n", " base_params = results['consensus_params'],\n", " stability_df = results['stability_df'], \n", " threshold = 0.20, #Adjust for % around peak score\n", ")\n", "\n", "# ── neighbouahood perturbation ────────────────────────────────────────────\n", "# Randomly jitters ALL free params simultaneously.\n", "# If mean score degrades >15% at ±10% offset, the optimum is fragile.\n", "\n", "perturb_df = perturbation_test(\n", " df = df,\n", " strategy_fn = my_strategy,\n", " base_params = results['consensus_params'],\n", " param_defs = PARAM_DEFS,\n", " fixed_params = FIXED_PARAMS,\n", " cost = COST,\n", " pct_offsets = (0.05, 0.10, 0.20), # ±5%, ±10%, ±20% of range\n", " n_samples = 50, # random perturbations per offset level\n", ")" ] }, { "cell_type": "markdown", "id": "17", "metadata": {}, "source": [ "### 1-D sweep charts:\n", "| Element | Meaning | Good | Bad |\n", "|---------|---------|------|-----|\n", "| **Blue curve** | Composite score at each value of the parameter, with all others held at consensus | Flat-topped curve — performance is insensitive to the exact value | Narrow spike — optimizer latched onto one specific value, everything nearby is worse |\n", "| **Red dashed line** | Where the consensus value sits | On the flat top of the curve | On a steep slope or at the edge of a cliff |\n", "| **Green dashed line** | Cutoff at 80% of peak score — the boundary between plateau and non-plateau | Blue curve stays above this line across most of the range | Blue curve dips below it quickly either side of the peak |\n", "| **Green shading** | Plateau region — all values where the score stays within 20% of the peak | Wide green band spanning most of the range (robust) | Thin sliver or no shading at all (fragile/overfit) |\n", "\n", " If concensus on steep slope: parameter **REGIME SENSITIVE** - do not fix, backtests are disagreeing, want to fix parameters on flat top." ] }, { "cell_type": "code", "execution_count": null, "id": "18", "metadata": {}, "outputs": [], "source": [ "\n", "# ── visual sweep curves ───────────────────────────────────────────────────\n", "plot_plateau_analysis(\n", " sweep_results = sweep_results,\n", " consensus_params = results['consensus_params'],\n", " param_defs = PARAM_DEFS,\n", " fixed_params = FIXED_PARAMS,\n", " threshold = 0.20,\n", " show = False,\n", " save_html = None,\n", ")\n", "\n" ] }, { "cell_type": "markdown", "id": "19", "metadata": {}, "source": [ "---\n", "## Results Charts and Cost Stress Test\n", "\n", "| Parameter | Description | Default |\n", "|---|---|---|\n", "| `show_fold_perf` | IS vs OOS bars for return, Sharpe, drawdown per fold | `False` |\n", "| `show_param_evol` | Parameter evolution across folds with ±1 std bands | `False` |\n", "| `show_oos_equity` | Combined OOS equity curve + drawdown with fold boundaries | `True` |\n", "| `show_trades` | Overlay entry/exit markers on OOS equity chart | `False` |\n", "| `benchmark_data` | DataFrame with `Close` column for buy & hold comparison | `None` |\n", "| `save_html_dir` | Directory path to save charts as HTML files, or `None` | `None` |\n", "\n", "**Cost Stress Test:** re-run the combined OOS backtest at 1×, 1.5×, 2×, 3× the base cost. Fragile strategies collapse; robust ones degrade gradually." ] }, { "cell_type": "code", "execution_count": null, "id": "20", "metadata": {}, "outputs": [], "source": [ "plot_walk_forward_results(\n", " results = results,\n", " param_defs = PARAM_DEFS,\n", " fixed_params = FIXED_PARAMS,\n", " benchmark_data = df,\n", " show = True,\n", " save_html_dir = None,\n", " show_fold_perf = False, # IS vs OOS bars by fold\n", " show_param_evol = False, # parameter evolution across folds\n", " show_oos_equity = True, # combined OOS equity curve\n", " show_trades = False, # trade markers on OOS equity chart\n", ")\n", "\n", "# ── transaction cost stress test ──────────────────────────────────────────\n", "\n", "if results['oos_combined_df'] is not None:\n", " cost_df = cost_stress_test(\n", " oos_combined_df = results['oos_combined_df'],\n", " cost_multipliers = (1.0, 1.5, 2.0, 3.0),\n", " base_cost = COST,\n", " )\n", "else:\n", " print('No combined OOS dataframe — skip cost stress test')" ] }, { "cell_type": "code", "execution_count": null, "id": "21", "metadata": {}, "outputs": [], "source": [ "results['oos_combined_df'].to_pickle('oos/btc_oos.pkl')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.14.0" } }, "nbformat": 4, "nbformat_minor": 5 }