{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Epsilon Fund - Walk-Forward Validation\n",
    "Uses `infrastructure/walk_forward/` to run rolling Optuna optimisation and evaluate OOS robustness. Ideal to use after finding strategy that seems to work using backtesting framework to ensure logic is valid.\n",
    "\n",
    "---\n",
    "### Iteration Guidelines\n",
    "\n",
    "**Overfitting the iteration process:** Each time you inspect OOS results and adjust parameters, you leak OOS information into your design decisions. Cap yourself at **3–4 iterations** — first run with everything free, second with obvious fixes from CV + plateau analysis, third to tighten remaining params. \n",
    "\n",
    "If the strategy still shows heavy overfitting signals after three passes, **the problem is the strategy architecture, not the parameters**.\n",
    "\n",
    "**WFE:** Walk-forward efficiency - examine IS/OOS ratio (simplest).\n",
    "\n",
    "**Pertubation degradation:** Examine pertubation table to see if degradation reduces across runs.\n",
    "\n",
    "| Signal | Meaning | Action |\n",
    "|--------|---------|--------|\n",
    "| IS Sharpe drops, OOS Sharpe holds or rises, WFE improves | Removing noise-fitting degrees of freedom | Continue iterating |\n",
    "| Perturbation degradation shrinks across iterations | Parameters becoming more robust | Continue iterating |\n",
    "| N/A plateau params decreasing across iterations | Strategy becoming more tolerant of parameter movement | Continue iterating |\n",
    "| WFE improvement flattens (e.g. 0.55 → 0.65 → 0.67) | Diminishing returns — further fixes won't help much | Stop iterating |\n",
    "| IS and OOS both decline but WFE rises (IS falls faster) | Constraining away real signal, not just noise | Stop iterating |\n",
    "| OOS Sharpe keeps declining despite \"better\" param setup | Overfitting the iteration process itself | Stop — problem is strategy architecture, not parameters |\n",
    "| WFE decreases after fixing a parameter | Locked in a param that was legitimately adapting across folds | Unfix that parameter and re-run |\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1",
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "import os\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "\n",
    "# ── repo root — works on both Mac and Windows ────────────────────────────────\n",
    "ROOT = os.path.expanduser('~/Desktop/epsilon/github/Epsilon-Quant-Research')\n",
    "# ROOT = r'C:\\Users\\user\\Documents\\Epsilon Fund\\Epsilon-Quant-Research'  # ← Windows path\n",
    "# ─────────────────────────────────────────────────────────────────────────────\n",
    "\n",
    "sys.path.append(os.path.join(ROOT, 'infrastructure', 'data'))\n",
    "sys.path.append(os.path.join(ROOT, 'infrastructure', 'walkforward'))\n",
    "sys.path.append(os.path.join(ROOT, 'infrastructure', 'backtester'))\n",
    "\n",
    "\n",
    "from binance_client import get_binance_client, get_data\n",
    "from wf_engine import walk_forward, plateau_analysis, plateau_summary, perturbation_test, cost_stress_test\n",
    "from wf_visualizer import plot_walk_forward_results, plot_plateau_analysis\n",
    "from engine import backtest\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2",
   "metadata": {},
   "source": [
    "---\n",
    "## Data\n",
    "\n",
    "**Pairs** — any Binance pair in `BASEQUOTE` format (e.g. `BTCUSDT`, `ETHUSDT`, `SOLUSDT`, `BNBUSDT`).  \n",
    "Verify availability at [binance.com/en/trade](https://www.binance.com/en/trade).\n",
    "\n",
    "**Intervals** — `'1m'` `'5m'` `'15m'` `'1h'` `'4h'` `'1d'` `'1w'`\n",
    "\n",
    "**Lookback** — days of history: must be >= (train_bars + test_bars) * n_folds desired"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "SYMBOL   = 'BTCUSDT'\n",
    "INTERVAL = '1d'\n",
    "LOOKBACK = 2150 \n",
    "\n",
    "\n",
    "# ── Multiple pairs (optional) ──────────────────────────────────────────────────\n",
    "# SYMBOLS = ['BTCUSDT', 'ETHUSDT', 'SOLUSDT']\n",
    "# data_dict = get_multiple_data(client, SYMBOLS, INTERVAL, LOOKBACK)\n",
    "# Access via: data_dict['BTCUSDT_1d'], data_dict['ETHUSDT_1d'] ...\n",
    "# ──────────────────────────────────────────────────────────────────────────────\n",
    "\n",
    "client   = get_binance_client()\n",
    "df = get_data(client, SYMBOL, INTERVAL, LOOKBACK)\n",
    "print(f'Data: {df.index[0].date()} → {df.index[-1].date()}  ({len(df)} bars)')\n",
    "df.tail(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "---\n",
    "## Parameter Configuration\n",
    "\n",
    "Define which parameters to optimise and anchor - **recommended to do after strategy writeup**\n",
    "\n",
    "`FIXED_PARAMS`: choose parameters with CV < 0.15 from prior stability run, cross referencing with pertubation verdict table to reduce search space, improve OOS credibility.\n",
    "\n",
    "**Practical rule**: free parameter count to be **at most** `n_trials` / 20 for meaningful conversion. \n",
    "\n",
    "> e.g 400 trials: ~20 free params as the theoretical ceiling, in practice you want far fewer because TPE (Optuna method) efficiency degrades exponentially with dimensionality, not linearly. A good target for 400 trials is 6–10 free parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ── parameter search space ────────────────────────────────────────────────────\n",
    "# Format: 'param_name': ('int' | 'float', lo, hi)\n",
    "# Only keys present in PARAMS above are searched — remove a key from PARAMS to exclude it entirely.\n",
    "\n",
    "PARAM_DEFS = {\n",
    "    'ema_span':          ('int',   5,    40),\n",
    "    'swing_caution':     ('int',   3,    14),\n",
    "    'swing_stop':        ('int',   3,    10),\n",
    "    'atr_caution':       ('int',   10,   30),\n",
    "    'atr_stop':          ('int',   10,   30),\n",
    "    'atr_size':          ('int',   3,    14),\n",
    "    'adx_override':      ('int',   52,   65),\n",
    "    'stop_atr_scale':    ('float', 0.5,  2.0),\n",
    "    'risk_per_trade':    ('float', 0.005, 0.05),\n",
    "    'max_leverage':      ('float', 1.0,  3.0),\n",
    "    'stop_mult_pos_caution': ('float', 0.1, 0.6),\n",
    "    'stop_mult_pos_normal':  ('float', 0.8, 2.0),\n",
    "    'stop_mult_ent_both':    ('float', 1, 2.5),\n",
    "    'stop_mult_ent_caution': ('float', 0.1, 0.9),\n",
    "    'stop_mult_ent_normal':  ('float', 0.5, 1.5),\n",
    "    'obv_ma_period':  ('int',   10,  40),   # OBV smoothing window\n",
    "    'obv_lookback':   ('int',   10,  30),   # bars back to compare price vs OBV direction\n",
    "}\n",
    "\n",
    "# ── anchored params (set after a stability run; empty first time) ─────────────\n",
    "# These bypass Optuna and are held constant across all folds.\n",
    "# Populate using stability_df results: fix params with CV < 0.15\n",
    "FIXED_PARAMS = {\n",
    "    'risk_per_trade': 0.0426,\n",
    "    'max_leverage': 2.8325,\n",
    "\n",
    "    'stop_atr_scale': 1,\n",
    "    'stop_mult_pos_normal':1,\n",
    "    'stop_mult_ent_normal': 1,\n",
    "\n",
    "#    'obv_ma_period': 32,\n",
    "#    'obv_lookback': 16,\n",
    "#    'adx_override': 60,\n",
    "\n",
    "#    'stop_mult_ent_normal':1\n",
    "    }"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6",
   "metadata": {},
   "source": [
    "1. Economiclally intuitive range narrowing"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7",
   "metadata": {},
   "source": [
    "### *Guide to parameter anchoring*\n",
    "\n",
    "|  | Robust Plateau| Fragile Plateau |\n",
    "|--|-------------------|-------------------|\n",
    "| Low CV | Stable across folds and insensitive to exact value - keep free| Looks stable but is fitting the same noise patterns - fix at concensus|\n",
    "| High CV | Parameter unimportant - fix at any reasonable value | Unstable across folds and sitting on a cliff - strong candidate to eliminate |\n",
    "\n",
    "Copy-paste plateau analysis table above into fixed params section and decide manually on which to fix/keep free.c"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "---\n",
    "## Strategy\n",
    "\n",
    "**CRUCIAL** - Strategy logic needs to work well in backtesting notebook before running here, saves time not running walk-forward for a broken strategy.\n",
    "\n",
    "**Available columns:** `Open` `High` `Low` `Close` `Volume`\n",
    "\n",
    "**Required output:** a `position` column — `1` long · `0` flat · `-1` short  \n",
    "**Optional:** `position_size` column (0–1) to use fractional capital\n",
    "\n",
    "> The engine applies a 1-bar execution lag automatically. Inside the strategy loop, use `prev` for signal decisions and `curr` for execution — no manual shifting needed.\n",
    "\n",
    "**To implement your strategy:**\n",
    "1. Write strategy logic — compute indicators, signals, and execution loop: use `param_name`for those to be searched\n",
    "2. Update `indicator_cols` to list your longest-warmup indicators — the engine uses this to clean NaN rows after OOS trimming\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "def my_strategy(df_slice: pd.DataFrame, params: dict) -> pd.DataFrame:\n",
    "\n",
    "    df = df_slice.copy()\n",
    "\n",
    "    # ── strategy logic ────────────────────────────────────────────────────────\n",
    "    df['EMA']          = df['Close'].ewm(span=params['ema_span'], adjust=False).mean()\n",
    "    df['Swing_Hi_Cau'] = df['High'].rolling(params['swing_caution']).max()\n",
    "    df['Swing_Lo_Cau'] = df['Low'].rolling(params['swing_caution']).min()\n",
    "    df['Swing_Hi_Stp'] = df['High'].rolling(params['swing_stop']).max()\n",
    "\n",
    "    def atr(period):\n",
    "        hl = df['High'] - df['Low']\n",
    "        hc = (df['High'] - df['Close'].shift(1)).abs()\n",
    "        lc = (df['Low']  - df['Close'].shift(1)).abs()\n",
    "        return pd.concat([hl, hc, lc], axis=1).max(axis=1).ewm(span=period, adjust=False).mean()\n",
    "\n",
    "    df['ATR_Cau'] = atr(params['atr_caution'])\n",
    "    df['ATR_Stp'] = atr(params['atr_stop'])\n",
    "    df['ATR_Sz']  = atr(params['atr_size'])\n",
    "\n",
    "    up    = df['High'].diff();  down = -df['Low'].diff()\n",
    "    pdm   = up.where((up > down) & (up > 0), 0.0)\n",
    "    ndm   = down.where((down > up) & (down > 0), 0.0)\n",
    "    atr14 = atr(14)\n",
    "    pdi   = 100 * pdm.ewm(span=14, adjust=False).mean() / atr14\n",
    "    ndi   = 100 * ndm.ewm(span=14, adjust=False).mean() / atr14\n",
    "    dx    = (100 * (pdi - ndi).abs() / (pdi + ndi).replace(0, np.nan)).fillna(0)\n",
    "    df['ADX_14'] = dx.ewm(span=14, adjust=False).mean()\n",
    "\n",
    "    direction     = df['Close'].diff().apply(lambda x: 1 if x > 0 else -1)  # removed Vol_MA\n",
    "    df['OBV']     = (df['Volume'] * direction).cumsum()\n",
    "    df['OBV_MA']  = df['OBV'].rolling(params['obv_ma_period']).mean()\n",
    "\n",
    "    df['Caution_OBV']   = (df['Close'] > df['Close'].shift(params['obv_lookback'])) & (df['OBV'] < df['OBV_MA'])\n",
    "    df['Caution_Long']  = ((df['Swing_Hi_Cau'] - df['Low']) > 1.5 * df['ATR_Cau']) | df['Caution_OBV']\n",
    "    df['Caution_Short'] = ((df['High'] - df['Swing_Lo_Cau']) > 1.5 * df['ATR_Cau']) | (df['Close'] > df['EMA'])\n",
    "    # guard: only allow entry when stop/size indicators are fully warmed up (no NaN stop-loss trap)\n",
    "    df['Entry_Long'] = (df['Close'] > df['EMA']) & (~df['Caution_Long'] | (df['ADX_14'] > params['adx_override']))  # removed volume filter\n",
    "    df['position_size_raw'] = (params['risk_per_trade'] / (df['ATR_Sz'] / df['Close'])).clip(0.1, params['max_leverage'])\n",
    "\n",
    "    n             = len(df)\n",
    "    position      = [0]     * n\n",
    "    position_size = [0.0]   * n\n",
    "    stop_arr      = [np.nan] * n\n",
    "    in_position   = 0\n",
    "    stop_loss     = np.nan\n",
    "    current_size  = 0.0\n",
    "\n",
    "    for i in range(1, n):\n",
    "        prev = df.iloc[i - 1]\n",
    "        curr = df.iloc[i]\n",
    "\n",
    "        if in_position == 0:\n",
    "            if prev['Entry_Long']:\n",
    "                in_position  = 1\n",
    "                current_size = curr['position_size_raw']\n",
    "                cl = prev['Caution_Long']; cs = prev['Caution_Short']\n",
    "                if cl and cs: sm = params['stop_mult_ent_both']\n",
    "                elif cl:      sm = params['stop_mult_ent_caution']\n",
    "                else:         sm = params['stop_mult_ent_normal']\n",
    "                stop_loss = curr['Swing_Hi_Stp'] - curr['ATR_Stp'] * sm * params['stop_atr_scale']\n",
    "        else:\n",
    "            if prev['Close'] < stop_loss:\n",
    "                in_position  = 0\n",
    "                current_size = 0.0\n",
    "                stop_loss    = np.nan\n",
    "            else:\n",
    "                sm        = params['stop_mult_pos_caution'] if curr['Caution_Long'] else params['stop_mult_pos_normal']\n",
    "                stop_loss = max(stop_loss, curr['Swing_Hi_Stp'] - curr['ATR_Stp'] * sm \n",
    "                * params['stop_atr_scale'])\n",
    "\n",
    "        position[i]      = in_position\n",
    "        position_size[i] = current_size\n",
    "        stop_arr[i]      = stop_loss\n",
    "\n",
    "    df['position']      = position\n",
    "    df['position_size'] = position_size\n",
    "    df['stop_loss']     = stop_arr\n",
    "\n",
    "    # ── cleanup ───────────────────────────────────────────────────────────────\n",
    "    indicator_cols = ['EMA', 'ATR_Cau', 'ADX_14', 'Swing_Hi_Cau', 'OBV_MA']  \n",
    "    df['position']      = df['position'].fillna(0).astype(int)\n",
    "    df['position_size'] = df['position_size'].fillna(0.0)\n",
    "    df['stop_loss']     = df['stop_loss'].fillna(0.0)\n",
    "\n",
    "    return df, indicator_cols"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "---\n",
    "## Run Walk-Forward\n",
    "Simulates how the strategy would have performed if re-optimised periodically\n",
    "in live trading, and exposes whether good IS performance survives unseen data.\n",
    "\n",
    "**Folds Setup**\n",
    "| Parameter | Description | Guidance |\n",
    "|---|---|---|\n",
    "| `TRAIN_BARS` | Bars per training window | Aim for 2:1 to 3:1 ratio vs `TEST_BARS` |\n",
    "| `TEST_BARS` | Bars per test window | `365` = ~1 year on daily data |\n",
    "| `BURNIN_BARS` | Bars prepended to test for indicator warmup | Match your longest indicator period |\n",
    "| `N_TRIALS` | Optuna trials per fold | 300–500 for daily; more = better but slower.10-20 trials per free parameter to avoid overfit |\n",
    "| `COST` | Round-trip cost per trade | `0.001` = 0.1% |\n",
    "\n",
    "Use `N_TRIALS` as robustness dia: if OOS degrades sharply as you increase it from 100→200→300, direct signal your parameter space still has too many degrees of freedom relative to the information content of the training window (consider decreasing). \n",
    "\n",
    "**Score and Rejection** — use to calibrate what Optuna optimises IS: default `score_fn(m)` uses weighted basket of Sharpe, Calmar and Return, normalised using their \"max\" value; default `reject_fn(m)` discards runs failing certain criteria that limits credibility.\n",
    "\n",
    "> Pay attention to the **degradation ratio** — OOS/IS Sharpe reveals overfitting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ── walk-forward windows ──────────────────────────────────────────────────────\n",
    "TRAIN_BARS  = 1050  \n",
    "TEST_BARS   = 137   \n",
    "BURNIN_BARS = 100   \n",
    "N_TRIALS    = 400  \n",
    "COST        = 0.001 \n",
    "\n",
    "# ── SCORING FUNCTION ──────────────────────────────────────────────────────────\n",
    "# Modify weights or swap components. Must return a float (higher = better).\n",
    "\n",
    "def score_fn(m):\n",
    "    SHARPE_MAX = 2.5\n",
    "    CALMAR_MAX = 6\n",
    "    RETURN_MAX = 15.0\n",
    "\n",
    "    calmar = m['total_return'] / abs(m['max_drawdown']) if m['max_drawdown'] != 0 else 0\n",
    "\n",
    "    s = np.clip(m['sharpe_ratio']  / SHARPE_MAX, 0, 1)\n",
    "    c = np.clip(calmar             / CALMAR_MAX, 0, 1)\n",
    "    r = np.clip(m['total_return']  / RETURN_MAX, 0, 1)\n",
    "\n",
    "    return 0.50 * s + 0.30 * c + 0.20 * r\n",
    "\n",
    "# ── REJECTION CRITERIA ────────────────────────────────────────────────────────\n",
    "# Trials that return True are discarded (score → -999).\n",
    "\n",
    "def reject_fn(m):\n",
    "    if m is None:                      return True\n",
    "    if m['num_trades']    < 15:        return True\n",
    "    if m['win_rate']      < 0.3:      return True\n",
    "    if m['max_drawdown']  < -0.7:     return True\n",
    "    if m['profit_factor'] < 0.6:       return True\n",
    "    return False\n",
    "\n",
    "\n",
    "results = walk_forward(\n",
    "    df           = df,\n",
    "    strategy_fn  = my_strategy,\n",
    "    param_defs   = PARAM_DEFS,\n",
    "    fixed_params = FIXED_PARAMS,\n",
    "    train_bars   = TRAIN_BARS,\n",
    "    test_bars    = TEST_BARS,\n",
    "    burnin_bars  = BURNIN_BARS,\n",
    "    n_trials     = N_TRIALS,\n",
    "    cost         = COST,\n",
    "    score_fn     = score_fn,    # ← your notebook definition\n",
    "    reject_fn    = reject_fn,   # ← your notebook definition\n",
    "    save_csv     = None,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "8 folds: 1.12, (but higher returns decently)\n",
    "- most consistent! \n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13",
   "metadata": {},
   "source": [
    "---\n",
    "## Granular Results and Parameter Stability\n",
    "\n",
    "Per-fold IS vs OOS performance. Each row is one fold — compare `train_*` vs `test_*` columns to assess overfitting.\n",
    "\n",
    "| Column | Description |\n",
    "|---|---|\n",
    "| `*_sharpe` `*_return` `*_drawdown` `*_calmar` | Core performance metrics |\n",
    "| `*_trades` `*_winrate` `*_profit_factor` | Trade statistics |\n",
    "| `optuna_score` | Best score achieved on training window |\n",
    "| `param_*` | Best parameter values per fold e.g. `param_ema_span` |\n",
    "\n",
    "**Concensus Parameters** - use to anchor: the engine determines stability using the coefficient of variation (CV) — the standard deviation of a parameter's best values across all folds divided by their median.\n",
    "\n",
    ">CV < 0.15: indicates the strategy  relies on value rather than it being noise-fitted to a specific period — making it safe to fix for future runs. A high CV means the parameter is period-sensitive and should stay free."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ── fold summary table ────────────────────────────────────────────────────────\n",
    "display_cols = [\n",
    "    'train_sharpe', 'test_sharpe',\n",
    "    'train_return', 'test_return',\n",
    "    'train_drawdown', 'test_drawdown',\n",
    "    'optuna_score',\n",
    "]\n",
    "display(results['results_df'][display_cols].round(2))\n",
    "\n",
    "# ── parameter stability ───────────────────────────────────────────────────────\n",
    "stability = results['stability_df'].copy()\n",
    "stability['stable'] = stability['stable'].map({True: '✓', False: ''})\n",
    "stability['fixed']  = stability['fixed'].map({True: '✓', False: ''})\n",
    "stability = stability[['param', 'median', 'std', 'cv', 'stable', 'fixed']].round(2)\n",
    "display(stability.sort_values('cv'))\n",
    "\n",
    "# ── consensus params ──────────────────────────────────────────────────────────\n",
    "stable = results['stability_df'][results['stability_df']['cv'] < 0.15]\n",
    "\n",
    "print('Stable parameters (CV < 0.15) — copy into FIXED_PARAMS:')\n",
    "for _, row in stable.iterrows():\n",
    "    v = results['consensus_params'][row['param']]\n",
    "    v_fmt = int(round(v)) if isinstance(v, float) and v == int(v) else round(v, 4) if isinstance(v, float) else v\n",
    "    print(f\"    '{row['param']}': {v_fmt},\")\n",
    "    \n",
    "print('\\nConsensus parameters (median across folds):')\n",
    "for k, v in results['consensus_params'].items():\n",
    "    print(f'  {k:<30} = {round(v, 2) if isinstance(v, float) else v}')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15",
   "metadata": {},
   "source": [
    "---\n",
    "## Parameter Robustness Checks\n",
    "\n",
    "### Plateau Analysis\n",
    "Sweep each free parameter across its range while holding others at consensus (median) value then evaluates the `score` at each point by backtesting over entire lookback.\n",
    "\n",
    "The stability table (CV across folds) tells you *\"does the optimizer consistently pick the same value?\"*  \n",
    "\n",
    "Plateau analysis tells you *\"if that value were slightly wrong, would performance collapse?\"*  \n",
    "\n",
    "**Plateau %** - what fraction of each parameter's range stays within `threshold`% (default 20) of peak score: >60% = `robust plateau`, 30–60% = `moderate`, <30% = `fragile` (consider anchoring). `N/A` means every sweep point failed rejection filters — the strategy is completely intolerant of movement on that dimension.\n",
    "\n",
    ">Run time: `n_free_params` × `n_steps`\n",
    "\n",
    "### Perturbation test\n",
    "Jitters all free parameters by ±5/10/20% of their range (50 random samples per offset range). Measures how much the score degrades vs the base\n",
    "\n",
    "Test whether optimum is a broad hill in `#free params`-D space or a narrow spike\n",
    "\n",
    "**>15%:** fragile optimum, consider reducing free parameters"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ── 1-D sensitivity sweeps around consensus params ─────────────────────────\n",
    "sweep_results = plateau_analysis(\n",
    "    df           = df,\n",
    "    strategy_fn  = my_strategy,\n",
    "    base_params  = results['consensus_params'],\n",
    "    param_defs   = PARAM_DEFS,\n",
    "    fixed_params = FIXED_PARAMS,\n",
    "    cost         = COST,\n",
    "    n_steps      = 20, #Adjust for number of steps around concensus per parameter\n",
    ")\n",
    "\n",
    "# ── text verdicts ──────────────────────────────────────────────────────────\n",
    "verdict_df = plateau_summary(\n",
    "    sweep_results,\n",
    "    base_params = results['consensus_params'],\n",
    "    stability_df = results['stability_df'],  \n",
    "    threshold   = 0.20, #Adjust for % around peak score\n",
    ")\n",
    "\n",
    "# ── neighbouahood perturbation ────────────────────────────────────────────\n",
    "# Randomly jitters ALL free params simultaneously.\n",
    "# If mean score degrades >15% at ±10% offset, the optimum is fragile.\n",
    "\n",
    "perturb_df = perturbation_test(\n",
    "    df           = df,\n",
    "    strategy_fn  = my_strategy,\n",
    "    base_params  = results['consensus_params'],\n",
    "    param_defs   = PARAM_DEFS,\n",
    "    fixed_params = FIXED_PARAMS,\n",
    "    cost         = COST,\n",
    "    pct_offsets  = (0.05, 0.10, 0.20),   # ±5%, ±10%, ±20% of range\n",
    "    n_samples    = 50,                     # random perturbations per offset level\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17",
   "metadata": {},
   "source": [
    "### 1-D sweep charts:\n",
    "| Element | Meaning | Good | Bad |\n",
    "|---------|---------|------|-----|\n",
    "| **Blue curve** | Composite score at each value of the parameter, with all others held at consensus | Flat-topped curve — performance is insensitive to the exact value | Narrow spike — optimizer latched onto one specific value, everything nearby is worse |\n",
    "| **Red dashed line** | Where the consensus value sits | On the flat top of the curve | On a steep slope or at the edge of a cliff |\n",
    "| **Green dashed line** | Cutoff at 80% of peak score — the boundary between plateau and non-plateau | Blue curve stays above this line across most of the range | Blue curve dips below it quickly either side of the peak |\n",
    "| **Green shading** | Plateau region — all values where the score stays within 20% of the peak | Wide green band spanning most of the range (robust) | Thin sliver or no shading at all (fragile/overfit) |\n",
    "\n",
    " If concensus on steep slope: parameter **REGIME SENSITIVE** - do not fix, backtests are disagreeing, want to fix parameters on flat top."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "18",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "# ── visual sweep curves ───────────────────────────────────────────────────\n",
    "plot_plateau_analysis(\n",
    "    sweep_results    = sweep_results,\n",
    "    consensus_params = results['consensus_params'],\n",
    "    param_defs       = PARAM_DEFS,\n",
    "    fixed_params     = FIXED_PARAMS,\n",
    "    threshold        = 0.20,\n",
    "    show             = False,\n",
    "    save_html        = None,\n",
    ")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19",
   "metadata": {},
   "source": [
    "---\n",
    "## Results Charts and Cost Stress Test\n",
    "\n",
    "| Parameter | Description | Default |\n",
    "|---|---|---|\n",
    "| `show_fold_perf` | IS vs OOS bars for return, Sharpe, drawdown per fold | `False` |\n",
    "| `show_param_evol` | Parameter evolution across folds with ±1 std bands | `False` |\n",
    "| `show_oos_equity` | Combined OOS equity curve + drawdown with fold boundaries | `True` |\n",
    "| `show_trades` | Overlay entry/exit markers on OOS equity chart | `False` |\n",
    "| `benchmark_data` | DataFrame with `Close` column for buy & hold comparison | `None` |\n",
    "| `save_html_dir` | Directory path to save charts as HTML files, or `None` | `None` |\n",
    "\n",
    "**Cost Stress Test:** re-run the combined OOS backtest at 1×, 1.5×, 2×, 3× the base cost. Fragile strategies collapse; robust ones degrade gradually."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "20",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_walk_forward_results(\n",
    "    results          = results,\n",
    "    param_defs       = PARAM_DEFS,\n",
    "    fixed_params     = FIXED_PARAMS,\n",
    "    benchmark_data   = df,\n",
    "    show             = True,\n",
    "    save_html_dir    = None,\n",
    "    show_fold_perf   = False,   # IS vs OOS bars by fold\n",
    "    show_param_evol  = False,   # parameter evolution across folds\n",
    "    show_oos_equity  = True,   # combined OOS equity curve\n",
    "    show_trades      = False,  # trade markers on OOS equity chart\n",
    ")\n",
    "\n",
    "# ── transaction cost stress test ──────────────────────────────────────────\n",
    "\n",
    "if results['oos_combined_df'] is not None:\n",
    "    cost_df = cost_stress_test(\n",
    "        oos_combined_df  = results['oos_combined_df'],\n",
    "        cost_multipliers = (1.0, 1.5, 2.0, 3.0),\n",
    "        base_cost        = COST,\n",
    "    )\n",
    "else:\n",
    "    print('No combined OOS dataframe — skip cost stress test')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21",
   "metadata": {},
   "outputs": [],
   "source": [
    "results['oos_combined_df'].to_pickle('oos/btc_oos.pkl')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}