{ "cells": [ { "cell_type": "markdown", "id": "361e6e0a", "metadata": {}, "source": [ "---\n", "title: Anatomy of a Real Crypto Order Book\n", "summary: Spread, depth, and slippage measured on a real recorded BTC-USD order book from Coinbase — the microstructure that sets your true trading costs.\n", "tags: [crypto, microstructure, order-book, slippage]\n", "---\n", "\n", "# Anatomy of a Real Crypto Order Book\n", "\n", "Trading isn't free, and the \"price\" on a screen isn't the price you get. This post dissects a **real\n", "recorded BTC-USD limit order book** (Coinbase, ~600 snapshots over ~10 minutes — the same data that\n", "powers the [ConvexPi Arena](https://convexpi.ai/compete/arena-book)) to measure the three things that\n", "actually determine your cost: the **spread**, the **depth**, and the **slippage** of walking the book.\n", "See [how matching works](https://convexpi.ai/exchange) for the mechanics." ] }, { "cell_type": "code", "execution_count": null, "id": "fd722456", "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import json, urllib.request\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "\n", "URL = \"https://raw.githubusercontent.com/convexpi/arena/f75088c55d58c8693fec8e85c9ef4a8927ab2f2e/data/btcusd_book.jsonl\"\n", "raw = urllib.request.urlopen(URL).read().decode()\n", "frames = [json.loads(l) for l in raw.splitlines() if l.strip()]\n", "print(f\"{len(frames)} order-book snapshots; each has bids 'b' and asks 'a' = [[price, size], ...]\")" ] }, { "cell_type": "markdown", "id": "7725ac97", "metadata": {}, "source": [ "## 1. The spread\n", "\n", "The **best bid** and **best ask** bracket the mid; the gap is the **spread** — what you immediately\n", "pay to cross from buyer to seller. Coinbase BTC is famously tight, so we look at it in basis points." ] }, { "cell_type": "code", "execution_count": null, "id": "aecb1e1f", "metadata": {}, "outputs": [], "source": [ "rows = []\n", "for f in frames:\n", " bb, ba = f[\"b\"][0][0], f[\"a\"][0][0]\n", " mid = (bb + ba) / 2\n", " rows.append({\"t\": f[\"t\"], \"mid\": mid, \"spread\": ba - bb, \"spread_bps\": (ba - bb) / mid * 1e4})\n", "df = pd.DataFrame(rows)\n", "df[\"time\"] = pd.to_datetime(df[\"t\"], unit=\"ms\")\n", "print(f\"mid price : ${df['mid'].mean():,.2f}\")\n", "print(f\"spread : ${df['spread'].mean():.3f} ({df['spread_bps'].mean():.2f} bps avg, \"\n", " f\"{df['spread_bps'].median():.2f} bps median)\")\n", "\n", "fig, ax = plt.subplots(2, 1, figsize=(10, 5), sharex=True)\n", "ax[0].plot(df[\"time\"], df[\"mid\"], lw=1); ax[0].set_title(\"BTC-USD mid price\")\n", "ax[1].plot(df[\"time\"], df[\"spread_bps\"], lw=1, color=\"indianred\"); ax[1].set_title(\"Spread (bps)\")\n", "plt.tight_layout(); plt.show()" ] }, { "cell_type": "markdown", "id": "bb9b3ce6", "metadata": {}, "source": [ "## 2. Depth — the shape of the book\n", "\n", "The spread only tells you the cost of a *tiny* trade. For real size, what matters is **depth**: how\n", "much is resting at each price level. We average the cumulative size available within a given distance\n", "(in bps) of the mid, on each side." ] }, { "cell_type": "code", "execution_count": null, "id": "d4613595", "metadata": {}, "outputs": [], "source": [ "LEVELS = 15\n", "def cum_depth(side_key, sign):\n", " # cumulative size and its avg price-offset (bps) for the first LEVELS levels, averaged over frames\n", " sizes = np.zeros(LEVELS); offs = np.zeros(LEVELS); cnt = np.zeros(LEVELS)\n", " for f in frames:\n", " mid = (f[\"b\"][0][0] + f[\"a\"][0][0]) / 2\n", " cum = 0.0\n", " for i, (price, qty) in enumerate(f[side_key][:LEVELS]):\n", " cum += qty\n", " sizes[i] += cum\n", " offs[i] += sign * (price - mid) / mid * 1e4\n", " cnt[i] += 1\n", " return offs / cnt, sizes / cnt\n", "bid_off, bid_cum = cum_depth(\"b\", -1)\n", "ask_off, ask_cum = cum_depth(\"a\", +1)\n", "\n", "fig, ax = plt.subplots(figsize=(9, 3.5))\n", "ax.step(-bid_off, bid_cum, where=\"mid\", color=\"seagreen\", label=\"bids (cumulative)\")\n", "ax.step(ask_off, ask_cum, where=\"mid\", color=\"indianred\", label=\"asks (cumulative)\")\n", "ax.axvline(0, color=\"grey\", lw=0.5)\n", "ax.set_xlabel(\"distance from mid (bps)\"); ax.set_ylabel(\"cumulative size (BTC)\")\n", "ax.set_title(\"Average book depth\"); ax.legend(); plt.tight_layout(); plt.show()\n", "print(f\"avg size in top {LEVELS} ask levels: {ask_cum[-1]:.2f} BTC (~${ask_cum[-1]*df['mid'].mean():,.0f})\")" ] }, { "cell_type": "markdown", "id": "03d3c42a", "metadata": {}, "source": [ "## 3. Slippage — what a market order really costs\n", "\n", "A market buy *walks the ask ladder*: it fills the cheapest offers first, then more expensive ones. The\n", "**volume-weighted price you pay** drifts above the mid — that's **slippage**, and it grows with order\n", "size. This is the cost the Arena makes you feel when you send a market order into the real book." ] }, { "cell_type": "code", "execution_count": null, "id": "cb0c0775", "metadata": {}, "outputs": [], "source": [ "def slippage_buy_bps(frame, size_btc):\n", " mid = (frame[\"b\"][0][0] + frame[\"a\"][0][0]) / 2\n", " remaining, cost, filled = size_btc, 0.0, 0.0\n", " for price, qty in frame[\"a\"]:\n", " take = min(remaining, qty); cost += take * price; filled += take; remaining -= take\n", " if remaining <= 1e-12: break\n", " if filled < size_btc * 0.999: # book too thin to fill this size\n", " return np.nan\n", " return (cost / filled / mid - 1) * 1e4\n", "\n", "sizes = [0.05, 0.1, 0.25, 0.5, 1, 2, 3, 5]\n", "slip = [np.nanmean([slippage_buy_bps(f, s) for f in frames]) for s in sizes]\n", "fillable = [np.mean([not np.isnan(slippage_buy_bps(f, s)) for f in frames]) for s in sizes]\n", "\n", "fig, ax = plt.subplots(figsize=(9, 3.5))\n", "ax.plot(sizes, slip, \"o-\"); ax.set_xlabel(\"market-buy size (BTC)\"); ax.set_ylabel(\"avg slippage (bps)\")\n", "ax.set_title(\"Slippage vs order size\"); plt.tight_layout(); plt.show()\n", "for s, sl, fr in zip(sizes, slip, fillable):\n", " print(f\" buy {s:>4} BTC (~${s*df['mid'].mean():>10,.0f}) -> {sl:5.2f} bps slippage [fillable in {fr:.0%} of snapshots]\")" ] }, { "cell_type": "markdown", "id": "c51af885", "metadata": {}, "source": [ "## Takeaways\n", "\n", "1. **The spread is the floor, not the cost.** Coinbase's BTC spread is ~1 bp — but that only covers a\n", " dust-sized trade.\n", "2. **Depth is finite.** The top-of-book holds only a few BTC; a larger order walks up the ladder.\n", "3. **Slippage grows with size** and explodes once you exhaust visible depth — the real reason big\n", " orders are split, worked, or routed.\n", "\n", "This is exactly what the [Arena's real-order-book mode](https://convexpi.ai/compete/arena-book) makes\n", "tangible: your market orders pay *this* slippage, and a market maker earns it. Build an agent in the\n", "[market-making lesson](https://convexpi.ai/lessons/market-making)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }