{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Preprocessing of Option Quotes\n", "==============================\n", "\n", "This notebook demonstrates the preprocessing of equity options, in preparation for the estimation of the parameters of a stochastic model.\n", "A number of preliminary calculations must be performed:\n", "\n", "1. Calculation of implied risk-free rate and dividend yield, and derivation of forward prices\n", "2. Calculation of forward at-the-money volatility. There is probably no option struck at the forward price, so this item must be computed by interpolation.\n", "3. Calculation of the Black-Scholes implied bid and ask volatility, given bid and ask option prices. \n", "4. Calculation of 'Quick Delta': a common measure of moneyness, useful for representing the volatility smile.\n", "\n", "Each step is now described.\n", "\n", "Calculation of implied dividend yield and risk-free rate\n", "--------------------------------------------------------\n", "\n", "Recall the put-call parity relationship with continuous dividends:\n", "\n", "$$\n", "C_t - P_t = S_t e^{-d (T-t)} - K e^{-r (T-t)}\n", "$$\n", "\n", "where\n", "\n", "* $C_t$ price of call at time $t$\n", "* $P_t$ price of put at time $t$\n", "* $S_t$ spot price of underlying asset\n", "* $d$ continuous dividend yield\n", "* $r$ risk-free rate\n", "* $T$ Expity\n", "\n", "For each maturity, we estimate the linear regression:\n", "\n", "$$\n", "C_t - P_t = a_0 + a_1 K\n", "$$\n", "\n", "which yields\n", "\n", "$$\n", "r = - \\frac{1}{T} \\ln (-a_1)\n", "$$\n", "$$\n", "d = \\frac{1}{T} \\ln \\left( \\frac{S_t}{a_0} \\right)\n", "$$\n", "\n", "Calculation of forward at-the-money volatility\n", "----------------------------------------------\n", "\n", "We next want to estimate the implied volatility of an option struck at the forward price. In general, such option is not traded, and the volatility must therefore be estimated. The calculation involves 3 steps, performed separately on calls and puts:\n", "\n", "1. Estimate the bid ($\\sigma_b(K)$) and ask ($\\sigma_a(K)$) Black-Scholes volatility for each quote.\n", "2. Compute a mid-market implied volatility for each quote:\n", "$$\n", "\\sigma(K) = \\frac{\\sigma_b(K)+\\sigma_a(K)}{2}\n", "$$\n", "3. Let $F$ be the forward price, the corresponding mid-market implied volatility is computed by linear interpolation between the two quuotes braketing $F$.\n", "\n", "The forward ATM volatility is the average of the volatilities computed on calls and puts.\n", "\n", "Quick Delta\n", "-----------\n", "\n", "Recall that the delta of a European call is defined as $N(d_1)$, where\n", "\n", "$$\n", "d_{1} = \\frac{1}{\\sigma \\sqrt{T}} \\left[ \\ln \\left( \\frac{S}{K} \\right) + \\left( r + \\frac{1}{2}\\sigma^2 \\right)T \\right]\n", "$$\n", "\n", "The \"Quick Delta\" (QD) is a popular measure of moneyness, inspired from the definition of delta:\n", "\n", "$$\n", "QD(K) = N \\left( \\frac{1}{\\sigma \\sqrt{T}} \\ln \\left( \\frac{F_T}{K} \\right) \\right)\n", "$$\n", "\n", "Note that $QD(F_T)=0.5$, for all maturities, while the regular forward delta is a function of time to expiry. This property of Quick Delta makes it convenient for representing the volatility smile.\n", "\n", "Data Filters\n", "------------\n", "\n", "A number of filters may be applied, in an attempt to exclude inconsistent or erroneous data.\n", "\n", "1. Exclusion of maturities shorter than $t_{Min}$\n", "2. Exclusion of maturities with less than $n_{Min}$ quotes\n", "3. Exclusion of quotes with Quick Delta less than $QD_{Min}$ or higher than $QD_{Max}$\n", "\n", "Implementation\n", "--------------\n", "\n", "This logic is implemented in the function `Compute_IV`, presented below. The function takes as argument a `pandas DataFrame` and returns another \n", "`DataFrame`, with one row per quote and 14 columns:\n", "\n", "1. Type: 'C'/'P'\n", "2. Strike\n", "3. dtExpiry\n", "4. dtTrade\n", "5. Spot\n", "6. IVBid: Black-Scholes implied volatility (bid)\n", "7. IVAsk: Black-Scholes implied volatility (ask)\n", "8. QD: Quick Delta\n", "9. iRate: risk-free rate (continuously compounded)\n", "10. iDiv: dividend yield (continuously compounded)\n", "11. Fwd: Forward price\n", "12. TTM: Time to maturity, in fraction of years (ACT/365)\n", "13. PBid: Premium (bid)\n", "14. PAsk: Premium (ask)\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas\n", "import dateutil\n", "import re\n", "import datetime\n", "import os\n", "import numpy as np\n", "from pandas import DataFrame\n", "from scipy.interpolate import interp1d\n", "from scipy.stats import norm\n", "\n", "import quantlib.reference.names as nm\n", "import quantlib.pricingengines.blackformula\n", "from quantlib.pricingengines.blackformula import blackFormulaImpliedStdDev\n", "\n", "def Compute_IV(optionDataFrame, tMin=0, nMin=0, QDMin=0, QDMax=1, keepOTMData=True):\n", " \n", " \"\"\"\n", " Pre-processing of a standard European option quote file.\n", " - Calculation of implied risk-free rate and dividend yield\n", " - Calculation of implied volatility\n", " - Estimate ATM volatility for each expiry\n", " - Compute implied volatility and Quick Delta for each quote\n", " \n", " Options for filtering the input data set: \n", " - maturities with less than nMin strikes are ignored\n", " - maturities shorter than tMin (ACT/365 daycount) are ignored\n", " - strikes with Quick Delta < qdMin or > qdMax are ignored\n", " \"\"\"\n", " \n", " grouped = optionDataFrame.groupby(nm.EXPIRY_DATE) \n", "\n", " isFirst = True\n", " for spec, group in grouped:\n", " print('processing group %s' % spec)\n", "\n", " # implied vol for this type/expiry group\n", "\n", " indx = group.index\n", " \n", " dtTrade = group[nm.TRADE_DATE][indx[0]]\n", " dtExpiry = group[nm.EXPIRY_DATE][indx[0]]\n", " spot = group[nm.SPOT][indx[0]]\n", " daysToExpiry = (dtExpiry-dtTrade).days\n", " timeToMaturity = daysToExpiry/365.0\n", "\n", " # exclude groups with too short time to maturity\n", " \n", " if timeToMaturity < tMin:\n", " continue\n", " \n", " # exclude groups with too few data points\n", " \n", " df_call = group[group[nm.OPTION_TYPE] == nm.CALL_OPTION]\n", " df_put = group[group[nm.OPTION_TYPE] == nm.PUT_OPTION]\n", " \n", " if (len(df_call) < nMin) | (len(df_put) < nMin):\n", " continue\n", "\n", " # calculate forward, implied interest rate and implied div. yield\n", " \n", " df_C = DataFrame((df_call['PBid']+df_call['PAsk'])/2, columns=['PremiumC'])\n", " df_C.index = df_call['Strike'].values\n", " \n", " df_P = DataFrame((df_put['PBid']+df_put['PAsk'])/2, columns=['PremiumP'])\n", " df_P.index = df_put['Strike'].values\n", " \n", " # use 'inner' join because some strikes are not quoted for C and P\n", " df_all = df_C.join(df_P, how='inner')\n", " df_all['Strike'] = df_all.index\n", " df_all['C-P'] = df_all['PremiumC'] - df_all['PremiumP']\n", " \n", " y = np.array(df_all['C-P'])\n", " x = np.array(df_all['Strike'])\n", " A = np.vstack((x, np.ones(x.shape))).T\n", "\n", " b = np.linalg.lstsq(A, y)[0]\n", " \n", " # intercept is last coef\n", " iRate = -np.log(-b[0])/timeToMaturity\n", " dRate = np.log(spot/b[1])/timeToMaturity\n", " \n", " discountFactor = np.exp(-iRate*timeToMaturity)\n", " Fwd = spot * np.exp((iRate-dRate)*timeToMaturity)\n", "\n", " print('Fwd: %f int rate: %f div yield: %f' % (Fwd, iRate, dRate))\n", "\n", " # mid-market ATM volatility\n", " \n", " def impvol(cp, strike, premium):\n", " try:\n", " vol = blackFormulaImpliedStdDev(cp, strike,\n", " forward=Fwd, blackPrice=premium, discount=discountFactor,\n", " TTM=timeToMaturity)\n", " except:\n", " vol = np.nan\n", " return vol/np.sqrt(timeToMaturity)\n", " \n", " # implied bid/ask vol for all options\n", " \n", " df_call['IVBid'] = [impvol('C', strike, price) for strike, price\n", " in zip(df_call['Strike'], df_call['PBid'])]\n", " df_call['IVAsk'] = [impvol('C', strike, price) for strike, price\n", " in zip(df_call['Strike'], df_call['PAsk'])]\n", " \n", " df_call['IVMid'] = (df_call['IVBid'] + df_call['IVAsk'])/2\n", " \n", " df_put['IVBid'] = [impvol('P', strike, price) for strike, price\n", " in zip(df_put['Strike'], df_put['PBid'])]\n", " df_put['IVAsk'] = [impvol('P', strike, price) for strike, price\n", " in zip(df_put['Strike'], df_put['PAsk'])]\n", " \n", " df_put['IVMid'] = (df_put['IVBid'] + df_put['IVAsk'])/2\n", " \n", " f_call = interp1d(df_call['Strike'].values, df_call['IVMid'].values)\n", " f_put = interp1d(df_put['Strike'].values, df_put['IVMid'].values)\n", "\n", " atmVol = (f_call(Fwd)+f_put(Fwd))/2\n", " print('ATM vol: %f' % atmVol)\n", "\n", " # Quick Delta, computed with ATM vol\n", " rv = norm()\n", " df_call['QuickDelta'] = [rv.cdf(np.log(Fwd/strike)/(atmVol*np.sqrt(timeToMaturity))) \\\n", " for strike in df_call['Strike']]\n", " df_put['QuickDelta'] = [rv.cdf(np.log(Fwd/strike)/(atmVol*np.sqrt(timeToMaturity))) \\\n", " for strike in df_put['Strike']]\n", "\n", " # keep data within QD range\n", " \n", " df_call = df_call[(df_call['QuickDelta'] >= QDMin) & \\\n", " (df_call['QuickDelta'] <= QDMax) ]\n", " \n", " df_put = df_put[ (df_put['QuickDelta'] >= QDMin) & \\\n", " (df_put['QuickDelta'] <= QDMax) ]\n", "\n", " # final assembly...\n", "\n", " df_cp = df_call.append(df_put, ignore_index=True)\n", " df_cp[nm.INTEREST_RATE] = iRate \n", " df_cp[nm.DIVIDEND_YIELD] = dRate \n", " df_cp[nm.ATMVOL] = atmVol \n", " df_cp[nm.FORWARD] = Fwd\n", " \n", " # keep only OTM data ?\n", " if keepOTMData:\n", " df_cp = df_cp[((df_cp[nm.STRIKE]>=Fwd) & (df_cp[nm.OPTION_TYPE] == nm.CALL_OPTION)) |\n", " ((df_cp[nm.STRIKE]