{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Receiver Operating Characteristic (ROC)\n", "\n", "A ROC curve shows the ability of a probabilitic forecast of a binary event (e.g., chance of rainfall) to discriminate between events and non-events. It plots the hit rate or probability of detection (POD) against the false alarm rate or probability of false detection (POFD) using a set of increasing probability thresholds (e.g., 0.1, 0.2, 0.3, ...) that convert the probabilistic forecast into a binary forecast.\n", "\n", "The area under the ROC curve is often used as a measure of discrimination ability. ROC is insensitive to forecast bias so should be used together with reliablity or some other metric which is measures bias." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from scores.categorical import probability_of_detection, probability_of_false_detection\n", "from scores.probability import roc_curve_data\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import xarray as xr\n", "from scipy.stats import beta, binom" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Imagine we have a sequence of binary forecasts and a sequence of binary observations, where 1 indicates a forecast or observed event and 0 indicates a forecast or observed non-event." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "fcst = xr.DataArray([1, 0, 1, 0, 1])\n", "obs = xr.DataArray([1, 0, 0, 1, 1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then use `scores` to calculate the POD ($\\frac{H}{H + M}$) and POFD ($\\frac{F}{C + F}$), where $H, M, F, C$ are the number of hits, misses, false alarms, and correct negatives respectively." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.DataArray 'ctable_probability_of_detection' ()>\n",
"array(0.66666667)<xarray.DataArray 'ctable_probability_of_false_detection' ()>\n",
"array(0.5)<xarray.Dataset>\n",
"Dimensions: (threshold: 101)\n",
"Coordinates:\n",
" * threshold (threshold) float64 0.0 0.01 0.02 0.03 ... 0.97 0.98 0.99 1.0\n",
"Data variables:\n",
" POD (threshold) float64 1.0 1.0 1.0 1.0 ... 0.1125 0.0875 0.04063 0.0\n",
" POFD (threshold) float64 1.0 1.0 1.0 1.0 1.0 ... 0.0 0.0 0.0 0.0 0.0\n",
" AUC float64 0.8074