{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Consistent scores\n", "## Introduction \n", "\n", "A \"consistent scoring function\" is a scoring function where a forecaster's expected score will be optimised by following a forecast directive. An example of a forecast directive is to \"predict the mean (i.e., expected) rainfall amount\". Using a consistent scoring function ensures that a forecaster or forecast system developer isn't faced with the dilemma of whether to \n", "\n", "1. produce forecasts (or develop a forecast system) that optimises the expected verification score, or\n", "2. produce an honest\\* forecast (or develop an honest forecast system) that is optimal for the forecast directive. \n", "\n", "\\* note that by \"honest\" we mean issuing a forecast that corresponds to a forecaster's true belief and has not been \"hedged\" ([Murphy 1978](https://doi.org/10.1175/1520-0477(1978)059<0371:HATMOE>2.0.CO;2)) to try and improve the verification score by producing a forecast that does not correspond to their true belief.\n", "\n", "Using a consistent scoring function to evaluate forecast performance ensures that objectives both 1 and 2 can be met without any tensions. For example, the Mean Squared Error (MSE) is a consistent score for forecasting the mean (i.e., expected) rainfall amount, since forecasting the \"mean rainfall amount\" will minimise the expected MSE. However the Mean Absolute Error (MAE) is not consistent with forecasting the mean value and is instead consistent with forecasting the median value. Consistent scoring functions are formally defined in ([Gneiting et al., 2011a](https://doi.org/10.1198/jasa.2011.r10138)).\n", "\n", "\n", "You may already know that the MSE is a consistent score for forecasting the mean and the MAE is a consistent score for forecasting the median. But did you know that there are a whole family of scores that are consistent for predicting the mean, median, or quantiles? \n", "\n", "The consistent scoring module in `scores` provides access to these families of consistent scores that can be used to tailor emphasis of predictive performance based on decision thresholds of interest.\n", "\n", "