{ "cells": [ { "cell_type": "markdown", "id": "b2985301", "metadata": {}, "source": [ "# Consistencies between SCMs\n", "\n", "In the previous notebooks we have discussed at length abstraction and (interventional) consistency, especially in the framework of [Rischel2020]. We have defined the notion of abstraction and abastraction error (first notebook), examined properties of this definition (second notebook), compared abstractions to transformations from [Rubenstein2017] (third notebook), implemented automated code to compute the abstraction error (fourth notebook), and then reviewed the compositionality of abstraction error from [Rischel2021] (fifth notebook).\n", "\n", "An underlying idea behind these explorations was that the quality of an abstraction could be assessed through a quantitative evaluation of interventional consistency, that is, the requirement that: (i) using a low-level mechanism under intervention and then abstracting; or (ii) abstracting to the intervened high-level model and then using a mechanism; would produce the same result. \n", "\n", "In this notebook we take a closer look at interventional consistency, and we explore other forms of consistency. We instantiate a series of models and abstractions to show how observational, interventional and counterfactual consistency are not strictly related to one another.\n", "\n", "The notebook aims at showing in a succint way the existence of models and abstractions that may guarantee different forms of consistency. We will go through the following steps: \n", "- Reviewing the definition and the relevance of consistency (Section 2)\n", "- Presenting the approach used to express the consistency problem as a commutativity problem on BNs in $\\mathtt{FinStoch}$ (Section 3)\n", "- Listing forms of consistency in the observational domain (Section 4)\n", "- Listing forms of consistency in the interventional domain (Section 5)\n", "- Listing forms of consistency in the counterfactual domain (Section 6)\n", "- Running a series of case studies in which we have models and abstractions satisfying different forms of consistency (Section 7).\n", "\n", "DISCLAIMER 1: the notebook refers to ideas from *causality* and *category theory* for which only a quick definition is offered. Useful references for causality are [Pearl2009,Peters2017], while for category theory are [Spivak2014,Fong2018].\n", "\n", "DISCLAIMER 2: mistakes are in all likelihood due to misunderstandings by the notebook author in reading [Rischel2020]. Feedback very welcome! :)" ] }, { "cell_type": "markdown", "id": "17f68091", "metadata": {}, "source": [ "## Importing libraries and defining parameters\n", "\n", "We start by importing basic libraries." ] }, { "cell_type": "code", "execution_count": 1, "id": "0a9f44a4", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import networkx as nx\n", "import scipy\n", "from tqdm.notebook import tqdm\n", "\n", "from pgmpy.models import BayesianNetwork as BN\n", "from pgmpy.factors.discrete import TabularCPD as cpd\n", "from pgmpy.inference import VariableElimination" ] }, { "cell_type": "markdown", "id": "d49ca936", "metadata": {}, "source": [ "For reproducibility, and for discussing our results in this notebook, we set a random seed to $0$." ] }, { "cell_type": "code", "execution_count": 2, "id": "55aa8bcb", "metadata": {}, "outputs": [], "source": [ "np.random.seed(0)" ] }, { "cell_type": "markdown", "id": "3570793f", "metadata": {}, "source": [ "We set a verbose parameter to control the display of probability distributions." ] }, { "cell_type": "code", "execution_count": 3, "id": "d51f1bbc", "metadata": {}, "outputs": [], "source": [ "verbose = False" ] }, { "cell_type": "markdown", "id": "2d1aa800", "metadata": {}, "source": [ "We also set the number of samples in our empirical simulations." ] }, { "cell_type": "code", "execution_count": 4, "id": "f5c1b448", "metadata": {}, "outputs": [], "source": [ "n_samples = 10**5" ] }, { "cell_type": "markdown", "id": "0bff83ec", "metadata": {}, "source": [ "In this notebook we write implementations of our models directly in *pgmpy*, but we do not rely on our own Abstraction objects from *src.SCMMappings*. For the sake of simplicity and illustration we will mainly work with abstractions that are trivial identities." ] }, { "cell_type": "markdown", "id": "9d2128f2", "metadata": {}, "source": [ "# A review of consistencies\n", "\n", "We are generically interested in working with forms of consistency. Following [Rischel2020] we will express consistency as a (category-theoretical) commuting diagram:\n", "\n", "$$\n", "\\begin{array}{ccc}\n", "A& \\overset{\\mu}{\\rightarrow} & B \\\\\n", "\\alpha_{X}{\\downarrow}& &{\\downarrow}\\alpha_{Y} \\\\\n", "X& \\overset{\\nu}{\\rightarrow} & Y\n", "\\end{array}\n", "$$\n", "\n", "where $A,B,X,Y$ are finite sets, and $\\mu,\\nu,\\alpha_X,\\alpha_Y$ are Markov kernels (column-stochastic matrices). In our abstraction setting $A,B,X,Y$ are outcomes, $\\mu,\\nu$ mechanisms, and $\\alpha_X,\\alpha_Y$ are abstractions with the added requirement of being binary matrices." ] }, { "cell_type": "markdown", "id": "0707018e", "metadata": {}, "source": [ "Consisency tells us that starting from an element $a \\in A$ we arrive at the same result whether:\n", "1. we follow the upper path, computing $\\alpha_Y \\circ \\mu (a)$; or\n", "2. we follow the lower path, computing $\\nu \\circ \\alpha_X (a)$." ] }, { "cell_type": "markdown", "id": "4f5eec38", "metadata": {}, "source": [ "**Meaning of consistency.** Why are we so interested in consistency? Consistency tells us something about the *alignment*, or about the *correspondance*, of the mechanisms $\\mu$ and $\\nu$ with respect to the abstraction. It says something about the *dynamics* of the model, and how their inputs and outputs are related.\n", "\n", "It is, however, a strongly *syntactical* property. It is concerned with the *replaceability* of one model with the other in the sense of maintaining an alignment between them, but it does not concern itself with the *quality* or *information* at different levels." ] }, { "cell_type": "markdown", "id": "7e5c11d6", "metadata": {}, "source": [ "**Taxonomy of consistencies.** Now, we can have different forms of consistency with respect to two different parameters used to generate the diagrams above:\n", "\n", "- *Originating graph*: whether the graph from which we derive the diagram is the observational model $\\mathcal{M}$, a post-interventional model $\\mathcal{M}_\\iota$, or a counterfactual model $\\mathcal{M}_{\\iota\\bar{\\iota}}$.\n", "\n", "- *Distributions*: relating to the choice of the sets $A,B,X,Z$ and consequently $\\mu,\\nu$." ] }, { "cell_type": "markdown", "id": "730f6238", "metadata": {}, "source": [ "**A note on the formalism.** Since $\\mu$ and $\\nu$ are matrices encoding distributions, in the following section we will use an explicit probability notation, instead of greek letters (normally used to denote mechanisms) or capital letters (sometimes used for matrices). We will then use symbols such as $p_\\mathcal{M}(A)$ to refer to a matrix encoding this discrete distribution. Every probability distribution thus correspond to a stochastic matrix. Moreover, the subscript will always make clear from which model the distribution is derived." ] }, { "cell_type": "markdown", "id": "8eebdbd5", "metadata": {}, "source": [ "**A note on computing distributions in the models.** In the following diagram we will refer to different distributions such as $p_{\\mathcal{M'}}(X)$ or $p_{\\mathcal{M}}(B\\vert A)$. In the context of Markovian and semi-Markovian model, we assume that all these quantities are computable from the base SCMs $\\mathcal{M}$ and ${\\mathcal{M'}}$ and their joints." ] }, { "cell_type": "markdown", "id": "252dd933", "metadata": {}, "source": [ "# Approach to causal consistencies\n", "\n", "As suggested above we want to consider different forms of consistencies, specifically observational, interventional and counterfactual.\n", "\n", "It is well-known that these three types of quantities live on three layers of a strict hierarchy; each layer is furthermore associated with a mathematical object that allows for the treatment of these quantities. Thus, Bayes networks (BN) deal only with observational quantities; casual Bayes networks enable us to evaluate not only observational quantities, but also interventional quantities; and, finally, structural causal models (SCM) add the possibility of considering counterfactual quantities [Bareinboim2022].\n", "\n", "Our discussion of causal models started from SCMs [Pearl2009] (see first notebook), but through the simplification of our models and their embedding in $\\mathtt{FinStoch}$ following the approach of [Rischel2020] (see first notebook again) we dropped enough information that we have been left virtually with a BN or a CBN. If we were working with a CBN we would not be able to discuss counterfactual queries; with a BN we could not even assess an interventional query.\n", "\n", "The approach that we have implicitly followed, and which we will make explicit now, is that we are always given a completely defined SCM $\\mathcal{M}^*$ (here we use the star to identify this ground truth SCM and distinguish it from other CBNs or BNs). However, we do not work directly with the SCM $\\mathcal{M}^*$. For us, the SCM is a generator of BNs (or CBNs). For us, in a sense, a SCM is a more expressive object than a CBN or a BN because it is actually a set of CBNs or BNs; it is a set of rules for generating new models.\n", "\n", "So, if we will want to consider consistency in the observational domain, we will take our SCM $\\mathcal{M}^*$ and extract from it the base BN $\\mathcal{M}$. If we will want to consider consistency in the interventional domain, and thus work with an intervention $\\iota$, we will extract from $\\mathcal{M}^*$ the model $\\mathcal{M}_\\iota$ which, after edge removal and structrual function replacement, is again a BN. If we want to consider consistency in the counterfactual domain, assuming we have performed an intervention $\\iota$ and now we want to observe the effects had we performed $\\bar{\\iota}$, we will extract from $\\mathcal{M}^*$ the model $\\mathcal{M}_{\\iota\\bar{\\iota}}$ which, after factual intervention, abduction on the exogenous variables and counterfactual intervention [Pearl2009], provides us with a BN.\n", "\n", "Everytime we will have to evaluate one form of consistency (marginal, joint, conditional) we will therefore just evaluate it in the relative BN assuming it can be represented in $\\mathtt{FinStoch}$ and that commutativity is well-defined. " ] }, { "cell_type": "markdown", "id": "9b59db39", "metadata": {}, "source": [ "# Observational consistencies\n", "\n", "Observational consistencies are consistencies evaluated out of the observational model $\\mathcal{M}$ on which no intervention has been applied. Both the base model $\\mathcal{M}$ and the abstracted model $\\mathcal{M'}$ are untouched." ] }, { "cell_type": "markdown", "id": "bf6acf9a", "metadata": {}, "source": [ "## Marginal observational consistency\n", "\n", "In *marginal observational consistency* we are interested in considering the alignment of a single-variable marginal in the abstracted model with respect to the (possibly multi-varied) marginal in the base model.\n", "\n", "Here our focus is focus on: **$P_\\mathcal{M'}(X)$**" ] }, { "cell_type": "markdown", "id": "196b6fb0", "metadata": {}, "source": [ "Consider, for instance, the following generic scenario in which two variables $A,B$ from $\\mathcal{M}$ are mapped onto $X$ in $\\mathcal{M'}$:\n", "\n", "