---
name: causal-inference
description: >
  Production-grade Bayesian causal inference with PyMC, CausalPy, and DoWhy. Enforces DAG-first
  thinking, mandatory user checkpoints for assumptions, design-specific refutation, and defensible
  reporting with causal language guardrails. Trigger on: causal inference, causal effect estimation,
  treatment effects, counterfactuals, difference-in-differences (DiD), synthetic control, regression
  discontinuity (RDD), interrupted time series (ITS), instrumental variables (IV), propensity scores,
  DAGs, causal graphs, confounders, backdoor criterion, do-calculus, interventional distributions,
  pm.do(), pm.observe(), CausalPy, DoWhy, mediation analysis, refutation, sensitivity analysis,
  parallel trends, placebo tests, or any question of the form "does X cause Y" or "what is the
  effect of X on Y."
license: MIT
metadata:
  author: "[Alexandre Andorra](https://alexandorra.github.io/)"
  version: "1.0"
---

# Causal Inference

## Dependencies

This skill requires the **bayesian-workflow** skill for all PyMC modeling steps (priors, sampling,
diagnostics, calibration, reporting).

Detect it:

```bash
ls ~/.claude/skills/bayesian-workflow/SKILL.md 2>/dev/null || ls .claude/skills/bayesian-workflow/SKILL.md 2>/dev/null
```

If not found, install it:

```bash
git clone https://github.com/Learning-Bayesian-Statistics/baygent-skills.git /tmp/baygent-skills
cp -r /tmp/baygent-skills/bayesian-workflow ~/.claude/skills/
```

For all PyMC modeling steps (priors, sampling, diagnostics, calibration, reporting), follow the
bayesian-workflow skill.

## Workflow overview

Every causal analysis follows this sequence. Steps 1-4 are the thinking phase (no code). Steps 5-8
are the doing phase. Think before you do.

1. **Formulate the causal question** — Propose precise estimand (ATE, ATT, LATE, etc.). ⚠️ ASK USER TO CONFIRM.
2. **Draw the DAG** — Propose causal graph with nodes, edges, and explicit non-edges. ⚠️ ASK USER TO CONFIRM. See [references/dags-and-identification.md](references/dags-and-identification.md)
3. **Identify** — Determine identification strategy (backdoor, front-door, IV, RDD, DiD). ⚠️ ASK USER TO CONFIRM untestable assumptions. See [references/dags-and-identification.md](references/dags-and-identification.md)
4. **Choose design** — Match problem to method using table below. ⚠️ ASK USER TO CONFIRM. See [references/quasi-experiments.md](references/quasi-experiments.md) or [references/structural-models.md](references/structural-models.md)
5. **Estimate** — Build and fit the model. Delegate all PyMC mechanics to bayesian-workflow skill.
6. **Refute** — MANDATORY. Run design-specific robustness checks. See [references/refutation.md](references/refutation.md)
7. **Interpret** — Effect size + decision-relevant HDIs + probability of direction.
8. **Report** — Generate causal analysis report. See [references/reporting.md](references/reporting.md)

## Design selection guide

| Design | Use when | Key assumption | Tool |
|---|---|---|---|
| DiD | Treatment at known time, control group available | Parallel trends | CausalPy |
| Staggered DiD | Treatment rolls out at different times | Parallel trends per cohort | CausalPy |
| Synthetic Control | Single treated unit, donor pool available | Weighted donors approximate counterfactual | CausalPy |
| ITS | Time series, intervention at known time, no control | No confounding event at treatment time | CausalPy |
| RDD | Treatment by threshold on running variable | No manipulation at threshold | CausalPy |
| IV | Endogenous treatment, valid instrument | Exclusion restriction, relevance | CausalPy |
| IPSW | Observational data, treatment modeled | No unmeasured confounders, positivity | CausalPy |
| Structural (do/observe) | Full causal theory, model mechanisms | Correct DAG specification | PyMC |
| Counterfactual | "What would Y have been if X differed?" | Correct structural model | PyMC |

## Critical rules

- **No estimation without a confirmed DAG.** A causal graph is not optional decoration — it makes
  assumptions explicit and determines the adjustment set. If the user resists, explain why the DAG
  is non-negotiable before proceeding.
- **No causal claims without refutation.** Every design has failure modes. Run at minimum one
  design-specific robustness check (placebo test, sensitivity analysis, falsification test) before
  reporting results. See [references/refutation.md](references/refutation.md).
- **State assumptions before results.** Lead with what must be true for the estimate to be causal.
  Bury the estimate after the assumptions, not before. This is not optional politeness — it prevents
  misuse of results.
- **Adapt HDIs to the decision context.** The bayesian-workflow skill's 94% HDI is a sensible
  default; adapt it with explicit explanation when the decision stakes warrant it (e.g., 89% for
  exploratory, 97% for high-stakes policy). Report multiple intervals when the decision threshold
  matters.
- **Downgrade causal language when warranted.** If identification assumptions are unverifiable or
  refutation raises flags, soften claims: "consistent with a causal effect" not "causes", "estimated
  effect" not "true effect". Flag uncertainty loudly in the report.
- **Ask the user when domain knowledge is needed.** You cannot know whether an instrument is valid,
  whether parallel trends holds, or whether a confounder exists without domain expertise. Ask
  before assuming.
- **Delegate PyMC mechanics to bayesian-workflow.** This skill handles causal structure and design.
  The bayesian-workflow skill handles priors, sampling, diagnostics, calibration, and reporting
  format. Don't duplicate those rules here.

## Common gotchas

These are battle-tested lessons that save hours of debugging:

- **CausalPy formula syntax uses `C()` for categoricals.** Passing a string column directly without
  `C()` will silently produce wrong dummy coding. Always wrap categorical treatment and group
  variables: `"y ~ C(treatment) + C(group)"`.
- **DoWhy requires explicit `U` nodes for unobserved confounders.** Omitting them from the graph
  will make DoWhy treat your model as fully identified when it isn't. Add latent nodes explicitly
  and mark them as unobserved.
- **CausalPy's PyMC models don't auto-store log-likelihood.** Same issue as bayesian-workflow:
  nutpie silently drops it. Call `pm.compute_log_likelihood(idata, model=model)` after sampling if
  you need it for model comparison.
- **Parallel trends is untestable in the post-treatment period.** Pre-treatment trend tests are
  necessary but not sufficient — passing them doesn't prove the assumption holds after treatment.
  State this explicitly in every DiD report.
- **Synthetic control requires the treated unit to lie within the convex hull of donors.** If the
  treated unit is an outlier (highest GDP, largest city), no weighted combination of donors can
  approximate its counterfactual. Check this before running — if violated, the design is invalid.
- **DiD group variable must be dummy-coded (0/1).** CausalPy rejects string labels like "treatment"/"control". Use integers: 1 = treatment, 0 = control. Data also requires a `unit` column.
- **SyntheticControl expects wide-format data.** Index = time, columns = unit names, values = outcome. If your data is long format, pivot first: `df.pivot(index="date", columns="unit", values="outcome")`.

## When things go wrong

| Symptom | Likely cause | Fix |
|---|---|---|
| Refutation fails | Assumption violated | Diagnose which assumption, try alternative design or sensitivity bounds |
| DiD effect at placebo time | Parallel trends violated | Try synthetic control or add group-specific time trends |
| RDD: bunching at threshold | Manipulation of running variable | Design is invalid for this threshold — report and stop |
| SC: poor pre-treatment fit | Donors don't span treated unit | Add donors, expand donor pool, or reconsider design |
| DoWhy says "not identifiable" | Insufficient adjustment set | Revise DAG, add measured variables, or change design |
| CausalPy formula error | Wrong formula syntax | Use `C()` for categoricals, check variable names match dataframe columns |