{ "cells": [ { "cell_type": "markdown", "source": [ "# Example to demonstrate optimized backdoor variable search for Causal Identification\n", "\n", "This notebook compares the performance between causal identification using vanilla backdoor search and the optimized backdoor search and demonstrates the performance gains obtained by using the latter." ], "metadata": {} }, { "cell_type": "code", "execution_count": 1, "source": [ "import time\n", "import random\n", "from networkx.linalg.graphmatrix import adjacency_matrix\n", "import numpy as np\n", "import pandas as pd\n", "import networkx as nx\n", "\n", "import dowhy\n", "from dowhy import CausalModel\n", "from dowhy.utils import graph_operations\n", "import dowhy.datasets\n" ], "outputs": [], "metadata": {} }, { "cell_type": "markdown", "source": [ "## Create Random Graph \n", "In this section, we create a random graph with the designated number of nodes (10 in this case)." ], "metadata": {} }, { "cell_type": "code", "execution_count": 2, "source": [ "n = 10\n", "p = 0.5\n", "\n", "graph = nx.generators.random_graphs.fast_gnp_random_graph(n, p, directed=True)\n", "nodes = []\n", "for i in graph.nodes:\n", "\tnodes.append(str(i))\n", "adjacency_matrix = np.asarray(nx.to_numpy_matrix(graph))\n", "graph_dot = graph_operations.adjacency_matrix_to_graph(adjacency_matrix, nodes)\n", "graph_dot = graph_operations.str_to_dot(graph_dot.source)\n", "print(\"Graph Generated.\")\n", "\n", "df = pd.DataFrame(columns=nodes)\n", "print(\"Dataframe Generated.\")" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Graph Generated.\n", "Dataframe Generated.\n" ] } ], "metadata": {} }, { "cell_type": "markdown", "source": [ "## Testing optimized backdoor search\n", "\n", "In this section, we compare the runtimes for causal identification using vanilla backdoor search and the optimized backdoor search." ], "metadata": {} }, { "cell_type": "code", "execution_count": 3, "source": [ "start = time.time()\n", "\n", "# I. Create a causal model from the data and given graph.\n", "model = CausalModel(data=df,treatment=str(random.randint(0,n-1)),outcome=str(random.randint(0,n-1)),graph=graph_dot)\n", "time1 = time.time()\n", "print(\"Time taken for initializing model =\", time1-start)\n", "\n", "# II. Identify causal effect and return target estimands\n", "identified_estimand = model.identify_effect()\n", "time2 = time.time()\n", "print(\"Time taken for vanilla identification =\", time2-time1)\n", "\n", "# III. Identify causal effect using the optimized backdoor implementation\n", "identified_estimand = model.identify_effect(optimize_backdoor=True)\n", "end = time.time()\n", "print(\"Time taken for optimized backdoor identification =\", end-time2)" ], "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Time taken for initializing model = 0.07566142082214355\n", "Time taken for vanilla identification = 6.404623508453369\n", "Time taken for optimized backdoor identification = 1.3513822555541992\n" ] } ], "metadata": {} }, { "cell_type": "markdown", "source": [ "It can be observed that the optimized backdoor search makes causal identification significantly faster as compared to the vanilla implementation." ], "metadata": {} }, { "cell_type": "markdown", "source": [], "metadata": {} } ], "metadata": { "orig_nbformat": 4, "language_info": { "name": "python", "version": "3.6.9", "mimetype": "text/x-python", "codemirror_mode": { "name": "ipython", "version": 3 }, "pygments_lexer": "ipython3", "nbconvert_exporter": "python", "file_extension": ".py" }, "kernelspec": { "name": "python3", "display_name": "Python 3.6.9 64-bit" }, "interpreter": { "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" } }, "nbformat": 4, "nbformat_minor": 2 }