--- title: "DAGs" subtitle: "EC 607, Set 07" author: "Edward Rubin" date: "Spring 2021" output: xaringan::moon_reader: css: ['default', 'metropolis', 'metropolis-fonts', 'my-css.css'] # self_contained: true nature: highlightStyle: github highlightLines: true countIncrementalSlides: false --- class: inverse, middle ```{r, setup, include = F} # devtools::install_github("dill/emoGG") library(pacman) p_load( broom, tidyverse, ggplot2, ggthemes, ggforce, ggridges, ggdag, dagitty, cowplot, patchwork, scales, latex2exp, viridis, extrafont, grid, gridExtra, plotly, ggformula, kableExtra, DT, data.table, dplyr, snakecase, janitor, lubridate, knitr, future, furrr, parallel, MASS, estimatr, FNN, parsnip, caret, glmnet, huxtable, here, magrittr ) # Define pink color red_pink <- "#e64173" turquoise <- "#20B2AA" orange <- "#FFA500" red <- "#fb6107" blue <- "#3b3b9a" green <- "#8bb174" grey_light <- "grey70" grey_mid <- "grey50" grey_dark <- "grey20" purple <- "#6A5ACD" slate <- "#314f4f" # Dark slate grey: #314f4f # Knitr options opts_chunk$set( comment = "#>", fig.align = "center", fig.height = 7, fig.width = 10.5, warning = F, message = F ) opts_chunk$set(dev = "svg") options(device = function(file, width, height) { svg(tempfile(), width = width, height = height) }) options(knitr.table.format = "html") theme_set(theme_gray(base_size = 20)) ``` ```{r, xaringan-extra, include = F, eval = F} xaringanExtra::use_scribble(pen_color = red_pink) ``` ```{css, echo = F, eval = T} @media print { .has-continuation { display: block !important; } } ``` $$ \begin{align} \def\ci{\perp\mkern-10mu\perp} \end{align} $$ # Prologue --- name: schedule # Schedule ## Last time (Bad) Controls ## Today Directed Acyclic Graphs (DAGs) ## Upcoming Matching --- class: inverse, middle # DAGs --- layout: true # DAGs --- name: different ## What's a DAG? .note[DAG] stands for .b[directed acyclic graph]. -- More helpful... A .note[DAG] graphically illustrates the causal relationships and non-causal associations within a network of random variables. --- name: dag-ex layout: false class: clear .ex[Example] Omitted-variable bias in a DAG ```{r, dag-ex-setup, echo = F, include = F} # The full DAG dag_full = dagify( Y ~ D, Y ~ W, D ~ W, coords = tibble( name = c("Y", "D", "W"), x = c(1, 3, 2), y = c(2, 2, 1) ) ) # Convert to data.table dag_dt = dag_full %>% fortify() %>% setDT() # Add indicators for paths dag_dt[, `:=`( path1 = (name == "D" & to == "Y") | (name == "Y"), path2 = (name == "D" & to == "W") | (name == "W" & to == "Y") | (name == "Y") )] # Shorten segments mult = 0.15 dag_dt[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, dag-ex-ovb, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = slate, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = slate, size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = slate, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) ``` A pretty standard DAG. --- class: clear .ex[Example] Omitted-variable bias in a DAG ```{r, dag-ex-ovb-nodes, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, color = red_pink ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = "grey80", size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = "white", fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) ``` .b.pink[Nodes] are random variables. --- class: clear .ex[Example] Omitted-variable bias in a DAG ```{r, dag-ex-ovb-edges, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = "grey80", shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = purple, size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = "grey80", fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) ``` .b.purple[Edges] depict causal links. Causality flows in the direction of the .b.purple[arrows]. -- - Connections matter! - Direction matters (for causality). - Non-connections also matter! .grey-light[(More on this topic soon.)] --- class: clear .ex[Example] Omitted-variable bias in a DAG ```{r, dag-ex-ovb-2, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = slate, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = slate, size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = slate, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) ``` Here we can see that .b.slate[Y] is affected by both .b.slate[D] and .b.slate[W]. .b.slate[W] also affects .b.slate[D]. -- .qa[Q] How does this graph exhibit OVB? --- class: clear .ex[Example] Omitted-variable bias in a DAG ```{r, dag-ex-ovb-3, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = slate, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb, color = (name == "D" & to == "Y")), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = slate, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) + scale_color_manual(values = c(slate, red_pink)) ``` There are two pathways from .b.slate[D] to .b.slate[Y]. -- .slate[1\.] The path from .b.slate[D] to .b.slate[Y] $\color{#e64173}{\left(\text{D}\rightarrow\text{Y}\right)}$ is our casual relationship of interest. --- class: clear .ex[Example] Omitted-variable bias in a DAG ```{r, dag-ex-ovb-4, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_curve( data = dag_dt[name == "D" & to == "Y"], color = orange, size = 0.8, linetype = "dashed", curvature = -1.02 ) + geom_point( size = 20, fill = "white", color = slate, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb, color = !(name == "D" & to == "Y")), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = slate, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) + scale_color_manual(values = c(slate, red_pink)) ``` There are two pathways from .b.slate[D] to .b.slate[Y]. .slate[1\.] The path from .b.slate[D] to .b.slate[Y] $\color{#314f4f}{\left(\text{D}\rightarrow\text{Y}\right)}$ is our casual relationship of interest.
.slate[2\.] The path $\color{#e64173}{\left(\text{Y}\leftarrow\text{W}\rightarrow\text{D}\right)}$ creates a .orange[non-causal association] btn .b.slate[D] and .b.slate[Y]. --- class: clear .ex[Example] Omitted-variable bias in a DAG ```{r, dag-ex-ovb-6, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_curve( data = dag_dt[name == "D" & to == "Y"], color = "grey80", size = 0.8, linetype = "dashed", curvature = -1.02 ) + geom_point( aes(color = name == "W", fill = name == "W"), size = 20, pch = 21 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = c(slate, "white", slate), fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) + scale_fill_manual(values = c("white", "grey80")) + scale_color_manual(values = c(slate, "grey80")) ``` There are two pathways from .b.slate[D] to .b.slate[Y]. .slate[1\.] The path from .b.slate[D] to .b.slate[Y] $\color{#314f4f}{\left(\text{D}\rightarrow\text{Y}\right)}$ is our casual relationship of interest.
.slate[2\.] The path $\color{#314f4f}{\left(\text{Y}\leftarrow\text{W}\rightarrow\text{D}\right)}$ creates a .orange[non-causal association] btn .b.slate[D] and .b.slate[Y]. To shut down this pathway creating a non-causal association, we must .b.grey-light[condition on .b[W]]. Sound familiar? --- layout: true # Graphs --- class: inverse, middle name: graphs --- ## More formally In graph theory, a .pink.def[graph] is a collection of .purple.def[nodes] connected by .orange.def[edges]. -- ```{r, graph-ex-setup, include = F} # The full DAG graph_ex = dagify( B ~ A, C ~ A, C ~ B, D ~ C, coords = tibble( name = LETTERS[1:4], x = c(0, 1, 0, 1), y = c(1, 1, 0, 0) ) ) # Convert to data.table graph_dt = graph_ex %>% fortify() %>% setDT() # Shorten segments mult = 0.16 graph_dt[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, graph-ex-undirected, echo = F, fig.height = 3, fig.width = 3} # Plot the full DAG ggplot( data = graph_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_curve( curvature = 0, color = orange, size = 1.2, lineend = "round" ) + geom_point( size = 20, color = purple ) + geom_text( data = graph_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = "white", fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = graph_dt[,range(x)] + graph_dt[,range(x) %>% diff()] * c(-0.1, 0.1), ylim = graph_dt[,range(y)] + graph_dt[,range(y) %>% diff()] * c(-0.1, 0.1) ) ``` -- - Nodes connected by an edge are called .def[adjacent]. -- - .def[Paths] run along adjacent nodes, .it[e.g.], $\text{A}-\text{B}-\text{C}$. -- - The graph above is .def[undirected], since the edges don't have direction. --- name: graphs-directed ## Directed .def.purple[Directed graphs] have edges with direction. ```{r, graph-ex-directed, echo = F, fig.height = 3, fig.width = 3} # Plot the full DAG ggplot( data = graph_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_curve( aes(x = xa, xend = xb, y = ya, yend = yb), arrow = arrow(length = unit(0.07, "npc")), curvature = 0, color = purple, size = 1.2, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( data = graph_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = graph_dt[,range(x)] + graph_dt[,range(x) %>% diff()] * c(-0.1, 0.1), ylim = graph_dt[,range(y)] + graph_dt[,range(y) %>% diff()] * c(-0.1, 0.1) ) ``` -- - .def[Directed paths] follow edges' directions, *e.g.*, $\text{A}\rightarrow\text{B}\rightarrow\text{C}$. -- - Nodes that precede a given node in a directed path are its .def[ancestors]. -- - The opposite: .def[descendants] come after the node, *e.g.*, $\text{D}=\text{de}(\text{C})$. --- name: graphs-cycles ## Cycles If a node is its own descendant (*e.g.*, $\text{de}(\text{D})=\text{D}$), your graph has a .pink.def[cycle]. ```{r, cycle-setup, include = F} # The full DAG cycle_ex = dagify( B ~ A, C ~ A, C ~ B, D ~ C, B ~ D, coords = tibble( name = LETTERS[1:4], x = c(0, 1, 0, 1), y = c(1, 1, 0, 0) ) ) # Convert to data.table cycle_dt = cycle_ex %>% fortify() %>% setDT() # Add indicators for paths cycle_dt[, `:=`( cycle = (name == "B" & to == "C") | (name == "D" & to == "B") | (name == "C" & to == "D") )] # Shorten segments mult = 0.16 cycle_dt[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, graph-ex-cycle, echo = F, fig.height = 3, fig.width = 3} # Plot the full DAG ggplot( data = cycle_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_curve( aes(x = xa, xend = xb, y = ya, yend = yb, color = cycle), arrow = arrow(length = unit(0.07, "npc")), curvature = 0, size = 1.2, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( data = cycle_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = cycle_dt[,range(x)] + cycle_dt[,range(x) %>% diff()] * c(-0.1, 0.1), ylim = cycle_dt[,range(y)] + cycle_dt[,range(y) %>% diff()] * c(-0.1, 0.1) ) + scale_color_manual(values = c(purple, red_pink)) ``` -- If your directed graph does not have any cycles, then you have a
.def.orange[directed acyclic graph] (.def.orange[DAG]). --- layout: true # DAGs --- class: inverse, middle --- name: dag-origins ## The origin story Many developments in .it[causal graphical models] came from work in probabilistic graphical models—especially Bayesian networks. -- Recall what you know about joint probabilities: $$ \begin{align} &\color{#FFA500}{2} &P(x_1,x_2) &= P(x_1) P(x_2|x_1) \\[0.2em] &\color{#FFA500}{3} &P(x_1,x_2,x_3) &= P(x_1) P(x_2,x_3|x_1) = P(x_1) P(x_2|x_1) P(x_3|x_2,x_1) \\[0.2em] &\color{#FFA500}{\vdots} \\[0.2em] &\color{#FFA500}{n} &P(x_1,x_2,\dots,x_n) &= P(x_1)\prod_{i=2}^{n} P(x_i|x_{i-1},\ldots,x_1) \end{align} $$ -- This final product can include *a lot* of terms.
.ex[E.g.,] even when $x_i$ are binary, $P(x_4 | x_3,x_2,x_1)$ requires $2^3=8$ parameters. --- name: local-markov ## Thinking locally DAGs help us think through simplifying $P(x_k | x_{k-1},x_{k-2},\ldots,x_1)$. -- ```{r, graph-prob, echo = F, fig.height = 3, fig.width = 3} # Plot the full DAG ex_graph = ggplot( data = graph_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_curve( aes(x = xa, xend = xb, y = ya, yend = yb), arrow = arrow(length = unit(0.07, "npc")), curvature = 0, color = purple, size = 1.2, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( data = graph_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = graph_dt[,range(x)] + graph_dt[,range(x) %>% diff()] * c(-0.1, 0.1), ylim = graph_dt[,range(y)] + graph_dt[,range(y) %>% diff()] * c(-0.1, 0.1) ) ex_graph ``` Given a prob. dist. and a DAG, can we assume some independencies? --
Given $\color{#6A5ACD}{\text{C}}$, is it reasonable to assume $\color{#6A5ACD}{\text{D}}$ is independent of $\color{#6A5ACD}{\text{A}}$ and $\color{#6A5ACD}{\text{B}}$? --- ## Local Markov This intuitive approach *is* the .def.purple[Local Markov Assumption] > Given its parents in the DAG, a node $X$ is independent of all of its non-descendants. -- .col-left[ .ex[Ex.] Consider the DAG to the right: With the Local Markov Assumption,
$P(\text{D}|\text{A},\text{B},\text{C})$ simplifies to $P(\text{D}|\text{C})$. Conditional on its parent $(\text{C})$,
$\text{D}$ is independent of $\text{A}$ and $\text{B}$. ] .col-right[ ```{r, graph-prob-2, echo = F, fig.height = 3, fig.width = 3} ex_graph ``` ] --- name: dags-factor ## Local Markov and factorization The Local Markov Assumption is equiv. to .def.purple[Bayesian Network Factorization] > For prob. dist. $P$ and DAG $G$, $P$ factorizes according to $G$ if $$ \begin{align} P(x_1,\ldots,x_n) = \prod_{i} P(x_i|\text{pa}_i) \end{align} $$ where $\text{pa}_i$ refers to $x_i$'s parents in $G$. Bayesian network factorization is also called *the chain rule for Bayesian networks* and *Markov compatibility*. --- name: factorization ## Factorize! You can now (more easily) factorize the DAG/dist. below! .grey-vlight[(You're welcome.)] .col-left[ ```{r, graph-factorize, echo = F, fig.height = 3, fig.width = 3} ex_graph ``` ] -- .col-right[ .b.slate[Factorization via B.N. chain rule] $$ \begin{align} &P(\text{A},\text{B},\text{C},\text{D}) \\[0.4em] &\quad = \prod_{i} P(x_i|\text{pa}_i) \\[0.4em] &\quad = P(\text{A}) P(\text{B}|\text{A}) P(\text{C}|\text{A},\text{B}) P(\text{D}|\text{C}) \end{align} $$ ] --- ## Independence What have we learned so far? .grey-vlight[(Why should you care about this stuff?)] Local Markov and Bayesian Network Factorization tell us abount .attn[independencies] within a probability distribution implied by the given DAG. You're now able to say something about which variables are .pink.it[independent]. -- .b[There's more:] Great start, but there's more to life than independence.
We also want to say something about .purple.it[dependence]. --- name: dags-dependence ## Dependence We need to strengthen our Local Markov assumption to be able to interpret adjacent nodes as dependent. .grey-vlight[(*I.e.*, add it to our small set of assumptions.)] -- The .def.purple[Minimality Assumption].pink.super[†] > 1. .def.purple[Local Markov] Given its parents in the DAG, a node $X$ is independent of all of its non-descendants. > 2. .it.grey-light[(NEW)] Adjacent nodes in the DAG are dependent. .footnote[ .pink[†] The name .grey-light.def[minimality] refers to the minimal set of independencies for $P$ and $G$—we cannot remove any more edges from the graph (while staying Markov compatible with $G$).] -- With the minimality assumption, we can learn both .pink[dependence] and .orange[independence] from connections (or non-connections) in a DAG. --- name: dags-causlity ## Causality We need one last assumption move DAGs from .it[statistical] to .it[causal] models. -- .def.purple[Strict Causal Edges Assumption] > Every parent is a direct cause of each of its children. -- For $Y$, the set of .it[direct causes] is the set of variables to which $Y$ responds. -- This assumption actually strengthens the second part of .note[Minimality]: > 2\. Adjacent nodes in the DAG are dependent. --- ## Assumptions Thus, we only need two assumptions to turn DAGs into causal models: 1. .def.purple[Local Markov] Given its parents in the DAG, a node $X$ is independent of all of its non-descendants. 1. .def.purple[Strict Causal Edges] Every parent is a direct cause of each of its children. -- Not bad, right? --- ## Flows [Brady Neal](https://bradyneal.com) emphasizes the .note[flow(s) of association] and .note[causation] in DAGs,
and I find it to be a super helpful way to think about these models. .def.purple[Flow of association] refers to whether two nodes are associated (statistically dependent) or not (statistically independent). We will be interested in unconditional and conditional associations. --- name: building-blocks ## Building blocks We will run through a few simple .it[building blocks] (DAGs) that make up more complex DAGs. For each simple DAG, we want to ask a few questions: 1. Which nodes are unconditionally or conditionally .b.pink[independent]?.super.pink[†] 1. Which nodes are .b.orange[dependent]? 1. What is the .b.purple[intuition]? .footnote[ .pink[†] To prove $\text{A}$ and $\text{B}$ are conditionally independent, we can show $P(\text{A},\text{B}|\text{C})$ factorizes as $P(\text{A}|\text{C})P(\text{B}|\text{C})$.] --- layout: true class: clear --- .note[Building block 1:] .b.slate[Two unconnected nodes] ```{r, bb1-plot, echo = F, fig.height = 1, fig.width = 4} # Plot the DAG ggplot( data = data.table( x = 0:1, y = 0, name = LETTERS[1:2] ), aes(x = x, y = y) ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-0.5, 1.5), ylim = c(-0.5, 0.5) ) ``` -- .b.purple[Intuition:] -- $\text{A}$ and $\text{B}$ appear independent—no link between the nodes. -- .b.pink[Proof:] -- By [Bayesian network factorization](#factorization), $$ \begin{align} P(\text{A},\text{B}) = P(\text{A}) P(\text{B}) \end{align} $$ (since neither node has parents). $\checkmark$ --- .note[Building block 2:] .b.slate[Two connected nodes] ```{r, bb2-data, include = F} # The DAG bb2_ex = dagify( B ~ A, coords = tibble( name = LETTERS[1:2], x = 0:1, y = c(0,0) ) ) # Convert to data.table bb2_dt = bb2_ex %>% fortify() %>% setDT() # Shorten segments mult = 0.2 bb2_dt[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, bb2-plot, echo = F, fig.height = 1, fig.width = 4} # Plot the DAG ggplot( data = bb2_dt, aes(x = x, y = y) ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.13, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-0.5, 1.5), ylim = c(-0.5, 0.5) ) ``` -- .b.purple[Intuition:] -- $\text{A}$ "is a cause of" $\text{B}$: there is clear (causal) dependence..super.pink[†] .footnote[ .pink[†] I'm not a huge fan of the "is a cause of" wording, but it appears to be (unfortunately) common in this literature. IMO, $``\text{A}$ causes (or affects) $\text{B"}$ would be clearer (and more grammatical), but no one asked me. One argument for "a cause of" (vs. "causes") is it emphasizes that events often have multiple causes.] -- .b.pink[Proof:] -- By the [Strict Causal Edges Assumption](#dags-causlity), every parent (here, $\text{A}$) is a direct cause of each of its children $\left(\text{B}\right)$. $\checkmark$ --- name: blocks-chains .note[Building block 3:] .b.slate[Chains] ```{r, bb3-data, include = F} # The DAG bb3_ex = dagify( B ~ A, C ~ B, coords = tibble( name = LETTERS[1:3], x = -1:1, y = 0 ) ) # Convert to data.table bb3_dt = bb3_ex %>% fortify() %>% setDT() # Shorten segments mult = 0.25 bb3_dt[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, bb3-plot, echo = F, fig.height = 1.5, fig.width = 5} # Plot the DAG gg_chain = ggplot( data = bb3_dt, aes(x = x, y = y) ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.09, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) # Plot it gg_chain ``` -- .b.purple[Intuition:] We already showed two connected nodes are dependent: - $\text{A}$ and $\text{B}$ are dependent. - $\text{B}$ and $\text{C}$ are dependent. The question is whether $\text{A}$ and $\text{C}$ are dependent:
Does association flow from $\text{A}$ to $\text{C}$ through $\text{B}$? --- count: false .note[Building block 3:] .b.slate[Chains] ```{r, bb3-plot-2, echo = F, fig.height = 1.5, fig.width = 5} # The curve dataset curve_dt = tibble( x = c(-1, 0, 1), y = c(0, -0.8, 0) ) %>% spline(n = 101) %>% as.data.table() # Plot the DAG gg_chain_association = ggplot( data = bb3_dt, aes(x = x, y = y) ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.09, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_line( data = curve_dt, color = orange, linetype = "dashed", size = 0.8 ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) gg_chain_association ``` .b.purple[Intuition:] We already showed two connected nodes are dependent: - $\text{A}$ and $\text{B}$ are dependent. - $\text{B}$ and $\text{C}$ are dependent. The question is whether $\text{A}$ and $\text{C}$ are dependent:
Does association flow from $\text{A}$ to $\text{C}$ through $\text{B}$? The answer .it[generally].super.pink[†] is .orange["yes"]: changes in $\text{A}$ typically cause changes in $\text{C}$. .footnote[ .pink[†] Section 2.2 of [Pearl, Glymour, and Jewell](http://bayes.cs.ucla.edu/PRIMER/) provides a "pathological" example of "intransitive dependence". It's basically when $\text{A}$ induces variation in $\text{B}$ that is not relevant to $\text{C}$ outcome.] --- .note[Building block 3:] .b.slate[Chains] ```{r, bb3-plot-3, echo = F, fig.height = 1.5, fig.width = 5} gg_chain_association ``` .b.pink[Proof:] Here's the unsatisfying part. Without more assumptions, we can't *prove* this association of $\text{A}$ and $\text{C}$. We'll think of this as a potential (even likely) association. --- .note[Building block 3:] .b.slate[Chains with conditions] ```{r, bb3-plot-4, echo = F, fig.height = 1.5, fig.width = 5} # The curve dataset curve_dt = tibble( x = c(-1, 0, 1), y = c(0, -0.8, 0) ) %>% spline(n = 101) %>% as.data.table() # Plot the DAG gg_chain_condition = ggplot( data = bb3_dt, aes(x = x, y = y) ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.09, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_line( data = curve_dt[1:floor(.N/2)], color = orange, linetype = "dashed", size = 0.8 ) + geom_point( aes(x = x, y = y), data = curve_dt[median(1:.N)], shape = 15, size = 6, color = "grey80" ) + geom_point( aes(color = name == "B", fill = name == "B"), shape = 21, size = 20, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = c(purple, "white", purple), fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) + scale_color_manual(values = c(purple, "grey80")) + scale_fill_manual(values = c("white", "grey80")) gg_chain_condition ``` .qa[Q] How does conditioning on $\text{B}$ affect the association between $\text{A}$ and $\text{C}$? .b.purple[Intuition:] 1. $\text{A}$ affects $\text{C}$ by changing $\text{B}$. 2. When we hold $\text{B}$ constant, $\text{A}$ cannot "reach" $\text{C}$. We've .def.purple[blocked] the path of association between $\text{A}$ and $\text{C}$. Conditioning blocks the flow of association .b[in chains]. ("Good" control!) --- .note[Building block 3:] .b.slate[Chains with conditions] ```{r, bb3-plot-6, echo = F, fig.height = 1.5, fig.width = 5} # Plot the DAG gg_chain_condition ``` .b.pink[Proof:] We want to show $\text{A}$ and $\text{C}$ are independent conditional on $\text{B}$,
*i.e.*, $P(\text{A},\text{C}|\text{B})=P(\text{A}|\text{B})P(\text{C}|\text{B})$. -- Start with BN factorization: $P(\text{A},\text{B},\text{C})$ -- $= P(\text{A})P(\text{B}|\text{A})P(\text{C}|\text{B})$. -- Now apply Bayes' rule for the LHS of our goal: $P(\text{A},\text{C}|\text{B}) = \frac{P(\text{A},\text{B},\text{C})}{P(\text{B})}$. -- And substitute our factorization into the Bayes' rule expression: $P(\text{A},\text{C}|\text{B}) = \dfrac{P(\text{A})P(\text{B}|\text{A})\color{#e64173}{P(\text{C}|\text{B})}}{P(\text{B})}$ -- $=P(\text{A}|\text{B})\color{#e64173}{P(\text{C}|\text{B})}$ $\checkmark$ .grey-light[(Bayes rule again)] --- .note[Building block 3:] .b.slate[Chains] ```{r, bb3-plot-7, echo = F, fig.height = 1.5, fig.width = 5} # Plot the DAG gg_chain_association ``` .note[Note] This .orange[association of] $\color{#FFA500}{\text{A}}$ .orange[and] $\color{#FFA500}{\text{C}}$ is not directional. (It is symmetric.) On the other hand, causation .b[is] directional (and asymmetric). As you've been warned for years: Associations are not necessarily causal. --- name: blocks-forks .note[Building block 4:] .b.slate[Forks] ```{r, bb4-data, include = F} # The DAG bb4_ex = dagify( A ~ B, C ~ B, coords = tibble( name = LETTERS[1:3], x = -1:1, y = c(-0.7, 0, -0.7) ) ) # Convert to data.table bb4_dt = bb4_ex %>% fortify() %>% setDT() # Shorten segments mult = 0.2 bb4_dt[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] # The curve dataset fork_curve = tibble( x = c(-1, 0, 1), y = c(-0.7, 0.3, -0.7) ) %>% spline(n = 101) %>% as.data.table() ``` ```{r, bb4-plot-1, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG gg_fork = ggplot( data = bb4_dt, aes(x = x, y = y) ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) # Plot it gg_fork ``` .def.purple[Forks] are another very common structure in DAGs: $\text{A}\leftarrow \text{B} \rightarrow \text{C}$. --- .note[Building block 4:] .b.slate[Forks] ```{r, bb4-plot-2, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG gg_fork_association = ggplot( data = bb4_dt, aes(x = x, y = y) ) + geom_line( data = fork_curve, color = orange, linetype = "dashed", size = 0.8 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) # Plot it gg_fork_association ``` $\text{A}$ and $\text{C}$ are *usually* .orange[associated] in forks. .grey-light[(As with chains.)] This chain of association follows the path $\text{A}\leftarrow \text{B} \rightarrow \text{C}$. -- .b.purple[Intuition:] -- $\text{B}$ induces changes in $\text{A}$ and $\text{B}$. An observer will see $\text{A}$ change when $\text{C}$ also changes—they are associated due to their common cause. --- .note[Building block 4:] .b.slate[Forks] ```{r, bb4-data-ovb, include = F} # Copy the fork dataset and change names of variables fork_dt = copy(bb4_dt) fork_dt[, `:=`( name = fcase( name == "B", "W", name == "A", "Y", name == "C", "D", default = NA ), to = fcase( to == "B", "W", to == "A", "Y", to == "C", "D", default = NA ) )] ``` ```{r, bb4-plot-ovb, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG ggplot( data = fork_dt, aes(x = x, y = y) ) + geom_line( data = fork_curve, color = orange, linetype = "dashed", size = 0.8 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) ``` Another way to think about forks: OVB when a treatment $\text{D}$ does not affect the outcome $\text{Y}$. Without controlling for $\text{W}$, $\text{Y}$ and $\text{D}$ are (usually) .orange[non-causally associated]. --- .note[Building block 4:] .b.slate[Forks] ```{r, bb4-plot-3, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG gg_fork_association = ggplot( data = bb4_dt, aes(x = x, y = y) ) + geom_line( data = fork_curve, color = orange, linetype = "dashed", size = 0.8 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) # Plot it gg_fork_association ``` $\text{A}$ and $\text{C}$ are *usually* .orange[associated] in forks. .grey-light[(As with chains.)] This chain of association follows the path $\text{A}\leftarrow \text{B} \rightarrow \text{C}$. .b.pink[Proof:] -- Same problem as chains: We can't show $\text{A}$ and $\text{C}$ are independent, so we assume they're likely (potentially?) dependent. --- .note[Building block 4:] .b.slate[Blocked forks] ```{r, bb4-plot-4, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG gg_fork_block = ggplot( data = bb4_dt, aes(x = x, y = y) ) + geom_line( data = fork_curve[1:floor(.N/2)], color = "grey80", linetype = "dashed", size = 0.8 ) + geom_point( aes(x = x, y = y), data = fork_curve[median(1:.N)], shape = 15, size = 6, color = "grey80" ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( aes(color = name == "B", fill = name == "B"), size = 20, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = c("white", "white", purple, purple), fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) + scale_color_manual(values = c(purple, "grey80")) + scale_fill_manual(values = c("white", "grey80")) # Plot it gg_fork_block ``` Conditioning on $\text{B}$ makes $\text{A}$ and $\text{C}$ -- independent. .grey-light[(As with chains.)] .b.purple[Intuition:] -- $\text{A}$ and $\text{C}$ are only associated due to their common cause $\text{B}$. When we shutdown (hold constant) this common cause $(\text{B})$,
there is way for $\text{A}$ and $\text{C}$ to associate. -- .note[Also:] Think about Local Markov. Or think about OVB. --- .note[Building block 4:] .b.slate[Blocked forks] ```{r, bb4-plot-5, echo = F, fig.height = 2.5, fig.width = 5} # Plot it gg_fork_block ``` .b.pink[Proof:] We want to show $P(\text{A},\text{C}|\text{B})=P(\text{A}|\text{B})P(\text{C}|\text{B})$. -- .note[Step 1:] Bayesian net. factorization: $P(\text{A},\text{B},\text{C})=P(\text{B})\color{#e64173}{P(\text{A}|\text{B})P(\text{C}|\text{B})}$ -- .note[Step 2:] Bayes' rule: $P(\text{A},\text{C}|\text{B})=\frac{P(\text{A},\text{B},\text{C})}{P(\text{B})}$ -- .note[Step 3:] Combine .note[2] & .note[1]: $P(\text{A},\text{C}|\text{B})=\frac{P(\text{A},\text{B},\text{C})}{P(\text{B})} = \color{#e64173}{P(\text{A}|\text{B})P(\text{C}|\text{B})}$ $\checkmark$ --- .note[Building block 4:] .b.slate[Forks] ```{r, bb4-plot-6, echo = F, fig.height = 2.5, fig.width = 5} # Plot it gg_fork_association ``` Two more items to emphasize: 1. .b.orange[Association] need not follow paths' directions, *e.g.*, $\text{A}\leftarrow \text{B} \rightarrow \text{C}$. 2. .b.pink[Causation] follows directed paths. --- name: blocks-colliders .note[Building block 5:] .b.slate[Immoralities] ```{r, bb5-data, include = F} # The DAG bb5_ex = dagify( B ~ A, B ~ C, coords = tibble( name = LETTERS[1:3], x = -1:1, y = c(-0.7, 0, -0.7) ) ) # Convert to data.table bb5_dt = bb5_ex %>% fortify() %>% setDT() # Shorten segments mult = 0.2 bb5_dt[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] # The curve dataset collider_curve = tibble( x = c(-1, 0, 1), y = c(-0.7, 0.3, -0.7) ) %>% spline(n = 101) %>% as.data.table() ``` ```{r, bb5-plot-1, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG gg_collider = ggplot( data = bb5_dt, aes(x = x, y = y) ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) # Plot it gg_collider ``` An .def.purple[immorality] occurs when two nodes share a child without being otherwise connected..super.pink[†] $\text{A} \rightarrow \text{B} \leftarrow \text{C}$ .footnote[.pink[†] I'm not making this up.] -- The child (here: $\text{B}$) at the center of this immorality is called a .def.purple[collider]. -- .note[Notice:] An immorality is a fork with reversed directions of the edges. --- .note[Building block 5:] .b.slate[Immoralities] ```{r, bb5-plot-2, echo = F, fig.height = 2.5, fig.width = 5} gg_collider ``` .qa[Q] Are $\text{A}$ and $\text{C}$ independent? --- count: false .note[Building block 5:] .b.slate[Immoralities] ```{r, bb5-plot-3, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG gg_collider_blocked = ggplot( data = bb5_dt, aes(x = x, y = y) ) + geom_line( data = collider_curve[1:floor(.N/2)], color = "grey80", linetype = "dashed", size = 0.8 ) + geom_point( aes(x = x, y = y), data = collider_curve[median(1:.N)], shape = 15, size = 6, color = "grey80" ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) # Plot it gg_collider_blocked ``` .qa[Q] Are $\text{A}$ and $\text{C}$ independent?
.qa[A] Yes. $\text{A} \ci \text{C}$. -- .b.purple[Intuition:] Causal effects flow from $\text{A}$ and $\text{C}$ and stop there. - Neither $\text{A}$ nor $\text{C}$ is a descendant of the other. - $\text{A}$ and $\text{C}$ do not share any common causes. --- .note[Building block 5:] .b.slate[Immoralities] ```{r, bb5-plot-4, echo = F, fig.height = 2.5, fig.width = 5} gg_collider_blocked ``` .b.pink[Proof:] Start with .it[marginalizing] dist. of $\text{A}$ and $\text{C}$. Then BNF. $P(\text{A},\text{C}) = \sum_{\text{B}} P(\text{A},\text{B}, \text{C})$ -- $\color{#FFFFFF}{P(\text{A},\text{C})} = \sum_{\text{B}} P(\text{A})P(\text{C})P(\text{B}|\text{A},\text{C})$ -- $\color{#FFFFFF}{P(\text{A},\text{C})} = P(\text{A})P(\text{C}) \color{#FFA500}{\left(\sum_{\text{B}} P(\text{B}|\text{A},\text{C}) = 1\right)}$ -- $\color{#FFFFFF}{P(\text{A},\text{C})} = P(\text{A})P(\text{C})$ $\quad\color{#e64173}{\checkmark}$ .pink[(] $\color{#e64173}{\text{A} \ci \text{C}}$ .pink[without conditioning)] --- .note[Building block 5:] .b.slate[Immoralities with conditions] ```{r, bb5-plot-5, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG ggplot( data = bb5_dt, aes(x = x, y = y) ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( aes(color = name == "B", fill = name == "B"), size = 20, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = c(purple, purple, "white"), fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) + scale_color_manual(values = c(purple, "grey80")) + scale_fill_manual(values = c("white", "grey80")) ``` .qa[Q] What happens when we condition on $\text{B}$? --- count: false .note[Building block 5:] .b.slate[Immoralities with conditions] ```{r, bb5-plot-6, echo = F, fig.height = 2.5, fig.width = 5} # Plot the DAG gg_collider_unblocked = ggplot( data = bb5_dt, aes(x = x, y = y) ) + geom_line( data = collider_curve, color = orange, linetype = "dashed", size = 0.8 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( aes(color = name == "B", fill = name == "B"), size = 20, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = c(purple, purple, "white"), fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) + scale_color_manual(values = c(purple, "grey80")) + scale_fill_manual(values = c("white", "grey80")) # Print figure gg_collider_unblocked ``` .qa[Q] What happens when we condition on $\text{B}$?
.qa[A] We .def.orange[unblock] (or .def.orange[open]) the previously blocked (closed) path. While $\text{A}$ and $\text{C}$ are independent, they are .orange[conditionally dependent]. -- .attn[Important:] When you condition on a collider, you open up the path. --- .note[Building block 5:] .b.slate[Immoralities with conditions] ```{r, bb5-plot-7, echo = F, fig.height = 2.5, fig.width = 5} gg_collider_unblocked ``` .b.purple[Intuition:] $\text{B}$ is a combination of $\text{A}$ and $\text{C}$. Conditioning on a value of $\text{B}$ jointly constrains $\text{A}$ and $\text{C}$—they can no longer move independently. -- .ex[Example:] Let $\text{A}$ take on $\{0,1\}$ and $\text{C}$ take on $\{0,1\}$ (independently). Conditional on $\text{B}=1$, $\text{A}$ and $\text{C}$ are perfectly negatively correlated. --- .note[Building block 5:] .b.slate[Immoralities with conditions] ```{r, bb5-plot-8, echo = F, fig.height = 2.5, fig.width = 5} ggplot( data = bb5_dt %>% dplyr::mutate( name = fcase( name == "A", "Y", name == "B", "X", name == "C", "D", default = NA ), to = fcase( to == "A", "Y", to == "B", "X", to == "C", "D", default = NA ) ), aes(x = x, y = y) ) + geom_line( data = collider_curve, color = orange, linetype = "dashed", size = 0.8 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( aes(color = name == "X", fill = name == "X"), size = 20, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = c(purple, purple, "white"), fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) + scale_color_manual(values = c(purple, "grey80")) + scale_fill_manual(values = c("white", "grey80")) ``` In *MHE* vocabulary: The collider $\text{X}$ is a *bad control*. $\text{X}$ is affected by both your treatment $\text{D}$ and outcome $\text{Y}$. -- .note[The result:] A spurious relationship between $\text{Y}$ and $\text{D}$
Remember: they're actually (unconditionally) independent. -- This spurious relationship is often called .def.purple[collider bias]. --- .ex[Example] Data from hospitalized patients: Mobility and respiratory health. -- .qa[Q] How does this example relate to collider bias? --
.qa[A] Write out the DAG (+ think about selection into your sample)! --- .ex[Example] Data from hospitalized patients: Mobility and respiratory health. Define $\text{M}$ as .it[mobility], -- $\text{R}$ as .it[respiratory health], -- and $\text{H}$ as .it[hospitalized]. -- Suppose for the moment respiratory health and mobility 1. are .b[independent of each other] 2. each .b[cause hospitalization] (when they are too low) -- ```{r, ex-collider-bias-1, echo = F, fig.height = 2.5, fig.width = 5} ggplot( data = bb5_dt %>% dplyr::mutate( name = fcase( name == "A", "M", name == "B", "H", name == "C", "R", default = NA ), to = fcase( to == "A", "M", to == "B", "H", to == "C", "R", default = NA ) ), aes(x = x, y = y) ) + # geom_line( # data = collider_curve[1:median(1:.N),], # color = "grey80", # linetype = "dashed", # size = 0.8 # ) + # geom_point( # aes(x = x, y = y), # data = collider_curve[median(1:.N)], # shape = 15, # size = 6, # color = "grey80" # ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, shape = 21, stroke = 0.6, fill = "white", color = purple ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) ``` The implied DAG. --- .ex[Example] Data from hospitalized patients: Mobility and respiratory health. Define $\text{M}$ as .it[mobility], $\text{R}$ as .it[respiratory health], and $\text{H}$ as .it[hospitalized]. Suppose for the moment respiratory health and mobility 1. are .b[independent of each other] 2. each .b[cause hospitalization] (when they are too low) ```{r, ex-collider-bias-1b, echo = F, fig.height = 2.5, fig.width = 5} ggplot( data = bb5_dt %>% dplyr::mutate( name = fcase( name == "A", "M", name == "B", "H", name == "C", "R", default = NA ), to = fcase( to == "A", "M", to == "B", "H", to == "C", "R", default = NA ) ), aes(x = x, y = y) ) + geom_line( data = collider_curve[1:median(1:.N),], color = "grey80", linetype = "dashed", size = 0.8 ) + geom_point( aes(x = x, y = y), data = collider_curve[median(1:.N)], shape = 15, size = 6, color = "grey80" ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( size = 20, shape = 21, stroke = 0.6, fill = "white", color = purple ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) ``` If we .b.pink[do not condition on hospitalization], $\text{M}\rightarrow \text{H} \leftarrow \text{R}$ is .b.grey[blocked]. --- .ex[Example] Data from hospitalized patients: Mobility and respiratory health. Define $\text{M}$ as .it[mobility], $\text{R}$ as .it[respiratory health], and $\text{H}$ as .it[hospitalized]. Suppose for the moment respiratory health and mobility 1. are .b[independent of each other] 2. each .b[cause hospitalization] (when they are too low) ```{r, ex-collider-bias-2, echo = F, fig.height = 2.5, fig.width = 5} ggplot( data = bb5_dt %>% dplyr::mutate( name = fcase( name == "A", "M", name == "B", "H", name == "C", "R", default = NA ), to = fcase( to == "A", "M", to == "B", "H", to == "C", "R", default = NA ) ), aes(x = x, y = y) ) + geom_line( data = collider_curve, color = orange, linetype = "dashed", size = 0.8 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.06, "npc")), color = purple, size = 0.9, lineend = "round" ) + geom_point( aes(color = name == "H", fill = name == "H"), size = 20, shape = 21, stroke = 0.6 ) + geom_text( aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = c(purple, purple, "white"), fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(-1.5, 1.5), ylim = c(-1, 0.5) ) + scale_color_manual(values = c(purple, "grey80")) + scale_fill_manual(values = c("white", "grey80")) ``` Our data .b.grey[conditions on hospitalization], which .b.orange[opens] $\text{M}\rightarrow \text{H} \leftarrow \text{R}$. --- class: clear, middle You can also see this example graphically... --- .ex[Example] Data from hospitalized patients: Mobility and respiratory health. ```{r, ex-collider-gen, include = F} # Set seed and sample size set.seed(12345) n = 100 # Generate M and R independently cb_dt = data.table( M = runif(n), R = runif(n) ) # Determine hospitalization cb_dt[, `:=`( H = M + R < 1 )] # Population relationship p1 = ggplot( data = cb_dt, aes(x = M, y = R) ) + geom_point( color = slate, size = 2.5, alpha = 0.8 ) + scale_y_continuous("Respiratory helath (R)") + scale_x_continuous("Mobility (M)") + theme_minimal( base_family = "Fira Sans Book", base_size = 14 ) + theme( panel.grid = element_blank(), legend.position = "bottom", legend.margin=margin(t=0, r=0, b=-0.3, l=0, unit="cm") ) + coord_cartesian(ylim = c(0,1), xlim = c(0,1)) # Adding hospitalization p2 = ggplot( data = cb_dt, aes(x = M, y = R) ) + geom_point( aes(color = H), size = 2.5, alpha = 0.8 ) + scale_y_continuous("Respiratory helath (R)") + scale_x_continuous("Mobility (M)") + scale_color_manual( "Hospitalized (H)", values = c(purple, orange), labels = c("No", "Yes") ) + theme_minimal( base_family = "Fira Sans Book", base_size = 14 ) + theme( panel.grid = element_blank(), legend.position = "bottom", legend.margin=margin(t=0, r=0, b=-0.3, l=0, unit="cm") ) + coord_cartesian(ylim = c(0,1), xlim = c(0,1)) # Conditioning on hospitalization p3 = ggplot( data = cb_dt[H==T], aes(x = M, y = R) ) + geom_point( color = orange, size = 2.5, alpha = 0.8 ) + scale_y_continuous("Respiratory helath (R)") + scale_x_continuous("Mobility (M)") + theme_minimal( base_family = "Fira Sans Book", base_size = 14 ) + theme( panel.grid = element_blank(), legend.position = "none", legend.margin=margin(t=0, r=0, b=-0.3, l=0, unit="cm") ) + coord_cartesian(ylim = c(0,1), xlim = c(0,1)) # Align the plots aligned = align_patches(p1, p2, p3) ``` Let $\color{#314f4f}{M\sim \text{Uniform}(0,1)}$; $\color{#e64173}{R\sim \text{Uniform}(0,1)}$; $\color{#FFA500}{H=\mathbb{I}\{M+R<1\}}$. -- ```{r, ex-collider-plot1, echo = F, fig.height = 4.5, fig.width = 7.5} aligned[[1]] ``` -- .note[Without conditioning:] No relationship between mobility and resp. health. --- .ex[Example] Data from hospitalized patients: Mobility and respiratory health. $\color{#314f4f}{M\sim \text{Uniform}(0,1)}$; $\color{#e64173}{R\sim \text{Uniform}(0,1)}$; $\color{#FFA500}{H=\mathbb{I}\{M+R<1\}}$. ```{r, ex-collider-plot2, echo = F, fig.height = 4.5, fig.width = 7.5} aligned[[2]] ``` .note[Recall:] Our sample excludes non-hospitalized individuals. --- .ex[Example] Data from hospitalized patients: Mobility and respiratory health. $\color{#314f4f}{M\sim \text{Uniform}(0,1)}$; $\color{#e64173}{R\sim \text{Uniform}(0,1)}$; $\color{#FFA500}{H=\mathbb{I}\{M+R<1\}}$. ```{r, ex-collider-plot3, echo = F, fig.height = 4.5, fig.width = 7.5} aligned[[3]] ``` .note[Conditioning on] $\text{H}$.note[:] Mobility and respiratory health are associated. --- class: clear, middle I like this example because it reminds us that .b.it[conditioning] occurs both .b[explicitly] (*e.g.*, "controlling for") and .b[implicitly] (*e.g.*, sample inclusion). This example of collider bias in hospitalization data comes from David L. Sackett's 1978 paper .it[[Bias in Analytic Research](https://www.jameslindlibrary.org/wp-data/uploads/2014/06/Sackett-1979-whole-article.pdf)]. Sackett called it .def.slate[admission rate bias]. .note[More generally:] You'll hear this called .def.slate[selection bias] or .def.slate[Berkson's paradox]. --- layout: false # DAGs ## Blocked paths Let's formally define a blocked path (blocking is important). -- A path between $\text{X}$ and $\text{Y}$ is .def.purple[blocked] by conditioning on a set of variables $\text{Z}$ (possibly empty) if either of the following statements is true: 1. On the path, there is a .b.pink[chain] $\left(\dots \rightarrow \text{W} \rightarrow \dots\right)$ or a .b.pink[fork] $\left(\dots \leftarrow \text{W} \rightarrow \dots\right)$, and we condition on $\text{W}$ $\left(\text{W}\in \text{Z}\right)$. 1. On the path, there is a .b.orange[collider] $\left(\dots \rightarrow \text{W} \leftarrow \dots\right)$, and we .it[do not] condition on $\text{W}$ $\left(\text{W}\not\in \text{Z}\right)$ or any of its .b.orange[descendants] $\left(\text{de}(\text{W})\not\subseteq \text{Z}\right)$. -- Association flows along unblocked paths. --- name: d-sep # DAGs ## d-separation and d-connected(-ness) Finally, we'll define whether nodes are .note[separated] or .note[connected] in DAGs. -- .b.purple[Separation:] Nodes $\text{X}$ and $\text{Y}$ are .def.purple[d-separated] by a set of nodes $\text{Z}$ if .purple[all paths] between $\text{X}$ and $\text{Y}$ .purple[are blocked] by $\text{Z}$. -- Notation for d-separation: $\color{#6A5ACD}{\text{X} \ci_{G} \text{Y} \vert \text{Z}}$ -- .b.pink[Connection:] If there is at least .pink[one path] between $\text{X}$ and $\text{Y}$ that is .pink[unblocked], then $\text{X}$ and $\text{Y}$ are .def.pink[d-connected]. --- # DAGs ## d-separation and causality .purple[d-separation] tells us that two nodes are .purple[not associated]. -- To measure the .pink[causal effect] of $\text{X}$ on $\text{Y}$:
We must eliminate .pink[non-causal association]. -- Putting these ideas together, here is our .def.orange[criterion to isolate causal effects]: > If we remove all edges flowing .b[out] of $\text{X}$ (its .pink[causal effects]),
then $\text{X}$ and $\text{Y}$ should be .purple[d-separated]. -- This criterion ensures that we've closed the .def.slate[backdoor paths] that generate non-causal associations between $\text{X}$ and $\text{Y}$. --- layout: false class: inverse, middle name: ex # Examples --- layout: true class: clear --- .ex[Example 1:] .b[OVB]
.to-middle[ ```{r, ex-1, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = dag_dt, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = purple, size = 1.2, lineend = "round" ) + geom_text( data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05), ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1) ) ``` ] .qa[Q] OVB using DAG fundamentals: When can we isolate causal effects? --- .ex[Example 2:] .b[Mediation] Here $\text{M}$ is a .def.purple[mediator]: it .def.purple[mediates] the effect of $\text{D}$ on $\text{Y}$. ```{r, ex-2-setup, include = F} # The full DAG ex2 = dagify( Y ~ M, M ~ D, Y ~ W, D ~ W, coords = tibble( name = c("Y", "D", "W", "M"), x = c(1, 3, 2, 2), y = c(2, 2, 1, 2) ) ) # Convert to data.table ex2 %<>% fortify() %T>% setDT() # Shorten segments mult = 0.15 ex2[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, ex-2-fig, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = ex2, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = purple, size = 1.2, lineend = "round" ) + geom_text( data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(ex2[,min(x)]*0.95, ex2[,max(x)]*1.05), ylim = c(ex2[,min(y)]*0.8, ex2[,max(y)]*1.1) ) ``` .qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
.qa[Q.sub[2]] What happens if we condition on $\text{W}$ and $\text{M}$? --- .ex[Example 3:] .b[Partial mediation]
```{r, ex-3-fig, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = ex2, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_curve( data = . %>% .[,.( name == "D", to == "Y", xa = sum(xa * (name == "D"), na.rm = T), ya = sum(ya * (name == "D"), na.rm = T) + 0.15, xb = sum(xb * (name == "M"), na.rm = T), yb = sum(yb * (name == "M"), na.rm = T) + 0.15 )], aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0.3, arrow = arrow(length = unit(0.07, "npc")), color = purple, size = 1.2, lineend = "round" ) + geom_curve( data = . %>% .[!(name == "D" & to == "Y")], aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = purple, size = 1.2, lineend = "round" ) + geom_text( data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = c(ex2[,min(x)]*0.95, ex2[,max(x)]*1.05), ylim = c(ex2[,min(y)]*0.8, ex2[,max(y)]*1.3) ) ``` .qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
.qa[Q.sub[2]] What happens if we condition on $\text{W}$ and $\text{M}$? --- .ex[Example 4:] .b[Non-mediator descendants]
```{r, ex-4-setup, include = F} # The full DAG ex4 = dagify( Y ~ D, Y ~ W, Z ~ Y, D ~ W, Z ~ D, coords = tibble( name = c("Y", "D", "W", "Z"), x = c(-1, 1, 0, 0), y = c( 0, 0, -1, 1) ) ) # Convert to data.table ex4 %<>% fortify() %T>% setDT() # Shorten segments mult = 0.15 ex4[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, ex-4-fig, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = ex4, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = purple, size = 1.2, lineend = "round" ) + geom_text( data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = ex4[,range(x)] + ex4[,range(x) %>% diff()] * c(-0.08, 0.08), ylim = ex4[,range(y)] + ex4[,range(y) %>% diff()] * c(-0.08, 0.08) ) ``` .qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
.qa[Q.sub[2]] What happens if we condition on $\text{W}$ and/or $\text{Z}$? --- .ex[Example 5:] .b[M-Bias] Notice that $\text{C}$ here is .it[not] a result of treatment (could be "pre-treatment"). ```{r, ex-5-setup, include = F} # The full DAG ex5 = dagify( Y ~ D, Y ~ A, C ~ A, C ~ B, D ~ B, coords = tibble( name = c("Y", "D", "A", "B", "C"), x = c(0, 2, 0, 2, 1), y = c(0, 0, 2, 2, 1) ) ) # Convert to data.table ex5 %<>% fortify() %T>% setDT() # Shorten segments mult = 0.2 ex5[, `:=`( xa = x + (xend-x) * (mult), ya = y + (yend-y) * (mult), xb = x + (xend-x) * (1-mult), yb = y + (yend-y) * (1-mult) )] ``` ```{r, ex-5-fig, echo = F, fig.height = 3, fig.width = 6} # Plot the full DAG ggplot( data = ex5, aes(x = x, y = y, xend = xend, yend = yend) ) + geom_point( size = 20, fill = "white", color = purple, shape = 21, stroke = 0.6 ) + geom_curve( aes(x = xa, y = ya, xend = xb, yend = yb), curvature = 0, arrow = arrow(length = unit(0.07, "npc")), color = purple, size = 1.2, lineend = "round" ) + geom_text( data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(), aes(x = x, y = y, label = name), family = "Fira Sans Medium", size = 8, color = purple, fontface = "bold" ) + theme_void() + theme( legend.position = "none", ) + coord_cartesian( xlim = ex5[,range(x)] + ex5[,range(x) %>% diff()] * c(-0.08, 0.08), ylim = ex5[,range(y)] + ex5[,range(y) %>% diff()] * c(-0.08, 0.08) ) ``` .qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
.qa[Q.sub[2]] What happens if we condition on $\text{C}$?
.qa[Q.sub[3]] What happens if we condition on $\text{C}$ along with $\text{B}$ and/or $\text{C}$? --- class: middle .b[One more note:] DAGs are often drawn without "noise variables" (disturbances). But they still exist—they're just "outside of the model." --- layout: false name: limits # DAGs ## Limitations So what can't DAGs do (well)? -- - .def[Simultaneity:] Defined causality as unidirectional and prohibited cycles. -- - .def[Dynamics:] You can sort of allow a variable to affect itself... $\text{Y}_{t=1}\rightarrow\text{Y}_{t=2}$. -- - .def[Uncertainty:] DAGs are most useful when you can correctly draw them. -- - .def[Make friends:] There's *a lot* of (angry/uncharitable) fighting about DAGs: $$ \begin{align} \text{Philosophy}\rightarrow \text{DAGs/Epidemiology} \leftarrow \text{Economics} \end{align} $$ --- class: clear, middle .b.slate[Some of Judea Pearl's thoughts] ([source]((http://causality.cs.ucla.edu/blog/index.php/2014/10/27/are-economists-smarter-than-epidemiologists-comments-on-imbenss-recent-paper/)) > So, what is it about epidemiologists that drives them to seek the light of new tools, while economists (at least those in Imbens’s camp) seek comfort in partial blindness, while missing out on the causal revolution? Can economists do in their heads what epidemiologists observe in their graphs? Can they, for instance, identify the testable implications of their own assumptions? Can they decide whether the IV assumptions (i.e., exogeneity and exclusion) are satisfied in their own models of reality? Of course the can’t; such decisions are intractable to the graph-less mind. (I have challenged them repeatedly to these tasks, to the sound of a pin-drop silence) --- class: clear, middle .b.slate[Pearl, continued] ([source]((http://causality.cs.ucla.edu/blog/index.php/2014/10/27/are-economists-smarter-than-epidemiologists-comments-on-imbenss-recent-paper/)) > Or, are problems in economics different from those in epidemiology? I have examined the structure of typical problems in the two fields, the number of variables involved, the types of data available, and the nature of the research questions. The problems are strikingly similar. > I have only one explanation for the difference: Culture. > The arrow-phobic culture started twenty years ago, when Imbens and Rubin (1995) decided that graphs “can easily lull the researcher into a false sense of confidence in the resulting causal conclusions,” and Paul Rosenbaum (1995) echoed with “No basis is given for believing” […] “that a certain mathematical operation, namely this wiping out of equations and fixing of variables, predicts a certain physical reality” --- class: clear, middle .b.slate[Guido Imbens's response] ([source]((http://causality.cs.ucla.edu/blog/index.php/2014/10/27/are-economists-smarter-than-epidemiologists-comments-on-imbenss-recent-paper/)) > ... Judea and others using graphical models have developed a very interesting set of tools that researchers in many areas have found useful for their research. Other researchers, including myself, have found the potential outcome framework for causality associated with the work by Rubin... more useful for their work. In my view that difference of opinion does not reflect “economists being scared of graphs”, or “educational deficiencies” as Judea claims, merely legitimate heterogeneity in views arising from differences in preferences and problems. The “educational deficiencies” claim, and similarly the comment about my “vow” to avoid causal graphs is particularly ironic given that in the past Judea has presented, at my request, his work on causal graphs to participants in a graduate seminar I taught at Harvard University. --- class: clear, middle, center .enormous[🤷]
.note[Suggestion:] Be nice to people and be intellectually honest. --- name: sources layout: false # Sources ## Thanks These notes rely heavily upon Brady Neal's [*Introduction to Causal Inference*](https://bradyneal.com/causal-inference-course). I also borrow from Scott Cunningham's [*Causal Inference: The Mixtape*](https://www.scunning.com/mixtape.html). I found the [Sackett (1978)](https://catalogofbias.org/biases/collider-bias/) example on the ["Catalog of Bias"](https://catalogofbias.org/biases/collider-bias/) website. --- exclude: false # Table of contents .col-left[ .small[ #### Admin - [Today and upcoming](#admin) #### Other - [Sources](#sources) #### DAGs - [What's a DAG?](#different) - [Example](#dag-ex) - [Graphs](#graphs) - [Definition/undirected](#graphs) - [Directed](#graphs-directed) - [Cycles](#graphs-cycles) ] ] .col-right[ .small[ #### DAGs continued - [Origins](#dag-origins) - [Local Markov](#local-markov) - [Bayesian net. factorization](#dags-factor) - [Dependence](#dags-dependence) - [Causality](#dags-causlity) - [DAG building blocks](#building-blocks) - [Chains](#blocks-chains) - [Forks](#blocks-forks) - [Immoralities/colliders](#blocks-colliders) - [d-separation](#d-sep) - [Examples](#ex) - [Limitations](#limits) ] ] --- exclude: true ```{r, generate pdfs, include = F, eval = F} pagedown::chrome_print("12-ml.html", output = "12-ml.pdf") pagedown::chrome_print("12-ml.html", output = "12-ml-nopause.pdf") ```