---
title: "DAGs"
subtitle: "EC 607, Set 07"
author: "Edward Rubin"
date: "Spring 2021"
output:
  xaringan::moon_reader:
    css: ['default', 'metropolis', 'metropolis-fonts', 'my-css.css']
    # self_contained: true
    nature:
      highlightStyle: github
      highlightLines: true
      countIncrementalSlides: false
---
class: inverse, middle

```{r, setup, include = F}
# devtools::install_github("dill/emoGG")
library(pacman)
p_load(
  broom, tidyverse,
  ggplot2, ggthemes, ggforce, ggridges, ggdag, dagitty, cowplot, patchwork, scales,
  latex2exp, viridis, extrafont, grid, gridExtra, plotly, ggformula,
  kableExtra, DT, 
  data.table, dplyr, snakecase, janitor,
  lubridate, knitr, future, furrr, parallel,
  MASS, estimatr, FNN, parsnip, caret, glmnet,
  huxtable, here, magrittr
)
# Define pink color
red_pink <- "#e64173"
turquoise <- "#20B2AA"
orange <- "#FFA500"
red <- "#fb6107"
blue <- "#3b3b9a"
green <- "#8bb174"
grey_light <- "grey70"
grey_mid <- "grey50"
grey_dark <- "grey20"
purple <- "#6A5ACD"
slate <- "#314f4f"
# Dark slate grey: #314f4f
# Knitr options
opts_chunk$set(
  comment = "#>",
  fig.align = "center",
  fig.height = 7,
  fig.width = 10.5,
  warning = F,
  message = F
)
opts_chunk$set(dev = "svg")
options(device = function(file, width, height) {
  svg(tempfile(), width = width, height = height)
})
options(knitr.table.format = "html")
theme_set(theme_gray(base_size = 20))
```

```{r, xaringan-extra, include = F, eval = F}
xaringanExtra::use_scribble(pen_color = red_pink)
```

```{css, echo = F, eval = T}
@media print {
  .has-continuation {
    display: block !important;
  }
}
```

$$
\begin{align}
  \def\ci{\perp\mkern-10mu\perp}
\end{align}
$$


# Prologue

---
name: schedule

# Schedule

## Last time

(Bad) Controls

## Today

Directed Acyclic Graphs (DAGs)

## Upcoming

Matching

---
class: inverse, middle
# DAGs

---
layout: true
# DAGs

---
name: different
##  What's a DAG?

.note[DAG] stands for .b[directed acyclic graph].

--

More helpful...

A .note[DAG] graphically illustrates the causal relationships and non-causal associations within a network of random variables.

---
name: dag-ex
layout: false
class: clear

.ex[Example] Omitted-variable bias in a DAG 

```{r, dag-ex-setup, echo = F, include = F}
# The full DAG
dag_full = dagify(
  Y ~ D,
  Y ~ W,
  D ~ W,
  coords = tibble(
    name = c("Y", "D", "W"),
    x = c(1, 3, 2),
    y = c(2, 2, 1)
  )
)
# Convert to data.table
dag_dt = dag_full %>% fortify() %>% setDT()
# Add indicators for paths
dag_dt[, `:=`(
  path1 = (name == "D" & to == "Y") | (name == "Y"),
  path2 = (name == "D" & to == "W") | (name == "W" & to == "Y") | (name == "Y")
)]
# Shorten segments
mult = 0.15
dag_dt[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, dag-ex-ovb, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = slate,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = slate,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = slate,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
)
```

A pretty standard DAG.

---
class: clear

.ex[Example] Omitted-variable bias in a DAG 

```{r, dag-ex-ovb-nodes, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  color = red_pink
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = "grey80",
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = "white",
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
)
```

.b.pink[Nodes] are random variables.

---
class: clear

.ex[Example] Omitted-variable bias in a DAG 

```{r, dag-ex-ovb-edges, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = "grey80",
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = "grey80",
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
)
```

.b.purple[Edges] depict causal links. Causality flows in the direction of the .b.purple[arrows].

--

- Connections matter!
- Direction matters (for causality).
- Non-connections also matter! .grey-light[(More on this topic soon.)]

---
class: clear

.ex[Example] Omitted-variable bias in a DAG 

```{r, dag-ex-ovb-2, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = slate,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = slate,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = slate,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
)
```

Here we can see that .b.slate[Y] is affected by both .b.slate[D] and .b.slate[W].

.b.slate[W] also affects .b.slate[D].

--

.qa[Q] How does this graph exhibit OVB?

---
class: clear

.ex[Example] Omitted-variable bias in a DAG 

```{r, dag-ex-ovb-3, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = slate,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb, color = (name == "D" & to == "Y")),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = slate,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
) +
scale_color_manual(values = c(slate, red_pink))
```

There are two pathways from .b.slate[D] to .b.slate[Y].

--

.slate[1\.] The path from .b.slate[D] to .b.slate[Y] $\color{#e64173}{\left(\text{D}\rightarrow\text{Y}\right)}$ is our casual relationship of interest. 

---
class: clear

.ex[Example] Omitted-variable bias in a DAG 

```{r, dag-ex-ovb-4, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_curve(
  data = dag_dt[name == "D" & to == "Y"],
  color = orange,
  size = 0.8,
  linetype = "dashed",
  curvature = -1.02
) +
geom_point(
  size = 20,
  fill = "white",
  color = slate,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb, color = !(name == "D" & to == "Y")),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = slate,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
) +
scale_color_manual(values = c(slate, red_pink))
```

There are two pathways from .b.slate[D] to .b.slate[Y].

.slate[1\.] The path from .b.slate[D] to .b.slate[Y] $\color{#314f4f}{\left(\text{D}\rightarrow\text{Y}\right)}$ is our casual relationship of interest. 
<br>
.slate[2\.] The path $\color{#e64173}{\left(\text{Y}\leftarrow\text{W}\rightarrow\text{D}\right)}$ creates a .orange[non-causal association] btn .b.slate[D] and .b.slate[Y].

---
class: clear

.ex[Example] Omitted-variable bias in a DAG 

```{r, dag-ex-ovb-6, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_curve(
  data = dag_dt[name == "D" & to == "Y"],
  color = "grey80",
  size = 0.8,
  linetype = "dashed",
  curvature = -1.02
) +
geom_point(
  aes(color = name == "W", fill = name == "W"),
  size = 20,
  pch = 21
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = c(slate, "white", slate),
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
) +
scale_fill_manual(values = c("white", "grey80")) +
scale_color_manual(values = c(slate, "grey80"))
```

There are two pathways from .b.slate[D] to .b.slate[Y].

.slate[1\.] The path from .b.slate[D] to .b.slate[Y] $\color{#314f4f}{\left(\text{D}\rightarrow\text{Y}\right)}$ is our casual relationship of interest. 
<br>
.slate[2\.] The path $\color{#314f4f}{\left(\text{Y}\leftarrow\text{W}\rightarrow\text{D}\right)}$ creates a .orange[non-causal association] btn .b.slate[D] and .b.slate[Y].

To shut down this pathway creating a non-causal association, we must .b.grey-light[condition on .b[W]]. Sound familiar?

---
layout: true
# Graphs

---
class: inverse, middle
name: graphs

---

## More formally

In graph theory, a .pink.def[graph] is a collection of .purple.def[nodes] connected by .orange.def[edges].

--

```{r, graph-ex-setup, include = F}
# The full DAG
graph_ex = dagify(
  B ~ A,
  C ~ A,
  C ~ B,
  D ~ C,
  coords = tibble(
    name = LETTERS[1:4],
    x = c(0, 1, 0, 1),
    y = c(1, 1, 0, 0)
  )
)
# Convert to data.table
graph_dt = graph_ex %>% fortify() %>% setDT()
# Shorten segments
mult = 0.16
graph_dt[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, graph-ex-undirected, echo = F, fig.height = 3, fig.width = 3}
# Plot the full DAG
ggplot(
  data = graph_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_curve(
  curvature = 0,
  color = orange,
  size = 1.2,
  lineend = "round"
) +
geom_point(
  size = 20,
  color = purple
) +
geom_text(
  data = graph_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = "white",
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = graph_dt[,range(x)] + graph_dt[,range(x) %>% diff()] * c(-0.1, 0.1),
  ylim = graph_dt[,range(y)] + graph_dt[,range(y) %>% diff()] * c(-0.1, 0.1)
)
```

--

- Nodes connected by an edge are called .def[adjacent].
--

- .def[Paths] run along adjacent nodes, .it[e.g.], $\text{A}-\text{B}-\text{C}$.
--

- The graph above is .def[undirected], since the edges don't have direction. 

---
name: graphs-directed
## Directed

.def.purple[Directed graphs] have edges with direction.

```{r, graph-ex-directed, echo = F, fig.height = 3, fig.width = 3}
# Plot the full DAG
ggplot(
  data = graph_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_curve(
  aes(x = xa, xend = xb, y = ya, yend = yb),
  arrow = arrow(length = unit(0.07, "npc")),
  curvature = 0,
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  data = graph_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = graph_dt[,range(x)] + graph_dt[,range(x) %>% diff()] * c(-0.1, 0.1),
  ylim = graph_dt[,range(y)] + graph_dt[,range(y) %>% diff()] * c(-0.1, 0.1)
)
```

--

- .def[Directed paths] follow edges' directions, *e.g.*, $\text{A}\rightarrow\text{B}\rightarrow\text{C}$.

--
- Nodes that precede a given node in a directed path are its .def[ancestors].

--
- The opposite: .def[descendants] come after the node, *e.g.*, $\text{D}=\text{de}(\text{C})$.

---
name: graphs-cycles
## Cycles

If a node is its own descendant (*e.g.*, $\text{de}(\text{D})=\text{D}$), your graph has a .pink.def[cycle].


```{r, cycle-setup, include = F}
# The full DAG
cycle_ex = dagify(
  B ~ A,
  C ~ A,
  C ~ B,
  D ~ C,
  B ~ D,
  coords = tibble(
    name = LETTERS[1:4],
    x = c(0, 1, 0, 1),
    y = c(1, 1, 0, 0)
  )
)
# Convert to data.table
cycle_dt = cycle_ex %>% fortify() %>% setDT()
# Add indicators for paths
cycle_dt[, `:=`(
  cycle = (name == "B" & to == "C") | (name == "D" & to == "B") | (name == "C" & to == "D")
)]
# Shorten segments
mult = 0.16
cycle_dt[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, graph-ex-cycle, echo = F, fig.height = 3, fig.width = 3}
# Plot the full DAG
ggplot(
  data = cycle_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_curve(
  aes(x = xa, xend = xb, y = ya, yend = yb, color = cycle),
  arrow = arrow(length = unit(0.07, "npc")),
  curvature = 0,
  size = 1.2,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  data = cycle_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = cycle_dt[,range(x)] + cycle_dt[,range(x) %>% diff()] * c(-0.1, 0.1),
  ylim = cycle_dt[,range(y)] + cycle_dt[,range(y) %>% diff()] * c(-0.1, 0.1)
) +
scale_color_manual(values = c(purple, red_pink))
```

--

If your directed graph does not have any cycles, then you have a <br> .def.orange[directed acyclic graph] (.def.orange[DAG]).

---
layout: true
# DAGs

---
class: inverse, middle

---
name: dag-origins
## The origin story

Many developments in .it[causal graphical models] came from work in probabilistic graphical models—especially Bayesian networks.

--

Recall what you know about joint probabilities:

$$
\begin{align}
  &\color{#FFA500}{2} &P(x_1,x_2) &= P(x_1) P(x_2|x_1)
  \\[0.2em]
  &\color{#FFA500}{3} &P(x_1,x_2,x_3) &= P(x_1) P(x_2,x_3|x_1) = P(x_1) P(x_2|x_1) P(x_3|x_2,x_1)
  \\[0.2em]
  &\color{#FFA500}{\vdots} 
  \\[0.2em]
  &\color{#FFA500}{n} &P(x_1,x_2,\dots,x_n) &= P(x_1)\prod_{i=2}^{n} P(x_i|x_{i-1},\ldots,x_1)
\end{align}
$$

--

This final product can include *a lot* of terms.
<br>.ex[E.g.,] even when $x_i$ are binary, $P(x_4 | x_3,x_2,x_1)$ requires $2^3=8$ parameters.

---
name: local-markov
## Thinking locally

DAGs help us think through simplifying $P(x_k | x_{k-1},x_{k-2},\ldots,x_1)$.

--

```{r, graph-prob, echo = F, fig.height = 3, fig.width = 3}
# Plot the full DAG
ex_graph = ggplot(
  data = graph_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_curve(
  aes(x = xa, xend = xb, y = ya, yend = yb),
  arrow = arrow(length = unit(0.07, "npc")),
  curvature = 0,
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  data = graph_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = graph_dt[,range(x)] + graph_dt[,range(x) %>% diff()] * c(-0.1, 0.1),
  ylim = graph_dt[,range(y)] + graph_dt[,range(y) %>% diff()] * c(-0.1, 0.1)
)
ex_graph
```

Given a prob. dist. and a DAG, can we assume some independencies?
--
<br> Given $\color{#6A5ACD}{\text{C}}$, is it reasonable to assume $\color{#6A5ACD}{\text{D}}$ is independent of $\color{#6A5ACD}{\text{A}}$ and $\color{#6A5ACD}{\text{B}}$?


---
## Local Markov

This intuitive approach *is* the .def.purple[Local Markov Assumption]
> Given its parents in the DAG, a node $X$ is independent of all of its non-descendants.

--

.col-left[
.ex[Ex.] Consider the DAG to the right:

With the Local Markov Assumption,
<br>
$P(\text{D}|\text{A},\text{B},\text{C})$ simplifies to $P(\text{D}|\text{C})$.

Conditional on its parent $(\text{C})$,
<br>
$\text{D}$ is independent of $\text{A}$ and $\text{B}$.
]

.col-right[
```{r, graph-prob-2, echo = F, fig.height = 3, fig.width = 3}
ex_graph
```
]

---
name: dags-factor
## Local Markov and factorization

The Local Markov Assumption is equiv. to .def.purple[Bayesian Network Factorization]
> For prob. dist. $P$ and DAG $G$, $P$ factorizes according to $G$ if 
$$
\begin{align}
  P(x_1,\ldots,x_n) = \prod_{i} P(x_i|\text{pa}_i)
\end{align}
$$
where $\text{pa}_i$ refers to $x_i$'s parents in $G$.

Bayesian network factorization is also called *the chain rule for Bayesian networks* and *Markov compatibility*.

---
name: factorization
## Factorize!

You can now (more easily) factorize the DAG/dist. below! .grey-vlight[(You're welcome.)]


.col-left[
```{r, graph-factorize, echo = F, fig.height = 3, fig.width = 3}
ex_graph
```
]

--


.col-right[
.b.slate[Factorization via B.N. chain rule]

$$
\begin{align}
  &P(\text{A},\text{B},\text{C},\text{D}) 
  \\[0.4em]
  &\quad = \prod_{i} P(x_i|\text{pa}_i)
  \\[0.4em]
  &\quad = P(\text{A}) P(\text{B}|\text{A}) P(\text{C}|\text{A},\text{B}) P(\text{D}|\text{C})
\end{align}
$$

]


---
## Independence

What have we learned so far? .grey-vlight[(Why should you care about this stuff?)]

Local Markov and Bayesian Network Factorization tell us abount .attn[independencies] within a probability distribution implied by the given DAG.

You're now able to say something about which variables are .pink.it[independent].

--

.b[There's more:] Great start, but there's more to life than independence.
<br>We also want to say something about .purple.it[dependence].

---
name: dags-dependence
## Dependence

We need to strengthen our Local Markov assumption to be able to interpret adjacent nodes as dependent. .grey-vlight[(*I.e.*, add it to our small set of assumptions.)]

--

The .def.purple[Minimality Assumption].pink.super[†]
> 1. .def.purple[Local Markov] Given its parents in the DAG, a node $X$ is independent of all of its non-descendants.
> 2. .it.grey-light[(NEW)] Adjacent nodes in the DAG are dependent.

.footnote[
.pink[†] The name .grey-light.def[minimality] refers to the minimal set of independencies for $P$ and $G$—we cannot remove any more edges from the graph (while staying Markov compatible with $G$).]

--

With the minimality assumption, we can learn both .pink[dependence] and .orange[independence] from connections (or non-connections) in a DAG. 

---
name: dags-causlity
## Causality

We need one last assumption move DAGs from .it[statistical] to .it[causal] models.

--

.def.purple[Strict Causal Edges Assumption]
> Every parent is a direct cause of each of its children.

--

For $Y$, the set of .it[direct causes] is the set of variables to which $Y$ responds.

--

This assumption actually strengthens the second part of .note[Minimality]:
> 2\. Adjacent nodes in the DAG are dependent.

---
## Assumptions

Thus, we only need two assumptions to turn DAGs into causal models:

1. .def.purple[Local Markov] Given its parents in the DAG, a node $X$ is independent of all of its non-descendants.

1. .def.purple[Strict Causal Edges] Every parent is a direct cause of each of its children.

--

Not bad, right?

---
## Flows

[Brady Neal](https://bradyneal.com) emphasizes the .note[flow(s) of association] and .note[causation] in DAGs,
<br>and I find it to be a super helpful way to think about these models.

.def.purple[Flow of association] refers to whether two nodes are associated (statistically dependent) or not (statistically independent).

We will be interested in unconditional and conditional associations.

---
name: building-blocks
## Building blocks

We will run through a few simple .it[building blocks] (DAGs) that make up more complex DAGs.

For each simple DAG, we want to ask a few questions:

1. Which nodes are unconditionally or conditionally .b.pink[independent]?.super.pink[†]

1. Which nodes are .b.orange[dependent]?

1. What is the .b.purple[intuition]?

.footnote[
.pink[†] To prove $\text{A}$ and $\text{B}$ are conditionally independent, we can show $P(\text{A},\text{B}|\text{C})$ factorizes as $P(\text{A}|\text{C})P(\text{B}|\text{C})$.]

---
layout: true
class: clear

---
.note[Building block 1:] .b.slate[Two unconnected nodes]

```{r, bb1-plot, echo = F, fig.height = 1, fig.width = 4}
# Plot the DAG
ggplot(
  data = data.table(
    x = 0:1,
    y = 0,
    name = LETTERS[1:2]
  ),
  aes(x = x, y = y)
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-0.5, 1.5),
  ylim = c(-0.5, 0.5)
)
```

--

.b.purple[Intuition:]
--
 $\text{A}$ and $\text{B}$ appear independent—no link between the nodes.

--

.b.pink[Proof:]
--
 By [Bayesian network factorization](#factorization),
$$
\begin{align}
  P(\text{A},\text{B}) = P(\text{A}) P(\text{B})
\end{align}
$$
(since neither node has parents). $\checkmark$

---
.note[Building block 2:] .b.slate[Two connected nodes]

```{r, bb2-data, include = F}
# The DAG
bb2_ex = dagify(
  B ~ A,
  coords = tibble(
    name = LETTERS[1:2],
    x = 0:1,
    y = c(0,0)
  )
)
# Convert to data.table
bb2_dt = bb2_ex %>% fortify() %>% setDT()
# Shorten segments
mult = 0.2
bb2_dt[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, bb2-plot, echo = F, fig.height = 1, fig.width = 4}
# Plot the DAG
ggplot(
  data = bb2_dt,
  aes(x = x, y = y)
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.13, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-0.5, 1.5),
  ylim = c(-0.5, 0.5)
)
```

--

.b.purple[Intuition:]
--
 $\text{A}$ "is a cause of" $\text{B}$: there is clear (causal) dependence..super.pink[†]

.footnote[
.pink[†] I'm not a huge fan of the "is a cause of" wording, but it appears to be (unfortunately) common in this literature. IMO, $``\text{A}$ causes (or affects) $\text{B"}$ would be clearer (and more grammatical), but no one asked me. One argument for "a cause of" (vs. "causes") is it emphasizes that events often have multiple causes.]

--

.b.pink[Proof:]
--
 By the [Strict Causal Edges Assumption](#dags-causlity), every parent (here, $\text{A}$) is a direct cause of each of its children $\left(\text{B}\right)$. $\checkmark$

---
name: blocks-chains
.note[Building block 3:] .b.slate[Chains]

```{r, bb3-data, include = F}
# The DAG
bb3_ex = dagify(
  B ~ A,
  C ~ B,
  coords = tibble(
    name = LETTERS[1:3],
    x = -1:1,
    y = 0
  )
)
# Convert to data.table
bb3_dt = bb3_ex %>% fortify() %>% setDT()
# Shorten segments
mult = 0.25
bb3_dt[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, bb3-plot, echo = F, fig.height = 1.5, fig.width = 5}
# Plot the DAG
gg_chain = ggplot(
  data = bb3_dt,
  aes(x = x, y = y)
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.09, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
# Plot it
gg_chain
```

--

.b.purple[Intuition:] We already showed two connected nodes are dependent:
- $\text{A}$ and $\text{B}$ are dependent.
- $\text{B}$ and $\text{C}$ are dependent.

The question is whether $\text{A}$ and $\text{C}$ are dependent:
<br>Does association flow from $\text{A}$ to $\text{C}$ through $\text{B}$?

---
count: false
.note[Building block 3:] .b.slate[Chains]

```{r, bb3-plot-2, echo = F, fig.height = 1.5, fig.width = 5}
# The curve dataset
curve_dt = tibble(
  x = c(-1, 0, 1),
  y = c(0, -0.8, 0)
) %>% spline(n = 101) %>% as.data.table()
# Plot the DAG
gg_chain_association = ggplot(
  data = bb3_dt,
  aes(x = x, y = y)
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.09, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_line(
  data = curve_dt,
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
gg_chain_association
```

.b.purple[Intuition:] We already showed two connected nodes are dependent:
- $\text{A}$ and $\text{B}$ are dependent.
- $\text{B}$ and $\text{C}$ are dependent.

The question is whether $\text{A}$ and $\text{C}$ are dependent:
<br>Does association flow from $\text{A}$ to $\text{C}$ through $\text{B}$? 

The answer .it[generally].super.pink[†] is .orange["yes"]: changes in $\text{A}$ typically cause changes in $\text{C}$. 

.footnote[
.pink[†] Section 2.2 of [Pearl, Glymour, and Jewell](http://bayes.cs.ucla.edu/PRIMER/) provides a "pathological" example of "intransitive dependence". It's basically when $\text{A}$ induces variation in $\text{B}$ that is not relevant to $\text{C}$ outcome.]

---
.note[Building block 3:] .b.slate[Chains]

```{r, bb3-plot-3, echo = F, fig.height = 1.5, fig.width = 5}
gg_chain_association
```

.b.pink[Proof:] Here's the unsatisfying part. 

Without more assumptions, we can't *prove* this association of $\text{A}$ and $\text{C}$.

We'll think of this as a potential (even likely) association.

---
.note[Building block 3:] .b.slate[Chains with conditions]

```{r, bb3-plot-4, echo = F, fig.height = 1.5, fig.width = 5}
# The curve dataset
curve_dt = tibble(
  x = c(-1, 0, 1),
  y = c(0, -0.8, 0)
) %>% spline(n = 101) %>% as.data.table()
# Plot the DAG
gg_chain_condition = ggplot(
  data = bb3_dt,
  aes(x = x, y = y)
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.09, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_line(
  data = curve_dt[1:floor(.N/2)],
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_point(
  aes(x = x, y = y),
  data = curve_dt[median(1:.N)],
  shape = 15,
  size = 6,
  color = "grey80"
) +
geom_point(
  aes(color = name == "B", fill = name == "B"),
  shape = 21,
  size = 20,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = c(purple, "white", purple),
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
) +
scale_color_manual(values = c(purple, "grey80")) +
scale_fill_manual(values = c("white", "grey80")) 
gg_chain_condition
```

.qa[Q] How does conditioning on $\text{B}$ affect the association between $\text{A}$ and $\text{C}$?

.b.purple[Intuition:] 

1. $\text{A}$ affects $\text{C}$ by changing $\text{B}$. 
2. When we hold $\text{B}$ constant, $\text{A}$ cannot "reach" $\text{C}$.

We've .def.purple[blocked] the path of association between $\text{A}$ and $\text{C}$.

Conditioning blocks the flow of association .b[in chains]. ("Good" control!)

---
.note[Building block 3:] .b.slate[Chains with conditions]

```{r, bb3-plot-6, echo = F, fig.height = 1.5, fig.width = 5}
# Plot the DAG
gg_chain_condition
```

.b.pink[Proof:] We want to show $\text{A}$ and $\text{C}$ are independent conditional on $\text{B}$,
<br>*i.e.*, $P(\text{A},\text{C}|\text{B})=P(\text{A}|\text{B})P(\text{C}|\text{B})$.

--

Start with BN factorization: $P(\text{A},\text{B},\text{C})$
--
 $= P(\text{A})P(\text{B}|\text{A})P(\text{C}|\text{B})$.

--

Now apply Bayes' rule for the LHS of our goal: $P(\text{A},\text{C}|\text{B}) = \frac{P(\text{A},\text{B},\text{C})}{P(\text{B})}$.

--

And substitute our factorization into the Bayes' rule expression:

$P(\text{A},\text{C}|\text{B}) = \dfrac{P(\text{A})P(\text{B}|\text{A})\color{#e64173}{P(\text{C}|\text{B})}}{P(\text{B})}$
--
 $=P(\text{A}|\text{B})\color{#e64173}{P(\text{C}|\text{B})}$ $\checkmark$ .grey-light[(Bayes rule again)]

---
.note[Building block 3:] .b.slate[Chains]

```{r, bb3-plot-7, echo = F, fig.height = 1.5, fig.width = 5}
# Plot the DAG
gg_chain_association
```

.note[Note] This .orange[association of] $\color{#FFA500}{\text{A}}$ .orange[and] $\color{#FFA500}{\text{C}}$ is not directional. (It is symmetric.)

On the other hand, causation .b[is] directional (and asymmetric).

As you've been warned for years: Associations are not necessarily causal.

---
name: blocks-forks

.note[Building block 4:] .b.slate[Forks]

```{r, bb4-data, include = F}
# The DAG
bb4_ex = dagify(
  A ~ B,
  C ~ B,
  coords = tibble(
    name = LETTERS[1:3],
    x = -1:1,
    y = c(-0.7, 0, -0.7)
  )
)
# Convert to data.table
bb4_dt = bb4_ex %>% fortify() %>% setDT()
# Shorten segments
mult = 0.2
bb4_dt[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
# The curve dataset
fork_curve = tibble(
  x = c(-1, 0, 1),
  y = c(-0.7, 0.3, -0.7)
) %>% spline(n = 101) %>% as.data.table()
```

```{r, bb4-plot-1, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
gg_fork = ggplot(
  data = bb4_dt,
  aes(x = x, y = y)
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
# Plot it
gg_fork
```

.def.purple[Forks] are another very common structure in DAGs: $\text{A}\leftarrow \text{B} \rightarrow \text{C}$.

---
.note[Building block 4:] .b.slate[Forks]

```{r, bb4-plot-2, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
gg_fork_association = ggplot(
  data = bb4_dt,
  aes(x = x, y = y)
) +
geom_line(
  data = fork_curve,
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
# Plot it
gg_fork_association
```

$\text{A}$ and $\text{C}$ are *usually* .orange[associated] in forks. .grey-light[(As with chains.)]

This chain of association follows the path $\text{A}\leftarrow \text{B} \rightarrow \text{C}$.

--

.b.purple[Intuition:] 
--
 $\text{B}$ induces changes in $\text{A}$ and $\text{B}$. An observer will see $\text{A}$ change when $\text{C}$ also changes—they are associated due to their common cause.

---
.note[Building block 4:] .b.slate[Forks]

```{r, bb4-data-ovb, include = F}
# Copy the fork dataset and change names of variables
fork_dt = copy(bb4_dt)
fork_dt[, `:=`(
  name = fcase(
    name == "B", "W",
    name == "A", "Y",
    name == "C", "D",
    default = NA
  ),
  to = fcase(
    to == "B", "W",
    to == "A", "Y",
    to == "C", "D",
    default = NA
  )
)]
```

```{r, bb4-plot-ovb, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
ggplot(
  data = fork_dt,
  aes(x = x, y = y)
) +
geom_line(
  data = fork_curve,
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
```

Another way to think about forks: 

OVB when a treatment $\text{D}$ does not affect the outcome $\text{Y}$.

Without controlling for $\text{W}$, $\text{Y}$ and $\text{D}$ are (usually) .orange[non-causally associated].

---
.note[Building block 4:] .b.slate[Forks]

```{r, bb4-plot-3, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
gg_fork_association = ggplot(
  data = bb4_dt,
  aes(x = x, y = y)
) +
geom_line(
  data = fork_curve,
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
# Plot it
gg_fork_association
```

$\text{A}$ and $\text{C}$ are *usually* .orange[associated] in forks. .grey-light[(As with chains.)]

This chain of association follows the path $\text{A}\leftarrow \text{B} \rightarrow \text{C}$.

.b.pink[Proof:]
--
 Same problem as chains: We can't show $\text{A}$ and $\text{C}$ are independent, so we assume they're likely (potentially?) dependent.

---
.note[Building block 4:] .b.slate[Blocked forks]

```{r, bb4-plot-4, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
gg_fork_block = ggplot(
  data = bb4_dt,
  aes(x = x, y = y)
) +
geom_line(
  data = fork_curve[1:floor(.N/2)],
  color = "grey80",
  linetype = "dashed",
  size = 0.8
) +
geom_point(
  aes(x = x, y = y),
  data = fork_curve[median(1:.N)],
  shape = 15,
  size = 6,
  color = "grey80"
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  aes(color = name == "B", fill = name == "B"),
  size = 20,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = c("white", "white", purple, purple),
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
) +
scale_color_manual(values = c(purple, "grey80")) +
scale_fill_manual(values = c("white", "grey80"))
# Plot it
gg_fork_block
```

Conditioning on $\text{B}$ makes $\text{A}$ and $\text{C}$
--
 independent. .grey-light[(As with chains.)]

.b.purple[Intuition:]
--
 $\text{A}$ and $\text{C}$ are only associated due to their common cause $\text{B}$.

When we shutdown (hold constant) this common cause $(\text{B})$, 
<br>there is way for $\text{A}$ and $\text{C}$ to associate.

--

.note[Also:] Think about Local Markov. Or think about OVB.

---
.note[Building block 4:] .b.slate[Blocked forks]

```{r, bb4-plot-5, echo = F, fig.height = 2.5, fig.width = 5}
# Plot it
gg_fork_block
```

.b.pink[Proof:] We want to show $P(\text{A},\text{C}|\text{B})=P(\text{A}|\text{B})P(\text{C}|\text{B})$.

--

.note[Step 1:] Bayesian net. factorization: $P(\text{A},\text{B},\text{C})=P(\text{B})\color{#e64173}{P(\text{A}|\text{B})P(\text{C}|\text{B})}$

--

.note[Step 2:] Bayes' rule: $P(\text{A},\text{C}|\text{B})=\frac{P(\text{A},\text{B},\text{C})}{P(\text{B})}$

--

.note[Step 3:] Combine .note[2] & .note[1]: $P(\text{A},\text{C}|\text{B})=\frac{P(\text{A},\text{B},\text{C})}{P(\text{B})} = \color{#e64173}{P(\text{A}|\text{B})P(\text{C}|\text{B})}$ $\checkmark$

---
.note[Building block 4:] .b.slate[Forks]

```{r, bb4-plot-6, echo = F, fig.height = 2.5, fig.width = 5}
# Plot it
gg_fork_association
```

Two more items to emphasize:

1. .b.orange[Association] need not follow paths' directions, *e.g.*, $\text{A}\leftarrow \text{B} \rightarrow \text{C}$.

2. .b.pink[Causation] follows directed paths.

---
name: blocks-colliders
.note[Building block 5:] .b.slate[Immoralities]

```{r, bb5-data, include = F}
# The DAG
bb5_ex = dagify(
  B ~ A,
  B ~ C,
  coords = tibble(
    name = LETTERS[1:3],
    x = -1:1,
    y = c(-0.7, 0, -0.7)
  )
)
# Convert to data.table
bb5_dt = bb5_ex %>% fortify() %>% setDT()
# Shorten segments
mult = 0.2
bb5_dt[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
# The curve dataset
collider_curve = tibble(
  x = c(-1, 0, 1),
  y = c(-0.7, 0.3, -0.7)
) %>% spline(n = 101) %>% as.data.table()
```

```{r, bb5-plot-1, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
gg_collider = ggplot(
  data = bb5_dt,
  aes(x = x, y = y)
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
# Plot it
gg_collider
```

 An .def.purple[immorality] occurs when two nodes share a child without being otherwise connected..super.pink[†] $\text{A} \rightarrow \text{B} \leftarrow \text{C}$

.footnote[.pink[†] I'm not making this up.]

--

The child (here: $\text{B}$) at the center of this immorality is called a .def.purple[collider].

--

.note[Notice:] An immorality is a fork with reversed directions of the edges.

---
.note[Building block 5:] .b.slate[Immoralities]

```{r, bb5-plot-2, echo = F, fig.height = 2.5, fig.width = 5}
gg_collider
```

.qa[Q] Are $\text{A}$ and $\text{C}$ independent?

---
count: false
.note[Building block 5:] .b.slate[Immoralities]

```{r, bb5-plot-3, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
gg_collider_blocked = ggplot(
  data = bb5_dt,
  aes(x = x, y = y)
) +
geom_line(
  data = collider_curve[1:floor(.N/2)],
  color = "grey80",
  linetype = "dashed",
  size = 0.8
) +
geom_point(
  aes(x = x, y = y),
  data = collider_curve[median(1:.N)],
  shape = 15,
  size = 6,
  color = "grey80"
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
# Plot it
gg_collider_blocked
```

.qa[Q] Are $\text{A}$ and $\text{C}$ independent?
<br>
.qa[A] Yes. $\text{A} \ci \text{C}$.

--

.b.purple[Intuition:] Causal effects flow from $\text{A}$ and $\text{C}$ and stop there.

- Neither $\text{A}$ nor $\text{C}$ is a descendant of the other.
- $\text{A}$ and $\text{C}$ do not share any common causes.

---
.note[Building block 5:] .b.slate[Immoralities]

```{r, bb5-plot-4, echo = F, fig.height = 2.5, fig.width = 5}
gg_collider_blocked
```

.b.pink[Proof:] Start with .it[marginalizing] dist. of $\text{A}$ and $\text{C}$. Then BNF.

$P(\text{A},\text{C}) = \sum_{\text{B}} P(\text{A},\text{B}, \text{C})$

--

$\color{#FFFFFF}{P(\text{A},\text{C})} = \sum_{\text{B}} P(\text{A})P(\text{C})P(\text{B}|\text{A},\text{C})$

--

$\color{#FFFFFF}{P(\text{A},\text{C})} = P(\text{A})P(\text{C}) \color{#FFA500}{\left(\sum_{\text{B}} P(\text{B}|\text{A},\text{C}) = 1\right)}$

--

$\color{#FFFFFF}{P(\text{A},\text{C})} = P(\text{A})P(\text{C})$ $\quad\color{#e64173}{\checkmark}$ .pink[(] $\color{#e64173}{\text{A} \ci \text{C}}$ .pink[without conditioning)] 

---
.note[Building block 5:] .b.slate[Immoralities with conditions]

```{r, bb5-plot-5, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
ggplot(
  data = bb5_dt,
  aes(x = x, y = y)
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  aes(color = name == "B", fill = name == "B"),
  size = 20,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = c(purple, purple, "white"),
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
) +
scale_color_manual(values = c(purple, "grey80")) +
scale_fill_manual(values = c("white", "grey80"))
```

.qa[Q] What happens when we condition on $\text{B}$?

---
count: false
.note[Building block 5:] .b.slate[Immoralities with conditions]

```{r, bb5-plot-6, echo = F, fig.height = 2.5, fig.width = 5}
# Plot the DAG
gg_collider_unblocked = ggplot(
  data = bb5_dt,
  aes(x = x, y = y)
) +
geom_line(
  data = collider_curve,
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  aes(color = name == "B", fill = name == "B"),
  size = 20,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = c(purple, purple, "white"),
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
) +
scale_color_manual(values = c(purple, "grey80")) +
scale_fill_manual(values = c("white", "grey80"))
# Print figure
gg_collider_unblocked
```

.qa[Q] What happens when we condition on $\text{B}$?
<br>
.qa[A] We .def.orange[unblock] (or .def.orange[open]) the previously blocked (closed) path.

While $\text{A}$ and $\text{C}$ are independent, they are .orange[conditionally dependent].

--

.attn[Important:] When you condition on a collider, you open up the path.

---
.note[Building block 5:] .b.slate[Immoralities with conditions]

```{r, bb5-plot-7, echo = F, fig.height = 2.5, fig.width = 5}
gg_collider_unblocked
```

.b.purple[Intuition:] $\text{B}$ is a combination of $\text{A}$ and $\text{C}$. 

Conditioning on a value of $\text{B}$ jointly constrains $\text{A}$ and $\text{C}$—they can no longer move independently.

--

.ex[Example:] Let $\text{A}$ take on $\{0,1\}$ and $\text{C}$ take on $\{0,1\}$ (independently).

Conditional on $\text{B}=1$, $\text{A}$ and $\text{C}$ are perfectly negatively correlated.

---
.note[Building block 5:] .b.slate[Immoralities with conditions]

```{r, bb5-plot-8, echo = F, fig.height = 2.5, fig.width = 5}
ggplot(
  data = bb5_dt %>% dplyr::mutate(
    name = fcase(
      name == "A", "Y",
      name == "B", "X",
      name == "C", "D",
      default = NA
    ),
    to = fcase(
      to == "A", "Y",
      to == "B", "X",
      to == "C", "D",
      default = NA
    )
  ),
  aes(x = x, y = y)
) +
geom_line(
  data = collider_curve,
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  aes(color = name == "X", fill = name == "X"),
  size = 20,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = c(purple, purple, "white"),
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
) +
scale_color_manual(values = c(purple, "grey80")) +
scale_fill_manual(values = c("white", "grey80"))
```

In *MHE* vocabulary: The collider $\text{X}$ is a *bad control*. 

$\text{X}$ is affected by both your treatment $\text{D}$ and outcome $\text{Y}$.

--

.note[The result:] A spurious relationship between $\text{Y}$ and $\text{D}$
<br>Remember: they're actually (unconditionally) independent.

--

This spurious relationship is often called .def.purple[collider bias].

---

.ex[Example] Data from hospitalized patients: Mobility and respiratory health.

--

.qa[Q] How does this example relate to collider bias?
--
<br>
.qa[A] Write out the DAG (+ think about selection into your sample)!

---

.ex[Example] Data from hospitalized patients: Mobility and respiratory health.

Define $\text{M}$ as .it[mobility],
--
 $\text{R}$ as .it[respiratory health],
--
 and $\text{H}$ as .it[hospitalized].

--

Suppose for the moment respiratory health and mobility 
1. are .b[independent of each other] 
2. each .b[cause hospitalization] (when they are too low)

--

```{r, ex-collider-bias-1, echo = F, fig.height = 2.5, fig.width = 5}
ggplot(
  data = bb5_dt %>% dplyr::mutate(
    name = fcase(
      name == "A", "M",
      name == "B", "H",
      name == "C", "R",
      default = NA
    ),
    to = fcase(
      to == "A", "M",
      to == "B", "H",
      to == "C", "R",
      default = NA
    )
  ),
  aes(x = x, y = y)
) +
# geom_line(
#   data = collider_curve[1:median(1:.N),],
#   color = "grey80",
#   linetype = "dashed",
#   size = 0.8
# ) +
# geom_point(
#   aes(x = x, y = y),
#   data = collider_curve[median(1:.N)],
#   shape = 15,
#   size = 6,
#   color = "grey80"
# ) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  shape = 21,
  stroke = 0.6,
  fill = "white",
  color = purple
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
```

The implied DAG.

---

.ex[Example] Data from hospitalized patients: Mobility and respiratory health.

Define $\text{M}$ as .it[mobility], $\text{R}$ as .it[respiratory health], and $\text{H}$ as .it[hospitalized].

Suppose for the moment respiratory health and mobility 
1. are .b[independent of each other] 
2. each .b[cause hospitalization] (when they are too low)

```{r, ex-collider-bias-1b, echo = F, fig.height = 2.5, fig.width = 5}
ggplot(
  data = bb5_dt %>% dplyr::mutate(
    name = fcase(
      name == "A", "M",
      name == "B", "H",
      name == "C", "R",
      default = NA
    ),
    to = fcase(
      to == "A", "M",
      to == "B", "H",
      to == "C", "R",
      default = NA
    )
  ),
  aes(x = x, y = y)
) +
geom_line(
  data = collider_curve[1:median(1:.N),],
  color = "grey80",
  linetype = "dashed",
  size = 0.8
) +
geom_point(
  aes(x = x, y = y),
  data = collider_curve[median(1:.N)],
  shape = 15,
  size = 6,
  color = "grey80"
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  size = 20,
  shape = 21,
  stroke = 0.6,
  fill = "white",
  color = purple
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
)
```

If we .b.pink[do not condition on hospitalization], $\text{M}\rightarrow \text{H} \leftarrow \text{R}$ is .b.grey[blocked].

---

.ex[Example] Data from hospitalized patients: Mobility and respiratory health.

Define $\text{M}$ as .it[mobility], $\text{R}$ as .it[respiratory health], and $\text{H}$ as .it[hospitalized].

Suppose for the moment respiratory health and mobility 
1. are .b[independent of each other] 
2. each .b[cause hospitalization] (when they are too low)

```{r, ex-collider-bias-2, echo = F, fig.height = 2.5, fig.width = 5}
ggplot(
  data = bb5_dt %>% dplyr::mutate(
    name = fcase(
      name == "A", "M",
      name == "B", "H",
      name == "C", "R",
      default = NA
    ),
    to = fcase(
      to == "A", "M",
      to == "B", "H",
      to == "C", "R",
      default = NA
    )
  ),
  aes(x = x, y = y)
) +
geom_line(
  data = collider_curve,
  color = orange,
  linetype = "dashed",
  size = 0.8
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.06, "npc")),
  color = purple,
  size = 0.9,
  lineend = "round"
) +
geom_point(
  aes(color = name == "H", fill = name == "H"),
  size = 20,
  shape = 21,
  stroke = 0.6
) +
geom_text(
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = c(purple, purple, "white"),
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(-1.5, 1.5),
  ylim = c(-1, 0.5)
) +
scale_color_manual(values = c(purple, "grey80")) +
scale_fill_manual(values = c("white", "grey80"))
```

Our data .b.grey[conditions on hospitalization], which .b.orange[opens] $\text{M}\rightarrow \text{H} \leftarrow \text{R}$.

---
class: clear, middle

You can also see this example graphically...

---
.ex[Example] Data from hospitalized patients: Mobility and respiratory health.

```{r, ex-collider-gen, include = F}
# Set seed and sample size
set.seed(12345)
n = 100
# Generate M and R independently
cb_dt = data.table(
  M = runif(n),
  R = runif(n)
)
# Determine hospitalization
cb_dt[, `:=`(
  H = M + R < 1
)]
# Population relationship
p1 = ggplot(
  data = cb_dt,
  aes(x = M, y = R)
) +
geom_point(
  color = slate,
  size = 2.5,
  alpha = 0.8
) +
scale_y_continuous("Respiratory helath (R)") +
scale_x_continuous("Mobility (M)") +
theme_minimal(
  base_family = "Fira Sans Book",
  base_size = 14
) +
theme(
  panel.grid = element_blank(),
  legend.position = "bottom",
  legend.margin=margin(t=0, r=0, b=-0.3, l=0, unit="cm")
) +
coord_cartesian(ylim = c(0,1), xlim = c(0,1))
# Adding hospitalization
p2 = ggplot(
  data = cb_dt,
  aes(x = M, y = R)
) +
geom_point(
  aes(color = H),
  size = 2.5,
  alpha = 0.8
) +
scale_y_continuous("Respiratory helath (R)") +
scale_x_continuous("Mobility (M)") +
scale_color_manual(
  "Hospitalized (H)",
  values = c(purple, orange),
  labels = c("No", "Yes")
) +
theme_minimal(
  base_family = "Fira Sans Book",
  base_size = 14
) +
theme(
  panel.grid = element_blank(),
  legend.position = "bottom",
  legend.margin=margin(t=0, r=0, b=-0.3, l=0, unit="cm")
) +
coord_cartesian(ylim = c(0,1), xlim = c(0,1))
# Conditioning on hospitalization
p3 = ggplot(
  data = cb_dt[H==T],
  aes(x = M, y = R)
) +
geom_point(
  color = orange,
  size = 2.5,
  alpha = 0.8
) +
scale_y_continuous("Respiratory helath (R)") +
scale_x_continuous("Mobility (M)") +
theme_minimal(
  base_family = "Fira Sans Book",
  base_size = 14
) +
theme(
  panel.grid = element_blank(),
  legend.position = "none",
  legend.margin=margin(t=0, r=0, b=-0.3, l=0, unit="cm")
) +
coord_cartesian(ylim = c(0,1), xlim = c(0,1))
# Align the plots
aligned = align_patches(p1, p2, p3)
```

Let $\color{#314f4f}{M\sim \text{Uniform}(0,1)}$; $\color{#e64173}{R\sim \text{Uniform}(0,1)}$; $\color{#FFA500}{H=\mathbb{I}\{M+R<1\}}$.

--

```{r, ex-collider-plot1, echo = F, fig.height = 4.5, fig.width = 7.5}
aligned[[1]]
```

--

.note[Without conditioning:] No relationship between mobility and resp. health.

---
.ex[Example] Data from hospitalized patients: Mobility and respiratory health.

$\color{#314f4f}{M\sim \text{Uniform}(0,1)}$; $\color{#e64173}{R\sim \text{Uniform}(0,1)}$; $\color{#FFA500}{H=\mathbb{I}\{M+R<1\}}$.

```{r, ex-collider-plot2, echo = F, fig.height = 4.5, fig.width = 7.5}
aligned[[2]]
```

.note[Recall:] Our sample excludes non-hospitalized individuals.

---
.ex[Example] Data from hospitalized patients: Mobility and respiratory health.

$\color{#314f4f}{M\sim \text{Uniform}(0,1)}$; $\color{#e64173}{R\sim \text{Uniform}(0,1)}$; $\color{#FFA500}{H=\mathbb{I}\{M+R<1\}}$.

```{r, ex-collider-plot3, echo = F, fig.height = 4.5, fig.width = 7.5}
aligned[[3]]
```

.note[Conditioning on] $\text{H}$.note[:] Mobility and respiratory health are associated.


---
class: clear, middle

I like this example because it reminds us that .b.it[conditioning] occurs both .b[explicitly] (*e.g.*, "controlling for") and .b[implicitly] (*e.g.*, sample inclusion).

This example of collider bias in hospitalization data comes from David L. Sackett's 1978 paper .it[[Bias in Analytic Research](https://www.jameslindlibrary.org/wp-data/uploads/2014/06/Sackett-1979-whole-article.pdf)]. 

Sackett called it .def.slate[admission rate bias].

.note[More generally:] You'll hear this called .def.slate[selection bias] or .def.slate[Berkson's paradox].

---
layout: false
# DAGs
## Blocked paths

Let's formally define a blocked path (blocking is important).

--

A path between $\text{X}$ and $\text{Y}$ is .def.purple[blocked] by conditioning on a set of variables $\text{Z}$ (possibly empty) if either of the following statements is true:

1. On the path, there is a .b.pink[chain] $\left(\dots \rightarrow \text{W} \rightarrow \dots\right)$ or a .b.pink[fork] $\left(\dots \leftarrow \text{W} \rightarrow \dots\right)$, and we condition on $\text{W}$ $\left(\text{W}\in \text{Z}\right)$.

1. On the path, there is a .b.orange[collider] $\left(\dots \rightarrow \text{W} \leftarrow \dots\right)$, and we .it[do not] condition on $\text{W}$ $\left(\text{W}\not\in \text{Z}\right)$ or any of its .b.orange[descendants] $\left(\text{de}(\text{W})\not\subseteq \text{Z}\right)$.

--

Association flows along unblocked paths.

---
name: d-sep
# DAGs
## d-separation and d-connected(-ness)

Finally, we'll define whether nodes are .note[separated] or .note[connected] in DAGs.

--

.b.purple[Separation:] Nodes $\text{X}$ and $\text{Y}$ are .def.purple[d-separated] by a set of nodes $\text{Z}$ if .purple[all paths] between $\text{X}$ and $\text{Y}$ .purple[are blocked] by $\text{Z}$.

--

Notation for d-separation: $\color{#6A5ACD}{\text{X} \ci_{G} \text{Y} \vert \text{Z}}$

--

.b.pink[Connection:] If there is at least .pink[one path] between $\text{X}$ and $\text{Y}$ that is .pink[unblocked], then $\text{X}$ and $\text{Y}$ are .def.pink[d-connected].

---
# DAGs
## d-separation and causality

.purple[d-separation] tells us that two nodes are .purple[not associated].

--

To measure the .pink[causal effect] of $\text{X}$ on $\text{Y}$:
<br>We must eliminate .pink[non-causal association].

--

Putting these ideas together, here is our .def.orange[criterion to isolate causal effects]:
> If we remove all edges flowing .b[out] of $\text{X}$ (its .pink[causal effects]),
<br>then $\text{X}$ and $\text{Y}$ should be .purple[d-separated].

--

This criterion ensures that we've closed the .def.slate[backdoor paths] that generate non-causal associations between $\text{X}$ and $\text{Y}$.

---
layout: false
class: inverse, middle
name: ex
# Examples

---
layout: true
class: clear

---
.ex[Example 1:] .b[OVB]

<br>

.to-middle[
```{r, ex-1, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = dag_dt,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = dag_dt[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(dag_dt[,min(x)]*0.95, dag_dt[,max(x)]*1.05),
  ylim = c(dag_dt[,min(y)]*0.8, dag_dt[,max(y)]*1.1)
)
```
]

.qa[Q] OVB using DAG fundamentals: When can we isolate causal effects?

---
.ex[Example 2:] .b[Mediation]

Here $\text{M}$ is a .def.purple[mediator]: it .def.purple[mediates] the effect of $\text{D}$ on $\text{Y}$.

```{r, ex-2-setup, include = F}
# The full DAG
ex2 = dagify(
  Y ~ M,
  M ~ D,
  Y ~ W,
  D ~ W,
  coords = tibble(
    name = c("Y", "D", "W", "M"),
    x = c(1, 3, 2, 2),
    y = c(2, 2, 1, 2)
  )
)
# Convert to data.table
ex2 %<>% fortify() %T>% setDT()
# Shorten segments
mult = 0.15
ex2[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, ex-2-fig, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = ex2,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(ex2[,min(x)]*0.95, ex2[,max(x)]*1.05),
  ylim = c(ex2[,min(y)]*0.8, ex2[,max(y)]*1.1)
)
```

.qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
<br>
.qa[Q.sub[2]] What happens if we condition on $\text{W}$ and $\text{M}$?

---
.ex[Example 3:] .b[Partial mediation]

<br>

```{r, ex-3-fig, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = ex2,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  data = . %>% .[,.(
    name == "D", to == "Y",
    xa = sum(xa * (name == "D"), na.rm = T),
    ya = sum(ya * (name == "D"), na.rm = T) + 0.15,
    xb = sum(xb * (name == "M"), na.rm = T),
    yb = sum(yb * (name == "M"), na.rm = T) + 0.15
  )],
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0.3,
  arrow = arrow(length = unit(0.07, "npc")),
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_curve(
  data = . %>% .[!(name == "D" & to == "Y")],
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = c(ex2[,min(x)]*0.95, ex2[,max(x)]*1.05),
  ylim = c(ex2[,min(y)]*0.8, ex2[,max(y)]*1.3)
)
```

.qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
<br>
.qa[Q.sub[2]] What happens if we condition on $\text{W}$ and $\text{M}$?

---
.ex[Example 4:] .b[Non-mediator descendants] 

<br>

```{r, ex-4-setup, include = F}
# The full DAG
ex4 = dagify(
  Y ~ D,
  Y ~ W,
  Z ~ Y,
  D ~ W,
  Z ~ D,
  coords = tibble(
    name = c("Y", "D", "W", "Z"),
    x = c(-1,  1,  0,  0),
    y = c( 0,  0, -1,  1)
  )
)
# Convert to data.table
ex4 %<>% fortify() %T>% setDT()
# Shorten segments
mult = 0.15
ex4[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, ex-4-fig, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = ex4,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = ex4[,range(x)] + ex4[,range(x) %>% diff()] * c(-0.08, 0.08),
  ylim = ex4[,range(y)] + ex4[,range(y) %>% diff()] * c(-0.08, 0.08)
)
```

.qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
<br>
.qa[Q.sub[2]] What happens if we condition on $\text{W}$ and/or $\text{Z}$?

---
.ex[Example 5:] .b[M-Bias]

Notice that $\text{C}$ here is .it[not] a result of treatment (could be "pre-treatment").

```{r, ex-5-setup, include = F}
# The full DAG
ex5 = dagify(
  Y ~ D,
  Y ~ A,
  C ~ A,
  C ~ B,
  D ~ B,
  coords = tibble(
    name = c("Y", "D", "A", "B", "C"),
    x = c(0, 2, 0, 2, 1),
    y = c(0, 0, 2, 2, 1)
  )
)
# Convert to data.table
ex5 %<>% fortify() %T>% setDT()
# Shorten segments
mult = 0.2
ex5[, `:=`(
  xa = x + (xend-x) * (mult),
  ya = y + (yend-y) * (mult),
  xb = x + (xend-x) * (1-mult),
  yb = y + (yend-y) * (1-mult)
)]
```

```{r, ex-5-fig, echo = F, fig.height = 3, fig.width = 6}
# Plot the full DAG
ggplot(
  data = ex5,
  aes(x = x, y = y, xend = xend, yend = yend)
) +
geom_point(
  size = 20,
  fill = "white",
  color = purple,
  shape = 21,
  stroke = 0.6
) +
geom_curve(
  aes(x = xa, y = ya, xend = xb, yend = yb),
  curvature = 0,
  arrow = arrow(length = unit(0.07, "npc")),
  color = purple,
  size = 1.2,
  lineend = "round"
) +
geom_text(
  data = . %>% .[,.(name,x,y,xend=x,yend=y)] %>% unique(),
  aes(x = x, y = y, label = name),
  family = "Fira Sans Medium",
  size = 8,
  color = purple,
  fontface = "bold"
) +
theme_void() +
theme(
  legend.position = "none",
) +
coord_cartesian(
  xlim = ex5[,range(x)] + ex5[,range(x) %>% diff()] * c(-0.08, 0.08),
  ylim = ex5[,range(y)] + ex5[,range(y) %>% diff()] * c(-0.08, 0.08)
)
```

.qa[Q.sub[1]] What do we need to condition on to get the effect of $\text{D}$ on $\text{Y}$?
<br>
.qa[Q.sub[2]] What happens if we condition on $\text{C}$?
<br>
.qa[Q.sub[3]] What happens if we condition on $\text{C}$ along with $\text{B}$ and/or $\text{C}$?

---
class: middle

.b[One more note:] 

DAGs are often drawn without "noise variables" (disturbances).

But they still exist—they're just "outside of the model."

---
layout: false
name: limits
# DAGs
## Limitations

So what can't DAGs do (well)?

--

- .def[Simultaneity:] Defined causality as unidirectional and prohibited cycles.

--

- .def[Dynamics:] You can sort of allow a variable to affect itself... $\text{Y}_{t=1}\rightarrow\text{Y}_{t=2}$.

--

- .def[Uncertainty:] DAGs are most useful when you can correctly draw them.

--

- .def[Make friends:] There's *a lot* of (angry/uncharitable) fighting about DAGs: 
$$
\begin{align}
  \text{Philosophy}\rightarrow \text{DAGs/Epidemiology} \leftarrow \text{Economics}
\end{align}
$$

---
class: clear, middle

.b.slate[Some of Judea Pearl's thoughts] ([source]((http://causality.cs.ucla.edu/blog/index.php/2014/10/27/are-economists-smarter-than-epidemiologists-comments-on-imbenss-recent-paper/))

> So, what is it about epidemiologists that drives them to seek the light of new tools, while economists (at least those in Imbens’s camp) seek comfort in partial blindness, while missing out on the causal revolution? Can economists do in their heads what epidemiologists observe in their graphs? Can they, for instance, identify the testable implications of their own assumptions? Can they decide whether the IV assumptions (i.e., exogeneity and exclusion) are satisfied in their own models of reality? Of course the can’t; such decisions are intractable to the graph-less mind. (I have challenged them repeatedly to these tasks, to the sound of a pin-drop silence)

---
class: clear, middle

.b.slate[Pearl, continued] ([source]((http://causality.cs.ucla.edu/blog/index.php/2014/10/27/are-economists-smarter-than-epidemiologists-comments-on-imbenss-recent-paper/))

> Or, are problems in economics different from those in epidemiology? I have examined the structure of typical problems in the two fields, the number of variables involved, the types of data available, and the nature of the research questions. The problems are strikingly similar.

> I have only one explanation for the difference: Culture.

> The arrow-phobic culture started twenty years ago, when Imbens and Rubin (1995) decided that graphs “can easily lull the researcher into a false sense of confidence in the resulting causal conclusions,” and Paul Rosenbaum (1995) echoed with “No basis is given for believing” […] “that a certain mathematical operation, namely this wiping out of equations and fixing of variables, predicts a certain physical reality”

---
class: clear, middle

.b.slate[Guido Imbens's response] ([source]((http://causality.cs.ucla.edu/blog/index.php/2014/10/27/are-economists-smarter-than-epidemiologists-comments-on-imbenss-recent-paper/))

> ... Judea and others using graphical models have developed a very interesting set of tools that researchers in many areas have found useful for their research. Other researchers, including myself, have found the potential outcome framework for causality associated with the work by Rubin... more useful for their work. In my view that difference of opinion does not reflect “economists being scared of graphs”, or “educational deficiencies” as Judea claims, merely legitimate heterogeneity in views arising from differences in preferences and problems. The “educational deficiencies” claim, and similarly the comment about my “vow” to avoid causal graphs is particularly ironic given that in the past Judea has presented, at my request, his work on causal graphs to participants in a graduate seminar I taught at Harvard University.

---
class: clear, middle, center

.enormous[🤷]
<br>
.note[Suggestion:] Be nice to people and be intellectually honest.

---
name: sources
layout: false

# Sources

## Thanks

These notes rely heavily upon Brady Neal's [*Introduction to Causal Inference*](https://bradyneal.com/causal-inference-course). 

I also borrow from Scott Cunningham's [*Causal Inference: The Mixtape*](https://www.scunning.com/mixtape.html).

I found the [Sackett (1978)](https://catalogofbias.org/biases/collider-bias/) example on the ["Catalog of Bias"](https://catalogofbias.org/biases/collider-bias/) website.

---
exclude: false
# Table of contents

.col-left[
.small[
#### Admin
- [Today and upcoming](#admin)

#### Other
- [Sources](#sources)

#### DAGs
- [What's a DAG?](#different)
- [Example](#dag-ex)
- [Graphs](#graphs)
  - [Definition/undirected](#graphs)
  - [Directed](#graphs-directed)
  - [Cycles](#graphs-cycles)

]
]
.col-right[
.small[
#### DAGs continued
- [Origins](#dag-origins)
  - [Local Markov](#local-markov)
  - [Bayesian net. factorization](#dags-factor)
  - [Dependence](#dags-dependence)
  - [Causality](#dags-causlity)
- [DAG building blocks](#building-blocks)
  - [Chains](#blocks-chains)
  - [Forks](#blocks-forks)
  - [Immoralities/colliders](#blocks-colliders)
- [d-separation](#d-sep)
- [Examples](#ex)
- [Limitations](#limits)
]
]

---
exclude: true

```{r, generate pdfs, include = F, eval = F}
pagedown::chrome_print("12-ml.html", output = "12-ml.pdf")
pagedown::chrome_print("12-ml.html", output = "12-ml-nopause.pdf")
```