Slides.knit

class: center, inverse, middle

.pull-left-wide {
  float: left;
  width: 66%;
}
.pull-right-wide {
  float: right;
  width: 66%;
}
.pull-right-wide ~ p {
  clear: both;
}

.pull-left-narrow {
  float: left;
  width: 30%;
}
.pull-right-narrow {
  float: right;
  width: 30%;
}

.tiny123 {
  font-size: 0.40em;
}

.small123 {
  font-size: 0.80em;
}

.large123 {
  font-size: 2em;
}

.red {
  color: red
}

.orange {
  color: orange
}

.green {
  color: green
}
</style>

# Statistics
## Testing relationships using quantitative data
### (Chapter 15)

### Christian Vedel,<br>Department of Economics<br>University of Southern Denmark

### Email: [christian-vs@sam.sdu.dk](mailto:christian-vs@sam.sdu.dk)

### Updated 2026-04-27

---
class: middle
# Today's lecture
.pull-left-wide[
**Extending hypothesis testing to comparisons between groups and across time**

- **Section 1:** Testing the difference between two mean values
- **Section 2:** Testing the difference between multiple mean values (ANOVA)
- **Section 3:** Testing the effect of a treatment
- **Section 4:** Testing the ratio between two variances
]

.pull-right-narrow[
![Trees](Figures/Trees1.jpg)
]

---
class: inverse, middle, center
# Testing the difference between two mean values

---
# Motivation

.pull-left-wide[
We want to test whether the means (or variances) of two or more distributions differ:
- do women earn the same wage, on average, as men?
- is the health of people treated with a drug better than those untreated?
]

---
# Setup

.pull-left-wide[
Let `$Y$` indicate group membership: `$Y=1$` for group 1, `$Y=2$` for group 2. Let `$X$` be the characteristic of interest (e.g. wages).

We want to compare:
- `$\mu_1$` from `$f(x \mid y=1)$`
- `$\mu_2$` from `$f(x \mid y=2)$`

Suppose we have two **independent** simple random samples:
- `$n_1$` elements from group 1: `$(X_{1,1},\ldots,X_{n_1,1})$`
- `$n_2$` elements from group 2: `$(X_{1,2},\ldots,X_{n_2,2})$`
]

---
# The general case

.pull-left-wide[
Two-sided hypothesis test:
`$$\begin{align*} H_0 & : \mu_1 = \mu_2 \\ H_1 & : \mu_1 \not= \mu_2 \end{align*}$$`

Hypothesis measure `$h(\mu_1,\mu_2) = \mu_1 - \mu_2 = 0$` under `$H_0$`. Replace the unknown means with sample averages:
`$$\begin{align*} \bar{X}_1 & = \frac{1}{n_1}\sum_{i=1}^{n_1} X_{i,1} \\ \bar{X}_2 & = \frac{1}{n_2}\sum_{i=1}^{n_2} X_{i,2} \end{align*}$$`
]

---
# The test statistic

.pull-left-wide[
Since the two samples are independent, `$Var\!\left(\bar{X}_1 - \bar{X}_2\right) = \sigma_1^2/n_1 + \sigma_2^2/n_2$`. If variances are unknown, use `$S_1^2$` and `$S_2^2$`:
`$$Z = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{S_1^2/n_1 + S_2^2/n_2}} \overset{a}{\sim} \mathcal{N}(0,1) \text{ under } H_0$$`

Reject `$H_0$` if `$Z < z_{\alpha/2}$` or `$Z > z_{1-\alpha/2}$`; `$p$`-value: `$p = 2\Phi(-|z|)$`.
]

---
# Equal variances

.pull-left-wide[
If `$\sigma_1^2 = \sigma_2^2 = \sigma^2$`, use the **pooled variance estimator** for a more efficient estimate:
`$$S_p^2 = \frac{(n_1-1)S_1^2 + (n_2-1)S_2^2}{n_1+n_2-2}, \qquad Z = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{S_p^2\!\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}$$`

If `$X$` is also normally distributed, `$Z \sim t(n_1+n_2-2)$` exactly under `$H_0$`.
]

.pull-right-narrow[
.small123[
**Why pooled?** Combining both samples to estimate `$\sigma^2$` is more efficient than using `$S_1^2$` or `$S_2^2$` alone.
]
]

---
# Normal distribution cases

.pull-left-wide[
**Known variances `$\sigma_1^2$`, `$\sigma_2^2$`** (both groups normal):
`$$Z = \frac{\bar{X}_1-\bar{X}_2}{\sqrt{\sigma_1^2/n_1+\sigma_2^2/n_2}} \sim \mathcal{N}(0,1) \text{ exactly under } H_0$$`

**Unknown and unequal variances** (both groups normal):
`$$Z = \frac{\bar{X}_1-\bar{X}_2}{\sqrt{S_1^2/n_1+S_2^2/n_2}} \overset{a}{\sim} t(n_1+n_2-1) \text{ under } H_0$$`
]

---
# Bernoulli distribution

.pull-left-wide[
If `$X$` follows a Bernoulli distribution, variance is a function of the mean. Under `$H_0$` (`$p_1=p_2$`), use the **pooled proportion**:
`$$\bar{p}_p = \frac{n_1\bar{p}_1 + n_2\bar{p}_2}{n_1+n_2}, \qquad S_p^2 = \bar{p}_p(1-\bar{p}_p)$$`

Test statistic:
`$$Z = \frac{\bar{p}_1-\bar{p}_2}{\sqrt{\bar{p}_p(1-\bar{p}_p)\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}} \overset{a}{\sim} \mathcal{N}(0,1) \text{ under } H_0$$`
]

---
# One-sided tests

.pull-left-wide[
The same test statistic is used; only the decision rule changes:

**$H_1: \mu_1 > \mu_2$** — reject if `$Z$` is too far right:
- do not reject `$H_0$` if `$Z < z_{1-\alpha}$`
- reject `$H_0$` if `$Z \geq z_{1-\alpha}$`

**$H_1: \mu_1 < \mu_2$** — reject if `$Z$` is too far left:
- do not reject `$H_0$` if `$Z > z_\alpha$`
- reject `$H_0$` if `$Z \leq z_\alpha$`
]

---
# .red[Raise your hand 1: Two-sample tests]

.pull-left-wide[
**Q1.** Two independent samples: `$n_1=36$`, `$n_2=49$`, unknown variances assumed equal. The correct test statistic uses:

A: The pooled variance estimator `$S_p^2$`
B: Separate sample variances `$S_1^2/n_1 + S_2^2/n_2$` in the denominator
C: The sum `$S_1^2 + S_2^2$` divided by `$n_1+n_2$`
D: A `$\chi^2$` statistic to test `$S_1^2/S_2^2$`
]

.pull-left-wide[
**Q2.** For a one-sided test `$H_1: \mu_1 < \mu_2$`, the decision rule is:

A: Reject `$H_0$` if `$Z \leq z_\alpha$`
B: Reject `$H_0$` if `$Z \geq z_{1-\alpha}$`
C: Reject `$H_0$` if `$|Z| \geq z_{1-\alpha/2}$`
D: Reject `$H_0$` if `$Z \geq z_{1-\alpha/2}$`
]

---
# .red[Practice 1: Two-sample mean test]

.pull-left-wide[
Two independent samples of weekly earnings (€):
- Group 1 (men): `$n_1=36$`, `$\bar{X}_1=850$`, `$S_1^2=3{,}600$`
- Group 2 (women): `$n_2=49$`, `$\bar{X}_2=800$`, `$S_2^2=2{,}500$`

1. Test `$H_0: \mu_1=\mu_2$` vs `$H_1: \mu_1 \not= \mu_2$` at `$\alpha=0.05$`.
2. Compute the `$p$`-value.
]

---
class: inverse, middle, center
# Testing the difference between multiple mean values (ANOVA)

---
# Setup

.pull-left-wide[
Generalise to `$K$` groups. Test whether all group means are equal:
`$$\begin{align*} H_0 & : \mu_1 = \mu_2 = \cdots = \mu_K \\ H_1 & : \text{at least one mean value is different} \end{align*}$$`

With weights `$w_k = n_k/n$` (relative group size) and `$\sigma_1^2=\cdots=\sigma_K^2=\sigma^2$`, define:
- Group sample means: `$\bar{X}_k = \frac{1}{n_k}\sum_{i=1}^{n_k}X_{i,k}$`
- Overall sample mean: `$\bar{X} = \sum_{k=1}^K w_k \bar{X}_k$`
]

---
# Sums of squares

.pull-left-wide[
**Sum of squared treatments** (between-group variation):
`$$SSTR = \sum_{k=1}^K n_k(\bar{X}_k - \bar{X})^2$$`

**Sum of squared errors** (within-group variation):
`$$SSE = \sum_{k=1}^K\sum_{i=1}^{n_k}(X_{i,k}-\bar{X}_k)^2 = \sum_{k=1}^K(n_k-1)S_k^2$$`
]

.pull-right-narrow[
.small123[
`$SSTR \approx 0$` when all means are equal. `$SSE$` estimates residual variance regardless of `$H_0$`.
]
]

---
# Test statistic

.pull-left-wide[
`$$F = \frac{SSTR/(K-1)}{SSE/(n-K)}$$`

Under `$H_0$`, `$F \sim F(K-1, n-K)$`.

Decision rule:
- do not reject `$H_0$` if `$F < F_{1-\alpha}(K-1, n-K)$`
- reject `$H_0$` if `$F \geq F_{1-\alpha}(K-1, n-K)$`

> If `$H_0$` is true, the numerator should be close to zero. If `$H_0$` is false, the numerator grows with `$n$`. This is **analysis of variance** (ANOVA).
]

---
# .red[Raise your hand 2: ANOVA]

.pull-left-wide[
**Q1.** An ANOVA F-test compares `$K=3$` groups with total `$n=60$` observations. Under `$H_0$`, the test statistic follows:

A: `$F(2,57)$`
B: `$F(3,57)$`
C: `$F(2,60)$`
D: `$\chi^2(2)$`
]

.pull-left-wide[
**Q2.** An F-statistic is very large. This means:

A: Between-group variation is large relative to within-group variation
B: Within-group variation is large relative to between-group variation
C: All group means are significantly different from each other
D: The sample variances in each group are unequal
]

---
# .red[Practice 2: ANOVA by hand]

.pull-left-wide[
Three sales regions, each with `$n_k=10$` observations:
- Region A: `$\bar{X}_1=120$`
- Region B: `$\bar{X}_2=130$`
- Region C: `$\bar{X}_3=140$`

Overall mean `$\bar{X}=130$`; `$SSE=2{,}700$`.

1. Compute `$SSTR$` and the `$F$` statistic.
2. Test `$H_0: \mu_1=\mu_2=\mu_3$` at `$\alpha=0.05$`. (`$F_{0.95}(2,27) \approx 3.35$`)
]

---
class: inverse, middle, center
# Testing the effect of a treatment

---
# Setup

.pull-left-wide[
A **treatment** is any intervention that can potentially change the distribution of `$X$`:
- health: medical treatments
- wages: training programmes
- school grades: additional resources or smaller class size

Two groups:
- **treated group** = receives the treatment
- **control group** = does not receive the treatment
]

---
# Case 1: Treatment and control observed once

.pull-left-wide[
With two independent simple random samples (one per group), the treatment effect is:
`$$D = \mu_T - \mu_C$$`

Test:
`$$\begin{align*} H_0 & : \mu_T - \mu_C = 0 \\ H_1 & : \mu_T - \mu_C \not= 0 \end{align*}$$`

Test statistic:
`$$Z = \frac{\bar{X}_T - \bar{X}_C}{\sqrt{S_T^2/n_T + S_C^2/n_C}} \overset{a}{\sim} \mathcal{N}(0,1)$$`
]

---
# Case 2: Treatment group observed twice (before/after)

.pull-left-wide[
Only the treated group is observed, at time 1 (before) and time 2 (after).

The per-element change: `$D_i = X_{T,i,2} - X_{T,i,1}$`

Test:
`$$\begin{align*} H_0 & : \mu_{T,2} - \mu_{T,1} = 0 \\ H_1 & : \mu_{T,2} - \mu_{T,1} \not= 0 \end{align*}$$`

Note: the two observations for the same element are **not** independent — this is **panel data**.
]

.pull-left-wide[
Variance estimator (accounts for dependence within elements):
`$$\widehat{Var}\!\left(\bar{X}_{T,2}-\bar{X}_{T,1}\right) = \frac{1}{n_T}\cdot\frac{1}{n_T-1}\sum_{i=1}^{n_T}\left[(X_{T,i,2}-X_{T,i,1})-(\bar{X}_{T,2}-\bar{X}_{T,1})\right]^2$$`

Test statistic: `$Z = (\bar{X}_{T,2}-\bar{X}_{T,1})/\sqrt{\widehat{Var}(\bar{X}_{T,2}-\bar{X}_{T,1})} \overset{a}{\sim} \mathcal{N}(0,1)$`
]

---
# Case 3: Both groups observed twice (DiD)

.pull-left-wide[
The issue with Case 1 and 2: we cannot be sure differences are caused by the treatment.

If both groups are observed at time 1 and time 2, other factors affecting both groups cancel out:
`$$D = (\mu_{T,2}-\mu_{T,1}) - (\mu_{C,2}-\mu_{C,1})$$`

This is the **difference-in-differences** (DiD) estimate.

Test:
`$$\begin{align*} H_0 & : (\mu_{T,2}-\mu_{T,1}) - (\mu_{C,2}-\mu_{C,1}) = 0 \\ H_1 & : (\mu_{T,2}-\mu_{T,1}) - (\mu_{C,2}-\mu_{C,1}) \not= 0 \end{align*}$$`
]

.pull-left-wide[
Variance estimator and test statistic:

.small123[
`$$\widehat{Var} = \frac{1}{n_T}\cdot\frac{1}{n_T-1}\sum_{i=1}^{n_T}\left[(X_{T,i,2}-X_{T,i,1})-(\bar{X}_{T,2}-\bar{X}_{T,1})\right]^2 + \frac{1}{n_C}\cdot\frac{1}{n_C-1}\sum_{i=1}^{n_C}\left[(X_{C,i,2}-X_{C,i,1})-(\bar{X}_{C,2}-\bar{X}_{C,1})\right]^2$$`
]

`$$Z = \frac{(\bar{X}_{T,2}-\bar{X}_{T,1})-(\bar{X}_{C,2}-\bar{X}_{C,1})}{\sqrt{\widehat{Var}}} \overset{a}{\sim} \mathcal{N}(0,1)$$`
]

---
# .red[Raise your hand 3: Difference-in-differences]

.pull-left-wide[
**Q1.** The DiD estimator for the effect of a treatment is:

A: `$(\bar{X}_{T,2}-\bar{X}_{T,1}) - (\bar{X}_{C,2}-\bar{X}_{C,1})$`
B: `$\bar{X}_{T,2} - \bar{X}_{C,2}$`
C: `$\bar{X}_{T,2} - \bar{X}_{T,1}$`
D: `$(\bar{X}_{T,2}-\bar{X}_{C,2}) + (\bar{X}_{T,1}-\bar{X}_{C,1})$`
]

.pull-left-wide[
**Q2.** The main advantage of DiD (Case 3) over the before/after comparison (Case 2) is:

A: It controls for common time trends affecting both groups
B: It eliminates the need for a control group
C: It provides a valid causal estimate without any additional assumptions
D: It doubles the effective sample size
]

---
# .red[Practice 3: DiD calculation]

.pull-left-wide[
A government subsidy is given to firms in region A (treated), not in region B (control). Average monthly profits (DKK 1,000s):

| | Before | After |
|---|:---:|:---:|
| Region A (treated) | 500 | 580 |
| Region B (control) | 480 | 500 |

1. Compute the DiD estimate of the subsidy effect.
2. Under what assumption is this a valid causal estimate?
]

---
class: inverse, middle, center
# Testing the ratio between two variances

---
# Setup

.pull-left-wide[
Two independent simple random samples, each normally distributed.

Test:
`$$\begin{align*} H_0 & : \sigma_1^2 = \sigma_2^2 \\ H_1 & : \sigma_1^2 \not= \sigma_2^2 \end{align*}$$`

Use the hypothesis measure `$\sigma_1^2/\sigma_2^2$` and replace with sample variances:
`$$F = \frac{S_1^2}{S_2^2}$$`

Under `$H_0$`, `$F \sim F(n_1-1, n_2-1)$`.

Decision rule: reject `$H_0$` if `$F \geq F_{1-\alpha/2}(n_1-1, n_2-1)$`.
]

.pull-right-narrow[
.small123[
**Convention:** Put the larger sample variance in the numerator so that `$F \geq 1$`, then use only the upper tail.
]
]

---
# .red[Raise your hand 4: Variance ratio test]

.pull-left-wide[
**Q1.** A two-sided test of `$H_0: \sigma_1^2=\sigma_2^2$` uses `$F=S_1^2/S_2^2$`. You reject `$H_0$` if:

A: `$F \geq F_{1-\alpha/2}(n_1-1, n_2-1)$`
B: `$F \geq F_{1-\alpha}(n_1-1, n_2-1)$`
C: `$F \geq F_{1-\alpha/2}(n_1, n_2)$`
D: `$F \geq F_{1-\alpha}(n_1-1, n_2-1)$` or `$F \leq F_\alpha(n_1-1, n_2-1)$`
]

.pull-left-wide[
**Q2.** If you accidentally compute `$F=S_2^2/S_1^2$` instead of `$S_1^2/S_2^2$`, the test statistic follows:

A: `$F(n_2-1, n_1-1)$` — degrees of freedom are reversed
B: The p-value doubles because you test from the wrong direction
C: An invalid distribution because `$F$` must always be `$\geq 1$`
D: The same distribution `$F(n_1-1, n_2-1)$`, since ratios are symmetric
]

---
# .red[Practice 4: Variance ratio test]

.pull-left-wide[
Two independent samples of test scores (assume normal distributions):
- Group 1: `$n_1=21$`, `$S_1^2=144$`
- Group 2: `$n_2=16$`, `$S_2^2=64$`

1. Test `$H_0: \sigma_1^2=\sigma_2^2$` vs `$H_1: \sigma_1^2 \not= \sigma_2^2$` at `$\alpha=0.05$`.
2. Compute the test statistic and state the decision. (`$F_{0.975}(20,15) \approx 2.76$`)
]

---
# Before next time
.pull-left[
- Read the assigned reading
- Next time: Testing relationships using qualitative data `$\rightarrow$` Chapter 16
]

.pull-right[
![Trees](Figures/Trees1.jpg)
]