Slides.knit

class: center, inverse, middle

.pull-left-wide {
  float: left;
  width: 66%;
}
.pull-right-wide {
  float: right;
  width: 66%;
}
.pull-right-wide ~ p {
  clear: both;
}

.pull-left-narrow {
  float: left;
  width: 30%;
}
.pull-right-narrow {
  float: right;
  width: 30%;
}

.tiny123 {
  font-size: 0.40em;
}

.small123 {
  font-size: 0.80em;
}

.large123 {
  font-size: 2em;
}

.red {
  color: red
}

.orange {
  color: orange
}

.green {
  color: green
}
</style>

# Statistics
## Simple linear regression
### (Chapter 17)

### Christian Vedel,<br>Department of Economics<br>University of Southern Denmark

### Email: [christian-vs@sam.sdu.dk](mailto:christian-vs@sam.sdu.dk)

### Updated 2026-04-27

---
class: middle
# Today's lecture
.pull-left-wide[
**Modelling the conditional mean of one variable as a linear function of another, and testing whether the relationship is real**

- **Section 1:** The regression function
- **Section 2:** Simple linear regression
- **Section 3:** OLS estimation
- **Section 4:** Hypothesis testing and confidence intervals
]

.pull-right-narrow[
![Trees](Figures/Trees1.jpg)
]

---
class: inverse, middle, center
# The regression function

---
# Motivation

.pull-left-wide[
The relationship between two random variables `$X$` and `$Y$` is specified by their joint probability distribution `$f(x,y)$`.

We can also study the relationship between a given value of `$X$` and the distribution of `$Y$` — given by the **conditional distribution** `$f_{Y|X}(y \mid x)$`.

If `$X$` and `$Y$` are not independent, knowing the value of `$X$` gives us information about the distribution of `$Y$`.
]

---
# Regression function

.pull-left-wide[
**Regression analysis** focuses on the conditional mean `$E(Y \mid X = x)$`:

- if `$Y$` is discrete: `$\displaystyle E(Y \mid X = x) = \sum_i y_i \cdot f_{Y|X}(y_i \mid x)$`
- if `$Y$` is continuous: `$\displaystyle E(Y \mid X = x) = \int_{-\infty}^\infty y \cdot f_{Y|X}(y \mid x)\,dy$`

In either case, `$E(Y \mid X = x)$` is a function of `$x$`, called the **regression function** of `$Y$` on `$X$`.
]

.pull-left-wide[
The two variables have different names:
- `$Y$` is the **dependent (explained)** variable
- `$X$` is the **independent (explanatory)** variable
]

---
# Error term

.pull-left-wide[
The regression function gives the **expected** value of `$Y$` for a given `$X$`. The actual value rarely equals this expectation.

The **error term** (residual) is the difference:
`$$U = Y - E(Y \mid X = x)$$`

By construction, this has zero conditional mean:
`$$E(U \mid X = x) = 0$$`
]

.pull-left-wide[
> The regression function describes a **statistical** relationship between `$Y$` and `$X$` — it does not imply causality.
]

---
class: inverse, middle, center
# Simple linear regression

---
# Linear regression model

.pull-left-wide[
In practice, we often assume a **linear** relationship between `$X$` and `$Y$`:

> **Simple linear regression:** `$E(Y \mid X = x) = \beta_0 + \beta_1 x$`

The two coefficients:
- `$\beta_0$` is the **intercept**
- `$\beta_1$` is the **slope coefficient**

The slope `$\beta_1$` measures the change in `$E(Y)$` for a one-unit increase in `$X$`.
]

---
# Population coefficients

.pull-left-wide[
It can be shown that the two coefficients can be expressed as:
`$$\begin{align*} \beta_1 & = \frac{Cov(X,Y)}{Var(X)} \\ \beta_0 & = E(Y) - \beta_1 E(X) \end{align*}$$`

Implications:
- the **sign** of `$\beta_1$` equals the sign of the correlation between `$X$` and `$Y$`
- the **magnitude** of `$\beta_1$` gives the effect of a one-unit change in `$X$` on `$E(Y)$`
]

.pull-right-narrow[
.small123[
If `$Cov(X,Y) = 0$`, then `$\beta_1 = 0$` and knowing `$X$` does not help predict `$E(Y)$`.
]
]

---
class: inverse, middle, center
# OLS estimation

---
# Analogy principle

.pull-left-wide[
Given a simple random sample `$((X_1,Y_1),\ldots,(X_n,Y_n))$`, replace the population quantities with sample counterparts:
`$$\begin{align*} Cov(X,Y) & \;\to\; \widehat{Cov}(X,Y) = \frac{1}{n-1}\sum_{i=1}^n(X_i-\bar{X})(Y_i-\bar{Y}) \\ Var(X) & \;\to\; S_X^2 = \frac{1}{n-1}\sum_{i=1}^n(X_i-\bar{X})^2 \\ E(Y) & \;\to\; \bar{Y} = \frac{1}{n}\sum_{i=1}^n Y_i \end{align*}$$`
]

---
# OLS estimators

.pull-left-wide[
Substituting gives the **ordinary least squares (OLS)** estimators:
`$$\begin{align*} \hat{\beta}_1 & = \frac{\widehat{Cov}(X,Y)}{S_X^2} = \frac{\sum_{i=1}^n(X_i-\bar{X})(Y_i-\bar{Y})}{\sum_{i=1}^n(X_i-\bar{X})^2} \\ \hat{\beta}_0 & = \bar{Y} - \hat{\beta}_1\bar{X} \end{align*}$$`

These are the solution to minimising the **sum of squared residuals**:
`$$\sum_{i=1}^n\left[Y_i - (b_0 + b_1 X_i)\right]^2$$`

> OLS finds the "best-fitting" straight line through the sample points.
]

---
# .red[Raise your hand 1: Regression function and OLS]

.pull-left-wide[
**Q1.** In `$E(Y \mid X=x) = \beta_0 + \beta_1 x$`, the slope `$\beta_1 = 2.5$` means:

A: When `$X$` increases by 1 unit, the expected value of `$Y$` increases by 2.5 units
B: When `$X$` increases by 1 unit, `$Y$` always increases by 2.5 units
C: The correlation between `$X$` and `$Y$` is 2.5
D: For every unit increase in `$Y$`, `$X$` increases by 2.5 units
]

.pull-left-wide[
**Q2.** The OLS estimate `$\hat{\beta}_1 = \widehat{Cov}(X,Y)/S_X^2$`. The sign of `$\hat{\beta}_1$` is determined by:

A: The sign of `$\widehat{Cov}(X,Y)$`, since `$S_X^2 > 0$` always
B: The sign of `$S_X^2$`
C: The sign of `$\bar{Y}$`
D: Whether `$n$` is odd or even
]

---
# .red[Practice 1: Computing OLS estimates]

.pull-left-wide[
A sample of `$n=5$` observations gives:
`$$\bar{X}=3, \quad \bar{Y}=7, \quad \sum_{i=1}^5(X_i-\bar{X})^2=10, \quad \sum_{i=1}^5(X_i-\bar{X})(Y_i-\bar{Y})=15$$`

1. Compute `$\hat{\beta}_1$` and `$\hat{\beta}_0$`.
2. Interpret `$\hat{\beta}_1$`.
3. What is the fitted value `$\hat{Y}$` when `$X=5$`?
]

---
class: inverse, middle, center
# Hypothesis testing and confidence intervals

---
# Hypothesis test for `$\beta_1$`

.pull-left-wide[
To test whether `$X$` and `$Y$` are linearly related, test:
`$$\begin{align*} H_0 & : \beta_1 = 0 \\ H_1 & : \beta_1 \not= 0 \end{align*}$$`

More generally, test `$H_0: \beta_1 = \beta_1^0$` for a value `$\beta_1^0$` suggested by theory.

Since `$\hat{\beta}_1$` is an **unbiased estimator** of `$\beta_1$` (i.e. `$E(\hat{\beta}_1) = \beta_1$`), the test is a test of the mean of `$\hat{\beta}_1$`:
`$$Z = \frac{\hat{\beta}_1 - \beta_1^0}{\sqrt{S_{\hat{\beta}_1}^2}} \overset{a}{\sim} \mathcal{N}(0,1)$$`

Decision rule:
- do not reject `$H_0$` if `$z_{\alpha/2} \leq Z \leq z_{1-\alpha/2}$`
- reject `$H_0$` if `$Z < z_{\alpha/2}$` or `$Z > z_{1-\alpha/2}$`
]

.pull-right-narrow[
.small123[
Testing `$\beta_0$` uses the same approach with `$\hat{\beta}_0$` and its standard error.
]
]

---
# Confidence interval for `$\beta_1$`

.pull-left-wide[
A confidence interval at `$(1-\alpha)$` level:
`$$\hat{I} = \left[\hat{\beta}_1 - z_{1-\alpha/2}\sqrt{S_{\hat{\beta}_1}^2},\;\; \hat{\beta}_1 + z_{1-\alpha/2}\sqrt{S_{\hat{\beta}_1}^2}\right]$$`

Interpretation: if we were to draw many samples, `$(1-\alpha)$`\% of the intervals constructed this way would contain the true `$\beta_1$`.

Confidence intervals for `$\beta_0$` are constructed analogously.
]

---
# .red[Raise your hand 2: Tests for regression coefficients]

.pull-left-wide[
**Q1.** To test whether `$X$` and `$Y$` are **linearly related**, we test:

A: `$H_0: \beta_1=0$` vs `$H_1: \beta_1 \neq 0$`
B: `$H_0: \beta_0=0$` vs `$H_1: \beta_0 \neq 0$`
C: `$H_0: \mu_X=\mu_Y$` vs `$H_1: \mu_X \neq \mu_Y$`
D: `$H_0: Var(X)=Var(Y)$` vs `$H_1: Var(X)\neq Var(Y)$`
]

.pull-left-wide[
**Q2.** The test statistic `$Z = (\hat{\beta}_1 - \beta_1^0)/\sqrt{S_{\hat{\beta}_1}^2}$` follows approximately `$\mathcal{N}(0,1)$` because:

A: `$\hat{\beta}_1$` is unbiased for `$\beta_1$`, so `$\hat{\beta}_1 - \beta_1^0$` is centred; the CLT gives approximate normality
B: The OLS estimator is exactly normally distributed for any sample size
C: The `$t$`-distribution always equals `$\mathcal{N}(0,1)$` regardless of sample size
D: Population parameters such as `$\beta_1$` are normally distributed by definition
]

---
# .red[Practice 2: Hypothesis test and CI for `$\beta_1$`]

.pull-left-wide[
Using the estimates from Practice 1 (`$\hat{\beta}_1 = 1.5$`, `$\hat{\beta}_0 = 2.5$`), suppose the estimated standard error is `$\sqrt{S_{\hat{\beta}_1}^2} = 0.5$`.

1. Test `$H_0: \beta_1=0$` vs `$H_1: \beta_1 \neq 0$` at `$\alpha=0.05$`.
2. Construct a 95\% confidence interval for `$\beta_1$`.
3. Does the CI contain 0? Is this consistent with your test decision?
]

---
# Before next time
.pull-left[
- Read the assigned reading
- Next time: Further topics in regression (Chapter 18)
]

.pull-right[
![Trees](Figures/Trees1.jpg)
]