QTM 385 - Experimental Methods

Lecture 04 - Selection Bias, Variability, and Regression Analysis

Danilo Freire

danilo.freire@emory.edu

Emory University

Hello, everyone! 👋
Nice to see you all again! 😉

Brief recap 📚

Last time, we saw that…

Potential outcomes framework help understand treatment effects
Fundamental problem of causal inference: we only see one potential outcome
Treatment effect is the difference between potential outcomes (\(\tau_{i} = Y_{1,i} - Y_{0,i}\))
Average causal effect is the goal, but simple means comparison is usually biased
Selection bias: treated/untreated groups may differ even before treatment
Bias does not vanish in large samples
Mathematical expectation \(E[Y_i]\) is the population average
Law of large numbers: sample average converges to population average
Conditional expectation \(E[Y_i|X_i]\) is average outcome given \(X_i\)

SUTVA (Stable Unit Treatment Value Assumption)
Violations of SUTVA: spillovers, contamination
External validity: generalising results to other settings
Heterogeneous treatment effects: effects vary across people
Compliance problem: people may not follow random assignment
Intent-to-Treat (ITT) and Treatment-on-the-Treated (TOT) analyses address compliance

Today, we will discuss…

A little more about potential outcomes
Measuring variability
Different types of selection bias, e.g.:
- Inappropriate controls
- Loss to follow-up
- Attrition bias
- Survivorship bias
- Post-treatment bias
How to use R and DeclareDesign to estimate regression models
But first…

Source: xkcd (where else? 😄)

Funny correlation of the day!

Let’s get started! 🚀

Potential outcomes revisited

Remember from last time: we have two potential outcomes for each unit
We only observe one of them: \(Y_i = Y_{1,i} \cdot D_i + Y_{0,i} \cdot (1 - D_i)\)
- \(D_i\) is the treatment indicator = 1 if treated, 0 if not
So we need to estimate the average treatment effect (ATE):
- \(E[Y_{1,i} - Y_{0,i}] = E[Y_{1,i}] - E[Y_{0,i}]\)
The problem is, not all comparisons are valid
I mean, they can be, but only under the heroic assumption that the groups are similar on average before any adjustment
Otherwise, we will have selection bias
Let’s see how this works in practice! 🤓

Khuzdar and Maria

Selection bias occurs when the groups are systematically difference, often before the treatment (but not exclusively!)
Using an example from Angrist and Pischke (2021), let’s say we have a student called Khuzdar from Kazakhstan, who is considering studying in the US and is worried about the cold weather
Should he get health insurance? 🤔
Let’s imagine that, without insurance, Khuzdar has a potential outcome of \(Y_{0,i} = 3\) and, with insurance, \(Y_{1,i} = 4\). So the treatment effect is \(\tau_i = 1\), that is, he gains 1 “health point” by getting insurance

Now, let’s imagine that Khuzdar has a Chilean colleague called Maria Moreno, who is also considering studying in the US
But since she comes from chilly Santiago, she is not worried about the cold weather
So, without insurance, Maria has a potential outcome of \(Y_{0,i} = 5\) and, with insurance, \(Y_{1,i} = 5\). So the treatment effect is \(\tau_i = 0\), that is, she gains no “health points” by getting insurance

Khuzdar and Maria

In fact, the comparison between frail Khuzdar and hearty Maria tells us little about the causal effects of their choices!
Why is that? Because they were different to begin with
Imagine that Khuzdar got insurance and Maria did not
Let’s do a little mathematical trick here: we will add and subtract \(Y_{0, Khuzdar}\) from the treatment effect (they cancel each other out, right?)
So we have the following:

Difference in means = average treatment effect + selection bias

So far, so good? 🤓
If we assume that the treatment has a constant effect (i.e. the treatment effect is the same for everyone), we can rewrite the equation as:

Where \(k\) is both the individual and average causal effect of insurance on health
Using the constant-effects model to substitute for \(Avg_n[Y_{1i}|D_i = 1]\), we have

How to check for selection bias?

Balance tests

We use balance tests or (randomisation checks) to check if the groups are similar before treatment
The idea is to compare the means of the covariates for the treated and untreated groups
If the means are similar, it provides evidence that nothing systematic is driving the treatment effect
We are never 100% sure, but we trust the random assignment
Small differences in means are acceptable, as long as they are not systematic
Why? Because some variation can happen only by chance
100% personal opinion: I think they aren’t very useful 😂
Mutz et al (2018) make a good case for that

Angrist and Pischke (2021), pp. 20 (selected parts)

Questions? 🤔

Different types of selection bias 🎯

Causal diagrams and associations

Causal diagrams are a great way to visualise the relationships between variables
- They are also called directed acyclic graphs (DAGs)
- They are “directed” because they show the direction of the relationships, and “acyclic” because they do not have loops (one cause cannot cause itself)
They have been introduced by Judea Pearl and are widely used in causal inference (quick guide here)
They help us understand the associations between variables and identify potential sources of bias
And they are super easy to draw! 🤓
For instance, let’s illustrate a classic example of selection bias with a causal diagram (borrowed from Hernán et al., 2019)

Figure 1 shows the common cause \(L\) (smoking) results in \(E\) (carrying matches) and \(D\) (lung cancer) being associated
Figure 2 shows how randomisation breaks the association between \(L\) and \(D\), so the treatment effect is unbiased
In this case, if you know what \(L\) is, you can also control for it in your analysis

Some sources of selection bias

Losses to follow-up

One problem that we sometimes have with experiments is individuals dropping out
That is, when individuals leave the study before the end of follow-up
Why is that an issue?
Because the treatment effect may be different for those who leave
Imagine a follow-up study of the effect of antiretroviral therapy (\(E\)) on AIDS (\(D\)) risk among HIV-infected patients
\(L\) is the presence of other symptoms of HIV infection
The greater the true level of immunosuppression (\(U\)), the greater the risk of AIDS. \(U\) is unmeasured
If a patient drops out from the study, his AIDS status cannot be assessed and we say that he is censored (\(C=1\))
Patients with greater values of \(U\) are more likely to be lost to follow up
The causal diagram shows the relationship between the variables

How can we correct for this bias?
One way is to use stratified analysis or inverse probability weighting
We will discuss these methods in further detail in the next lectures
But stratification is quite simple:
- Assuming that \(L\) is measured, we can estimate the treatment effect within each stratum of \(L\)
- Then we can average the stratum-specific estimates to get the overall treatment effect

Attrition bias

Attrition bias is a type of selection bias that occurs when individuals drop out of a study in a way that is related to the treatment or outcome
It is more systematic than losses to follow-up, which can happen by chance (e.g., disengagement, relocation, or other factors, leading to incomplete data)
For example, in a clinical trial, if participants experiencing side effects disproportionately drop out of the treatment group compared to the placebo group, this difference introduces attrition bias
In short, attrition bias specifically addresses the bias that arises when these departures are non-random
As with losses to follow-up, this is a type of selection bias than can be caused by the experiment itself, not only by pre-existing differences between groups

Survivorship bias

In my view, one of the coolest examples of selection bias
Survivorship bias is the logical error of concentrating on the people or things that made it past some selection process and overlooking those that did not
It is also closely related to attrition bias and follow up bias (more mentioned in observational data)
- Attrition bias emphasises the process of losing data, while survivorship bias highlights the exclusion of failures in analysis
For instance, imagine that you are studying the effect of education on income
If you only look at people who have completed college, you are ignoring those who dropped out
This can lead to an overestimation of the effect of education on income
Why? Because those who dropped out may have done so because they were not good students, which can affect their income
Have a look at this example (you may have seen it before!)

Survivorship bias in World War II aircraft

During World War II, the US military wanted to improve the armour on its aircraft
They analysed the bullet holes on returning aircraft and decided to add more armour to the areas with the most bullet holes
Considering the picture you see here, where would you add the armour?
More here

Another example: Imagine you are studying mutual fund performance, and you only look at funds that are still in operation. You see that they beat the S&P 500 by 2% per year
What’s the problem here?

Post-treatment bias

Post-treatment bias (or overcontrol) occurs when we control for variables that are affected by the treatment
Imagine a research study evaluating a job training programme for unemployed individuals
The treatment is the job training programme and the outcome is finding employment after program completion
Imagine also that people who take the programme submit more job applications after training
This is a post-treatment variable, as it is affected by the treatment
If we control for the number of job applications in our analysis, we (usually) underestimate the programme’s effectiveness, because we are removing one of the mechanisms through which the treatment works

Source: Elwert and Winship (2014)

This bias, however, can go in either direction, it depends on the context
Is there a statistical fix for it? Not really!
The best way to avoid post-treatment bias is to think carefully about the variables you include in your analysis

Examples of post-treatment bias

Avoidable post-treatment bias

Causal effect of party ID on the vote
- Do control for race
- Do not control for voting intentions five minutes before voting
Causal effect of race on salary in a firm
- Do control for qualifications
- Don’t control for position in the firm
Causal effect of medicine on health
- Do control for health prior to the treatment decision
- Do not control for side effects or other medicines taken later

Unavoidable post-treatment bias

Causal effect of democratisation on civil war; do we control for GDP?
- Yes, since GDP → democratisation we must control to avoid omitted variable bias
- No, since democratisation → GDP, we would have post-treatment bias
Causal effect of legislative effectiveness on state failure; do we control for trade openness?
- Yes, since trade openness → legislative effectiveness, we must control to avoid omitted variable bias
- No, since legislative effectiveness → trade openness, we would have post-treatment bias

Now it’s your turn! 🤓

I will list some factors and you tell me if they are likely to be sources of selection bias (and why)
- Telephone random sampling
- Ascertainment: when the investigator responsible for assessing the outcome of interest is also aware of the group assignment
- WEIRD samples: Western, Educated, Industrialised, Rich, and Democratic
- Referral filter: when the study population is recruited through referrals, often from health care providers or NGOs
- When you ask socially sensitive questions in a survey

Measuring variability 📏

Measuring variability

Variance and standard deviation

Apart from averages, we are also interested in variability
We usually look at average squared deviations from the mean (variance)
The sample variance of \(Y_i\) in a sample of size \(n\) is defined as:
- \(S(Y_i)^2 = \frac{1}{n-1} \sum_{i=1}^{n} (Y_i - \bar{Y})^2\)
The population variance, which is a fixed parameter, is defined as:
- \(Var(Y_i) = E[(Y_i - E[Y_i])^2]\)
- Here we just replaced averages with expectations
Since the variance can be quite large, we often look at the standard deviation (square root of the variance), written as \(\sigma(Y_i)\)
These measures are descriptive, as they summarise the variability in your dataset

Standard error

The standard deviation of an inferential statistic is called its standard error
Standard error measures how precise your sample mean is as an estimate of the true population mean (which we usually don’t know)
The standard error of the sample mean can be written as:
- \(SE(\bar{Y}) = \frac{\sigma(Y_i)}{\sqrt{n}}\)
But since we usually don’t have the population standard deviation, we use the sample standard deviation instead
- \(SE(\bar{Y}) = \frac{S(Y_i)}{\sqrt{n}}\)
The standard error is not the same as the standard deviation
The standard error is a measure of the precision of an estimate, while the standard deviation is a measure of the variability of the data
Confused? 🤔 Don’t worry, let’s see their differences in further detail 😉

Comparison between standard deviation and standard error

A high SD indicates more spread in the data points. It does not inherently change with sample size
The standard error decreases as the sample size increases, indicating that larger samples yield more reliable estimates of the mean
- That’s the central limit theorem at work! 🤓
Confidence intervals in statistial models are often constructed using the SE, while descriptive statistics often report the SD
Does it sound clearer now? 🙂

Suppose you measure the IQ of 50 students:
- \(\bar{IQ} = 100\)
- \(SD = 15\)
- \(SE = \frac{15}{\sqrt{50}} \approx 2.12\)
\(SD = 15\): Individual IQs vary by 15 points from the mean (e.g., most are between 85 and 115).
\(SE = 2.12\): If you repeated the study, the average IQ of your samples would typically vary by 2 points from 100

Standard error of a difference in means

We are usually interested not in one mean, but in the difference between two means
Let’s assume we have a treatment group mean of \(\bar{Y}_{1}\) and a control group mean of \(\bar{Y}_{0}\), with sample sizes \(n_1\) and \(n_0\), respectively
The total sample size is \(n = n_1 + n_0\)
Using the fact that the variance of a difference between two statistically independent variables is the sum of their variances, we have:
- \(Var(\bar{Y}_{1} - \bar{Y}_{0}) = Var(\bar{Y}_{1}) + Var(\bar{Y}_{0})\)
- \(Var(\bar{Y}_{1} - \bar{Y}_{0}) = \frac{\sigma(Y_{1})^2}{n_1} + \frac{\sigma(Y_{0})^2}{n_0}\)
The standard error of the difference in means is then:
- \(SE(\bar{Y}_{1} - \bar{Y}_{0}) = \sqrt{\frac{\sigma(Y_{1})^2}{n_1} + \frac{\sigma(Y_{0})^2}{n_0}}\)
Again, since we don’t usually have the population standard deviation, we use the sample standard deviation instead
- \(SE(\bar{Y}_{1} - \bar{Y}_{0}) = \sqrt{\frac{S(Y_{1})^2}{n_1} + \frac{S(Y_{0})^2}{n_0}}\)
This is the standard error of the difference in means, which is used to calculate confidence intervals and hypothesis tests: \(t = \frac{\text{Difference}}{\text{SE}_{\text{diff}}}\) and \(CI = \text{Difference} \pm t \times \text{SE}_{\text{diff}}\)
Do you have to worry about this? Not really, but it’s good to know what’s going on under the hood 😉
Your software will do the heavy lifting for you 🤖

Regression analysis of experiments 📊

Regression analysis of experiments

It is super easy to estimate the average treatment effect using regression analysis
Although some areas (e.g. medicine, psychology) often use \(t\)-tests for difference-in-means, regression is more flexible and gives you exactly the same results
The idea is to regress the outcome variable on the treatment variable and some covariates if necessary
But always report the ATE without covariates first, as it is the most interpretable
Let’s see an example using R! 🤓
We will the fabricatr, estimatr, and randomizr packages, which are part of the DeclareDesign suite
More information about the packages are available here, here, and here
We will start slowly, don’t worry, and build up from there in the next lectures 😉

Simulating data in `R`

Imagine that we want to estimate the effect of a job training programme on the number of interviews that unemployed individuals get, similar to the example we discussed earlier
The treatment variable is treat, which takes the value 1 if the individual received the training and 0 otherwise
We will assign 1000 individuals to the treatment and control groups, with an equal probability of being assigned to each group (simple random assignment)
The outcome variable is interviews, which is the number of interviews the individual got after the training
The treatment effect is 5, that is, individuals who received the training got 5 more interviews than those who did not

# Load the required packages
library(fabricatr)
library(estimatr)
library(randomizr)

# Set the seed for reproducibility
set.seed(385)

# Simulate the data
data <- fabricate( # fabricatr
  N = 1000, # base
  treat = complete_ra(N, m = 500), # randomizr
  interviews = round(rnorm(N, mean = 10, sd = 2) + 5 * treat, digits = 0) # base
)

# Check the first few rows of the data
head(data)

    ID treat interviews
1 0001     0         11
2 0002     1         15
3 0003     1         15
4 0004     1         16
5 0005     0          9
6 0006     0         11

Step-by-step: using `fabricatr` to simulate data

We used the fabricate() function from the fabricatr package to simulate the data
The package is very flexible and was built for all types of data simulations, mainly those used by social scientists (e.g. experiments, surveys)
- We can draw from Likert scales, create binary variables, ordinal variables, categorical variables, etc
- More about how to create different types of variables here
The fabricate() function takes a series of arguments, which are the variables we want to simulate
It uses base R functions too, which makes it easy to use
For instance, N = 1000 is the number of individuals in the sample and we created it using just base R
To simulate the interiews variable, we used the rnorm() function to draw from a normal distribution with mean 10 and standard deviation 2
We sum 5 because it is the treatment effect
As we cannot have a fraction of an interview, we used the round() function to round the number of interviews to the nearest integer (digits = 0)
So far so good? 🤓

Randomising treatment assignment

We used the complete_ra() function from the randomizr package to randomise the treatment assignment
The function has a few arguments, such as
- N: the number of individuals in the sample (required)
- m: the number of individuals assigned to the treatment group (optional)
- m_each: the number of individuals assigned to each treatment group (optional)
We could have used simple_ra() for simple random assignment, which is a simple coin flip. The arguments are:
- N: the number of individuals in the sample (required)
- prob: the probability of being assigned to the treatment group (optional)
- prob_each: the probability of being assigned to each treatment group (optional)
We could also use block_ra() for block random assignment, which is when we divide the sample into blocks and then randomise within each block
cluster_ra() is used for cluster random assignment, which is when we randomise at the cluster level (e.g. schools, hospitals)
Finally, block_and_cluster_ra() is used for block and cluster random assignment, which is a combination of the two
We will discuss these functions in further detail in the next lectures 😉

Estimating the treatment effect

Now that we have the data, we can estimate the treatment effect using the lm_robust() function from the estimatr package
The function is similar to the lm() function, but it uses robust standard errors, which are more appropriate for experiments
Why? Because robust standard errors correct for heteroskedasticity and clustering
This is a free lunch: if your data is not heteroskedastic or clustered, the robust standard errors will be the same as the regular standard errors
The function takes a formula as an argument, which is the outcome variable regressed on the treatment variable

model <- lm_robust(interviews ~ treat, data = data)
summary(model)


Call:
lm_robust(formula = interviews ~ treat, data = data)

Standard error type:  HC2 

Coefficients:
            Estimate Std. Error t value   Pr(>|t|) CI Lower CI Upper  DF
(Intercept)    10.18     0.0922  110.39  0.000e+00    9.997   10.359 998
treat           4.91     0.1305   37.63 1.205e-193    4.654    5.166 998

Multiple R-squared:  0.5866 ,   Adjusted R-squared:  0.5862 
F-statistic:  1416 on 1 and 998 DF,  p-value: < 2.2e-16

Difference in means

For those who would like to estimate the difference in means using the t.test() function, we’ve got you covered! 😉
estimatr also has a function that does that, called difference_in_means() (what a surprise! 😂)
It is actually more flexible than the t.test() function, as it can handle unit-randomised, cluster-randomised, block-randomised, matched-pair randomised, and matched-pair clustered designs. More here
Here is how to use it:

difference_in_means(interviews ~ treat, data = data)

Design:  Standard 
      Estimate Std. Error  t value      Pr(>|t|) CI Lower CI Upper       DF
treat     4.91  0.1304729 37.63232 1.204798e-193 4.653967 5.166033 997.9986

As you can see, the results are exactly the same as the regression model
Just the output looks a little different, but the interpretation is the same
Use whatever you feel more comfortable with! 😊

Interpreting the results

The output of the summary() function gives us the coefficient estimate of the treatment variable
The intercept is 10.18, which is the average number of interviews for the control group
The coefficient of the treatment variable is 4.91, which is the average treatment effect
The standard error of the treatment effect is 0.13, which is the standard error of the difference in means
Our results are statistically significant at the 5% level, as the \(p\)-value is less than 0.05

model <- lm_robust(interviews ~ treat, data = data)
summary(model)


Call:
lm_robust(formula = interviews ~ treat, data = data)

Standard error type:  HC2 

Coefficients:
            Estimate Std. Error t value   Pr(>|t|) CI Lower CI Upper  DF
(Intercept)    10.18     0.0922  110.39  0.000e+00    9.997   10.359 998
treat           4.91     0.1305   37.63 1.205e-193    4.654    5.166 998

Multiple R-squared:  0.5866 ,   Adjusted R-squared:  0.5862 
F-statistic:  1416 on 1 and 998 DF,  p-value: < 2.2e-16

The R^2 and its adjusted version are measures of the goodness of fit of the model
- They are the proportion of the variance in the outcome variable explained by the model
The F statistic is a test of the overall significance of the model
- The null hypothesis is that all coefficients are zero
- But as we see, the treatment variable is significant

Including pre-treatment covariates in the model

We can also include covariates in the model to increase precision
In contrast with the treatment, the covariates have no causal interpretation, that is, they are not affected by the treatment
Even with randomisation, there can be small imbalances in covariates between treatment and control groups, especially in small samples
Including these covariates in the regression helps to adjust for any residual imbalances that may exist due to random chance, making the analysis more accurate
Including covariates can also help bolster the credibility of the results by demonstrating that the treatment effect persists even after controlling for relevant variables
But remember to include only pre-treatment covariates in the model, as post-treatment covariates can introduce bias

Let’s simulate a new dataset with two pre-treatment covariates: age and education

set.seed(385)

data2 <- fabricate(
  N = 1000,
  treat = complete_ra(N, m = 500),
  age = round(rnorm(N, mean = 30, sd = 5)),
  education = round(rnorm(N, mean = 12, sd = 2)),
  interviews = round(rnorm(N, mean = 10, sd = 2) + 5 * treat)
)

head(data2)

    ID treat age education interviews
1 0001     0  32        11          9
2 0002     1  30        13         16
3 0003     1  31        12         16
4 0004     1  33        13         15
5 0005     0  27        15          7
6 0006     0  33        12          9

Estimating the second model

We can add covariates to experimental models to increase precision
The same way we do in any regression:
- \(Y_i = \alpha + \beta \text{T}_i + \gamma \text{X}_i + \epsilon_i\)
Freedman (2008) demonstrated that pre-treatment covariate adjustment may bias estimates of average treatment effects, mainly in small samples
Lin (2013) proposed that if covariate correlations differ between treatment and control groups, centering and interacting covariates with the treatment variable will balance the groups
This adjusts for covariates separately in each group, which is equivalent to including treatment-by-covariate interactions
- \(Y_i = \alpha + \beta \text{T}_i + \gamma \text{W}_i + \delta \text{T}_i \times \text{W}_i + \epsilon_i\)
- Where \(W_i = \text{X}_i - \bar{\text{X}}\) and \(\bar{\text{X}}\) is the mean of \(\text{X}_i\)
- For ease of interpretation, we can centre the covariates to have a mean of zero
Anyway, worry not! 😅 Our friends at DeclareDesign have done all the work for us and created a function called lm_lin() that does everything automatically!

Estimating the second model

model2 <- lm_lin(interviews ~ treat,
                covariates = ~ age + education,
                data = data2)
summary(model2)


Call:
lm_lin(formula = interviews ~ treat, covariates = ~age + education, 
    data = data2)

Standard error type:  HC2 

Coefficients:
                  Estimate Std. Error  t value   Pr(>|t|)  CI Lower CI Upper
(Intercept)        9.88659    0.09213 107.3127  0.000e+00  9.705804 10.06738
treat              5.13084    0.12546  40.8956 3.281e-215  4.884642  5.37704
age_c              0.02845    0.01771   1.6062  1.085e-01 -0.006307  0.06320
education_c       -0.04550    0.04721  -0.9636  3.355e-01 -0.138150  0.04716
treat:age_c       -0.03324    0.02546  -1.3054  1.921e-01 -0.083206  0.01673
treat:education_c  0.04324    0.06140   0.7043  4.814e-01 -0.077248  0.16373
                   DF
(Intercept)       994
treat             994
age_c             994
education_c       994
treat:age_c       994
treat:education_c 994

Multiple R-squared:  0.6272 ,   Adjusted R-squared:  0.6254 
F-statistic: 344.4 on 5 and 994 DF,  p-value: < 2.2e-16

The lm_lin() function is similar to the lm_robust() function, but it includes the covariates argument
We pass a simple R formula to it
As you can see, the results are pretty similar to the previous model
The treatment effect is slightly higher, and the standard error is slightly lower
The _c part of the variable names indicates that the variables are centred, as recommended by Lin (2013).
Lin’s method addresses covariate imbalance by adjusting for covariates separately within each treatment group, which is equivalent to including all treatment-by-covariate interactions.

Sub-group analysis

We can also estimate the treatment effect for sub-groups of the population
This is useful when we suspect that the treatment effect may vary across known dimensions
For instance, we can estimate the treatment effect for people with high and low levels of education
We can do this by including an interaction term between the treatment variable and the covariate of interest
We will discuss this in further detail in the next lectures, but here is how to do it

# Create a binary subgroup (e.g., "high" vs. "low" education)
data2$high_edu <- ifelse(data2$education > median(data2$education), 1, 0)

# Fit an interaction model with covariates
model_interaction <- lm_robust(
  interviews ~ treat * high_edu + age + education,
  data = data2)

# Summarize results
summary(model_interaction)


Call:
lm_robust(formula = interviews ~ treat * high_edu + age + education, 
    data = data2)

Standard error type:  HC2 

Coefficients:
               Estimate Std. Error t value   Pr(>|t|) CI Lower CI Upper  DF
(Intercept)     9.39778    0.69048 13.6104  8.255e-39  8.04281 10.75275 994
treat           5.10068    0.16905 30.1726 1.682e-142  4.76894  5.43241 994
high_edu       -0.26179    0.24257 -1.0792  2.808e-01 -0.73780  0.21422 994
age             0.01141    0.01276  0.8938  3.716e-01 -0.01363  0.03644 994
education       0.02033    0.05026  0.4046  6.859e-01 -0.07829  0.11895 994
treat:high_edu  0.09721    0.25093  0.3874  6.985e-01 -0.39521  0.58962 994

Multiple R-squared:  0.6269 ,   Adjusted R-squared:  0.625 
F-statistic: 341.1 on 5 and 994 DF,  p-value: < 2.2e-16

Sub-group analysis: interpreting the results

The output of the summary() function gives us the coefficient estimate of the treatment variable for the high education group
- It is in the last line, treat:high_edu
Interpretation: The treatment effect for the high-education subgroup (high_edu = 1) is 0.097 larger than for the low-education subgroup
Total treatment effect for high_edu = 1: 5.101 + 0.097 = 5.198
Significance: The interaction is not significant (CI: -0.395 to 0.590), meaning there’s no evidence the treatment effect differs between education subgroups
This is expected, as we simulated the data to have the same treatment effect for all individuals
But it is good to know how to do it, right? 😉

# Create a binary subgroup (e.g., "high" vs. "low" education)
data2$high_edu <- ifelse(data2$education > median(data2$education), 1, 0)

# Fit an interaction model with covariates
model_interaction <- lm_robust(
  interviews ~ treat * high_edu + age + education,
  data = data2)

# Summarize results
summary(model_interaction)


Call:
lm_robust(formula = interviews ~ treat * high_edu + age + education, 
    data = data2)

Standard error type:  HC2 

Coefficients:
               Estimate Std. Error t value   Pr(>|t|) CI Lower CI Upper  DF
(Intercept)     9.39778    0.69048 13.6104  8.255e-39  8.04281 10.75275 994
treat           5.10068    0.16905 30.1726 1.682e-142  4.76894  5.43241 994
high_edu       -0.26179    0.24257 -1.0792  2.808e-01 -0.73780  0.21422 994
age             0.01141    0.01276  0.8938  3.716e-01 -0.01363  0.03644 994
education       0.02033    0.05026  0.4046  6.859e-01 -0.07829  0.11895 994
treat:high_edu  0.09721    0.25093  0.3874  6.985e-01 -0.39521  0.58962 994

Multiple R-squared:  0.6269 ,   Adjusted R-squared:  0.625 
F-statistic: 341.1 on 5 and 994 DF,  p-value: < 2.2e-16

Cool, isn’t it? 😎

Questions?

Summary

We discussed the potential outcomes framework in further detail, focusing on issues of selection bias
We talked about different types of selection bias, such as inappropriate controls, loss to follow-up, attrition bias, survivorship bias, and counfounders
We also discussed how to measure variability using the standard deviation and standard error
Then, we talked about regression analysis of experiments, including how to simulate data, randomise treatment assignment, estimate the treatment effect, and include pre-treatment covariates in the model
We also introduced the fabricatr, estimatr, and randomizr packages, which are part of the DeclareDesign suite
Finally, we discussed sub-group analysis and how to estimate the treatment effect for different sub-groups of the population
I hope you enjoyed the lecture! 🤓
Next time, we will discuss more about hypothesis testing and confidence intervals, so stay tuned! 😉

Thank you and see you soon! 😃

QTM 385 - Experimental Methods

Hello, everyone! 👋 Nice to see you all again! 😉

Brief recap 📚

Last time, we saw that…

Today, we will discuss…

Funny correlation of the day!

Let’s get started! 🚀

Potential outcomes revisited

Khuzdar and Maria

Khuzdar and Maria

Difference in means = average treatment effect + selection bias

How to check for selection bias?

Balance tests

Questions? 🤔

Different types of selection bias 🎯

Causal diagrams and associations

Some sources of selection bias

Losses to follow-up

Attrition bias

Closely related to losses to follow-up

Survivorship bias

Survivorship bias in World War II aircraft

Post-treatment bias

Examples of post-treatment bias

Avoidable post-treatment bias

Unavoidable post-treatment bias

Now it’s your turn! 🤓

Measuring variability 📏

Measuring variability

Variance and standard deviation

Standard error

Comparison between standard deviation and standard error

Standard error of a difference in means

Regression analysis of experiments 📊

Regression analysis of experiments

Simulating data in R

Step-by-step: using fabricatr to simulate data

Randomising treatment assignment

Estimating the treatment effect

Difference in means

Interpreting the results

Including pre-treatment covariates in the model

Estimating the second model

Estimating the second model

Sub-group analysis

Sub-group analysis: interpreting the results

Cool, isn’t it? 😎

Questions?

Summary

Thank you and see you soon! 😃

Hello, everyone! 👋
Nice to see you all again! 😉

Simulating data in `R`

Step-by-step: using `fabricatr` to simulate data