What can go wrong?
How can we test that this is not happening?
Note:
- variable names don't match
- differently coded values (categories)
- different units of measurement
---
## Aggregating values by groups
- In R: `group_by` + `summarise`
- In Stata: `collapse`