Econometrics

.title[
# Econometrics
]
.subtitle[
## Introduction
]
.author[
### Florian Oswald
]
.date[
### UniTo ESOMAS </br> 2025-10-09
]

---

---

# Welcome to Econometrics @ ESOMAS UniTo!

## Team

* My name is Florian Oswald, I'm a Professor at ESOMAS. Check out my [website](https://floswald.github.io)!

* I work on urban, macro and IO topics.
    
* Our TA this year is [Kacper Krasowski](https://kacperkrasowski.github.io), PhD student at Collegio Carlo Alberto.

]

.pull-right[
    
* I do a lot of computation (who doesn't). I like `R`, `python` and [`julia`](https://julialang.org) - I teach computational econ to our PhD students.

* I profited *a lot* from the open source software (OSS) community
* OSS is key to reproducible research.

👉 seeing that every day as [Data Editor](https://jpedataeditor.github.io)

* I try to use and teach my students tools which enable greater reproducibility.

]

---

# Welcome to Econometrics @ ESOMAS UniTo!

- In this course you will learn the core tools of ***econometrics***.
 
--

- You will also learn to use the `R` programming language!

## What is *econometrics*?

- A set of ***techniques and methods*** to answer (economic) questions with ***data***.

- Some examples!

---

# Answering Important Questions with Econometrics

[<ru-blockquote>
Does raising the minimum wage *reduce* employment for the low-skilled?
</ru-blockquote>](http://davidcard.berkeley.edu/papers/njmin-aer.pdf)

[<ru-blockquote>
Does mandating a 40% representation of each gender on the board of public limited liability companies increase the number of women in top jobs?
</ru-blockquote>](https://academic.oup.com/restud/article-abstract/86/1/191/5042274)

[<ru-blockquote>
Does the neighborhood you grew up in have an *impact* on your life outcomes?
</ru-blockquote>](https://academic.oup.com/qje/article/133/3/1107/4850660)

[<ru-blockquote>
Does giving a work permit to immigrants *cause* them to commit less crimes?
</ru-blockquote>](https://www.aeaweb.org/articles?id=10.1257/aer.20150355)

---

# Causality

* Notice that ***many other factors could have caused*** each of the outcomes mentioned.

* Often, we'll want to focus on the ***causal impact*** of just one of these factors (immigration, minimum wage, education ,etc.)

* Econometrics is about spelling out ***conditions*** under which we can ***claim to measure causal relationships***.

* We will encounter the most basic of those conditions, and talk about some potential pitfalls.

* ["Credibility Revolution"](https://www.aeaweb.org/articles?id=10.1257/jep.24.2.3) in econometrics over the past 30 years ([2022 Economics Nobel](https://www.nobelprize.org/prizes/economic-sciences/2021/press-release/) awarded to some of the main protagonists of this "revolution")

???

test comment speaker note.

---

# Welcome to the Post-Truth Age - Bullshit Rules the World

- "Alternative facts" [Kellyanne Conway, Meet the Press, January 22, 2017]

]

![](../img/photos/alt-facts.png)

]

---

# Welcome to the Post-Truth Age - Bullshit Rules the World

- "Alternative facts" [Kellyanne Conway, Meet the Press, January 22, 2017]

- "Trade wars are good and easy to win" [Donald Trump, 2018]

]

![](../img/photos/trade-wars.png)

]

---

# Welcome to the Post-Truth Age - Bullshit Rules the World

- "Alternative facts" [Kellyanne Conway, Meet the Press, January 22, 2017]

- "Trade wars are good and easy to win" [Donald Trump, 2018]

- "The concept of global warming was created by and for the Chinese" [Donald Trump, Twitter, 2012]

]

![](../img/photos/chinese-hoax.png)

]

---

# Welcome to the Post-Truth Age - Bullshit Rules the World

- "Alternative facts" [Kellyanne Conway, Meet the Press, January 22, 2017]

- "Trade wars are good and easy to win" [Donald Trump, 2018]

- "The concept of global warming was created by and for the Chinese" [Donald Trump, Twitter, 2012]

- Brexit: "We send the EU £350 million a week" [Vote Leave campaign bus, 2016]

]

![](../img/photos/Vote_Leave_Brexit_Bus.jpg)

]

---

# Welcome to the Post-Truth Age - Bullshit Rules the World

- "Alternative facts" [Kellyanne Conway, Meet the Press, January 22, 2017]

- "Trade wars are good and easy to win" [Donald Trump, 2018]

- "The concept of global warming was created by and for the Chinese" [Donald Trump, Twitter, 2012]

- Brexit: "We send the EU £350 million a week" [Vote Leave campaign bus, 2016]

]

[<ru-blockquote>The law of Brandolini.</ru-blockquote>](https://en.wikipedia.org/wiki/Brandolini%27s_law)

]

---

# The Motto Of the USA (and This Course)

![](../img/photos/USD-1-2.jpg)

---

# The Motto Of the USA (and This Course)

![](../img/photos/USD-1.jpeg)

---

# The Motto Of the USA (and This Course)

## "In God We Trust, All Others Must Bring Data"
*—W. Edwards Deming (attribution uncertain, often credited to him)*

![:scale 60%](../img/photos/USD-1-small.jpeg)

---

# This Course

- Teach you the basics of ***linear regression***, ***statistical inference*** and ***impact evaluation***.

- Equip you with a framework to think more deeply about ***causality***.

- Introduce you to the `R` software environment.

- ⚠️ This is *not* a course about `R`.

**Grading. Two Options:**

.pull-left[
The **Good** Way:
* come to class
* take 5(?) quizzes on moodle during the semester (0% of grade).
* take closed book exam (100%) early December 2025.

]

The **Other** Way

* (come to class?)
* take closed book exam (100%) later.
* (do worse on the exam.)

]

---

# Communication: Slack

Questions like

* *I don't understand x*
* *y does not work for me*
* *when is the exam*
* *can I come to office hours?"*

will *only* be answered on Slack.

All other questions via email to

`florian.oswald@unito.it` or
`kacper.krasowski@carloalberto.org`

What is *Slack*??

]

---

# Your To-Do List for Tomorrow

1. Sign up on moodle: [https://elearning.unito.it/sme/course/view.php?id=8568](https://elearning.unito.it/sme/course/view.php?id=8568)

2. From moodle, sign up on slack

3. Update your laptop OS and install `R`: [https://cloud.r-project.org](https://cloud.r-project.org)

4. Install `RStudio` at [https://posit.co/download/rstudio-desktop/](https://posit.co/download/rstudio-desktop/)

---

# Class Conduct and My Expectations 🧐

1. Come to class: You will understand better.

2. Be on time, be polite, don't use your phone. `#respect #reciprocity`

3. Open/Close your laptop when I say so. (take notes on paper)

4. Ask questions ***any time*** by raising your hand.

5. Work in groups: You can/should work in groups of 2-3 on the quizzes.

6. Don't cheat in exam. No phones. Penalties are severe.

---

# Notation in Slides

---

---

# Notation is Important

1. Simple Text

2. ***Important*** text is in italic red. **Very Important** text is in boldface red.

3. Maths looks like this: `$\int f(x) dx$`

4. `R` Code inline has pink background: `data(gapminder, package = "dslabs")`

---

# Notation is Important

1. Simple Text

2. ***Important*** text is in italic red. **Very Important** text is in boldface red.

3. Maths looks like this: `$\int f(x) dx$`

4. `R` Code inline has pink background: `data(gapminder, package = "dslabs")`

5. In-class tasks for you have a pink background. Do the tasks! 😉

---

# R

---

---

## What is `R`?

`R` is a __programming language__ with powerful statistical and graphic capabilities.

## Why are we using `R`?<sup>1</sup>

.footnote[
[1]: This list has been inspired by [Ed Rubin's](https://github.com/edrubin/EC421S19).  
<span style="visibility:hidden">[2]: Learning `R` definitely requires time and effort but it's worth it, trust me! .</span>
]

1. `R` is __free__ and __open source__—saving both you and the university 💰💵💰.

1. `R` is very __flexible and powerful__—adaptable to nearly any task, (data cleaning, data visualization, econometrics, spatial data analysis, machine learning, web scraping, etc.)

1. `R` has a vibrant, [thriving online community](https://stackoverflow.com/questions/tagged/r) that will (almost) always have a solution to your problem.

1. If you put in the work<sup>2</sup>, you will come away with a __very valuable and useful__ tool.

.footnote[
<span style="visibility:hidden">[1]: This list has been inspired by [Ed Rubin's](https://github.com/edrubin/EC421S19).</span>  
[2]: Learning `R` definitely requires time and effort but it's worth it, trust me! 
]

???

* Single user Stata/SE annual license costs 485 USD for education.
* Student lab for 25 students costs 4135 USD per year.

---

# First Taste of R

---

---

# In Practice: Data Wrangling

* You will spend a lot of time preparing data for further analysis.

* The `gapminder` dataset contains data on life expectancy, GDP per capita and population by country between 1952 and 2007.

* Let's first discuss some basics, and then try to answer a simple question.

---

# Loading a Dataset

``` r
# load gapminder package
library(gapminder)
# load the dataset from the gapminder package
data(gapminder, package = "gapminder") 
# show first 10 lines of this dataframe
head(gapminder,n = 10)
```

```
## # A tibble: 10 × 6
##    country     continent  year lifeExp      pop gdpPercap
##    <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
##  1 Afghanistan Asia       1952    28.8  8425333      779.
##  2 Afghanistan Asia       1957    30.3  9240934      821.
##  3 Afghanistan Asia       1962    32.0 10267083      853.
##  4 Afghanistan Asia       1967    34.0 11537966      836.
##  5 Afghanistan Asia       1972    36.1 13079460      740.
##  6 Afghanistan Asia       1977    38.4 14880372      786.
##  7 Afghanistan Asia       1982    39.9 12881816      978.
##  8 Afghanistan Asia       1987    40.8 13867957      852.
##  9 Afghanistan Asia       1992    41.7 16317921      649.
## 10 Afghanistan Asia       1997    41.8 22227415      635.
```

---

# What is a *Dataset*?

## Cross Sectional Data

👉 **One index only:** `country`

```
## # A tibble: 10 × 5
##    country      year lifeExp      pop gdpPercap
##    <fct>       <int>   <dbl>    <int>     <dbl>
##  1 Afghanistan  1952    28.8  8425333      779.
##  2 Albania      1952    55.2  1282697     1601.
##  3 Algeria      1952    43.1  9279525     2449.
##  4 Angola       1952    30.0  4232095     3521.
##  5 Argentina    1952    62.5 17876956     5911.
##  6 Australia    1952    69.1  8691212    10040.
##  7 Austria      1952    66.8  6927772     6137.
##  8 Bahrain      1952    50.9   120447     9867.
##  9 Bangladesh   1952    37.5 46886859      684.
## 10 Belgium      1952    68    8730405     8343.
```
]

```
## # A tibble: 10 × 5
##    country      year lifeExp       pop gdpPercap
##    <fct>       <int>   <dbl>     <int>     <dbl>
##  1 Afghanistan  2007    43.8  31889923      975.
##  2 Albania      2007    76.4   3600523     5937.
##  3 Algeria      2007    72.3  33333216     6223.
##  4 Angola       2007    42.7  12420476     4797.
##  5 Argentina    2007    75.3  40301927    12779.
##  6 Australia    2007    81.2  20434176    34435.
##  7 Austria      2007    79.8   8199783    36126.
##  8 Bahrain      2007    75.6    708573    29796.
##  9 Bangladesh   2007    64.1 150448339     1391.
## 10 Belgium      2007    79.4  10392226    33693.
```
]

---

# What is a *Dataset*?

## Panel (or longitudinal) Data 👉 **two indices:** `country` and `year`

```
## # A tibble: 18 × 6
##    country continent  year lifeExp        pop gdpPercap
##    <fct>   <fct>     <int>   <dbl>      <int>     <dbl>
##  1 India   Asia       1952    37.4  372000000      547.
##  2 India   Asia       1962    43.6  454000000      658.
##  3 India   Asia       1972    50.7  567000000      724.
##  4 India   Asia       1982    56.6  708000000      856.
##  5 India   Asia       1992    60.2  872000000     1164.
##  6 India   Asia       2002    62.9 1034172547     1747.
##  7 Italy   Europe     1952    65.9   47666000     4931.
##  8 Italy   Europe     1962    69.2   50843200     8244.
##  9 Italy   Europe     1972    72.2   54365564    12269.
## 10 Italy   Europe     1982    75.0   56535636    16537.
## 11 Italy   Europe     1992    77.4   56840847    22014.
## 12 Italy   Europe     2002    80.2   57926999    27968.
## 13 Poland  Europe     1952    61.3   25730551     4029.
## 14 Poland  Europe     1962    67.6   30329617     5339.
## 15 Poland  Europe     1972    70.8   33039545     8007.
## 16 Poland  Europe     1982    71.3   36227381     8452.
## 17 Poland  Europe     1992    71.0   38370697     7739.
## 18 Poland  Europe     2002    74.7   38625976    12002.
```

]

* Can you tell me what the definition of an `index` is in this context?

* Why is `year` and `continent` not a valid index?

* 🤔

]

---

# What is a *Dataset*?

### Italy

``` r
gapminder %>% 
    filter(country == "Italy")
```

```
## # A tibble: 12 × 6
##    country continent  year lifeExp      pop gdpPercap
##    <fct>   <fct>     <int>   <dbl>    <int>     <dbl>
##  1 Italy   Europe     1952    65.9 47666000     4931.
##  2 Italy   Europe     1957    67.8 49182000     6249.
##  3 Italy   Europe     1962    69.2 50843200     8244.
##  4 Italy   Europe     1967    71.1 52667100    10022.
##  5 Italy   Europe     1972    72.2 54365564    12269.
##  6 Italy   Europe     1977    73.5 56059245    14256.
##  7 Italy   Europe     1982    75.0 56535636    16537.
##  8 Italy   Europe     1987    76.4 56729703    19207.
##  9 Italy   Europe     1992    77.4 56840847    22014.
## 10 Italy   Europe     1997    78.8 57479469    24675.
## 11 Italy   Europe     2002    80.2 57926999    27968.
## 12 Italy   Europe     2007    80.5 58147733    28570.
```
]

``` r
gapminder %>% 
    filter(country == "Poland")
```

```
## # A tibble: 12 × 6
##    country continent  year lifeExp      pop gdpPercap
##    <fct>   <fct>     <int>   <dbl>    <int>     <dbl>
##  1 Poland  Europe     1952    61.3 25730551     4029.
##  2 Poland  Europe     1957    65.8 28235346     4734.
##  3 Poland  Europe     1962    67.6 30329617     5339.
##  4 Poland  Europe     1967    69.6 31785378     6557.
##  5 Poland  Europe     1972    70.8 33039545     8007.
##  6 Poland  Europe     1977    70.7 34621254     9508.
##  7 Poland  Europe     1982    71.3 36227381     8452.
##  8 Poland  Europe     1987    71.0 37740710     9082.
##  9 Poland  Europe     1992    71.0 38370697     7739.
## 10 Poland  Europe     1997    72.8 38654957    10160.
## 11 Poland  Europe     2002    74.7 38625976    12002.
## 12 Poland  Europe     2007    75.6 38518241    15390.
```
]

---

# In Practice: Data Wrangling

* Suppose we want to know the average life expectancy and average GDP per capita for each **continent** in each year.

* We need to group the data by continent *and* year, then compute the average life expectancy and average GDP per capita

* There are always several ways to achieve a goal. (As in life 😁)

* Here we will only focus on the `dplyr` way:

``` r
# compute the required statistics
# average life exp and gdp per cap
gapminder_dplyr = gapminder %>% 
  group_by(continent, year) %>% 
  summarise(count = n(),
            mean_lifeexp = mean(lifeExp),
            mean_gdppercap = mean(gdpPercap))
```

]

``` r
# show first 5 lines of this new dataset
head(gapminder_dplyr, n = 5)
```

```
## # A tibble: 5 × 5
## # Groups:   continent [1]
##   continent  year count mean_lifeexp mean_gdppercap
##   <fct>     <int> <int>        <dbl>          <dbl>
## 1 Africa     1952    52         39.1          1253.
## 2 Africa     1957    52         41.3          1385.
## 3 Africa     1962    52         43.3          1598.
## 4 Africa     1967    52         45.3          2050.
## 5 Africa     1972    52         47.5          2340.
```
]

---

# Visualisation

.pull-left[
* Now we could *look* at the result in `gapminder_dplyr`, or compute some statistics from it.

* Nothing beats a picture, though:

``` r
ggplot(data = gapminder_dplyr, 
       mapping = aes(x = mean_lifeexp,
                     y = mean_gdppercap,
                     color = continent,
                     size = count)) +
  geom_point(alpha = 1/2) +
  labs(x = "Average life expectancy",
       y = "Average GDP per capita",
       color = "Continent",
       size = "Nb of countries") +
  theme_bw()
```
]

.pull-right[
<img src="chapter_intro_files/figure-html/gampminder_plot-1.svg" style="display: block; margin: auto;" />
]

???

* We map different features of the data to different ways of representing it
* color
* size of point
* different scales for each

---

# Animated Plotting 👌 <sup>1</sup>

.footnote[
[1]: This animation is taken from [Ed Rubin](https://raw.githack.com/edrubin/EC421S19/master/LectureNotes/01Intro/01_intro.html#40).
]

---

# R 101: Here Is Where You Start

---

---

# Start your `RStudio`!

## First Glossary of Terms

* `R`: a programming language.

* `RStudio`: an integrated development environment (IDE) to work with `R`.

* *command*: user input (text or numbers) that `R` *understands*.

* *script*: a list of commands collected in a text file, each separated by a new line, to be run one after the other.

* To run a script, you need to highlight the relevant code lines and hit `Ctrl`+`Enter` (Windows) or `Cmd`+`Enter` (Mac).

---

# `RStudio` Layout

---

# R as a Calculator

* You can use the `R` console like a calculator

* Just type an arithmetic operation after `>` and hit `Enter`!

* Some basic arithmetic first:

``` r
4 + 1
```

```
## [1] 5
```

``` r
8 / 2
```

```
## [1] 4
```

* Great! What about this?

``` r
2^3
```

```
## [1] 8
```

``` r
# by the way: this is a comment! R therefore disregards it
```

---

# Task 1

1. Create a new R script (File `$\rightarrow$` New File `$\rightarrow$` R Script). Save it somewhere as `lecture_intro.R`.

1. Type the following code in your script and run it. To run the code press `Ctrl` or `Cmd` + `Enter` (you can either highlight the code or just put your cursor at the end of the line)
    
    ``` r
    4 * 8
    ```

1. Type the following code in your script and run it. What happens if you only run the first line of the code?
    
    ``` r
    x = 5 # equivalently x <- 5
    x
    ```
Congratulations, you have created your first `R` "object"! Everything is an object in R! Objects are assigned using `=` or `<-`.

1. Create a new object named `x_3` to which you assign the cube of `x`. Note that to assign you need to use `=` or `<-`. Use code to compute the cube, not a calculator.

---

# Where to get Help?

``` r
?log #? in front of function
help(lm)   # help() is equivalent
??plot  # get all help on keyword "plot"
```
]

---

# Collaborate!

---

# R Packages

* `R` users contribute add-on data and functions as *packages*

* Installing packages is easy! Just use the `install.packages` function:
    
    ``` r
    install.packages("ggplot2")
    ```

* To *use* the contents of a packge, we must load it from our library using `library`:
    
    ``` r
    library(ggplot2)
    ```

---

# Data *Types*. What kinds of **Data** are there actually?

👉 Numbers, text, categories, images, ...

Unfortunately, your (mine, everybody's) computer only "speaks" `0` and `1`. That's why software **encodes** different kinds of data differently:

<br>

| Data    | R Type                           | Binary Encoding |
|---------|----------------------------------|-----------------------------|
| `42`      | double                   | `101010` (integer in base-2)|
| `"A"`     | character                  | `01000001` (ASCII)          |
| `TRUE`    | logical                 | `1`                         |
| `FALSE`   | logical                | `0`                         |
| `factor("Male")`   | integer   | `00000001` |
| `factor("Female")` | integer | `00000010` |

---

# Vectors

.pull-left[
* The `c` function creates vectors, i.e. *one-dimensional arrays*.
    
    ``` r
    c(1, 3, 5, 7, 8, 9)
    ```
    
    ```
    ## [1] 1 3 5 7 8 9
    ```
    
* Coercion to unique types:
    
    ``` r
    (v <- c(42, "Statistics", TRUE))
    ```
    
    ```
    ## [1] "42"         "Statistics" "TRUE"
    ```
]

* Creating a *range*
    
    ``` r
    1:10
    ```
    
    ```
    ##  [1]  1  2  3  4  5  6  7  8  9 10
    ```

* get vector elements with square bracket operator `[index]`:
    
    ``` r
    v[c(1,3)]
    ```
    
    ```
    ## [1] "42"   "TRUE"
    ```
]

---

# `data.frame`'s

`data.frame`s represent **tabular data**. Like spreadsheets.

``` r
example_data = data.frame(x = c(1, 3, 5, 7),
                          y = c(rep("Hello", 3), "Goodbye"),
                          z = c("one", 2, "three", 4))
example_data
```

```
##   x       y     z
## 1 1   Hello   one
## 2 3   Hello     2
## 3 5   Hello three
## 4 7 Goodbye     4
```

* A `data.frame` has 2 dimensions: *rows* and *columns*. Like a *matrix*. Can get elements with `[row_index,col_index]`.

* In practice, you will be importing files that contain the data into `R` rather than creating `data.frame`s by hand.

---

# Go to https://tinyurl.com/metrics-task2

### You need some real data for the next task.

---

---

# Task 2

1. Find out (using `help()` or google) how to import a `.csv` file. Do NOT use the "Import Dataset" button, nor install a package.

1. Import [gun_murders.csv](https://www.dropbox.com/scl/fi/uq8xlecjczy2t2vu50h7l/gun_murders.csv?rlkey=4zr1t5o7jsi9pgoey4tep467w&dl=1)<sup>1</sup> in a new object `murders`. This file contains data on gun murders by US state in 2010. (Hint: objects are created using `=` or `<-`).

1. Ensure that `murders` is a data.frame by running:
    
    ``` r
    class(murder) # check class
    ```

1. Find out what variables are contained in `murders` by running:
    
    ``` r
    names(murders) # obtain variable names
    ```

1. View the contents of `murders` by clicking on `murders` in your workspace. What does the `total` variable correspond to?

---

# `data.frame`s

Useful functions to describe a dataframe:

``` r
str(murders) # `str` describes structure of any R object
```

```
## 'data.frame':	51 obs. of  5 variables:
##  $ state     : chr  "Alabama" "Alaska" "Arizona" "Arkansas" ...
##  $ abb       : chr  "AL" "AK" "AZ" "AR" ...
##  $ region    : chr  "South" "West" "West" "South" ...
##  $ population: int  4779736 710231 6392017 2915918 37253956 5029196 3574097 897934 601723 19687653 ...
##  $ total     : int  135 19 232 93 1257 65 97 38 99 669 ...
```

``` r
names(murders) # column names
```

```
## [1] "state"      "abb"        "region"     "population" "total"
```

``` r
nrow(murders) # number of rows
```

```
## [1] 51
```

``` r
ncol(murders) # number of columns
```

```
## [1] 5
```

---

# Accessing `data.frame` Columns
    
* To extract one column **as a vector** we can use the `$` operator (as in `murders$state`), or the square bracket operator `[which_index]` with name or position index:
    
    ``` r
    first5 <- murders[1:5, ]  # take first 5 states only
    first5$state  # extract with $ operator
    ```
    
    ```
    ## [1] "Alabama"    "Alaska"     "Arizona"    "Arkansas"   "California"
    ```
    
    ``` r
    first5[ ,"state"]  # extract with column name
    ```
    
    ```
    ## [1] "Alabama"    "Alaska"     "Arizona"    "Arkansas"   "California"
    ```
    
    ``` r
    first5[ ,1] # get first column
    ```
    
    ```
    ## [1] "Alabama"    "Alaska"     "Arizona"    "Arkansas"   "California"
    ```

.pull-left[
* Check `class` of an object:
    
    ``` r
    class(murders)
    ```
    
    ```
    ## [1] "data.frame"
    ```
]

.pull-right[
* `typeof` gives the R-internal data type:
    
    ``` r
    typeof(murders)
    ```
    
    ```
    ## [1] "list"
    ```
]

---

# Subsetting `data.frames`

* Subsetting a data.frame: `murders[row condition, column number]` or `murders[row condition, "column name"]`
    
    ``` r
    # Only keep states with over 500 gun murders and keep only the "state" and "total" variables
    murders[murders$total > 500, c("state", "total")]
    ```
    
    ```
    ##         state total
    ## 5  California  1257
    ## 10    Florida   669
    ## 33   New York   517
    ## 44      Texas   805
    ```
    
    ``` r
    # Only keep California and Texas and keep only the "state" and "total" variables
    murders[murders$state %in% c("California", "Texas"), c("state", "total")]
    ```
    
    ```
    ##         state total
    ## 5  California  1257
    ## 44      Texas   805
    ```

---

# Task 3

1. How many observations are there in `murders`?

1. How many variables? What are the data types of each variable?

1. Remember that the colon operator `1:10` is just short for *construct a sequence from `1` to `10`* (i.e. 1, 2, 3, etc). Create a new object `murders_2` containing the rows 10 to 25 of `murders`.

1. Create a new object `murders_3` which only contains the columns `state` and `total`. (Recall that `c` creates vectors.)

1. Create a `total_percap` variable equal to the number of murders per 10,000 inhabitants by running the following code.
    
    ``` r
    murders$total_percap = (murders$total / murders$population) * 10000
    ```

Congratulations, you've created your first variable! Click on the `murders` object to see the new variable.

---

class: title-slide-final, middle
background-image: url(../img/logo/esomas.png)
background-size: 250px
background-position: 9% 19%

# That's it for this lesson!

|                                                                                                            |                                   |
| :--------------------------------------------------------------------------------------------------------- | :-------------------------------- |
| <a href="https://github.com/floswald/Econometrics-Slides">.ScPored[<i class="fa fa-link fa-fw"></i>] | Slides |
| <a href="https://floswald.github.io">.ScPored[<i class="fa fa-link fa-fw"></i>] | My Homepage |
| <a href="https://scpoecon.github.io/Econometrics/">.ScPored[<i class="fa fa-github fa-fw"></i>]                          | Book                       |