In this problem set, we will use data from the unvotes package. It contains data for all countries voting history in the general assembly.
Here is a description of the data set:
variable | description |
---|---|
rcid | The roll call id; An identifier for each vote; used to join with un_votes and un_roll_call_issues |
country | Country name, by official English short name |
country_code | 2-character ISO country code |
vote | Vote result as a factor of yes/abstain/no |
session | Session number. The UN holds one session per year; these started in 1946 |
importantvote | Whether the vote was classified as important by the U.S. State Department report “Voting Practices in the United Nations”. These classifications began with session 39 |
date | Date of the vote, as a Date vector |
unres | Resolution code |
amend | Whether the vote was on an amendment; coded only until 1985 |
para | Whether the vote was only on a paragraph and not a resolution; coded only until 1985 |
short | Short description |
descr | Longer description |
short_name | Two-letter issue codes |
issue | Descriptive issue name, 6 issues |
It’s panel data (/time-series cross-sectional data/longitudinal data). That is, we observe our observational units, in this case countries, over time.
For non-political scientists, here is a bit of a longer description of the data:
The UN General Assembly is the main deliberative/policy-making/representative institution of the UN. From 1946 onwards, the countries (almost any country in the world) of the UN meet every year (in so called “sessions” that take a couple of months) to vote on recommendations on peace, economic development, disarmament, human rights, etc. Each country has one vote and can vote “yes”, “no”, or can “abstain” (roll call; see rcid
). In each year (session-year), multiple issues are voted upon on several occasions (see variable date
). The broader issue category is measured by the variable issue
.
Source: tidytuesday.
In a first step, we import the data. Thanks to the rio
package, this is super easy. You don`t have to be afraid of missing anything because you stick to this comfort wrapper. It’s just data importing; you don’t need to memorize 5 or more packages when you can learn one in the beginning. I also use this package all the time.
Our data is stored as a .parquet
file and is not exactly small with almost 1 million observations. The .csv
would’ve had 200 mb while .parquet
is under 1mb. The power of compression.
unvotes <- rio::import(here("Data", "unvotes.parquet"))
A note on the “here” package:
We use R projects to organize our analysis. This is nice, as clicking on the .rproj
file initializes an R project which sets the working directory to the project root (the place where the .rproj
file lives in). We can then use relative file paths to point R to the files we want to import/use in our script.
However, when you work across different operating systems (e.g., Windows and Mac), relative file paths won’t work properly. Moreover, .Rmd
files such as the one here won’t work with relative file paths when you knit the document (via the blue button above). This is because they set the knitting directory to the folder where the file is placed by default. You can reset this default via:
knitr::opts_knit$set(root.dir = normalizePath('../')) # for windows.
Because all of this, the here
package exists. It makes relative file paths just work.
So instead of passing a relative file path to the file importing function of your choice, you wrap it into the here()
function. Inside the function (starting from the project root, see example above), you simply pass the folder (and sub-folders if they exists) and the file name as strings separated by commas. Easy and robust!
Familiarize yourself with the data. What’s each variable’s type/class?
str(unvotes)
## 'data.frame': 857878 obs. of 14 variables:
## $ rcid : int 6 6 6 6 6 6 6 6 6 6 ...
## $ country : chr "United States" "Canada" "Cuba" "Dominican Republic" ...
## $ country_code : chr "US" "CA" "CU" "DO" ...
## $ vote : chr "no" "no" "yes" "abstain" ...
## $ session : int 1 1 1 1 1 1 1 1 1 1 ...
## $ importantvote: int 0 0 0 0 0 0 0 0 0 0 ...
## $ date : Date, format: "1946-01-04" "1946-01-04" ...
## $ unres : chr "R/1/107" "R/1/107" "R/1/107" "R/1/107" ...
## $ amend : int 0 0 0 0 0 0 0 0 0 0 ...
## $ para : int 0 0 0 0 0 0 0 0 0 0 ...
## $ short : chr "DECLARATION OF HUMAN RIGHTS" "DECLARATION OF HUMAN RIGHTS" "DECLARATION OF HUMAN RIGHTS" "DECLARATION OF HUMAN RIGHTS" ...
## $ descr : chr "TO ADOPT A CUBAN PROPOSAL (A/3-C) THAT AN ITEM ON A DECLARATION OF THE RIGHTS AND DUTIES OF MAN BE TABLED." "TO ADOPT A CUBAN PROPOSAL (A/3-C) THAT AN ITEM ON A DECLARATION OF THE RIGHTS AND DUTIES OF MAN BE TABLED." "TO ADOPT A CUBAN PROPOSAL (A/3-C) THAT AN ITEM ON A DECLARATION OF THE RIGHTS AND DUTIES OF MAN BE TABLED." "TO ADOPT A CUBAN PROPOSAL (A/3-C) THAT AN ITEM ON A DECLARATION OF THE RIGHTS AND DUTIES OF MAN BE TABLED." ...
## $ short_name : chr "hr" "hr" "hr" "hr" ...
## $ issue : chr "Human rights" "Human rights" "Human rights" "Human rights" ...
glimpse(unvotes)
## Rows: 857,878
## Columns: 14
## $ rcid <int> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,~
## $ country <chr> "United States", "Canada", "Cuba", "Dominican Republic",~
## $ country_code <chr> "US", "CA", "CU", "DO", "MX", "GT", "HN", "SV", "NI", "P~
## $ vote <chr> "no", "no", "yes", "abstain", "yes", "no", "yes", "absta~
## $ session <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,~
## $ importantvote <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ date <date> 1946-01-04, 1946-01-04, 1946-01-04, 1946-01-04, 1946-01~
## $ unres <chr> "R/1/107", "R/1/107", "R/1/107", "R/1/107", "R/1/107", "~
## $ amend <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ para <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,~
## $ short <chr> "DECLARATION OF HUMAN RIGHTS", "DECLARATION OF HUMAN RIG~
## $ descr <chr> "TO ADOPT A CUBAN PROPOSAL (A/3-C) THAT AN ITEM ON A DEC~
## $ short_name <chr> "hr", "hr", "hr", "hr", "hr", "hr", "hr", "hr", "hr", "h~
## $ issue <chr> "Human rights", "Human rights", "Human rights", "Human r~
What is the level of observation of this data set?
Is this data set tidy?
I generate some random data:
# generate 10000 random observations drawn from a normal distribution (mean = 0, sd = 1.
x <- rnorm(10000)
Re-write the following code with a pipe:
# 1:
# Kernel density plot made via Base R's plotting functions for your understanding.
# Base R's plotting functions are pretty mediocre, though. We will learn a better way soon: ggplot2.
plot(density((x)))
# 2:
round(log(sqrt(abs(exp(x)))))
# 3:
# Pipe out 1 instead of x this time.
round(x, digits = 1)
Your Answer:
# 1:
x %>%
density %>%
plot()
# 2:
x %>%
exp() %>%
abs() %>%
sqrt() %>%
log() %>%
round() %>%
head()
## [1] 0 1 0 1 0 0
# 3:
1 %>%
round(x, digits = .) %>%
head()
## [1] -0.6 2.5 -0.8 1.9 -0.8 -0.3
Filter the unvotes
data frame such that you obtain observations only from the US. Bind the resulting object to a new name unvotes_us
(this would be the accurate way of phrasing assignment. For simplicity you can also read this as “create a new object named unvotes_us
”). Do this with AND without pipes!
Also, make the data.frame
a tibble
and print it to the console.
Note: If you need help, consult the slides or the help files via ?FUNCTIONNAME
. Check if you understood data masking.
unvotes_us <- unvotes %>%
filter(country == "United States") %>%
as_tibble()
unvotes_us <- as_tibble(filter(unvotes, country == "United States"))
unvotes_us
## # A tibble: 5,718 x 14
## rcid country country_code vote session importantvote date unres amend
## <int> <chr> <chr> <chr> <int> <int> <date> <chr> <int>
## 1 6 United~ US no 1 0 1946-01-04 R/1/~ 0
## 2 8 United~ US no 1 0 1946-01-05 R/1/~ 1
## 3 11 United~ US yes 1 0 1946-02-05 R/1/~ 0
## 4 11 United~ US yes 1 0 1946-02-05 R/1/~ 0
## 5 18 United~ US no 1 0 1946-02-03 R/1/~ 1
## 6 19 United~ US yes 1 0 1946-02-03 R/1/~ 0
## 7 24 United~ US yes 1 0 1946-12-05 R/1/~ 0
## 8 26 United~ US no 1 0 1946-12-06 R/1/~ 0
## 9 27 United~ US yes 1 0 1946-12-06 R/1/~ 0
## 10 28 United~ US yes 1 0 1946-12-06 R/1/~ 0
## # ... with 5,708 more rows, and 5 more variables: para <int>, short <chr>,
## # descr <chr>, short_name <chr>, issue <chr>
Using unvotes_us
, “collapse” the data to a tibble showing the number of “yes” votes (pooled over all years). For a hint, scroll to the bottom of this document.
unvotes_us %>%
filter(vote == "yes") %>%
summarise(n_yes = n())
## # A tibble: 1 x 1
## n_yes
## <int>
## 1 1184
“Overwrite” unvotes_us
and create a new variable year
holding the year of the roll call.
unvotes_us <- unvotes_us %>%
mutate(year = lubridate::year(date))
Select the year
, the rcid
, the descr
variable, and the issue
variable and sort the tibble by issue
.
unvotes_us %>%
select(year, rcid, descr, issue) %>%
arrange(issue)
## # A tibble: 5,718 x 4
## year rcid descr issue
## <dbl> <int> <chr> <chr>
## 1 1948 85 TO ADOPT USSR DRAFT RESOL. (A/C.1/310) POSTPO~ Arms control and ~
## 2 1948 94 TO ADOPT PARAGRAPH 1 OF USSR DRAFT RESOL. (A/~ Arms control and ~
## 3 1948 95 TO ADOPT PARAGRAPH 2 OF USSR DRAFT RESOL. (A/~ Arms control and ~
## 4 1948 96 TO ADOPT PARAGRAPH 3 OF USSR DRAFT RESOL. (A/~ Arms control and ~
## 5 1948 97 TO ADOPT PARAGRAPH 4 OF USSR DRAFT RESOL. (A/~ Arms control and ~
## 6 1948 98 TO ADOPT PARAGRAPH 5 OF THE USSR DRAFT RESOL.~ Arms control and ~
## 7 1948 99 TO ADOPT PARAGRAPH 6 OF USSR DRAFT RESOL. (A/~ Arms control and ~
## 8 1948 100 TO ADOPT PARAGRAPH 7 OF THE USSR DRAFT RESOL.~ Arms control and ~
## 9 1948 101 TO ADOPT PARAGRAPH 8 OF THE USSR DRAFT RESOL.~ Arms control and ~
## 10 1949 198 TO ADOPT PARAGRAPH 1 OF THE USSR DRAFT RESOLU~ Arms control and ~
## # ... with 5,708 more rows
Select everything but the year
variable (Hint: read the select help file; there is a very short way to do this!).
unvotes_us %>%
select(-year)
## # A tibble: 5,718 x 14
## rcid country country_code vote session importantvote date unres amend
## <int> <chr> <chr> <chr> <int> <int> <date> <chr> <int>
## 1 6 United~ US no 1 0 1946-01-04 R/1/~ 0
## 2 8 United~ US no 1 0 1946-01-05 R/1/~ 1
## 3 11 United~ US yes 1 0 1946-02-05 R/1/~ 0
## 4 11 United~ US yes 1 0 1946-02-05 R/1/~ 0
## 5 18 United~ US no 1 0 1946-02-03 R/1/~ 1
## 6 19 United~ US yes 1 0 1946-02-03 R/1/~ 0
## 7 24 United~ US yes 1 0 1946-12-05 R/1/~ 0
## 8 26 United~ US no 1 0 1946-12-06 R/1/~ 0
## 9 27 United~ US yes 1 0 1946-12-06 R/1/~ 0
## 10 28 United~ US yes 1 0 1946-12-06 R/1/~ 0
## # ... with 5,708 more rows, and 5 more variables: para <int>, short <chr>,
## # descr <chr>, short_name <chr>, issue <chr>
unvotes_us %>%
select(!year)
## # A tibble: 5,718 x 14
## rcid country country_code vote session importantvote date unres amend
## <int> <chr> <chr> <chr> <int> <int> <date> <chr> <int>
## 1 6 United~ US no 1 0 1946-01-04 R/1/~ 0
## 2 8 United~ US no 1 0 1946-01-05 R/1/~ 1
## 3 11 United~ US yes 1 0 1946-02-05 R/1/~ 0
## 4 11 United~ US yes 1 0 1946-02-05 R/1/~ 0
## 5 18 United~ US no 1 0 1946-02-03 R/1/~ 1
## 6 19 United~ US yes 1 0 1946-02-03 R/1/~ 0
## 7 24 United~ US yes 1 0 1946-12-05 R/1/~ 0
## 8 26 United~ US no 1 0 1946-12-06 R/1/~ 0
## 9 27 United~ US yes 1 0 1946-12-06 R/1/~ 0
## 10 28 United~ US yes 1 0 1946-12-06 R/1/~ 0
## # ... with 5,708 more rows, and 5 more variables: para <int>, short <chr>,
## # descr <chr>, short_name <chr>, issue <chr>
Take the entire data frame and create a new variable that holds, for each country, the number of yes votes. Tip: to check if this worked fine, arrange()
by country.
unvotes <- unvotes %>%
group_by(country) %>%
mutate(yes_votes = sum(vote == "yes", na.rm = TRUE)) %>%
arrange(country)
unvotes
## # A tibble: 857,878 x 15
## # Groups: country [200]
## rcid country country_code vote session importantvote date unres amend
## <int> <chr> <chr> <chr> <int> <int> <date> <chr> <int>
## 1 24 Afghan~ AF yes 1 0 1946-12-05 R/1/~ 0
## 2 35 Afghan~ AF yes 1 0 1946-12-07 R/1/~ 0
## 3 36 Afghan~ AF abst~ 1 0 1946-12-07 R/1/~ 0
## 4 37 Afghan~ AF abst~ 1 0 1946-12-07 R/1/~ 1
## 5 37 Afghan~ AF abst~ 1 0 1946-12-07 R/1/~ 1
## 6 38 Afghan~ AF abst~ 1 0 1946-12-07 R/1/~ 1
## 7 39 Afghan~ AF abst~ 1 0 1946-12-07 R/1/~ 0
## 8 41 Afghan~ AF no 1 0 1946-12-01 R/1/~ 1
## 9 41 Afghan~ AF no 1 0 1946-12-01 R/1/~ 1
## 10 42 Afghan~ AF yes 1 0 1946-12-01 R/1/~ 0
## # ... with 857,868 more rows, and 6 more variables: para <int>, short <chr>,
## # descr <chr>, short_name <chr>, issue <chr>, yes_votes <int>
With
datasummary_skim(unvotes$issue, type = "categorical")
data | N | % |
---|---|---|
Arms control and disarmament | 170497 | 19.9 |
Colonialism | 129708 | 15.1 |
Economic development | 108759 | 12.7 |
Human rights | 156623 | 18.3 |
Nuclear weapons and nuclear material | 133635 | 15.6 |
Palestinian conflict | 158656 | 18.5 |
you get a quick summary for the categorical issue variable.
However, as each rcid
occurs multiple times for each country, the issues are, of course, also repeatedly present in our data.
What we want is a sorted table for the distribution of our issue variable over all roll calls.
Create such table/tibble/data frame using only tidyverse verbs/functions (scroll to the bottom of this document if you need a hint).
# There are several different ways we can achieve this:
# 1. E.g. using count() which is a shorthand for group_by(x) %>% summarise(n = n())
# count() returns a tibble of the form
# > x n
# > <chr> <int>
# > 1 A 35
# So we can just add a column to this computing the relative frequency:
unvotes %>%
distinct(rcid, issue) %>%
count(issue) %>%
mutate(percent = round(100 * n / sum(n), 1)) %>% # sum() of the n vector/variable
arrange(desc(percent))
## # A tibble: 1,195 x 4
## # Groups: country [200]
## country issue n percent
## <chr> <chr> <int> <dbl>
## 1 Zanzibar Economic development 1 100
## 2 Taiwan Colonialism 260 40.1
## 3 South Sudan Human rights 117 26.8
## 4 St. Kitts & Nevis Arms control and disarmament 544 26.5
## 5 Tuvalu Human rights 333 25.7
## 6 Turkmenistan Arms control and disarmament 441 25.4
## 7 Palau Human rights 388 24.9
## 8 Nauru Human rights 298 24.8
## 9 Tajikistan Arms control and disarmament 530 24.3
## 10 Kiribati Arms control and disarmament 139 24.2
## # ... with 1,185 more rows
# Another way:
unvotes %>%
distinct(rcid, issue) %>%
group_by(issue) %>%
summarise(n = n()) %>%
mutate(percent = round(100 * n / sum(n), 1)) %>%
arrange(desc(percent))
## # A tibble: 6 x 3
## issue n percent
## <chr> <int> <dbl>
## 1 Arms control and disarmament 170497 19.9
## 2 Palestinian conflict 158656 18.5
## 3 Human rights 156623 18.3
## 4 Nuclear weapons and nuclear material 133635 15.6
## 5 Colonialism 129708 15.1
## 6 Economic development 108759 12.7
# Yet another way
unvotes %>%
distinct(rcid, issue) %>%
group_by(issue) %>%
transmute(n = n()) %>%
unique() %>%
ungroup() %>%
mutate(percent = round(100 * n / sum(n), 1)) %>%
arrange(desc(percent))
## # A tibble: 6 x 3
## issue n percent
## <chr> <int> <dbl>
## 1 Arms control and disarmament 170497 19.9
## 2 Palestinian conflict 158656 18.5
## 3 Human rights 156623 18.3
## 4 Nuclear weapons and nuclear material 133635 15.6
## 5 Colonialism 129708 15.1
## 6 Economic development 108759 12.7
Which issue category has the highest share of important votes?
unvotes %>%
distinct(rcid, issue, .keep_all = T) %>%
group_by(issue) %>%
summarise(votes = n(),
imp_votes = sum(importantvote, na.rm = TRUE),
share_imp = sum(importantvote, na.rm = TRUE) / n()) %>%
arrange(desc(share_imp))
## # A tibble: 6 x 4
## issue votes imp_votes share_imp
## <chr> <int> <int> <dbl>
## 1 Human rights 156623 28844 0.184
## 2 Economic development 108759 11309 0.104
## 3 Palestinian conflict 158656 14274 0.0900
## 4 Nuclear weapons and nuclear material 133635 7269 0.0544
## 5 Arms control and disarmament 170497 8107 0.0475
## 6 Colonialism 129708 5443 0.0420
Add variables that show, for each country, the number and share of “yes”, “no”, and “abstain” votes, pooled over all years. Additionally, put out a tibble/data frame with one row for each country and these new variables.
# First, add the variables (this is a bit cumbersome; we will learn case_when in the next session):
unvotes <- unvotes %>%
group_by(country) %>%
mutate(yes_votes = sum(vote == "yes"),
no_votes = sum(vote == "no"),
abstain_votes = sum(vote == "abstain"),
pct_yes = sum(vote == "yes", na.rm = T)/n(),
pct_no = sum(vote == "no", na.rm = T)/n(),
pct_abs = sum(vote == "abstain", na.rm = T)/n())
# Next, compute a summary table
sum_tab_a <- unvotes %>% # or use mean()
distinct(country, .keep_all = TRUE) %>%
select(country, yes_votes, no_votes, abstain_votes, pct_yes, pct_no, pct_abs)
sum_tab_a
## # A tibble: 200 x 7
## # Groups: country [200]
## country yes_votes no_votes abstain_votes pct_yes pct_no pct_abs
## <chr> <int> <int> <int> <dbl> <dbl> <dbl>
## 1 Afghanistan 4815 185 289 0.910 0.0350 0.0546
## 2 Albania 2959 599 648 0.704 0.142 0.154
## 3 Algeria 4854 89 347 0.918 0.0168 0.0656
## 4 Andorra 1602 351 579 0.633 0.139 0.229
## 5 Angola 3685 36 242 0.930 0.00908 0.0611
## 6 Antigua & Barbuda 3305 20 265 0.921 0.00557 0.0738
## 7 Argentina 4536 143 1007 0.798 0.0251 0.177
## 8 Armenia 1886 69 638 0.727 0.0266 0.246
## 9 Australia 2950 1187 1600 0.514 0.207 0.279
## 10 Austria 3447 454 1613 0.625 0.0823 0.293
## # ... with 190 more rows
# OR
sum_tab_a_1 <- unvotes %>%
group_by(country) %>%
summarise(
pct_yes = sum(vote == "yes", na.rm = T) / n(),
pct_no = sum(vote == "no", na.rm = T) / n(),
pct_abs = sum(vote == "abstain", na.rm = T) / n()
)
sum_tab_a_1
## # A tibble: 200 x 4
## country pct_yes pct_no pct_abs
## <chr> <dbl> <dbl> <dbl>
## 1 Afghanistan 0.910 0.0350 0.0546
## 2 Albania 0.704 0.142 0.154
## 3 Algeria 0.918 0.0168 0.0656
## 4 Andorra 0.633 0.139 0.229
## 5 Angola 0.930 0.00908 0.0611
## 6 Antigua & Barbuda 0.921 0.00557 0.0738
## 7 Argentina 0.798 0.0251 0.177
## 8 Armenia 0.727 0.0266 0.246
## 9 Australia 0.514 0.207 0.279
## 10 Austria 0.625 0.0823 0.293
## # ... with 190 more rows
# OR, better to read (for single countries), but no "wide" format:
sum_tab_a_2 <- unvotes %>%
group_by(country, vote) %>%
count(country, vote) %>%
group_by(country) %>%
mutate(percent = n/sum(n))
sum_tab_a_2
## # A tibble: 598 x 4
## # Groups: country [200]
## country vote n percent
## <chr> <chr> <int> <dbl>
## 1 Afghanistan abstain 289 0.0546
## 2 Afghanistan no 185 0.0350
## 3 Afghanistan yes 4815 0.910
## 4 Albania abstain 648 0.154
## 5 Albania no 599 0.142
## 6 Albania yes 2959 0.704
## 7 Algeria abstain 347 0.0656
## 8 Algeria no 89 0.0168
## 9 Algeria yes 4854 0.918
## 10 Andorra abstain 579 0.229
## # ... with 588 more rows
Calculate, for each country and issue, the number and share of “yes” votes but only for “important votes” and for the permanent members of the security council. The output should have 30 rows.
str(unvotes$importantvote)
## int [1:857878] 0 0 0 0 0 0 0 0 0 0 ...
perm_members <- c("United States", "Russia", "France", "United Kingdom", "China")
imp_votes_tab <- unvotes %>%
group_by(country, issue) %>%
filter(importantvote == 1, country %in% perm_members) %>%
summarise(
n_votes = n(),
pct_yes = sum(vote == "yes", na.rm = T) / n()
)
imp_votes_tab
## # A tibble: 30 x 4
## # Groups: country [5]
## country issue n_votes pct_yes
## <chr> <chr> <int> <dbl>
## 1 China Arms control and disarmament 45 0.622
## 2 China Colonialism 31 0.903
## 3 China Economic development 67 0.925
## 4 China Human rights 167 0.461
## 5 China Nuclear weapons and nuclear material 40 0.575
## 6 China Palestinian conflict 87 0.989
## 7 France Arms control and disarmament 48 0.729
## 8 France Colonialism 33 0.242
## 9 France Economic development 67 0.701
## 10 France Human rights 171 0.637
## # ... with 20 more rows
Get the years with the highest and lowest share of “yes” votes for each country.
max_min_tab <- unvotes %>%
filter(country %in% perm_members) %>%
group_by(year = lubridate::year(date), country) %>% # lubridate is not loaded by default
summarise(
n_votes = n(),
pct_yes = sum(vote == "yes", na.rm = T) / n()
) %>%
ungroup() %>%
group_by(country) %>%
filter(pct_yes == max(pct_yes) | pct_yes == min(pct_yes)) %>%
arrange(country, desc(pct_yes))
max_min_tab
## # A tibble: 11 x 4
## # Groups: country [5]
## year country n_votes pct_yes
## <dbl> <chr> <int> <dbl>
## 1 1979 China 92 0.967
## 2 1971 China 33 0.636
## 3 1951 France 2 1
## 4 1968 France 26 0.0769
## 5 1989 Russia 125 0.992
## 6 1951 Russia 2 0
## 7 1951 United Kingdom 2 1
## 8 1952 United Kingdom 29 0.172
## 9 1951 United States 2 1
## 10 1956 United States 11 1
## 11 2004 United States 91 0.0659
Task 2c: You also need filter()
and n()
for this.
Task 3a: distinct()
may be helpful here!
Task 3b: Check out the arguments of distinct()
!