International trade datasets

Peter Schott

NBER International Trade and Geography Data - Redding

WITS

COMTRADE

OECD

TCdata360 World Bank

Microdata WB

WTO

CEPII

Atlas of Economic Complexity

The Observatory of Economic Complexity

Comex Stat

Gravity Portal

db.nomics

World Input Output Database

Authors to follow

Adao, Rodrigo

Antràs, Pol

Arkolakis, Costas

Atkin, David

Bernard, Andrew

Caliendo, Lorenzo

Costinot, Arnaud

Dix-Carneiro, Rafael

Donaldson, Dave

Feenstra, Robert

Grossman, Gene

Helpman, Elhanan

Javorcik, Beata

Jensen, J. Bradford

Khandelwal, Amit

Kortum, Samuel

Melitz, Marc

Ornelas, Emanuel

Parro, Fernando

Pavcnik, Nina

Pessoa, João Paulo

Redding, Stephen

Rodríguez-Clare, Andrés

Schott, Peter

Trefler, Daniel

Downloading and looking into WIOD data

For the final course project we will use the WIOD table, (Timmer et al. 2015). This table contains bilateral trade flows and underlying data, covering 43 countries, and a model for the rest of the world for the period 2000-2014. Data for 56 sectors are classified according to the International Standard Industrial Classification revision 4 (ISIC Rev. 4).

Download the 2016 release of WIOD table from the link above. Let’s import the R version into a folder called input. Once you have downloaded and unzip the contents, you will have a file called WIOT2014_October16_ROW.RData in your input folder.

library(data.table)
library(fst)

load("input/WIOT2014_October16_ROW.RData")

setnames(
  wiot,
  c("IndustryCode", "IndustryDescription", "Country", "RNr"),
  c("ind_code", "ind_name", "out_country", "out_ind")
)
wiot[sample(5), .(ind_code, out_country, out_ind, AUS1, BRA4)]
##    ind_code out_country out_ind        AUS1         BRA4
## 1:      A02         AUS       2    83.02964  0.012315142
## 2:  C10-C12         AUS       5  1590.84059  0.070672046
## 3:      A01         AUS       1 12924.17969  0.018449419
## 4:        B         AUS       4   115.92985 24.198535555
## 5:      A03         AUS       3    19.14773  0.009592908
id_vars <- c("out_country", "out_ind")
wiot <- melt(wiot[, -c("ind_code", "ind_name", "Year")],
  id.vars = id_vars,
  variable.name = "in_country",
  variable.factor = FALSE
)
wiot[, `:=`(
  in_ind = sub("[[:alpha:]]+(\\d+)", "\\1", in_country),
  in_country = sub("([[:alpha:]]+)\\d+", "\\1", in_country)
)]
wiot[, in_ind := as.integer(in_ind)]
setcolorder(
  wiot,
  c("out_country", "out_ind", "in_country", "in_ind", "value")
)
wiot[sample(5), 1:5]
##    out_country out_ind in_country in_ind       value
## 1:         AUS       2        AUS      1    83.02964
## 2:         AUS       5        AUS      1  1590.84059
## 3:         AUS       3        AUS      1    19.14773
## 4:         AUS       4        AUS      1   115.92985
## 5:         AUS       1        AUS      1 12924.17969

Let’s check what are the countries included in the dataset. We can see from the code below we have 45 “coutries”, but only 43 of them are really countries. ROW stands for Rest Of the World, while TOT is a “Total” column that came with the raw data and we should drop it.

wiot[, unique(out_country)]
##  [1] "AUS" "AUT" "BEL" "BGR" "BRA" "CAN" "CHE" "CHN" "CYP" "CZE" "DEU" "DNK"
## [13] "ESP" "EST" "FIN" "FRA" "GBR" "GRC" "HRV" "HUN" "IDN" "IND" "IRL" "ITA"
## [25] "JPN" "KOR" "LTU" "LUX" "LVA" "MEX" "MLT" "NLD" "NOR" "POL" "PRT" "ROU"
## [37] "RUS" "SVK" "SVN" "SWE" "TUR" "TWN" "USA" "ROW" "TOT"

From previous data analysis (not shown here), we also know that industry #56 - Activities of extraterritorial organizations and bodies is comprised of all zeros and, all other industries with codes greater than 56 are related to the Total aggregation. We will drop all these observations.

#' Drop industry U (56) since most countries do not expend anything on this
wiot <- wiot[!(out_country %chin% c("TOT") |
  in_country %chin% c("TOT") |
  out_ind > 55 |
  in_ind > 55)]

This is the cleaned dataset we shall work through this series of classes, save it to a file for posterior loading.

write_fst(wiot, "output/wiot.fst")
write.csv(wiot, "output/wiot.csv") # For Python users

Now we must get socio-economic data from this link. This data is fully compatible with WIOD, but comes in Microsoft Excel format.

library(openxlsx)
wiod_sea <- as.data.table(
  read.xlsx("input/WIOD_SEA_Nov16.xlsx", sheet = "DATA")
)
#' Employed population in thousands
employed <- wiod_sea[variable == "EMP",
  by = "country",
  .(employed = sum(`2014`))
]
setnames(
  employed,
  c("country", "employed"),
  c("out_country", "employed_i")
)
#' ATTENTION: socio-economic data from WIOD does not include ROW! I will just
#' input the employed in ROW having the same value as the sum of all countries!
employed <- rbind(employed, list("ROW", employed[, sum(employed_i)]))
#' Write employed to fst file
write_fst(employed, "output/employed.fst")

What about Python?

import numpy as np
import pandas as pd

wiot = pd.read_csv("output/wiot.csv")

wiod_sea = pd.read_excel("input/WIOD_SEA_Nov16.xlsx", sheet_name = "DATA",
                         engine = "openpyxl")
employed = (wiod_sea
  .query("variable == 'EMP'")
  .groupby('country')[2014]
  .sum()
  .reset_index()
  .rename(columns={'country': 'out_country', 2014: 'employed_i'})
)
row = pd.DataFrame({'out_country': ['ROW'], 
                    'employed_i': [employed['employed_i'].sum()]})
employed = employed.append(row)

What about Matlab?

You are on your own!

Sorry 😣

References

Timmer, Marcel P., Erik Dietzenbacher, Bart Los, Robert Stehrer, and Gaaitzen J. de Vries. 2015. “An Illustrated User Guide to the World Input–Output Database: The Case of Global Automotive Production.” Review of International Economics 23 (3): 575–605. https://doi.org/https://doi.org/10.1111/roie.12178.