Code
library(bibliometrix)
library(igraph)
library(ggplot2)
library(dplyr)
library(tidyr)
This is a cheat sheet for the seventh homework on affiliation. It works through each step in R.
First, you need the following packages.
library(bibliometrix)
library(igraph)
library(ggplot2)
library(dplyr)
library(tidyr)
First, we read the data. Next, you can answer most of the questions by analyzing the culture.df data frame. You can also easily provide an overview using the biblioAnalysis
function of Bibliometrix. summary
and plot
will provide overviews to answer the introductory questions.
<- readRDS("data/culture_corpus.RDS")
culture.df
<- biblioAnalysis(culture.df, sep=';')
cul.bib
summary(cul.bib, k=10)
MAIN INFORMATION ABOUT DATA
Timespan 1965 : 2022
Sources (Journals, Books, etc) 6
Documents 996
Annual Growth Rate % 4.2
Document Average Age 18
Average citations per doc 93.35
Average citations per year per doc 5.462
References 53068
DOCUMENT TYPES
article 922
article; proceedings paper 74
DOCUMENT CONTENTS
Keywords Plus (ID) 1951
Author's Keywords (DE) 549
AUTHORS
Authors 1484
Author Appearances 1787
Authors of single-authored docs 430
AUTHORS COLLABORATION
Single-authored docs 481
Documents per Author 0.671
Co-Authors per Doc 1.79
International co-authorships % 9.739
Annual Scientific Production
Year Articles
1965 3
1966 2
1967 7
1968 7
1969 3
1970 3
1971 5
1972 6
1973 4
1974 5
1975 5
1976 10
1977 2
1978 7
1979 6
1980 5
1981 4
1982 7
1983 1
1984 4
1985 4
1986 4
1987 6
1988 6
1990 4
1991 7
1992 17
1993 17
1994 21
1995 19
1996 35
1997 28
1998 17
1999 25
2000 19
2001 17
2002 24
2003 15
2004 19
2005 29
2006 30
2007 38
2008 17
2009 28
2010 38
2011 28
2012 29
2013 29
2014 32
2015 41
2016 33
2017 42
2018 37
2019 39
2020 33
2021 43
2022 30
Annual Percentage Growth Rate 4.122326
Most Productive Authors
Authors Articles Authors Articles Fractionalized
1 GIBSON JL 9 GIBSON JL 7.83
2 COLE WM 7 COLE WM 6.00
3 SCHOFER E 7 RIVERA LA 4.50
4 FINE GA 5 FINE GA 4.00
5 GOLDBERG A 5 KALMIJN M 4.00
6 KALMIJN M 5 SCHOFER E 3.67
7 RIVERA LA 5 VAISEY S 3.50
8 VAISEY S 5 GOLDBERG A 3.03
9 WALDER AG 5 BLAU JR 3.00
10 BAUMANN S 4 CALARCO JM 3.00
Top manuscripts per citations
Paper DOI TC TCperYear NTC
1 SWIDLER A, 1986, AM SOCIOL REV 10.2307/2095521 4335 114.1 3.74
2 INGLEHART R, 2000, AM SOCIOL REV 10.2307/2657288 2761 115.0 12.63
3 ZUCKER LG, 1977, AM SOCIOL REV 10.2307/2094862 1206 25.7 1.98
4 RAO H, 2003, AM J SOCIOL 10.1086/367917 921 43.9 3.59
5 DIMAGGIO P, 1982, AM SOCIOL REV 10.2307/2094962 910 21.7 5.22
6 FLIGSTEIN N, 1996, AM SOCIOL REV 10.2307/2096398 863 30.8 6.02
7 SIMMONS BA, 2004, AM POLIT SCI REV 10.1017/S0003055404001078 847 42.4 4.69
8 WILSON J, 1997, AM SOCIOL REV 10.2307/2657355 759 28.1 4.85
9 HIRSCH PM, 1972, AM J SOCIOL 10.1086/225192 747 14.4 5.39
10 MACKENZIE D, 2003, AM J SOCIOL 10.1086/374404 737 35.1 2.87
Corresponding Author's Countries
Country Articles Freq SCP MCP MCP_Ratio
1 USA 786 0.84244 739 47 0.0598
2 CANADA 22 0.02358 14 8 0.3636
3 UNITED KINGDOM 22 0.02358 16 6 0.2727
4 NETHERLANDS 15 0.01608 13 2 0.1333
5 GERMANY 13 0.01393 7 6 0.4615
6 ISRAEL 12 0.01286 9 3 0.2500
7 JAPAN 9 0.00965 3 6 0.6667
8 DENMARK 8 0.00857 6 2 0.2500
9 CHINA 5 0.00536 1 4 0.8000
10 MEXICO 5 0.00536 4 1 0.2000
SCP: Single Country Publications
MCP: Multiple Country Publications
Total Citations per Country
Country Total Citations Average Article Citations
1 USA 77261 98.3
2 UNITED KINGDOM 2173 98.8
3 CANADA 1742 79.2
4 NETHERLANDS 1603 106.9
5 GERMANY 826 63.5
6 JAPAN 794 88.2
7 ISRAEL 753 62.8
8 DENMARK 643 80.4
9 AUSTRALIA 558 139.5
10 GEORGIA 294 98.0
Most Relevant Sources
Sources Articles
1 SOCIAL FORCES 293
2 AMERICAN SOCIOLOGICAL REVIEW 254
3 AMERICAN JOURNAL OF SOCIOLOGY 201
4 AMERICAN POLITICAL SCIENCE REVIEW 103
5 JOURNAL OF POLITICS 87
6 AMERICAN JOURNAL OF POLITICAL SCIENCE 58
Most Relevant Keywords
Author Keywords (DE) Articles Keywords-Plus (ID) Articles
1 CULTURE 36 CULTURE 144
2 GENDER 12 UNITED-STATES 105
3 SOCIAL MOVEMENTS 11 INEQUALITY 56
4 INEQUALITY 10 SOCIOLOGY 56
5 EDUCATION 9 POLITICS 55
6 PUBLIC OPINION 8 GENDER 54
7 ORGANIZATIONS 7 ATTITUDES 53
8 RELIGION 7 IDENTITY 53
9 SOCIAL NETWORKS 7 RACE 51
10 IDENTITY 6 DEMOCRACY 39
plot(x=cul.bib, k=5, pause=FALSE)
The bonus in the section can be constructed in ggplot. There are numerous ways to do this. I find the counts for each year and discpline by counting the rows using n()
in the summarise
function in dplyr and append it to a dataframe with just the counts by year using bind_rows
.
<- culture.df %>% group_by(WC, PY) %>% summarise(cnt = n())
dsum
<- culture.df %>% group_by(PY) %>% summarise(cnt = n())
tsum
$WC <- "TOTAL"
tsum
<- bind_rows(tsum, dsum)
tsum
ggplot(data = tsum, aes(x=PY, y=cnt, group=WC, color=WC)) +
geom_smooth(method = "loess") +
theme_bw()
<- components(kw.2mode)
comps
<- which.max(comps$csize)
bigcomp
<- V(kw.2mode)[comps$membership == bigcomp]
vert_ids
<- induced_subgraph(kw.2mode, vert_ids)
kw2.gcc
<- data.frame(SR=V(kw2.gcc)$name)
lbls
<- culture.df %>% select(SR, WC)
flds
<- left_join(lbls, flds)
lbls
$WC <- lbls$WC %>% replace_na("word")
lbls
<- data.frame(WC=c("SOCIOLOGY", "POLITICAL SCIENCE", "word"), clr=c("purple", "gold", "tomato"))
clrs
<- left_join(lbls, clrs)
lbls
V(kw2.gcc)$field <- lbls$WC
<- cluster_louvain(kw2.gcc)
lv
plot(kw2.gcc, layout=layout_with_kk, vertex.color=lv$membership, vertex.label.color=lbls$clr, vertex.label.cex=.5, vertex.size=degree(kw2.gcc)/5)
<- which(degree(kw2.gcc) > (quantile(degree(kw2.gcc), .90)
lab.keep
))
plot(kw2.gcc, vertex.label = ifelse(V(kw2.gcc) %in% lab.keep, V(kw2.gcc)$name, NA),
layout=layout_with_kk, vertex.color=lv$membership, vertex.label.color=lbls$clr, vertex.label.cex=.5, vertex.size=degree(kw2.gcc)/2)
<- components(kw.1)
comps
<- which.max(comps$csize)
bigcomp
<- V(kw.1)[comps$membership == bigcomp]
vert_ids
<- induced_subgraph(kw.1, vert_ids)
kw1.gcc
<- cluster_louvain(kw1.gcc)
lv
<- which(degree(kw1.gcc) > (quantile(degree(kw1.gcc), .95)
lab.keep
))
plot(kw1.gcc, vertex.label = ifelse(V(kw1.gcc) %in% lab.keep, V(kw1.gcc)$name, NA), layout=layout_with_fr, vertex.color=lv$membership, vertex.label.cex=.5, vertex.size=degree(kw1.gcc)/5)