HW7 Cheat Sheet

This is a cheat sheet for the seventh homework on affiliation. It works through each step in R.

First, you need the following packages.

Code
library(bibliometrix)
library(igraph)
library(ggplot2)
library(dplyr)
library(tidyr)

1. Introductory Inspection

First, we read the data. Next, you can answer most of the questions by analyzing the culture.df data frame. You can also easily provide an overview using the biblioAnalysis function of Bibliometrix. summary and plot will provide overviews to answer the introductory questions.

Code
culture.df <- readRDS("data/culture_corpus.RDS")

cul.bib <- biblioAnalysis(culture.df, sep=';')

summary(cul.bib, k=10)


MAIN INFORMATION ABOUT DATA

 Timespan                              1965 : 2022 
 Sources (Journals, Books, etc)        6 
 Documents                             996 
 Annual Growth Rate %                  4.2 
 Document Average Age                  18 
 Average citations per doc             93.35 
 Average citations per year per doc    5.462 
 References                            53068 
 
DOCUMENT TYPES                     
 article                         922 
 article; proceedings paper      74 
 
DOCUMENT CONTENTS
 Keywords Plus (ID)                    1951 
 Author's Keywords (DE)                549 
 
AUTHORS
 Authors                               1484 
 Author Appearances                    1787 
 Authors of single-authored docs       430 
 
AUTHORS COLLABORATION
 Single-authored docs                  481 
 Documents per Author                  0.671 
 Co-Authors per Doc                    1.79 
 International co-authorships %        9.739 
 

Annual Scientific Production

 Year    Articles
    1965        3
    1966        2
    1967        7
    1968        7
    1969        3
    1970        3
    1971        5
    1972        6
    1973        4
    1974        5
    1975        5
    1976       10
    1977        2
    1978        7
    1979        6
    1980        5
    1981        4
    1982        7
    1983        1
    1984        4
    1985        4
    1986        4
    1987        6
    1988        6
    1990        4
    1991        7
    1992       17
    1993       17
    1994       21
    1995       19
    1996       35
    1997       28
    1998       17
    1999       25
    2000       19
    2001       17
    2002       24
    2003       15
    2004       19
    2005       29
    2006       30
    2007       38
    2008       17
    2009       28
    2010       38
    2011       28
    2012       29
    2013       29
    2014       32
    2015       41
    2016       33
    2017       42
    2018       37
    2019       39
    2020       33
    2021       43
    2022       30

Annual Percentage Growth Rate 4.122326 


Most Productive Authors

   Authors        Articles Authors        Articles Fractionalized
1      GIBSON JL         9     GIBSON JL                     7.83
2      COLE WM           7     COLE WM                       6.00
3      SCHOFER E         7     RIVERA LA                     4.50
4      FINE GA           5     FINE GA                       4.00
5      GOLDBERG A        5     KALMIJN M                     4.00
6      KALMIJN M         5     SCHOFER E                     3.67
7      RIVERA LA         5     VAISEY S                      3.50
8      VAISEY S          5     GOLDBERG A                    3.03
9      WALDER AG         5     BLAU JR                       3.00
10     BAUMANN S         4     CALARCO JM                    3.00


Top manuscripts per citations

                       Paper                                DOI   TC TCperYear   NTC
1  SWIDLER A, 1986, AM SOCIOL REV     10.2307/2095521           4335     114.1  3.74
2  INGLEHART R, 2000, AM SOCIOL REV   10.2307/2657288           2761     115.0 12.63
3  ZUCKER LG, 1977, AM SOCIOL REV     10.2307/2094862           1206      25.7  1.98
4  RAO H, 2003, AM J SOCIOL           10.1086/367917             921      43.9  3.59
5  DIMAGGIO P, 1982, AM SOCIOL REV    10.2307/2094962            910      21.7  5.22
6  FLIGSTEIN N, 1996, AM SOCIOL REV   10.2307/2096398            863      30.8  6.02
7  SIMMONS BA, 2004, AM POLIT SCI REV 10.1017/S0003055404001078  847      42.4  4.69
8  WILSON J, 1997, AM SOCIOL REV      10.2307/2657355            759      28.1  4.85
9  HIRSCH PM, 1972, AM J SOCIOL       10.1086/225192             747      14.4  5.39
10 MACKENZIE D, 2003, AM J SOCIOL     10.1086/374404             737      35.1  2.87


Corresponding Author's Countries

          Country Articles    Freq SCP MCP MCP_Ratio
1  USA                 786 0.84244 739  47    0.0598
2  CANADA               22 0.02358  14   8    0.3636
3  UNITED KINGDOM       22 0.02358  16   6    0.2727
4  NETHERLANDS          15 0.01608  13   2    0.1333
5  GERMANY              13 0.01393   7   6    0.4615
6  ISRAEL               12 0.01286   9   3    0.2500
7  JAPAN                 9 0.00965   3   6    0.6667
8  DENMARK               8 0.00857   6   2    0.2500
9  CHINA                 5 0.00536   1   4    0.8000
10 MEXICO                5 0.00536   4   1    0.2000


SCP: Single Country Publications

MCP: Multiple Country Publications


Total Citations per Country

      Country      Total Citations Average Article Citations
1  USA                       77261                      98.3
2  UNITED KINGDOM             2173                      98.8
3  CANADA                     1742                      79.2
4  NETHERLANDS                1603                     106.9
5  GERMANY                     826                      63.5
6  JAPAN                       794                      88.2
7  ISRAEL                      753                      62.8
8  DENMARK                     643                      80.4
9  AUSTRALIA                   558                     139.5
10 GEORGIA                     294                      98.0


Most Relevant Sources

                         Sources        Articles
1 SOCIAL FORCES                              293
2 AMERICAN SOCIOLOGICAL REVIEW               254
3 AMERICAN JOURNAL OF SOCIOLOGY              201
4 AMERICAN POLITICAL SCIENCE REVIEW          103
5 JOURNAL OF POLITICS                         87
6 AMERICAN JOURNAL OF POLITICAL SCIENCE       58


Most Relevant Keywords

   Author Keywords (DE)      Articles Keywords-Plus (ID)     Articles
1           CULTURE                36          CULTURE            144
2           GENDER                 12          UNITED-STATES      105
3           SOCIAL MOVEMENTS       11          INEQUALITY          56
4           INEQUALITY             10          SOCIOLOGY           56
5           EDUCATION               9          POLITICS            55
6           PUBLIC OPINION          8          GENDER              54
7           ORGANIZATIONS           7          ATTITUDES           53
8           RELIGION                7          IDENTITY            53
9           SOCIAL NETWORKS         7          RACE                51
10          IDENTITY                6          DEMOCRACY           39
Code
plot(x=cul.bib, k=5, pause=FALSE)

The bonus in the section can be constructed in ggplot. There are numerous ways to do this. I find the counts for each year and discpline by counting the rows using n() in the summarise function in dplyr and append it to a dataframe with just the counts by year using bind_rows.

Code
dsum <- culture.df %>% group_by(WC, PY) %>% summarise(cnt = n())

tsum <- culture.df %>% group_by(PY) %>% summarise(cnt = n())

tsum$WC <- "TOTAL"

tsum <- bind_rows(tsum, dsum)

ggplot(data = tsum, aes(x=PY, y=cnt, group=WC, color=WC))  +  
      geom_smooth(method = "loess") +
      theme_bw()

2. Culture in the Social Sciences

For this section I will build three different kinds of networks, but you can certainly build more.

i.Co-authorship Network

First, I will build the co-authorship network. I’m going to build this network using the cocMatrix function.

Code
#Use Bibliometrix to make a two-matrix of Articles X Authors

coauth <- cocMatrix(culture.df, Field="AU")

#Make an Igraph object from this matrix

coauth.2mode <- graph_from_incidence_matrix(coauth)

#Make a one-mode projection of this matrix looking at columns (Author X Author)

coauth.1mode <- bipartite.projection(coauth.2mode, multiplicity=TRUE, which=TRUE)

# Plot the one mode

plot(coauth.1mode, vertex.label=NA, vertex.size=4)

Code
# Let's get rid of isolates

coauth.noiso <- delete.vertices(coauth.1mode, which(degree(coauth.1mode)==0))

# and plot

plot(coauth.noiso, layout=layout_with_fr, vertex.label=NA, vertex.size=4)

Code
#can examine the components here

table(components(coauth.noiso)$csize)

  2   3   4   5   6   7   8   9  11  14  16  19 
197  89  46  17   3   6   5   3   1   2   1   1 
Code
#centralization

centralization.degree(coauth.noiso)
$res
   [1]  3  2  7  2  7  2  1  3  5  4  5  6  2  6  3  3  6  6  3  9  1  4  2  2  1  5  3  2  3  3  1  3  1  2  4 18  2  4
  [39]  1  3  2  4  2  4  5  5  3  4  2  5  2  8  3  3  3  1  1  6  5  3  3  1  3  2  2  3  3  1  2  2  1  3  3  1  1  3
  [77]  2  3  1  4  3  7  1  2  1  1  1  2 15  1  3  7  1  2  2  2  2  3  2  3  1  1  3  2  4  3  2  4  3  6  3  1  2  1
 [115]  1  1  2  2  1  3  3  3  4  1  2  2  3  1  1  3  2  1  2  4  1  1 16  4  5  1  2  1  2  3  2  4  2  2  3  3  2  3
 [153]  1  1  4  1  7  3  1  2  2  1  1  3  1  3  2  2  1  2  2  3  2  3  1  4  1  5  7  4  1  1  5  3  2  1  1  3  3  1
 [191]  2  1  1  2  1  1  2  1  4  4  1  1  2  3  1  2  1  3  1  2  2  1  1  3  1 10  2  1  1  1  2  3  2  4  1  1  1  2
 [229]  1  1  4  1  2  2  2  1  1  1  4  2  2  1 13  1  1  1  1  4  1  5  2  3  3  4  2  1  1  2  1  3  1  2  2  2  1  1
 [267]  1  2  1  2  4  1  2  4  1  2  1  1  2  3  1  3  2  1  1  1  1  1  2  1  2  1  1  1  1  2  1  2  1  2  2  2  1  1
 [305]  1  2  2  2  2 13  1  2  2  2  2  1  3  2  2  1  2  1  1  1  2  2  2  2  1  1  1  1  2  3  3  1  1  1  1  4  1  1
 [343]  2  1  1  2  2  1  2  3  8  2  3  2  2  1  2  4  1  1  1  2  2  2 10  2  2  1  4  1  2  2  2  1  8  2  1  1  2  1
 [381]  2  1  1  1  2  1  1  2  2  2  4  3  1  1  1  2  1  4  1  2  1  2  1  1  2  1  1  2  2  2  2  2  1  1  1  3  1  1
 [419]  1  1  2  1  1  4  1 10  1  1  2  3  2  1  1  2  1  2  2  1  2  2  4  8  2  2  1  1  1  1 10  2  1  1  1  3  1  1
 [457]  4  1  2  1  1  1  1  8  1  1  2  1  1  1  3  2  1  2  2  2  1  2  1  2  1 13  1  3  1  1  1  1  1  1  2 13  1  1
 [495]  1  3  2  2  1  2  1  1  1  3  3  1  1  1  1  1  1  1  1  2  1  1  2  4  1  1  2  2  1  2  1  1  1  1  1 13  2  2
 [533]  2  1  2  1  2  3  4  1  1  1  2  1  1  2  1  1  2  2  2  1  1  1  1  3  1  3  1  1  3  1  8  1  1  1  2  2  1  2
 [571]  1  2  2  2  2  2  2  2  2  1  4  5  2  2  1  2  1  1  2  1  3  1  1  1  1  5  2  5  2  2  2  1  1  1  1  1  2  1
 [609]  1  1  2  1  2  3  1  1  1  2  1  2  1  2  4  1  1  1  2  3  2  3  1  4  1  1  2 10  1  2  1  1  1  1  1  1  1  1
 [647]  4  1  4  2  1  1  3  1  2  1  1  1  2  1  2  1  3  5  1  1  2  2  1  1  1  1  2  1  1  3  1  1  1  3  3  2  3  2
 [685]  5  3  2  3  1  1  1  1  2  2  1  2  2 13  1  3  1  2  4  1  3  1  4  2  1  3  4  5  3  1  2  4  3  1  1  1  1  2
 [723]  3  3  1  1  2  1  2  1  2  1  1  1  1  4  2  5 13  1  4  1  2  1  2  2  1  2  1  2  4  1  1  2  2  2  3  1  1  1
 [761]  1  1  2  1  2  2  2  1  3  2  2  4  4  4  1 13  2  1  3  1  1 10  1  2  1  1  8  3  3  3  1  2  1  5  1  1  1 10
 [799]  2  2  2  1  2  3  1  1  3  1  3  1  1  1  1  2  1  5  1  1  2  1  1  2  4  1  1  1  2  1  1  3  1  1  1  2  1  1
 [837]  1  1  1  2  1  3  1  1  4  2  1  2 10  3  3  2  1  1  2  1  1  2  4  1  2 10  3  3  2  1  2  4  2  1  2  1  1  3
 [875]  1  2  1  2  2  1  2  1  4  2  2  2  1  1  2  1  1  3  2  1  1  2  1 10  1  1  2  2  3  1  2  2  2  2  8  1  4  2
 [913]  4  2  3  3  1  2  1  2  2  1  1  1  2  2  2  3  3  1  2  2  3  1  1  3  1  3  2  2  2  2  2  1  2  3  1  5  5  2
 [951]  2  1  2  3  3  1  1  2  4  3  1  4  2 13  1  1  1  2  1  1  1  1  2  2  2  2  1  1  1  4  1  2  1  2  1  2  1  1
 [989]  2  5  1  3  1  2  1  1  3  1  1  2  1  1  1  2  2  2  3  3  2  1  2  1  1  3  2  1  1  4  3  2  2  2  1  1 10  1
[1027]  1  1  2  2  1  1  1  4  1  2  1  1  1  3  8  1  1  2  2  1  1  2  1  2  2  4  1  1  3  3  1  2  4  1  2  2  3  2
[1065]  2  2  2  3  2  1  2  2  1  1  3  2  1  1  2  2 13  1  2  1  1  2  4  4  1  1  8  1  1  1  1  3  2  2  3  1  1  1
[1103]  2  1  2  1  2  1  1  3  1  4  1  1  2  2  2  1  1  1  1  2  3  3  3  1 13  1  2  2  1

$centralization
[1] 0.01399028

$theoretical_max
[1] 1278030
Code
# density

edge_density(coauth.noiso)
[1] 0.001938922

ii.Keyword 2-mode Network

Next, we can look at the keyword network in two modes.

Code
kw.mat <- cocMatrix(culture.df, Field="DE")

kw.2mode <- graph_from_incidence_matrix(kw.mat)

# and plot

plot(kw.2mode, layout=layout_with_fr, vertex.label=NA, vertex.size=4)

Code
# Let's get rid of isolates - Presumambly papers without keywords

kw2.noiso <- delete.vertices(kw.2mode, which(degree(kw.2mode)==0))

# and plot

plot(kw2.noiso, layout=layout_with_fr, vertex.label=NA, vertex.size=4)

Code
#can examine the components here

table(components(kw2.noiso)$csize)

  5   6   7   8 598 
  5  11   1   1   1 
Code
#centralization

centralization.degree(kw2.noiso)
$res
  [1]  5  5  6  6  5  7  4  5  4  5  4  5  7  6  3  4  5  4  4  5  6  4  5  7  5  6  5  6  5  4  5  5  5  5  4  6  5  5
 [39]  5  4  5  5  4  5  5  5  5  7  7  5  5  6  4  4  4  6  4  5  5  5  5  5  5  5  5  6  5  4  7  5  5  5  5  5  6  5
 [77]  6  5  5  7  5  4  5  5  4  9  4  5  6  7  9  4  4  6  4  6  5  5  5  5  4  5  5  4  5  6  5 10  5  5  5  5  5  8
[115]  5  6  6  4  6  5  6  7  5  6  6  7  4  5  5  5  5  5  5  4  4  5  5  4  4  7  5  5  5  3  5  5  5  4  5  5  6 10
[153]  4  5  4 36 12 11 10  9  8  7  7  7  6  6  6  5  5  5  5  4  4  4  4  4  4  4  4  3  3  3  3  3  3  3  3  3  3  3
[191]  3  3  3  3  3  3  3  3  3  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2
[229]  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  2  1  1  1  1  1
[267]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[305]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[343]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[381]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[419]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[457]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[495]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[533]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[571]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[609]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[647]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
[685]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

$centralization
[1] 0.04796004

$theoretical_max
[1] 494912
Code
# density

edge_density(kw2.noiso)
[1] 0.003249062

iii. Keyword 1-mode Network

Now we can look at the keyword network in 1-mode

Code
kw.mat <- cocMatrix(culture.df, Field="DE")

kw.2mode <- graph_from_incidence_matrix(kw.mat)

kw.1 <- bipartite_projection(kw.2mode, multiplicity=TRUE, which="true")

plot(kw.1, vertex.label=NA, vertex.size=degree(kw.1)/10)

Code
#can examine the components here

table(components(kw.1)$csize)

  4   5   6   7 461 
  5  11   1   1   1 
Code
#centralization

centralization.degree(kw.1)
$res
  [1] 135  44  45  42  29  32  33  26  29  25  22  21  18  22  19  18  17  15  16  14  14  14  16  18  10  12  13   9
 [29]  11  14  12  12  11  11  12  15  10  14  10  10  18  12  11  11   7   8   7   7   7   7   8   5   9   7  10   8
 [57]   7   6   6   7   9   7   9   7  10   7   7   9   9   6   9   7   9   8   9  14  11   9   7   9  11   8   6   9
 [85]  10  11  10  10   7   8  12   9   9   9   7   9  10   6  10   9  10   8   8   7   7   9   4   4   4   3   3   4
[113]   8   4   4   4   9   4   3   5   4   6   5   5   3   3   3   6   3   4   6   4   4   3   3   3   6   3   4   5
[141]   4   4   9   4   4   7   5   3   6   4   4   4   4   3   3   5   4   6   4   4   9   4   8   5   6   7   4   4
[169]   4   5   4   4   6   4   4   4   4   5   4   4   6   3   4   3   4   4   4   8   4   6   5   4   5   6   4   3
[197]   6   4   3   6   4   5   4   4   3   6   4   4   6   6   3   8   6   4   4   4   4   5   8   4   3   4   4   4
[225]   4   3   3   8   4   4   3   9   5   4   4   9   4   4   4   5   4   4   5   4   5   6   9   5   3   4   5   4
[253]   4   4   4   4   8   4   6   3   3   4   4   4   4   4   4   6   6   6   4   5   4   5   9   4   4   6   4   4
[281]   4   5   4   5   5   3   4   9   3   4   8   4   4   4   5   4   3   4   5   4   4   4   9   4   5   4   3   3
[309]   4   5   3   6   6   6   3   5   4   5   9   6   4   4   8   5   6   5   4   4   4   4   3   6   4   5   4   4
[337]   4   4   5   3   3   8   4   4   6   7   5   5   4   2   4   4   6   5   4   5   5   4   3   5   5   4   5   4
[365]   4   5   4   5   4   4   4   4   7   4   5   4   4   4   4   4   4   4   4   6   4   4   4   6   4   4   6   4
[393]   4   6   3   3   4   5   4   5   3   3   2   5   4   4   4   4   3   4   5   5   4   5   3   5   3   2   5   4
[421]   4   6   5   3   4   3   4   6   4   4   4   6   4   4   5   4   4   5   3   6   4   4   3   4   6   4   5   6
[449]   5   4   4   4   6   4   3   3   4   4   4   3   6   5   3   5   4   5   4   6   4   3   5   4   3   6   4   4
[477]   4   5   3   8   4   5   4   4   3   5   4   4   5   4   5   4   4   6   6   5   3   8   4   4   4   4   3   6
[505]   3   3   5   7   5   4   3   3   9   6   4   9   9   3   4   4   3   4   4   5   7   4   4   4   5   4   4   3
[533]   5   5   4   5   4   4   4   4   4   6   4   4   6   6   9   9   3

$centralization
[1] 0.2348563

$theoretical_max
[1] 300852
Code
# density

edge_density(kw.1)
[1] 0.01149402

2. Variation in Political Science and Sociology

i. Variation in Coauthorship

This is a little bit of a challenge because we want to see if authors are political scientists or sociologists.

A simple and reasonable way to do it would be to pull out the giant connected component (or two given the kind of graph that we have) and inspect the authors in the component(s), perhaps by community.

Code
comps <- components(coauth.noiso)

bigcomp <- which.max(comps$csize)

vert_ids <- V(coauth.noiso)[comps$membership == bigcomp]

gcc <- induced_subgraph(coauth.noiso, vert_ids)

comms <- cluster_louvain(gcc)

plot(gcc, layout=layout_with_kk, vertex.label.cex=.7, vertex.size=degree(gcc), vertex.color=as.factor(comms$membership))

If you want to look at the second largest component, you can identify it by size using a simple peek at component sizes in table and selecting the component size using which instead of which.max.

Code
table(comps$csize)

  2   3   4   5   6   7   8   9  11  14  16  19 
197  89  46  17   3   6   5   3   1   2   1   1 
Code
secondcomp <- which(comps$csize==16)

vert_ids2 <- V(coauth.noiso)[comps$membership == secondcomp]

gcc2 <- induced_subgraph(coauth.noiso, vert_ids2)

comms2 <- cluster_louvain(gcc2)

plot(gcc2, layout=layout_with_kk, vertex.label.cex=.7, vertex.size=degree(gcc2), vertex.color=as.factor(comms2$membership))

Note that you can also program your way into connecting the field and the author data. It is a huge pain as you will see. You do not need to figure this out for this assignment. Lol.

Code
# Not good practice to introduce here, but this is part is extra in multiple ways.

library(tidyverse)
library(stringr)

aufields <- culture.df %>% select(AU, UT, WC)

mau <- max(str_count(aufields$AU, ";"))

ausplit <- as.data.frame(str_split_fixed(aufields$AU, ";", mau))

aufields$id <- 1:nrow(aufields)

ausplit$id <- 1:nrow(ausplit)

aufields <- left_join(aufields, ausplit)

aulong <- aufields %>% pivot_longer(V1:V13, names_to = "author_order", values_to = "author")

aulong <- aulong %>% filter(author!="")

auwc <- aulong %>% group_by(author, WC) %>% summarise(cnt = n())

ausoc <- auwc %>% filter(WC=="SOCIOLOGY")  
  
ausoc <- rename(ausoc, soccnt=cnt)

ausoc <- ausoc %>% select(author, soccnt)

aupol <- auwc %>% filter(WC=="POLITICAL SCIENCE")  
  
aupol <- rename(aupol, polcnt=cnt)

aupol <- aupol %>% select(author, polcnt)

autot <-aulong %>%  group_by(author) %>% summarise(cnt=n())

autot <- rename(autot, totcnt=cnt)

#put all data frames into list
df_list <- list(autot, aupol, ausoc)      

#merge all data frames together
aus <- df_list %>% reduce(full_join, by='author')

aus[is.na(aus)] <- 0

aus$disc <- ifelse(aus$soccnt>0, "sociology", "political")

aus$disc <- ifelse(aus$totcnt!=aus$soccnt & aus$totcnt!=aus$polcnt, "both", aus$disc)

vname <- data.frame(author=V(coauth.noiso)$name)

vname <- left_join(vname, aus)

V(coauth.noiso)$discipline <- vname$disc

plot(coauth.noiso, layout=layout_with_fr, vertex.label=NA, vertex.size=4, vertex.color=as.factor(V(coauth.noiso)$discipline))

Code
components <- components(coauth.noiso)

table(components$csize)

  2   3   4   5   6   7   8   9  11  14  16  19 
197  89  46  17   3   6   5   3   1   2   1   1 
Code
vert_ids <- V(coauth.noiso)[components$membership == 3]

comp3 <- induced_subgraph(coauth.noiso, vert_ids)

plot(comp3, layout=layout_with_kk, vertex.size=4, vertex.color=as.factor(V(comp3)$discipline))

Code
vert_ids <- V(coauth.noiso)[components$membership == 3]

comp3 <- induced_subgraph(coauth.noiso, vert_ids)

plot(comp3, layout=layout_with_kk, vertex.size=4, vertex.color=as.factor(V(comp3)$discipline))


vert_ids <- V(coauth.noiso)[components$membership == 3]

comp3 <- induced_subgraph(coauth.noiso, vert_ids)

plot(comp3, layout=layout_with_kk, vertex.size=4, vertex.color=as.factor(V(comp3)$discipline))


vert_ids <- V(coauth.noiso)[components$membership == 3]

comp3 <- induced_subgraph(coauth.noiso, vert_ids)

plot(comp3, layout=layout_with_kk, vertex.size=4, vertex.color=as.factor(V(comp3)$discipline))


vert_ids <- V(coauth.noiso)[components$membership == 30]

comp30 <- induced_subgraph(coauth.noiso, vert_ids)

plot(comp30, layout=layout_with_kk, vertex.size=4, vertex.color=as.factor(V(comp30)$discipline))

ii. Variation in two-mode

Code
comps <- components(kw.2mode)

bigcomp <- which.max(comps$csize)

vert_ids <- V(kw.2mode)[comps$membership == bigcomp]

kw2.gcc <- induced_subgraph(kw.2mode, vert_ids)

lbls <- data.frame(SR=V(kw2.gcc)$name)

flds <- culture.df %>% select(SR, WC)

lbls <- left_join(lbls, flds)

lbls$WC <- lbls$WC %>% replace_na("word")

clrs <- data.frame(WC=c("SOCIOLOGY", "POLITICAL SCIENCE", "word"), clr=c("purple", "gold", "tomato"))

lbls <- left_join(lbls, clrs)

V(kw2.gcc)$field <- lbls$WC

lv <- cluster_louvain(kw2.gcc)

plot(kw2.gcc, layout=layout_with_kk, vertex.color=lv$membership, vertex.label.color=lbls$clr, vertex.label.cex=.5, vertex.size=degree(kw2.gcc)/5)

Code
lab.keep <- which(degree(kw2.gcc) > (quantile(degree(kw2.gcc), .90)
))


plot(kw2.gcc, vertex.label = ifelse(V(kw2.gcc) %in% lab.keep, V(kw2.gcc)$name, NA),
     layout=layout_with_kk, vertex.color=lv$membership, vertex.label.color=lbls$clr, vertex.label.cex=.5, vertex.size=degree(kw2.gcc)/2)

ii. Variation in one-mode

Code
comps <- components(kw.1)

bigcomp <- which.max(comps$csize)

vert_ids <- V(kw.1)[comps$membership == bigcomp]

kw1.gcc <- induced_subgraph(kw.1, vert_ids)

lv <- cluster_louvain(kw1.gcc)

lab.keep <- which(degree(kw1.gcc) > (quantile(degree(kw1.gcc), .95)
))

plot(kw1.gcc, vertex.label = ifelse(V(kw1.gcc) %in% lab.keep, V(kw1.gcc)$name, NA), layout=layout_with_fr, vertex.color=lv$membership, vertex.label.cex=.5, vertex.size=degree(kw1.gcc)/5)