HW8 Cheat Sheet

This is a cheat sheet for the eighth homework on an introduction to network statistics. It works through each step in R.

First, you need the following packages.

Code
library(dplyr)
library(statnet)
library(igraph)
library(ggplot2)

1. Basic Description

First, we read the data. You will see that there are two networks for each of the networks in the Vickers class dataset. The objects that end with .g are igraph objects and the objects with .n are statnet objects.

Code
load("data/vickers_class.Rdata")

We can use igraph to construct random networks.

Code
gs <- vector('list', 1000)


for(i in 1:1000){
  gs[[i]] <- sample_gnm(n=gorder(best.g), m=gsize(best.g))
}


library(ggplot2)

gs.dist <- data.frame(mdist = unlist(lapply(gs, mean_distance)))

ggplot(gs.dist, aes(x=mdist))+
  geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
  geom_density(alpha=0.2, fill="tomato") +
  geom_vline(aes(xintercept=mean_distance(best.g)),
             color="blue", linetype="dashed", size=1) +
  labs(title="Average Path Length: Vickers Best Network",
       x ="Average Path Length", y = "Density") +
  theme_bw()

Code
#Then the clustering coefficient for the Best friends graph.

gs.transitiv <- data.frame(transitiv = unlist(lapply(gs, transitivity)))

ggplot(gs.transitiv, aes(x=transitiv))+
  geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
  geom_density(alpha=0.2, fill="tomato") +
  geom_vline(aes(xintercept=transitivity(best.g)),
             color="blue", linetype="dashed", size=1) +
  labs(title="Transitivity: Vickers Best Network",
       x ="Transitivity", y = "Density") +
  theme_bw()

We can

Code
gs <- vector('list', 1000)

for(i in 1:1000){
  gs[[i]] <- sample_gnm(n=gorder(work.g), m=gsize(work.g))
}


library(ggplot2)

gs.dist <- data.frame(mdist = unlist(lapply(gs, mean_distance)))

ggplot(gs.dist, aes(x=mdist))+
  geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
  geom_density(alpha=0.2, fill="tomato") +
  geom_vline(aes(xintercept=mean_distance(work.g)),
             color="blue", linetype="dashed", size=1) +
  labs(title="Average Path Length: Vickers Work Network",
       x ="Average Path Length", y = "Density") +
  theme_bw()

Code
#CLustering coefficient for work

gs.transitiv <- data.frame(transitiv = unlist(lapply(gs, transitivity)))

ggplot(gs.transitiv, aes(x=transitiv))+
  geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
  geom_density(alpha=0.2, fill="tomato") +
  geom_vline(aes(xintercept=transitivity(work.g)),
             color="blue", linetype="dashed", size=1) +
  labs(title="Transitivity: Vickers Work Network",
       x ="Transitivity", y = "Density") +
  theme_bw()

2. QAP Correlation

For this section I will build three different kinds of networks, but you can certainly build more.

Code
netcor <- qaptest(list(best.n, work.n), gcor, g1=1, g2=2, reps=1000)

summary(netcor)

QAP Test Results

Estimated p-values:
    p(f(perm) >= f(d)): 0 
    p(f(perm) <= f(d)): 1 

Test Diagnostics:
    Test Value (f(d)): 0.6006287 
    Replications: 1000 
    Distribution Summary:
        Min:     -0.1391789 
        1stQ:    -0.03523897 
        Med:     0.001445701 
        Mean:    0.000987143 
        3rdQ:    0.03813038 
        Max:     0.1848691 

3. QAP Logistic Regression

Code
nlog <-netlogit(work.n, list(best.n, gender.n),reps=1000)

summary(nlog)

Network Logit Model

Coefficients:
            Estimate   Exp(b)     Pr(<=b) Pr(>=b) Pr(>=|b|)
(intercept) -1.6459752  0.1928244 0.000   1.000   0.000    
x1           3.4032756 30.0624127 1.000   0.000   0.000    
x2           0.8200015  2.2705033 0.998   0.002   0.002    

Goodness of Fit Statistics:

Null deviance: 1125.671 on 812 degrees of freedom
Residual deviance: 746.9267 on 809 degrees of freedom
Chi-Squared test of fit improvement:
     378.7443 on 3 degrees of freedom, p-value 0 
AIC: 752.9267   BIC: 767.0252 
Pseudo-R^2 Measures:
    (Dn-Dr)/(Dn-Dr+dfn): 0.3180736 
    (Dn-Dr)/Dn: 0.3364609 
Contingency Table (predicted (rows) x actual (cols)):

         Actual
Predicted     0     1
        0   493   138
        1    15   166

    Total Fraction Correct: 0.8115764 
    Fraction Predicted 1s Correct: 0.9171271 
    Fraction Predicted 0s Correct: 0.7812995 
    False Negative Rate: 0.4539474 
    False Positive Rate: 0.02952756 

Test Diagnostics:

    Null Hypothesis: qap 
    Replications: 1000 
    Distribution Summary:

       (intercept)         x1         x2
Min     -10.310464  -4.781974  -4.036494
1stQ     -8.340742  -1.146125  -0.916402
Median   -7.766053  -0.111457  -0.083353
Mean     -7.779412  -0.070560  -0.002356
3rdQ     -7.249614   0.893377   0.825926
Max      -5.374531   6.277920   4.747512