Code
library(dplyr)
library(statnet)
library(igraph)
library(ggplot2)
This is a cheat sheet for the eighth homework on an introduction to network statistics. It works through each step in R.
First, you need the following packages.
library(dplyr)
library(statnet)
library(igraph)
library(ggplot2)
First, we read the data. You will see that there are two networks for each of the networks in the Vickers class dataset. The objects that end with .g are igraph objects and the objects with .n are statnet objects.
load("data/vickers_class.Rdata")
We can use igraph to construct random networks.
<- vector('list', 1000)
gs
for(i in 1:1000){
<- sample_gnm(n=gorder(best.g), m=gsize(best.g))
gs[[i]]
}
library(ggplot2)
<- data.frame(mdist = unlist(lapply(gs, mean_distance)))
gs.dist
ggplot(gs.dist, aes(x=mdist))+
geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
geom_density(alpha=0.2, fill="tomato") +
geom_vline(aes(xintercept=mean_distance(best.g)),
color="blue", linetype="dashed", size=1) +
labs(title="Average Path Length: Vickers Best Network",
x ="Average Path Length", y = "Density") +
theme_bw()
#Then the clustering coefficient for the Best friends graph.
<- data.frame(transitiv = unlist(lapply(gs, transitivity)))
gs.transitiv
ggplot(gs.transitiv, aes(x=transitiv))+
geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
geom_density(alpha=0.2, fill="tomato") +
geom_vline(aes(xintercept=transitivity(best.g)),
color="blue", linetype="dashed", size=1) +
labs(title="Transitivity: Vickers Best Network",
x ="Transitivity", y = "Density") +
theme_bw()
We can
<- vector('list', 1000)
gs
for(i in 1:1000){
<- sample_gnm(n=gorder(work.g), m=gsize(work.g))
gs[[i]]
}
library(ggplot2)
<- data.frame(mdist = unlist(lapply(gs, mean_distance)))
gs.dist
ggplot(gs.dist, aes(x=mdist))+
geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
geom_density(alpha=0.2, fill="tomato") +
geom_vline(aes(xintercept=mean_distance(work.g)),
color="blue", linetype="dashed", size=1) +
labs(title="Average Path Length: Vickers Work Network",
x ="Average Path Length", y = "Density") +
theme_bw()
#CLustering coefficient for work
<- data.frame(transitiv = unlist(lapply(gs, transitivity)))
gs.transitiv
ggplot(gs.transitiv, aes(x=transitiv))+
geom_histogram(bins=20, aes(y = ..density..), color="black", fill="gray") +
geom_density(alpha=0.2, fill="tomato") +
geom_vline(aes(xintercept=transitivity(work.g)),
color="blue", linetype="dashed", size=1) +
labs(title="Transitivity: Vickers Work Network",
x ="Transitivity", y = "Density") +
theme_bw()
For this section I will build three different kinds of networks, but you can certainly build more.
<- qaptest(list(best.n, work.n), gcor, g1=1, g2=2, reps=1000)
netcor
summary(netcor)
QAP Test Results
Estimated p-values:
p(f(perm) >= f(d)): 0
p(f(perm) <= f(d)): 1
Test Diagnostics:
Test Value (f(d)): 0.6006287
Replications: 1000
Distribution Summary:
Min: -0.1391789
1stQ: -0.03523897
Med: 0.001445701
Mean: 0.000987143
3rdQ: 0.03813038
Max: 0.1848691
<-netlogit(work.n, list(best.n, gender.n),reps=1000)
nlog
summary(nlog)
Network Logit Model
Coefficients:
Estimate Exp(b) Pr(<=b) Pr(>=b) Pr(>=|b|)
(intercept) -1.6459752 0.1928244 0.000 1.000 0.000
x1 3.4032756 30.0624127 1.000 0.000 0.000
x2 0.8200015 2.2705033 0.998 0.002 0.002
Goodness of Fit Statistics:
Null deviance: 1125.671 on 812 degrees of freedom
Residual deviance: 746.9267 on 809 degrees of freedom
Chi-Squared test of fit improvement:
378.7443 on 3 degrees of freedom, p-value 0
AIC: 752.9267 BIC: 767.0252
Pseudo-R^2 Measures:
(Dn-Dr)/(Dn-Dr+dfn): 0.3180736
(Dn-Dr)/Dn: 0.3364609
Contingency Table (predicted (rows) x actual (cols)):
Actual
Predicted 0 1
0 493 138
1 15 166
Total Fraction Correct: 0.8115764
Fraction Predicted 1s Correct: 0.9171271
Fraction Predicted 0s Correct: 0.7812995
False Negative Rate: 0.4539474
False Positive Rate: 0.02952756
Test Diagnostics:
Null Hypothesis: qap
Replications: 1000
Distribution Summary:
(intercept) x1 x2
Min -10.310464 -4.781974 -4.036494
1stQ -8.340742 -1.146125 -0.916402
Median -7.766053 -0.111457 -0.083353
Mean -7.779412 -0.070560 -0.002356
3rdQ -7.249614 0.893377 0.825926
Max -5.374531 6.277920 4.747512