---
title: "Simulation of diffusion networks: rdiffnet"
author: "Thomas W. Valente and George G. Vega Yon"
---
```{r setup, echo=FALSE, message=FALSE, warning=FALSE}
library(netdiffuseR)
knitr::opts_chunk$set(comment = "#")
```
# Introduction
Before we start, a review of the concepts we will be using here
1. Exposure: Proportion/number of neighbors that has adopted an innovation at each point in time.
2. Threshold: The proportion/number of your neighbors who had adopted at or one time period before ego (the focal individual) adopted.
3. Infectiousness: How much $i$'s adoption affects her alters.
4. Susceptibility: How much $i$'s alters' adoption affects her.
5. Structural equivalence: How similar are $i$ and $j$ in terms of position in the network.
# Simulating diffusion networks
We will simulate a diffusion network with the following parameters:
1. Will have 1,000 vertices,
2. Will span 20 time periods,
3. The initial adopters (seeds) will be selected random,
4. Seeds will be a 10\% of the network,
5. The graph (network) will be small-world,
6. Will use the WS algorithmwith $p=.2$ (probability of rewire).
7. Threshold levels will be uniformly distributed between [0.3, 0.7\]
To generate this diffusion network we can use the `rdiffnet` function included in the package:
```{r Generating the random graph}
# Setting the seed for the RNG
set.seed(1213)
# Generating a random diffusion network
net <- rdiffnet(
n = 1e3, # 1.
t = 20, # 2.
seed.nodes = "random", # 3.
seed.p.adopt = .1, # 4.
seed.graph = "small-world", # 5.
rgraph.args = list(p=.2), # 6.
threshold.dist = function(x) runif(1, .3, .7) # 7.
)
```
* The function `rdiffnet` generates random diffusion networks. Main features:
1. Simulating random graph or using your own,
2. Setting threshold levels per node,
3. Network rewiring throughout the simulation, and
4. Setting the seed nodes.
* The simulation algorithm is as follows:
1. If required, a baseline graph is created,
2. Set of initial adopters and threshold distribution are established,
3. The set of t networks is created (if required), and
4. Simulation starts at t=2, assigning adopters based on exposures and thresholds:
a. For each $i \in N$, if its exposure at $t-1$ is greater than its threshold, then
adopts, otherwise continue without change.
b. next $i$
# Rumor spreading
```{r sim-rumor}
library(netdiffuseR)
set.seed(09)
diffnet_rumor <- rdiffnet(
n = 5e2,
t = 5,
seed.graph = "small-world",
rgraph.args = list(k = 4, p = .3),
seed.nodes = "random",
seed.p.adopt = .05,
rewire = TRUE,
threshold.dist = function(i) 1L,
exposure.args = list(normalized = FALSE)
)
```
```{r summary-rumor}
summary(diffnet_rumor)
```
```{r plot-rumor, fig.align='center', cache=TRUE}
plot_diffnet(diffnet_rumor, slices = c(1, 3, 5))
# We want to use igraph to compute layout
igdf <- diffnet_to_igraph(diffnet_rumor, slices=c(1,2))[[1]]
pos <- igraph::layout_with_drl(igdf)
plot_diffnet2(diffnet_rumor, vertex.size = dgr(diffnet_rumor)[,1], layout=pos)
```
# Difussion
```{r sim-complex}
set.seed(09)
diffnet_complex <- rdiffnet(
seed.graph = diffnet_rumor$graph,
seed.nodes = which(diffnet_rumor$toa == 1),
rewire = FALSE,
threshold.dist = function(i) rbeta(1, 3, 10),
name = "Diffusion",
behavior = "Some social behavior"
)
```
```{r plot-complex-and-disease}
plot_adopters(diffnet_rumor, what = "cumadopt", include.legend = FALSE)
plot_adopters(diffnet_complex, bg="tomato", add=TRUE, what = "cumadopt")
legend("topleft", legend = c("Disease", "Complex"), col = c("lightblue", "tomato"),
bty = "n", pch=19)
```
# Mentor Matching
```{r mentor-match, cache = TRUE}
# Finding mentors
mentors <- mentor_matching(diffnet_rumor, 25, lead.ties.method = "random")
# Simulating diffusion with these mentors
set.seed(09)
diffnet_mentored <- rdiffnet(
seed.graph = diffnet_complex,
seed.nodes = which(mentors$`1`$isleader),
rewire = FALSE,
threshold.dist = diffnet_complex[["real_threshold"]],
name = "Diffusion using Mentors"
)
summary(diffnet_mentored)
```
```{r toa_mat-mentors}
cumulative_adopt_count(diffnet_complex)
cumulative_adopt_count(diffnet_mentored)
```
# Example by changing threshold
```{r sim-sim, cache = TRUE, collapse = TRUE}
# Simulating a scale-free homophilic network
set.seed(1231)
X <- rep(c(1,1,1,1,1,0,0,0,0,0), 50)
net <- rgraph_ba(t = 499, m=4, eta = X)
# Taking a look in igraph
ig <- igraph::graph_from_adjacency_matrix(net)
plot(ig, vertex.color = c("azure", "tomato")[X+1], vertex.label = NA,
vertex.size = sqrt(dgr(net)))
# Now, simulating a bunch of diffusion processes
nsim <- 500L
ans_1and2 <- vector("list", nsim)
set.seed(223)
for (i in 1:nsim) {
# We just want the cum adopt count
ans_1and2[[i]] <-
cumulative_adopt_count(
rdiffnet(
seed.graph = net,
t = 10,
threshold.dist = sample(1:2, 500L, TRUE),
seed.nodes = "random",
seed.p.adopt = .10,
exposure.args = list(outgoing = FALSE, normalized = FALSE),
rewire = FALSE
)
)
# Are we there yet?
if (!(i %% 50))
message("Simulation ", i," of ", nsim, " done.")
}
# Extracting prop
ans_1and2 <- do.call(rbind, lapply(ans_1and2, "[", i="prop", j=))
ans_2and3 <- vector("list", nsim)
set.seed(223)
for (i in 1:nsim) {
# We just want the cum adopt count
ans_2and3[[i]] <-
cumulative_adopt_count(
rdiffnet(
seed.graph = net,
t = 10,
threshold.dist = sample(2:3, 500L, TRUE),
seed.nodes = "random",
seed.p.adopt = .10,
exposure.args = list(outgoing = FALSE, normalized = FALSE),
rewire = FALSE
)
)
# Are we there yet?
if (!(i %% 50))
message("Simulation ", i," of ", nsim, " done.")
}
ans_2and3 <- do.call(rbind, lapply(ans_2and3, "[", i="prop", j=))
```
This can actually be simplified by using the function `rdiffnet_multiple`. The following lines of code accomplish the same as the previous code avoiding the for-loop (from the user's perspective). Besides of the usual parameters passed to `rdiffnet`, the `rdiffnet_multiple` function requires `R` (number of repetitions/simulations), and `statistic` (a function that returns the statistic of insterst). Optionally, the user may choose to specify the number of clusters to run it in parallel (multiple CPUs):
```{r rdiffnet-multiple}
ans_1and3 <- rdiffnet_multiple(
# Num of sim
R = nsim,
# Statistic
statistic = function(d) cumulative_adopt_count(d)["prop",],
seed.graph = net,
t = 10,
threshold.dist = sample(1:3, 500, TRUE),
seed.nodes = "random",
seed.p.adopt = .1,
rewire = FALSE,
exposure.args = list(outgoing=FALSE, normalized=FALSE),
# Running on 4 cores
ncpus = 4L
)
```
```{r sim-sim-results}
boxplot(ans_1and2, col="ivory", xlab = "Time", ylab = "Threshold")
boxplot(ans_2and3, col="tomato", add=TRUE)
boxplot(t(ans_1and3), col = "steelblue", add=TRUE)
legend(
"topleft",
fill = c("ivory", "tomato", "steelblue"),
legend = c("1/2", "2/3", "1/3"),
title = "Threshold range",
bty ="n"
)
```
* Example simulating a thousand networks by changing threshold levels.
The final prevalence, or hazard as a function of threshold levels.
# Problems
1. Given the following types of networks: Small-world, Scale-free, Bernoulli,
what set of $n$ initiators maximizes diffusion?
(solution script and solution plot)