---
title: "Introduction"
author: "Thomas W. Valente and George G. Vega Yon"
---
```{r setup, echo=FALSE, message=FALSE, warning=FALSE}
library(netdiffuseR)
knitr::opts_chunk$set(comment = "#")
```
# Network Diffusion of Innovations
## Diffusion networks
```{r valente1995, echo=FALSE, fig.align='center'}
knitr::include_graphics("valente_1995.jpg")
```
* Explains how new ideas and practices (innovations) spread within and between
communities.
* While a lot of factors have been shown to influence diffusion (Spatial,
Economic, Cultural, Biological, etc.), Social Networks is a prominent one.
* There are many components in the diffusion network model including network exposures, thresholds, infectiousness, susceptibility, hazard rates, diffusion rates (bass model), clustering (Moran's I), and so on.
## Thresholds
* One of the cannonical concepts is the network threshold. Network thresholds (Valente, 1995; 1996), $\tau$, are defined as the required proportion or number of neighbors that leads you to adopt a particular behavior (innovation), $a=1$. In (very) general terms\pause
$$
a_i = \left\{\begin{array}{ll}
1 &\mbox{if } \tau_i\leq E_i \\
0 & \mbox{Otherwise}
\end{array}\right. \qquad
E_i \equiv \frac{\sum_{j\neq i}\mathbf{X}_{ij}a_j}{\sum_{j\neq i}\mathbf{X}_{ij}}
$$
Where $E_i$ is i's exposure to the innovation and $\mathbf{X}$ is the adjacency matrix (the network).
* This can be generalized and extended to include covariates and other network weighting schemes (that's what __netdiffuseR__ is all about).
# netdiffuseR
## Overview
__netdiffuseR__ is an R package that:
* Is designed for Visualizing, Analyzing and Simulating network diffusion data (in general).
* Depends on some pretty popular packages:
* _RcppArmadillo_: So it's fast,
* _Matrix_: So it's big,
* _statnet_ and _igraph_: So it's not from scratch
* Can handle big graphs, e.g., an adjacency matrix with more than 4 billion elements (PR for RcppArmadillo)
* Already on CRAN with ~6,000 downloads since its first version, Feb 2016,
* A lot of features to make it easy to read network (dynamic) data, making it a nice companion of other net packages.
## Datasets
- __netdiffuseR__ has the three classic Diffusion Network Datasets:
- `medInnovationsDiffNet` Doctors and the innovation of Tetracycline (1955).
- `brfarmersDiffNet` Brazilian farmers and the innovation of Hybrid Corn Seed (1966).
- `kfamilyDiffNet` Korean women and Family Planning methods (1973).
```{r printing}
brfarmersDiffNet
medInnovationsDiffNet
kfamilyDiffNet
```
## Visualization methods
```{r viz, cache=TRUE, eval=TRUE}
set.seed(12315)
x <- rdiffnet(
400, t = 6, rgraph.args = list(k=6, p=.3),
seed.graph = "small-world",
seed.nodes = "central", rewire = FALSE, threshold.dist = 1/4
)
plot(x)
plot_diffnet(x)
plot_diffnet2(x)
plot_adopters(x)
plot_threshold(x)
plot_infectsuscep(x, K=2)
plot_hazard(x)
```
# Problems
1. Using the diffnet object in [`intro.rda`](intro.rda), use the function `plot_threshold` specifying
shapes and colors according to the variables ItrustMyFriends and Age. Do you see any pattern?
(solution script and solution plot)
```{r datasim, echo=FALSE, eval=TRUE}
set.seed(1252)
dat <- data.frame(
ItrustMyFriends = sample(c(0,1), 200, TRUE),
Age = 10 + rpois(200, 4)
)
net <- rgraph_er(200, p = .05)
# net <- diag_expand(list(net, net))
# net[cbind(1:20, 101:120)] <- 1
# Generating the process
diffnet <- rdiffnet(
threshold.dist = 4 - dat$ItrustMyFriends*3,
seed.graph = net,
t=6,
seed.nodes = c(9:25),
exposure.args = list(normalized=FALSE),
rewire = FALSE)
diffnet[["ItrustMyFriends"]] <- dat$ItrustMyFriends
diffnet[["Age"]] <- dat$Age
save(diffnet, file = "intro.rda")
```