--- title: "Introduction" author: "Thomas W. Valente and George G. Vega Yon" --- ```{r setup, echo=FALSE, message=FALSE, warning=FALSE} library(netdiffuseR) knitr::opts_chunk$set(comment = "#") ``` # Network Diffusion of Innovations ## Diffusion networks ```{r valente1995, echo=FALSE, fig.align='center'} knitr::include_graphics("valente_1995.jpg") ``` * Explains how new ideas and practices (innovations) spread within and between communities. * While a lot of factors have been shown to influence diffusion (Spatial, Economic, Cultural, Biological, etc.), Social Networks is a prominent one. * There are many components in the diffusion network model including network exposures, thresholds, infectiousness, susceptibility, hazard rates, diffusion rates (bass model), clustering (Moran's I), and so on. ## Thresholds * One of the cannonical concepts is the network threshold. Network thresholds (Valente, 1995; 1996), $\tau$, are defined as the required proportion or number of neighbors that leads you to adopt a particular behavior (innovation), $a=1$. In (very) general terms\pause $$ a_i = \left\{\begin{array}{ll} 1 &\mbox{if } \tau_i\leq E_i \\ 0 & \mbox{Otherwise} \end{array}\right. \qquad E_i \equiv \frac{\sum_{j\neq i}\mathbf{X}_{ij}a_j}{\sum_{j\neq i}\mathbf{X}_{ij}} $$ Where $E_i$ is i's exposure to the innovation and $\mathbf{X}$ is the adjacency matrix (the network). * This can be generalized and extended to include covariates and other network weighting schemes (that's what __netdiffuseR__ is all about). # netdiffuseR ## Overview __netdiffuseR__ is an R package that: * Is designed for Visualizing, Analyzing and Simulating network diffusion data (in general). * Depends on some pretty popular packages: * _RcppArmadillo_: So it's fast, * _Matrix_: So it's big, * _statnet_ and _igraph_: So it's not from scratch * Can handle big graphs, e.g., an adjacency matrix with more than 4 billion elements (PR for RcppArmadillo) * Already on CRAN with ~6,000 downloads since its first version, Feb 2016, * A lot of features to make it easy to read network (dynamic) data, making it a nice companion of other net packages. ## Datasets - __netdiffuseR__ has the three classic Diffusion Network Datasets: - `medInnovationsDiffNet` Doctors and the innovation of Tetracycline (1955). - `brfarmersDiffNet` Brazilian farmers and the innovation of Hybrid Corn Seed (1966). - `kfamilyDiffNet` Korean women and Family Planning methods (1973). ```{r printing} brfarmersDiffNet medInnovationsDiffNet kfamilyDiffNet ``` ## Visualization methods ```{r viz, cache=TRUE, eval=TRUE} set.seed(12315) x <- rdiffnet( 400, t = 6, rgraph.args = list(k=6, p=.3), seed.graph = "small-world", seed.nodes = "central", rewire = FALSE, threshold.dist = 1/4 ) plot(x) plot_diffnet(x) plot_diffnet2(x) plot_adopters(x) plot_threshold(x) plot_infectsuscep(x, K=2) plot_hazard(x) ``` # Problems 1. Using the diffnet object in [`intro.rda`](intro.rda), use the function `plot_threshold` specifying shapes and colors according to the variables ItrustMyFriends and Age. Do you see any pattern? (solution script and solution plot) ```{r datasim, echo=FALSE, eval=TRUE} set.seed(1252) dat <- data.frame( ItrustMyFriends = sample(c(0,1), 200, TRUE), Age = 10 + rpois(200, 4) ) net <- rgraph_er(200, p = .05) # net <- diag_expand(list(net, net)) # net[cbind(1:20, 101:120)] <- 1 # Generating the process diffnet <- rdiffnet( threshold.dist = 4 - dat$ItrustMyFriends*3, seed.graph = net, t=6, seed.nodes = c(9:25), exposure.args = list(normalized=FALSE), rewire = FALSE) diffnet[["ItrustMyFriends"]] <- dat$ItrustMyFriends diffnet[["Age"]] <- dat$Age save(diffnet, file = "intro.rda") ```