--- title: "Random Graphs, Social Networks and the Internet: Simulations" subtitle: "Mathematics Sin Fronteras, 2021" author: "Jose A. Sanchez & Mariana Olvera Cravioto" date: "09/23/2021" output: html_document: default pdf_document: default --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ## Package used to save and visualize networks. library(igraph) ## Package used to visualize networks dynamically. library(networkD3) ## Package for visualizing adjacency matrices. library(plot.matrix) ## Not 100% sure of what this is for. if(!require(EnvStats)){ install.packages("EnvStats") library(EnvStats) } cbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7") set.seed(2) ``` In the following, we generate visualization of different random graph models through code on the programming language R. In particular, we visualize the Erdos-Renyi, inhomogeneous Erdos-Renyi, stochastic block model, intersection graph and preferential attachment models. For each random graph, we create static and interactive plots. ## Erdos Renyi Random Graph Model. The original Erdos-Renyi graph model has two input parameters: the total number of nodes $n$ and the probability of connection $p$. Basically, we include the edge $(i,j)$ in the graph with probability $p$ and the edges are all independent from each other. Equivalently, we have that for all $1\leq i red ---> black. "black")) colors = colfunc(dmax) V(NH.ErdosRenyi)$color <- colors[deg + 1] # CWe color vertices according to their # degree. lw <- layout_with_fr(NH.ErdosRenyi) plot(NH.ErdosRenyi, layout = lw, main = "Inhomogeneous Erdos-Renyi: Pareto weights") ######################################################### ## Paso 3: Dynamic plot. ######################################################### edges = which(adj == 1, arr.ind = TRUE) edges = as.data.frame(edges) p <- simpleNetwork(edges, height="100px", width="100px", Source = 1, # column number of source Target = 2, # column number of target linkDistance = 30, # distance between node. Increase this value to have more space between nodes charge = -100, # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value) fontSize = 1, # size of the node names fontFamily = "serif", # font og node names linkColour = "#666", # colour of edges, MUST be a common colour for the whole graph nodeColour = "#69b3a2", # colour of nodes, MUST be a common colour for the whole graph opacity = 0.9, # opacity of nodes. 0=transparent. 1=no transparency zoom = T # Can you zoom on the figure? ) p ######################################################### ## Paso 3: Play with parameters. ######################################################### ``` \newpage ## Stochastic block model For the stochastic block model, each vertex in the graph belong to one of $K$ communities. The chance that a vertex $i$ in class $A$ and a vertex $j$ in class $B$ are connected depends on a kernel function $\kappa:\{1,\ldots, K\}^2\to \mathbb{R}_{+}$. More specifically, the probability is $p_{A,B}= \frac{\kappa(A,B)}{n}$. ```{r, fig.width=9, fig.height=6, fig.align='center', warning=FALSE} ######################################################### ## Paso 1: Generate the adjacency matrix ######################################################### n1 = 100 # We model a netowrk with 3 even classes. n2 = 100 n3 = 100 n = n1 + n2 + n3 class = rep(c(1,2,3), c(n1,n2,n3)) cmat = matrix(c(15, 0.2, 0.2, # We generate the matrix of "average connections". 0.2, 15, 0.2, 0.2, 0.2, 15), ncol= 3, byrow = TRUE) pmat = cmat / n # Matrix of connection probabilities. adj = matrix(0, n, n) # We create the n x n adjacency matrix. # We start with no connections. for(.i in 2:n){ for(.j in 1:(.i-1)){ # For each pair of vertices (.i, .j): .p = pmat[class[.i], class[.j]] # a) We find the probability of connection with the weights. adj[.i, .j] = rbinom(1,1, .p) # b) We include the edge with probability p. } } adjsymm = adj + t(adj) ######################################################### ## Paso 2: Visualizing graph with igraph. ######################################################### SBM = graph_from_adjacency_matrix(adjsymm) # Create the graph object. V(SBM)$size <- 6 V(SBM)$label <- "" V(SBM)$frame.color <- "white" E(SBM)$arrow.mode <- 0 ## We color the vertices according to their community assignment. V(SBM)$color <- rep(cbPalette[c(2,3,4)], c(n1,n2,n3)) lw <- layout_with_fr(SBM) plot(SBM, layout = lw, main = "Stochastic Block Model: 3 communities") ######################################################### ## Paso 3: Dynamic plot ######################################################### edges = which(adj == 1, arr.ind = TRUE) edges = as.data.frame(edges) p <- simpleNetwork(edges, height="100px", width="100px", Source = 1, # column number of source Target = 2, # column number of target linkDistance = 10, # distance between node. Increase this value to have more space between nodes charge = -100, # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value) fontSize = 1, # size of the node names fontFamily = "serif", # font og node names linkColour = "#666", # colour of edges, MUST be a common colour for the whole graph nodeColour = "#69b3a2", # colour of nodes, MUST be a common colour for the whole graph opacity = 0.9, # opacity of nodes. 0=transparent. 1=no transparency zoom = T # Can you zoom on the figure? ) p ######################################################### ## Paso 3: Play with the parameters! ######################################################### ####### Suggestion of things to modify: ## You can modify the community kernel function. ## 1) What would happen if we increase the probability of connection across communities? ## 2) What would happen if you give different probabilities of connection to each community? ## Las comunidades. ## 3) What happens if the communities are not of the same size? ``` \newpage ## Intersection Model So far, we have visualized graphs with "long cycles." With this, we refer to the fact that cycles have a length $O(\log(n))$, where $n$ is the number of vertices. This means that as $n$ grows, the cycles become increasingly long and the presence of triangles becomes increasingly unlikely. In real social networks, this is unrealistic, since in real networks it is common for nodes to arrange in "groups of friends" in which people with common friends are friends themselves. This means social networks tend to have many triangles. The intersection model generates random graphs with a high presence of triangles. ```{r, fig.width= 10, fig.height=8, fig.align='center', warning=FALSE} ######################################################### ## Paso 1: generar la gráfica bipartita. ######################################################### n = 100 # Total vertex count (actores). beta = 0.5 gamma = 0.5 m = floor(beta*n) # Number of vertices on the bipartite opposite side (movies) # old/good. #b = 3 #alpha = 2.5 b = 5 # We simulate Pareto weights. alpha = 3.5 U = runif(n = n, min = 0, max = 1) w = b/( (1 - U)^(1/alpha) ) # Weights!! L = sum(w) # Sum of weights. bip = matrix(0, ncol = n2, nrow = n) for(.i in 1:n){ for(.j in 1:m){ # for each pair of (actor, movie): .pi = min(gamma*w[.i]/n, 1) # a) we find the connection probability. bip[.i, .j] = rbinom(n = 1, # b) We add a connection between them with probability p. size = 1, prob = .pi) } } ######################################################### ## Paso 2: Generate the adjacency matrix of actors. ######################################################### adj = matrix(0, n, n) count = 0 for(.i in 2:n){ for(.j in 1:(.i-1)){ # For each pair of actors. .iftrian = sum(bip[.i, ] * bip[.j, ]) # iftrian = "How many movies did i and j # collaborate on?" if(.iftrian > 0){ # If i and j have collaborated, connect them. adj[.i, .j] = 1 } } } adjsymm = adj + t(adj) ######################################################### ## Paso 3: Visualize the "movie cooperation" graph. ######################################################### IntGraph = graph_from_adjacency_matrix(adjsymm) V(IntGraph)$size <- 4 V(IntGraph)$label <- "" V(IntGraph)$frame.color <- "white" E(IntGraph)$arrow.mode <- 0 ## Coloring the vertices according to degree. { deg = apply(adjsymm, MARGIN = 1, sum) # a) We find the degrees. dmax = max(deg) + 1 # b) Find the maximum degree + 1 colfunc <- colorRampPalette(c("orange", # c) We generate a gradient: "red", # Orange ---> red ---> black. "black")) colors = colfunc(dmax) V(IntGraph)$color <- colors[deg + 1] # We color the vertices according to degree. } lw <- layout_with_fr(IntGraph) coeff = transitivity(IntGraph, type = "average") plot(IntGraph, layout = lw, main = paste("Intersection model: clustering = ", coeff) ) ######################################################### ## Paso 4: Dynamic visualization. ######################################################### edges = which(adj == 1, arr.ind = TRUE) edges = as.data.frame(edges) p <- simpleNetwork(edges, height="100px", width="100px", Source = 1, # column number of source Target = 2, # column number of target linkDistance = 30, # distance between node. Increase this value to have more space between nodes charge = -100, # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value) fontSize = 1, # size of the node names fontFamily = "serif", # font og node names linkColour = "#666", # colour of edges, MUST be a common colour for the whole graph nodeColour = "#69b3a2", # colour of nodes, MUST be a common colour for the whole graph opacity = 0.9, # opacity of nodes. 0=transparent. 1=no transparency zoom = T # Can you zoom on the figure? ) p ``` \newpage ## Preferential Attachment. The model of preferential attachment has vertices that arrive sequentially, and randomly attach with the present vertices according to their popularity. Therefore, older vertices have large degrees, and the new vertices tend to have small degrees. ```{r, fig.width= 10, fig.height=8, fig.align='center', warning=FALSE} ######################################################### ## Paso 1: Generate adjacency matrix sequentially. ######################################################### n = 100 deg = rep(c(2,1), c(1,n-1)) adj = diag(n) adjsymm = diag(n) for(.k in 2:n){ .ps = deg[1:.k] / sum(deg[1:.k]) adj[.k, .k] = 0 adjsymm[.k, .k] = 0 .j = sample(x = 1:.k, size = 1, prob = .ps) adj[.k, .j] = 1 adjsymm[.k, .j] = 1 adjsymm[.j, .k] = 1 deg = apply(adjsymm, MARGIN = 1, sum) } ######################################################### ## Paso 3: Visualize the graph. ######################################################### AlbBar = graph_from_adjacency_matrix(adjsymm) V(AlbBar)$size <- 4 V(AlbBar)$frame.color <- "white" E(AlbBar)$arrow.mode <- 0 lw <- layout_with_fr(AlbBar) coeff = transitivity(AlbBar, type = "average") plot(AlbBar, layout = lw, main = paste("Preferential Attachment")) ######################################################### ## Paso 4: Dynamic plot ######################################################### edges = which(adj == 1, arr.ind = TRUE) edges = as.data.frame(edges) p <- simpleNetwork(edges, height="100px", width="100px", Source = 1, # column number of source Target = 2, # column number of target linkDistance = 40, # distance between node. Increase this value to have more space between nodes charge = -50, # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value) fontSize = 15, # size of the node names fontFamily = "serif", # font og node names linkColour = "#666", # colour of edges, MUST be a common colour for the whole graph nodeColour = "#69b3a2", # colour of nodes, MUST be a common colour for the whole graph opacity = 0.9, # opacity of nodes. 0=transparent. 1=no transparency zoom = T # Can you zoom on the figure? ) p ```