---
title: "Random Graphs, Social Networks and the Internet: Simulations"
subtitle: "Mathematics Sin Fronteras, 2021"
author: "Jose A. Sanchez & Mariana Olvera Cravioto"
date: "09/23/2021"
output:
  html_document: default
  pdf_document: default
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

## Package used to save and visualize networks.
library(igraph)

## Package used to visualize networks dynamically.
library(networkD3)

## Package for visualizing adjacency matrices.
library(plot.matrix)

## Not 100% sure of what this is for.
if(!require(EnvStats)){
  install.packages("EnvStats")
  library(EnvStats)
}

cbPalette <- c("#999999", "#E69F00", "#56B4E9", 
               "#009E73", "#F0E442", "#0072B2", 
               "#D55E00", "#CC79A7")
set.seed(2)

```


  In the following, we generate visualization of different random graph models through code on the programming language R. In particular, we visualize the Erdos-Renyi, inhomogeneous Erdos-Renyi, stochastic block model, intersection graph and preferential attachment models. For each random graph, we create static and interactive plots.






## Erdos Renyi Random Graph Model.

The original Erdos-Renyi graph model has two input parameters: the total number of nodes $n$ and the probability of connection $p$. Basically, we include the edge $(i,j)$ in the graph with probability $p$ and the edges are all independent from each other. Equivalently, we have that for all $1\leq i<j \leq n$,

$$ \mathbb{P}[(i,j)\in E] = p.$$

We visualize a simulation of the Erdos-Renyi random graph model in the following code chunk.

```{r, echo = TRUE, fig.height= 6, fig.width= 8, fig.align='center', warning=FALSE}

#########################################################
## Step 1: create the adjacency matrix.
#########################################################

n = 100   # n = total node count.
c = 3     # c = average degree.
p = c/n   # p = connection probability.

adj = matrix(0, n, n)     # We create an adjacency matrix.
                          # We start with no connections.

for(.i in 2:n){
  for(.j in 1:(.i-1)){    # For each pair of vertices (.i, .j):
    adj[.i, .j] = rbinom(1,1, p) # we add the edge with probability p“.
    
  }
}
adjsymm = adj + t(adj)

#########################################################
## Paso 2: Visualizing the graph with igraph package.
#########################################################

ErdosRenyi = graph_from_adjacency_matrix(adjsymm) # This command creates the graph object based on the adjacency matrix.


V(ErdosRenyi)$size <- 5                       # Specify visual parameters.
V(ErdosRenyi)$color <- rep("orange", n)       # for the graph.
V(ErdosRenyi)$label <- ""                     
V(ErdosRenyi)$frame.color <- "white"          
E(ErdosRenyi)$arrow.mode <- 0
lw <- layout_with_fr(ErdosRenyi)

title = paste0("Erdos-Renyi Random Graph ",   # Plot title.
               "(n = ", n, 
               ", p = ", p, ")")

plot(ErdosRenyi, layout = lw, main = title)


#########################################################
## Paso 3: Dynamic plot
#########################################################

edges = which(adj == 1, arr.ind = TRUE)
edges = as.data.frame(edges)

p <- simpleNetwork(edges, height="100px", width="100px",        
        Source = 1,                 # column number of source
        Target = 2,                 # column number of target
        linkDistance = 10,          # distance between node. Increase this value to have more space between nodes
        charge = -100,                # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value)
        fontSize = 1,               # size of the node names
        fontFamily = "serif",       # font og node names
        linkColour = "#666",        # colour of edges, MUST be a common colour for the whole graph
        nodeColour = "#69b3a2",     # colour of nodes, MUST be a common colour for the whole graph
        opacity = 0.9,              # opacity of nodes. 0=transparent. 1=no transparency
        zoom = T                    # Can you zoom on the figure?
        )
p

#########################################################
## Paso 4: Play with the parameters!
#########################################################

###### n = 500, p = 2/n
## We expect a giant component.

###### n = 500, p = 1/n
## We expect a component of size close to 500^(2/3) = 62.

###### n = 500, p = 0.5/n
## We expect an almost fully disconnected graph.

```





\newpage


## Chung-Lu Random Graph Model.


For the Chung-Lu, each vertex $i$ has an associated weight $w_i$. The larger the weight of a vertex is, the more popular the vertex is. Therefore, vertices with high weights will be highly connected to other vertices. More formally, if $L= \sum_{i=1}^n w_i$, then,
$$ \mathbb{P}[(i,j)\in E] = \min\left\{\frac{w_iw_j}{L}, 1\right\}$$

In this case, the approximate degree of any vertex is $w_i$.

```{r, fig.width=8, fig.height=6, fig.align='center', warning=FALSE}

#########################################################
## Paso 1: generate adjacency matrix.
#########################################################

n = 100                     # n = total vertex count.


b = 1.5                       # Generating pareto weights.
alpha = 2
U = runif(n = n, min = 0, max =  1)
w = b/( (1 - U)^(1/alpha) ) # Weights!
L = sum(w)                  # Sum of all weights.


adj = matrix(0, n, n)       # We create an n x n adjacency matrix.
                            # We start with no connections.

for(.i in 2:n){                         
  for(.j in 1:(.i-1)){          # For each pair of vertices (.i, .j):
    .p = min(w[.i]*w[.j]/L, 1)    # a) We find the connection probability.
    adj[.i, .j] = rbinom(1,1, .p) # b) We include the edge in the network with probability p. 
  }
}
adjsymm = adj + t(adj)

#########################################################
## Paso 2: Visualizing the graph with igraph.
#########################################################


NH.ErdosRenyi = graph_from_adjacency_matrix(adjsymm)  # Create graph object.
V(NH.ErdosRenyi)$size <- 6
V(NH.ErdosRenyi)$label <- ""                       
V(NH.ErdosRenyi)$frame.color <- "white"
E(NH.ErdosRenyi)$arrow.mode <- 0


## We will color the vertices according to their degree.
deg = apply(adjsymm, MARGIN = 1, sum)         # a) We find the degrees.
dmax = max(deg) + 1                       # b) Find the maximum degree + 1

colfunc <- colorRampPalette(c("orange",   # c) We create a color gradient.
                              "red",      #     Orange ---> red  --->  black.
                              "black"))
colors = colfunc(dmax)                    

V(NH.ErdosRenyi)$color <- colors[deg + 1]     # CWe color vertices according to their
                                              # degree.

lw <- layout_with_fr(NH.ErdosRenyi)
plot(NH.ErdosRenyi, layout = lw, main = "Inhomogeneous Erdos-Renyi: Pareto weights")



#########################################################
## Paso 3: Dynamic plot.
#########################################################

edges = which(adj == 1, arr.ind = TRUE)
edges = as.data.frame(edges)

p <- simpleNetwork(edges, height="100px", width="100px",        
        Source = 1,                 # column number of source
        Target = 2,                 # column number of target
        linkDistance = 30,          # distance between node. Increase this value to have more space between nodes
        charge = -100,                # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value)
        fontSize = 1,               # size of the node names
        fontFamily = "serif",       # font og node names
        linkColour = "#666",        # colour of edges, MUST be a common colour for the whole graph
        nodeColour = "#69b3a2",     # colour of nodes, MUST be a common colour for the whole graph
        opacity = 0.9,              # opacity of nodes. 0=transparent. 1=no transparency
        zoom = T                    # Can you zoom on the figure?
        )
p




#########################################################
## Paso 3: Play with parameters.
#########################################################


```





\newpage


## Stochastic block model


For the stochastic block model, each vertex in the graph belong to one of $K$ communities. The chance that a vertex $i$ in class $A$ and a vertex $j$ in class $B$ are connected depends on a kernel function $\kappa:\{1,\ldots, K\}^2\to \mathbb{R}_{+}$. More specifically, the probability is  $p_{A,B}= \frac{\kappa(A,B)}{n}$.


```{r, fig.width=9, fig.height=6, fig.align='center', warning=FALSE}

#########################################################
## Paso 1: Generate the adjacency matrix
#########################################################

n1 = 100                 # We model a netowrk with 3 even classes.
n2 = 100
n3 = 100
n = n1 + n2 + n3
class = rep(c(1,2,3),    
            c(n1,n2,n3))

cmat = matrix(c(15, 0.2, 0.2,  # We generate the matrix of "average connections".
                0.2, 15, 0.2,
                0.2, 0.2, 15), 
              ncol= 3, 
              byrow = TRUE)
pmat = cmat / n        # Matrix of connection probabilities.



adj = matrix(0, n, n)       # We create the n x n adjacency matrix.
                            # We start with no connections.
for(.i in 2:n){                         
  for(.j in 1:(.i-1)){            # For each pair of vertices (.i, .j):
    .p  = pmat[class[.i], class[.j]]  # a) We find the probability of connection with the weights.
    adj[.i, .j] = rbinom(1,1, .p)     # b) We include the edge with probability p. 
  }
}
adjsymm = adj + t(adj)


#########################################################
## Paso 2: Visualizing graph with igraph.
#########################################################

SBM = graph_from_adjacency_matrix(adjsymm)    # Create the graph object.
V(SBM)$size <- 6
V(SBM)$label <- "" 
V(SBM)$frame.color <- "white"
E(SBM)$arrow.mode <- 0

## We color the vertices according to their community assignment.
V(SBM)$color <- rep(cbPalette[c(2,3,4)], c(n1,n2,n3))


lw <- layout_with_fr(SBM)

plot(SBM, layout = lw, main = "Stochastic Block Model: 3 communities")



#########################################################
## Paso 3: Dynamic plot
#########################################################

edges = which(adj == 1, arr.ind = TRUE)
edges = as.data.frame(edges)

p <- simpleNetwork(edges, height="100px", width="100px",        
        Source = 1,                 # column number of source
        Target = 2,                 # column number of target
        linkDistance = 10,          # distance between node. Increase this value to have more space between nodes
        charge = -100,                # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value)
        fontSize = 1,               # size of the node names
        fontFamily = "serif",       # font og node names
        linkColour = "#666",        # colour of edges, MUST be a common colour for the whole graph
        nodeColour = "#69b3a2",     # colour of nodes, MUST be a common colour for the whole graph
        opacity = 0.9,              # opacity of nodes. 0=transparent. 1=no transparency
        zoom = T                    # Can you zoom on the figure?
        )
p





#########################################################
## Paso 3: Play with the parameters!
#########################################################

####### Suggestion of things to modify:

## You can modify the community kernel function.
## 1) What would happen if we increase the probability of connection across communities?
## 2) What would happen if you give different probabilities of connection to each community?

## Las comunidades.
## 3) What happens if the communities are not of the same size?



```



\newpage





## Intersection Model

So far, we have visualized graphs with "long cycles." With this, we refer to the fact that cycles have a length $O(\log(n))$, where $n$ is the number of vertices. This means that as $n$ grows, the cycles become increasingly long and the presence of triangles becomes increasingly unlikely. 

In real social networks, this is unrealistic, since in real networks it is common for nodes to arrange in "groups of friends" in which people with common friends are friends themselves. This means social networks tend to have many triangles. 

The intersection model generates random graphs with a high presence of triangles.


```{r, fig.width= 10, fig.height=8, fig.align='center', warning=FALSE}

#########################################################
## Paso 1: generar la grįfica bipartita.
#########################################################

n = 100                     # Total vertex count (actores).
beta = 0.5
gamma = 0.5
m = floor(beta*n)          # Number of vertices on the bipartite opposite side (movies)

# old/good.
#b = 3                       
#alpha = 2.5

b = 5                       # We simulate Pareto weights.
alpha = 3.5
U = runif(n = n, min = 0, max =  1)
w = b/( (1 - U)^(1/alpha) ) # Weights!!
L = sum(w)                  # Sum of weights.
  

bip = matrix(0, ncol = n2, nrow = n)
for(.i in 1:n){             
  for(.j in 1:m){                  # for each pair of (actor, movie):
    .pi = min(gamma*w[.i]/n, 1)       # a) we find the connection probability.
    bip[.i, .j] = rbinom(n = 1,       # b) We add a connection between them with probability p.
                         size = 1, 
                         prob = .pi)
  }
}

#########################################################
## Paso 2: Generate the adjacency matrix of actors.
#########################################################


adj = matrix(0, n, n)
count = 0
for(.i in 2:n){
  for(.j in 1:(.i-1)){          # For each pair of actors.
    
    .iftrian = sum(bip[.i, ] * bip[.j, ]) # iftrian = "How many movies did i and j
                                          #             collaborate on?"
    
    if(.iftrian > 0){   #  If i and j have collaborated, connect them.
      adj[.i, .j] = 1
    } 
  } 
}
adjsymm = adj + t(adj)

#########################################################
## Paso 3: Visualize the "movie cooperation" graph.
#########################################################

IntGraph = graph_from_adjacency_matrix(adjsymm)
V(IntGraph)$size <- 4
V(IntGraph)$label <- "" 
V(IntGraph)$frame.color <- "white"
E(IntGraph)$arrow.mode <- 0

## Coloring the vertices according to degree.
{
  deg = apply(adjsymm, MARGIN = 1, sum)         # a) We find the degrees.
  dmax = max(deg) + 1                       # b) Find the maximum degree + 1
  
  colfunc <- colorRampPalette(c("orange",   # c) We generate a gradient:
                                "red",      #     Orange ---> red  --->  black.
                                "black"))
  colors = colfunc(dmax)                    
  
  V(IntGraph)$color <- colors[deg + 1]     # We color the vertices according to degree.
                                               
}

lw <- layout_with_fr(IntGraph)
coeff = transitivity(IntGraph, type = "average")
plot(IntGraph, layout = lw, 
     main = paste("Intersection model: clustering = ", coeff) )



#########################################################
## Paso 4: Dynamic visualization.
#########################################################

edges = which(adj == 1, arr.ind = TRUE)
edges = as.data.frame(edges)

p <- simpleNetwork(edges, height="100px", width="100px",        
        Source = 1,                 # column number of source
        Target = 2,                 # column number of target
        linkDistance = 30,          # distance between node. Increase this value to have more space between nodes
        charge = -100,                # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value)
        fontSize = 1,               # size of the node names
        fontFamily = "serif",       # font og node names
        linkColour = "#666",        # colour of edges, MUST be a common colour for the whole graph
        nodeColour = "#69b3a2",     # colour of nodes, MUST be a common colour for the whole graph
        opacity = 0.9,              # opacity of nodes. 0=transparent. 1=no transparency
        zoom = T                    # Can you zoom on the figure?
        )
p



```



\newpage



## Preferential Attachment.


The model of preferential attachment has vertices that arrive sequentially, and randomly attach with the present vertices according to their popularity. Therefore, older vertices have large degrees, and the new vertices tend to have small degrees.

```{r, fig.width= 10, fig.height=8, fig.align='center', warning=FALSE}

#########################################################
## Paso 1: Generate adjacency matrix sequentially.
#########################################################

n = 100
deg = rep(c(2,1), c(1,n-1))

adj = diag(n)
adjsymm = diag(n)

for(.k in 2:n){
  .ps = deg[1:.k] / sum(deg[1:.k])
  adj[.k, .k] = 0
  adjsymm[.k, .k] = 0
  
  .j = sample(x = 1:.k, size = 1, prob = .ps) 
  
  adj[.k, .j] = 1
  adjsymm[.k, .j] = 1
  adjsymm[.j, .k] = 1
    
  deg = apply(adjsymm, MARGIN = 1, sum)
}


#########################################################
## Paso 3: Visualize the graph.
#########################################################

AlbBar = graph_from_adjacency_matrix(adjsymm)
V(AlbBar)$size <- 4 
V(AlbBar)$frame.color <- "white"
E(AlbBar)$arrow.mode <- 0

lw <- layout_with_fr(AlbBar)
coeff = transitivity(AlbBar, type = "average")
plot(AlbBar, layout = lw, 
     main = paste("Preferential Attachment"))



#########################################################
## Paso 4: Dynamic plot
#########################################################

edges = which(adj == 1, arr.ind = TRUE)
edges = as.data.frame(edges)

p <- simpleNetwork(edges, height="100px", width="100px",        
        Source = 1,                 # column number of source
        Target = 2,                 # column number of target
        linkDistance = 40,          # distance between node. Increase this value to have more space between nodes
        charge = -50,                # numeric value indicating either the strength of the node repulsion (negative value) or attraction (positive value)
        fontSize = 15,               # size of the node names
        fontFamily = "serif",       # font og node names
        linkColour = "#666",        # colour of edges, MUST be a common colour for the whole graph
        nodeColour = "#69b3a2",     # colour of nodes, MUST be a common colour for the whole graph
        opacity = 0.9,              # opacity of nodes. 0=transparent. 1=no transparency
        zoom = T                    # Can you zoom on the figure?
        )
p



```