Erdős–Rényi Random Graph

Notation: $G (n, p)$ or $ER (n, p)$
Model: Fixed $n$ , undirected, $1 {(i, j) \in E} \sim Bernoulli (p)$
Basic properties
- $D_{i} \sim Binom (n - 1, p) \approx Poisson (n p)$
  - Also known as Poisson random graph model
- $∣ E ∣ \sim Binom ((2 n), p)$
- Local Clustering: $E C_{i} = p$
Phase transitions
- Edge existence: $t (n) = \frac{1}{n ^{2}}$
- Connectivity: $t (n) = \frac{l o g n}{n}$
- Giant component: $t (n) = \frac{1}{n}$
Diameter: $Θ (ln n / ln λ)$
- see Diameter
Sparse ER: $p_{n} = λ / n$
- $D_{i} \approx Poisson (λ)$
- Number of $k$ -cycles: $Θ (1)$
- Number of $k$ -trees: $Θ (n)$

Special Components

Analyzing the ER model requires us to calculate the probability of the existence of components of various shapes, or the expected number of them if they exist.

Singleton. Let $I_{i}^{(1)}$ be the indicator r.v. that $i$ is isolated and $I^{(1)} = \sum_{i} I_{i}^{(1)}$ . We have $E [I^{(1)}] = n (1 - p)^{n - 1}$ . The variance is calculated in the next section.

$k$ -cycle. Randomly choose $k$ nodes, fix a starting node, and randomly arrange the rest $k - 1$ ; they form a cycle if and only if they are only connected to the nodes before and after them. Thus, the expected number of $k$ -cycles is

E [I^{(k, cycle)}] = (k n) \cdot (k - 1)! \cdot \frac{1}{2} \cdot p^{k} \cdot (1 - p)^{k (n - k)} \cdot (1 - p)^{k (k - 3) /2} .

Note that we have two $1/2$ factors: the first one is because a clockwise and a counter-clockwise arrangement of the same nodes correspond to the same cycle, and the second one is because the links between the $k$ nodes are double-counted by both ends. See Problem 5 for the calculation of the variance of $I^{(3, cycle)}$ .

$k$ -tree. Randomly choose $k$ nodes and label them. Each possible tree formed by them has a unique Prüfer sequence¹ of length $k - 2$ . Thus, the number of possible trees is given by the Cayley’s formula $k^{k - 2}$ , and the expected number of $k$ -trees is

E [I^{(k, tree)}] = (k n) \cdot k^{k - 2} \cdot p^{k - 1} \cdot (1 - p)^{k (n - k)} \cdot (1 - p)^{(2 k) - (k - 1)} . (1)

Connectivity

Let $p_{n} = λ ln n / n$ . We show that the network is connected w.p.1 if $λ > 1$ and is disconnected w.p.1 if $λ < 1$ , as $n \to \infty$ .

When $λ < 1$ , we inspect the probability of the existence of an isolated node. Then $E [I^{(1)}] \to n^{1 - λ} \to \infty$ . The second moment of $I^{(1)}$ is

E [(I^{(1)})^{2}] = E [I^{(1)}] + (n^{2} - n) E [I_{i}^{(1)} I_{j}^{(1)}] \to n^{1 - λ} + (n^{2} - n) n^{- 2 λ} .

Thus $Var (I^{(1)}) \to n^{1 - λ}$ . By the second Moment Method, we have $P (I^{(1)} \geq 1) \to 1$ , and thus the network is disconnected w.p.1.

When $λ > 1$ , showing $P (I^{(1)} \geq 1) \to 0$ is not enough. We have to show that for any $k \geq 1$ , the probability of the existence of a connected component of size $k$ goes to zero. Note that $k$ nodes form a standalone connected component only if they contain a spanning tree. Thus, similar to $(1)$ , we have

E [I^{(k)}] \leq (k n) k^{k - 2} p^{k - 1} (1 - p)^{k (n - k)} ≲ n^{1 - λk} k^{k - 2} λ^{k - 1} \to 0, \forall k \geq 1.

By the union bound and Markov Inequality (or first Moment Method), we have

P (disconnected) \leq \leq \leq = \leq \to k = 1 \sum n /2 P (I^{(k)} \geq 1) k = 1 \sum n /2 E [I^{(k)}] k = 1 \sum n /2 n^{1 - λk} n^{k - 2} λ^{k - 1} n^{- 1} k = 1 \sum n /2 n^{- (λ - 1) k} λ^{k - 1} n^{- 1} n^{- (λ - 1)} \frac{1}{1 - n ^{- (λ - 1)} λ} 0.

A finer result captures the regime $p_{n} = \frac{l n n + α}{n}$ :

P (connected) \to e^{- e^{- α}} .

Sparse ER

We call the ER model in the regime of $p_{n} = λ / n$ the sparse ER model (^dense-sparse). Note that the degree distribution approximates a Poisson Distribution with mean $λ$ . Thus, this regime is also known as the Poisson random graph model.

Subcritical regime $λ < 1$ . Let $C (i)$ be the connected component containing $i$ . Then,

E ∣ C (i) ∣ \to \frac{1}{1 - λ} and P (i max ∣ C (i) ∣ = Θ (ln n)) \to 1.

That is, the typical size of a component is $\frac{1}{1 - λ}$ , and the size of the largest component is $Θ (ln n)$ .

Supercritical regime $λ > 1$ . Let $C_{1}$ be the largest component and $C_{2}$ be the second largest component. Let $η (λ)$ be the extinction probability of a Branching process with offspring distribution $Poisson (λ)$ . Then,

P (∣ C_{1} ∣ = Θ ((1 - η (λ)) n)) \to 1, P (∣ C_{2} ∣ = O (ln n)) \to 1, and E ∣ C (i) ∣ \to \frac{1}{1 - η ( λ ) λ} .

That is, a giant component emerges with size $Θ (n)$ , the size of the second largest component is $Θ (ln n)$ , and the typical size of a small component is $\frac{1}{1 - η ( λ ) λ}$ .

Proofs

Supercritical regime, giant component

Let $u$ be the probability that a random node $i$ does not belong to the giant component. Then $(1 - u) n$ is the expected size of the giant component. We have the recursion

u = (1 - p + p u)^{n - 1},

where for any other node $j$ , w.p. $1 - p$ , $i$ is not connected to $j$ , and if they are connected, w.p. approximately $u$ , $j$ does not belong to the giant component. Plugging $p = λ / n$ gives

u = (1 + \frac{λ ( u - 1 )}{n})^{n - 1} \to exp (λ (u - 1)) = g (u),

where $g$ is the PGF of $Poisson (λ)$ .

Supercritical regime, uniqueness of the giant component

Before proving the second largest component in the supercritical regime is small, we can easily show that there cannot be two giant components. For any $v_{1}, v_{2} \in (0, 1]$ , suppose there are two giant components of sizes at least $v_{1} n$ and $v_{2} n$ . The probability that they are not connected is at most $(1 - p)^{v_{1} n \cdot v_{2} n} \to exp (- λ v_{1} v_{2} n)$ . Then, one can calculate the upper bound of the probability that there are two giant components using a similar equation in $(1)$ .

Another easier approach is called sprinkling. Consider the ER model is generated in two steps: first generate links i.i.d. with probability $p$ , and then sprinkle additional links with probability $n^{- ϵ} p$ for some small $ϵ \in (0, 1)$ . Then, this model is equivalent to $G (n, (1 + n^{- ϵ}) p) \to G (n, p)$ .

Now suppose in the first step there are two giant components with sizes at least $v_{1} n$ and $v_{2} n$ . Then, the probability that they are not connected in the second step is at most $(1 - n^{- ϵ} p)^{v_{1} n \cdot v_{2} n} \to exp (- λ v_{1} v_{2} n^{1 - ϵ}) \to 0$ . Thus, w.p.1, $G (n, p)$ has only one giant component.

Supercritical regime, the second largest component

All regimes, typical size of a small component

We have proved in all regimes, there is at most one giant component with size $Θ ((1 - η (λ)) n)$ , with $η (λ) = 1$ when $λ < 1$ . For a random node $i$ not in the giant component, let $C (i)$ be the small component containing $i$ . We have also proved that $∣ C (i) ∣ = O (ln n)$ almost surely and $C (i)$ is a tree almost surely. Thus, removing $i$ breaks $C (i)$ into $D_{i}$ subtrees, denoted by ${C^{'} (j_{k})}_{k = 1}^{D_{i}}$ . We have

∣ C (i) ∣ = 1 + k = 1 \sum D_{i} ∣ C^{'} (j_{k}) ∣,

which gives

E ∣ C (i) ∣ = 1 + E [D_{i} E [C^{'} (j_{1}) ∣ D_{i}]],

where we use the symmetry of the subtrees. In analogy to the Branching process, we have $E [C^{'} (j_{1}) ∣ D_{i}] = E ∣ C (i) ∣$ . Additionally, note that $D_{i}$ links can only connect to nodes not in the giant component. Thus, $E D_{i} = η n p$ . Together, we have

E ∣ C (i) ∣ \to \frac{1}{1 - η ( λ ) λ} .

See Local Branching for a more formal treatment.

Average size of small components

The above quantity is the expected size of a small component to which a random node belongs. It is not equivalent to the expected size of a randomly chosen small component. Since nodes in larger small components are more likely to be chosen, the above is larger than the average size of a randomly chosen small component.

Starts with an empty sequence, remove the leaf node with the smallest label and append the label to the sequence, and repeat until only two nodes are left. The Prüfer sequence of a $k$ -tree has a length $k - 2$ and is a bijection. ↩

Networked Networks

Table of Contents

Backlinks

Graph View

Erdős–Rényi Random Graph

Table of Contents

Erdős–Rényi Random Graph

Special Components

Connectivity

Sparse ER

Proofs

Supercritical regime, giant component

Supercritical regime, uniqueness of the giant component

Supercritical regime, the second largest component

All regimes, typical size of a small component

Backlinks

Graph View

Networked Networks

Table of Contents

Backlinks

Graph View

Erdős–Rényi Random Graph

Table of Contents

Erdős–Rényi Random Graph

Special Components

Connectivity

Sparse ER

Proofs

Supercritical regime, giant component

Supercritical regime, uniqueness of the giant component

Supercritical regime, the second largest component

All regimes, typical size of a small component

Footnotes

Backlinks

Graph View