Erdős–Rényi Random Graph

  • Notation: or
  • Model: Fixed , undirected,
  • Basic properties
      • Also known as Poisson random graph model
    • Local Clustering:
  • Phase transitions
    • Edge existence:
    • Connectivity:
    • Giant component:
  • Diameter:
  • Sparse ER:
    • Number of -cycles:
    • Number of -trees:

Special Components

Analyzing the ER model requires us to calculate the probability of the existence of components of various shapes, or the expected number of them if they exist.

Singleton. Let be the indicator r.v. that is isolated and . We have . The variance is calculated in the next section.

-cycle. Randomly choose nodes, fix a starting node, and randomly arrange the rest ; they form a cycle if and only if they are only connected to the nodes before and after them. Thus, the expected number of -cycles is

Note that we have two factors: the first one is because a clockwise and a counter-clockwise arrangement of the same nodes correspond to the same cycle, and the second one is because the links between the nodes are double-counted by both ends. See Problem 5 for the calculation of the variance of .

-tree. Randomly choose nodes and label them. Each possible tree formed by them has a unique Prüfer sequence1 of length . Thus, the number of possible trees is given by the Cayley’s formula , and the expected number of -trees is

Connectivity

Let . We show that the network is connected w.p.1 if and is disconnected w.p.1 if , as .

When , we inspect the probability of the existence of an isolated node. Then . The second moment of is

Thus . By the second Moment Method, we have , and thus the network is disconnected w.p.1.

When , showing is not enough. We have to show that for any , the probability of the existence of a connected component of size goes to zero. Note that nodes form a standalone connected component only if they contain a spanning tree. Thus, similar to , we have

By the union bound and Markov Inequality (or first Moment Method), we have

A finer result captures the regime :

Sparse ER

We call the ER model in the regime of the sparse ER model (^dense-sparse). Note that the degree distribution approximates a Poisson Distribution with mean . Thus, this regime is also known as the Poisson random graph model.

Subcritical regime . Let be the connected component containing . Then,

That is, the typical size of a component is , and the size of the largest component is .

Supercritical regime . Let be the largest component and be the second largest component. Let be the extinction probability of a Branching process with offspring distribution . Then,

That is, a giant component emerges with size , the size of the second largest component is , and the typical size of a small component is .

Proofs

Supercritical regime, giant component

Let be the probability that a random node does not belong to the giant component. Then is the expected size of the giant component. We have the recursion

where for any other node , w.p. , is not connected to , and if they are connected, w.p. approximately , does not belong to the giant component. Plugging gives

where is the PGF of .

Regimes|300
Regimes

Supercritical regime, uniqueness of the giant component

Before proving the second largest component in the supercritical regime is small, we can easily show that there cannot be two giant components. For any , suppose there are two giant components of sizes at least and . The probability that they are not connected is at most . Then, one can calculate the upper bound of the probability that there are two giant components using a similar equation in .

Another easier approach is called sprinkling. Consider the ER model is generated in two steps: first generate links i.i.d. with probability , and then sprinkle additional links with probability for some small . Then, this model is equivalent to .

Now suppose in the first step there are two giant components with sizes at least and . Then, the probability that they are not connected in the second step is at most . Thus, w.p.1, has only one giant component.

Supercritical regime, the second largest component

All regimes, typical size of a small component

We have proved in all regimes, there is at most one giant component with size , with when . For a random node not in the giant component, let be the small component containing . We have also proved that almost surely and is a tree almost surely. Thus, removing breaks into subtrees, denoted by . We have

which gives

where we use the symmetry of the subtrees. In analogy to the Branching process, we have . Additionally, note that links can only connect to nodes not in the giant component. Thus, . Together, we have

See Local Branching for a more formal treatment.

Average size of small components

The above quantity is the expected size of a small component to which a random node belongs. It is not equivalent to the expected size of a randomly chosen small component. Since nodes in larger small components are more likely to be chosen, the above is larger than the average size of a randomly chosen small component.

Footnotes

  1. Starts with an empty sequence, remove the leaf node with the smallest label and append the label to the sequence, and repeat until only two nodes are left. The Prüfer sequence of a -tree has a length and is a bijection.