Assortativity

Homophily

tendency of individuals to associate disproportionately with others who are similar (e.g. ethnicity, age, profession, religion, etc) to themselves.

  • Lazarsfeld, P.F. “Friendship as a social process: A substantive and methodological analysis.” Freedom and Control in Modern Society (1954)

The tendency of individuals to associate with others who are similar to themselves is known as homophily or assortative mixing. On the other hand, disassortative mixing, the tendency of individuals to associate with others who are unlike them, is also widely observed in information, technological, and biological networks [@newman2018Networks].

Modularity

To talk about assortativity, we need to specify the characteristic that defines the similarity between nodes. When such characteristics are unknown, the Unsupervised Learning task of grouping nodes into clusters is called community detection.

The idea of a community is that nodes in the same community have a higher probability of being connected than those in different communities. To capture this idea, modularity of a division is defined as

where is the set of edges within communities, and places all edges uniformly at random. A large modularity means a low entropy, which further suggests a strong community structure.

More formally, suppose returns the community index of node . Then,

If we break the network into stubs and reconnect the stubs uniformly at random, the expected number of edges within communities is

Thus,

Let . The matrix form of modularity is

Modularity Maximization

Modularity can serve as an objective function for community detection. The discrete optimization problem is NP-hard in general. Consider a partition into two communities with index and . Then,

Note that . Thus, for this binary partition, we have

recovering the ^spectral-partition with the graph Laplacian replaced by the modularity matrix . However, here we can directly let the relaxed real-valued be the leading eigenvector of the modularity matrix, without degenerating to the trivial case where all nodes are in the same community. This method is also referred to as the spectral modularity maximization.

Assortative Mixing

Now suppose nodes have a scalar characteristic. If nodes with similar values tend to be connected together more often than those with different values, then the network is considered assortatively mixed according to that characteristic.

To see if links are formed based on a scalar characteristic , we calculate the covariance between the characteristic values of the two endpoints of a randomly chosen edge:

where the marginal mean is defined as

Thus,

which recovers ^eq-mod by replacing the community indicator with the characteristic value. In matrix form, we have

Note that

We define the assortativity coefficient as the correlation:

means perfect assortative mixing, means no assortative mixing, and means perfect disassortative mixing.

Degree Correlation

When the characteristic is the degree, associative mixing gives core-periphery structures, where high-degree nodes tend to stick together to form a core surrounded by a periphery of low-degree nodes, commonly observed in social networks. Disassortative mixing by degree tend to give hub-and-spoke structures, where high-degree nodes are hubs that connect many low-degree nodes that connect to other hubs. Replacing with the degree vector gives the assortativity coefficient by degree.