Given a set S of N elements , consider two partitions of S, namely with R clusters, and with C clusters. It is presumed here that the partitions are so-called hard clusters; the partitions are pairwise disjoint:
for all , and complete:
The mutual information of cluster overlap between U and V can be summarized in the form of an RxC contingency table , where denotes the number of objects that are common to clusters and . That is,
Suppose an object is picked at random from S; the probability that the object falls into cluster is:
The entropy associated with the partitioning U is:
H(U) is non-negative and takes the value 0 only when there is no uncertainty determining an object's cluster membership, i.e., when there is only one cluster. Similarly, the entropy of the clustering V can be calculated as:
where . The mutual information (MI) between two partitions:
where denotes the probability that a point belongs to both the cluster in U and cluster in V:
MI is a non-negative quantity upper bounded by the entropies H(U) and H(V). It quantifies the information shared by the two clusterings and thus can be employed as a clustering similarity measure.
Like the Rand index, the baseline value of mutual information between two random clusterings does not take on a constant value, and tends to be larger when the two partitions have a larger number of clusters (with a fixed number of set elements N).
By adopting a hypergeometric model of randomness, it can be shown that the expected mutual information between two random clusterings is:
where
denotes . The variables and are partial sums of the contingency table; that is,
and
The adjusted measure[1] for the mutual information may then be defined to be:
- .
The AMI takes a value of 1 when the two partitions are identical and 0 when the MI between two partitions equals the value expected due to chance alone.