Homophily is the principle whereby "similarity breeds connections". We give a
quantitative formulation of this principle within networks. We say that a
network is homophillic with respect to a given labeled partition of its
vertices, when the classes of the partition induce subgraphs that are
significantly denser than what we expect under a random labeled partition into
classes maintaining the same cardinalities (type). This is the recently
introduced \emph{random coloring model} for network homophily. In this
perspective, the vector whose entries are the sizes of the subgraphs induced by
the corresponding classes, is viewed as the observed outcome of the random
vector described by picking labeled partitions at random among partitions with
the same type.\,Consequently, the input network is homophillic at the
significance level α whenever the one-sided tail probability of
observing an outcome at least as extreme as the observed one, is smaller than
α. Clearly, α can also be thought of as a quantifier of homophily
in the scale [0,1]. Since, as we show, even approximating this tail
probability is an NP-hard problem, we resort multidimensional extensions of
classical Cantelli's inequality to bound α from above. This upper bound
is the homophily index we propose. It requires the knowledge of the covariance
matrix of the random vector, which was not previously known within the random
coloring model. In this paper we close this gap by computing the covariance
matrix of subgraph sizes under the random coloring model. Interestingly, the
matrix depends on the input partition only through its type and on the network
only through its degrees. Furthermore all the covariances have the same sign
and this sign is a graph invariant. Plugging this structure into Cantelli's
bound yields a meaningful, easy to compute index for measuring network
homophily