2 research outputs found
Metrics matter in community detection
We present a critical evaluation of normalized mutual information (NMI) as an
evaluation metric for community detection. NMI exaggerates the leximin method's
performance on weak communities: Does leximin, in finding the trivial
singletons clustering, truly outperform eight other community detection
methods? Three NMI improvements from the literature are AMI, rrNMI, and cNMI.
We show equivalences under relevant random models, and for evaluating community
detection, we advise one-sided AMI under the model
(all partitions of nodes). This work seeks (1) to start a conversation on
robust measurements, and (2) to advocate evaluations which do not give "free
lunch"
An Exact No Free Lunch Theorem for Community Detection
A precondition for a No Free Lunch theorem is evaluation with a loss function
which does not assume a priori superiority of some outputs over others. A
previous result for community detection by Peel et al. (2017) relies on a
mismatch between the loss function and the problem domain. The loss function
computes an expectation over only a subset of the universe of possible outputs;
thus, it is only asymptotically appropriate with respect to the problem size.
By using the correct random model for the problem domain, we provide a
stronger, exact No Free Lunch theorem for community detection. The claim
generalizes to other set-partitioning tasks including core/periphery
separation, -clustering, and graph partitioning. Finally, we review the
literature of proposed evaluation functions and identify functions which
(perhaps with slight modifications) are compatible with an exact No Free Lunch
theorem