Crowdsourced, or human computation based clustering algorithms usually rely
on relative distance comparisons, as these are easier to elicit from human
workers than absolute distance information. A relative distance comparison is a
statement of the form "item A is closer to item B than to item C". However,
many existing clustering algorithms that use relative distances are rather
complex. They are often based on a two-step approach, where the relative
distances are first used to learn either a distance matrix, or an embedding of
the items, and then some standard clustering method is applied in a second
step. In this paper we argue that it should be possible to compute a clustering
directly from relative distance comparisons. Our ideas are built upon existing
work on correlation clustering, a well-known non-parametric approach to
clustering. The technical contribution of this work is twofold. We first define
a novel variant of correlation clustering that is based on relative distance
comparisons, and hence suitable for human computation. We go on to show that
our new problem is closely related to basic correlation clustering, and use
this property to design an approximation algorithm for our problem. As a second
contribution, we propose a more practical algorithm, which we empirically
compare against existing methods from literature. Experiments with synthetic
data suggest that our approach can outperform more complex methods. Also, our
method efficiently finds good and intuitive clusterings from real relative
distance comparison data.Comment: short version published at IEEE ICDM 201