Human labeled datasets, along with their corresponding evaluation algorithms,
play an important role in boundary detection. We here present a psychophysical
experiment that addresses the reliability of such benchmarks. To find better
remedies to evaluate the performance of any boundary detection algorithm, we
propose a computational framework to remove inappropriate human labels and
estimate the intrinsic properties of boundaries.Comment: NIPS 2012 Workshop on Human Computation for Science and Computational
Sustainabilit