The goal of the multi-sound source localization task is to localize sound
sources from the mixture individually. While recent multi-sound source
localization methods have shown improved performance, they face challenges due
to their reliance on prior information about the number of objects to be
separated. In this paper, to overcome this limitation, we present a novel
multi-sound source localization method that can perform localization without
prior knowledge of the number of sound sources. To achieve this goal, we
propose an iterative object identification (IOI) module, which can recognize
sound-making objects in an iterative manner. After finding the regions of
sound-making objects, we devise object similarity-aware clustering (OSC) loss
to guide the IOI module to effectively combine regions of the same object but
also distinguish between different objects and backgrounds. It enables our
method to perform accurate localization of sound-making objects without any
prior knowledge. Extensive experimental results on the MUSIC and VGGSound
benchmarks show the significant performance improvements of the proposed method
over the existing methods for both single and multi-source. Our code is
available at: https://github.com/VisualAIKHU/NoPrior_MultiSSLComment: Accepted at CVPR 202