This paper proposes a universal sound separation (USS) method capable of
handling untrained sampling frequencies (SFs). The USS aims at separating
arbitrary sources of different types and can be the key technique to realize a
source separator that can be universally used as a preprocessor for any
downstream tasks. To realize a universal source separator, there are two
essential properties: universalities with respect to source types and recording
conditions. The former property has been studied in the USS literature, which
has greatly increased the number of source types that can be handled by a
single neural network. However, the latter property (e.g., SF) has received
less attention despite its necessity. Since the SF varies widely depending on
the downstream tasks, the universal source separator must handle a wide variety
of SFs. In this paper, to encompass the two properties, we propose an
SF-independent (SFI) extension of a computationally efficient USS network,
SuDoRM-RF. The proposed network uses our previously proposed SFI convolutional
layers, which can handle various SFs by generating convolutional kernels in
accordance with an input SF. Experiments show that signal resampling can
degrade the USS performance and the proposed method works more consistently
than signal-resampling-based methods for various SFs.Comment: Submitted to ICASSP202