We present Self-Remixing, a novel self-supervised speech separation method,
which refines a pre-trained separation model in an unsupervised manner. The
proposed method consists of a shuffler module and a solver module, and they
grow together through separation and remixing processes. Specifically, the
shuffler first separates observed mixtures and makes pseudo-mixtures by
shuffling and remixing the separated signals. The solver then separates the
pseudo-mixtures and remixes the separated signals back to the observed
mixtures. The solver is trained using the observed mixtures as supervision,
while the shuffler's weights are updated by taking the moving average with the
solver's, generating the pseudo-mixtures with fewer distortions. Our
experiments demonstrate that Self-Remixing gives better performance over
existing remixing-based self-supervised methods with the same or less training
costs under unsupervised setup. Self-Remixing also outperforms baselines in
semi-supervised domain adaptation, showing effectiveness in multiple setups.Comment: Accepted by ICASSP2023, 5pages, 2figures, 2table