3 research outputs found
Psychoacoustically Motivated Audio Declipping Based on Weighted l1 Minimization
A novel method for audio declipping based on sparsity is presented. The
method incorporates psychoacoustic information by weighting the transform
coefficients in the minimization. Weighting leads to an improved
quality of restoration while retaining a low complexity of the algorithm. Three
possible constructions of the weights are proposed, based on the absolute
threshold of hearing, the global masking threshold and on a quadratic curve.
Experiments compare the restoration quality according to the
signal-to-distortion ratio (SDR) and PEMO-Q objective difference grade (ODG)
and indicate that with correctly chosen weights, the presented method is able
to compete, or even outperform, the current state of the art
Vision-Infused Deep Audio Inpainting
Multi-modality perception is essential to develop interactive intelligence.
In this work, we consider a new task of visual information-infused audio
inpainting, \ie synthesizing missing audio segments that correspond to their
accompanying videos. We identify two key aspects for a successful inpainter:
(1) It is desirable to operate on spectrograms instead of raw audios. Recent
advances in deep semantic image inpainting could be leveraged to go beyond the
limitations of traditional audio inpainting. (2) To synthesize visually
indicated audio, a visual-audio joint feature space needs to be learned with
synchronization of audio and video. To facilitate a large-scale study, we
collect a new multi-modality instrument-playing dataset called MUSIC-Extra-Solo
(MUSICES) by enriching MUSIC dataset. Extensive experiments demonstrate that
our framework is capable of inpainting realistic and varying audio segments
with or without visual contexts. More importantly, our synthesized audio
segments are coherent with their video counterparts, showing the effectiveness
of our proposed Vision-Infused Audio Inpainter (VIAI). Code, models, dataset
and video results are available at
https://hangz-nju-cuhk.github.io/projects/AudioInpaintingComment: To appear in ICCV 2019. Code, models, dataset and video results are
available at the project page:
https://hangz-nju-cuhk.github.io/projects/AudioInpaintin
Audio Inpainting: Revisited and Reweighted
We deal with the problem of sparsity-based audio inpainting, i.e. filling in
the missing segments of audio. A consequence of the approaches based on
mathematical optimization is the insufficient amplitude of the signal in the
filled gaps. Remaining in the framework based on sparsity and convex
optimization, we propose improvements to audio inpainting, aiming at
compensating for such an energy loss. The new ideas are based on different
types of weighting, both in the coefficient and the time domains. We show that
our propositions improve the inpainting performance in terms of both the SNR
and ODG