Search CORE

3 research outputs found

Psychoacoustically Motivated Audio Declipping Based on Weighted l1 Minimization

Author: Rajmic Pavel
Schimmel Jíří
Záviška Pavel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2020
Field of study

A novel method for audio declipping based on sparsity is presented. The method incorporates psychoacoustic information by weighting the transform coefficients in the

\ell_1

minimization. Weighting leads to an improved quality of restoration while retaining a low complexity of the algorithm. Three possible constructions of the weights are proposed, based on the absolute threshold of hearing, the global masking threshold and on a quadratic curve. Experiments compare the restoration quality according to the signal-to-distortion ratio (SDR) and PEMO-Q objective difference grade (ODG) and indicate that with correctly chosen weights, the presented method is able to compete, or even outperform, the current state of the art

arXiv.org e-Print Archive

Vision-Infused Deep Audio Inpainting

Author: Liu Ziwei
Luo Ping
Wang Xiaogang
Xu Xudong
Zhou Hang
Publication venue
Publication date: 24/10/2019
Field of study

Multi-modality perception is essential to develop interactive intelligence. In this work, we consider a new task of visual information-infused audio inpainting, \ie synthesizing missing audio segments that correspond to their accompanying videos. We identify two key aspects for a successful inpainter: (1) It is desirable to operate on spectrograms instead of raw audios. Recent advances in deep semantic image inpainting could be leveraged to go beyond the limitations of traditional audio inpainting. (2) To synthesize visually indicated audio, a visual-audio joint feature space needs to be learned with synchronization of audio and video. To facilitate a large-scale study, we collect a new multi-modality instrument-playing dataset called MUSIC-Extra-Solo (MUSICES) by enriching MUSIC dataset. Extensive experiments demonstrate that our framework is capable of inpainting realistic and varying audio segments with or without visual contexts. More importantly, our synthesized audio segments are coherent with their video counterparts, showing the effectiveness of our proposed Vision-Infused Audio Inpainter (VIAI). Code, models, dataset and video results are available at https://hangz-nju-cuhk.github.io/projects/AudioInpaintingComment: To appear in ICCV 2019. Code, models, dataset and video results are available at the project page: https://hangz-nju-cuhk.github.io/projects/AudioInpaintin

arXiv.org e-Print Archive

Audio Inpainting: Revisited and Reweighted

Author: Mokrý Ondřej
Rajmic Pavel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/08/2020
Field of study

We deal with the problem of sparsity-based audio inpainting, i.e. filling in the missing segments of audio. A consequence of the approaches based on mathematical optimization is the insufficient amplitude of the signal in the filled gaps. Remaining in the framework based on sparsity and convex optimization, we propose improvements to audio inpainting, aiming at compensating for such an energy loss. The new ideas are based on different types of weighting, both in the coefficient and the time domains. We show that our propositions improve the inpainting performance in terms of both the SNR and ODG

arXiv.org e-Print Archive