154 research outputs found

    Audio Inpainting

    Get PDF
    (c) 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Transactions on Audio, Speech and Language Processing 20(3): 922-932, Mar 2012. DOI: 10.1090/TASL.2011.2168211

    Introducing SPAIN (SParse Audio INpainter)

    Full text link
    A novel sparsity-based algorithm for audio inpainting is proposed. It is an adaptation of the SPADE algorithm by Kiti\'c et al., originally developed for audio declipping, to the task of audio inpainting. The new SPAIN (SParse Audio INpainter) comes in synthesis and analysis variants. Experiments show that both A-SPAIN and S-SPAIN outperform other sparsity-based inpainting algorithms. Moreover, A-SPAIN performs on a par with the state-of-the-art method based on linear prediction in terms of the SNR, and, for larger gaps, SPAIN is even slightly better in terms of the PEMO-Q psychoacoustic criterion

    Audio inpainting algorithms

    Get PDF
    Tato práce se zabývá doplňováním chybějících dat do audio signálů a algoritmy řešícími problém založenými na řídké reprezentaci audio signálu. Práce se zaměřuje na některé algoritmy, které řeší doplňování chybějících dat do audio signálů pomocí řídké reprezentace signálů. Součástí práce je také návrh algoritmu, který používá řídkou reprezentaci signálu a také nízkou hodnost signálu ve spektrogramu audio signálu. Dále práce uvádí implementaci tohoto algoritmu v programu Matlab a jeho vyhodnocení.The thesis deals with audio inpainting problem and sparse representation approaches to this problem. It focuses on some of recent approaches to solving audio inpainting problem with respect to sparse representation algorithms. It proposes solving audio inapinting problem based on sparse representation of signal and low rank structure in spectrogram of audio signal. Thesis also describes implementation in program Matlab and evaluation of the proposed method.

    Quality Enhancement of Highly Degraded Music Using Deep Learning-Based Prediction Models for Lost Frequencies

    Get PDF
    Audio quality degradation can have many causes. For musical applications, this fragmentation may lead to highly unpleasant experiences. Restoration algorithms may be employed to reconstruct missing parts of the audio in a similar way as for image reconstruction-in an approach called audio inpainting. Current state-of-The art methods for audio inpainting cover limited scenarios, with well-defined gap windows and little variety of musical genres. In this work, we propose a Deep-Learning-based (DL-based) method for audio inpainting accompanied by a dataset with random fragmentation conditions that approximate real impairment situations. The dataset was collected using tracks from different music genres to provide a good signal variability. Our best model improved the quality of all musical genres, obtaining an average of 12.9 dB of PSNR, although it worked better for musical genres in which acoustic instruments are predominant

    Inpainting of long audio segments with similarity graphs

    Full text link
    We present a novel method for the compensation of long duration data loss in audio signals, in particular music. The concealment of such signal defects is based on a graph that encodes signal structure in terms of time-persistent spectral similarity. A suitable candidate segment for the substitution of the lost content is proposed by an intuitive optimization scheme and smoothly inserted into the gap, i.e. the lost or distorted signal region. Extensive listening tests show that the proposed algorithm provides highly promising results when applied to a variety of real-world music signals

    Diffusion-Based Audio Inpainting

    Full text link
    Audio inpainting aims to reconstruct missing segments in corrupted recordings. Previous methods produce plausible reconstructions when the gap length is shorter than about 100\;ms, but the quality decreases for longer gaps. This paper explores recent advancements in deep learning and, particularly, diffusion models, for the task of audio inpainting. The proposed method uses an unconditionally trained generative model, which can be conditioned in a zero-shot fashion for audio inpainting, offering high flexibility to regenerate gaps of arbitrary length. An improved deep neural network architecture based on the constant-Q transform, which allows the model to exploit pitch-equivariant symmetries in audio, is also presented. The performance of the proposed algorithm is evaluated through objective and subjective metrics for the task of reconstructing short to mid-sized gaps. The results of a formal listening test show that the proposed method delivers a comparable performance against state-of-the-art for short gaps, while retaining a good audio quality and outperforming the baselines for the longest gap lengths tested, 150\;ms and 200\;ms. This work helps improve the restoration of sound recordings having fairly long local disturbances or dropouts, which must be reconstructed.Comment: Submitted for publication to the Journal of Audio Engineering Society on January 30th, 202

    A Reproducible Research Framework for Audio Inpainting

    Get PDF
    International audienceWe introduce a unified framework for the restoration of distorted audio data, leveraging the Image Inpainting concept and covering existing audio applications. In this framework, termed Audio Inpainting, the distorted data is considered missing and its location is assumed to be known. We further introduce baseline approaches based on sparse representations. For this new audio inpainting concept, we provide reproducible-research tools including: the handling of audio inpainting tasks as inverse problems, embedded in a frame-based scheme similar to patch-based image processing; several experimental settings; speech and music material; OMP-like algorithms, with two dictionaries, for general audio inpainting or specifically-enhanced declipping
    corecore