2,847 research outputs found
The Audio Degradation Toolbox and its Application to Robustness Evaluation
We introduce the Audio Degradation Toolbox (ADT) for the controlled degradation of audio signals, and propose its usage as a means of evaluating and comparing the robustness of audio processing algorithms. Music recordings encountered in practical applications are subject to varied, sometimes unpredictable degradation. For example, audio is degraded by low-quality microphones, noisy recording environments, MP3 compression, dynamic compression in broadcasting or vinyl decay. In spite of this, no standard software for the degradation of audio exists, and music processing methods are usually evaluated against clean data. The ADT fills this gap by providing Matlab scripts that emulate a wide range of degradation types. We describe 14 degradation units, and how they can be chained to create more complex, `real-world' degradations. The ADT also provides functionality to adjust existing ground-truth, correcting for temporal distortions introduced by degradation. Using four different music informatics tasks, we show that performance strongly depends on the combination of method and degradation applied. We demonstrate that specific degradations can reduce or even reverse the performance difference between two competing methods. ADT source code, sounds, impulse responses and definitions are freely available for download
Database of audio records
Diplomka a prakticky castDiplome with partical part
ADAGIO: Interactive Experimentation with Adversarial Attack and Defense for Audio
Adversarial machine learning research has recently demonstrated the
feasibility to confuse automatic speech recognition (ASR) models by introducing
acoustically imperceptible perturbations to audio samples. To help researchers
and practitioners gain better understanding of the impact of such attacks, and
to provide them with tools to help them more easily evaluate and craft strong
defenses for their models, we present ADAGIO, the first tool designed to allow
interactive experimentation with adversarial attacks and defenses on an ASR
model in real time, both visually and aurally. ADAGIO incorporates AMR and MP3
audio compression techniques as defenses, which users can interactively apply
to attacked audio samples. We show that these techniques, which are based on
psychoacoustic principles, effectively eliminate targeted attacks, reducing the
attack success rate from 92.5% to 0%. We will demonstrate ADAGIO and invite the
audience to try it on the Mozilla Common Voice dataset.Comment: Demo paper; for supplementary video, see https://youtu.be/0W2BKMwSfV
Detection and localization of double compression in MP3 audio tracks
In this work, by exploiting the traces left by double compression in the statistics of quantized modified discrete cosine transform coefficients, a single measure has been derived that allows to decide whether an MP3 file is singly or doubly compressed and, in the last case, to devise also the bit-rate of the first compression. Moreover, the proposed method as well as two state-of-the-art methods have been applied to analyze short temporal windows of the track, allowing the localization of possible tampered portions in the MP3 file under analysis. Experiments confirm the good performance of the proposed scheme and demonstrate that current detection methods are useful for tampering localization, thus offering a new tool for the forensic analysis of MP3 audio tracks
- …