757 research outputs found
The Audio Degradation Toolbox and its Application to Robustness Evaluation
We introduce the Audio Degradation Toolbox (ADT) for the controlled degradation of audio signals, and propose its usage as a means of evaluating and comparing the robustness of audio processing algorithms. Music recordings encountered in practical applications are subject to varied, sometimes unpredictable degradation. For example, audio is degraded by low-quality microphones, noisy recording environments, MP3 compression, dynamic compression in broadcasting or vinyl decay. In spite of this, no standard software for the degradation of audio exists, and music processing methods are usually evaluated against clean data. The ADT fills this gap by providing Matlab scripts that emulate a wide range of degradation types. We describe 14 degradation units, and how they can be chained to create more complex, `real-world' degradations. The ADT also provides functionality to adjust existing ground-truth, correcting for temporal distortions introduced by degradation. Using four different music informatics tasks, we show that performance strongly depends on the combination of method and degradation applied. We demonstrate that specific degradations can reduce or even reverse the performance difference between two competing methods. ADT source code, sounds, impulse responses and definitions are freely available for download
Drum Transcription via Classification of Bar-level Rhythmic Patterns
acceptedMatthias Mauch is supported by a Royal Academy of Engineering
Research Fellowshi
Souvenir Book of Bar Harbor, Me.
Colored, postcard-like images in a souvenir booklet showing scenes around Bar Harbor, Maine, published between 1915 and 1925. Images include the Village Green, Thunder Cave, Duck Brook, Frenchman\u27s Bay, Newport Mountain from Gorge Road, Bar Harbor High School, Balance Rock, Main Street, Shore Path, Bar Harbor from Rodick Island, the Post Office, Jesup Memorial Library and Y.W.C.A., Kebo Street, Profile Rock and a horse drawn buggy driving past Otter Cliff
Impact of the assimilation of conventional data on the quantitative precipitation forecasts in the Eastern Mediterranean
International audienceThis study is devoted to the evaluation of the role of assimilation of conventional data on the quantitative precipitation forecasts at regional scale. The conventional data included surface station reports as well as upper air observations. The analysis was based on the simulation of 15 cases of heavy precipitation that occurred in the Eastern Mediterranean. The verification procedure revealed that the ingestion of conventional data by objective analysis in the initial conditions of BOLAM limited area model do not result in a statistically significant improvement of the quantitative precipitation forecasts
A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion
Previous research has shown that established techniques for spoken voice conversion (VC) do not perform as well when applied to singing voice conversion (SVC). We propose an alternative loss component in a loss function that is otherwise well-established among VC tasks, which has been shown to improve our model’s SVC performance. We first trained a singer identity embedding (SIE) network on mel-spectrograms of singer recordings to produce singer-specific variance encodings using contrastive learning. We subsequently trained a well-known autoencoder framework (AutoVC) conditioned on these SIEs, and measured differences in SVC performance when using different latent regressor loss components. We found that using this loss w.r.t. SIEs leads to better performance than w.r.t. bottleneck embeddings, where converted audio is more natural and specific towards target singers. The inclusion of this loss component has the advantage of explicitly forcing the network to reconstruct with timbral similarity, and also negates the effect of poor disentanglement in AutoVC’s bottleneck embeddings. We demonstrate peculiar diversity between computational and human evaluations on singer converted audio clips, which highlights the necessity of both. We also propose a pitch-matching mechanism between source and target singers to ensure these evaluations are not influenced by differences in pitch register
Mammy\u27s Lullaby / music by Lee S. Roberts; words by Will Callahan
Cover: drawing of sunset over a river; description reads a dreamy southern waltz (see 431); Publisher: Forster Music Publisher (Chicago)https://egrove.olemiss.edu/sharris_c/1138/thumbnail.jp
Seeing Sounds, Hearing Shapes: a gamified study to evaluate sound-sketches
Sound-shape associations, a subset of cross-modal associations between the auditory and visual domain, have been studied mainly in the context of matching a set of purposefully crafted shapes to sounds. Recent studies have explored how humans represent sound through free-form sketching and how a graphical sketch input could be used for sound production. In this paper, the potential of communicating sound characteristics through these free-form sketches is investigated in a gamified study that was conducted with eighty-two participants at two online exhibition events. The results show that participants managed to recognise sounds at a higher rate than the random baseline would suggest, however it appeared difficult to visually encode nuanced timbral differences
Sketching sounds: an exploratory study on sound-shape associations
Sound synthesiser controls typically correspond to technical parameters of signal processing algorithms rather than intuitive sound descriptors that relate to human perception of sound. This makes it difficult to realise sound ideas in a straightforward way. Cross-modal mappings, for example between gestures and sound, have been suggested as a more intuitive control mechanism. A large body of research shows consistency in human associations between sounds and shapes. However, the use of drawings to drive sound synthesis has not been explored to its full extent. This pa- per presents an exploratory study that asked participants to sketch visual imagery of sounds with a monochromatic digital drawing interface, with the aim to identify different representational approaches and determine whether timbral sound characteristics can be communicated reliably through visual sketches. Results imply that the development of a synthesiser exploiting sound-shape associations is feasible, but a larger and more focused dataset is needed in followup studies
- …