7,128 research outputs found
Pitch ability as an aptitude for tone learning
Tone languages such as Mandarin use voice pitch to signal lexical contrasts, presenting a challenge for second/foreign language (L2) learners whose native languages do not use pitch in this manner. The present study examined components of an aptitude for mastering L2 lexical tone. Native English speakers with no previous tone language experience completed a Mandarin word learning task, as well as tests of pitch ability, musicality, L2 aptitude, and general cognitive ability. Pitch ability measures improved predictions of learning performance beyond musicality, L2 aptitude, and general cognitive ability and also predicted transfer of learning to new talkers. In sum, although certain nontonal measures help predict successful tone learning, the central components of tonal aptitude are pitch-specific perceptual measures
Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future
Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)
A computational framework for sound segregation in music signals
Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200
OK Computer Analysis: An Audio Corpus Study of Radiohead
The application of music information retrieval techniques in popular music studies has great promise. In the present work, a corpus of Radiohead songs across their career from 1992 to 2017 are subjected to automated audio analysis. We examine findings from a number of granularities and perspectives, including within song and between song examination of both timbral-rhythmic and harmonic features. Chronological changes include possible career spanning effects for a band's releases such as slowing tempi and reduced brightness, and the timbral markers of Radiohead's expanding approach to instrumental resources most identified with the Kid A and Amnesiac era. We conclude with a discussion highlighting some challenges for this approach, and the potential for a field of audio file based career analysis
DrumGAN: Synthesis of Drum Sounds With Timbral Feature Conditioning Using Generative Adversarial Networks
Synthetic creation of drum sounds (e.g., in drum machines) is commonly
performed using analog or digital synthesis, allowing a musician to sculpt the
desired timbre modifying various parameters. Typically, such parameters control
low-level features of the sound and often have no musical meaning or perceptual
correspondence. With the rise of Deep Learning, data-driven processing of audio
emerges as an alternative to traditional signal processing. This new paradigm
allows controlling the synthesis process through learned high-level features or
by conditioning a model on musically relevant information. In this paper, we
apply a Generative Adversarial Network to the task of audio synthesis of drum
sounds. By conditioning the model on perceptual features computed with a
publicly available feature-extractor, intuitive control is gained over the
generation process. The experiments are carried out on a large collection of
kick, snare, and cymbal sounds. We show that, compared to a specific prior work
based on a U-Net architecture, our approach considerably improves the quality
of the generated drum samples, and that the conditional input indeed shapes the
perceptual characteristics of the sounds. Also, we provide audio examples and
release the code used in our experiments.Comment: 8 pages, 1 figure, 3 tables, accepted in Proc. of the 21st
International Society for Music Information Retrieval (ISMIR2020
A latent rhythm complexity model for attribute-controlled drum pattern generation
AbstractMost music listeners have an intuitive understanding of the notion of rhythm complexity. Musicologists and scientists, however, have long sought objective ways to measure and model such a distinctively perceptual attribute of music. Whereas previous research has mainly focused on monophonic patterns, this article presents a novel perceptually-informed rhythm complexity measure specifically designed for polyphonic rhythms, i.e., patterns in which multiple simultaneous voices cooperate toward creating a coherent musical phrase. We focus on drum rhythms relating to the Western musical tradition and validate the proposed measure through a perceptual test where users were asked to rate the complexity of real-life drumming performances. Hence, we propose a latent vector model for rhythm complexity based on a recurrent variational autoencoder tasked with learning the complexity of input samples and embedding it along one latent dimension. Aided by an auxiliary adversarial loss term promoting disentanglement, this effectively regularizes the latent space, thus enabling explicit control over the complexity of newly generated patterns. Trained on a large corpus of MIDI files of polyphonic drum recordings, the proposed method proved capable of generating coherent and realistic samples at the desired complexity value. In our experiments, output and target complexities show a high correlation, and the latent space appears interpretable and continuously navigable. On the one hand, this model can readily contribute to a wide range of creative applications, including, for instance, assisted music composition and automatic music generation. On the other hand, it brings us one step closer toward achieving the ambitious goal of equipping machines with a human-like understanding of perceptual features of music
- …