Search CORE

8,557 research outputs found

Multichannel high resolution NMF for modelling convolutive mixtures of non-stationary signals in the time-frequency domain

Author: Badeau Roland
Plumbley Mark
Publication venue
Publication date: 01/01/2013
Field of study

Several probabilistic models involving latent components have been proposed for modeling time-frequency (TF) representations of audio signals such as spectrograms, notably in the nonnegative matrix factorization (NMF) literature. Among them, the recent high-resolution NMF (HR-NMF) model is able to take both phases and local correlations in each frequency band into account, and its potential has been illustrated in applications such as source separation and audio inpainting. In this paper, HR-NMF is extended to multichannel signals and to convolutive mixtures. The new model can represent a variety of stationary and non-stationary signals, including autoregressive moving average (ARMA) processes and mixtures of damped sinusoids. A fast variational expectation-maximization (EM) algorithm is proposed to estimate the enhanced model. This algorithm is applied to piano signals, and proves capable of accurately modeling reverberation, restoring missing observations, and separating pure tones with close frequencies

Crossref

Queen Mary Research Online

Surrey Research Insight

The simulation of piano string vibration: {F}rom physical models to finite difference schemes and digital waveguides

Author: Bensa Julien
Bilbao Stefan
Kronland Martinet Richard
Smith Julius
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2003
Field of study

Crossref

Edinburgh Research Explorer

Action-based effects on music perception

Author: Leman Marc
Maes Pieter-Jan
Palmer Caroline
Wanderley Marcelo M
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2014
Field of study

The classical, disembodied approach to music cognition conceptualizes action and perception as separate, peripheral processes. In contrast, embodied accounts of music cognition emphasize the central role of the close coupling of action and perception. It is a commonly established fact that perception spurs action tendencies. We present a theoretical framework that captures the ways in which the human motor system and its actions can reciprocally influence the perception of music. The cornerstone of this framework is the common coding theory, postulating a representational overlap in the brain between the planning, the execution, and the perception of movement. The integration of action and perception in so-called internal models is explained as a result of associative learning processes. Characteristic of internal models is that they allow intended or perceived sensory states to be transferred into corresponding motor commands (inverse modeling), and vice versa, to predict the sensory outcomes of planned actions (forward modeling). Embodied accounts typically refer to inverse modeling to explain action effects on music perception (Leman, 2007). We extend this account by pinpointing forward modeling as an alternative mechanism by which action can modulate perception. We provide an extensive overview of recent empirical evidence in support of this idea. Additionally, we demonstrate that motor dysfunctions can cause perceptual disabilities, supporting the main idea of the paper that the human motor system plays a functional role in auditory perception. The finding that music perception is shaped by the human motor system and its actions suggests that the musical mind is highly embodied. However, we advocate for a more radical approach to embodied (music) cognition in the sense that it needs to be considered as a dynamical process, in which aspects of action, perception, introspection, and social interaction are of crucial importance

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

From the physics of piano strings to digital waveguides

Author: Bensa Julien
Bilbao Stefan
Kronland Martinet Richard
Smith Julius
Publication venue
Publication date: 01/01/2002
Field of study

University of Michigan Library Repository

Edinburgh Research Explorer

Speech Development by Imitation

Author: Balkenius Christian
Breidegard Bjorn
Publication venue: Lund University Cognitive Studies
Publication date: 01/01/2003
Field of study

The Double Cone Model (DCM) is a model of how the brain transforms sensory input to motor commands through successive stages of data compression and expansion. We have tested a subset of the DCM on speech recognition, production and imitation. The experiments show that the DCM is a good candidate for an artificial speech processing system that can develop autonomously. We show that the DCM can learn a repertoire of speech sounds by listening to speech input. It is also able to link the individual elements of speech to sequences that can be recognized or reproduced, thus allowing the system to imitate spoken language

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive

Real-time emulation of the Clavinet, an electromechanical keyboard instrument

Author: Bilbao Stefan
Gabrieli Leonardo
Valimaki Vesa
Publication venue
Publication date: 01/09/2010
Field of study

Edinburgh Research Explorer

A Two-Process Model for Control of Legato Articulation Across a Wide Range of Tempos During Piano Performance

Author: Bullock Daniel
Jacobs J. Pieter
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/10/1997
Field of study

Prior reports indicated a non-linear increase in key overlap times (KOTs) as tempo slows for scales/arpeggios performed at internote intervals (INIs) of I00-1000 ms. Simulations illustrate that this function can be explained by a two-process model. An oscillating neural network based on dynamics of the vector-integration-to-endpoint model for central generation of voluntary actions, allows performers to compute an estimate of the time remaining before the oscillator's next cycle onset. At fixed successive threshold values of this estimate they first launch keystroke n+l and then lift keystroke n. As tempo slows, time required to pass between threshold crossings elongates, and KOT increases. If only this process prevailed, performers would produce longer than observed KOTs at the slowest tempo. The full data set is explicable if subjects lift keystroke n whenever they cross the second threshold or receive sensory feedback from stroke n+l, whichever comes earlier.Fulbright grant; Office of Naval Research (N00014-92-J-1309, N0014-95-1-0409

Boston University Institutional Repository (OpenBU)

Musical notes classification with Neuromorphic Auditory System using FPGA and a Convolutional Spiking Network

Author: Cerezuela Escudero Elena
Domínguez Morales Manuel Jesús
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Linares Barranco Alejandro
Paz Vicente Rafael
Publication venue: IEEE Computer Society
Publication date: 01/01/2015
Field of study

In this paper, we explore the capabilities of a sound classification system that combines both a novel FPGA cochlear model implementation and a bio-inspired technique based on a trained convolutional spiking network. The neuromorphic auditory system that is used in this work produces a form of representation that is analogous to the spike outputs of the biological cochlea. The auditory system has been developed using a set of spike-based processing building blocks in the frequency domain. They form a set of band pass filters in the spike-domain that splits the audio information in 128 frequency channels, 64 for each of two audio sources. Address Event Representation (AER) is used to communicate the auditory system with the convolutional spiking network. A layer of convolutional spiking network is developed and trained on a computer with the ability to detect two kinds of sound: artificial pure tones in the presence of white noise and electronic musical notes. After the training process, the presented system is able to distinguish the different sounds in real-time, even in the presence of white noise.Ministerio de Economía y Competitividad TEC2012-37868-C04-0

idUS. Depósito de Investigación Universidad de Sevilla