Search CORE

4,614 research outputs found

Audio Classification from Time-Frequency Texture

Author: Slotine Jean-Jacques
Yu Guoshen
Publication venue
Publication date: 25/09/2008
Field of study

Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription

Author: Emmanouil Benetos
Simon Dixon
Student Member
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

In this paper, a method for automatic transcription of music signals based on joint multiple-F0 estimation is proposed. As a time-frequency representation, the constant-Q resonator time-frequency image is employed, while a novel noise suppression technique based on pink noise assumption is applied in a preprocessing step. In the multiple-F0 estimation stage, the optimal tuning and inharmonicity parameters are computed and a salience function is proposed in order to select pitch candidates. For each pitch candidate combination, an overlapping partial treatment procedure is used, which is based on a novel spectral envelope estimation procedure for the log-frequency domain, in order to compute the harmonic envelope of candidate pitches. In order to select the optimal pitch combination for each time frame, a score function is proposed which combines spectral and temporal characteristics of the candidate pitches and also aims to suppress harmonic errors. For postprocessing, hidden Markov models (HMMs) and conditional random fields (CRFs) trained on MIDI data are employed, in order to boost transcription accuracy. The system was trained on isolated piano sounds from the MAPS database and was tested on classic and jazz recordings from the RWC database, as well as on recordings from a Disklavier piano. A comparison with several state-of-the-art systems is provided using a variety of error metrics, where encouraging results are indicated

CiteSeerX

City Research Online

Crossref

Action-based effects on music perception

Author: Leman Marc
Maes Pieter-Jan
Palmer Caroline
Wanderley Marcelo M
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2014
Field of study

The classical, disembodied approach to music cognition conceptualizes action and perception as separate, peripheral processes. In contrast, embodied accounts of music cognition emphasize the central role of the close coupling of action and perception. It is a commonly established fact that perception spurs action tendencies. We present a theoretical framework that captures the ways in which the human motor system and its actions can reciprocally influence the perception of music. The cornerstone of this framework is the common coding theory, postulating a representational overlap in the brain between the planning, the execution, and the perception of movement. The integration of action and perception in so-called internal models is explained as a result of associative learning processes. Characteristic of internal models is that they allow intended or perceived sensory states to be transferred into corresponding motor commands (inverse modeling), and vice versa, to predict the sensory outcomes of planned actions (forward modeling). Embodied accounts typically refer to inverse modeling to explain action effects on music perception (Leman, 2007). We extend this account by pinpointing forward modeling as an alternative mechanism by which action can modulate perception. We provide an extensive overview of recent empirical evidence in support of this idea. Additionally, we demonstrate that motor dysfunctions can cause perceptual disabilities, supporting the main idea of the paper that the human motor system plays a functional role in auditory perception. The finding that music perception is shaped by the human motor system and its actions suggests that the musical mind is highly embodied. However, we advocate for a more radical approach to embodied (music) cognition in the sense that it needs to be considered as a dynamical process, in which aspects of action, perception, introspection, and social interaction are of crucial importance

Ghent University Academic Bibliography

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

Evaluating new approaches to teaching of sight-reading skills to advanced pianists

Author: Zhukov Katie
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2014
Field of study

This paper evaluates three teaching approaches to improving sight-reading skills against a control in a large-scale study of advanced pianists. One hundred pianists in four equal groups participated in newly developed training programmes (accompanying, rhythm, musical style and control), with pre-and post-sight-reading tests analysed using custom-made software that provided four scores on accuracy: two on pitch and two on rhythm. Mixed-design ANCOVAs were used to analyse the performance data, showing improvement in one rhythm and one pitch variable for each of the training groups and progress in pitch for the control group. The results suggest that training does develop various aspects of sight-reading and that additional sight-reading activities enhance post-test performance. The curriculum combining all three teaching strategies is being currently trialled. Future research needs to focus on instruments other than piano to formulate generic approaches to teaching of sight-reading skills

University of Queensland eSpace

Behavioural and electrophysiological indices of error processing and perception in joint musical action

Author: Paas Anita
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2020
Field of study

Errors are rare, important events. During joint action, agency of errors helps initiate corrective, adaptive responses. If agency becomes ambiguous, errors may become more difficult to process. Models of joint action posit that internal models are developed for own and others’ actions during joint action. If agency is ambiguous, own and other internal models may overlap, causing confusion as to who produced what action. Further, claiming agency over past actions, whether performed, observed, or neither, may affect neural indices of past performance and past errors. The first experiment investigated how agency ambiguity affects behavioural and neural responses to errors and the role of agency in models of joint action. Results from this experiment showed no effects of the agency manipulation, but revealed differences in how distinct types of errors are processed. Uncorrected errors showed a pattern of performance breakdown and were processed at initiation of the erroneous motor command. Corrected errors showed a pattern of pre-response conflict and were processed after the error onset. The second experiment investigated the role of belief of agency on behavioural and neural indices of the perception of errors in previously performed actions. Results showed participants struggled to identify the performer and errors in past performance, but neural activity suggested familiarity may have aided in processing of previously performed actions. Overall findings suggest the brain supports fluent interpersonal coordination in real-time joint action by employing distinct neural mechanisms to manage different types of errors and that a sense of familiarity may play a role in the processing of previously performed actions

Western Sydney ResearchDirect

Deep Learning Techniques for Music Generation -- A Survey

Author: Briot Jean-Pierre
Hadjeres Gaëtan
Pachet François-David
Publication venue
Publication date: 23/03/2019
Field of study

This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical content is to be generated? Examples are: melody, polyphony, accompaniment or counterpoint. - For what destination and for what use? To be performed by a human(s) (in the case of a musical score), or by a machine (in the case of an audio file). Representation - What are the concepts to be manipulated? Examples are: waveform, spectrogram, note, chord, meter and beat. - What format is to be used? Examples are: MIDI, piano roll or text. - How will the representation be encoded? Examples are: scalar, one-hot or many-hot. Architecture - What type(s) of deep neural network is (are) to be used? Examples are: feedforward network, recurrent network, autoencoder or generative adversarial networks. Challenge - What are the limitations and open challenges? Examples are: variability, interactivity and creativity. Strategy - How do we model and control the process of generation? Examples are: single-step feedforward, iterative feedforward, sampling or input manipulation. For each dimension, we conduct a comparative analysis of various models and techniques and we propose some tentative multidimensional typology. This typology is bottom-up, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature. These systems are described and are used to exemplify the various choices of objective, representation, architecture, challenge and strategy. The last section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P. Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music Generation, Computational Synthesis and Creative Systems, Springer, 201

arXiv.org e-Print Archive

A Convolutional Approach to Melody Line Identification in Symbolic Scores

Author: C. Eduardo Cancino Chac&#243
F. Simonetta
G. Widmer
S. Ntalampiras
Publication venue: ISMIR
Publication date: 01/01/2019
Field of study

In many musical traditions, the melody line is of primary significance in a piece. Human listeners can readily distinguish melodies from accompaniment; however, making this distinction given only the written score -- i.e. without listening to the music performed -- can be a difficult task. Solving this task is of great importance for both Music Information Retrieval and musicological applications. In this paper, we propose an automated approach to identifying the most salient melody line in a symbolic score. The backbone of the method consists of a convolutional neural network (CNN) estimating the probability that each note in the score (more precisely: each pixel in a piano roll encoding of the score) belongs to the melody line. We train and evaluate the method on various datasets, using manual annotations where available and solo instrument parts where not. We also propose a method to inspect the CNN and to analyze the influence exerted by notes on the prediction of other notes; this method can be applied whenever the output of a neural network has the same size as the input

AIR Universita degli studi di Milano