901 research outputs found

    Melodic Phrase Segmentation By Deep Neural Networks

    Full text link
    Automated melodic phrase detection and segmentation is a classical task in content-based music information retrieval and also the key towards automated music structure analysis. However, traditional methods still cannot satisfy practical requirements. In this paper, we explore and adapt various neural network architectures to see if they can be generalized to work with the symbolic representation of music and produce satisfactory melodic phrase segmentation. The main issue of applying deep-learning methods to phrase detection is the sparse labeling problem of training sets. We proposed two tailored label engineering with corresponding training techniques for different neural networks in order to make decisions at a sequential level. Experiment results show that the CNN-CRF architecture performs the best, being able to offer finer segmentation and faster to train, while CNN, Bi-LSTM-CNN and Bi-LSTM-CRF are acceptable alternatives

    Maths, Computation and Flamenco: overview and challenges

    Full text link
    Flamenco is a rich performance-oriented art music genre from Southern Spain which attracts a growing community of aficionados around the globe. Due to its improvisational and expressive nature, its unique musical characteristics, and the fact that the genre is largely undocumented, flamenco poses a number of interesting mathematical and computational challenges. Most existing approaches in Musical Information Retrieval (MIR) were developed in the context of popular or classical music and do often not generalize well to non-Western music traditions, in particular when the underlying music theoretical assumptions do not hold for these genres. Over the recent decade, a number of computational problems related to the automatic analysis of flamenco music have been defined and several methods addressing a variety of musical aspects have been proposed. This paper provides an overview of the challenges which arise in the context of computational analysis of flamenco music and outlines an overview of existing approaches

    Investigating brain responses to section endings in tonal, classical and rhythmic music : an fMRI study

    Get PDF
    Our overall aim was to examine brain responses to different experiences of time in music with a particular focus on the question of how we experience large-scale music form. The present experiment was aimed at investigating the neural correlates to experiencing section endings in teleological (goal-directed) music as well as in rhythmic (groove-based) music. We used functional magnetic resonance imaging (fMRI) on 14 human participants. Comparing transition points to continuous sections of the music we found that there was more neural activity in both musical genres at the transition points. Additionally we saw stronger blood-oxygen-level dependent (BOLD) fMRI activations at transition points in the rhythmical piece than in the classical piece. We did four region-of-interest (ROI) analyses, based on a priori expectations about the likely involvement of different brain areas in our task; the ventrolateral prefrontal cortex (VLPFC), the posterior temporal cortex (PTC), the dorsolateral prefrontal cortex (DLPFC) and the posterior parietal cortex (PPC). PTC was the only region that showed activations strong enough to survive the correction for multiple comparisons

    Treatment of non-fluent aphasia through melody, rhythm and formulaic language

    No full text
    Left-hemisphere stroke patients often suffer a profound loss of spontaneous speech — known as non-fluent aphasia. Yet, many patients are still able to sing entire pieces of text fluently. This striking finding has inspired mainly two research questions. If the experimental design focuses on one point in time (cross section), one may ask whether or not singing facilitates speech production in aphasic patients. If the design focuses on changes over several points in time (longitudinal section), one may ask whether or not singing qualifies as a therapy to aid recovery from aphasia. The present work addresses both of these questions based on two separate experiments. A cross-sectional experiment investigated the relative effects of melody, rhythm, and lyric type on speech production in seventeen patients with non-fluent aphasia. The experiment controlled for vocal frequency variability, pitch accuracy, rhythmicity, syllable duration, phonetic complexity and other influences, such as learning effects and the acoustic setting. Contrary to earlier reports, the cross-sectional results suggest that singing may not benefit speech production in non-fluent aphasic patients over and above rhythmic speech. Previous divergent findings could very likely be due to affects from the acoustic setting, insufficient control for syllable duration, and language-specific stress patterns. However, the data reported here indicate that rhythmic pacing may be crucial, particularly for patients with lesions including the basal ganglia. Overall, basal ganglia lesions accounted for more than fifty percent of the variance related to rhythmicity. The findings suggest that benefits typically attributed to singing in the past may actually have their roots in rhythm. Moreover, the results demonstrate that lyric type may have a profound impact on speech production in non-fluent aphasic patients. Among the studied patients, lyric familiarity and formulaic language appeared to strongly mediate speech production, regardless of whether patients were singing or speaking rhythmically. Lyric familiarity and formulaic language may therefore help to explain effects that have, up until now, been presumed to result from singing. A longitudinal experiment investigated the relative long-term effects of melody and rhythm on the recovery of formulaic and non-formulaic speech. Fifteen patients with chronic non-fluent aphasia underwent either singing therapy, rhythmic therapy, or standard speech therapy. The experiment controlled for vocal frequency variability, phonatory quality, pitch accuracy, syllable duration, phonetic complexity and other influences, such as the acoustic setting and learning effects induced by the testing itself. The longitudinal results suggest that singing and rhythmic speech may be similarly effective in the treatment of non-fluent aphasia. Both singing and rhythmic therapy patients made good progress in the production of common, formulaic phrases — known to be supported by right corticostriatal brain areas. This progress occurred at an early stage of both therapies and was stable over time. Moreover, relatives of the patients reported that they were using a fixed number of formulaic phrases successfully in communicative contexts. Independent of whether patients had received singing or rhythmic therapy, they were able to easily switch between singing and rhythmic speech at any time. Conversely, patients receiving standard speech therapy made less progress in the production of formulaic phrases. They did, however, improve their production of unrehearsed, non-formulaic utterances, in contrast to singing and rhythmic therapy patients, who did not. In light of these results, it may be worth considering the combined use of standard speech therapy and the training of formulaic phrases, whether sung or rhythmically spoken. This combination may yield better results for speech recovery than either therapy alone. Overall, treatment and lyric type accounted for about ninety percent of the variance related to speech recovery in the data reported here. The present work delivers three main results. First, it may not be singing itself that aids speech production and speech recovery in non-fluent aphasic patients, but rhythm and lyric type. Second, the findings may challenge the view that singing causes a transfer of language function from the left to the right hemisphere. Moving beyond this left-right hemisphere dichotomy, the current results are consistent with the idea that rhythmic pacing may partly bypass corticostriatal damage. Third, the data support the claim that non-formulaic utterances and formulaic phrases rely on different neural mechanisms, suggesting a two-path model of speech recovery. Standard speech therapy focusing on non-formulaic, propositional utterances may engage, in particular, left perilesional brain regions, while training of formulaic phrases may open new ways of tapping into right-hemisphere language resources — even without singing

    Motivic Pattern Classification of Music Audio Signals Combining Residual and LSTM Networks

    Get PDF
    Motivic pattern classification from music audio recordings is a challenging task. More so in the case of a cappella flamenco cantes, characterized by complex melodic variations, pitch instability, timbre changes, extreme vibrato oscillations, microtonal ornamentations, and noisy conditions of the recordings. Convolutional Neural Networks (CNN) have proven to be very effective algorithms in image classification. Recent work in large-scale audio classification has shown that CNN architectures, originally developed for image problems, can be applied successfully to audio event recognition and classification with little or no modifications to the networks. In this paper, CNN architectures are tested in a more nuanced problem: flamenco cantes intra-style classification using small motivic patterns. A new architecture is proposed that uses the advantages of residual CNN as feature extractors, and a bidirectional LSTM layer to exploit the sequential nature of musical audio data. We present a full end-to-end pipeline for audio music classification that includes a sequential pattern mining technique and a contour simplification method to extract relevant motifs from audio recordings. Mel-spectrograms of the extracted motifs are then used as the input for the different architectures tested. We investigate the usefulness of motivic patterns for the automatic classification of music recordings and the effect of the length of the audio and corpus size on the overall classification accuracy. Results show a relative accuracy improvement of up to 20.4% when CNN architectures are trained using acoustic representations from motivic patterns

    Music as complex emergent behaviour : an approach to interactive music systems

    Get PDF
    Access to the full-text thesis is no longer available at the author's request, due to 3rd party copyright restrictions. Access removed on 28.11.2016 by CS (TIS).Metadata merged with duplicate record (http://hdl.handle.net/10026.1/770) on 20.12.2016 by CS (TIS).This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.This thesis suggests a new model of human-machine interaction in the domain of non-idiomatic musical improvisation. Musical results are viewed as emergent phenomena issuing from complex internal systems behaviour in relation to input from a single human performer. We investigate the prospect of rewarding interaction whereby a system modifies itself in coherent though non-trivial ways as a result of exposure to a human interactor. In addition, we explore whether such interactions can be sustained over extended time spans. These objectives translate into four criteria for evaluation; maximisation of human influence, blending of human and machine influence in the creation of machine responses, the maintenance of independent machine motivations in order to support machine autonomy and finally, a combination of global emergent behaviour and variable behaviour in the long run. Our implementation is heavily inspired by ideas and engineering approaches from the discipline of Artificial Life. However, we also address a collection of representative existing systems from the field of interactive composing, some of which are implemented using techniques of conventional Artificial Intelligence. All systems serve as a contextual background and comparative framework helping the assessment of the work reported here. This thesis advocates a networked model incorporating functionality for listening, playing and the synthesis of machine motivations. The latter incorporate dynamic relationships instructing the machine to either integrate with a musical context suggested by the human performer or, in contrast, perform as an individual musical character irrespective of context. Techniques of evolutionary computing are used to optimise system components over time. Evolution proceeds based on an implicit fitness measure; the melodic distance between consecutive musical statements made by human and machine in relation to the currently prevailing machine motivation. A substantial number of systematic experiments reveal complex emergent behaviour inside and between the various systems modules. Music scores document how global systems behaviour is rendered into actual musical output. The concluding chapter offers evidence of how the research criteria were accomplished and proposes recommendations for future research
    • …
    corecore