16 research outputs found

    Automatic Transcription of Drum Strokes in Carnatic Music

    Full text link
    The mridangam is a double-headed percussion instrument that plays a key role in Carnatic music concerts. This paper presents a novel automatic transcription algorithm to classify the strokes played on the mridangam. Onset detection is first performed to segment the audio signal into individual strokes, and feature vectors consisting of the DFT magnitude spectrum of the segmented signal are generated. A multi-layer feedforward neural network is trained using the feature vectors as inputs and the manual transcriptions as targets. Since the mridangam is a tonal instrument tuned to a given tonic, tonic invariance is an important feature of the classifier. Tonic invariance is achieved by augmenting the dataset with pitch-shifted copies of the audio. This algorithm consistently yields over 83% accuracy on a held-out test dataset.Comment: 7 pages, 9 figure

    Vocal Source Separation for Carnatic Music

    Get PDF
    Carnatic Music is a Classical music form that originates from the South of India and is extremely varied from Western genres. Music Information Retrieval (MIR) has predominantly been used to tackle problems in western musical genres and cannot be adapted to non western musical styles like Carnatic Music due to the fundamental difference in melody, rhythm, instrumentation, nature of compositions and improvisations. Due to these conceptual differences emerged MIR tasks specific for the use case of Carnatic Music. Researchers have constantly been using domain knowledge and technology driven ideas to tackle tasks like Melodic analysis, Rhythmic analysis and Structural segmentation. Melodic analysis of Carnatic Music has been a cornerstone in MIR research and heavily relies on the singing voice because the singer offers the main melody. The problem is that the singing voice is not isolated and has melodic, percussion and drone instruments as accompaniment. Separating the singing voice from the accompanying instruments usually comes with issues like bleeding of the accompanying instruments and loss of melodic information. This in turn has an adverse effect on the melodic analysis. The datasets used for Carnatic-MIR are concert recordings of different artistes with accompanying instruments and there is a lack of clean isolated singing voice tracks. Existing Source Separation models are trained extensively on multi-track audio of the rock and pop genre and do not generalize well for the use case of Carnatic music. How do we improve Singing Voice Source Separation for Carnatic Music given the above constraints? In this work, the possible contributions to mitigate the existing issue are ; 1) Creating a dataset of isolated Carnatic music stems. 2) Reusing multi-track audio with bleeding from the Saraga dataset. 3) Retraining and fine tuning existing State of the art Source Separation models. We hope that this effort to improve Source Separation for Carnatic Music can help overcome existing shortcomings and generalize well for Carnatic music datasets in the literature and in turn improve melodic analysis of this music culture

    Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers

    Full text link
    The success of supervised deep learning methods is largely due to their ability to learn relevant features from raw data. Deep Neural Networks (DNNs) trained on large-scale datasets are capable of capturing a diverse set of features, and learning a representation that can generalize onto unseen tasks and datasets that are from the same domain. Hence, these models can be used as powerful feature extractors, in combination with shallower models as classifiers, for smaller tasks and datasets where the amount of training data is insufficient for learning an end-to-end model from scratch. During the past years, Convolutional Neural Networks (CNNs) have largely been the method of choice for audio processing. However, recently attention-based transformer models have demonstrated great potential in supervised settings, outperforming CNNs. In this work, we investigate the use of audio transformers trained on large-scale datasets to learn general-purpose representations. We study how the different setups in these audio transformers affect the quality of their embeddings. We experiment with the models' time resolution, extracted embedding level, and receptive fields in order to see how they affect performance on a variety of tasks and datasets, following the HEAR 2021 NeurIPS challenge evaluation setup. Our results show that representations extracted by audio transformers outperform CNN representations. Furthermore, we will show that transformers trained on Audioset can be extremely effective representation extractors for a wide range of downstream tasks.Comment: will apear in HEAR: Holistic Evaluation of Audio Representations Proceedings of Machine Learning Research PMLR 166. Source code: https://github.com/kkoutini/passt_hear2

    Devotional music in Mysore

    Get PDF
    I examine the motivations and rewards concerned with participation in performances of devotional music in Mysore, India, whilst the vaohana and devaranama traditions are considered, I concentrate upon the sangita, or 'art', form of devotional music.The motivations and rewards are expressed in terms of indigenously-conceived corresrondances between musical exper iences and experiences in sadhana. I examine the 3et of religio-philosophical ideas and meditational practises involved in sauhana and conclude that all forms can be conceived in terms of three procedures - viz. 'purification' (i.e. experience of the 'real self'), 'participation' (i.e. mergence with 'alter') and 'transcendence' (of the conceptual frameworks employed in thought).Indigenous aesthetic theory conceives of three separate components of musical experience - viz. sah it.ye. bhava (i.e. the 'emotional' effect of the literary text). raga bhava (i.e. the 'emotional' effect of the rags) and ganam (i. e. the 'aesthetic' effect of musical form). I examine each component and conclude that the bhava components amount to 'participation' whilst the ganam component amounts to 'purification' and 'transcendence'. I conclude that the mechanisms by means of which the music produces its effects are those of 'absolute' and 'referential' musical meaning and that 'referential' meaning may be conceptualised in terms of 'image processes', 'moods' and 'connotations'.I conclude that the participant's motivations and rewords ; ay be defined in terms of the capacity of the music to promote the maintenance of his emotional and mental equilibrium and that the social significance of the institution of devotional music derives from its capacity to adapt the participant to cope more successfully with his/ her reactions to life in society

    Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners

    Get PDF
    In this work, we propose a Multi-Window Masked Autoencoder (MW-MAE) fitted with a novel Multi-Window Multi-Head Attention (MW-MHA) module that facilitates the modelling of local-global interactions in every decoder transformer block through attention heads of several distinct local and global windows. Empirical results on ten downstream audio tasks show that MW-MAEs consistently outperform standard MAEs in overall performance and learn better general-purpose audio representations, along with demonstrating considerably better scaling characteristics. Investigating attention distances and entropies reveals that MW-MAE encoders learn heads with broader local and global attention. Analyzing attention head feature representations through Projection Weighted Canonical Correlation Analysis (PWCCA) shows that attention heads with the same window sizes across the decoder layers of the MW-MAE learn correlated feature representations which enables each block to independently capture local and global information, leading to a decoupled decoder feature hierarchy. Code for feature extraction and downstream experiments along with pre-trained models will be released publically

    Two Lecture Recitals of Percussion Music: Notation for Stroke-Type Analyses and Tabla and Indian Musical Concepts in Modern Drum Set Performance

    Get PDF
    Two lecture recitals were presented in lieu of a written dissertation. This document combines two formal paper presentations, each of which corresponds to a single lecture recital. Sunday, May 2, 2010, 4:00 p.m., Mannoni Performing Arts Center. Program: Two Mexican Dances for Marimba by Gordon Stout; Marimba Spiritual by Minoru Miki. “Notation for Stroke-Type Analyses” presents a method for notating stroke types in four-mallet keyboard percussion mechanics. A discussion of various applications of stroke type notation follows, including the pedagogical value of engaging in stroke-type analyses as a supplement to traditional harmonic and formal analyses, the effect of stroke-type analyses on performance preparation, and the effect of stroke-type analyses on the performance itself. Thursday, February 24, 2011, 6:00 p.m., Mannoni Performing Arts Center. Program: Three Classical Compositions for Tabla transmitted by Pandit Arup Chattopadhyay, trans. David Whitman; Palta by Bob Becker. “Tabla and Indian Musical Concepts in Modern Drum Set Performance” deals with the adaptation of tabla compositions for Western drum set. Fixed classical compositions for solo tabla from the Farrukhabad gharana of the central Uttar Pradesh region of India are transcribed for drum iii set. The application of North Indian musical concepts such as tihai, tal, and jhati are also discussed and applied to a performance of Palta – a composition for either solo tabla or solo drum set and percussion ensemble by Bob Becker. The author’s studies with Calcutta-based master tabla artist Pandit Arup Chattopadhyay form the basis of much of the material presented in the second chapter

    Character and Expressivity in the Sacred Naphat Music of Thailand

    Get PDF
    The wai khru ceremony (in which homage is paid to Hindu deities as divine teachers) is considered a mysterious and exclusive space of Thai tradition which sits at the heart of many forms of artistic practice. The soundscape created by the piphat ensemble is at the centre of this ritual, where naphat music is treated as a sacred device suggesting the imagined appearance of Hindu deities: sacrifice is offered, and blessing and good fortune is asked for. This thesis scrutinises the details of this music from the perspective of a cultural insider, focusing on the manifestations of musical expressivity associated with divine qualities, and how Thai underlying conceptual models are made manifest in art. The melodic core of the repertoire, played by the khong wong yai, is analysed in terms of grammar and affective syntax found within Thai musical practice by reference to examples from forty individual pieces of music. Affective expressivities of the music are associated with two particular modes, the demonic and the deva, that are governed by the concept of thang. This process involves a complex cooperation between expressivity and how that may be constituted in musical terms, with a cultural insider’s perspective reliant on known patterns of musical cognition against which an ambiguous expressivity is created through a process of distortion. Through the examination of this process, a quality of the divine in this music can be identified, involving a projection of the fantastical in relation to ambiguous expression; the music thus emerges as a uniquely subtle use of compositional methods founded within Thai musical theory

    Non-Isochronous Meter: A Study of Cross cultural practice, analytic technique, and implications for jazz pedagogy

    Get PDF
    This dissertation examines the use of non-isochronous (NI) meters in jazz compositional and performative practices (meters as comprised of cycles of a prime number [e.g., 5, 7, 11] or uneven divisions of non-prime cycles [e.g., 9 divided as 2+2+2+3]). The explorative meter practices of jazz, while constituting a central role in the construction of its own identity, remains curiously absent from jazz scholarship. The conjunct research broadly examines NI meters and the various processes/strategies and systems utilized in historical and current jazz composition and performance practices. While a considerable amount of NI meter composers have advertantly drawn from the metric practices of non-Western music traditions, the potential for utilizing insights gleaned from contemporary music-theoretical discussions of meter have yet to fully emerge as a complimentary and/or organizational schemata within jazz pedagogy and discourse. This paper seeks to address this divide, but not before an accurate picture of historical meter practice is assessed, largely as a means for contextualizing developments within historical and contemporary practice and discourse. The dissertation presents a chronology of explorative meter developments in jazz, firstly, by tracing compositional output, and secondly, by establishing the relevant sources within conjunct periods of development i.e., scholarly works, relative academic developments, and tractable world music sources. Bridging the gap between world music meter sources and theoretical musicology (primarily, the underlying perceptual and cognitive model which represents a topology of the structural premises of meter) the research acts to direct and inform a compositional process which directly accounts for an isomorphic link between structurally similar meters

    Make It New

    Get PDF
    As jazz enters its second century it is reasserting itself as dynamic and relevant. Boston Globe jazz writer and Emerson College professor Bill Beuttler reveals new ways in which jazz is engaging with society through the vivid biographies and music of Jason Moran, Vijay Iyer, Rudresh Mahanthappa, The Bad Plus, Miguel Zenón, Anat Cohen, Robert Glasper, and Esperanza Spalding. These musicians are freely incorporating other genres of music into jazz—from classical (both western and Indian) to popular (hip-hop, Ramp;B, rock, bluegrass, klezmer, Brazilian choro)—and other art forms as well (literature, film, photography, and other visual arts). This new generation of jazz is increasingly more international and is becoming more open to women as instrumentalists and bandleaders. Contemporary jazz is reasserting itself as a force for social change, prompted by developments such as the Black Lives Matter, #MeToo movements, and the election of Donald Trump

    DRUMMING AUSPICIOUSNESS THE PAKHĀVAJ OF NATHDWARA AND THE CULT OF THE KING-GOD

    Get PDF
    The pakhāvaj occupies a unique position in the classical music scene of contemporary India. Identified with the ancient máč›daáč…ga and associated with kings and gods, played in the Hindustani tradition of the court dhrupad and in the temple music of various sects, it is the most respected of the Indian drums by musicians as well as the most authoritative, according to textual sources; it is an auspicious drum and multiple origin myths explain its creation; its repertoire includes compositions which musicians connect to literature in Sanskrit or vernacular languages and to prayer. Notwithstanding its relevance in Indian music, there are no specific studies on the pakhāvaj and above all about its language, repertoire, and its unique position connecting sacred and secular music. This dissertation fills the gap with a study of the pakhāvaj of Nathdwara, its history, aesthetics and repertoire. Furthermore, joining ethnographic, historical, religious and iconographic perspectives, it provides a multifaceted interpretation of the role and function of the pakhāvaj in royal courts, temples and contemporary stages, and the first analysis of the visual and narrative contents of its repertoire. It also contributes to the understanding of the language, idea and role of drums and drumming in Indian court and temple music, and their relationship over the last two millennia
    corecore