37 research outputs found

    Melodic track identification in MIDI files considering the imbalanced context

    Get PDF
    In this paper, the problem of identifying the melodic track of a MIDI file in imbalanced scenarios is addressed. A polyphonic MIDI file is a digital score that consists of a set of tracks where usually only one of them contains the melody and the remaining tracks hold the accompaniment. This leads to a two-class imbalance problem that, unlike in previous work, is managed by over-sampling the melody class (the minority one) or by under-sampling the accompaniment class (the majority one) until both classes are the same size. Experimental results over three different music genres prove that learning from balanced training sets clearly provides better results than the standard classification proces

    On the suitability of combining feature selection and resampling to manage data complexity

    Get PDF
    The effectiveness of a learning task depends on data com- plexity (class overlap, class imbalance, irrelevant features, etc.). When more than one complexity factor appears, two or more preprocessing techniques should be applied. Nevertheless, no much effort has been de- voted to investigate the importance of the order in which they can be used. This paper focuses on the joint use of feature reduction and bal- ancing techniques, and studies which could be the application order that leads to the best classification results. This analysis was made on a spe- cific problem whose aim was to identify the melodic track given a MIDI file. Several experiments were performed from different imbalanced 38- dimensional training sets with many more accompaniment tracks than melodic tracks, and where features were aggregated without any correla- tion study. Results showed that the most effective combination was the ordered use of resampling and feature reduction techniques

    Understanding Optical Music Recognition

    Get PDF
    For over 50 years, researchers have been trying to teach computers to read music notation, referred to as Optical Music Recognition (OMR). However, this field is still difficult to access for new researchers, especially those without a significant musical background: Few introductory materials are available, and, furthermore, the field has struggled with defining itself and building a shared terminology. In this work, we address these shortcomings by (1) providing a robust definition of OMR and its relationship to related fields, (2) analyzing how OMR inverts the music encoding process to recover the musical notation and the musical semantics from documents, and (3) proposing a taxonomy of OMR, with most notably a novel taxonomy of applications. Additionally, we discuss how deep learning affects modern OMR research, as opposed to the traditional pipeline. Based on this work, the reader should be able to attain a basic understanding of OMR: its objectives, its inherent structure, its relationship to other fields, the state of the art, and the research opportunities it affords

    Towards a Better Understanding of Emotion Communication in Music: An Interactive Production Approach.

    Get PDF
    It has been well established that composers and performers are able to encode certain emotional expressions in music, which in turn are decoded by listeners, and in general, successfully recognised. There is still much to discover, however, as to how musical cues combine to shape different emotions in the music, since previous literature has tended to focus on a limited number of cues and emotional expressions. The work in this thesis aims to investigate how combinations of tempo, articulation, pitch, dynamics, brightness, mode, and later, instrumentation, are used to shape sadness, joy, calmness, anger, fear, power, and surprise in Western tonal music. In addition, new tools for music and emotion research are presented with the aim of providing an efficient production approach to explore a large cue-emotion space in a relatively short time. To this end, a new interactive interface called EmoteControl was created which allows users to alter musical pieces in real-time through the available cues. Moreover, musical pieces were specifically composed to be used as stimuli. Empirical experiments were then carried out with the interface to determine how participants shaped different emotions in the pieces using the available cues. Specific cue combinations for the different emotions were produced. Findings revealed that overall, mode and tempo were the strongest contributors to the conveyed emotion whilst brightness was the least effective cue. However, the importance of the cues varied depending on the intended emotion. Finally, a comparative evaluation of production and traditional approaches was carried out which showed that similar results may be obtained with both. However, the production approach allowed for a larger cue-emotion space to be navigated in a shorter time. In sum, the production approach allowed participants to directly show us how they think emotional expressions should sound, and how they are shaped in music

    Automated manipulation of musical grammars to support episodic interactive experiences

    Get PDF
    Music is used to enhance the experience of participants and visitors in a range of settings including theatre, film, video games, installations and theme parks. These experiences may be interactive, contrastingly episodic and with variable duration. Hence, the musical accompaniment needs to be dynamic and to transition between contrasting music passages. In these contexts, computer generation of music may be necessary for practical reasons including distribution and cost. Automated and dynamic composition algorithms exist but are not well-suited to a highly interactive episodic context owing to transition-related problems including discontinuity, abruptness, extended repetitiveness and lack of musical granularity and musical form. Addressing these problems requires algorithms capable of reacting to participant behaviour and episodic change in order to generate formic music that is continuous and coherent during transitions. This thesis presents the Form-Aware Transitioning and Recovering Algorithm (FATRA) for realtime, adaptive, form-aware music generation to provide continuous musical accompaniment in episodic context. FATRA combines stochastic grammar adaptation and grammar merging in real time. The Form-Aware Transition Engine (FATE) implementation of FATRA estimates the time-occurrence of upcoming narrative transitions and generates a harmonic sequence as narrative accompaniment with a focus on coherent, form-aware music transitioning between music passages of contrasting character. Using FATE, FATRA has been evaluated in three perceptual user studies: An audioaugmented real museum experience, a computer-simulated museum experience and a music-focused online study detached from narrative. Music transitions of FATRA were benchmarked against common approaches of the video game industry, i.e. crossfading and direct transitions. The participants were overall content with the music of FATE during their experience. Transitions of FATE were significantly favoured against the crossfading benchmark and competitive against the direct transitions benchmark, without statistical significance for the latter comparison. In addition, technical evaluation demonstrated capabilities of FATRA including form generation, repetitiveness avoidance and style/form recovery in case of falsely predicted narrative transitions. Technical results along with perceptual preference and competitiveness against the benchmark approaches are deemed as positive and the structural advantages of FATRA, including form-aware transitioning, carry considerable potential for future research

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Folio of compositions and productions with critical commentary

    Get PDF
    This practical portfolio PhD presents a collection of my pop music compositions and productions, with an accompanying exegesis that examines my process whilst creating these tracks. The exegesis analyses the methods and techniques I use, with the aim of achieving a better understanding of my productions. This will in turn reveal my idiolect, which is the distinctive way an individual creates that produces their particular style (Middleton, 1990: 136, 174; Moore, 2012: 120, 166-7). This research also includes analyses of interesting, innovative and important moments or processes that draw on perspectives taken from theories of creativity, technology as well as the social aspects of creation and innovation. These theories are the systems approach to creativity (Csikszentmihalyi, 1997, 1999), and the Social Construction of Technology [SCOT] (Bijker, Hughes & Pinch, 1987, 2012). The focus of this thesis will be on understanding the detail of my creative practice. This exegesis consists of critical commentaries of four of the fifteen portfolio tracks, each in its own chapter. It is the intention that concentrating on just four tracks will enable a better understanding of the processes in question. Other portfolio tracks serve as further examples of the points made, or demonstrate alternatives where relevant. An autoethnographical approach is used as a methodology for the commentary chapters. This is in order to achieve a comprehension of my personal experience (Bochner and Ellis in Denzin, 2000: 736-9; Muncey, 2010: 26-33). The reader could then potentially ‘resonate with’ this experience and ‘reflect on it’ (Bochner and Ellis in Denzin, 2000: 753). Chapter 1 outlines the various roles I play when creating a pop music track, which include composer, producer, engineer, performer and collaborator. This chapter additionally explores my position as listener as well as the significance of the opinions of the audience. Chapter 2 focuses on composition and melody, including a discussion of intuition and invention. The vocal production techniques used in the creation of the featured track are explored as well as my approach to mixing. Chapter 3 centers on how my drum performances and the lead vocal parts were created, and edited for the portfolio track in question. It also outlines several models of collaboration that refer to the creation of this track. Chapter 4 discusses the frame of mind one draws upon whilst 6 creating pop music. It examines how I chose equipment for the production, the production techniques used, and the creation of timbre and texture. Chapter 5 examines the application of reverbs and effects, and also explores my approaches and attitudes, some philosophical, towards the perspective of the listener. The composition collaboration situation differs in each commentary chapter, including: sole composition for Chapter 2; equally shared composition between two collaborators in Chapter 3; three collaborators in Chapter 4, of which I have least compositional input; and two way collaboration in Chapter 5, where I have no compositional input. As a result of this study, a better understanding of my creative practice has been achieved, which includes further comprehension of my idiolect. This research has not only informed my recent composing and producing, but it is also likely that it will influence my future productions. Furthermore, it is hoped that the insights presented in this thesis could potentially serve as useful knowledge for others, with the prospect that they can approach their own productions with this knowledge in mind

    Proceedings of the 19th Sound and Music Computing Conference

    Get PDF
    Proceedings of the 19th Sound and Music Computing Conference - June 5-12, 2022 - Saint-Étienne (France). https://smc22.grame.f
    corecore