26,826 research outputs found

    PLXTRM : prediction-led eXtended-guitar tool for real-time music applications and live performance

    Get PDF
    peer reviewedThis article presents PLXTRM, a system tracking picking-hand micro-gestures for real-time music applications and live performance. PLXTRM taps into the existing gesture vocabulary of the guitar player. On the first level, PLXTRM provides a continuous controller that doesn’t require the musician to learn and integrate extrinsic gestures, avoiding additional cognitive load. Beyond the possible musical applications using this continuous control, the second aim is to harness PLXTRM’s predictive power. Using a reservoir network, string onsets are predicted within a certain time frame, based on the spatial trajectory of the guitar pick. In this time frame, manipulations to the audio signal can be introduced, prior to the string actually sounding, ’prefacing’ note onsets. Thirdly, PLXTRM facilitates the distinction of playing features such as up-strokes vs. down-strokes, string selections and the continuous velocity of gestures, and thereby explores new expressive possibilities

    Mandarin Singing Voice Synthesis Based on Harmonic Plus Noise Model and Singing Expression Analysis

    Full text link
    The purpose of this study is to investigate how humans interpret musical scores expressively, and then design machines that sing like humans. We consider six factors that have a strong influence on the expression of human singing. The factors are related to the acoustic, phonetic, and musical features of a real singing signal. Given real singing voices recorded following the MIDI scores and lyrics, our analysis module can extract the expression parameters from the real singing signals semi-automatically. The expression parameters are used to control the singing voice synthesis (SVS) system for Mandarin Chinese, which is based on the harmonic plus noise model (HNM). The results of perceptual experiments show that integrating the expression factors into the SVS system yields a notable improvement in perceptual naturalness, clearness, and expressiveness. By one-to-one mapping of the real singing signal and expression controls to the synthesizer, our SVS system can simulate the interpretation of a real singer with the timbre of a speaker.Comment: 8 pages, technical repor

    Music Information Retrieval in Live Coding: A Theoretical Framework

    Get PDF
    The work presented in this article has been partly conducted while the first author was at Georgia Tech from 2015–2017 with the support of the School of Music, the Center for Music Technology and Women in Music Tech at Georgia Tech. Another part of this research has been conducted while the first author was at Queen Mary University of London from 2017–2019 with the support of the AudioCommons project, funded by the European Commission through the Horizon 2020 programme, research and innovation grant 688382. The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Music information retrieval (MIR) has a great potential in musical live coding because it can help the musician–programmer to make musical decisions based on audio content analysis and explore new sonorities by means of MIR techniques. The use of real-time MIR techniques can be computationally demanding and thus they have been rarely used in live coding; when they have been used, it has been with a focus on low-level feature extraction. This article surveys and discusses the potential of MIR applied to live coding at a higher musical level. We propose a conceptual framework of three categories: (1) audio repurposing, (2) audio rewiring, and (3) audio remixing. We explored the three categories in live performance through an application programming interface library written in SuperCollider, MIRLC. We found that it is still a technical challenge to use high-level features in real time, yet using rhythmic and tonal properties (midlevel features) in combination with text-based information (e.g., tags) helps to achieve a closer perceptual level centered on pitch and rhythm when using MIR in live coding. We discuss challenges and future directions of utilizing MIR approaches in the computer music field

    Probabilistic Modeling Paradigms for Audio Source Separation

    Get PDF
    This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems

    Computers in Support of Musical Expression

    Get PDF

    Audiovisual integration of emotional signals from others' social interactions

    Get PDF
    Audiovisual perception of emotions has been typically examined using displays of a solitary character (e.g., the face-voice and/or body-sound of one actor). However, in real life humans often face more complex multisensory social situations, involving more than one person. Here we ask if the audiovisual facilitation in emotion recognition previously found in simpler social situations extends to more complex and ecological situations. Stimuli consisting of the biological motion and voice of two interacting agents were used in two experiments. In Experiment 1, participants were presented with visual, auditory, auditory filtered/noisy, and audiovisual congruent and incongruent clips. We asked participants to judge whether the two agents were interacting happily or angrily. In Experiment 2, another group of participants repeated the same task, as in Experiment 1, while trying to ignore either the visual or the auditory information. The findings from both experiments indicate that when the reliability of the auditory cue was decreased participants weighted more the visual cue in their emotional judgments. This in turn translated in increased emotion recognition accuracy for the multisensory condition. Our findings thus point to a common mechanism of multisensory integration of emotional signals irrespective of social stimulus complexity

    Audio Features Affected by Music Expressiveness

    Full text link
    Within a Music Information Retrieval perspective, the goal of the study presented here is to investigate the impact on sound features of the musician's affective intention, namely when trying to intentionally convey emotional contents via expressiveness. A preliminary experiment has been performed involving 1010 tuba players. The recordings have been analysed by extracting a variety of features, which have been subsequently evaluated by combining both classic and machine learning statistical techniques. Results are reported and discussed.Comment: Submitted to ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2016), Pisa, Italy, July 17-21, 201

    Rhythm detection for speech-music discrimination in MPEG compressed domain

    Get PDF
    A novel approach to speech-music discrimination based on rhythm (or beat) detection is introduced. Rhythmic pulses are detected by applying a long-term autocorrelation method on band-passed signals. This approach is combined with another, in which the features describe the energy peaks of the signal. The discriminator uses just three features that are computed from data directly taken from an MPEG-1 bitstream. The discriminator was tested on more than 3 hours of audio data. Average recognition rate is 97.7%

    Synchronizing Sequencing Software to a Live Drummer

    Get PDF
    Copyright 2013 Massachusetts Institute of Technology. MIT allows authors to archive published versions of their articles after an embargo period. The article is available at
    corecore