2,821 research outputs found
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
Proceedings of the 6th International Workshop on Folk Music Analysis, 15-17 June, 2016
The Folk Music Analysis Workshop brings together computational music analysis and ethnomusicology. Both symbolic and audio representations of music are considered, with a broad range of scientific approaches being applied (signal processing, graph theory, deep learning). The workshop features a range of interesting talks from international researchers in areas such as Indian classical music, Iranian singing, Ottoman-Turkish Makam music scores, Flamenco singing, Irish traditional music, Georgian traditional music and Dutch folk songs. Invited guest speakers were Anja Volk, Utrecht University and Peter Browne, Technological University Dublin
Pan European Voice Conference - PEVOC 11
The Pan European VOice Conference (PEVOC) was born in 1995 and therefore in 2015 it celebrates the 20th anniversary of its establishment: an important milestone that clearly expresses the strength and interest of the scientific community for the topics of this conference. The most significant themes of PEVOC are singing pedagogy and art, but also occupational voice disorders, neurology, rehabilitation, image and video analysis. PEVOC takes place in different European cities every two years (www.pevoc.org). The PEVOC 11 conference includes a symposium of the Collegium Medicorum Theatri (www.comet collegium.com
Wagner Ring Dataset: A Complex Opera Scenario for Music Processing and Computational Musicology
This paper introduces the Wagner Ring Dataset (WRD), a multi-modal and multi-version resource on the large-scale opera cycle Der Ring des Nibelungen by Richard Wagner. The Ring comprises four music dramas organized into eleven acts and 21 939 measures in total. Concerning sheet music, we processed a publicly available piano reduction (822 pages) of the full score with optical music recognition followed by extensive manual corrections to create a high-quality, machine-readable symbolic score. Concerning audio data, our corpus covers 16 recorded performances of the full Ring (three of which are publicly available thanks to copyright expiry), each lasting about 14–15 hours. To musically synchronize these versions among each other, we manually annotated all measure positions for three performances, which we transferred to the remaining performances via automated synchronization techniques. The dataset further comprises annotations of key and time signatures, scenes, and singing voice regions (libretto). Moreover, we provide note event annotations for all performances derived from the piano score. The WRD thus constitutes a comprehensive resource for developing algorithms for various music information retrieval tasks, complementing existing datasets with a complex opera scenario. For computational musicology, the WRD serves as a structured dataset that allows for studying the composition and performances of the Ring
Models and Analysis of Vocal Emissions for Biomedical Applications
The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis
Effects of training and lung volume levels on voice onset control and cortical activation in singers
Singers need to counteract respiratory elastic recoil at high and low lung volume levels (LVLs) to maintain consistent airflow and pressure while singing. Professionally trained singers modify their vocal and respiratory systems creating a physiologically stable and perceptually pleasing voice quality at varying LVLs. In manuscript 1, we compared non-singers and singers on the initiation of a voiceless plosive followed by a vowel at low (30% vital capacity, VC), intermediate (50%VC), and high (80%VC) LVLs. In manuscript 2, we examined how vocal students (singers in manuscript 1) learn to control their voice onset at varying LVLs before and after a semester of voice training within a university program. Also examined were the effects of training level and LVLs on cortical activation patterns between non-singers and singers (manuscript 1), and within vocal students before and after training (manuscript 2) using fNIRS. Results revealed decreased control of voice onset initially in singers prior to training as compared to non-singers, but significant improvements in initial voice onset control after training, although task difficulty continued to alter voice physiology throughout. Cortical activation patterns did not change with training but continued to show increased activation during the most difficult tasks, which was more pronounced after training. Professionally trained techniques for consistent, coordinated voice initiation were shown to alter voice onset following plosive consonants with training. However, in non-singers and, as performance improved in singers after training, cortical activation remained greatest during the tasks at low LVLs when difficulty was highest
Computational Modelling and Analysis of Vibrato and Portamento in Expressive Music Performance
PhD, 148ppVibrato and portamento constitute two expressive devices involving continuous
pitch modulation and is widely employed in string, voice, wind music instrument
performance. Automatic extraction and analysis of such expressive features
form some of the most important aspects of music performance research and
represents an under-explored area in music information retrieval. This thesis
aims to provide computational and scalable solutions for the automatic extraction
and analysis of performed vibratos and portamenti. Applications of the
technologies include music learning, musicological analysis, music information
retrieval (summarisation, similarity assessment), and music expression synthesis.
To automatically detect vibratos and estimate their parameters, we propose
a novel method based on the Filter Diagonalisation Method (FDM). The FDM
remains robust over short time frames, allowing frame sizes to be set at values
small enough to accurately identify local vibrato characteristics and pinpoint
vibrato boundaries. For the determining of vibrato presence, we test two alternate
decision mechanisms—the Decision Tree and Bayes’ Rule. The FDM
systems are compared to state-of-the-art techniques and obtains the best results.
The FDM’s vibrato rate accuracies are above 92.5%, and the vibrato
extent accuracies are about 85%.
We use the Hidden Markov Model (HMM) with Gaussian Mixture Model
(GMM) to detect portamento existence. Upon extracting the portamenti, we
propose a Logistic Model for describing portamento parameters. The Logistic
Model has the lowest root mean squared error and the highest adjusted Rsquared
value comparing to regression models employing Polynomial and Gaussian
functions, and the Fourier Series.
The vibrato and portamento detection and analysis methods are implemented
in AVA, an interactive tool for automated detection, analysis, and visualisation
of vibrato and portamento. Using the system, we perform crosscultural
analyses of vibrato and portamento differences between erhu and violin
performance styles, and between typical male or female roles in Beijing opera
singing
Case study of a performance-active changing trans* male singing voice
A professional classical singer of more than 25 years (AZ) in his early 50s requested this voice researcher’s consultation and assistance in early 2014. He was about to start living full time as a trans* man. Despite his intention to be included in the low start/gradual increase testosterone option of the Trans* Male (previously, “FTM”) Singing Voice Program, the request contained a rather unconventional aspect: AZ would continue to sing while his voice was changing. The above request was integral with his singing history. After the introduction of safeguards and his informed consent, AZ was accepted onto the Program. Due to the highly individual circumstances, his participation was recorded as a case study. The study has aimed to replicate the particulars of the slow hormonal changes and continuing singing ability found in certain cisgender male adolescent voices. Despite dealing with an adult trans* male individual, the progress has been comparable. This has been achieved by carefully monitoring AZ’s low start/gradual increase testosterone administration in communication with the medical practitioner. The participant’s vocal health remained safeguarded and promoted by carefully individualized vocal tuition. This article will discuss the collective results of the case study, including the recordings and the data analysis
- …