5 research outputs found
CCOM-HuQin: an Annotated Multimodal Chinese Fiddle Performance Dataset
HuQin is a family of traditional Chinese bowed string instruments. Playing
techniques(PTs) embodied in various playing styles add abundant emotional
coloring and aesthetic feelings to HuQin performance. The complex applied
techniques make HuQin music a challenging source for fundamental MIR tasks such
as pitch analysis, transcription and score-audio alignment. In this paper, we
present a multimodal performance dataset of HuQin music that contains
audio-visual recordings of 11,992 single PT clips and 57 annotated musical
pieces of classical excerpts. We systematically describe the HuQin PT taxonomy
based on musicological theory and practical use cases. Then we introduce the
dataset creation methodology and highlight the annotation principles featuring
PTs. We analyze the statistics in different aspects to demonstrate the variety
of PTs played in HuQin subcategories and perform preliminary experiments to
show the potential applications of the dataset in various MIR tasks and
cross-cultural music studies. Finally, we propose future work to be extended on
the dataset.Comment: 15 pages, 11 figure
Recent Trends in Computational Intelligence
Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications
Modelling Professional Singers: A Bayesian Machine Learning Approach with Enhanced Real-time Pitch Contour Extraction and Onset Processing from an Extended Dataset.
Singing signals are one of the input data that computer systems need to analyse, and singing is part of all the cultures in the world. However, although there have been several studies on audio signal processing during the last three decades, it is still an active research area because most of the available algorithms in the literature require improvement due to the complexity of audio/music signals. More efforts are needed for analysing sounds/music in a real-time environment since the algorithms should work only on the past data, while in an offline system, all the required data are available. In addition, the complexity of the data will be increased if the audio signals come from singing due to the unique features of singing signals (such as vocal system, vibration, pitch drift, and tuning approach) that make the signals different and more complicated than those from an instrument.
This thesis is mainly focused on analysing singing signals and better understanding how trained- professional singers sing the pitch frequency and duration of the notes according to their position in a piece of music and the singing technique applied. To do this, it is discovered that by incorporating singing features, such as gender and BPM, a real-time pitch detection algorithm can be found to estimate fundamental frequencies with fewer errors. In addition, two novel algorithms were proposed, one for smoothing pitch contours and another for estimating onset, offset, and the transition between notes. These two algorithms showed better results as compared to several other state-of-the-art algorithms. Moreover, a new vocal dataset that included several annotations for 2688 singing files was published. Finally, this thesis presents two models for calculating pitches and the duration of notes according to their positions in a piece of music. In conclusion, optimizing results for pitch-oriented Music Information Retrieval (MIR) algorithms necessitates adapting/selecting them based on the unique characteristics of the signals. Achieving a universal algorithm that performs exceptionally well on all data types remains a formidable challenge given the current state of technology