17,934 research outputs found
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
Recommended from our members
MUSCLE movie-database: a multimodal corpus with rich annotation for dialogue and saliency detection
TRECVID 2008 - goals, tasks, data, evaluation mechanisms and metrics
The TREC Video Retrieval Evaluation (TRECVID) 2008 is a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in content-based exploitation of digital video via open, metrics-based evaluation. Over the last 7 years this effort has yielded a
better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. In 2008, 77 teams (see Table 1) from various research organizations --- 24 from
Asia, 39 from Europe, 13 from North America, and 1 from Australia --- participated in one or more of five tasks: high-level feature extraction, search (fully automatic, manually assisted, or interactive), pre-production video (rushes) summarization, copy detection, or surveillance event detection. The copy detection and surveillance event detection tasks are being run for the first time in TRECVID.
This paper presents an overview of TRECVid in 2008
Recommended from our members
A Systemic Approach to Music Performance Learning with Multimodal Technology
Models and Analysis of Vocal Emissions for Biomedical Applications
The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies
Effort in gestural interactions with imaginary objects in Hindustani Dhrupad vocal music
Physical effort has often been regarded as a key factor of expressivity in music performance. Nevertheless, systematic experimental approaches to the subject have been rare. In North Indian classical (Hindustani) vocal music, singers often engage with melodic ideas during improvisation by manipulating intangible, imaginary objects with their hands, such as through stretching, pulling, pushing, throwing etc. The above observation suggests that some patterns of change in acoustic features allude to interactions that real objects through their physical properties can afford. The present study reports on the exploration of the relationships between movement and sound by accounting for the physical effort that such interactions require in the Dhrupad genre of Hindustani vocal improvisation.
The work follows a mixed methodological approach, combining qualitative and quantitative methods to analyse interviews, audio-visual material and movement data. Findings indicate that despite the flexibility in the way a Dhrupad vocalist might use his/her hands while singing, there is a certain degree of consistency by which performers associate effort levels with melody and types of gestural interactions with imaginary objects. However, different schemes of cross-modal associations are revealed for the vocalists analysed, that depend on the pitch space organisation of each particular melodic mode (rāga), the mechanical requirements of voice production, the macro-structure of the ālāp improvisation and morphological cross-domain analogies. Results further suggest that a good part of the variance in both physical effort and gesture type can be explained through a small set of sound and movement features. Based on the findings, I argue that gesturing in Dhrupad singing is guided by: the know-how of humans in interacting with and exerting effort on real objects of the environment, the movement–sound relationships transmitted from teacher to student in the oral music training context and the mechanical demands of vocalisation
- …