7,479 research outputs found
Target Type Identification for Entity-Bearing Queries
Identifying the target types of entity-bearing queries can help improve
retrieval performance as well as the overall search experience. In this work,
we address the problem of automatically detecting the target types of a query
with respect to a type taxonomy. We propose a supervised learning approach with
a rich variety of features. Using a purpose-built test collection, we show that
our approach outperforms existing methods by a remarkable margin. This is an
extended version of the article published with the same title in the
Proceedings of SIGIR'17.Comment: Extended version of SIGIR'17 short paper, 5 page
The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use
The GTZAN dataset appears in at least 100 published works, and is the
most-used public dataset for evaluation in machine listening research for music
genre recognition (MGR). Our recent work, however, shows GTZAN has several
faults (repetitions, mislabelings, and distortions), which challenge the
interpretability of any result derived using it. In this article, we disprove
the claims that all MGR systems are affected in the same ways by these faults,
and that the performances of MGR systems in GTZAN are still meaningfully
comparable since they all face the same faults. We identify and analyze the
contents of GTZAN, and provide a catalog of its faults. We review how GTZAN has
been used in MGR research, and find few indications that its faults have been
known and considered. Finally, we rigorously study the effects of its faults on
evaluating five different MGR systems. The lesson is not to banish GTZAN, but
to use it with consideration of its contents.Comment: 29 pages, 7 figures, 6 tables, 128 reference
Affective Music Information Retrieval
Much of the appeal of music lies in its power to convey emotions/moods and to
evoke them in listeners. In consequence, the past decade witnessed a growing
interest in modeling emotions from musical signals in the music information
retrieval (MIR) community. In this article, we present a novel generative
approach to music emotion modeling, with a specific focus on the
valence-arousal (VA) dimension model of emotion. The presented generative
model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the
subjectivity of emotion perception by the use of probability distributions.
Specifically, it learns from the emotion annotations of multiple subjects a
Gaussian mixture model in the VA space with prior constraints on the
corresponding acoustic features of the training music pieces. Such a
computational framework is technically sound, capable of learning in an online
fashion, and thus applicable to a variety of applications, including
user-independent (general) and user-dependent (personalized) emotion
recognition and emotion-based music retrieval. We report evaluations of the
aforementioned applications of AEG on a larger-scale emotion-annotated corpora,
AMG1608, to demonstrate the effectiveness of AEG and to showcase how
evaluations are conducted for research on emotion-based MIR. Directions of
future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio
Music Information Retrieval in Live Coding: A Theoretical Framework
The work presented in this article has been partly conducted while the first author was at Georgia Tech from 2015–2017 with the support of the School of Music, the Center for Music Technology and Women in Music Tech at Georgia Tech.
Another part of this research has been conducted while the first author was at Queen Mary University of London from 2017–2019 with the support of the AudioCommons project, funded by the European Commission through the Horizon 2020 programme, research and innovation grant 688382.
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Music information retrieval (MIR) has a great potential in musical live coding because it can help the musician–programmer to make musical decisions based on audio content analysis and explore new sonorities by means of MIR techniques. The use of real-time MIR techniques can be computationally demanding and thus they have been rarely used in live coding; when they have been used, it has been with a focus on low-level feature extraction. This article surveys and discusses the potential of MIR applied to live coding at a higher musical level. We propose a conceptual framework of three categories: (1) audio repurposing, (2) audio rewiring, and (3) audio remixing. We explored the three categories in live performance through an application programming interface library written in SuperCollider, MIRLC. We found that it is still a technical challenge to use high-level features in real time, yet using rhythmic and tonal properties (midlevel features) in combination with text-based information (e.g., tags) helps to achieve a closer perceptual level centered on pitch and rhythm when using MIR in live coding. We discuss challenges and future directions of utilizing MIR approaches in the computer music field
- …