27,080 research outputs found
Audio Content-Based Music Retrieval
The rapidly growing corpus of digital audio material requires novel
retrieval strategies for exploring large music collections. Traditional retrieval strategies rely on metadata that describe the actual audio content in words. In the case that such textual descriptions are not available, one requires content-based retrieval strategies which only utilize the raw audio material. In this contribution, we discuss content-based retrieval strategies that
follow the query-by-example paradigm: given an audio query, the task is to retrieve all documents that are somehow similar or related to the query from a music collection. Such strategies can be loosely classified according to their "specificity", which refers to the degree of similarity between the query and the database documents. Here, high specificity refers to a strict notion of similarity, whereas low specificity to a rather vague one. Furthermore, we introduce a second classification principle based on "granularity", where one distinguishes between fragment-level and document-level retrieval. Using a classification scheme based on specificity and granularity, we identify various classes of retrieval scenarios, which comprise "audio identification", "audio matching", and "version
identification". For these three important classes, we give an overview of representative state-of-the-art approaches, which also illustrate the sometimes subtle but crucial differences between the retrieval scenarios. Finally, we give an outlook on a user-oriented retrieval system, which combines the various retrieval strategies in a unified framework
Music information retrieval: conceptuel framework, annotation and user behaviour
Understanding music is a process both based on and influenced by the knowledge and experience of the listener. Although content-based music retrieval has been given increasing attention in recent years, much of the research still focuses on bottom-up retrieval techniques. In order to make a music information retrieval system appealing and useful to the user, more effort should be spent on constructing systems that both operate directly on the encoding of the physical energy of music and are flexible with respect to usersâ experiences.
This thesis is based on a user-centred approach, taking into account the mutual relationship between music as an acoustic phenomenon and as an expressive phenomenon. The issues it addresses are: the lack of a conceptual framework, the shortage of annotated musical audio databases, the lack of understanding of the behaviour of system users and shortage of user-dependent knowledge with respect to high-level features of music.
In the theoretical part of this thesis, a conceptual framework for content-based music information retrieval is defined. The proposed conceptual framework - the first of its kind - is conceived as a coordinating structure between the automatic description of low-level music content, and the description of high-level content by the system users. A general framework for the manual annotation of musical audio is outlined as well. A new methodology for the manual annotation of musical audio is introduced and tested in case studies. The results from these studies show that manually annotated music files can be of great help in the development of accurate analysis tools for music information retrieval.
Empirical investigation is the foundation on which the aforementioned theoretical framework is built. Two elaborate studies involving different experimental issues are presented. In the first study, elements of signification related to spontaneous user behaviour are clarified. In the second study, a global profile of music information retrieval system users is given and their description of high-level content is discussed. This study has uncovered relationships between the usersâ demographical background and their perception of expressive and structural features of music. Such a multi-level approach is exceptional as it included a large sample of the population of real users of interactive music systems. Tests have shown that the findings of this study are representative of the targeted population.
Finally, the multi-purpose material provided by the theoretical background and the results from empirical investigations are put into practice in three music information retrieval applications: a prototype of a user interface based on a taxonomy, an annotated database of experimental findings and a prototype semantic user recommender system.
Results are presented and discussed for all methods used. They show that, if reliably generated, the use of knowledge on users can significantly improve the quality of music content analysis. This thesis demonstrates that an informed knowledge of human approaches to music information retrieval provides valuable insights, which may be of particular assistance in the development of user-friendly, content-based access to digital music collections
A scalable Peer-to-Peer System for Music Content and Information Retrieval
Currently a large percentage of internet traffice consists of music files, typically stored in MP3 compressed audio format, shared and exchanged over Peer-to-Peer (P2P) networks. Searching for music is performed by specifying keywords and naive string matching techniques. In the past years the emerging research area of Music Information Retrieval (MIR) has produced a variety of new ways of looking at the problem of music search. Such MIR techniques can significantly enhance the ways users search for music over P2P networks. In order for that to happen there are two main challenges that need to be addressed: 1) scalability to large collections and number of peers 2) richer set of search semantics that can support MIR especially when retrieval is content-based. In this paper, we describe a scalable P2P system that uses Rendezvouz Points (RPs) for music metadata registration and query resolution, that supports atribute-value search semantics as well as content-based retrieval. The performance of the system has been evaluated in large scale usage scenarios using "real" automatically calculated musical content descriptors
Towards an All-Purpose Content-Based Multimedia Information Retrieval System
The growth of multimedia collections - in terms of size, heterogeneity, and
variety of media types - necessitates systems that are able to conjointly deal
with several forms of media, especially when it comes to searching for
particular objects. However, existing retrieval systems are organized in silos
and treat different media types separately. As a consequence, retrieval across
media types is either not supported at all or subject to major limitations. In
this paper, we present vitrivr, a content-based multimedia information
retrieval stack. As opposed to the keyword search approach implemented by most
media management systems, vitrivr makes direct use of the object's content to
facilitate different types of similarity search, such as Query-by-Example or
Query-by-Sketch, for and, most importantly, across different media types -
namely, images, audio, videos, and 3D models. Furthermore, we introduce a new
web-based user interface that enables easy-to-use, multimodal retrieval from
and browsing in mixed media collections. The effectiveness of vitrivr is shown
on the basis of a user study that involves different query and media types. To
the best of our knowledge, the full vitrivr stack is unique in that it is the
first multimedia retrieval system that seamlessly integrates support for four
different types of media. As such, it paves the way towards an all-purpose,
content-based multimedia information retrieval system
Managing complexity in a distributed digital library
As the capabilities of distributed digital libraries increase, managing organizational and software complexity becomes a key issue. How can collections and indexes be updated without impacting queries currently in progress? How can the system handle several user-interface clients for the same collections? Computer science professors and lectors from the University of Waikato have developed a software structure that successfully manages this complexity in the New Zealand Digital Library. This digital library has been a success in managing organizational and software complexity. The researchers' primary goal has been to minimize the effort required to keep the system operational and yet continue to expand its offerings
Design and Evaluation of a Probabilistic Music Projection Interface
We describe the design and evaluation of a probabilistic
interface for music exploration and casual playlist generation.
Predicted subjective features, such as mood and
genre, inferred from low-level audio features create a 34-
dimensional feature space. We use a nonlinear dimensionality
reduction algorithm to create 2D music maps of
tracks, and augment these with visualisations of probabilistic
mappings of selected features and their uncertainty.
We evaluated the system in a longitudinal trial in usersâ
homes over several weeks. Users said they had fun with the
interface and liked the casual nature of the playlist generation.
Users preferred to generate playlists from a local
neighbourhood of the map, rather than from a trajectory,
using neighbourhood selection more than three times more
often than path selection. Probabilistic highlighting of subjective
features led to more focused exploration in mouse
activity logs, and 6 of 8 users said they preferred the probabilistic
highlighting mode
Tracking the Evolution of a Band's Live Performances over Decades
International Society for Music Information Retrieval Conference (ISMIR 2022) , Bengaluru, India, December 4-8, 2022Evolutionary studies have become a dominant thread in the analysis of large audio collections. Such corpora usually consist of musical pieces by various composers or bands and the studies usually focus on identifying general historical trends in harmonic content or music production techniques. In this paper we present a comparable study that examines the music of a single band whose publicly available live recordings span three decades. We first discuss the opportunities and challenges faced when working with single-artist and live-music datasets and introduce solutions for audio feature validation and outlier detection. We then investigate how individual songs vary over time and identify general performance trends using a new approach based on relative feature values, which improves accuracy for features with a large variance. Finally, we validate our findings by juxtaposing them with descriptions posted in online forums by experienced listeners of the band's large following
Large scale evaluations of multimedia information retrieval: the TRECVid experience
Information Retrieval is a supporting technique which underpins a broad range of content-based applications including retrieval, filtering, summarisation, browsing, classification, clustering, automatic linking, and others. Multimedia information retrieval (MMIR) represents those applications when applied to multimedia information such as image, video, music, etc. In this presentation and extended abstract we are primarily concerned with MMIR as applied to information in digital video format. We begin with a brief overview of large scale evaluations of IR tasks in areas such as text, image and music, just to illustrate that this phenomenon is not just restricted to MMIR on video. The main contribution, however, is a set of pointers and a summarisation of the work done as part of TRECVid, the annual benchmarking exercise for video retrieval tasks
- âŠ