40,522 research outputs found
Recommended from our members
On machine learning and knowledge organisation in Multimedia Information Retrieval
Recent technological developments have increased the use of machine learning to solve many problems, including many in information retrieval (IR). Deployment of machine-learning techniques is widespread in text search, notability web search engines (Dai et al., 2011). Multimedia information retrieval as a problem however still represents a significant challenge to machine learning as a technological solution, but some problems in IR can still be addressed by using appropriate AI techniques. In this paper we review the technological developments, and provide a perspective on the use of machine-learning techniques in conjunction with knowledge organisation techniques to address multimedia IR needs. We take the perspective from the MacFarlane (2016) position paper, that there are some problems in multimedia IR that AI and machine learning cannot currently solve. The semantic gap in multimedia IR (Enser, 2008) remains a significant problem in the field, and solutions to them are many years off. However, there are occasions where the new technological developments allow the use of knowledge organisation and machine learning in multimedia search systems and services. Specifically we argue that the improvement of detection of some classes of low level features in images (Karpathy and Li, 2015), music (Byrd and Crawford, 2002) and video (Hu et al., 2011) can be used in conjunction with knowledge organisation to tag or label multimedia content for better retrieval performance. We advocate the use of supervised learning techniques. We provide an overview of the use of knowledge organisation schemes in machine learning, and make recommendations to information professionals on the use of this technology with knowledge organisation techniques to solve multimedia IR problems
Music Similarity Estimation
Music is a complicated form of communication, where creators and culture communicate and expose their individuality. After music digitalization took place, recommendation systems and other online services have become indispensable in the field of Music Information Retrieval (MIR). To build these systems and recommend the right choice of song to the user, classification of songs is required. In this paper, we propose an approach for finding similarity between music based on mid-level attributes like pitch, midi value corresponding to pitch, interval, contour and duration and applying text based classification techniques. Our system predicts jazz, metal and ragtime for western music. The experiment to predict the genre of music is conducted based on 450 music files and maximum accuracy achieved is 95.8% across different n-grams. We have also analyzed the Indian classical Carnatic music and are classifying them based on its raga. Our system predicts Sankarabharam, Mohanam and Sindhubhairavi ragas. The experiment to predict the raga of the song is conducted based on 95 music files and the maximum accuracy achieved is 90.3% across different n-grams. Performance evaluation is done by using the accuracy score of scikit-learn
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
Characterizing the Landscape of Musical Data on the Web: State of the Art and Challenges
Musical data can be analysed, combined, transformed and exploited for diverse purposes. However, despite the proliferation of digital libraries and repositories for music, infrastructures and tools, such uses of musical data remain scarce. As an initial step to help fill this gap, we present a survey of the landscape of musical data on the Web, available as a Linked Open Dataset: the musoW dataset of catalogued musical resources. We present the dataset and the methodology and criteria for its creation and assessment. We map the identified dimensions and parameters to existing Linked Data vocabularies, present insights gained from SPARQL queries, and identify significant relations between resource features. We present a thematic analysis of the original research questions associated with surveyed resources and identify the extent to which the collected resources are Linked Data-ready
Methodological considerations concerning manual annotation of musical audio in function of algorithm development
In research on musical audio-mining, annotated music databases are needed which allow the development of computational tools that extract from the musical audiostream the kind of high-level content that users can deal with in Music Information Retrieval (MIR) contexts. The notion of musical content, and therefore the notion of annotation, is ill-defined, however, both in the syntactic and semantic sense. As a consequence, annotation has been approached from a variety of perspectives (but mainly linguistic-symbolic oriented), and a general methodology is lacking. This paper is a step towards the definition of a general framework for manual annotation of musical audio in function of a computational approach to musical audio-mining that is based on algorithms that learn from annotated data. 1
Towards an All-Purpose Content-Based Multimedia Information Retrieval System
The growth of multimedia collections - in terms of size, heterogeneity, and
variety of media types - necessitates systems that are able to conjointly deal
with several forms of media, especially when it comes to searching for
particular objects. However, existing retrieval systems are organized in silos
and treat different media types separately. As a consequence, retrieval across
media types is either not supported at all or subject to major limitations. In
this paper, we present vitrivr, a content-based multimedia information
retrieval stack. As opposed to the keyword search approach implemented by most
media management systems, vitrivr makes direct use of the object's content to
facilitate different types of similarity search, such as Query-by-Example or
Query-by-Sketch, for and, most importantly, across different media types -
namely, images, audio, videos, and 3D models. Furthermore, we introduce a new
web-based user interface that enables easy-to-use, multimodal retrieval from
and browsing in mixed media collections. The effectiveness of vitrivr is shown
on the basis of a user study that involves different query and media types. To
the best of our knowledge, the full vitrivr stack is unique in that it is the
first multimedia retrieval system that seamlessly integrates support for four
different types of media. As such, it paves the way towards an all-purpose,
content-based multimedia information retrieval system
Performance Following: Real-Time Prediction of Musical Sequences Without a Score
(c)2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works
Integrating musicology's heterogeneous data sources for better exploration
Musicologists have to consult an extraordinarily heterogeneous body of primary and secondary sources during all stages of their research. Many of these sources are now available online, but the historical dispersal of material across libraries and archives has now been replaced by segregation of data and metadata into a plethora of online repositories. This segregation hinders the intelligent manipulation of metadata, and means that extracting large tranches of basic factual information or running multi-part search queries is still enormously and needlessly time consuming. To counter this barrier to research, the “musicSpace” project is experimenting with integrating access to many of musicology’s leading data sources via a modern faceted browsing interface that utilises Semantic Web and Web2.0 technologies such as RDF and AJAX. This will make previously intractable search queries tractable, enable musicologists to use their time more efficiently, and aid the discovery of potentially significant information that users did not think to look for. This paper outlines our work to date
- …