33 research outputs found

    Announcements

    Get PDF
    Calls for Papers, Conferences, Podcast

    Contrastive audio-language learning for music

    Get PDF
    As one of the most intuitive interfaces known to humans, natural language has the potential to mediate many tasks that involve human-computer interaction, especially in application-focused fields like Music Information Retrieval. In this work, we explore cross-modal learning in an attempt to bridge audio and language in the music domain. To this end, we propose MusCALL, a framework for Music Contrastive Audio-Language Learning. Our approach consists of a dual-encoder architecture that learns the alignment between pairs of music audio and descriptive sentences, producing multimodal embeddings that can be used for text-to-audio and audio-to-text retrieval out-of-the-box. Thanks to this property, MusCALL can be transferred to virtually any task that can be cast as text-based retrieval. Our experiments show that our method performs significantly better than the baselines at retrieving audio that matches a textual description and, conversely, text that matches an audio query. We also demonstrate that the multimodal alignment capability of our model can be successfully extended to the zero-shot transfer scenario for genre classification and auto-tagging on two public datasets

    Music Information Retrieval: An Inspirational Guide to Transfer from Related Disciplines

    Get PDF
    The emerging field of Music Information Retrieval (MIR) has been influenced by neighboring domains in signal processing and machine learning, including automatic speech recognition, image processing and text information retrieval. In this contribution, we start with concrete examples for methodology transfer between speech and music processing, oriented on the building blocks of pattern recognition: preprocessing, feature extraction, and classification/decoding. We then assume a higher level viewpoint when describing sources of mutual inspiration derived from text and image information retrieval. We conclude that dealing with the peculiarities of music in MIR research has contributed to advancing the state-of-the-art in other fields, and that many future challenges in MIR are strikingly similar to those that other research areas have been facing

    Theoretical and applied issues on the impact of information on musical creativity: an information seeking behaviour perspective.

    Get PDF
    This century is an era of information and knowledge intensification. Novel information systems and services are developing through modern online information technologies. The rapid changes in the online information environment have greatly affected the way in which individuals search for music information and engage with musical creativity, within different music domains and for different purposes which involve composition, performance and improvisation, analysis and listening. The aim of this book chapter is to investigate the theoretical and practical issues relating to the impact of music information on musical creativity from an information seeking behavior perspective. Musical creativity is perceived as an intentional process which acts as a motivator for information seeking, leading to the utilization of different information resources and to the development of specific information seeking preferences. The chapter highlights the implications for research in this area and presents a research agenda for the interrelation between music information seeking and musical creativity

    Automatic Classification of Digital Music by Genre

    Get PDF
    Presented at the Grace Hopper Celebration of Women in Computing (GHC’12) Research Poster, Baltimore, MD, USA and also presented at the Women in Machine Learning Workshop (WiML ’12), Research Poster, Lake Tahoe, Nevada, USA.Over the past two decades, advances in the digital music industry have resulted in an exponential growth in music data sets. This exponential growth has in turn spurred great interest in music information retrieval (MIR) problems, organizing large music collections, and content-based search methods for digital music libraries. Equally important are the related problems in music classification such as genre classification, music mood analysis, and artist identification. Music genre classification is a well-studied problem in the music information retrieval community and has a wide range of applications. In this project we address the problem of genre classification by representing the MFCC feature vectors in an extended semantic space. We combine this audio representation with machine learning techniques to perform genre classification with the goal of obtaining higher classification accuracy

    Browse-to-search

    Full text link
    This demonstration presents a novel interactive online shopping application based on visual search technologies. When users want to buy something on a shopping site, they usually have the requirement of looking for related information from other web sites. Therefore users need to switch between the web page being browsed and other websites that provide search results. The proposed application enables users to naturally search products of interest when they browse a web page, and make their even causal purchase intent easily satisfied. The interactive shopping experience is characterized by: 1) in session - it allows users to specify the purchase intent in the browsing session, instead of leaving the current page and navigating to other websites; 2) in context - -the browsed web page provides implicit context information which helps infer user purchase preferences; 3) in focus - users easily specify their search interest using gesture on touch devices and do not need to formulate queries in search box; 4) natural-gesture inputs and visual-based search provides users a natural shopping experience. The system is evaluated against a data set consisting of several millions commercial product images. © 2012 Authors

    Discovering real-world usage scenarios for a multimodal math search interface

    Get PDF
    To use math expressions in search, current search engines require knowing expression names or using a structure editor or string encoding (e.g., LaTeX) to enter expressions. This is unfortunate for people who are not math experts, as this can lead to an intention gap between the math query they wish to express, and what the interface will allow. min is a search interface that supports drawing expressions on a canvas using a mouse/touch, keyboard and images. We designed a user study to examine how the multimodal interface of min changes search behavior for mathematical non-experts, and discover real-world usage scenarios. Participants demonstrated increased use of math expressions in queries when using min. There was little difference in task success reported by participants using min vs. text-based search, but the majority of participants appreciated the multimodal input, and identified real-world scenarios in which they would like to use systems like min

    Content-Based Music Genre Classification Using Sparse Approximation Techniques

    Get PDF
    Presented at the Drexel IEEE Graduate Forum’s Fifth Annual Research SymposiumIn this study we evaluated the performance of genre classification systems using various feature vectors and learning methods. Using a fixed classifier, i.e., the Gaussian mixture models we were able to create a suboptimal feature vector to characterize the audio signals in a low dimensional feature space. We then utilized this modified feature representation to solve the problem of music genre classification. We evaluated the performance of the recent sparsity-eager support vector machines classifier using the proposed feature vector and compared the results to the classic support vector machines and Gaussian mixture models as the baseline classifiers

    Recommender Systems based on Linked Data

    Get PDF
    Backgrounds: The increase in the amount of structured data published using the principles of Linked Data, means that now it is more likely to find resources in the Web of Data that describe real life concepts. However, discovering resources related to any given resource is still an open research area. This thesis studies Recommender Systems (RS) that use Linked Data as a source for generating recommendations exploiting the large amount of available resources and the relationships among them. Aims: The main objective of this study was to propose a recommendation tech- nique for resources considering semantic relationships between concepts from Linked Data. The specific objectives were: (i) Define semantic relationships derived from resources taking into account the knowledge found in Linked Data datasets. (ii) Determine semantic similarity measures based on the semantic relationships derived from resources. (iii) Propose an algorithm to dynami- cally generate automatic rankings of resources according to defined similarity measures. Methodology: It was based on the recommendations of the Project management Institute and the Integral Model for Engineering Professionals (Universidad del Cauca). The first one for managing the project, and the second one for developing the experimental prototype. Accordingly, the main phases were: (i) Conceptual base generation for identifying the main problems, objectives and the project scope. A Systematic Literature Review was conducted for this phase, which highlighted the relationships and similarity measures among resources in Linked Data, and the main issues, features, and types of RS based on Linked Data. (ii) Solution development is about designing and developing the experimental prototype for testing the algorithms studied in this thesis. Results: The main results obtained were: (i) The first Systematic Literature Re- view on RS based on Linked Data. (ii) A framework to execute and an- alyze recommendation algorithms based on Linked Data. (iii) A dynamic algorithm for resource recommendation based on on the knowledge of Linked Data relationships. (iv) A comparative study of algorithms for RS based on Linked Data. (v) Two implementations of the proposed framework. One with graph-based algorithms and other with machine learning algorithms. (vi) The application of the framework to various scenarios to demonstrate its feasibility within the context of real applications. Conclusions: (i) The proposed framework demonstrated to be useful for develop- ing and evaluating different configurations of algorithms to create novel RS based on Linked Data suitable to users’ requirements, applications, domains and contexts. (ii) The layered architecture of the proposed framework is also useful towards the reproducibility of the results for the research community. (iii) Linked data based RS are useful to present explanations of the recommen- dations, because of the graph structure of the datasets. (iv) Graph-based algo- rithms take advantage of intrinsic relationships among resources from Linked Data. Nevertheless, their execution time is still an open issue. Machine Learn- ing algorithms are also suitable, they provide functions useful to deal with large amounts of data, so they can help to improve the performance (execution time) of the RS. However most of them need a training phase that require to know a priory the application domain in order to obtain reliable results. (v) A log- ical evolution of RS based on Linked Data is the combination of graph-based with machine learning algorithms to obtain accurate results while keeping low execution times. However, research and experimentation is still needed to ex- plore more techniques from the vast amount of machine learning algorithms to determine the most suitable ones to deal with Linked Data
    corecore