10,296 research outputs found

    Extended pipeline for content-based feature engineering in music genre recognition

    Full text link
    We present a feature engineering pipeline for the construction of musical signal characteristics, to be used for the design of a supervised model for musical genre identification. The key idea is to extend the traditional two-step process of extraction and classification with additive stand-alone phases which are no longer organized in a waterfall scheme. The whole system is realized by traversing backtrack arrows and cycles between various stages. In order to give a compact and effective representation of the features, the standard early temporal integration is combined with other selection and extraction phases: on the one hand, the selection of the most meaningful characteristics based on information gain, and on the other hand, the inclusion of the nonlinear correlation between this subset of features, determined by an autoencoder. The results of the experiments conducted on GTZAN dataset reveal a noticeable contribution of this methodology towards the model's performance in classification task.Comment: ICASSP 201

    Context-aware, ontology-based, service discovery

    Get PDF
    Service discovery is a process of locating, or discovering, one or more documents, that describe a particular service. Most of the current service discovery approaches perform syntactic matching, that is, they retrieve services descriptions that contain particular keywords from the user’s query. This often leads to poor discovery results, because the keywords in the query can be semantically similar but syntactically different, or syntactically similar but semantically different from the terms in a service description. Another drawback of the existing service discovery mechanisms is that the query-service matching score is calculated taking into account only the keywords from the user’s query and the terms in the service descriptions. Thus, regardless of the context of the service user and the context of the services providers, the same list of results is returned in response to a particular query. This paper presents a novel approach for service discovery that uses ontologies to capture the semantics of the user’s query, of the services and of the contextual information that is considered relevant in the matching process

    A Semantic Similarity Method for Products and Processes

    Get PDF
    豊橋技術科学大

    Enabling Embodied Analogies in Intelligent Music Systems

    Full text link
    The present methodology is aimed at cross-modal machine learning and uses multidisciplinary tools and methods drawn from a broad range of areas and disciplines, including music, systematic musicology, dance, motion capture, human-computer interaction, computational linguistics and audio signal processing. Main tasks include: (1) adapting wisdom-of-the-crowd approaches to embodiment in music and dance performance to create a dataset of music and music lyrics that covers a variety of emotions, (2) applying audio/language-informed machine learning techniques to that dataset to identify automatically the emotional content of the music and the lyrics, and (3) integrating motion capture data from a Vicon system and dancers performing on that music.Comment: 4 page

    Retrieving Ambiguous Sounds Using Perceptual Timbral Attributes in Audio Production Environments

    Get PDF
    For over an decade, one of the well identified problem within audio production environments is the effective retrieval and management of sound libraries. Most of the self-recorded and commercially produced sound libraries are usually well structured in terms of meta-data and textual descriptions and thus allowing traditional text-based retrieval approaches to obtain satisfiable results. However, traditional information retrieval techniques pose limitations in retrieving ambiguous sound collections (ie. sounds with no identifiable origin, foley sounds, synthesized sound effects, abstract sounds) due to the difficulties in textual descriptions and the complex psychoacoustic nature of the sound. Early psychoacoustical studies propose perceptual acoustical qualities as an effective way of describing these category of sounds [1]. In Music Information Retrieval (MIR) studies, this problem were mostly studied and explored in context of content-based audio retrieval. However, we observed that most of the commercial available systems in the market neither integrated advanced content-based sound descriptions nor the visualization and interface design approaches evolved in the last years. Our research was mainly aimed to investigate two things; 1. Development of audio retrieval system incorporating high level timbral features as search parameters. 2. Investigate user-centered approach in integrating these features into audio production pipelines using expert-user studies. In this project, We present an prototype which is similar to traditional sound browsers (list-based browsing) with an added functionality of filtering and ranking sounds by perceptual timbral features such as brightness, depth, roughness and hardness. Our main focus was on the retrieval process by timbral features. Inspiring from the recent focus on user-centered systems ([2], [3]) in the MIR community, in-depth interviews and qualitative evaluation of the system were conducted with expert-user in order to identify the underlying problems. Our studies observed the potential applications of high-level perceptual timbral features in audio production pipelines using a probe system and expert-user studies. We also outlined future guidelines and possible improvements to the system from the outcomes of this research

    Evidence Summary: Music Information Seeking Behaviour Poses Unique Challenges for the Design of Information Retrieval Systems

    Get PDF
    Objective – To better understand music information seeking behaviour in a real life situation and to create a taxonomy relating to this behaviour to facilitate better comparison of music information retrieval studies in the future. Design – Content analysis of natural language queries. Setting – Google Answers, a fee based online service. Subjects – 1,705 queries and their related answers and comments posted in the music category of the Google Answers website before April 27, 2005. Methods – A total of 2,208 queries were retrieved from the music category on the Google Answers service. Google Answers was a fee based service in which users posted questions and indicated what they were willing to pay to have them answered. The queries selected for this study were posted prior to April 27, 2005, over a year before the service was discontinued completely. Of the 2208 queries taken from the site, only 1,705 were classified as relevant to the question of music information seeking by the researcher. The off-topic queries were not included in the study. Each of the 1,705 queries was coded according to the needs expressed by the user and the information provided to assist researchers in answering the question. The initial coding framework used by the researcher was informed by previous studies of music information retrieval to facilitate comparison, but was expanded and revised to reflect the evidence itself. Only the questions themselves were subjected to this iterative coding process. The answers provided by the Google Answer researchers and online comments posted by other users were examined by the author, but not coded for inclusion in the study. User needs in the questions were coded for their form and topic. Each question was assigned at least one form and one topic. Form refers to the type of question being asked and consisted of the following 10 categories: identification, location, verification, recommendation, evaluation, ready reference, reproduction, description, research, and other. Reproduction in this context is defined as “questions asking for text” and referred most often to questions looking for song lyrics, while evaluation typically meant the user was seeking reviews of works (p. 1029). Sixteen question topics were outlined in the coding framework. They included lyrics, translation, meaning (i.e., of lyrics), score, work, version, recording (e.g., where is an album available for purchase), related work, genre, artist, publisher, instrument, statistics, background (e.g. definitions), resource (i.e. sources of music information) and other. The questions were also coded for their features or the information provided by the user. The final coding framework outlined 57 features, some of which were further subdivided by additional attributes. For example, a feature with attributes was title. The researcher further clarified the attribute of title by indicating whether the user mentioned the title of a musical work, recording, printed material or related work in their question. More than one feature could appear in a user query. Main Results – Overall, the most common questions posted on the Google Answers service relating to music involved identifying works or artists, finding recordings, or retrieving lyrics. The most popular query forms were identification (43.8%), location (33.3%), and reproduction (10.9%). The most common topics were work (49.1%), artist (36.4%), recording (16.7%), and lyrics (10.4%). The most common features provided by users in their posted questions were person name (53%), title (50.9%), date (45.6%), genre (37.2%), role (33.8%), and lyric (27.6%). The person name usually referred to an artist’s name (in 95.6% of cases) and title most often referred to the title of a musical work. Another feature that appeared in 25.6% of queries was place reference, almost half of which referred to the place where the user encountered the music they were enquiring about. While the coding framework eventually encompassed 57 different features, a small number of features dominated, with seven features used in over 25% of the queries posted and 33 features appearing in less than 10%. The seven most common features were person name, title, date, genre, role, lyric, and place reference. Lee categorized most of the queries as “known-item searches,” even though at times users provided incorrect information and many were looking for information about the musical item but not the item itself (p. 1035). Other interesting features identified by the author were the presence of “dormant searches,” long standing questions a user had about a musical item, sometimes for years, which were reawakened by hearing the song again or other events (p. 1037). Multiple versions of musical works and the provision of information gleaned third hand by users were also identified as complicating factors in correctly meeting musical information needs. Conclusion – While certain types of questions dominated among music queries posted on the Google Answers service, there were a wide variety of music information needs expressed by users. In some cases, the features provided by the user as clues to answering the query were very personal, and related to the context in which they encountered the work or the mood a particular work or artist evoked. Such circumstances are not currently or adequately covered by existing bibliographic record standards, which focus on qualities inherent in the music itself. The author suggests that user context should play a greater role in the testing and development of music information retrieval systems, although the instability and variability of this type of information is acknowledged. In some cases this context could apply to other works (film, television, etc.) in which a musical work is featured. Another potential implication for music information retrieval system development is a need to re-evaluate the terminology employed in testing to ensure that it is the language most often employed by users. For example, the 128 different terms used in this study to describe how a musical item made the user feel did not significantly overlap with terms employed in a previous music information retrieval task involving mood classification conducted through MIREX, the Music Information Retrieval Evaluation Exchange, in 2007. The author also argues that while most current music information retrieval testing is task-specific – e.g., how can a user search for a particular work by humming a few bars or searching for a work based on its genre, in real life, users come to their search with information that is not neatly parsed into separate tasks. The study affirms a need for systems that can combine tasks and/or consolidate the results of separate tasks for users
    • …
    corecore