27 research outputs found

    Audio Event Detection using Weakly Labeled Data

    Full text link
    Acoustic event detection is essential for content analysis and description of multimedia recordings. The majority of current literature on the topic learns the detectors through fully-supervised techniques employing strongly labeled data. However, the labels available for majority of multimedia data are generally weak and do not provide sufficient detail for such methods to be employed. In this paper we propose a framework for learning acoustic event detectors using only weakly labeled data. We first show that audio event detection using weak labels can be formulated as an Multiple Instance Learning problem. We then suggest two frameworks for solving multiple-instance learning, one based on support vector machines, and the other on neural networks. The proposed methods can help in removing the time consuming and expensive process of manually annotating data to facilitate fully supervised learning. Moreover, it can not only detect events in a recording but can also provide temporal locations of events in the recording. This helps in obtaining a complete description of the recording and is notable since temporal information was never known in the first place in weakly labeled data.Comment: ACM Multimedia 201

    The design and evaluation of EKE, a semi-automated email knowledge extraction tool

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Knowledge Management Research and Practice. The definitive publisher-authenticated version, TEDMORI, S. and JACKSON, T., 2012. The design and evaluation of EKE, a semi-automated email knowledge extraction tool. Knowledge Management Research and Practice, 10 (1), pp. 79 - 88 is available online at: http://dx.doi.org/10.1057/kmrp.2011.40This paper presents an approach to locating experts within organisations through the use of the indispensable communication medium and source of information, email. The approach was realised through the email expert locator architecture developed by the authors, which uses email content in the modelling of individuals' expertise profiles. The approach has been applied to a real-world application, EKE, and evaluated using focus group sessions and system trials. In this work, the authors report the findings obtained from the focus groups sessions. The aim of the sessions was to obtain information about the participants' perceptions, opinions, underlying attitudes, and recommendations with regard to the notion of exploiting email content for expertise profiling. The paper provides a review of the various approaches to expertise location that have been developed and highlights the end-users' perspectives on the usability and functionality of EKE and the socio-ethical challenges raised by its adoption from an industrial perspective. © 2012 Operational Research Society. All rights reserved

    Recognition of isolated musical patterns using Hidden Markov models

    No full text
    This paper presents an efficient method for recognizing isolated musical patterns in a monophonic environment, using Discrete Observation Hidden Markov Models. Each musical pattern is converted into a sequence of music intervals by means of a fundamental frequency tracking algorithm followed by a quantizer. The resulting sequence of music intervals is presented to the input of a set of Discrete Observation Hidden Markov models, each of which has been trained to recognize a specific type of musical patterns. Our methodology has been tested in the context of Greek Traditional Music, which exhibits certain characteristics that make the classification task harder, when compared with Western musical tradition. A recognition rate higher than 95% was achieved. To our knowledge, it is the first time that the problem of isolated musical pattern recognition has been treated using Hidden Markov Models. © 2002 Springer-Verlag Berlin Heidelberg

    Self-similarity analysis applied on tempo induction from music recordings

    No full text
    This paper presents a self-similarity analysis approach to tempo induction, assuming that tempo remains approximately constant throughout the music recording. The proposed method is based on the observation that rhythmic characteristics of the music signal manifest themselves as inherent periodicities that can be extracted by processing the diagonals of the self-similarity matrix. Such periodicities can then be processed in pairs to yield a pair of tempo candidates. The method was submitted to the MIREX 2006 Tempo Extraction contest, where it was investigated whether the returned tempi are related to perceived tempi extracted from ground truth data. In this paper results are also reported for a music corpus assembled by the authors. © 2007 Taylor & Francis

    A speech/music discriminator of radio recordings based on dynamic programming and Bayesian networks

    No full text
    This paper presents a multistage system for speech/music discrimination which is based on a three-step procedure. The first step is a computationally efficient scheme consisting of a region growing technique and operates on a 1-D feature sequence, which is extracted from the raw audio stream. This scheme is used as a preprocessing stage and yields segments with high music and speech precision at the expense of leaving certain parts of the audio recording unclassified. The unclassified parts of the audio stream are then fed as input to a more computationally demanding scheme. The latter treats speech/music discrimination of radio recordings as a probabilistic segmentation task, where the solution is obtained by means of dynamic programming. The proposed scheme seeks the sequence of segments and respective class labels (i.e., speech/music) that maximize the product of posterior class probabilities, given the data that form the segments. To this end, a Bayesian Network combiner is embedded as a posterior probability estimator. At a final stage, an algorithm that performs boundary correction is applied to remove possible errors at the boundaries of the segments (speech or music) that have been previously generated. The proposed system has been tested on radio recordings from various sources. The overall system accuracy is approximately 96%. Performance results are also reported on a musical genre basis and a comparison with existing methods is given. © 2008 IEEE

    An overview of speech/music discrimination techniques in the context of audio recordings

    No full text
    Speech/music discrimination of audio recordings refers to the problem of segmenting an audio stream and labeling each segment as either speech or music. This chapter provides an overview of methods that have been proposed in the field during the past decade and also presents in more detail a methodology that treats the problem as a posterior probability maximization task. Given that feature extraction is of primary importance to all methods, a study of feature extraction schemes is first provided. The existing methods are then broadly classified to categories depending on the underlying design philosophy. Finally, a performance study is given by presenting the datasets and accompanying assumptions that each method has adopted. © 2008 Springer-Verlag Berlin Heidelberg

    Classification of musical patterns using variable duration hidden Markov models

    No full text
    This paper presents a new extension to the variable duration hidden Markov model (HMM), capable of classifying musical pattens that have been extracted from raw audio data into a set of predefined classes. Each musical pattern is converted into a sequence of music intervals by means of a fundamental frequency tracking procedure. This sequence is subsequently presented as input to a set of variable-duration HMMs. Each one of these models has been trained to recognize patterns of a corresponding predefined class. Classification is determined based on the highest recognition probability. The new type of variable-duration hidden Markov modeling proposed in this paper results in enhanced performance because 1) it deals effectively with errors that commonly originate during the feature extraction stage, and 2) it accounts for variations due to the individual expressive performance of different instrument players. To demonstrate its effectiveness, the novel classification scheme has been employed in the context of Greek traditional music, to monophonic musical patterns of a popular instrument, the Greek traditional clarinet. Although the method is also appropriate for western-style music, Greek traditional music poses extra difficulties and makes music pattern recognition a harder task. The classification results demonstrate that the new approach outperforms previous work based on conventional HMMs. © 2006 IEEE

    Classification of musical patterns using variable duration hidden Markov models

    No full text

    MEMOIR - an open framework for enhanced navigation of distributed information.

    No full text
    In large companies, whose business is critically dependent on the effectiveness of their RandD function, the provision of effective means to access and share all forms of technical information is an acute problem. It is often easier to repeat an activity than it is to determine whether work has been carried out before. In this paper we present experiences in implementing and evaluating the MEMOIR system. MEMOIR is an open framework, i.e., it is extensible and adaptable to an organization's infrastructure and applications, and it provides its user interface via standard Web browsers. It uses trails, open hypermedia link services and a set of software agents to assist users in accessing and navigating vast amounts of information in Intranet environments. Additionally, MEMOIR exploits trail data to support users in finding colleagues with similar interests. The MEMOIR system has been installed and evaluated by two end-user organizations. This paper describes the results obtained in this evaluation
    corecore