1,224 research outputs found
Automated speech and audio analysis for semantic access to multimedia
The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to increased granularity of automatically extracted metadata. A number of techniques will be presented, including the alignment of speech and text resources, large vocabulary speech recognition, key word spotting and speaker classification. The applicability of techniques will be discussed from a media crossing perspective. The added value of the techniques and their potential contribution to the content value chain will be illustrated by the description of two (complementary) demonstrators for browsing broadcast news archives
A Probabilistic Multimedia Retrieval Model and its Evaluation
We present a probabilistic model for the retrieval of multimodal documents. The model is based on Bayesian decision theory and combines models for text-based search with models for visual search. The textual model is based on the language modelling approach to text retrieval, and the visual information is modelled as a mixture of Gaussian densities. Both models have proved successful on various standard retrieval tasks. We evaluate the multimodal model on the search task of TREC′s video track. We found that the disclosure of video material based on visual information only is still too difficult. Even with purely visual information needs, text-based retrieval still outperforms visual approaches. The probabilistic model is useful for text, visual, and multimedia retrieval. Unfortunately, simplifying assumptions that reduce its computational complexity degrade retrieval effectiveness. Regarding the question whether the model can effectively combine information from different modalities, we conclude that whenever both modalities yield reasonable scores, a combined run outperforms the individual runs
Affective Music Information Retrieval
Much of the appeal of music lies in its power to convey emotions/moods and to
evoke them in listeners. In consequence, the past decade witnessed a growing
interest in modeling emotions from musical signals in the music information
retrieval (MIR) community. In this article, we present a novel generative
approach to music emotion modeling, with a specific focus on the
valence-arousal (VA) dimension model of emotion. The presented generative
model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the
subjectivity of emotion perception by the use of probability distributions.
Specifically, it learns from the emotion annotations of multiple subjects a
Gaussian mixture model in the VA space with prior constraints on the
corresponding acoustic features of the training music pieces. Such a
computational framework is technically sound, capable of learning in an online
fashion, and thus applicable to a variety of applications, including
user-independent (general) and user-dependent (personalized) emotion
recognition and emotion-based music retrieval. We report evaluations of the
aforementioned applications of AEG on a larger-scale emotion-annotated corpora,
AMG1608, to demonstrate the effectiveness of AEG and to showcase how
evaluations are conducted for research on emotion-based MIR. Directions of
future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio
Inference and Evaluation of the Multinomial Mixture Model for Text Clustering
In this article, we investigate the use of a probabilistic model for
unsupervised clustering in text collections. Unsupervised clustering has become
a basic module for many intelligent text processing applications, such as
information retrieval, text classification or information extraction. The model
considered in this contribution consists of a mixture of multinomial
distributions over the word counts, each component corresponding to a different
theme. We present and contrast various estimation procedures, which apply both
in supervised and unsupervised contexts. In supervised learning, this work
suggests a criterion for evaluating the posterior odds of new documents which
is more statistically sound than the "naive Bayes" approach. In an unsupervised
context, we propose measures to set up a systematic evaluation framework and
start with examining the Expectation-Maximization (EM) algorithm as the basic
tool for inference. We discuss the importance of initialization and the
influence of other features such as the smoothing strategy or the size of the
vocabulary, thereby illustrating the difficulties incurred by the high
dimensionality of the parameter space. We also propose a heuristic algorithm
based on iterative EM with vocabulary reduction to solve this problem. Using
the fact that the latent variables can be analytically integrated out, we
finally show that Gibbs sampling algorithm is tractable and compares favorably
to the basic expectation maximization approach
TOS: A Text Organizing System
This paper reports research undertaken to conceptualize, design and implement a system for automatic indexing, classification and repositing of text items, which may be any aggregates of information in English language on a computer - readable media, in a standard format.
The ultimate goal of the research reported here is to devise all automatic processes which would read text items, and then index, classify and reposit them for subsequent search and retrieval. Only portions of the path to this goal have been made fully automatic. These portions consist of all automatic processes as follows:
1. Scanning the text items and assigning candidate index terms (words or phrases) to the items.
2. Discriminating and rejecting candidate index terms determined to be ineffective in forming a classification automatically.
3. Generating a classification system and repositing the text items in accordance with this system
Rules and fuzzy rules in text: concept, extraction and usage
Several concepts and techniques have been imported from other disciplines such as
Machine Learning and Artificial Intelligence to the field of textual data. In this paper,
we focus on the concept of rule and the management of uncertainty in text applications.
The different structures considered for the construction of the rules, the extraction of the
knowledge base and the applications and usage of these rules are detailed. We include a
review of the most relevant works of the different types of rules based on their representation
and their application to most of the common tasks of Information Retrieval
such as categorization, indexing and classification
Large scale biomedical texts classification: a kNN and an ESA-based approaches
With the large and increasing volume of textual data, automated methods for
identifying significant topics to classify textual documents have received a
growing interest. While many efforts have been made in this direction, it still
remains a real challenge. Moreover, the issue is even more complex as full
texts are not always freely available. Then, using only partial information to
annotate these documents is promising but remains a very ambitious issue.
MethodsWe propose two classification methods: a k-nearest neighbours
(kNN)-based approach and an explicit semantic analysis (ESA)-based approach.
Although the kNN-based approach is widely used in text classification, it needs
to be improved to perform well in this specific classification problem which
deals with partial information. Compared to existing kNN-based methods, our
method uses classical Machine Learning (ML) algorithms for ranking the labels.
Additional features are also investigated in order to improve the classifiers'
performance. In addition, the combination of several learning algorithms with
various techniques for fixing the number of relevant topics is performed. On
the other hand, ESA seems promising for this classification task as it yielded
interesting results in related issues, such as semantic relatedness computation
between texts and text classification. Unlike existing works, which use ESA for
enriching the bag-of-words approach with additional knowledge-based features,
our ESA-based method builds a standalone classifier. Furthermore, we
investigate if the results of this method could be useful as a complementary
feature of our kNN-based approach.ResultsExperimental evaluations performed on
large standard annotated datasets, provided by the BioASQ organizers, show that
the kNN-based method with the Random Forest learning algorithm achieves good
performances compared with the current state-of-the-art methods, reaching a
competitive f-measure of 0.55% while the ESA-based approach surprisingly
yielded reserved results.ConclusionsWe have proposed simple classification
methods suitable to annotate textual documents using only partial information.
They are therefore adequate for large multi-label classification and
particularly in the biomedical domain. Thus, our work contributes to the
extraction of relevant information from unstructured documents in order to
facilitate their automated processing. Consequently, it could be used for
various purposes, including document indexing, information retrieval, etc.Comment: Journal of Biomedical Semantics, BioMed Central, 201
- …