17,612 research outputs found
Latent Semantic Analysis for Text Segmentation
This paper describes a method for linear text segmentation that is more accurate or at least as accurate as state-of-the-art methods (Utiyama and Isahara, 200
Feature extraction for document image segmentation by pLSA model
In this paper, we propose a method for document image segmentation based on pLSA (probabilistic latent semantic analysis) model. The pLSA model is originally developed for topic discovery in text analysis using "bag-of-words" document representation. The model is useful for image analysis by "bag-of-visual words" image representation. The performance of the method depends on the visual vocabulary generated by feature extraction from the document image. We compare several feature extraction and description methods, and examine the relations to segmentation performance. Through the experiments, we show accurate content-based document segmentation is made possible by using pLSA-based method.ArticleThe Eighth IAPR Workshop on Document Analysis Systemsconference pape
Learning Behavioural Context
The original publication is available at www.springerlink.co
TV News Story Segmentation Based on Semantic Coherence and Content Similarity
In this paper, we introduce and evaluate two novel approaches, one using video stream and the other using close-caption text stream, for segmenting TV news into stories. The segmentation of the video stream into stories is achieved by detecting anchor person shots and the text stream is segmented into stories using a Latent Dirichlet Allocation (LDA) based approach. The benefit of the proposed LDA based approach is that along with the story segmentation it also provides the topic distribution associated with each segment. We evaluated our techniques on the TRECVid 2003 benchmark database and found that though the individual systems give comparable results, a combination of the outputs of the two systems gives a significant improvement over the performance of the individual systems
On the Application of Generic Summarization Algorithms to Music
Several generic summarization algorithms were developed in the past and
successfully applied in fields such as text and speech summarization. In this
paper, we review and apply these algorithms to music. To evaluate this
summarization's performance, we adopt an extrinsic approach: we compare a Fado
Genre Classifier's performance using truncated contiguous clips against the
summaries extracted with those algorithms on 2 different datasets. We show that
Maximal Marginal Relevance (MMR), LexRank and Latent Semantic Analysis (LSA)
all improve classification performance in both datasets used for testing.Comment: 12 pages, 1 table; Submitted to IEEE Signal Processing Letter
- …