31,161 research outputs found
siEDM: an efficient string index and search algorithm for edit distance with moves
Although several self-indexes for highly repetitive text collections exist,
developing an index and search algorithm with editing operations remains a
challenge. Edit distance with moves (EDM) is a string-to-string distance
measure that includes substring moves in addition to ordinal editing operations
to turn one string into another. Although the problem of computing EDM is
intractable, it has a wide range of potential applications, especially in
approximate string retrieval. Despite the importance of computing EDM, there
has been no efficient method for indexing and searching large text collections
based on the EDM measure. We propose the first algorithm, named string index
for edit distance with moves (siEDM), for indexing and searching strings with
EDM. The siEDM algorithm builds an index structure by leveraging the idea
behind the edit sensitive parsing (ESP), an efficient algorithm enabling
approximately computing EDM with guarantees of upper and lower bounds for the
exact EDM. siEDM efficiently prunes the space for searching query strings by
the proposed method, which enables fast query searches with the same guarantee
as ESP. We experimentally tested the ability of siEDM to index and search
strings on benchmark datasets, and we showed siEDM's efficiency.Comment: 23 page
Automatic detection and extraction of artificial text in video
A significant challenge in large multimedia databases is the
provision of efficient means for semantic indexing and retrieval of visual information. Artificial text in video is normally generated in order to supplement or summarise the visual content and thus is an important carrier of information that is highly relevant to the content of the video. As such, it is a potential ready-to-use source of semantic information. In this paper we present an algorithm for detection and localisation of artificial text in video using a horizontal difference magnitude measure and morphological processing. The result of character segmentation, based on a modified version of the Wolf-Jolion
algorithm [1][2] is enhanced using smoothing and multiple
binarisation. The output text is input to an âoff-the-shelfâ noncommercial OCR. Detection, localisation and recognition results for a 20min long MPEG-1 encoded television programme are presented
Super Imposed Method for Text Extraction in a Sports Video
Video is one of the sources for presenting the valuable information. It contains sequence of video images, audio and text information. Text data present in video contain useful information for automatic annotation, structuring, mining, indexing and retrieval of video. Nowadays mechanically added (superimposed) text in video sequences provides useful information about their contents. It provides supplemental but important information for video indexing and retrieval. A large number of techniques have been proposed to address this problem. This paper provides a novel method of detecting video text regions containing player information and score in sports videos. It also proposes an improved algorithm for the automatic extraction of super imposed text in sports video. First, we identified key frames from video using the Color Histogram technique to minimize the number of video frames. Then, the key images were converted into gray images for the efficient text detection. Generally, the super imposed text displayed in bottom part of the image in the sports video. So, we cropped the text image regions in the gray image which contains the text information. Then we applied the canny edge detection algorithms for text edge detection. The ESPN cricket video data was taken for our experiment and extracted the super imposed text region in the sports video. Using the OCR tool, the text region image was converted as ASCII text and the result was verified
An Approximate Algorithm for Maximum Inner Product Search over Streaming Sparse Vectors
Maximum Inner Product Search or top-k retrieval on sparse vectors is
well-understood in information retrieval, with a number of mature algorithms
that solve it exactly. However, all existing algorithms are tailored to text
and frequency-based similarity measures. To achieve optimal memory footprint
and query latency, they rely on the near stationarity of documents and on laws
governing natural languages. We consider, instead, a setup in which collections
are streaming -- necessitating dynamic indexing -- and where indexing and
retrieval must work with arbitrarily distributed real-valued vectors. As we
show, existing algorithms are no longer competitive in this setup, even against
naive solutions. We investigate this gap and present a novel approximate
solution, called Sinnamon, that can efficiently retrieve the top-k results for
sparse real valued vectors drawn from arbitrary distributions. Notably,
Sinnamon offers levers to trade-off memory consumption, latency, and accuracy,
making the algorithm suitable for constrained applications and systems. We give
theoretical results on the error introduced by the approximate nature of the
algorithm, and present an empirical evaluation of its performance on two
hardware platforms and synthetic and real-valued datasets. We conclude by
laying out concrete directions for future research on this general top-k
retrieval problem over sparse vectors
Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs
Search engine is the popular term for an information retrieval (IR) system. Typically, search engine can be based on full-text indexing. Changing the presentation from the text data to multimedia data types make an information retrieval process more complex such as a retrieval of image or sounds in large databases. This paper introduces the use of language and text independent speech as input queries in a large sound database by using Speaker identification algorithm. The method consists of 2 main processing first steps, we separate vocal and non-vocal identification after that vocal be used to speaker identification for audio query by speaker voice. For the speaker identification and audio query by process, we estimate the similarity of the example signal and the samples in the queried database by calculating the Euclidian distance between the Mel frequency cepstral coefficients (MFCC) and Energy spectrum of acoustic features. The simulations show that the good performance with a sustainable computational cost and obtained the average accuracy rate more than 90%
Document expansion for image retrieval
Successful information retrieval requires e�ective matching
between the user's search request and the contents of relevant
documents. Often the request entered by a user may
not use the same topic relevant terms as the authors' of the
documents. One potential approach to address problems
of query-document term mismatch is document expansion
to include additional topically relevant indexing terms in a
document which may encourage its retrieval when relevant
to queries which do not match its original contents well. We
propose and evaluate a new document expansion method
using external resources. While results of previous research
have been inconclusive in determining the impact of document
expansion on retrieval e�ectiveness, our method is
shown to work e�ectively for text-based image retrieval of
short image annotation documents. Our approach uses the
Okapi query expansion algorithm as a method for document
expansion. We further show improved performance can be
achieved by using a \document reduction" approach to include
only the signi�cant terms in a document in the expansion
process. Our experiments on the WikipediaMM task at
ImageCLEF 2008 show an increase of 16.5% in mean average
precision (MAP) compared to a variation of Okapi BM25 retrieval
model. To compare document expansion with query
expansion, we also test query expansion from an external resource
which leads an improvement by 9.84% in MAP over
our baseline. Our conclusion is that the document expansion
with document reduction and in combination with query expansion
produces the overall best retrieval results for shortlength
document retrieval. For this image retrieval task, we
also concluded that query expansion from external resource
does not outperform the document expansion method
Combining textual and visual information processing for interactive video retrieval: SCHEMA's participation in TRECVID 2004
In this paper, the two different applications based on the Schema Reference System that were developed by the SCHEMA NoE for participation to the search task of TRECVID 2004 are illustrated. The first application, named âSchema-Textâ, is an interactive retrieval application that employs only textual information while the second one, named âSchema-XMâ, is an extension of the former, employing algorithms and
methods for combining textual, visual and higher level information. Two runs for each application were submitted, I A 2 SCHEMA-Text 3, I A 2 SCHEMA-Text 4 for Schema-Text and I A 2 SCHEMA-XM 1, I A 2 SCHEMA-XM 2 for Schema-XM. The comparison of these two applications in terms of retrieval efficiency revealed that the combination of information from different data sources can provide higher efficiency for retrieval systems. Experimental testing additionally revealed that initially performing a text-based query and subsequently proceeding with visual similarity search using one of the returned relevant keyframes as an example image is a good scheme for combining visual and textual information
- âŚ