Search CORE

31,161 research outputs found

siEDM: an efficient string index and search algorithm for edit distance with moves

Author: Kuboyama Tetsuji
Nakashima Kenta
Sakamoto Hiroshi
Tabei Yasuo
Takabatake Yoshimasa
Publication venue
Publication date: 01/04/2016
Field of study

Although several self-indexes for highly repetitive text collections exist, developing an index and search algorithm with editing operations remains a challenge. Edit distance with moves (EDM) is a string-to-string distance measure that includes substring moves in addition to ordinal editing operations to turn one string into another. Although the problem of computing EDM is intractable, it has a wide range of potential applications, especially in approximate string retrieval. Despite the importance of computing EDM, there has been no efficient method for indexing and searching large text collections based on the EDM measure. We propose the first algorithm, named string index for edit distance with moves (siEDM), for indexing and searching strings with EDM. The siEDM algorithm builds an index structure by leveraging the idea behind the edit sensitive parsing (ESP), an efficient algorithm enabling approximately computing EDM with guarantees of upper and lower bounds for the exact EDM. siEDM efficiently prunes the space for searching query strings by the proposed method, which enables fast query searches with the same guarantee as ESP. We experimentally tested the ability of siEDM to index and search strings on benchmark datasets, and we showed siEDM's efficiency.Comment: 23 page

arXiv.org e-Print Archive

Directory of Open Access Journals

Automatic detection and extraction of artificial text in video

Author: Malobabić Jovanka
Marlow Seán
Murphy Noel
O'Connor Noel E.
Publication venue
Publication date: 01/04/2004
Field of study

A significant challenge in large multimedia databases is the provision of efficient means for semantic indexing and retrieval of visual information. Artificial text in video is normally generated in order to supplement or summarise the visual content and thus is an important carrier of information that is highly relevant to the content of the video. As such, it is a potential ready-to-use source of semantic information. In this paper we present an algorithm for detection and localisation of artificial text in video using a horizontal difference magnitude measure and morphological processing. The result of character segmentation, based on a modified version of the Wolf-Jolion algorithm [1][2] is enhanced using smoothing and multiple binarisation. The output text is input to an “off-the-shelf” noncommercial OCR. Detection, localisation and recognition results for a 20min long MPEG-1 encoded television programme are presented

Super Imposed Method for Text Extraction in a Sports Video

Author: Tabish Sayyed, Dinesh Barai, Snehal Kande
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/03/2018
Field of study

Video is one of the sources for presenting the valuable information. It contains sequence of video images, audio and text information. Text data present in video contain useful information for automatic annotation, structuring, mining, indexing and retrieval of video. Nowadays mechanically added (superimposed) text in video sequences provides useful information about their contents. It provides supplemental but important information for video indexing and retrieval. A large number of techniques have been proposed to address this problem. This paper provides a novel method of detecting video text regions containing player information and score in sports videos. It also proposes an improved algorithm for the automatic extraction of super imposed text in sports video. First, we identified key frames from video using the Color Histogram technique to minimize the number of video frames. Then, the key images were converted into gray images for the efficient text detection. Generally, the super imposed text displayed in bottom part of the image in the sports video. So, we cropped the text image regions in the gray image which contains the text information. Then we applied the canny edge detection algorithms for text edge detection. The ESPN cricket video data was taken for our experiment and extracted the super imposed text region in the sports video. Using the OCR tool, the text region image was converted as ASCII text and the result was verified

An Approximate Algorithm for Maximum Inner Product Search over Streaming Sparse Vectors

Author: Bruch Sebastian
Ingber Amir
Liberty Edo
Nardini Franco Maria
Publication venue
Publication date: 25/01/2023
Field of study

Maximum Inner Product Search or top-k retrieval on sparse vectors is well-understood in information retrieval, with a number of mature algorithms that solve it exactly. However, all existing algorithms are tailored to text and frequency-based similarity measures. To achieve optimal memory footprint and query latency, they rely on the near stationarity of documents and on laws governing natural languages. We consider, instead, a setup in which collections are streaming -- necessitating dynamic indexing -- and where indexing and retrieval must work with arbitrarily distributed real-valued vectors. As we show, existing algorithms are no longer competitive in this setup, even against naive solutions. We investigate this gap and present a novel approximate solution, called Sinnamon, that can efficiently retrieve the top-k results for sparse real valued vectors drawn from arbitrary distributions. Notably, Sinnamon offers levers to trade-off memory consumption, latency, and accuracy, making the algorithm suitable for constrained applications and systems. We give theoretical results on the error introduced by the approximate nature of the algorithm, and present an empirical evaluation of its performance on two hardware platforms and synthetic and real-valued datasets. We conclude by laying out concrete directions for future research on this general top-k retrieval problem over sparse vectors

arXiv.org e-Print Archive

Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs

Author: Doungpaisan Pafan
Mingkhwan Anirach
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/12/2017
Field of study

Search engine is the popular term for an information retrieval (IR) system. Typically, search engine can be based on full-text indexing. Changing the presentation from the text data to multimedia data types make an information retrieval process more complex such as a retrieval of image or sounds in large databases. This paper introduces the use of language and text independent speech as input queries in a large sound database by using Speaker identification algorithm. The method consists of 2 main processing first steps, we separate vocal and non-vocal identification after that vocal be used to speaker identification for audio query by speaker voice. For the speaker identification and audio query by process, we estimate the similarity of the example signal and the samples in the queried database by calculating the Euclidian distance between the Mel frequency cepstral coefficients (MFCC) and Energy spectrum of acoustic features. The simulations show that the good performance with a sustainable computational cost and obtained the average accuracy rate more than 90%

ZENODO

Institute of Advanced Engineering and Science

Document expansion for image retrieval

Author: Jones Gareth J.F.
Leveling Johannes
Min Jinming
Zhou Dong
Publication venue
Publication date: 01/04/2010
Field of study

Successful information retrieval requires e�ective matching between the user's search request and the contents of relevant documents. Often the request entered by a user may not use the same topic relevant terms as the authors' of the documents. One potential approach to address problems of query-document term mismatch is document expansion to include additional topically relevant indexing terms in a document which may encourage its retrieval when relevant to queries which do not match its original contents well. We propose and evaluate a new document expansion method using external resources. While results of previous research have been inconclusive in determining the impact of document expansion on retrieval e�ectiveness, our method is shown to work e�ectively for text-based image retrieval of short image annotation documents. Our approach uses the Okapi query expansion algorithm as a method for document expansion. We further show improved performance can be achieved by using a \document reduction" approach to include only the signi�cant terms in a document in the expansion process. Our experiments on the WikipediaMM task at ImageCLEF 2008 show an increase of 16.5% in mean average precision (MAP) compared to a variation of Okapi BM25 retrieval model. To compare document expansion with query expansion, we also test query expansion from an external resource which leads an improvement by 9.84% in MAP over our baseline. Our conclusion is that the document expansion with document reduction and in combination with query expansion produces the overall best retrieval results for shortlength document retrieval. For this image retrieval task, we also concluded that query expansion from external resource does not outperform the document expansion method

Combining textual and visual information processing for interactive video retrieval: SCHEMA's participation in TRECVID 2004

Author: Doulaverakis Haralambos
Herrmann Stephan
Lehane Bart
Mezaris Vasileios
O'Connor Noel E.
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/11/2004
Field of study

In this paper, the two different applications based on the Schema Reference System that were developed by the SCHEMA NoE for participation to the search task of TRECVID 2004 are illustrated. The first application, named ”Schema-Text”, is an interactive retrieval application that employs only textual information while the second one, named ”Schema-XM”, is an extension of the former, employing algorithms and methods for combining textual, visual and higher level information. Two runs for each application were submitted, I A 2 SCHEMA-Text 3, I A 2 SCHEMA-Text 4 for Schema-Text and I A 2 SCHEMA-XM 1, I A 2 SCHEMA-XM 2 for Schema-XM. The comparison of these two applications in terms of retrieval efficiency revealed that the combination of information from different data sources can provide higher efficiency for retrieval systems. Experimental testing additionally revealed that initially performing a text-based query and subsequently proceeding with visual similarity search using one of the returned relevant keyframes as an example image is a good scheme for combining visual and textual information