28,280 research outputs found
A new visual signature for content-based indexing of low resolution documents
This paper proposes a new visual signature for content –based indexing of low resolution documents. Camera Based Document Analysis and Recognition (CBDAR) has been established which deals with
the textual information in scene images taken by low cost hand held devices like digital camera, cell
phones, etc. A lot of applications like text translation, reading text for visually impaired and blind
person, information retrieval from media document, e-learning, etc., can be built using the techniques
developed in CBDAR domain. The proposed approach of extraction of textual information is
composed of three steps: image segmentation, text localization and extraction, and Optical Character
Recognition. First of all, for pre-processing the resolution of each image is checked for re-sampling
to a common resolution format (720 X 540). Then, the final image is converted to grayscale and
binarized using Otsu segmentation method for further processing. In addition, looking at the mean
horizontal run length of both black and white pixels, the proper segmentation of foreground objects is
checked. In the post-processing step, the text localizer validates the candidate text regions proposed
by text detector. We have employed a connected component approach for text localization. The
extracted text is then has been successfully recognized using ABBYY FineReader for OCR. Apart
from OCR, we had created a novel feature vectors from textual information for Content-Based Image
Retrieval (CBIR)
Semantic Retrieval and Automatic Annotation: Linear Transformations, Correlation and Semantic Spaces
This paper proposes a new technique for auto-annotation and semantic retrieval based upon the idea of linearly mapping an image feature space to a keyword space. The new technique is compared to several related techniques, and a number of salient points about each of the techniques are discussed and contrasted. The paper also discusses how these techniques might actually scale to a real-world retrieval problem, and demonstrates this though a case study of a semantic retrieval technique being used on a real-world data-set (with a mix of annotated and unannotated images) from a picture library
Extending the 5S Framework of Digital Libraries to support Complex Objects, Superimposed Information, and Content-Based Image Retrieval Services
Advanced services in digital libraries (DLs) have been developed and widely used to address the required capabilities of an assortment of systems as DLs expand into diverse application domains. These systems may require support for images (e.g., Content-Based Image Retrieval), Complex (information) Objects, and use of content at fine grain (e.g., Superimposed Information). Due to the lack of consensus on precise theoretical definitions for those services, implementation efforts often involve ad hoc development, leading to duplication and interoperability problems. This article presents a methodology to address those problems by extending a precisely specified minimal digital library (in the 5S framework) with formal definitions of aforementioned services. The theoretical extensions of digital library functionality presented here are reinforced with practical case studies as well as scenarios for the individual and integrative use of services to balance theory and practice. This methodology has implications that other advanced
services can be continuously integrated into our current extended framework whenever they are identified. The theoretical definitions and case study we present may impact future development efforts and a wide range of digital library researchers, designers, and developers
Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval
This paper presents a new state-of-the-art for document image classification
and retrieval, using features learned by deep convolutional neural networks
(CNNs). In object and scene analysis, deep neural nets are capable of learning
a hierarchical chain of abstraction from pixel inputs to concise and
descriptive representations. The current work explores this capacity in the
realm of document analysis, and confirms that this representation strategy is
superior to a variety of popular hand-crafted alternatives. Experiments also
show that (i) features extracted from CNNs are robust to compression, (ii) CNNs
trained on non-document images transfer well to document analysis tasks, and
(iii) enforcing region-specific feature-learning is unnecessary given
sufficient training data. This work also makes available a new labelled subset
of the IIT-CDIP collection, containing 400,000 document images across 16
categories, useful for training new CNNs for document analysis
- …