43,901 research outputs found
Anveshak - A Groundtruth Generation Tool for Foreground Regions of Document Images
We propose a graphical user interface based groundtruth generation tool in
this paper. Here, annotation of an input document image is done based on the
foreground pixels. Foreground pixels are grouped together with user interaction
to form labeling units. These units are then labeled by the user with the user
defined labels. The output produced by the tool is an image with an XML file
containing its metadata information. This annotated data can be further used in
different applications of document image analysis.Comment: Accepted in DAR 201
A semantic-based system for querying personal digital libraries
This is the author's accepted manuscript. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-540-28640-0_4. Copyright @ Springer 2004.The decreasing cost and the increasing availability of new technologies is enabling people to create their own digital libraries. One of the main topic in personal digital libraries is allowing people to select interesting information among all the different digital formats available today (pdf, html, tiff, etc.). Moreover the increasing availability of these on-line libraries, as well as the advent of the so called Semantic Web [1], is raising the demand for converting paper documents into digital, possibly semantically annotated, documents. These motivations drove us to design a new system which could enable the user to interact and query documents independently from the digital formats in which they are represented. In order to achieve this independence from the format we consider all the digital documents contained in a digital library as images. Our system tries to automatically detect the layout of the digital documents and recognize the geometric regions of interest. All the extracted information is then encoded with respect to a reference ontology, so that the user can query his digital library by typing free text or browsing the ontology
Semantics-Based Content Extraction in Typewritten Historical Documents
This paper presents a flexible approach to extracting content from scanned historical documents using semantic information. The final electronic document is the result of a "digital historical document lifecycle" process, where the expert knowledge of the historian/archivist user is incorporated at different stages. Results show that such a conversion strategy aided by (expert) user-specified semantic information and which enables the processing of individual parts of the document in a specialised way, produces superior (in a variety of significant ways) results than document analysis and understanding techniques devised for contemporary documents
Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification
We present an exhaustive investigation of recent Deep Learning architectures,
algorithms, and strategies for the task of document image classification to
finally reduce the error by more than half. Existing approaches, such as the
DeepDocClassifier, apply standard Convolutional Network architectures with
transfer learning from the object recognition domain. The contribution of the
paper is threefold: First, it investigates recently introduced very deep neural
network architectures (GoogLeNet, VGG, ResNet) using transfer learning (from
real images). Second, it proposes transfer learning from a huge set of document
images, i.e. 400,000 documents. Third, it analyzes the impact of the amount of
training data (document images) and other parameters to the classification
abilities. We use two datasets, the Tobacco-3482 and the large-scale RVL-CDIP
dataset. We achieve an accuracy of 91.13% for the Tobacco-3482 dataset while
earlier approaches reach only 77.6%. Thus, a relative error reduction of more
than 60% is achieved. For the large dataset RVL-CDIP, an accuracy of 90.97% is
achieved, corresponding to a relative error reduction of 11.5%
Image Enhancement with Statistical Estimation
Contrast enhancement is an important area of research for the image analysis.
Over the decade, the researcher worked on this domain to develop an efficient
and adequate algorithm. The proposed method will enhance the contrast of image
using Binarization method with the help of Maximum Likelihood Estimation (MLE).
The paper aims to enhance the image contrast of bimodal and multi-modal images.
The proposed methodology use to collect mathematical information retrieves from
the image. In this paper, we are using binarization method that generates the
desired histogram by separating image nodes. It generates the enhanced image
using histogram specification with binarization method. The proposed method has
showed an improvement in the image contrast enhancement compare with the other
image.Comment: 9 pages,6 figures; ISSN:0975-5578 (Online); 0975-5934 (Print
Applying semantic web technologies to knowledge sharing in aerospace engineering
This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale
Building multi-layer social knowledge maps with google maps API
Google Maps is an intuitive online-map service which changes people's way of navigation on Geo-maps. People can explore the maps in a multi-layer fashion in order to avoid information overloading. This paper reports an innovative approach to extend the "power" of Google Maps to adaptive learning. We have designed and implemented a navigator for multi-layer social knowledge maps, namely ProgressiveZoom, with Google Maps API. In our demonstration, the knowledge maps are built from the Interactive System Design (ISD) course at the School of Information Science, University of Pittsburgh. Students can read the textbooks and reflect their individual and social learning progress in a context of pedagogical hierarchical structure
- …