2,300 research outputs found
Text Localization in Video Using Multiscale Weber's Local Descriptor
In this paper, we propose a novel approach for detecting the text present in
videos and scene images based on the Multiscale Weber's Local Descriptor
(MWLD). Given an input video, the shots are identified and the key frames are
extracted based on their spatio-temporal relationship. From each key frame, we
detect the local region information using WLD with different radius and
neighborhood relationship of pixel values and hence obtained intensity enhanced
key frames at multiple scales. These multiscale WLD key frames are merged
together and then the horizontal gradients are computed using morphological
operations. The obtained results are then binarized and the false positives are
eliminated based on geometrical properties. Finally, we employ connected
component analysis and morphological dilation operation to determine the text
regions that aids in text localization. The experimental results obtained on
publicly available standard Hua, Horizontal-1 and Horizontal-2 video dataset
illustrate that the proposed method can accurately detect and localize texts of
various sizes, fonts and colors in videos.Comment: IEEE SPICES, 201
ICDAR2003 Page Segmentation Competition
There is a significant need to objectively evaluate layout analysis (page segmentation and region classification) methods. This paper describes the Page Segmentation Competition (modus operandi, dataset and evaluation criteria) held in the context of ICDAR2003 and presents the results of the evaluation of the candidate methods. The main objective of the competition was to evaluate such methods using scanned documents from commonly-occurring publications. The results indicate that although methods seem to be maturing, there is still a considerable need to develop robust methods that deal with everyday documents
Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images
In recent years, a large number of binarization methods have been developed,
with varying performance generalization and strength against different
benchmarks. In this work, to leverage on these methods, an ensemble of experts
(EoE) framework is introduced, to efficiently combine the outputs of various
methods. The proposed framework offers a new selection process of the
binarization methods, which are actually the experts in the ensemble, by
introducing three concepts: confidentness, endorsement and schools of experts.
The framework, which is highly objective, is built based on two general
principles: (i) consolidation of saturated opinions and (ii) identification of
schools of experts. After building the endorsement graph of the ensemble for an
input document image based on the confidentness of the experts, the saturated
opinions are consolidated, and then the schools of experts are identified by
thresholding the consolidated endorsement graph. A variation of the framework,
in which no selection is made, is also introduced that combines the outputs of
all experts using endorsement-dependent weights. The EoE framework is evaluated
on the set of participating methods in the H-DIBCO'12 contest and also on an
ensemble generated from various instances of grid-based Sauvola method with
promising performance.Comment: 6-page version, Accepted to be presented in ICDAR'1
Statistics Oriented Preprocessing of Document Image
Old printed documents represent an important part of our cultural heritage. Their digitalization plays an important role in creating data and metadata. The paper proposed an algorithm for estimation of the global text skew. First, document image is binarized reducing the impact of noise and uneven illumination. The binary image is statistically analyzed and processed. Accordingly, redundant data have been excluded. Furthermore, the convex hulls are established encircling each text object. They are joined establishing connected components. Then, the connected components in complementary image are enlarged with morphological dilation. At the end, the biggest connected component is extracted. Its orientation is similar to the global orientation of text document which is calculated by the moments. Efficiency and correctness of the algorithm are verified by testing on a custom dataset
Automated Algorithm for the Identification of Artifacts in Mottled and Noisy Images
We describe a method for automatically classifying image-quality defects on printed documents. The proposed approach accepts a scanned image where the defect has been localized a priori and performs several appropriate image processing steps to reveal the region of interest. A mask is then created from the exposed region to identify bright outliers. Morphological reconstruction techniques are then applied to emphasize relevant local attributes. The classification of the defects is accomplished via a customized tree classifier that utilizes size or shape attributes at corresponding nodes to yield appropriate binary decisions. Applications of this process include automated/assisted diagnosis and repair of printers/copiers in the field in a timely fashion. The proposed technique was tested on a database of 276 images of synthetic and real-life defects with 94.95% accuracy
- …