2,300 research outputs found

    Text Localization in Video Using Multiscale Weber's Local Descriptor

    Full text link
    In this paper, we propose a novel approach for detecting the text present in videos and scene images based on the Multiscale Weber's Local Descriptor (MWLD). Given an input video, the shots are identified and the key frames are extracted based on their spatio-temporal relationship. From each key frame, we detect the local region information using WLD with different radius and neighborhood relationship of pixel values and hence obtained intensity enhanced key frames at multiple scales. These multiscale WLD key frames are merged together and then the horizontal gradients are computed using morphological operations. The obtained results are then binarized and the false positives are eliminated based on geometrical properties. Finally, we employ connected component analysis and morphological dilation operation to determine the text regions that aids in text localization. The experimental results obtained on publicly available standard Hua, Horizontal-1 and Horizontal-2 video dataset illustrate that the proposed method can accurately detect and localize texts of various sizes, fonts and colors in videos.Comment: IEEE SPICES, 201

    ICDAR2003 Page Segmentation Competition

    No full text
    There is a significant need to objectively evaluate layout analysis (page segmentation and region classification) methods. This paper describes the Page Segmentation Competition (modus operandi, dataset and evaluation criteria) held in the context of ICDAR2003 and presents the results of the evaluation of the candidate methods. The main objective of the competition was to evaluate such methods using scanned documents from commonly-occurring publications. The results indicate that although methods seem to be maturing, there is still a considerable need to develop robust methods that deal with everyday documents

    Preprocessing Techniques in Character Recognition

    Get PDF

    Unsupervised ensemble of experts (EoE) framework for automatic binarization of document images

    Full text link
    In recent years, a large number of binarization methods have been developed, with varying performance generalization and strength against different benchmarks. In this work, to leverage on these methods, an ensemble of experts (EoE) framework is introduced, to efficiently combine the outputs of various methods. The proposed framework offers a new selection process of the binarization methods, which are actually the experts in the ensemble, by introducing three concepts: confidentness, endorsement and schools of experts. The framework, which is highly objective, is built based on two general principles: (i) consolidation of saturated opinions and (ii) identification of schools of experts. After building the endorsement graph of the ensemble for an input document image based on the confidentness of the experts, the saturated opinions are consolidated, and then the schools of experts are identified by thresholding the consolidated endorsement graph. A variation of the framework, in which no selection is made, is also introduced that combines the outputs of all experts using endorsement-dependent weights. The EoE framework is evaluated on the set of participating methods in the H-DIBCO'12 contest and also on an ensemble generated from various instances of grid-based Sauvola method with promising performance.Comment: 6-page version, Accepted to be presented in ICDAR'1

    Statistics Oriented Preprocessing of Document Image

    Get PDF
    Old printed documents represent an important part of our cultural heritage. Their digitalization plays an important role in creating data and metadata. The paper proposed an algorithm for estimation of the global text skew. First, document image is binarized reducing the impact of noise and uneven illumination. The binary image is statistically analyzed and processed. Accordingly, redundant data have been excluded. Furthermore, the convex hulls are established encircling each text object. They are joined establishing connected components. Then, the connected components in complementary image are enlarged with morphological dilation. At the end, the biggest connected component is extracted. Its orientation is similar to the global orientation of text document which is calculated by the moments. Efficiency and correctness of the algorithm are verified by testing on a custom dataset

    Automated Algorithm for the Identification of Artifacts in Mottled and Noisy Images

    Get PDF
    We describe a method for automatically classifying image-quality defects on printed documents. The proposed approach accepts a scanned image where the defect has been localized a priori and performs several appropriate image processing steps to reveal the region of interest. A mask is then created from the exposed region to identify bright outliers. Morphological reconstruction techniques are then applied to emphasize relevant local attributes. The classification of the defects is accomplished via a customized tree classifier that utilizes size or shape attributes at corresponding nodes to yield appropriate binary decisions. Applications of this process include automated/assisted diagnosis and repair of printers/copiers in the field in a timely fashion. The proposed technique was tested on a database of 276 images of synthetic and real-life defects with 94.95% accuracy
    • …
    corecore