30,919 research outputs found

    Utilisation de la couleur pour l'extraction de tableaux dans des images de documents

    Get PDF
    International audienceTables are complex elements that can disturb the automatic analysis of the structure of an image of a document. In this article, we present a method based on the alternation of the color of lines to extract color tables that are not materialized by physical rulings. Experimental results, obtained on a dataset of document images with various layouts, enable to validate the interest of this approach. MOTS-CLÉS : Analyse d'images de documents, extraction de tableaux, dĂ©tection de couleurs dominantes, segmentation d'images, croissance de rĂ©gions.Les tableaux sont des Ă©lĂ©ments complexes qui peuvent perturber l'analyse automatique de la structure d'une image de document. Dans cet article, nous prĂ©sentons une mĂ©thode fondĂ©e sur l'alternance de couleurs de lignes pour extraire des tableaux colorĂ©s Ă  bordures non matĂ©rialisĂ©es. Les rĂ©sultats expĂ©rimentaux obtenus Ă  partir d'une base d'images de documents Ă  mise en page variĂ©e, permettent de valider l'intĂ©rĂȘt de cette approche

    Colour Text Segmentation in Web Images Based on Human Perception

    No full text
    There is a significant need to extract and analyse the text in images on Web documents, for effective indexing, semantic analysis and even presentation by non-visual means (e.g., audio). This paper argues that the challenging segmentation stage for such images benefits from a human perspective of colour perception in preference to RGB colour space analysis. The proposed approach enables the segmentation of text in complex situations such as in the presence of varying colour and texture (characters and background). More precisely, characters are segmented as distinct regions with separate chromaticity and/or lightness by performing a layer decomposition of the image. The method described here is a result of the authors’ systematic approach to approximate the human colour perception characteristics for the identification of character regions. In this instance, the image is decomposed by performing histogram analysis of Hue and Lightness in the HLS colour space and merging using information on human discrimination of wavelength and luminance

    Two Approaches for Text Segmentation in Web Images

    Get PDF
    There is a significant need to recognise the text in images on web pages, both for effective indexing and for presentation by non-visual means (e.g., audio). This paper presents and compares two novel methods for the segmentation of characters for subsequent extraction and recognition. The novelty of both approaches is the combination of (different in each case) topological features of characters with an anthropocentric perspective of colour perception— in preference to RGB space analysis. Both approaches enable the extraction of text in complex situations such as in the presence of varying colour and texture (characters and background)

    Two Approaches for Text Segmentation in Web Images

    No full text
    There is a significant need to recognise the text in images on web pages, both for effective indexing and for presentation by non-visual means (e.g., audio). This paper presents and compares two novel methods for the segmentation of characters for subsequent extraction and recognition. The novelty of both approaches is the combination of (different in each case) topological features of characters with an anthropocentric perspective of colour perception— in preference to RGB space analysis. Both approaches enable the extraction of text in complex situations such as in the presence of varying colour and texture (characters and background)

    Text Extraction from Web Images Based on A Split-and-Merge Segmentation Method Using Color Perception

    No full text
    This paper describes a complete approach to the segmentation and extraction of text from Web images for subsequent recognition, to ultimately achieve both effective indexing and presentation by non-visual means (e.g., audio). The method described here (the first in the authors’ systematic approach to exploit human colour perception) enables the extraction of text in complex situations such as in the presence of varying colour (characters and background). More precisely, in addition to using structural features, the segmentation follows a split-and-merge strategy based on the Hue-Lightness- Saturation (HLS) representation of colour as a first approximation of an anthropocentric expression of the differences in chromaticity and lightness. Character-like components are then extracted as forming textlines in a number of orientations and along curves

    Anveshak - A Groundtruth Generation Tool for Foreground Regions of Document Images

    Full text link
    We propose a graphical user interface based groundtruth generation tool in this paper. Here, annotation of an input document image is done based on the foreground pixels. Foreground pixels are grouped together with user interaction to form labeling units. These units are then labeled by the user with the user defined labels. The output produced by the tool is an image with an XML file containing its metadata information. This annotated data can be further used in different applications of document image analysis.Comment: Accepted in DAR 201
    • 

    corecore