13 research outputs found

    Illustrations Segmentation in Digitized Documents Using Local Correlation Features

    Get PDF
    In this paper we propose an approach for Document Layout Analysis based on local correlation features. We identify and extract illustrations in digitized documents by learning the discriminative patterns of textual and pictorial regions. The proposal has been demonstrated to be effective on historical datasets and to outperform the state-of-the-art in presence of challenging documents with a large variety of pictorial elements

    Utilisation de la couleur pour l'extraction de tableaux dans des images de documents

    Get PDF
    International audienceTables are complex elements that can disturb the automatic analysis of the structure of an image of a document. In this article, we present a method based on the alternation of the color of lines to extract color tables that are not materialized by physical rulings. Experimental results, obtained on a dataset of document images with various layouts, enable to validate the interest of this approach. MOTS-CLÉS : Analyse d'images de documents, extraction de tableaux, détection de couleurs dominantes, segmentation d'images, croissance de régions.Les tableaux sont des éléments complexes qui peuvent perturber l'analyse automatique de la structure d'une image de document. Dans cet article, nous présentons une méthode fondée sur l'alternance de couleurs de lignes pour extraire des tableaux colorés à bordures non matérialisées. Les résultats expérimentaux obtenus à partir d'une base d'images de documents à mise en page variée, permettent de valider l'intérêt de cette approche

    Document image and zone classification through incremental learning

    Full text link

    User-driven Page Layout Analysis of historical printed Books

    Get PDF
    International audienceIn this paper, based on the study of the specificity of historical printed books, we first explain the main error sources in classical methods used for page layout analysis. We show that each method (bottom-up and top-down) provides different types of useful information that should not be ignored, if we want to obtain both a generic method and good segmentation results. Next, we propose to use a hybrid segmentation algorithm that builds two maps: a shape map that focuses on connected components and a background map, which provides information about white areas corresponding to block separations in the page. Using this first segmentation, a classification of the extracted blocks can be achieved according to scenarios produced by the user. These scenarios are defined very simply during an interactive stage. The user is able to make processing sequences adapted to the different kinds of images he is likely to meet and according to the user needs. The proposed “user-driven approach” is capable of doing segmentation and labelling of the required user high level concepts efficiently and has achieved above 93% accurate results over different data sets tested. User feedbacks and experimental results demonstrate the effectiveness and usability of our framework mainly because the extraction rules can be defined without difficulty and parameters are not sensitive to page layout variation

    DOCUMENT IMAGE AND ZONE CLASSIFICATION THROUGH INCREMENTAL LEARNING

    Get PDF
    International audienceWe present an incremental learning method for document image and zone classification. We consider an industrial context where the system faces a large variability of digitized administrative documents that become available progressively over time. Each new incoming document is segmented into physical regions (zones) which are classified according to a zone-model. We represent the document by means of its classified zones and we classify the document according to a document-model. The classification relies on a reject utility in order to reject ambiguous zones or documents. Models are updated by incrementally learning each new document and its extracted zones. We validate the method on real administrative document images and we achieve a recognition rate of more than 92%

    Segmentation et classification des zones d'une page de document

    Get PDF
    International audienceThis paper proposes a methodology for complex document segmentation based on textual content and shape. The textual content corresponds with printed text and it is verified by text-word analysis using dictionary and regular expressions variable that are adapted to noise. This allows knowing where the interested expressions are placed (address, phone number etc.) The non-textual content is segmented in zone considering size and distance between connected components in order to classify zones like logo, signature, and table. To make that, features are extracted like run length, Bi level Co-occurrence... This classification is based on a modified boosting method and decision trees. The modification is about the calculation of the probability to draw training data. Compare to OCRs that are able to classify text, tables and pictures, our methodology increases the performance and allows the detection of other zones like handwritten text, logo, signature, table and tampon.Cet article propose une méthode de segmentation de documents complexes en zones d'intérêt en s'appuyant à la fois sur le contenu textuel et la forme. Le contenu textuel correspond aux sorties lisibles validées par un dictionnaire et des expressions régulières adaptées aux données bruitées. Ceci permet en parallèle de localiser des textes d'intérêt (adresses, numéros de téléphone, formules de politesse, etc.). Le contenu non lisible est regroupé en régions physiques en prenant en compte la taille et l'éloignement des composantes connexes en vue de l'identification de zones spécifiques, comme des logos, des signatures et des tampons. Pour cela, des descripteurs morphologiques sont appliqués. Cette classification s'appuie sur une méthode de boosting modifiée associée à des arbres de décision. La modification a porté sur le calcul de la probabilité d'appartenance d'un individu à une classe. Par rapport à l'action actuelle des OCRs qui classent le texte, les tableaux et les images, les résultats de notre méthode accroissent non seulement ces performances mais elle permet aussi à des zones à faible consensus comme, les annotations manuscrites, les logos, les tampons et surtout les signatures d'être reconnues

    Analyse multicouche de la structure et de la forme des journaux

    Get PDF
    Understanding newspaper structure and design remains a challenging task due to the complex composition of pages with many visual and textual elements. Current approaches have focused on simple design types and analysed only broad classes for the components in a page. In this paper, we propose an approach to obtain a comprehensive understanding of a newspaper page through a multi-layered analysis of structure and design. Taking images of newspaper front pages as input, our approach uses a combination of computer vision techniques to segment newspapers with complex layouts into meaningful blocks of varying degrees of granularity, and convolutional neural network (CNN) to classify each block. The final output presents a visualization of the various layers of design elements present in the newspaper. Compared to previous approaches, our method introduces a much larger set of design-related labels (23 labels against less than 10 before) resulting in a very fine description of the pages, with high accuracy (83%). As a whole, this automated analysis would have potential applications such as cross-medium content adaptation, digital archiving, and UX design.La composition des pages d'un journal est complexe, comprenant de nombreux éléments visuels et textuels. Cela rend difficile l'analyse de la structure et de la forme de ces pages. Les approches actuelles se sont focalisées sur des documents simples et ont analysé uniquement les classes de base des composants d'une page. Dans ce rapport, nous proposons une approche permettant d’obtenir une compréhension complète d’une page de journal grâce à une analyse multicouche de la structure et de la forme. Notre système prend les images de pages de journaux en entrée et comprend deux parties. La première utilise des techniques de vision par ordinateur pour segmenter des pages complexes en blocs significatifs de différents degrés de granularité. La deuxième classe chaque bloc identifié avec un réseau de neurones à convolution (CNN). Le résultat final est une visualisation des différentes couches des composants d'une page. En comparaison des approches précédentes, notre méthode introduit un ensemble beaucoup plus large de classes (23 classes de composants d’une page par rapport à moins de 10 auparavant), donnant une description très fine des pages, avec une bonne précision (83 %). Cette méthode a des applications potentielles telles que l'adaptation de contenu multi-média, l'archivage numérique et la conception UX
    corecore