6 research outputs found

    Automated Ground Truth Data Generation for Newspaper Document Images

    In document image understanding, public ground-truthed datasets are an important part of scientific work. They do not only helpful for developing new methods, but they are also a point of intersection allowing to compare the methods performance without need to implement it. For document image understanding several datasets exists, each having its own pros and cons. Generating these datasets is time consuming and costly work and therefore each existing and new dataset is valuable. In this paper we propose a way to generate a ground-truthed dataset for newspapers. The ground truth in focus is layout analysis ground truth. The proposed two step approach consists of a layout generating module and an image matching module allowing to match the ground truth information from the synthetic data to the scanned version. Using the “MyNews ” system, newspaper layouts are generated using a news corpus. The output con-sists of a digital newspaper (PDF file) and an XML file con-taining geometric and logical layout information. In the second step, the PDF files are printed and scanned. Then the scanned document image is aligned with the synthetic image obtained by rendering the PDF. Finally the geometric and logical layout ground truth is mapped onto the scanned image.

    Anveshak - A Groundtruth Generation Tool for Foreground Regions of Document Images

    We propose a graphical user interface based groundtruth generation tool in this paper. Here, annotation of an input document image is done based on the foreground pixels. Foreground pixels are grouped together with user interaction to form labeling units. These units are then labeled by the user with the user defined labels. The output produced by the tool is an image with an XML file containing its metadata information. This annotated data can be further used in different applications of document image analysis.Comment: Accepted in DAR 201

    Development of a tool for the construction of "ground truth" for complex color images with text

    Aquesta memoria resumeix el treball de final de carrera d'Enginyeria Superior d'Informàtica. Explicarà les principals raons que han motivat el projecte així com exemples que il·lustren l'aplicació resultant. En aquest cas el software intentarà resoldre la actual necessitat que hi ha de tenir dades de Ground Truth per als algoritmes de segmentació de text per imatges de color complexes. Tots els procesos seran explicats en els diferents capítols partint de la definició del problema, la planificació, els requeriments i el disseny fins a completar la il·lustració dels resultats del programa i les dades de Ground Truth resultants.Esta memoria resume el trabajo de final de carrera de la Ingeniería Superior de Informática. Explicará las principales razones que han motivado la realización del proyecto así como ejemplos que ilustran la consecuente aplicación. En este caso se intentará resolver la actual necesidad que hay en tener datos Ground Truth para los algoritmos de segmentación de texto para imágenes de color complejas. Todos los procesos serán explicados en los diferentes capítulos partiendo de la definición del problema, la planificación, los requerimientos y el diseño, hasta una completa ilustración de los resultados del programa y de los datos de Ground Truth resultantes.This thesis summaries the work of the Computer Engineering of degree project. It will explain the main reasons to do the project as well examples that illustrated the resulting application that will try to solve the need, in this case for the creation of Ground Truth data sets for algorithms of complex color text segmentation. All the processes of the creation will explained in different chapters from the definition of problem, the work plan, the requirements and the design, to a complete illustration of the resulting software and corresponding data sets

    Adaptive Methods for Robust Document Image Understanding

    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy