650 research outputs found

    Analyse d’images de documents patrimoniaux : une approche structurelle à base de texture

    Get PDF
    Over the last few years, there has been tremendous growth in digitizing collections of cultural heritage documents. Thus, many challenges and open issues have been raised, such as information retrieval in digital libraries or analyzing page content of historical books. Recently, an important need has emerged which consists in designing a computer-aided characterization and categorization tool, able to index or group historical digitized book pages according to several criteria, mainly the layout structure and/or typographic/graphical characteristics of the historical document image content. Thus, the work conducted in this thesis presents an automatic approach for characterization and categorization of historical book pages. The proposed approach is applicable to a large variety of ancient books. In addition, it does not assume a priori knowledge regarding document image layout and content. It is based on the use of texture and graph algorithms to provide a rich and holistic description of the layout and content of the analyzed book pages to characterize and categorize historical book pages. The categorization is based on the characterization of the digitized page content by texture, shape, geometric and topological descriptors. This characterization is represented by a structural signature. More precisely, the signature-based characterization approach consists of two main stages. The first stage is extracting homogeneous regions. Then, the second one is proposing a graph-based page signature which is based on the extracted homogeneous regions, reflecting its layout and content. Afterwards, by comparing the different obtained graph-based signatures using a graph-matching paradigm, the similarities of digitized historical book page layout and/or content can be deduced. Subsequently, book pages with similar layout and/or content can be categorized and grouped, and a table of contents/summary of the analyzed digitized historical book can be provided automatically. As a consequence, numerous signature-based applications (e.g. information retrieval in digital libraries according to several criteria, page categorization) can be implemented for managing effectively a corpus or collections of books. To illustrate the effectiveness of the proposed page signature, a detailed experimental evaluation has been conducted in this work for assessing two possible categorization applications, unsupervised page classification and page stream segmentation. In addition, the different steps of the proposed approach have been evaluated on a large variety of historical document images.Les rĂ©cents progrĂšs dans la numĂ©risation des collections de documents patrimoniaux ont ravivĂ© de nouveaux dĂ©fis afin de garantir une conservation durable et de fournir un accĂšs plus large aux documents anciens. En parallĂšle de la recherche d'information dans les bibliothĂšques numĂ©riques ou l'analyse du contenu des pages numĂ©risĂ©es dans les ouvrages anciens, la caractĂ©risation et la catĂ©gorisation des pages d'ouvrages anciens a connu rĂ©cemment un regain d'intĂ©rĂȘt. Les efforts se concentrent autant sur le dĂ©veloppement d'outils rapides et automatiques de caractĂ©risation et catĂ©gorisation des pages d'ouvrages anciens, capables de classer les pages d'un ouvrage numĂ©risĂ© en fonction de plusieurs critĂšres, notamment la structure des mises en page et/ou les caractĂ©ristiques typographiques/graphiques du contenu de ces pages. Ainsi, dans le cadre de cette thĂšse, nous proposons une approche permettant la caractĂ©risation et la catĂ©gorisation automatiques des pages d'un ouvrage ancien. L'approche proposĂ©e se veut indĂ©pendante de la structure et du contenu de l'ouvrage analysĂ©. Le principal avantage de ce travail rĂ©side dans le fait que l'approche s'affranchit des connaissances prĂ©alables, que ce soit concernant le contenu du document ou sa structure. Elle est basĂ©e sur une analyse des descripteurs de texture et une reprĂ©sentation structurelle en graphe afin de fournir une description riche permettant une catĂ©gorisation Ă  partir du contenu graphique (capturĂ© par la texture) et des mises en page (reprĂ©sentĂ©es par des graphes). En effet, cette catĂ©gorisation s'appuie sur la caractĂ©risation du contenu de la page numĂ©risĂ©e Ă  l'aide d'une analyse des descripteurs de texture, de forme, gĂ©omĂ©triques et topologiques. Cette caractĂ©risation est dĂ©finie Ă  l'aide d'une reprĂ©sentation structurelle. Dans le dĂ©tail, l'approche de catĂ©gorisation se dĂ©compose en deux Ă©tapes principales successives. La premiĂšre consiste Ă  extraire des rĂ©gions homogĂšnes. La seconde vise Ă  proposer une signature structurelle Ă  base de texture, sous la forme d'un graphe, construite Ă  partir des rĂ©gions homogĂšnes extraites et reflĂ©tant la structure de la page analysĂ©e. Cette signature assure la mise en Ɠuvre de nombreuses applications pour gĂ©rer efficacement un corpus ou des collections de livres patrimoniaux (par exemple, la recherche d'information dans les bibliothĂšques numĂ©riques en fonction de plusieurs critĂšres, ou la catĂ©gorisation des pages d'un mĂȘme ouvrage). En comparant les diffĂ©rentes signatures structurelles par le biais de la distance d'Ă©dition entre graphes, les similitudes entre les pages d'un mĂȘme ouvrage en termes de leurs mises en page et/ou contenus peuvent ĂȘtre dĂ©duites. Ainsi de suite, les pages ayant des mises en page et/ou contenus similaires peuvent ĂȘtre catĂ©gorisĂ©es, et un rĂ©sumĂ©/une table des matiĂšres de l'ouvrage analysĂ© peut ĂȘtre alors gĂ©nĂ©rĂ© automatiquement. Pour illustrer l'efficacitĂ© de la signature proposĂ©e, une Ă©tude expĂ©rimentale dĂ©taillĂ©e a Ă©tĂ© menĂ©e dans ce travail pour Ă©valuer deux applications possibles de catĂ©gorisation de pages d'un mĂȘme ouvrage, la classification non supervisĂ©e de pages et la segmentation de flux de pages d'un mĂȘme ouvrage. En outre, les diffĂ©rentes Ă©tapes de l'approche proposĂ©e ont donnĂ© lieu Ă  des Ă©valuations par le biais d'expĂ©rimentations menĂ©es sur un large corpus de documents patrimoniaux

    Matching Islamic patterns in Kufic images

    Get PDF
    In this study, we address the problem of matching patterns in Kufic calligraphy images. Being used as a decorative element, Kufic images have been designed in a way that makes it difficult to be read by non-experts. Therefore, available methods for handwriting recognition are not easily applicable to the recognition of Kufic patterns. In this study, we propose two new methods for Kufic pattern matching. The first method approximates the contours of connected components into lines and then utilizes chain code representation. Sequence matching techniques with a penalty for gaps are exploited for handling the variations between different instances of sub-patterns. In the second method, skeletons of connected components are represented as a graph where junction and end points are considered as nodes. Graph isomorphism techniques are then relaxed for partial graph matching. Methods are evaluated over a collection of 270 square Kufic images with 8,941 sub-patterns. Experimental results indicate that, besides retrieval and indexing of known patterns, our method also allows the discovery of new patterns. © 2015, Springer-Verlag London

    Arabic Manuscripts Analysis and Retrieval

    Get PDF

    The Optimisation of Elementary and Integrative Content-Based Image Retrieval Techniques

    Get PDF
    Image retrieval plays a major role in many image processing applications. However, a number of factors (e.g. rotation, non-uniform illumination, noise and lack of spatial information) can disrupt the outputs of image retrieval systems such that they cannot produce the desired results. In recent years, many researchers have introduced different approaches to overcome this problem. Colour-based CBIR (content-based image retrieval) and shape-based CBIR were the most commonly used techniques for obtaining image signatures. Although the colour histogram and shape descriptor have produced satisfactory results for certain applications, they still suffer many theoretical and practical problems. A prominent one among them is the well-known “curse of dimensionality “. In this research, a new Fuzzy Fusion-based Colour and Shape Signature (FFCSS) approach for integrating colour-only and shape-only features has been investigated to produce an effective image feature vector for database retrieval. The proposed technique is based on an optimised fuzzy colour scheme and robust shape descriptors. Experimental tests were carried out to check the behaviour of the FFCSS-based system, including sensitivity and robustness of the proposed signature of the sampled images, especially under varied conditions of, rotation, scaling, noise and light intensity. To further improve retrieval efficiency of the devised signature model, the target image repositories were clustered into several groups using the k-means clustering algorithm at system runtime, where the search begins at the centres of each cluster. The FFCSS-based approach has proven superior to other benchmarked classic CBIR methods, hence this research makes a substantial contribution towards corresponding theoretical and practical fronts

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Advances in Character Recognition

    Get PDF
    This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject

    Adaptive Algorithms for Automated Processing of Document Images

    Get PDF
    Large scale document digitization projects continue to motivate interesting document understanding technologies such as script and language identification, page classification, segmentation and enhancement. Typically, however, solutions are still limited to narrow domains or regular formats such as books, forms, articles or letters and operate best on clean documents scanned in a controlled environment. More general collections of heterogeneous documents challenge the basic assumptions of state-of-the-art technology regarding quality, script, content and layout. Our work explores the use of adaptive algorithms for the automated analysis of noisy and complex document collections. We first propose, implement and evaluate an adaptive clutter detection and removal technique for complex binary documents. Our distance transform based technique aims to remove irregular and independent unwanted foreground content while leaving text content untouched. The novelty of this approach is in its determination of best approximation to clutter-content boundary with text like structures. Second, we describe a page segmentation technique called Voronoi++ for complex layouts which builds upon the state-of-the-art method proposed by Kise [Kise1999]. Our approach does not assume structured text zones and is designed to handle multi-lingual text in both handwritten and printed form. Voronoi++ is a dynamically adaptive and contextually aware approach that considers components' separation features combined with Docstrum [O'Gorman1993] based angular and neighborhood features to form provisional zone hypotheses. These provisional zones are then verified based on the context built from local separation and high-level content features. Finally, our research proposes a generic model to segment and to recognize characters for any complex syllabic or non-syllabic script, using font-models. This concept is based on the fact that font files contain all the information necessary to render text and thus a model for how to decompose them. Instead of script-specific routines, this work is a step towards a generic character and recognition scheme for both Latin and non-Latin scripts

    Using contour information and segmentation for object registration, modeling and retrieval

    Get PDF
    This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios. There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∌ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p
    • 

    corecore