9 research outputs found

    Thinning-free Polygonal Approximation of Thick Digital Curves Using Cellular Envelope

    Get PDF
    Since the inception of successful rasterization of curves and objects in the digital space, several algorithms have been proposed for approximating a given digital curve. All these algorithms, however, resort to thinning as preprocessing before approximating a digital curve with changing thickness. Described in this paper is a novel thinning-free algorithm for polygonal approximation of an arbitrarily thick digital curve, using the concept of "cellular envelope", which is newly introduced in this paper. The cellular envelope, defined as the smallest set of cells containing the given curve, and hence bounded by two tightest (inner and outer) isothetic polygons, is constructed using a combinatorial technique. This envelope, in turn, is analyzed to determine a polygonal approximation of the curve as a sequence of cells using certain attributes of digital straightness. Since a real-world curve=curve-shaped object with varying thickness, unexpected disconnectedness, noisy information, etc., is unsuitable for the existing algorithms on polygonal approximation, the curve is encapsulated by the cellular envelope to enable the polygonal approximation. Owing to the implicit Euclidean-free metrics and combinatorial properties prevailing in the cellular plane, implementation of the proposed algorithm involves primitive integer operations only, leading to fast execution of the algorithm. Experimental results that include output polygons for different values of the approximation parameter corresponding to several real-world digital curves, a couple of measures on the quality of approximation, comparative results related with two other well-referred algorithms, and CPU times, have been presented to demonstrate the elegance and efficacy of the proposed algorithm

    Adaptive Methods for Robust Document Image Understanding

    Get PDF
    A vast amount of digital document material is continuously being produced as part of major digitization efforts around the world. In this context, generic and efficient automatic solutions for document image understanding represent a stringent necessity. We propose a generic framework for document image understanding systems, usable for practically any document types available in digital form. Following the introduced workflow, we shift our attention to each of the following processing stages in turn: quality assurance, image enhancement, color reduction and binarization, skew and orientation detection, page segmentation and logical layout analysis. We review the state of the art in each area, identify current defficiencies, point out promising directions and give specific guidelines for future investigation. We address some of the identified issues by means of novel algorithmic solutions putting special focus on generality, computational efficiency and the exploitation of all available sources of information. More specifically, we introduce the following original methods: a fully automatic detection of color reference targets in digitized material, accurate foreground extraction from color historical documents, font enhancement for hot metal typesetted prints, a theoretically optimal solution for the document binarization problem from both computational complexity- and threshold selection point of view, a layout-independent skew and orientation detection, a robust and versatile page segmentation method, a semi-automatic front page detection algorithm and a complete framework for article segmentation in periodical publications. The proposed methods are experimentally evaluated on large datasets consisting of real-life heterogeneous document scans. The obtained results show that a document understanding system combining these modules is able to robustly process a wide variety of documents with good overall accuracy

    Courbure discrète : théorie et applications

    Get PDF
    International audienceThe present volume contains the proceedings of the 2013 Meeting on discrete curvature, held at CIRM, Luminy, France. The aim of this meeting was to bring together researchers from various backgrounds, ranging from mathematics to computer science, with a focus on both theory and applications. With 27 invited talks and 8 posters, the conference attracted 70 researchers from all over the world. The challenge of finding a common ground on the topic of discrete curvature was met with success, and these proceedings are a testimony of this wor

    Collection of abstracts of the 24th European Workshop on Computational Geometry

    Get PDF
    International audienceThe 24th European Workshop on Computational Geomety (EuroCG'08) was held at INRIA Nancy - Grand Est & LORIA on March 18-20, 2008. The present collection of abstracts contains the 63 scientific contributions as well as three invited talks presented at the workshop

    End-Shape Analysis for Automatic Segmentation of Arabic Handwritten Texts

    Get PDF
    Word segmentation is an important task for many methods that are related to document understanding especially word spotting and word recognition. Several approaches of word segmentation have been proposed for Latin-based languages while a few of them have been introduced for Arabic texts. The fact that Arabic writing is cursive by nature and unconstrained with no clear boundaries between the words makes the processing of Arabic handwritten text a more challenging problem. In this thesis, the design and implementation of an End-Shape Letter (ESL) based segmentation system for Arabic handwritten text is presented. This incorporates four novel aspects: (i) removal of secondary components, (ii) baseline estimation, (iii) ESL recognition, and (iv) the creation of a new off-line CENPARMI ESL database. Arabic texts include small connected components, also called secondary components. Removing these components can improve the performance of several systems such as baseline estimation. Thus, a robust method to remove secondary components that takes into consideration the challenges in the Arabic handwriting is introduced. The methods reconstruct the image based on some criteria. The results of this method were subsequently compared with those of two other methods that used the same database. The results show that the proposed method is effective. Baseline estimation is a challenging task for Arabic texts since it includes ligature, overlapping, and secondary components. Therefore, we propose a learning-based approach that addresses these challenges. Our method analyzes the image and extracts baseline dependent features. Then, the baseline is estimated using a classifier. Algorithms dealing with text segmentation usually analyze the gaps between connected components. These algorithms are based on metric calculation, finding threshold, and/or gap classification. We use two well-known metrics: bounding box and convex hull to test metric-based method on Arabic handwritten texts, and to include this technique in our approach. To determine the threshold, an unsupervised learning approach, known as the Gaussian Mixture Model, is used. Our ESL-based segmentation approach extracts the final letter of a word using rule-based technique and recognizes these letters using the implemented ESL classifier. To demonstrate the benefit of text segmentation, a holistic word spotting system is implemented. For this system, a word recognition system is implemented. A series of experiments with different sets of features are conducted. The system shows promising results

    Contribution to structural parameters computation: volume models and methods

    Get PDF
    Bio-CAD and in-silico experimentation are getting a growing interest in biomedical applications where scientific data coming from real samples are used to compute structural parameters that allow to evaluate physical properties. Non-invasive imaging acquisition technologies such as CT, mCT or MRI, plus the constant growth of computer capabilities, allow the acquisition, processing and visualization of scientific data with increasing degree of complexity. Structural parameters computation is based on the existence of two phases (or spaces) in the sample: the solid, which may correspond to the bone or material, and the empty or porous phase and, therefore, they are represented as binary volumes. The most common representation model for these datasets is the voxel model, which is the natural extension to 3D of 2D bitmaps. In this thesis, the Extreme Vertices Model (EVM) and a new proposed model, the Compact Union of Disjoint Boxes (CUDB), are used to represent binary volumes in a much more compact way. EVM stores only a sorted subset of vertices of the object¿s boundary whereas CUDB keeps a compact list of boxes. In this thesis, methods to compute the next structural parameters are proposed: pore-size distribution, connectivity, orientation, sphericity and roundness. The pore-size distribution helps to interpret the characteristics of porous samples by allowing users to observe most common pore diameter ranges as peaks in a graph. Connectivity is a topological property related to the genus of the solid space, measures the level of interconnectivity among elements, and is an indicator of the biomechanical characteristics of bone or other materials. The orientation of a shape can be defined by rotation angles around a set of orthogonal axes. Sphericity is a measure of how spherical is a particle, whereas roundness is the measure of the sharpness of a particle's edges and corners. The study of these parameters requires dealing with real samples scanned at high resolution, which usually generate huge datasets that require a lot of memory and large processing time to analyze them. For this reason, a new method to simplify binary volumes in a progressive and lossless way is presented. This method generates a level-of-detail sequence of objects, where each object is a bounding volume of the previous objects. Besides being used as support in the structural parameter computation, this method can be practical for task such as progressive transmission, collision detection and volume of interest computation. As part of multidisciplinary research, two practical applications have been developed to compute structural parameters of real samples. A software for automatic detection of characteristic viscosity points of basalt rocks and glasses samples, and another to compute sphericity and roundness of complex forms in a silica dataset.El Bio-Diseño Asistido por Computadora (Bio-CAD), y la experimentacion in-silico est an teniendo un creciente interes en aplicaciones biomedicas, en donde se utilizan datos cientificos provenientes de muestras reales para calcular par ametros estructurales que permiten evaluar propiedades físicas. Las tecnologías de adquisicion de imagen no invasivas como la TC, TC o IRM, y el crecimiento constante de las prestaciones de las computadoras, permiten la adquisicion, procesamiento y visualizacion de datos científicos con creciente grado de complejidad. El calculo de parametros estructurales esta basado en la existencia de dos fases (o espacios) en la muestra: la solida, que puede corresponder al hueso o material, y la fase porosa o vacía, por tanto, tales muestras son representadas como volumenes binarios. El modelo de representacion mas comun para estos conjuntos de datos es el modelo de voxeles, el cual es una extension natural a 3D de los mapas de bits 2D. En esta tesis se utilizan el modelo Extreme Verrtices Model (EVM) y un nuevo modelo propuesto, the Compact Union of Disjoint Boxes (CUDB), para representar los volumenes binarios en una forma mucho mas compacta. El modelo EVM almacena solo un subconjunto ordenado de vertices de la frontera del objeto mientras que el modelo CUDB mantiene una lista compacta de cajas. En esta tesis se proponen metodos para calcular los siguientes parametros estructurales: distribucion del tamaño de los poros, conectividad, orientacion, esfericidad y redondez. La distribucion del tamaño de los poros ayuda a interpretar las características de las muestras porosas permitiendo a los usuarios observar los rangos de diametro mas comunes de los poros mediante picos en un grafica. La conectividad es una propiedad topologica relacionada con el genero del espacio solido, mide el nivel de interconectividad entre los elementos, y es un indicador de las características biomecanicas del hueso o de otros materiales. La orientacion de un objeto puede ser definida por medio de angulos de rotacion alrededor de un conjunto de ejes ortogonales. La esfericidad es una medida de que tan esferica es una partícula, mientras que la redondez es la medida de la nitidez de sus aristas y esquinas. En el estudio de estos parametros se trabaja con muestras reales escaneadas a alta resolucion que suelen generar conjuntos de datos enormes, los cuales requieren una gran cantidad de memoria y mucho tiempo de procesamiento para ser analizados. Por esta razon, se presenta un nuevo metodo para simpli car vol umenes binarios de una manera progresiva y sin perdidas. Este metodo genera una secuencia de niveles de detalle de los objetos, en donde cada objeto es un volumen englobante de los objetos previos. Ademas de ser utilizado como apoyo en el calculo de parametros estructurales, este metodo puede ser de utilizado en otras tareas como transmision progresiva, deteccion de colisiones y calculo de volumen de interes. Como parte de una investigacion multidisciplinaria, se han desarrollado dos aplicaciones practicas para calcular parametros estructurales de muestras reales. Un software para la deteccion automatica de puntos de viscosidad característicos en muestras de rocas de basalto y vidrios, y una aplicacion para calcular la esfericidad y redondez de formas complejas en un conjunto de datos de sílice

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum
    corecore