266 research outputs found

    Historical Document Enhancement Using LUT Classification

    Get PDF
    The fast evolution of scanning and computing technologies in recent years has led to the creation of large collections of scanned historical documents. It is almost always the case that these scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to learn local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system, we have labeled a subset of the Frieder diaries collection (The diaries of Rabbi Dr. Avraham Abba Frieder. http://ir.iit.edu/collections/). This labeled subset was then used to train classifiers based on lookup tables in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly efficient and effective. Experimental evaluation results are provided using the Frieder diaries collection (The diaries of Rabbi Dr. Avraham Abba Frieder. http://ir.iit.edu/collections/). © Springer-Verlag 2009

    Character-based Automated Human Perception Quality Assessment In Document Images

    Get PDF
    Large degradations in document images impede their readability and deteriorate the performance of automated document processing systems. Document image quality (IQ) metrics have been defined through optical character recognition (OCR) accuracy. Such metrics, however, do not always correlate with human perception of IQ. When enhancing document images with the goal of improving readability, e.g., in historical documents where OCR performance is low and/or where it is necessary to preserve the original context, it is important to understand human perception of quality. The goal of this paper is to design a system that enables the learning and estimation of human perception of document IQ. Such a metric can be used to compare existing document enhancement methods and guide automated document enhancement. Moreover, the proposed methodology is designed as a general framework that can be applied in a wide range of applications. © 2012 IEEE

    Advanced correlation-based character recognition applied to the Archimedes Palimpsest

    Get PDF
    The Archimedes Palimpsest is a manuscript containing the partial text of seven treatises by Archimedes that were copied onto parchment and bound in the tenth-century AD. This work is aimed at providing tools that allow scholars of ancient Greek mathematics to retrieve as much information as possible from images of the remaining degraded text. Acorrelation pattern recognition (CPR) system has been developed to recognize distorted versions of Greek characters in problematic regions of the palimpsest imagery, which have been obscured by damage from mold and fire, overtext, and natural aging. Feature vectors for each class of characters are constructed using a series of spatial correlation algorithms and corresponding performance metrics. Principal components analysis (PCA) is employed prior to classification to remove features corresponding to filtering schemes that performed poorly for the spatial characteristics of the selected region-of-interest. A probability is then assigned to each class, forming a character probability distribution based on relative distances from the class feature vectors to the ROI feature vector in principal component (PC) space. However, the current CPR system does not produce a single classification decision, as is common in most target detection problems, but instead has been designed to provide intermediate results that allow the user to apply his or her own decisions (or evidence) to arrive at a conclusion. To achieve this result, a probabilistic network has been incorporated into the recognition system. A probabilistic network represents a method for modeling the uncertainty in a system, and for this application, it allows information from the existing iv partial transcription and contextual knowledge from the user to be an integral part of the decision-making process. The CPR system was designed to provide a framework for future research in the area of spatial pattern recognition by accommodating a broad range of applications and the development of new filtering methods. For example, during preliminary testing, the CPR system was used to confirm the publication date of a fifteenth-century Hebrew colophon, and demonstrated success in the detection of registration markers in three-dimensional MRI breast imaging. In addition, a new correlation algorithm that exploits the benefits of linear discriminant analysis (LDA) and the inherent shift invariance of spatial correlation has been derived, implemented, and tested. Results show that this composite filtering method provides a high level of class discrimination while maintaining tolerance to withinclass distortions. With the integration of this algorithm into the existing filter library, this work completes each stage of a cyclic workflow using the developed CPR system, and provides the necessary tools for continued experimentation

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Forest attributes mapping with SAR data in the romanian South-Eastern Carpathians requirements and outcomes

    Get PDF
    Esta tesis doctoral se centra en la estimación de variables forestales en la zona Sureste de los Cárpatos Rumanos a partir de imágenes de radar de apertura sintética. La investigación abarca parte del preprocesado de las imágenes, métodos de generación de mosaicos y la extracción de la cobertura de bosque, sus subtipos o su biomasa. La tesis se desarrolló en el Instituto Nacional de Investigación y Desarrollo Forestal Marín Dracea (INCDS) y la Universidad de Alcalá (UAH) gracias a varios proyectos: el proyecto EO-ROFORMON del INCDS (Prototyping an Earth-Observation based monitoring and forecasting system for the Romanian forests), y el proyecto EMAFOR de la UAH (Synthetic Aperture Radar (SAR) enabled Analysis Ready Data (ARD) cubes for efficient monitoring of agricultural and forested landscapes). El proyecto EO-ROFORMON fue financiado por la Autoridad Nacional para la Investigación Científica de Rumania y el Fondo Europeo de Desarrollo Regional. El proyecto EMAFOR fue financiado por la Comunidad Autónoma de Madrid (España). El objetivo de esta tesis es el desarrollo de algoritmos para la extracción de variables forestales de uso general como la cobertura, el tipo o la biomasa del bosque a partir de imagen de radar de apertura sintética. Para alcanzar dicho propósito se analizaron posibles fuentes de sesgo sistemático que podrían aparecer en zonas de montaña (ej., normalización topográfica, generación de mosaicos), y se aplicaron técnicas de aprendizaje de máquina para tareas de clasificación y regresión. La tesis contiene ocho secciones: una introducción, cinco publicaciones en revistas o actas de congresos indexados, una pendiente de publicación (quinto capítulo) y las conclusiones. La introducción contextualiza la importancia del bosque, cómo se recoge la información sobre su estado (ej., inventario forestal) y las iniciativas o marcos legislativos que requieren dicha información. A continuación, se describe cómo la teledetección puede complementar la información de inventario forestal, detallando el contexto histórico de las distintas tecnologías, su funcionamiento, y cómo pueden ser aplicadas para la extracción de información forestal. Por último, se describe la problemática y el monitoreo del bosque en Rumanía, detallando el objetivo de la tesis y su estructura. El primer capítulo analiza la influencia del modelo digital de elevaciones (MDE) en la calidad de la normalización topográfica, analizando tres MDE globales (SRTM, AW3D y TanDEM-X DEM) y uno nacional (PNOA-LiDAR). Los experimentos se basan en la comparación entre órbitas, con un MDE de referencia, y la variación del acierto en la clasificación dependiendo del MDE empleado para la normalización. Los resultados muestran una menor diferencia ente órbitas al utilizar un MDE con una mejor resolución (ej. TanDEM-X, PNOA-LIDAR), especialmente en el caso de zonas con fuertes pendientes o formas del terreno complejas, como pueden ser los valles. En zonas de alta montaña las imágenes de radar de apertura sintética (SAR) sufren frecuentes distorsiones. Estas distorsiones dependen de la geometría de adquisición, por lo que es posible combinar imágenes adquiridas desde varias órbitas para que la cobertura sea lo más completa posible. El segundo capítulo evalúa dos metodologías para la clasificación de usos del suelo utilizando datos de Sentinel-1 adquiridos desde varias órbitas. El primer método crea clasificaciones por órbita y las combina, mientras que el segundo genera un mosaico con datos de múltiples órbitas y lo clasifica. El acierto obtenido mediante combinación de clasificaciones es ligeramente mayor, mientras que la clasificación de mosaicos tiene importantes omisiones de las zonas boscosas debido a problemas en la normalización topográfica y a los efectos direccionales. El tercer capítulo se enfoca en separar la cobertura forestal de otras coberturas del suelo (urbano, vegetación baja, agua) analizando la utilidad de las variables basadas en la coherencia interferométrica. En él se realizan tres clasificaciones de máquina vector-soporte basadas en un conjunto concreto de variables. El primer conjunto contiene las estadísticas anuales de la retrodispersión (media y desviación típica anual), el segundo añade la coherencia a largo plazo (separación temporal mayor a un año), el tercero incluye las estadísticas de la coherencia a corto plazo (mínima separación temporal). Utilizar variables basadas en la coherencia aumenta el acierto de la clasificación hasta un 5% y reduce los errores de omisión de la cobertura forestal. El cuarto capítulo evalúa la posibilidad de detectar talas selectivas utilizando datos de Sentinel-1 y Sentinel-2. Sus resultados muestran que la detección resulta muy difícil debido a la saturación de los sensores y la confusión introducida por el efecto de la fenología. El quinto capítulo se centra en la clasificación de tipos de bosque basado en una serie temporal de datos Sentinel-1. Se basa en la creación de un conjunto de modelos que describen la relación entre la retrodispersión y el ángulo local de incidencia para un determinado tipo de bosque y fecha concreta. Para cada píxel se calcula el residuo respecto al modelo de cada uno de los tipos de bosque, acumulando dichos residuos a lo largo de la serie temporal. Hecho esto, cada píxel es asignado al tipo de bosque que acumula un menor residuo. Los resultados son prometedores, mostrando que frondosas y coníferas tienen un comportamiento distintivo, y que es posible separar ambos tipos de bosque con un alto grado de acierto. El sexto capítulo está dedicado a la estimación de biomasa utilizando datos Sentinel-1, ALOS PALSAR y regresión Random Forest. Se obtiene un error similar para ambos sensores a pesar de utilizar una banda diferente (band-C vs. -L), con poca reducción en el error cuando ambas bandas se utilizan conjuntamente. Sin embargo, el ajuste de un estimador adaptado a las condiciones locales de Rumanía sí ofreció una reducción de del error al ser comparado con las estimaciones globales de biomasa

    Global Agro-ecological Assessment for Agriculture in the 21st Century: Methodology and Results

    Get PDF
    Over the past 20 years, the term "agro-ecological zones methodology," or AEZ, has become widely used. However, it has been associated with a wide range of different activities that are often related yet quite different in scope and objectives. FAO and IIASA differentiate the AEZ methodology in the following activities: First, AEZ provides a standardized framework for the characterization of climate, soil, and terrain conditions relevant to agricultural production. In this context, the concepts of "length of growing period" and of latitudinal thermal climates have been applied in mapping activities focusing on zoning at various scales, from the subnational to the global level. Second, AEZ matching procedures are used to identify crop-specific limitations of prevailing climate, soil, and terrain resources, under assumed levels of inputs and management conditions. This part of the AEZ methodology provides estimates of maximum potential and agronomically attainable crop yields for basic land resources units. Third, AEZ provides the frame for various applications. The previous two sets of activities result in very large databases. The information contained in these data sets form the basis for a number of AEZ applications, such as quantification of land productivity, extents of land with rain-fed or irrigated cultivation potential, estimation of land's population supporting capacity, and multi-criteria optimization of the use and development of land resources. The AEZ methodology uses a land resources inventory to assess, for specified management conditions and levels of inputs, all feasible agricultural land-use options and to quantify anticipated production of cropping activities relevant in the specific agro-ecological context. The characterization of land resources includes components of climate, soils, and land form. The recent availability of digital global databases of climatic parameters, topography, soil and terrain, and land cover ahs allowed for revisions and improvements in calculation procedures. It has also allowed the expansion of assessments of AEZ crop suitability and land productivity potentials to temperate and boreal environments. This effectively enables global coverage for assessments of agricultural potentials. The AEZ methodologies and procedures have been extended and newly implemented to make use of these digital geographical databases, and to cope with specific characteristics of seasonal temperate and boreal climates. This report describes the methodological adaptations necessary for the global assessment and illustrates with numerous results a wide range of applications

    Learning to compress and search visual data in large-scale systems

    Full text link
    The problem of high-dimensional and large-scale representation of visual data is addressed from an unsupervised learning perspective. The emphasis is put on discrete representations, where the description length can be measured in bits and hence the model capacity can be controlled. The algorithmic infrastructure is developed based on the synthesis and analysis prior models whose rate-distortion properties, as well as capacity vs. sample complexity trade-offs are carefully optimized. These models are then extended to multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is further evolved as a powerful deep neural network architecture with fast and sample-efficient training and discrete representations. For the developed algorithms, three important applications are developed. First, the problem of large-scale similarity search in retrieval systems is addressed, where a double-stage solution is proposed leading to faster query times and shorter database storage. Second, the problem of learned image compression is targeted, where the proposed models can capture more redundancies from the training images than the conventional compression codecs. Finally, the proposed algorithms are used to solve ill-posed inverse problems. In particular, the problems of image denoising and compressive sensing are addressed with promising results.Comment: PhD thesis dissertatio

    Vision Sensors and Edge Detection

    Get PDF
    Vision Sensors and Edge Detection book reflects a selection of recent developments within the area of vision sensors and edge detection. There are two sections in this book. The first section presents vision sensors with applications to panoramic vision sensors, wireless vision sensors, and automated vision sensor inspection, and the second one shows image processing techniques, such as, image measurements, image transformations, filtering, and parallel computing
    corecore