2,251 research outputs found
Hyperspectral image analysis for questioned historical documents.
This thesis describes the application of spectroscopy and hyperspectral image
processing to examine historical manuscripts and text. Major activities
in palaeographic and manuscript studies include the recovery of illegible or
deleted text, the minute analyses of scribal hands, the identification of inks
and the segmentation and dating of text. This thesis describes how Hyperspectral
Imaging (HSI), applied in a novel manner, can be used to perform
quality text recovery, segmentation and dating of historical documents. The
non-destructive optical imaging process of Spectroscopy is described in detail
and how it can be used to assist historians and document experts in
the exemption of aged manuscripts. This non-destructive optical method
of analysis can distinguish subtle differences in the reflectance properties of
the materials under study. Many historically significant documents from
libraries such as the Royal Irish Academy and the Russell Library at the
National University of Ireland, Maynooth, have been the selected for study
using the hyperspectral imaging technique. Processing techniques have are
described for the applications to the study of manuscripts in a poor state
of conservation. The research provides a comprehensive overview of Hyperspectral
Imaging (HSI) and associated statistical and analytical methods,
and also an in-depth investigation of the practical implementation of such
methods to aid document analysts. Specifically, we provide results from employing
statistical analytical methods including principal component analysis
(PCA), independent component analysis (ICA) and both supervised and automatic
clustering methods to historically significant manuscripts and text
VIII
such as Leabhar na hUidhre, a 12th century Irish text which was subject to
part-erasure and rewriting, a 16th Century pastedown cover, and a multi-ink
example typical of that found in, for example, late medieval administrative
texts such as Gttingen’s kundige bok. The purpose of which is to achieve
an overall greater insight into the historical context of the document, which
includes the recovery or enhancement of faded or illegible text or text lost
through fading, staining, overwriting or other forms of erasure. In addition,
we demonstrate prospect of distinguishing different ink-types, and furnishing
us with details of the manuscript’s composition, all of which are refinements,
which can be used to answer questions about date and provenance. This process
marks a new departure for the study of manuscripts and may provide
answer many long-standing questions posed by palaeographers and by scholars
in a variety of disciplines. Furthermore, through text retrieval, it holds
out the prospect of adding considerably to the existing corpus of texts and
to providing very many new research opportunities for coming generations
of scholars
Advances in Image Processing, Analysis and Recognition Technology
For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches
Analysis of Diagnostic Images of Artworks and Feature Extraction: Design of a Methodology
none6noDigital images represent the primary tool for diagnostics and documentation of the state of preservation of artifacts. Today the interpretive filters that allow one to characterize information and communicate it are extremely subjective. Our research goal is to study a quantitative analysis methodology to facilitate and semi-automate the recognition and polygonization of areas corresponding to the characteristics searched. To this end, several algorithms have been tested that allow for separating the characteristics and creating binary masks to be statistically analyzed and polygonized. Since our methodology aims to offer a conservator-restorer model to obtain useful graphic documentation in a short time that is usable for design and statistical purposes, this process has been implemented in a single Geographic Information Systems (GIS) application.openAmura, Annamaria; Aldini, Alessandro; Pagnotta, Stefano; Salerno, Emanuele; Tonazzini, Anna; Triolo, PaoloAmura, Annamaria; Aldini, Alessandro; Pagnotta, Stefano; Salerno, Emanuele; Tonazzini, Anna; Triolo, Paol
Analytical and mathematical methods for revealing hidden details in ancient manuscripts and paintings: A review
In this work, a critical review of the current nondestructive probing and image analysis approaches is presented, to revealing otherwise invisible or hardly discernible details in manuscripts and paintings relevant to cultural heritage and archaeology. Multispectral imaging, X-ray fluorescence, Laser-Induced Breakdown Spectroscopy, Raman spectroscopy and Thermography are considered, as techniques for acquiring images and spectral image sets; statistical methods for the analysis of these images are then discussed, including blind separation and false colour techniques. Several case studies are presented, with particular attention dedicated to the approaches that appear most promising for future applications. Some of the techniques described herein are likely to replace, in the near future, classical digital photography in the study of ancient manuscripts and paintings
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Thirteen Modalities
For a long time, due to the high heterogeneity in structure and semantics
among various spatiotemporal modal data, the joint interpretation of multimodal
spatiotemporal data has been an extremely challenging problem. The primary
challenge resides in striking a trade-off between the cohesion and autonomy of
diverse modalities, and this trade-off exhibits a progressively nonlinear
nature as the number of modalities expands. We introduce the Language as
Reference Framework (LaRF), a fundamental principle for constructing a
multimodal unified model, aiming to strike a trade-off between the cohesion and
autonomy among different modalities. We propose a multimodal spatiotemporal
general artificial intelligence model, called AllSpark. Our model integrates
thirteen different modalities into a unified framework, including 1D (text,
code), 2D (RGB, infrared, SAR, multispectral, hyperspectral, tables, graphs,
trajectory, oblique photography), and 3D (point clouds, videos) modalities. To
achieve modal cohesion, AllSpark uniformly maps diverse modal features to the
language modality. In addition, we design modality-specific prompts to guide
multi-modal large language models in accurately perceiving multimodal data. To
maintain modality autonomy, AllSpark introduces modality-specific encoders to
extract the tokens of various spatiotemporal modalities. And modal bridge is
employed to achieve dimensional projection from each modality to the language
modality. Finally, observing a gap between the model's interpretation and
downstream tasks, we designed task heads to enhance the model's generalization
capability on specific downstream tasks. Experiments indicate that AllSpark
achieves competitive accuracy in modalities such as RGB and trajectory compared
to state-of-the-art models.Comment: 49 pages, 16 tables, 3 figure
Visual image processing in various representation spaces for documentary preservation
This thesis establishes an advanced image processing framework for the enhancement and restoration of historical document images (HDI) in both intensity (gray-scale or color) and multispectral (MS) representation spaces. It provides three major contributions: 1) the binarization of gray-scale HDI; 2) the visual quality restoration of MS HDI; and 3) automatic reference data (RD) estimation for HDI binarization. HDI binarization is one of the enhancement techniques that produces bi-level information which is easy to handle using methods of analysis (OCR, for instance) and is less computationally costly to process than 256 levels of grey or color images. Restoring the visual quality of HDI in an MS representation space enhances their legibility, which is not possible with conventional intensity-based restoration methods, and HDI legibility is the main concern of historians and librarians wishing to transfer knowledge and revive ancient cultural heritage. The use of MS imaging systems is a new and attractive research trend in the field of numerical processing of cultural heritage documents. In this thesis, these systems are also used for automatically estimating more accurate RD to be used for the evaluation of HDI binarization algorithms in order to track the level of human performance.
Our first contribution, which is a new adaptive method of intensity-based binarization, is defined at the outset. Since degradation is present over document images, binarization methods must be adapted to handle degradation phenomena locally. Unfortunately, these methods are not effective, as they are not able to capture weak text strokes, which results in a deterioration of the performance of character recognition engines. The proposed approach first detects a subset of the most probable text pixels, which are used to locally estimate the parameters of the two classes of pixels (text and background), and then performs a simple maximum likelihood (ML) to locally classify the remaining pixels based on their class membership. To the best of our knowledge, this is the first time local parameter estimation and classification in an ML framework has been introduced for HDI binarization with promising results. A limitation of this method in the case with as the intensity-based methods of enhancement is that they are not effective in dealing with severely degraded HDI. Developing more advanced methods based on MS information would be a promising alternative avenue of research.
In the second contribution, a novel approach to the visual restoration of HDI is defined. The approach is aimed at providing end users (historians, librarians, etc..) with better HDI visualization, specifically; it aims to restore them from degradations, while keeping the original appearance of the HDI intact. Practically, this problem cannot be solved by conventional intensity-based restoration methods. To cope with these limitations, MS imaging is used to produce additional spectral images in the invisible light (infrared and ultraviolet) range, which gives greater contrast to objects in the documents. The inpainting-based variational framework proposed here for HDI restoration involves isolating the degradation phenomena in the infrared spectral images, and then inpainting them in the visible spectral images. The final color image to visualize is therefore reconstructed from the restored visible spectral images. To the best of our knowledge, this is the first time the inpainting technique has been introduced for MS HDI. The experimental results are promising, and our objective, in collaboration with the BAnQ (Bibliothèque et Archives nationales de Québec), is to push heritage documents into the public domain and build an intelligent engine for accessing them. It is useful to note that the proposed model can be extended to other MS-based image processing tasks.
Our third contribution is presented, which is to consider a new problem of RD (reference data) estimation, in order to show the importance of working with MS images rather than gray-scale or color images. RDs are mandatory for comparing different binarization algorithms, and they are usually generated by an expert. However, an expert’s RD is always subject to mislabeling and judgment errors, especially in the case of degraded data in restricted representation spaces (gray-scale or color images). In the proposed method, multiple RD generated by several experts are used in combination with MS HDI to estimate new, more accurate RD. The idea is to include the agreement of experts about labels and the multivariate data fidelity in a single Bayesian classification framework to estimate the a posteriori probability of new labels forming the final estimated RD. Our experiments show that estimated RD are more accurate than an expert’s RD. To the best of our knowledge, no similar work to combine binary data and multivariate data for the estimation of RD has been conducted
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
- …