44 research outputs found
What's wrong with the murals at the Mogao Grottoes : a near-infrared hyperspectral imaging method
Although a significant amount of work has been performed to preserve the ancient murals in the Mogao Grottoes by Dunhuang Cultural Research, non-contact methods need to be developed to effectively evaluate the degree of flaking of the murals. In this study, we propose to evaluate the flaking by automatically analyzing hyperspectral images that were scanned at the site. Murals with various degrees of flaking were scanned in the 126th cave using a near-infrared (NIR) hyperspectral camera with a spectral range of approximately 900 to 1700 nm. The regions of interest (ROIs) of the murals were manually labeled and grouped into four levels: normal, slight, moderate, and severe. The average spectral data from each ROI and its group label were used to train our classification model. To predict the degree of flaking, we adopted four algorithms: deep belief networks (DBNs), partial least squares regression (PLSR), principal component analysis with a support vector machine (PCA + SVM) and principal component analysis with an artificial neural network (PCA + ANN). The experimental results show the effectiveness of our method. In particular, better results are obtained using DBNs when the training data contain a significant amount of striping noise
Data-driven clustering : new methods and applications
Abstract :
This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems.
Résumé :
Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes
Mixed spectral-structural classification of very high resolution images with summation kernels
In this paper, mixed spectral-structural kernel machines are proposed
for the classification of very-high resolution images. The simultaneous
use of multispectral and structural features (computed using morphological
filters) allows a significant increase in classification accuracy
of remote sensing images. Subsequently, weighted summation kernel
support vector machines are proposed and applied in order to take
into account the multiscale nature of the scene considered. Such
classifiers use the Mercer property of kernel matrices to compute
a new kernel matrix accounting simultaneously for two scale parameters.
Tests on a Zurich QuickBird image show the relevance of the proposed
method : using the mixed spectral-structural features, the classification
accuracy increases of about 5%, achieving a Kappa index of 0.97.
The multikernel approach proposed provide an overall accuracy of
98.90% with related Kappa index of 0.985
Ensemble methods for environmental data modelling with support vector regression
This paper investigates the use of ensemble of predictors in order
to improve the performance of spatial prediction methods. Support
vector regression (SVR), a popular method from the field of statistical
machine learning, is used. Several instances of SVR are combined
using different data sampling schemes (bagging and boosting). Bagging
shows good performance, and proves to be more computationally efficient
than training a single SVR model while reducing error. Boosting,
however, does not improve results on this specific problem
Deep learning via semi-supervised embedding
We show how nonlinear embedding algorithms popular for use with shallow
semi-supervised learning techniques such as kernel methods can be
applied to deep multilayer architectures, either as a regularizer
at the output layer, or on each layer of the architecture. This provides
a simple alternative to existing approaches to deep learning whilst
yielding competitive error rates compared to those methods, and existing
shallow semi-supervised techniques
Active Learning of Very-High Resolution Optical Imagery with SVM: Entropy vs Margin Sampling
An active learning method is proposed for the semi-automatic selection
of training sets in remote sensing image classification. The method
adds iteratively to the current training set the unlabeled pixels
for which the prediction of an ensemble of classifiers based on bagged
training sets show maximum entropy. This way, the algorithm selects
the pixels that are the most uncertain and that will improve the
model if added in the training set. The user is asked to label such
pixels at each iteration. Experiments using support vector machines
(SVM) on an 8 classes QuickBird image show the excellent performances
of the methods, that equals accuracies of both a model trained with
ten times more pixels and a model whose training set has been built
using a state-of-the-art SVM specific active learning metho
Multi-source composite kernels for urban image classification
This letter presents advanced classification methods for very high
resolution images. Efficient multisource information, both spectral
and spatial, is exploited through the use of composite kernels in
support vector machines. Weighted summations of kernels accounting
for separate sources of spectral and spatial information are analyzed
and compared to classical approaches such as pure spectral classification
or stacked approaches using all the features in a single vector.
Model selection problems are addressed, as well as the importance
of the different kernels in the weighted summation