52,666 research outputs found
Mapping and monitoring forest remnants : a multiscale analysis of spatio-temporal data
KEYWORDS : Landsat, time series, machine learning, semideciduous Atlantic forest, Brazil, wavelet transforms, classification, change detectionForests play a major role in important global matters such as carbon cycle, climate change, and biodiversity. Besides, forests also influence soil and water dynamics with major consequences for ecological relations and decision-making. One basic requirement to quantify and model these processes is the availability of accurate maps of forest cover. Data acquisition and analysis at appropriate scales is the keystone to achieve the mapping accuracy needed for development and reliable use of ecological models.The current and upcoming production of high-resolution data sets plus the ever-increasing time series that have been collected since the seventieth must be effectively explored. Missing values and distortions further complicate the analysis of this data set. Thus, integration and proper analysis is of utmost importance for environmental research. New conceptual models in environmental sciences, like the perception of multiple scales, require the development of effective implementation techniques.This thesis presents new methodologies to map and monitor forests on large, highly fragmented areas with complex land use patterns. The use of temporal information is extensively explored to distinguish natural forests from other land cover types that are spectrally similar. In chapter 4, novel schemes based on multiscale wavelet analysis are introduced, which enabled an effective preprocessing of long time series of Landsat data and improved its applicability on environmental assessment.In chapter 5, the produced time series as well as other information on spectral and spatial characteristics were used to classify forested areas in an experiment relating a number of combinations of attribute features. Feature sets were defined based on expert knowledge and on data mining techniques to be input to traditional and machine learning algorithms for pattern recognition, viz . maximum likelihood, univariate and multivariate decision trees, and neural networks. The results showed that maximum likelihood classification using temporal texture descriptors as extracted with wavelet transforms was most accurate to classify the semideciduous Atlantic forest in the study area.In chapter 6, a multiscale approach to digital change detection was developed to deal with multisensor and noisy remotely sensed images. Changes were extracted according to size classes minimising the effects of geometric and radiometric misregistration.Finally, in chapter 7, an automated procedure for GIS updating based on feature extraction, segmentation and classification was developed to monitor the remnants of semideciduos Atlantic forest. The procedure showed significant improvements over post classification comparison and direct multidate classification based on artificial neural networks.</p
Interpretable Categorization of Heterogeneous Time Series Data
Understanding heterogeneous multivariate time series data is important in
many applications ranging from smart homes to aviation. Learning models of
heterogeneous multivariate time series that are also human-interpretable is
challenging and not adequately addressed by the existing literature. We propose
grammar-based decision trees (GBDTs) and an algorithm for learning them. GBDTs
extend decision trees with a grammar framework. Logical expressions derived
from a context-free grammar are used for branching in place of simple
thresholds on attributes. The added expressivity enables support for a wide
range of data types while retaining the interpretability of decision trees. In
particular, when a grammar based on temporal logic is used, we show that GBDTs
can be used for the interpretable classi cation of high-dimensional and
heterogeneous time series data. Furthermore, we show how GBDTs can also be used
for categorization, which is a combination of clustering and generating
interpretable explanations for each cluster. We apply GBDTs to analyze the
classic Australian Sign Language dataset as well as data on near mid-air
collisions (NMACs). The NMAC data comes from aircraft simulations used in the
development of the next-generation Airborne Collision Avoidance System (ACAS
X).Comment: 9 pages, 5 figures, 2 tables, SIAM International Conference on Data
Mining (SDM) 201
Interpretable Aircraft Engine Diagnostic via Expert Indicator Aggregation
Detecting early signs of failures (anomalies) in complex systems is one of
the main goal of preventive maintenance. It allows in particular to avoid
actual failures by (re)scheduling maintenance operations in a way that
optimizes maintenance costs. Aircraft engine health monitoring is one
representative example of a field in which anomaly detection is crucial.
Manufacturers collect large amount of engine related data during flights which
are used, among other applications, to detect anomalies. This article
introduces and studies a generic methodology that allows one to build automatic
early signs of anomaly detection in a way that builds upon human expertise and
that remains understandable by human operators who make the final maintenance
decision. The main idea of the method is to generate a very large number of
binary indicators based on parametric anomaly scores designed by experts,
complemented by simple aggregations of those scores. A feature selection method
is used to keep only the most discriminant indicators which are used as inputs
of a Naive Bayes classifier. This give an interpretable classifier based on
interpretable anomaly detectors whose parameters have been optimized indirectly
by the selection process. The proposed methodology is evaluated on simulated
data designed to reproduce some of the anomaly types observed in real world
engines.Comment: arXiv admin note: substantial text overlap with arXiv:1408.6214,
arXiv:1409.4747, arXiv:1407.088
Improved /hadron separation for the detection of faint gamma-ray sources using boosted decision trees
Imaging atmospheric Cherenkov telescopes record an enormous number of
cosmic-ray background events. Suppressing these background events while
retaining -rays is key to achieving good sensitivity to faint
-ray sources. The differentiation between signal and background events
can be accomplished using machine learning algorithms, which are already used
in various fields of physics. Multivariate analyses combine several variables
into a single variable that indicates the degree to which an event is
-ray-like or cosmic-ray-like. In this paper we will focus on the use of
boosted decision trees for /hadron separation. We apply the method to
data from the Very Energetic Radiation Imaging Telescope Array System
(VERITAS), and demonstrate an improved sensitivity compared to the VERITAS
standard analysis.Comment: accepted for publication in Astroparticle Physic
Modeling Binary Time Series Using Gaussian Processes with Application to Predicting Sleep States
Motivated by the problem of predicting sleep states, we develop a mixed
effects model for binary time series with a stochastic component represented by
a Gaussian process. The fixed component captures the effects of covariates on
the binary-valued response. The Gaussian process captures the residual
variations in the binary response that are not explained by covariates and past
realizations. We develop a frequentist modeling framework that provides
efficient inference and more accurate predictions. Results demonstrate the
advantages of improved prediction rates over existing approaches such as
logistic regression, generalized additive mixed model, models for ordinal data,
gradient boosting, decision tree and random forest. Using our proposed model,
we show that previous sleep state and heart rates are significant predictors
for future sleep states. Simulation studies also show that our proposed method
is promising and robust. To handle computational complexity, we utilize Laplace
approximation, golden section search and successive parabolic interpolation.
With this paper, we also submit an R-package (HIBITS) that implements the
proposed procedure.Comment: Journal of Classification (2018
- …