18,695 research outputs found
Combining Parametric and Non-parametric Algorithms for a Partially Unsupervised Classification of Multitemporal Remote-Sensing Images
In this paper, we propose a classification system based on a multiple-classifier architecture, which is aimed at updating land-cover maps by using multisensor and/or multisource remote-sensing images. The proposed system is composed of an ensemble of classifiers that, once trained in a supervised way on a specific image of a given area, can be retrained in an unsupervised way to classify a new image of the considered site. In this context, two techniques are presented for the unsupervised updating of the parameters of a maximum-likelihood (ML) classifier and a radial basis function (RBF) neural-network classifier, on the basis of the distribution of the new image to be classified. Experimental results carried out on a multitemporal and multisource remote-sensing data set confirm the effectiveness of the proposed system
A survey of outlier detection methodologies
Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review
Security Evaluation of Support Vector Machines in Adversarial Environments
Support Vector Machines (SVMs) are among the most popular classification
techniques adopted in security applications like malware detection, intrusion
detection, and spam filtering. However, if SVMs are to be incorporated in
real-world security systems, they must be able to cope with attack patterns
that can either mislead the learning algorithm (poisoning), evade detection
(evasion), or gain information about their internal parameters (privacy
breaches). The main contributions of this chapter are twofold. First, we
introduce a formal general framework for the empirical evaluation of the
security of machine-learning systems. Second, according to our framework, we
demonstrate the feasibility of evasion, poisoning and privacy attacks against
SVMs in real-world security problems. For each attack technique, we evaluate
its impact and discuss whether (and how) it can be countered through an
adversary-aware design of SVMs. Our experiments are easily reproducible thanks
to open-source code that we have made available, together with all the employed
datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector
Machine Applications
Pseudo-Marginal Bayesian Inference for Gaussian Processes
The main challenges that arise when adopting Gaussian Process priors in
probabilistic modeling are how to carry out exact Bayesian inference and how to
account for uncertainty on model parameters when making model-based predictions
on out-of-sample data. Using probit regression as an illustrative working
example, this paper presents a general and effective methodology based on the
pseudo-marginal approach to Markov chain Monte Carlo that efficiently addresses
both of these issues. The results presented in this paper show improvements
over existing sampling methods to simulate from the posterior distribution over
the parameters defining the covariance function of the Gaussian Process prior.
This is particularly important as it offers a powerful tool to carry out full
Bayesian inference of Gaussian Process based hierarchic statistical models in
general. The results also demonstrate that Monte Carlo based integration of all
model parameters is actually feasible in this class of models providing a
superior quantification of uncertainty in predictions. Extensive comparisons
with respect to state-of-the-art probabilistic classifiers confirm this
assertion.Comment: 14 pages double colum
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
A Multiple Cascade-Classifier System for a Robust and Partially Unsupervised Updating of Land-Cover Maps
A system for a regular updating of land-cover maps is proposed that is based on the use of multitemporal remote-sensing images. Such a system is able to face the updating problem under the realistic but critical constraint that, for the image to be classified (i.e., the most recent of the considered multitemporal data set), no ground truth information is available. The system is composed of an ensemble of partially unsupervised classifiers integrated in a multiple classifier architecture. Each classifier of the ensemble exhibits the following novel peculiarities: i) it is developed in the framework of the cascade-classification approach to exploit the temporal correlation existing between images acquired at different times in the considered area; ii) it is based on a partially unsupervised methodology capable to accomplish the classification process under the aforementioned critical constraint. Both a parametric maximum-likelihood classification approach and a non-parametric radial basis function (RBF) neural-network classification approach are used as basic methods for the development of partially unsupervised cascade classifiers. In addition, in order to generate an effective ensemble of classification algorithms, hybrid maximum-likelihood and RBF neural network cascade classifiers are defined by exploiting the peculiarities of the cascade-classification methodology. The results yielded by the different classifiers are combined by using standard unsupervised combination strategies. This allows the definition of a robust and accurate partially unsupervised classification system capable of analyzing a wide typology of remote-sensing data (e.g., images acquired by passive sensors, SAR images, multisensor and multisource data). Experimental results obtained on a real multitemporal and multisource data set confirm the effectiveness of the proposed system
Non-Parametric Calibration of Probabilistic Regression
The task of calibration is to retrospectively adjust the outputs from a
machine learning model to provide better probability estimates on the target
variable. While calibration has been investigated thoroughly in classification,
it has not yet been well-established for regression tasks. This paper considers
the problem of calibrating a probabilistic regression model to improve the
estimated probability densities over the real-valued targets. We propose to
calibrate a regression model through the cumulative probability density, which
can be derived from calibrating a multi-class classifier. We provide three
non-parametric approaches to solve the problem, two of which provide empirical
estimates and the third providing smooth density estimates. The proposed
approaches are experimentally evaluated to show their ability to improve the
performance of regression models on the predictive likelihood
- …