18,695 research outputs found

    Combining Parametric and Non-parametric Algorithms for a Partially Unsupervised Classification of Multitemporal Remote-Sensing Images

    Get PDF
    In this paper, we propose a classification system based on a multiple-classifier architecture, which is aimed at updating land-cover maps by using multisensor and/or multisource remote-sensing images. The proposed system is composed of an ensemble of classifiers that, once trained in a supervised way on a specific image of a given area, can be retrained in an unsupervised way to classify a new image of the considered site. In this context, two techniques are presented for the unsupervised updating of the parameters of a maximum-likelihood (ML) classifier and a radial basis function (RBF) neural-network classifier, on the basis of the distribution of the new image to be classified. Experimental results carried out on a multitemporal and multisource remote-sensing data set confirm the effectiveness of the proposed system

    A survey of outlier detection methodologies

    Get PDF
    Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

    Security Evaluation of Support Vector Machines in Adversarial Environments

    Full text link
    Support Vector Machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion), or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector Machine Applications

    Pseudo-Marginal Bayesian Inference for Gaussian Processes

    Get PDF
    The main challenges that arise when adopting Gaussian Process priors in probabilistic modeling are how to carry out exact Bayesian inference and how to account for uncertainty on model parameters when making model-based predictions on out-of-sample data. Using probit regression as an illustrative working example, this paper presents a general and effective methodology based on the pseudo-marginal approach to Markov chain Monte Carlo that efficiently addresses both of these issues. The results presented in this paper show improvements over existing sampling methods to simulate from the posterior distribution over the parameters defining the covariance function of the Gaussian Process prior. This is particularly important as it offers a powerful tool to carry out full Bayesian inference of Gaussian Process based hierarchic statistical models in general. The results also demonstrate that Monte Carlo based integration of all model parameters is actually feasible in this class of models providing a superior quantification of uncertainty in predictions. Extensive comparisons with respect to state-of-the-art probabilistic classifiers confirm this assertion.Comment: 14 pages double colum

    One-Class Classification: Taxonomy of Study and Review of Techniques

    Full text link
    One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

    A Multiple Cascade-Classifier System for a Robust and Partially Unsupervised Updating of Land-Cover Maps

    Get PDF
    A system for a regular updating of land-cover maps is proposed that is based on the use of multitemporal remote-sensing images. Such a system is able to face the updating problem under the realistic but critical constraint that, for the image to be classified (i.e., the most recent of the considered multitemporal data set), no ground truth information is available. The system is composed of an ensemble of partially unsupervised classifiers integrated in a multiple classifier architecture. Each classifier of the ensemble exhibits the following novel peculiarities: i) it is developed in the framework of the cascade-classification approach to exploit the temporal correlation existing between images acquired at different times in the considered area; ii) it is based on a partially unsupervised methodology capable to accomplish the classification process under the aforementioned critical constraint. Both a parametric maximum-likelihood classification approach and a non-parametric radial basis function (RBF) neural-network classification approach are used as basic methods for the development of partially unsupervised cascade classifiers. In addition, in order to generate an effective ensemble of classification algorithms, hybrid maximum-likelihood and RBF neural network cascade classifiers are defined by exploiting the peculiarities of the cascade-classification methodology. The results yielded by the different classifiers are combined by using standard unsupervised combination strategies. This allows the definition of a robust and accurate partially unsupervised classification system capable of analyzing a wide typology of remote-sensing data (e.g., images acquired by passive sensors, SAR images, multisensor and multisource data). Experimental results obtained on a real multitemporal and multisource data set confirm the effectiveness of the proposed system

    Non-Parametric Calibration of Probabilistic Regression

    Full text link
    The task of calibration is to retrospectively adjust the outputs from a machine learning model to provide better probability estimates on the target variable. While calibration has been investigated thoroughly in classification, it has not yet been well-established for regression tasks. This paper considers the problem of calibrating a probabilistic regression model to improve the estimated probability densities over the real-valued targets. We propose to calibrate a regression model through the cumulative probability density, which can be derived from calibrating a multi-class classifier. We provide three non-parametric approaches to solve the problem, two of which provide empirical estimates and the third providing smooth density estimates. The proposed approaches are experimentally evaluated to show their ability to improve the performance of regression models on the predictive likelihood
    corecore