248 research outputs found
Online Nonparametric Anomaly Detection based on Geometric Entropy Minimization
We consider the online and nonparametric detection of abrupt and persistent
anomalies, such as a change in the regular system dynamics at a time instance
due to an anomalous event (e.g., a failure, a malicious activity). Combining
the simplicity of the nonparametric Geometric Entropy Minimization (GEM) method
with the timely detection capability of the Cumulative Sum (CUSUM) algorithm we
propose a computationally efficient online anomaly detection method that is
applicable to high-dimensional datasets, and at the same time achieve a
near-optimum average detection delay performance for a given false alarm
constraint. We provide new insights to both GEM and CUSUM, including new
asymptotic analysis for GEM, which enables soft decisions for outlier
detection, and a novel interpretation of CUSUM in terms of the discrepancy
theory, which helps us generalize it to the nonparametric GEM statistic. We
numerically show, using both simulated and real datasets, that the proposed
nonparametric algorithm attains a close performance to the clairvoyant
parametric CUSUM test.Comment: to appear in IEEE International Symposium on Information Theory
(ISIT) 201
Contamination Estimation via Convex Relaxations
Identifying anomalies and contamination in datasets is important in a wide
variety of settings. In this paper, we describe a new technique for estimating
contamination in large, discrete valued datasets. Our approach considers the
normal condition of the data to be specified by a model consisting of a set of
distributions. Our key contribution is in our approach to contamination
estimation. Specifically, we develop a technique that identifies the minimum
number of data points that must be discarded (i.e., the level of contamination)
from an empirical data set in order to match the model to within a specified
goodness-of-fit, controlled by a p-value. Appealing to results from large
deviations theory, we show a lower bound on the level of contamination is
obtained by solving a series of convex programs. Theoretical results guarantee
the bound converges at a rate of , where p is the size of
the empirical data set.Comment: To appear, ISIT 201
Learning to classify with possible sensor failures
In this paper, we propose an efficient algorithm to train a robust large-margin classifier, when corrupt measurements caused by sensor failure might be present in the training set. By incorporating a non-parametric prior based on the empiri-cal distribution of the training data, we propose a Geometric-Entropy-Minimization regularized Maximum Entropy Dis-crimination (GEM-MED) method to perform classification and anomaly detection in a joint manner. We demonstrate that our proposed method can yield improved performance over previous robust classification methods in terms of both classification accuracy and anomaly detection rate using sim-ulated data and real footstep data. Index Terms β corrupt measurements, robust large-margin training, anomaly detection, maximum entropy dis-crimination 1
- β¦