17,023 research outputs found
Distributed Nonparametric Sequential Spectrum Sensing under Electromagnetic Interference
A nonparametric distributed sequential algorithm for quick detection of
spectral holes in a Cognitive Radio set up is proposed. Two or more local nodes
make decisions and inform the fusion centre (FC) over a reporting Multiple
Access Channel (MAC), which then makes the final decision. The local nodes use
energy detection and the FC uses mean detection in the presence of fading,
heavy-tailed electromagnetic interference (EMI) and outliers. The statistics of
the primary signal, channel gain or the EMI is not known. Different
nonparametric sequential algorithms are compared to choose appropriate
algorithms to be used at the local nodes and the FC. Modification of a recently
developed random walk test is selected for the local nodes for energy detection
as well as at the fusion centre for mean detection. It is shown via simulations
and analysis that the nonparametric distributed algorithm developed performs
well in the presence of fading, EMI and is robust to outliers. The algorithm is
iterative in nature making the computation and storage requirements minimal.Comment: 8 pages; 6 figures; Version 2 has the proofs for the theorems.
Version 3 contains a new section on approximation analysi
Online Nonparametric Anomaly Detection based on Geometric Entropy Minimization
We consider the online and nonparametric detection of abrupt and persistent
anomalies, such as a change in the regular system dynamics at a time instance
due to an anomalous event (e.g., a failure, a malicious activity). Combining
the simplicity of the nonparametric Geometric Entropy Minimization (GEM) method
with the timely detection capability of the Cumulative Sum (CUSUM) algorithm we
propose a computationally efficient online anomaly detection method that is
applicable to high-dimensional datasets, and at the same time achieve a
near-optimum average detection delay performance for a given false alarm
constraint. We provide new insights to both GEM and CUSUM, including new
asymptotic analysis for GEM, which enables soft decisions for outlier
detection, and a novel interpretation of CUSUM in terms of the discrepancy
theory, which helps us generalize it to the nonparametric GEM statistic. We
numerically show, using both simulated and real datasets, that the proposed
nonparametric algorithm attains a close performance to the clairvoyant
parametric CUSUM test.Comment: to appear in IEEE International Symposium on Information Theory
(ISIT) 201
Sequential Quantiles via Hermite Series Density Estimation
Sequential quantile estimation refers to incorporating observations into
quantile estimates in an incremental fashion thus furnishing an online estimate
of one or more quantiles at any given point in time. Sequential quantile
estimation is also known as online quantile estimation. This area is relevant
to the analysis of data streams and to the one-pass analysis of massive data
sets. Applications include network traffic and latency analysis, real time
fraud detection and high frequency trading. We introduce new techniques for
online quantile estimation based on Hermite series estimators in the settings
of static quantile estimation and dynamic quantile estimation. In the static
quantile estimation setting we apply the existing Gauss-Hermite expansion in a
novel manner. In particular, we exploit the fact that Gauss-Hermite
coefficients can be updated in a sequential manner. To treat dynamic quantile
estimation we introduce a novel expansion with an exponentially weighted
estimator for the Gauss-Hermite coefficients which we term the Exponentially
Weighted Gauss-Hermite (EWGH) expansion. These algorithms go beyond existing
sequential quantile estimation algorithms in that they allow arbitrary
quantiles (as opposed to pre-specified quantiles) to be estimated at any point
in time. In doing so we provide a solution to online distribution function and
online quantile function estimation on data streams. In particular we derive an
analytical expression for the CDF and prove consistency results for the CDF
under certain conditions. In addition we analyse the associated quantile
estimator. Simulation studies and tests on real data reveal the Gauss-Hermite
based algorithms to be competitive with a leading existing algorithm.Comment: 43 pages, 9 figures. Improved version incorporating referee comments,
as appears in Electronic Journal of Statistic
A comparative study of nonparametric methods for pattern recognition
The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is shown graphically for differences in modes and/or means of the probability density functions for four, eight and sixteen samples. The K-class classifier performed very well with respect to the other classifiers used. Since the K-class classifier is a nonparametric technique, it usually performed better than the Bayes classifier which assumes the data to be Gaussian even though it may not be. The K-class classifier has the advantage over the Bayes in that it works well with non-Gaussian data without having to determine the probability density function of the data. It should be noted that the data in this experiment was always unimodal
Detection and localization of change-points in high-dimensional network traffic data
We propose a novel and efficient method, that we shall call TopRank in the
following paper, for detecting change-points in high-dimensional data. This
issue is of growing concern to the network security community since network
anomalies such as Denial of Service (DoS) attacks lead to changes in Internet
traffic. Our method consists of a data reduction stage based on record
filtering, followed by a nonparametric change-point detection test based on
-statistics. Using this approach, we can address massive data streams and
perform anomaly detection and localization on the fly. We show how it applies
to some real Internet traffic provided by France-T\'el\'ecom (a French Internet
service provider) in the framework of the ANR-RNRT OSCAR project. This approach
is very attractive since it benefits from a low computational load and is able
to detect and localize several types of network anomalies. We also assess the
performance of the TopRank algorithm using synthetic data and compare it with
alternative approaches based on random aggregation.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS232 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …