54 research outputs found

    Detection and localization of change-points in high-dimensional network traffic data

    Full text link
    We propose a novel and efficient method, that we shall call TopRank in the following paper, for detecting change-points in high-dimensional data. This issue is of growing concern to the network security community since network anomalies such as Denial of Service (DoS) attacks lead to changes in Internet traffic. Our method consists of a data reduction stage based on record filtering, followed by a nonparametric change-point detection test based on UU-statistics. Using this approach, we can address massive data streams and perform anomaly detection and localization on the fly. We show how it applies to some real Internet traffic provided by France-T\'el\'ecom (a French Internet service provider) in the framework of the ANR-RNRT OSCAR project. This approach is very attractive since it benefits from a low computational load and is able to detect and localize several types of network anomalies. We also assess the performance of the TopRank algorithm using synthetic data and compare it with alternative approaches based on random aggregation.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS232 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    OMP-type Algorithm with Structured Sparsity Patterns for Multipath Radar Signals

    Get PDF
    A transmitted, unknown radar signal is observed at the receiver through more than one path in additive noise. The aim is to recover the waveform of the intercepted signal and to simultaneously estimate the direction of arrival (DOA). We propose an approach exploiting the parsimonious time-frequency representation of the signal by applying a new OMP-type algorithm for structured sparsity patterns. An important issue is the scalability of the proposed algorithm since high-dimensional models shall be used for radar signals. Monte-Carlo simulations for modulated signals illustrate the good performance of the method even for low signal-to-noise ratios and a gain of 20 dB for the DOA estimation compared to some elementary method

    Adaptive tests for periodic signal detection with applications to laser vibrometry

    Get PDF
    International audienceInitially motivated by a practical issue in target detection via laser vibrometry, we are interested in the problem of periodic signal detection in a Gaussian fixed design regression framework. Assuming that the signal belongs to some periodic Sobolev ball and that the variance of the noise is known, we first consider the problem from a minimax point of view: we evaluate the so-called minimax separation rate which corresponds to the minimal l2−distance between the signal and zero so that the detection is possible with prescribed probabilities of error. Then, we propose a testing procedure which is available when the variance of the noise is unknown and which does not use any prior information about the smoothness degree or the period of the signal. We prove that it is adaptive in the sense that it achieves, up to a possible logarithmic factor, the minimax separation rate over various periodic Sobolev balls simultaneously. The originality of our approach as compared to related works on the topic of signal detection is that our testing procedure is sensitive to the periodicity assumption on the signal. A simulation study is performed in order to evaluate the effect of this prior assumption on the power of the test. We do observe the gains that we could expect from the theory. At last, we turn to the application to target detection by laser vibrometry that we had in view

    A novel approach for estimating functions in the multivariate setting based on an adaptive knot selection for B-splines with an application to a chemical system used in geoscience

    Full text link
    In this paper, we will outline a novel data-driven method for estimating functions in a multivariate nonparametric regression model based on an adaptive knot selection for B-splines. The underlying idea of our approach for selecting knots is to apply the generalized lasso, since the knots of the B-spline basis can be seen as changes in the derivatives of the function to be estimated. This method was then extended to functions depending on several variables by processing each dimension independently, thus reducing the problem to a univariate setting. The regularization parameters were chosen by means of a criterion based on EBIC. The nonparametric estimator was obtained using a multivariate B-spline regression with the corresponding selected knots. Our procedure was validated through numerical experiments by varying the number of observations and the level of noise to investigate its robustness. The influence of observation sampling was also assessed and our method was applied to a chemical system commonly used in geoscience. For each different framework considered in this paper, our approach performed better than state-of-the-art methods. Our completely data-driven method is implemented in the glober R package which will soon be available on the Comprehensive R Archive Network (CRAN).Comment: 29 pages, 29 figure

    A variable selection approach for highly correlated predictors in high-dimensional genomic data

    Full text link
    In genomic studies, identifying biomarkers associated with a variable of interest is a major concern in biomedical research. Regularized approaches are classically used to perform variable selection in high-dimensional linear models. However, these methods can fail in highly correlated settings. We propose a novel variable selection approach called WLasso, taking these correlations into account. It consists in rewriting the initial high-dimensional linear model to remove the correlation between the biomarkers (predictors) and in applying the generalized Lasso criterion. The performance of WLasso is assessed using synthetic data in several scenarios and compared with recent alternative approaches. The results show that when the biomarkers are highly correlated, WLasso outperforms the other approaches in sparse high-dimensional frameworks. The method is also successfully illustrated on publicly available gene expression data in breast cancer. Our method is implemented in the WLasso R package which is available from the Comprehensive R Archive Network
    • 

    corecore