2,477 research outputs found

    Minimax Structured Normal Means Inference

    Full text link
    We provide a unified treatment of a broad class of noisy structure recovery problems, known as structured normal means problems. In this setting, the goal is to identify, from a finite collection of Gaussian distributions with different means, the distribution that produced some observed data. Recent work has studied several special cases including sparse vectors, biclusters, and graph-based structures. We establish nearly matching upper and lower bounds on the minimax probability of error for any structured normal means problem, and we derive an optimality certificate for the maximum likelihood estimator, which can be applied to many instantiations. We also consider an experimental design setting, where we generalize our minimax bounds and derive an algorithm for computing a design strategy with a certain optimality property. We show that our results give tight minimax bounds for many structure recovery problems and consider some consequences for interactive sampling

    Detection of an anomalous cluster in a network

    Full text link
    We consider the problem of detecting whether or not, in a given sensor network, there is a cluster of sensors which exhibit an "unusual behavior." Formally, suppose we are given a set of nodes and attach a random variable to each node. We observe a realization of this process and want to decide between the following two hypotheses: under the null, the variables are i.i.d. standard normal; under the alternative, there is a cluster of variables that are i.i.d. normal with positive mean and unit variance, while the rest are i.i.d. standard normal. We also address surveillance settings where each sensor in the network collects information over time. The resulting model is similar, now with a time series attached to each node. We again observe the process over time and want to decide between the null, where all the variables are i.i.d. standard normal, and the alternative, where there is an emerging cluster of i.i.d. normal variables with positive mean and unit variance. The growth models used to represent the emerging cluster are quite general and, in particular, include cellular automata used in modeling epidemics. In both settings, we consider classes of clusters that are quite general, for which we obtain a lower bound on their respective minimax detection rate and show that some form of scan statistic, by far the most popular method in practice, achieves that same rate to within a logarithmic factor. Our results are not limited to the normal location model, but generalize to any one-parameter exponential family when the anomalous clusters are large enough.Comment: Published in at http://dx.doi.org/10.1214/10-AOS839 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Optimal change point detection and localization in sparse dynamic networks

    Get PDF
    We study the problem of change point localization in dynamic networks models. We assume that we observe a sequence of independent adjacency matrices of the same size, each corresponding to a realization of an unknown inhomogeneous Bernoulli model. The underlying distribution of the adjacency matrices are piecewise constant, and may change over a subset of the time points, called change points. We are concerned with recovering the unknown number and positions of the change points. In our model setting, we allow for all the model parameters to change with the total number of time points, including the network size, the minimal spacing between consecutive change points, the magnitude of the smallest change and the degree of sparsity of the networks. We first identify a region of impossibility in the space of the model parameters such that no change point estimator is provably consistent if the data are generated according to parameters falling in that region. We propose a computationally-simple algorithm for network change point localization, called network binary segmentation, that relies on weighted averages of the adjacency matrices. We show that network binary segmentation is consistent over a range of the model parameters that nearly cover the complement of the impossibility region, thus demonstrating the existence of a phase transition for the problem at hand. Next, we devise a more sophisticated algorithm based on singular value thresholding, called local refinement, that delivers more accurate estimates of the change point locations. Under appropriate conditions, local refinement guarantees a minimax optimal rate for network change point localization while remaining computationally feasible
    • …
    corecore