128 research outputs found

    Foundational principles for large scale inference: Illustrations through correlation mining

    Full text link
    When can reliable inference be drawn in the "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics the dataset is often variable-rich but sample-starved: a regime where the number nn of acquired samples (statistical replicates) is far fewer than the number pp of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data." Sample complexity however has received relatively less attention, especially in the setting when the sample size nn is fixed, and the dimension pp grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. We demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks

    Distributed Detection in Sensor Networks with Limited Range Sensors

    Full text link
    We consider a multi-object detection problem over a sensor network (SNET) with limited range sensors. This problem complements the widely considered decentralized detection problem where all sensors observe the same object. While the necessity for global collaboration is clear in the decentralized detection problem, the benefits of collaboration with limited range sensors is unclear and has not been widely explored. In this paper we develop a distributed detection approach based on recent development of the false discovery rate (FDR). We first extend the FDR procedure and develop a transformation that exploits complete or partial knowledge of either the observed distributions at each sensor or the ensemble (mixture) distribution across all sensors. We then show that this transformation applies to multi-dimensional observations, thus extending FDR to multi-dimensional settings. We also extend FDR theory to cases where distributions under both null and positive hypotheses are uncertain. We then propose a robust distributed algorithm to perform detection. We further demonstrate scalability to large SNETs by showing that the upper bound on the communication complexity scales linearly with the number of sensors that are in the vicinity of objects and is independent of the total number of sensors. Finally, we deal with situations where the sensing model may be uncertain and establish robustness of our techniques to such uncertainties.Comment: Submitted to IEEE Transactions on Signal Processin

    An Algorithm for Automatic Target Recognition Using Passive Radar and an EKF for Estimating Aircraft Orientation

    Get PDF
    Rather than emitting pulses, passive radar systems rely on illuminators of opportunity, such as TV and FM radio, to illuminate potential targets. These systems are attractive since they allow receivers to operate without emitting energy, rendering them covert. Until recently, most of the research regarding passive radar has focused on detecting and tracking targets. This dissertation focuses on extending the capabilities of passive radar systems to include automatic target recognition. The target recognition algorithm described in this dissertation uses the radar cross section (RCS) of potential targets, collected over a short period of time, as the key information for target recognition. To make the simulated RCS as accurate as possible, the received signal model accounts for aircraft position and orientation, propagation losses, and antenna gain patterns. An extended Kalman filter (EKF) estimates the target's orientation (and uncertainty in the estimate) from velocity measurements obtained from the passive radar tracker. Coupling the aircraft orientation and state with the known antenna locations permits computation of the incident and observed azimuth and elevation angles. The Fast Illinois Solver Code (FISC) simulates the RCS of potential target classes as a function of these angles. Thus, the approximated incident and observed angles allow the appropriate RCS to be extracted from a database of FISC results. Using this process, the RCS of each aircraft in the target class is simulated as though each is executing the same maneuver as the target detected by the system. Two additional scaling processes are required to transform the RCS into a power profile (magnitude only) simulating the signal in the receiver. First, the RCS is scaled by the Advanced Refractive Effects Prediction System (AREPS) code to account for propagation losses that occur as functions of altitude and range. Then, the Numerical Electromagnetic Code (NEC2) computes the antenna gain pattern, further scaling the RCS. A Rician likelihood model compares the scaled RCS of the illuminated aircraft with those of the potential targets. To improve the robustness of the result, the algorithm jointly optimizes over feasible orientation profiles and target types via dynamic programming.Ph.D.Committee Chair: Lanterman, Aaron; Committee Member: McLaughlin, Steve; Committee Member: Richards, Mark; Committee Member: Serban, Nicoleta; Committee Member: Verriest, Eri

    Decentralized Riemannian Particle Filtering with Applications to Multi-Agent Localization

    Get PDF
    The primary focus of this research is to develop consistent nonlinear decentralized particle filtering approaches to the problem of multiple agent localization. A key aspect in our development is the use of Riemannian geometry to exploit the inherently non-Euclidean characteristics that are typical when considering multiple agent localization scenarios. A decentralized formulation is considered due to the practical advantages it provides over centralized fusion architectures. Inspiration is taken from the relatively new field of information geometry and the more established research field of computer vision. Differential geometric tools such as manifolds, geodesics, tangent spaces, exponential, and logarithmic mappings are used extensively to describe probabilistic quantities. Numerous probabilistic parameterizations were identified, settling on the efficient square-root probability density function parameterization. The square-root parameterization has the benefit of allowing filter calculations to be carried out on the well studied Riemannian unit hypersphere. A key advantage for selecting the unit hypersphere is that it permits closed-form calculations, a characteristic that is not shared by current solution approaches. Through the use of the Riemannian geometry of the unit hypersphere, we are able to demonstrate the ability to produce estimates that are not overly optimistic. Results are presented that clearly show the ability of the proposed approaches to outperform current state-of-the-art decentralized particle filtering methods. In particular, results are presented that emphasize the achievable improvement in estimation error, estimator consistency, and required computational burden

    Revisiting Chernoff Information with Likelihood Ratio Exponential Families

    Full text link
    The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for bounding the Bayes error in statistical hypothesis testing, the divergence found many other applications due to its empirical robustness property found in applications ranging from information fusion to quantum information. From the viewpoint of information theory, the Chernoff information can also be interpreted as a minmax symmetrization of the Kullback--Leibler divergence. In this paper, we first revisit the Chernoff information between two densities of a measurable Lebesgue space by considering the exponential families induced by their geometric mixtures: The so-called likelihood ratio exponential families. Second, we show how to (i) solve exactly the Chernoff information between any two univariate Gaussian distributions or get a closed-form formula using symbolic computing, (ii) report a closed-form formula of the Chernoff information of centered Gaussians with scaled covariance matrices and (iii) use a fast numerical scheme to approximate the Chernoff information between any two multivariate Gaussian distributions.Comment: 41 page
    • …
    corecore