1,456 research outputs found

    Fusing Censored Dependent Data for Distributed Detection

    Full text link
    In this paper, we consider a distributed detection problem for a censoring sensor network where each sensor's communication rate is significantly reduced by transmitting only "informative" observations to the Fusion Center (FC), and censoring those deemed "uninformative". While the independence of data from censoring sensors is often assumed in previous research, we explore spatial dependence among observations. Our focus is on designing the fusion rule under the Neyman-Pearson (NP) framework that takes into account the spatial dependence among observations. Two transmission scenarios are considered, one where uncensored observations are transmitted directly to the FC and second where they are first quantized and then transmitted to further improve transmission efficiency. Copula-based Generalized Likelihood Ratio Test (GLRT) for censored data is proposed with both continuous and discrete messages received at the FC corresponding to different transmission strategies. We address the computational issues of the copula-based GLRTs involving multidimensional integrals by presenting more efficient fusion rules, based on the key idea of injecting controlled noise at the FC before fusion. Although, the signal-to-noise ratio (SNR) is reduced by introducing controlled noise at the receiver, simulation results demonstrate that the resulting noise-aided fusion approach based on adding artificial noise performs very closely to the exact copula-based GLRTs. Copula-based GLRTs and their noise-aided counterparts by exploiting the spatial dependence greatly improve detection performance compared with the fusion rule under independence assumption

    Measuring reproducibility of high-throughput experiments

    Full text link
    Reproducibility is essential to reliable scientific discovery in high-throughput experiments. In this work we propose a unified approach to measure the reproducibility of findings identified from replicate experiments and identify putative discoveries using reproducibility. Unlike the usual scalar measures of reproducibility, our approach creates a curve, which quantitatively assesses when the findings are no longer consistent across replicates. Our curve is fitted by a copula mixture model, from which we derive a quantitative reproducibility score, which we call the "irreproducible discovery rate" (IDR) analogous to the FDR. This score can be computed at each set of paired replicate ranks and permits the principled setting of thresholds both for assessing reproducibility and combining replicates. Since our approach permits an arbitrary scale for each replicate, it provides useful descriptive measures in a wide variety of situations to be explored. We study the performance of the algorithm using simulations and give a heuristic analysis of its theoretical properties. We demonstrate the effectiveness of our method in a ChIP-seq experiment.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS466 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The Copula Approach to Sample Selection Modelling: An Application to the Recreational Value of Forests

    Get PDF
    The sample selection model is based upon a bivariate or a multivariate structure, and distributional assumptions are in this context more severe than in univariate settings, due to the limited availability of tractable multivariate distributions. While the standard FIML estimation of the selectivity model assumes normality of the joint distribution, alternative approaches require less stringent distributional hypotheses. As shown by Smith (2003), copulas allow great flexibility also in FIML models. The copula model is very useful in situations where the applied researcher has a prior on the distributional form of the margins, since it allows separating their modelling from that of the dependence structure. In the present paper the copula approach to sample selection is first compared to the semiparametric approach and to the standard FIML, bivariate normal model, in an illustrative application on female work data. Then its performance is analysed more thoroughly in an application to Contingent Valuation data on recreational values of forests.Contingent valuation, Selectivity bias, Bivariate models, Copulas

    Decision-Making with Heterogeneous Sensors - A Copula Based Approach

    Get PDF
    Statistical decision making has wide ranging applications, from communications and signal processing to econometrics and finance. In contrast to the classical one source-one receiver paradigm, several applications have been identified in the recent past that require acquiring data from multiple sources or sensors. Information from the multiple sensors are transmitted to a remotely located receiver known as the fusion center which makes a global decision. Past work has largely focused on fusion of information from homogeneous sensors. This dissertation extends the formulation to the case when the local sensors may possess disparate sensing modalities. Both the theoretical and practical aspects of multimodal signal processing are considered. The first and foremost challenge is to \u27adequately\u27 model the joint statistics of such heterogeneous sensors. We propose the use of copula theory for this purpose. Copula models are general descriptors of dependence. They provide a way to characterize the nonlinear functional relationships between the multiple modalities, which are otherwise difficult to formalize. The important problem of selecting the `best\u27 copula function from a given set of valid copula densities is addressed, especially in the context of binary hypothesis testing problems. Both, the training-testing paradigm, where a training set is assumed to be available for learning the copula models prior to system deployment, as well as generalized likelihood ratio test (GLRT) based fusion rule for the online selection and estimation of copula parameters are considered. The developed theory is corroborated with extensive computer simulations as well as results on real-world data. Sensor observations (or features extracted thereof) are most often quantized before their transmission to the fusion center for bandwidth and power conservation. A detection scheme is proposed for this problem assuming unifom scalar quantizers at each sensor. The designed rule is applicable for both binary and multibit local sensor decisions. An alternative suboptimal but computationally efficient fusion rule is also designed which involves injecting a deliberate disturbance to the local sensor decisions before fusion. The rule is based on Widrow\u27s statistical theory of quantization. Addition of controlled noise helps to \u27linearize\u27 the higly nonlinear quantization process thus resulting in computational savings. It is shown that although the introduction of external noise does cause a reduction in the received signal to noise ratio, the proposed approach can be highly accurate when the input signals have bandlimited characteristic functions, and the number of quantization levels is large. The problem of quantifying neural synchrony using copula functions is also investigated. It has been widely accepted that multiple simultaneously recorded electroencephalographic signals exhibit nonlinear and non-Gaussian statistics. While the existing and popular measures such as correlation coefficient, corr-entropy coefficient, coh-entropy and mutual information are limited to being bivariate and hence applicable only to pairs of channels, measures such as Granger causality, even though multivariate, fail to account for any nonlinear inter-channel dependence. The application of copula theory helps alleviate both these limitations. The problem of distinguishing patients with mild cognitive impairment from the age-matched control subjects is also considered. Results show that the copula derived synchrony measures when used in conjunction with other synchrony measures improve the detection of Alzheimer\u27s disease onset

    Relation between higher order comoments and dependence structure of equity portfolio

    Get PDF
    We study a relation between higher order comoments and dependence structure of equity portfolio in the US and UK by relying on a simple portfolio approach where equity portfolios are sorted on the higher order comoments. We find that beta and coskewness are positively related with a copula correlation, whereas cokurtosis is negatively related with it. We also find that beta positively associates with an asymmetric tail dependence whilst coskewness negatively associates with it. Furthermore, two extreme equity portfolios sorted on the higher order comoments are closely correlated and their dependence structure is strongly time varying and nonlinear. Backtesting results of value-at-risk and expected shortfall demonstrate the importance of dynamic modeling of asymmetric tail dependence in the risk management of extreme events

    Fusing Heterogeneous Data for Detection Under Non-stationary Dependence

    Get PDF
    In this paper, we consider the problem of detection for dependent, non-stationary signals where the non-stationarity is encoded in the dependence structure. We employ copula theory, which allows for a general parametric characterization of the joint distribution of sensor observations and, hence, allows for a more general description of inter-sensor dependence. We design a copula-based detector using the Neyman-Pearson framework. Our approach involves a sample-wise copula selection scheme, which for a simple hypothesis test, is proved to perform better than previously used single copula selection schemes. We demonstrate the utility of our copula-based approach on simulated data, and also for outdoor sensor data collected by the Army Research Laboratory at the US southwest border

    Hypothesis Testing and Model Estimation with Dependent Observations in Heterogeneous Sensor Networks

    Get PDF
    Advances in microelectronics, communication and signal processing have enabled the development of inexpensive sensors that can be networked to collect vital information from their environment to be used in decision-making and inference. The sensors transmit their data to a central processor which integrates the information from the sensors using a so-called fusion algorithm. Many applications of sensor networks (SNs) involve hypothesis testing or the detection of a phenomenon. Many approaches to data fusion for hypothesis testing assume that, given each hypothesis, the sensors\u27 measurements are conditionally independent. However, since the sensors are densely deployed in practice, their field of views overlap and consequently their measurements are dependent. Moreover, a sensor\u27s measurement samples may be correlated over time. Another assumption often used in data fusion algorithms is that the underlying statistical model of sensors\u27 observations is completely known. However, in practice these statistics may not be available prior to deployment and may change over the lifetime of the network due to hardware changes, aging, and environmental conditions. In this dissertation, we consider the problem of data fusion in heterogeneous SNs (SNs in which the sensors are not identical) collecting dependent data. We develop the expectation maximization algorithm for hypothesis testing and model estimation. Copula distributions are used to model the correlation in the data. Moreover, it is assumed that the distribution of the sensors\u27 measurements is not completely known. we consider both parametric and non-parametric model estimation. The proposed approach is developed for both batch and online processing. In batch processing, fusion can only be performed after a block of data samples is received from each sensor, while in online processing, fusion is performed upon arrival of each data sample. Online processing is of great interest since for many applications, the long delay required for the accumulation of data in batch processing is not acceptable. To evaluate the proposed algorithms, both simulation data and real-world datasets are used. Detection performances of the proposed algorithms are compared with well-known supervised and unsupervised learning methods as well as with similar EM-based methods, which either partially or entirely ignore the dependence in the data

    Hypothesis Testing Using Spatially Dependent Heavy-Tailed Multisensor Data

    Get PDF
    The detection of spatially dependent heavy-tailed signals is considered in this dissertation. While the central limit theorem, and its implication of asymptotic normality of interacting random processes, is generally useful for the theoretical characterization of a wide variety of natural and man-made signals, sensor data from many different applications, in fact, are characterized by non-Gaussian distributions. A common characteristic observed in non-Gaussian data is the presence of heavy-tails or fat tails. For such data, the probability density function (p.d.f.) of extreme values decay at a slower-than-exponential rate, implying that extreme events occur with greater probability. When these events are observed simultaneously by several sensors, their observations are also spatially dependent. In this dissertation, we develop the theory of detection for such data, obtained through heterogeneous sensors. In order to validate our theoretical results and proposed algorithms, we collect and analyze the behavior of indoor footstep data using a linear array of seismic sensors. We characterize the inter-sensor dependence using copula theory. Copulas are parametric functions which bind univariate p.d.f. s, to generate a valid joint p.d.f. We model the heavy-tailed data using the class of alpha-stable distributions. We consider a two-sided test in the Neyman-Pearson framework and present an asymptotic analysis of the generalized likelihood test (GLRT). Both, nested and non-nested models are considered in the analysis. We also use a likelihood maximization-based copula selection scheme as an integral part of the detection process. Since many types of copula functions are available in the literature, selecting the appropriate copula becomes an important component of the detection problem. The performance of the proposed scheme is evaluated numerically on simulated data, as well as using indoor seismic data. With appropriately selected models, our results demonstrate that a high probability of detection can be achieved for false alarm probabilities of the order of 10^-4. These results, using dependent alpha-stable signals, are presented for a two-sensor case. We identify the computational challenges associated with dependent alpha-stable modeling and propose alternative schemes to extend the detector design to a multisensor (multivariate) setting. We use a hierarchical tree based approach, called vines, to model the multivariate copulas, i.e., model the spatial dependence between multiple sensors. The performance of the proposed detectors under the vine-based scheme are evaluated on the indoor footstep data, and significant improvement is observed when compared against the case when only two sensors are deployed. Some open research issues are identified and discussed

    Distribution and Dynamics of Central-European Exchange Rates: Evidence from Intraday Data

    Get PDF
    This paper investigates the behavior of the EUR/CZK, EUR/HUF and EUR/PLN spot exchange rates in the period 2002–2008, using 5-minute intraday data. The authors find that daily returns on the corresponding exchange rates scaled by model-free estimates of daily realized volatility are approximately normally distributed and independent over time. On the other hand, daily realized variances exhibit substantial positive skewness and very persistent, long-memory type of dynamics. The authors estimate a simple three-equation model for daily returns, realized variance and the time-varying volatility of realized variance. The model captures all salient features of the data very well and can be successfully employed for constructing point, as well as density forecasts for future volatility. The authors also discuss some issues associated with measuring volatility from the noisy high-frequency data and employ a simple correction that accounts for the distortions present in our dataset.intraday data, realized variance, return and volatility distributions, heterogeneous autoregressive model
    • 

    corecore