99 research outputs found

    Almost Perfect Privacy for Additive Gaussian Privacy Filters

    Full text link
    We study the maximal mutual information about a random variable YY (representing non-private information) displayed through an additive Gaussian channel when guaranteeing that only ϵ\epsilon bits of information is leaked about a random variable XX (representing private information) that is correlated with YY. Denoting this quantity by gϵ(X,Y)g_\epsilon(X,Y), we show that for perfect privacy, i.e., ϵ=0\epsilon=0, one has g0(X,Y)=0g_0(X,Y)=0 for any pair of absolutely continuous random variables (X,Y)(X,Y) and then derive a second-order approximation for gϵ(X,Y)g_\epsilon(X,Y) for small ϵ\epsilon. This approximation is shown to be related to the strong data processing inequality for mutual information under suitable conditions on the joint distribution PXYP_{XY}. Next, motivated by an operational interpretation of data privacy, we formulate the privacy-utility tradeoff in the same setup using estimation-theoretic quantities and obtain explicit bounds for this tradeoff when ϵ\epsilon is sufficiently small using the approximation formula derived for gϵ(X,Y)g_\epsilon(X,Y).Comment: 20 pages. To appear in Springer-Verla

    A mixed effect model for bivariate meta-analysis of diagnostic test accuracy studies using a copula representation of the random effects distribution

    Get PDF
    Diagnostic test accuracy studies typically report the number of true positives, false positives, true negatives and false negatives. There usually exists a negative association between the number of true positives and true negatives, because studies that adopt less stringent criterion for declaring a test positive invoke higher sensitivities and lower specificities. A generalized linear mixed model (GLMM) is currently recommended to synthesize diagnostic test accuracy studies. We propose a copula mixed model for bivariate meta-analysis of diagnostic test accuracy studies. Our general model includes the GLMM as a special case and can also operate on the original scale of sensitivity and specificity. Summary receiver operating characteristic curves are deduced for the proposed model through quantile regression techniques and different characterizations of the bivariate random effects distribution. Our general methodology is demonstrated with an extensive simulation study and illustrated by re-analysing the data of two published meta-analyses. Our study suggests that there can be an improvement on GLMM in fit to data and makes the argument for moving to copula random effects models. Our modelling framework is implemented in the package CopulaREMADA within the open source statistical environment R

    Pairwise maximum entropy models for studying large biological systems: when they can and when they can't work

    Get PDF
    One of the most critical problems we face in the study of biological systems is building accurate statistical descriptions of them. This problem has been particularly challenging because biological systems typically contain large numbers of interacting elements, which precludes the use of standard brute force approaches. Recently, though, several groups have reported that there may be an alternate strategy. The reports show that reliable statistical models can be built without knowledge of all the interactions in a system; instead, pairwise interactions can suffice. These findings, however, are based on the analysis of small subsystems. Here we ask whether the observations will generalize to systems of realistic size, that is, whether pairwise models will provide reliable descriptions of true biological systems. Our results show that, in most cases, they will not. The reason is that there is a crossover in the predictive power of pairwise models: If the size of the subsystem is below the crossover point, then the results have no predictive power for large systems. If the size is above the crossover point, the results do have predictive power. This work thus provides a general framework for determining the extent to which pairwise models can be used to predict the behavior of whole biological systems. Applied to neural data, the size of most systems studied so far is below the crossover point

    Applications of a generalized combinatorial problem of smirnov

    No full text
    • …
    corecore