99 research outputs found
Almost Perfect Privacy for Additive Gaussian Privacy Filters
We study the maximal mutual information about a random variable
(representing non-private information) displayed through an additive Gaussian
channel when guaranteeing that only bits of information is leaked
about a random variable (representing private information) that is
correlated with . Denoting this quantity by , we show that
for perfect privacy, i.e., , one has for any pair of
absolutely continuous random variables and then derive a second-order
approximation for for small . This approximation is
shown to be related to the strong data processing inequality for mutual
information under suitable conditions on the joint distribution . Next,
motivated by an operational interpretation of data privacy, we formulate the
privacy-utility tradeoff in the same setup using estimation-theoretic
quantities and obtain explicit bounds for this tradeoff when is
sufficiently small using the approximation formula derived for
.Comment: 20 pages. To appear in Springer-Verla
A mixed effect model for bivariate meta-analysis of diagnostic test accuracy studies using a copula representation of the random effects distribution
Diagnostic test accuracy studies typically report the number of true positives, false positives, true negatives and false negatives. There usually exists a negative association between the number of true positives and true negatives, because studies that adopt less stringent criterion for declaring a test positive invoke higher sensitivities and lower specificities. A generalized linear mixed model (GLMM) is currently recommended to synthesize diagnostic test accuracy studies. We propose a copula mixed model for bivariate meta-analysis of diagnostic test accuracy studies. Our general model includes the GLMM as a special case and can also operate on the original scale of sensitivity and specificity. Summary receiver operating characteristic curves are deduced for the proposed model through quantile regression techniques and different characterizations of the bivariate random effects distribution. Our general methodology is demonstrated with an extensive simulation study and illustrated by re-analysing the data of two published meta-analyses. Our study suggests that there can be an improvement on GLMM in fit to data and makes the argument for moving to copula random effects models. Our modelling framework is implemented in the package CopulaREMADA within the open source statistical environment R
Pairwise maximum entropy models for studying large biological systems: when they can and when they can't work
One of the most critical problems we face in the study of biological systems
is building accurate statistical descriptions of them. This problem has been
particularly challenging because biological systems typically contain large
numbers of interacting elements, which precludes the use of standard brute
force approaches. Recently, though, several groups have reported that there may
be an alternate strategy. The reports show that reliable statistical models can
be built without knowledge of all the interactions in a system; instead,
pairwise interactions can suffice. These findings, however, are based on the
analysis of small subsystems. Here we ask whether the observations will
generalize to systems of realistic size, that is, whether pairwise models will
provide reliable descriptions of true biological systems. Our results show
that, in most cases, they will not. The reason is that there is a crossover in
the predictive power of pairwise models: If the size of the subsystem is below
the crossover point, then the results have no predictive power for large
systems. If the size is above the crossover point, the results do have
predictive power. This work thus provides a general framework for determining
the extent to which pairwise models can be used to predict the behavior of
whole biological systems. Applied to neural data, the size of most systems
studied so far is below the crossover point
- …