3 research outputs found
Graphical-model Based Multiple Testing under Dependence, with Applications to Genome-wide Association Studies
Large-scale multiple testing tasks often exhibit dependence, and leveraging
the dependence between individual tests is still one challenging and important
problem in statistics. With recent advances in graphical models, it is feasible
to use them to perform multiple testing under dependence. We propose a multiple
testing procedure which is based on a Markov-random-field-coupled mixture
model. The ground truth of hypotheses is represented by a latent binary Markov
random field, and the observed test statistics appear as the coupled mixture
variables. The parameters in our model can be automatically learned by a novel
EM algorithm. We use an MCMC algorithm to infer the posterior probability that
each hypothesis is null (termed local index of significance), and the false
discovery rate can be controlled accordingly. Simulations show that the
numerical performance of multiple testing can be improved substantially by
using our procedure. We apply the procedure to a real-world genome-wide
association study on breast cancer, and we identify several SNPs with strong
association evidence.Comment: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty
in Artificial Intelligence (UAI2012
MLE-induced Likelihood for Markov Random Fields
Due to the intractable partition function, the exact likelihood function for
a Markov random field (MRF), in many situations, can only be approximated.
Major approximation approaches include pseudolikelihood and Laplace
approximation. In this paper, we propose a novel way of approximating the
likelihood function through first approximating the marginal likelihood
functions of individual parameters and then reconstructing the joint likelihood
function from these marginal likelihood functions. For approximating the
marginal likelihood functions, we derive a particular likelihood function from
a modified scenario of coin tossing which is useful for capturing how one
parameter interacts with the remaining parameters in the likelihood function.
For reconstructing the joint likelihood function, we use an appropriate copula
to link up these marginal likelihood functions. Numerical investigation
suggests the superior performance of our approach. Especially as the size of
the MRF increases, both the numerical performance and the computational cost of
our approach remain consistently satisfactory, whereas Laplace approximation
deteriorates and pseudolikelihood becomes computationally unbearable
Graphical-model Based Multiple Testing under Dependence, with Applications to Genome-wide Association Studies
Large-scale multiple testing tasks often exhibit dependence, and leveraging the dependence between individual tests is still one challenging and important problem in statistics. With recent advances in graphical models, it is feasible to use them to perform multiple testing under dependence. We propose a multiple testing procedure which is based on a Markov-random-field-coupled mixture model. The ground truth of hypotheses is represented by a latent binary Markov random field, and the observed test statistics appear as the coupled mixture variables. The parameters in our model can be automatically learned by a novel EM algorithm. We use an MCMC algorithm to infer the posterior probability that each hypothesis is null (termed local index of significance), and the false discovery rate can be controlled accordingly. Simulations show that the numerical performance of multiple testing can be improved substantially by using our procedure. We apply the procedure to a real-world genome-wide association study on breast cancer, and we identify several SNPs with strong association evidence.