441 research outputs found
Bayesian Structure Learning for Markov Random Fields with a Spike and Slab Prior
In recent years a number of methods have been developed for automatically
learning the (sparse) connectivity structure of Markov Random Fields. These
methods are mostly based on L1-regularized optimization which has a number of
disadvantages such as the inability to assess model uncertainty and expensive
cross-validation to find the optimal regularization parameter. Moreover, the
model's predictive performance may degrade dramatically with a suboptimal value
of the regularization parameter (which is sometimes desirable to induce
sparseness). We propose a fully Bayesian approach based on a "spike and slab"
prior (similar to L0 regularization) that does not suffer from these
shortcomings. We develop an approximate MCMC method combining Langevin dynamics
and reversible jump MCMC to conduct inference in this model. Experiments show
that the proposed model learns a good combination of the structure and
parameter values without the need for separate hyper-parameter tuning.
Moreover, the model's predictive performance is much more robust than L1-based
methods with hyper-parameter settings that induce highly sparse model
structures.Comment: Accepted in the Conference on Uncertainty in Artificial Intelligence
(UAI), 201
Recommended from our members
Weibull regression with Bayesian variable selection to identify prognostic tumour markers of breast cancer survival.
As data-rich medical datasets are becoming routinely collected, there is a growing demand for regression methodology that facilitates variable selection over a large number of predictors. Bayesian variable selection algorithms offer an attractive solution, whereby a sparsity inducing prior allows inclusion of sets of predictors simultaneously, leading to adjusted effect estimates and inference of which covariates are most important. We present a new implementation of Bayesian variable selection, based on a Reversible Jump MCMC algorithm, for survival analysis under the Weibull regression model. A realistic simulation study is presented comparing against an alternative LASSO-based variable selection strategy in datasets of up to 20,000 covariates. Across half the scenarios, our new method achieved identical sensitivity and specificity to the LASSO strategy, and a marginal improvement otherwise. Runtimes were comparable for both approaches, taking approximately a day for 20,000 covariates. Subsequently, we present a real data application in which 119 protein-based markers are explored for association with breast cancer survival in a case cohort of 2287 patients with oestrogen receptor-positive disease. Evidence was found for three independent prognostic tumour markers of survival, one of which is novel. Our new approach demonstrated the best specificity.PJN and SR were funded by the Medical Research Council. PJN also acknowledges partial support from the NIHR Cambridge Biomedical Research Centre.This is the accepted manuscript. The final version is available from SAGE at http://dx.doi.org/10.1177/096228021454874
- …