151 research outputs found
Bayesian inference for multiple Gaussian graphical models with application to metabolic association networks
We investigate the effect of cadmium (a toxic environmental pollutant) on the correlation structure of a number of urinary metabolites using Gaussian graphical models (GGMs). The inferred metabolic associations can provide important information on the physiological state of a metabolic system and insights on complex metabolic relationships. Using the fitted GGMs, we construct differential networks, which highlight significant changes in metabolite interactions under different experimental conditions. The analysis of such metabolic association networks can reveal differences in the underlying biological reactions caused by cadmium exposure. We consider Bayesian inference and propose using the multiplicative (or Chung–Lu random graph) model as a prior on the graphical space. In the multiplicative model, each edge is chosen independently with probability equal to the product of the connectivities of the end nodes. This class of prior is parsimonious yet highly flexible; it can be used to encourage sparsity or graphs with a pre-specified degree distribution when such prior knowledge is available. We extend the multiplicative model to multiple GGMs linking the probability of edge inclusion through logistic regression and demonstrate how this leads to joint inference for multiple GGMs. A sequential Monte Carlo (SMC) algorithm is developed for estimating the posterior distribution of the graphs
Bayesian solutions to the label switching problem
The label switching problem, the unidentifiability of the permutation of clusters or more generally latent variables, makes interpretation of results computed with MCMC sampling difficult. We introduce a fully Bayesian treatment of the permutations which performs better than alternatives. The method can be used to compute summaries of the posterior samples even for nonparametric Bayesian methods, for which no good solutions exist so far. Although being approximative in this case, the results are very promising. The summaries are intuitively appealing: A summarized cluster is defined as a set of points for which the likelihood of being in the same cluster is maximized
An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration
While statisticians are well-accustomed to performing exploratory analysis in
the modeling stage of an analysis, the notion of conducting preliminary
general-purpose exploratory analysis in the Monte Carlo stage (or more
generally, the model-fitting stage) of an analysis is an area which we feel
deserves much further attention. Towards this aim, this paper proposes a
general-purpose algorithm for automatic density exploration. The proposed
exploration algorithm combines and expands upon components from various
adaptive Markov chain Monte Carlo methods, with the Wang-Landau algorithm at
its heart. Additionally, the algorithm is run on interacting parallel chains --
a feature which both decreases computational cost as well as stabilizes the
algorithm, improving its ability to explore the density. Performance is studied
in several applications. Through a Bayesian variable selection example, the
authors demonstrate the convergence gains obtained with interacting chains. The
ability of the algorithm's adaptive proposal to induce mode-jumping is
illustrated through a trimodal density and a Bayesian mixture modeling
application. Lastly, through a 2D Ising model, the authors demonstrate the
ability of the algorithm to overcome the high correlations encountered in
spatial models.Comment: 33 pages, 20 figures (the supplementary materials are included as
appendices
Some discussions of D. Fearnhead and D. Prangle's Read Paper "Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation"
This report is a collection of comments on the Read Paper of Fearnhead and Prangle (2011), to appear in the Journal of the Royal Statistical Society Series B, along with a reply from the authors
Analysis of ChIP-seq data via Bayesian finite mixture models with a non-parametric component
In large discrete data sets which requires classification into signal and noise components, the distribution of the signal is often very bumpy and does not follow a standard distribution. Therefore the signal distribution is further modelled as a mixture of component distributions. However, when the signal component is modelled as a mixture of distributions, we are faced with the challenges of justifying the number of components and the label switching problem (caused by multimodality of the likelihood function). To circumvent these challenges, we propose a non-parametric structure for the signal component. This new method is more efficient in terms of precise estimates and better classifications. We demonstrated the efficacy of the methodology using a ChIP-sequencing data set
Dynamic Mixture-of-Experts Models for Longitudinal and Discrete-Time Survival Data
We propose a general class of flexible models for longitudinal data with special emphasis on discrete-time survival data. The model is a finite mixture model where the subjects are allowed to move between components through time. The time-varying probability of component memberships is modeled as a function of subject-specific time-varying covariates. This allows for interesting within-subject dynamics and manageable computations even with a large number of subjects. Each parameter in the component densities and in the mixing function is connected to its own set of covariates through a link function. The models are estimated using a Bayesian approach via a highly efficient Markov Chain Monte Carlo (MCMC) algorithm with tailored proposals and variable selection in all set of covariates. The focus of the paper is on models for discrete-time survival data with an application to bankruptcy prediction for Swedish firms, using both exponential and Weibull mixture components. The dynamic mixture-of-experts models are shown to have an interesting interpretation and to dramatically improve the out-of-sample predictive density forecasts compared to models with time-invariant mixture probabilities
From adaptive to generative learning in small and medium enterprises-a network perspective
Organizational learning has been playing an important role for competitive advantages for the organization. Managing learning and change in the unique context of small and medium enterprises (SMEs) can obtain benefits from network alliance. The paper seeks to draw attention to learning approaches from adaptive learning to generative learning in a SME in the context of asymmetric learning relationship. A qualitative research is conducted on a towing company of Taiwan with 14 in-depth interviews on persons of strategic alliances. This study discusses an asymmetric learning relationship where a large enterprise dominates the central place of the network, decides the learning policies and practices and guides learning involving adaptive and generative learning. This case of the SME assumes adaptive learning to ensure the development of network capability and adopts generative learning through communication channels and resources provided by the central firm. The outcomes of generative learning are the enhancement of absorptive capacity, the transfer of knowledge, shared identities, and shared contextual understanding in the towing industry. Though acquiring generative learning development, the case of the SME gets a competitive advantage but chooses to stay small and to be a business owner. This situation meets the psychological needs of the Chinese people
A Novel Test for Gene-Ancestry Interactions in Genome-Wide Association Data
Genome-wide association study (GWAS) data on a disease are increasingly available from multiple related populations. In this scenario, meta-analyses can improve power to detect homogeneous genetic associations, but if there exist ancestry-specific effects, via interactions on genetic background or with a causal effect that co-varies with genetic background, then these will typically be obscured. To address this issue, we have developed a robust statistical method for detecting susceptibility gene-ancestry interactions in multi-cohort GWAS based on closely-related populations. We use the leading principal components of the empirical genotype matrix to cluster individuals into “ancestry groups” and then look for evidence of heterogeneous genetic associations with disease or other trait across these clusters. Robustness is improved when there are multiple cohorts, as the signal from true gene-ancestry interactions can then be distinguished from gene-collection artefacts by comparing the observed interaction effect sizes in collection groups relative to ancestry groups. When applied to colorectal cancer, we identified a missense polymorphism in iron-absorption gene CYBRD1 that associated with disease in individuals of English, but not Scottish, ancestry. The association replicated in two additional, independently-collected data sets. Our method can be used to detect associations between genetic variants and disease that have been obscured by population genetic heterogeneity. It can be readily extended to the identification of genetic interactions on other covariates such as measured environmental exposures. We envisage our methodology being of particular interest to researchers with existing GWAS data, as ancestry groups can be easily defined and thus tested for interactions
- …