Search CORE

27,784 research outputs found

Bayesian model-based approaches with MCMC computation to some bioinformatics problems

Author: Bae Kyounghwa
Publication venue: Texas A&M University
Publication date: 29/08/2005
Field of study

Bioinformatics applications can address the transfer of information at several stages of the central dogma of molecular biology, including transcription and translation. This dissertation focuses on using Bayesian models to interpret biological data in bioinformatics, using Markov chain Monte Carlo (MCMC) for the inference method. First, we use our approach to interpret data at the transcription level. We propose a two-level hierarchical Bayesian model for variable selection on cDNA Microarray data. cDNA Microarray quantifies mRNA levels of a gene simultaneously so has thousands of genes in one sample. By observing the expression patterns of genes under various treatment conditions, important clues about gene function can be obtained. We consider a multivariate Bayesian regression model and assign priors that favor sparseness in terms of number of variables (genes) used. We introduce the use of different priors to promote different degrees of sparseness using a unified two-level hierarchical Bayesian model. Second, we apply our method to a problem related to the translation level. We develop hidden Markov models to model linker/non-linker sequence regions in a protein sequence. We use a linker index to exploit differences in amino acid composition between regions from sequence information alone. A goal of protein structure prediction is to take an amino acid sequence (represented as a sequence of letters) and predict its tertiary structure. The identification of linker regions in a protein sequence is valuable in predicting the three-dimensional structure. Because of the complexities of both models encountered in practice, we employ the Markov chain Monte Carlo method (MCMC), particularly Gibbs sampling (Gelfand and Smith, 1990) for the inference of the parameter estimation

Texas A&M Repository

A hierarchical Bayesian model for inference of copy number variants and their association to gene expression

Author: Cassese Alberto
Falciani Francesco
Guindani Michele
Tadesse Mahlet G.
Vannucci Marina
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

A number of statistical models have been successfully developed for the analysis of high-throughput data from a single source, but few methods are available for integrating data from different sources. Here we focus on integrating gene expression levels with comparative genomic hybridization (CGH) array measurements collected on the same subjects. We specify a measurement error model that relates the gene expression levels to latent copy number states which, in turn, are related to the observed surrogate CGH measurements via a hidden Markov model. We employ selection priors that exploit the dependencies across adjacent copy number states and investigate MCMC stochastic search techniques for posterior inference. Our approach results in a unified modeling framework for simultaneously inferring copy number variants (CNV) and identifying their significant associations with mRNA transcripts abundance. We show performance on simulated data and illustrate an application to data from a genomic study on human cancer cell lines.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS705 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Maastricht University Research Portal

Florence Research

PubMed Central

eScholarship - University of California

DSpace at Rice University

Bayesian Gene Set Analysis

Author: Plevritis Sylvia K.
Shachaf Catherine M.
Shahbaba Babak
Tibshirani Robert
Publication venue
Publication date: 01/01/2010
Field of study

Gene expression microarray technologies provide the simultaneous measurements of a large number of genes. Typical analyses of such data focus on the individual genes, but recent work has demonstrated that evaluating changes in expression across predefined sets of genes often increases statistical power and produces more robust results. We introduce a new methodology for identifying gene sets that are differentially expressed under varying experimental conditions. Our approach uses a hierarchical Bayesian framework where a hyperparameter measures the significance of each gene set. Using simulated data, we compare our proposed method to alternative approaches, such as Gene Set Enrichment Analysis (GSEA) and Gene Set Analysis (GSA). Our approach provides the best overall performance. We also discuss the application of our method to experimental data based on p53 mutation status

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Inferring gene regulatory networks using ensembles of feature selection techniques

Author: Demeester Piet
Dhaene Tom
Geurts Pierre
Huynh-thu Vân anh
Ruyssinck Joeri
Saeys Yvan
Publication venue
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography

Bayesian Tobit quantile regression using-prior distribution with ridge parameter

Author: Bilias Y
George EI
Keming Yu
Martin A
Rahim Alhamzawi
Zellner A
Publication venue: 'Informa UK Limited'
Publication date: 07/08/2014
Field of study

A Bayesian approach is proposed for coefficient estimation in the Tobit quantile regression model. The proposed approach is based on placing a g-prior distribution depends on the quantile level on the regression coefficients. The prior is generalized by introducing a ridge parameter to address important challenges that may arise with censored data, such as multicollinearity and overfitting problems. Then, a stochastic search variable selection approach is proposed for Tobit quantile regression model based on g-prior. An expression for the hyperparameter g is proposed to calibrate the modified g-prior with a ridge parameter to the corresponding g-prior. Some possible extensions of the proposed approach are discussed, including the continuous and binary responses in quantile regression. The methods are illustrated using several simulation studies and a microarray study. The simulation studies and the microarray study indicate that the proposed approach performs well

Crossref

Brunel University Research Archive