Search CORE

Texas A&M Repository

Genomic applications of statistical signal processing

Author: Zhao Wentao
Publication venue
Publication date: 15/05/2009
Field of study

Biological phenomena in the cells can be explained in terms of the interactions among biological macro-molecules, e.g., DNAs, RNAs and proteins. These interactions can be modeled by genetic regulatory networks (GRNs). This dissertation proposes to reverse engineering the GRNs based on heterogeneous biological data sets, including time-series and time-independent gene expressions, Chromatin ImmunoPrecipatation (ChIP) data, gene sequence and motifs and other possible sources of knowledge. The objective of this research is to propose novel computational methods to catch pace with the fast evolving biological databases. Signal processing techniques are exploited to develop computationally efficient, accurate and robust algorithms, which deal individually or collectively with various data sets. Methods of power spectral density estimation are discussed to identify genes participating in various biological processes. Information theoretic methods are applied for non-parametric inference. Bayesian methods are adopted to incorporate several sources with prior knowledge. This work aims to construct an inference system which takes into account different sources of information such that the absence of some components will not interfere with the rest of the system. It has been verified that the proposed algorithms achieve better inference accuracy and higher computational efficiency compared with other state-of-the-art schemes, e.g. REVEAL, ARACNE, Bayesian Networks and Relevance Networks, at presence of artificial time series and steady state microarray measurements. The proposed algorithms are especially appealing when the the sample size is small. Besides, they are able to integrate multiple heterogeneous data sources, e.g. ChIP and sequence data, so that a unified GRN can be inferred. The analysis of biological literature and in silico experiments on real data sets for fruit fly, yeast and human have corroborated part of the inferred GRN. The research has also produced a set of potential control targets for designing gene therapy strategies

Texas A&M Repository

Analysis of Gene Coexpression by B-Spline Based CoD Estimation

Author: Li Huai
Sun Yu
Zhan Ming
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

The gene coexpression study has emerged as a novel holistic approach for microarray data analysis. Different indices have been used in exploring coexpression relationship, but each is associated with certain pitfalls. The Pearson's correlation coefficient, for example, is not capable of uncovering nonlinear pattern and directionality of coexpression. Mutual information can detect nonlinearity but fails to show directionality. The coefficient of determination (CoD) is unique in exploring different patterns of gene coexpression, but so far only applied to discrete data and the conversion of continuous microarray data to the discrete format could lead to information loss. Here, we proposed an effective algorithm, CoexPro, for gene coexpression analysis. The new algorithm is based on B-spline approximation of coexpression between a pair of genes, followed by CoD estimation. The algorithm was justified by simulation studies and by functional semantic similarity analysis. The proposed algorithm is capable of uncovering both linear and a specific class of nonlinear relationships from continuous microarray data. It can also provide suggestions for possible directionality of coexpression to the researchers. The new algorithm presents a novel model for gene coexpression and will be a valuable tool for a variety of gene expression and network studies. The application of the algorithm was demonstrated by an analysis on ligand-receptor coexpression in cancerous and noncancerous cells. The software implementing the algorithm is available upon request to the authors

Aquila Digital Community (University of Southern Mississippi, USM)

Comparison of Probabilistic Boolean Network and Dynamic Bayesian Network Approaches for Inferring Gene Regulatory Networks

Author: Deng Youping
Gong Ping
Li Peng
Perkins Edward J.
Zhang Chaoyang
Publication venue: The Aquila Digital Community
Publication date: 01/11/2007
Field of study

Background: The regulation of gene expression is achieved through gene regulatory networks (GRNs) in which collections of genes interact with one another and other substances in a cell. In order to understand the underlying function of organisms, it is necessary to study the behavior of genes in a gene regulatory network context. Several computational approaches are available for modeling gene regulatory networks with different datasets. In order to optimize modeling of GRN, these approaches must be compared and evaluated in terms of accuracy and efficiency. Results: In this paper, two important computational approaches for modeling gene regulatory networks, probabilistic Boolean network methods and dynamic Bayesian network methods, are compared using a biological time-series dataset from the Drosophila Interaction Database to construct a Drosophila gene network. A subset of time points and gene samples from the whole dataset is used to evaluate the performance of these two approaches. Conclusions: The comparison indicates that both approaches had good performance in modeling the gene regulatory networks. The accuracy in terms of recall and precision can be improved if a smaller subset of genes is selected for inferring GRNs. The accuracy of both approaches is dependent upon the number of selected genes and time points of gene samples. In all tested cases, DBN identified more gene interactions and gave better recall than PBN

Uncovering Gene Regulatory Networks from Time-Series Microarray Data with Variational Bayesian Structural Expectation Maximization

Author: Huang Yufei
Luna Isabel Tienda
Padillo Diego P Ruiz
Perez M Carmen Carrion
Yin Yufang
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

We investigate in this paper reverse engineering of gene regulatory networks from time-series microarray data. We apply dynamic Bayesian networks (DBNs) for modeling cell cycle regulations. In developing a network inference algorithm, we focus on soft solutions that can provide a posteriori probability (APP) of network topology. In particular, we propose a variational Bayesian structural expectation maximization algorithm that can learn the posterior distribution of the network model parameters and topology jointly. We also show how the obtained APPs of the network topology can be used in a Bayesian data integration strategy to integrate two different microarray data sets. The proposed VBSEM algorithm has been tested on yeast cell cycle data sets. To evaluate the confidence of the inferred networks, we apply a moving block bootstrap method. The inferred network is validated by comparing it to the KEGG pathway map

Modeling and control of genetic regulatory networks

Author: Pal Ranadip
Publication venue
Publication date: 15/05/2009
Field of study

Texas A&M Repository

Closed Likelihood Ratio Testing Procedures to Assess Similarity of Covariance Matrices

Author: Akaike H.
Anderson T. W.
Antonio Punzo
Bagnato L.
Bensmail H.
Biernacki C.
Boente G.
Bozdogan H.
Bozdogan H.
Bretz F.
Campbell N. A.
Cavanaugh J. E.
Celeux G.
Christensen R.
Emerson S.
Fisher R. A.
Flury B. N.
Flury B. N.
Flury B. N.
Flury B. N.
Francesca Greselin
Giancristofaro Arboretti R.
Greselin F.
Hallin M.
Hochberg Y.
Holm S.
Jolicoeur P.
Manly B. F. J.
Marcus R.
R Development Core Team
Rencher A. C.
Schmidt-Nielsen K.
Schwarz G.
Westfall P.
Westfall P. H.
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Parallel mutual information estimation for inferring gene regulatory networks on GPUs

Author: AJ Butte
AM Fraser
Bertil Schmidt
CO Daub
E Lindholm
Haixiang Shi
I Arsic
J Schäfer
J Wilson
J Zola
J Zola
JPW Pluim
M Tebmann
N CUDA
N Friedman
P D'Haeseleer
SA Manavski
W Liu
Weiguo Liu
Wolfgang Müller-Wittig
X Chen
X Zhou
X Zhou
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Mutual information is a measure of similarity between two variables. It has been widely used in various application domains including computational biology, machine learning, statistics, image processing, and financial computing. Previously used simple histogram based mutual information estimators lack the precision in quality compared to kernel based methods. The recently introduced B-spline function based mutual information estimation method is competitive to the kernel based methods in terms of quality but at a lower computational complexity. Results We present a new approach to accelerate the B-spline function based mutual information estimation algorithm with commodity graphics hardware. To derive an efficient mapping onto this type of architecture, we have used the Compute Unified Device Architecture (CUDA) programming model to design and implement a new parallel algorithm. Our implementation, called CUDA-MI, can achieve speedups of up to 82 using double precision on a single GPU compared to a multi-threaded implementation on a quad-core CPU for large microarray datasets. We have used the results obtained by CUDA-MI to infer gene regulatory networks (GRNs) from microarray data. The comparisons to existing methods including ARACNE and TINGe show that CUDA-MI produces GRNs of higher quality in less time. Conclusions CUDA-MI is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant speedup over sequential multi-threaded implementation by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs.</p