1,566 research outputs found
The EM Algorithm and the Rise of Computational Biology
In the past decade computational biology has grown from a cottage industry
with a handful of researchers to an attractive interdisciplinary field,
catching the attention and imagination of many quantitatively-minded
scientists. Of interest to us is the key role played by the EM algorithm during
this transformation. We survey the use of the EM algorithm in a few important
computational biology problems surrounding the "central dogma"; of molecular
biology: from DNA to RNA and then to proteins. Topics of this article include
sequence motif discovery, protein sequence alignment, population genetics,
evolutionary models and mRNA expression microarray data analysis.Comment: Published in at http://dx.doi.org/10.1214/09-STS312 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Gene Regulatory Network Reconstruction Using Dynamic Bayesian Networks
High-content technologies such as DNA microarrays can provide a system-scale overview of how genes interact with each other in a network context. Various mathematical methods and computational approaches have been proposed to reconstruct GRNs, including Boolean networks, information theory, differential equations and Bayesian networks. GRN reconstruction faces huge intrinsic challenges on both experimental and theoretical fronts, because the inputs and outputs of the molecular processes are unclear and the underlying principles are unknown or too complex.
In this work, we focused on improving the accuracy and speed of GRN reconstruction with Dynamic Bayesian based method. A commonly used structure-learning algorithm is based on REVEAL (Reverse Engineering Algorithm). However, this method has some limitations when it is used for reconstructing GRNs. For instance, the two-stage temporal Bayes network (2TBN) cannot be well recovered by application of REVEAL; it has low accuracy and speed for high dimensionality networks that has above a hundred nodes; and it even cannot accomplish the task of reconstructing a network with 400 nodes. We implemented an algorithm for DBN structure learning with Friedman\u27s score function to replace REVEAL, and tested it on reconstruction of both synthetic networks and real yeast networks and compared it with REVEAL in the absence or presence of preprocessed network generated by Zou and Conzen\u27s algorithm. The new score metric improved the precision and recall of GRN reconstruction. Networks of gene interactions were reconstructed using a Dynamic Bayesian Network (DBN) approach and were analyzed to identify the mechanism of chemical-induced reversible neurotoxicity through reconstruction of gene regulatory networks in earthworms with tools curating relevant genes from non-model organism\u27s pathway to model organism pathway
Kernel methods in genomics and computational biology
Support vector machines and kernel methods are increasingly popular in
genomics and computational biology, due to their good performance in real-world
applications and strong modularity that makes them suitable to a wide range of
problems, from the classification of tumors to the automatic annotation of
proteins. Their ability to work in high dimension, to process non-vectorial
data, and the natural framework they provide to integrate heterogeneous data
are particularly relevant to various problems arising in computational biology.
In this chapter we survey some of the most prominent applications published so
far, highlighting the particular developments in kernel methods triggered by
problems in biology, and mention a few promising research directions likely to
expand in the future
Integrate qualitative biological knowledge for gene regulatory network reconstruction with dynamic Bayesian networks
Reconstructing gene regulatory networks, especially the dynamic gene networks that reveal the temporal program of gene expression from microarray expression data, is essential in systems biology. To overcome the challenges posed by the noisy and under-sampled microarray data, developing data fusion methods to integrate legacy biological knowledge for gene network reconstruction is a promising direction. However, large amount of qualitative biological knowledge accumulated by previous research, albeit very valuable, has received less attention for reconstructing dynamic gene networks due to its incompatibility with the quantitative computational models.;In this dissertation, I introduce a novel method to fuse qualitative gene interaction information with quantitative microarray data under the Dynamic Bayesian Networks framework. This method extends the previous data integration methods by its capabilities of both utilizing qualitative biological knowledge by using Bayesian Networks without the involvement of human experts, and taking time-series data to produce dynamic gene networks. The experimental study shows that when compared with standard Dynamic Bayesian Networks method which only uses microarray data, our method excels by both accuracy and consistency
- …