Search CORE

An optimization model for metabolic pathways

Author: Arita
Arita
Beasley
Croes
Croes
de Figueiredo
F. J. Planes
Ihmels
J. E. Beasley
Jeong
Keseler
Kharchenko
Klamt
Küffner
Ma
Meléndez-Hevia
Meléndez-Hevia
Meléndez-Hevia
Nelson
Palsson
Planes
Planes
Rahman
Reed
Schuster
Schuster
Schuster
Urbanczik
Wagner
Publication venue: 'Oxford University Press (OUP)'
Publication date: 20/07/2009
Field of study

This article is available open access through the publisher’s website through the link below. Copyright @ The Author 2009.Motivation: Different mathematical methods have emerged in the post-genomic era to determine metabolic pathways. These methods can be divided into stoichiometric methods and path finding methods. In this paper we detail a novel optimization model, based upon integer linear programming, to determine metabolic pathways. Our model links reaction stoichiometry with path finding in a single approach. We test the ability of our model to determine 40 annotated Escherichia coli metabolic pathways. We show that our model is able to determine 36 of these 40 pathways in a computationally effective manner. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online (http://bioinformatics.oxfordjournals.org/cgi/content/full/btp441/DC1)

Brunel University Research Archive

Discovering transcriptional modules by Bayesian data integration

Author: Antoniak
Bar-Joseph
Bernard J. de la Cruz
Bähler
Cho
Dahl
Datta
David L. Wild
Eisen
Falcon
Ferguson
Fritsch
Gasch
Gerber
Geweke
Harbison
Ideker
Ihmels
Jim E. Griffin
Kundaje
Lee
Liu
Liu
Medvedovic
Medvedovic
Qin
Rasmussen
Rasmussen
Reid
Richard S. Savage
Savage
Segal
Segal
Teh
Teh
Wild
Yao
Yeung
Zoubin Ghahramani
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2010
Field of study

Motivation: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. Results: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs

Warwick Research Archives Portal Repository

Kent Academic Repository

CUED - Cambridge University Engineering Department

Characterization and Comparison of the Tissue-Related Modules in Human and Mouse

Author: AI Su
Art F. Y. Poon
B Zhang
Bing Su
BY Liao
BY Liao
D Deutscher
D Segre
D Smedley
DT Odom
E Hubbell
E Ravasz
E Segal
EA Glazov
EI Boyle
G Bejerano
G Chartrand
GM Rubin
H Ge
H Kitano
H Ramsay
HB Fraser
HB Fraser
J Ihmels
J Ihmels
J Ihmels
J Ihmels
J Yang
JD Thompson
M Ashburner
MA Pujana
MB Eisen
MD Wilson
ME Glasner
OR Bininda-Emonds
P Khaitovich
P Pamilo
P Tamayo
P Tsaparas
PM Kim
R Nielsen
Ruolin Yang
S Bergmann
S Bergmann
S Tavazoie
SA Rifkin
TC Freeman
W Enard
WS Cleveland
XJ Yu
Y Guan
YQ Wang
Z Wang
Z Wu
Z Yang
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 22/07/2010
Field of study

BACKGROUND: Due to the advances of high throughput technology and data-collection approaches, we are now in an unprecedented position to understand the evolution of organisms. Great efforts have characterized many individual genes responsible for the interspecies divergence, yet little is known about the genome-wide divergence at a higher level. Modules, serving as the building blocks and operational units of biological systems, provide more information than individual genes. Hence, the comparative analysis between species at the module level would shed more light on the mechanisms underlying the evolution of organisms than the traditional comparative genomics approaches. RESULTS: We systematically identified the tissue-related modules using the iterative signature algorithm (ISA), and we detected 52 and 65 modules in the human and mouse genomes, respectively. The gene expression patterns indicate that all of these predicted modules have a high possibility of serving as real biological modules. In addition, we defined a novel quantity, "total constraint intensity," a proxy of multiple constraints (of co-regulated genes and tissues where the co-regulation occurs) on the evolution of genes in module context. We demonstrate that the evolutionary rate of a gene is negatively correlated with its total constraint intensity. Furthermore, there are modules coding the same essential biological processes, while their gene contents have diverged extensively between human and mouse. CONCLUSIONS: Our results suggest that unlike the composition of module, which exhibits a great difference between human and mouse, the functional organization of the corresponding modules may evolve in a more conservative manner. Most importantly, our findings imply that similar biological processes can be carried out by different sets of genes from human and mouse, therefore, the functional data of individual genes from mouse may not apply to human in certain occasions

Growth landscape formed by perception and import of glucose in yeast

Author: A Kaniak
A Maier
A Mallavarapu
A Zaslaver
Alexander van Oudenaarden
DA Fell
DB Kell
DR Lorenz
E Airoldi
E Boles
E Dekel
E Levine
E Reifenberger
E Reifenberger
G Stephanopoulos
GM Santangelo
H Moriya
Hyun Youk
I Famili
J Ihmels
J Ihmels
J Monod
J Nielsen
J Stelling
JH Kim
JH Kim
JI Castrillo
JM Gancedo
JR Dickinson
KA Reijenga
LF Bisson
M Sheff
MA Savageau
MA Savageau
MC Walsh
MR Bennett
P Daran-Lapujade
P van Hoek
R Wieczorke
S Goyal
S Klumpp
S Krishna
S Levy
S Ostergaard
S Ozcan
SS Pao
U Alon
VM Boer
Y Jiang
Z Yin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2009
Field of study

An important challenge in systems biology is to quantitatively describe microbial growth using a few measurable parameters that capture the essence of this complex phenomenon. Two key events at the cell membrane—extracellular glucose sensing and uptake—initiate the budding yeast’s growth on glucose. However, conventional growth models focus almost exclusively on glucose uptake. Here we present results from growth-rate experiments that cannot be explained by focusing on glucose uptake alone. By imposing a glucose uptake rate independent of the sensed extracellular glucose level, we show that despite increasing both the sensed glucose concentration and uptake rate, the cell’s growth rate can decrease or even approach zero. We resolve this puzzle by showing that the interaction between glucose perception and import, not their individual actions, determines the central features of growth, and characterize this interaction using a quantitative model. Disrupting this interaction by knocking out two key glucose sensors significantly changes the cell’s growth rate, yet uptake rates are unchanged. This is due to a decrease in burden that glucose perception places on the cells. Our work shows that glucose perception and import are separate and pivotal modules of yeast growth, the interaction of which can be precisely tuned and measured.National Institutes of Health (U.S.). Pioneer AwardNatural Sciences and Engineering Research Council of Canada (NSERC). Graduate Fellowshi

DSpace@MIT

Bayesian hierarchical clustering for studying cancer gene expression data with unknown statistics

Author: A Su
B Frey
C Nutt
C Rasmussen
C Rasmussen
D Arango
D Jiang
D Singh
David R. J. Snead
E Cooke
Ferdinando Di Cunto
G Brock
J Ihmels
J Yao
K Yeung
Korsuk Sirinukunwattana
L Hubert
L McQuitty
LF Wu
M De Souto
M Eisen
M Shipp
Muhammad F. Bari
Nasir M. Rajpoot
P D'haeseleer
P Laiho
R Neal
R Savage
R Sokal
Richard S. Savage
S Armstrong
S Datta
S Eschrich
S Falcon
S Matsui
S Pomeroy
S Ramaswamy
S Varambally
T Golub
Y Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Clustering analysis is an important tool in studying gene expression data. The Bayesian hierarchical clustering (BHC) algorithm can automatically infer the number of clusters and uses Bayesian model selection to improve clustering quality. In this paper, we present an extension of the BHC algorithm. Our Gaussian BHC (GBHC) algorithm represents data as a mixture of Gaussian distributions. It uses normal-gamma distribution as a conjugate prior on the mean and precision of each of the Gaussian components. We tested GBHC over 11 cancer and 3 synthetic datasets. The results on cancer datasets show that in sample clustering, GBHC on average produces a clustering partition that is more concordant with the ground truth than those obtained from other commonly used algorithms. Furthermore, GBHC frequently infers the number of clusters that is often close to the ground truth. In gene clustering, GBHC also produces a clustering partition that is more biologically plausible than several other state-of-the-art methods. This suggests GBHC as an alternative tool for studying gene expression data. The implementation of GBHC is available at https://sites. google.com/site/gaussianbhc

CiteSeerX

Qatar University Institutional Repository

Warwick Research Archives Portal Repository

Using Pre-existing Microarray Datasets to Increase Experimental Power: Application to Insulin Resistance

Although they have become a widely used experimental technique for identifying differentially expressed (DE) genes, DNA microarrays are notorious for generating noisy data. A common strategy for mitigating the effects of noise is to perform many experimental replicates. This approach is often costly and sometimes impossible given limited resources; thus, analytical methods are needed which increase accuracy at no additional cost. One inexpensive source of microarray replicates comes from prior work: to date, data from hundreds of thousands of microarray experiments are in the public domain. Although these data assay a wide range of conditions, they cannot be used directly to inform any particular experiment and are thus ignored by most DE gene methods. We present the SVD Augmented Gene expression Analysis Tool (SAGAT), a mathematically principled, data-driven approach for identifying DE genes. SAGAT increases the power of a microarray experiment by using observed coexpression relationships from publicly available microarray datasets to reduce uncertainty in individual genes' expression measurements. We tested the method on three well-replicated human microarray datasets and demonstrate that use of SAGAT increased effective sample sizes by as many as 2.72 arrays. We applied SAGAT to unpublished data from a microarray study investigating transcriptional responses to insulin resistance, resulting in a 50% increase in the number of significant genes detected. We evaluated 11 (58%) of these genes experimentally using qPCR, confirming the directions of expression change for all 11 and statistical significance for three. Use of SAGAT revealed coherent biological changes in three pathways: inflammation, differentiation, and fatty acid synthesis, furthering our molecular understanding of a type 2 diabetes risk factor. We envision SAGAT as a means to maximize the potential for biological discovery from subtle transcriptional responses, and we provide it as a freely available software package that is immediately applicable to any human microarray study

Repression of Mitochondrial Translation, Respiration and a Metabolic Cycle-Regulated Gene, SLF1, by the Yeast Pumilio-Family Protein Puf3p

Author: A Breitkreutz
A Kudlicki
A Russo
AC Goldstrohm
AP Gerber
AR Albig
BC Foat
BP Tu
C Stark
E Eliyahu
F Devaux
G Lelandais
Gerald S. Shadel
GP Cereghino
I Hagen
J Cotney
J Ihmels
J Ihmels
Janine Santos
K Zarnack
LA Grivell
LC Lai
LJ Garcia-Rodriguez
M Carlson
MA Lebedeva
Marc Chatenay-Lapointe
MS Rodeheffer
MS Rodeheffer
ND Bonawitz
ND Bonawitz
R Mehta
RE Kellems
SG Sobel
SI Lee
SL Forsburg
SW Ho
T Quenault
TE Shutt
W Olivas
Y Deng
Y Pan
Y Saint-Georges
Z Liu
Publication venue: Public Library of Science
Publication date: 31/05/2011
Field of study

Synthesis and assembly of the mitochondrial oxidative phosphorylation (OXPHOS) system requires genes located both in the nuclear and mitochondrial genomes, but how gene expression is coordinated between these two compartments is not fully understood. One level of control is through regulated expression mitochondrial ribosomal proteins and other factors required for mitochondrial translation and OXPHOS assembly, which are all products of nuclear genes that are subsequently imported into mitochondria. Interestingly, this cadre of genes in budding yeast has in common a 3′-UTR element that is bound by the Pumilio family protein, Puf3p, and is coordinately regulated under many conditions, including during the yeast metabolic cycle. Multiple functions have been assigned to Puf3p, including promoting mRNA degradation, localizing nucleus-encoded mitochondrial transcripts to the outer mitochondrial membrane, and facilitating mitochondria-cytoskeletal interactions and motility. Here we show that Puf3p has a general repressive effect on mitochondrial OXPHOS abundance, translation, and respiration that does not involve changes in overall mitochondrial biogenesis and largely independent of TORC1-mitochondrial signaling. We also identified the cytoplasmic translation factor Slf1p as yeast metabolic cycle-regulated gene that is repressed by Puf3p at the post-transcriptional level and promotes respiration and extension of yeast chronological life span when over-expressed. Altogether, these results should facilitate future studies on which of the many functions of Puf3p is most relevant for regulating mitochondrial gene expression and the role of nuclear-mitochondrial communication in aging and longevity

A classification-based framework for predicting and analyzing gene regulatory response

Author: AJ Hartemink
Anshul Kundaje
AP Gasch
AP Gasch
Chris H Wiggins
Christina Leslie
CI Holmberg
D Pe'er
D Pe'er
D Pollard
DC Raitt
E Ramil
E Segal
E Segal
ER Gansner
HJ Bussemaker
I Ota
I Pedruzzi
J Ihmels
JD Hughes
JT Lin
M Middendorf
M Middendorf
M Middendorf
MA Beer
Manuel Middendorf
Mihir Shah
P Zarzov
RE Schapire
TI Lee
VK Vyas
W Hoeffding
Y Pilpel
Yoav Freund
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We have recently introduced a predictive framework for studying gene transcriptional regulation in simpler organisms using a novel supervised learning algorithm called GeneClass. GeneClass is motivated by the hypothesis that in model organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular microarray experiment based on the presence of binding site subsequences ("motifs") in the gene's regulatory region and the expression levels of regulators such as transcription factors in the experiment ("parents"). GeneClass formulates the learning task as a classification problem — predicting +1 and -1 labels corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. Using the Adaboost algorithm, GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. METHODS: In the current work, we introduce a new, robust version of the GeneClass algorithm that increases stability and computational efficiency, yielding a more scalable and reliable predictive model. The improved stability of the prediction tree enables us to introduce a detailed post-processing framework for biological interpretation, including individual and group target gene analysis to reveal condition-specific regulation programs and to suggest signaling pathways. Robust GeneClass uses a novel stabilized variant of boosting that allows a set of correlated features, rather than single features, to be included at nodes of the tree; in this way, biologically important features that are correlated with the single best feature are retained rather than decorrelated and lost in the next round of boosting. Other computational developments include fast matrix computation of the loss function for all features, allowing scalability to large datasets, and the use of abstaining weak rules, which results in a more shallow and interpretable tree. We also show how to incorporate genome-wide protein-DNA binding data from ChIP chip experiments into the GeneClass algorithm, and we use an improved noise model for gene expression data. RESULTS: Using the improved scalability of Robust GeneClass, we present larger scale experiments on a yeast environmental stress dataset, training and testing on all genes and using a comprehensive set of potential regulators. We demonstrate the improved stability of the features in the learned prediction tree, and we show the utility of the post-processing framework by analyzing two groups of genes in yeast — the protein chaperones and a set of putative targets of the Nrg1 and Nrg2 transcription factors — and suggesting novel hypotheses about their transcriptional and post-transcriptional regulation. Detailed results and Robust GeneClass source code is available for download from

Springer - Publisher Connector

Columbia University Academic Commons

Extracting expression modules from perturbational gene expression compendia

Author: A Joshi
A Prelić
A Tanay
A Tanay
AL Barabási
AW Rives
C Stark
CE Horak
CT Harbison
D Pe'er
DJ Reiss
Dk Lee
E Ragni
E Ravasz
E Segal
E Segal
G Getz
G Lesage
GD Bader
GK Smyth
H Kitano
I Laloux
I Laloux
J Ihmels
J Ihmels
J Supper
JA Ubersax
JDJ Han
L Lazzeroni
LA Amaral
LF Wu
LH Hartwell
M Ashburner
M Gaisne
M Halkidi
M Schmid
Martin Kuiper
MB Eisen
MG Walker
MZ Bao
N Bolshakova
N Metropolis
P D'haeseleer
Patrick Van Dijck
Q Sheng
R Albert
R Shamir
R Tanaka
S Barkow
S Bergmann
S Bergmann
S Erdman
S Hohmann
S Kirkpatrick
S Maere
SC Madeira
SK Kim
Steven Maere
T Ideker
T Michoel
TR Hughes
W Zhang
X Cui
Y Benjamini
Y Cheng
Y Kluger
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Compendia of gene expression profiles under chemical and genetic perturbations constitute an invaluable resource from a systems biology perspective. However, the perturbational nature of such data imposes specific challenges on the computational methods used to analyze them. In particular, traditional clustering algorithms have difficulties in handling one of the prominent features of perturbational compendia, namely partial coexpression relationships between genes. Biclustering methods on the other hand are specifically designed to capture such partial coexpression patterns, but they show a variety of other drawbacks. For instance, some biclustering methods are less suited to identify overlapping biclusters, while others generate highly redundant biclusters. Also, none of the existing biclustering tools takes advantage of the staple of perturbational expression data analysis: the identification of differentially expressed genes. Results We introduce a novel method, called ENIGMA, that addresses some of these issues. ENIGMA leverages differential expression analysis results to extract expression modules from perturbational gene expression data. The core parameters of the ENIGMA clustering procedure are automatically optimized to reduce the redundancy between modules. In contrast to the biclusters produced by most other methods, ENIGMA modules may show internal substructure, i.e. subsets of genes with distinct but significantly related expression patterns. The grouping of these (often functionally) related patterns in one module greatly aids in the biological interpretation of the data. We show that ENIGMA outperforms other methods on artificial datasets, using a quality criterion that, unlike other criteria, can be used for algorithms that generate overlapping clusters and that can be modified to take redundancy between clusters into account. Finally, we apply ENIGMA to the Rosetta compendium of expression profiles for <it>Saccharomyces cerevisiae </it>and we analyze one pheromone response-related module in more detail, demonstrating the potential of ENIGMA to generate detailed predictions. Conclusion It is increasingly recognized that perturbational expression compendia are essential to identify the gene networks underlying cellular function, and efforts to build these for different organisms are currently underway. We show that ENIGMA constitutes a valuable addition to the repertoire of methods to analyze such data.</p

Springer - Publisher Connector

Ghent University Academic Bibliography