Search CORE

arXiv.org e-Print Archive

MPG.PuRe

Gauge transformations and symmetries of integrable systems

Author: Ablowitz M J
Babelon O
Das A
Dorfmeister J
Dorfmeister J
Fadeev L D
Fukuyama T Kamimura K Toda K
Hereman W
Kiyoshi Kamimura
Kruskal M D
Lebedev N N
Olver P
Pelinovsky D
Saša Krešić-Jurić
Schiff J
Segal G
Stjepan Meljanac
Takeshi Fukuyama
Vinogradov A M
Wadati M
Wu T T
Zakharov V E
Zakharov V E ed
Publication venue: 'IOP Publishing'
Publication date: 18/07/2008
Field of study

We analyze several integrable systems in zero-curvature form within the framework of

SL(2,\R)

invariant gauge theory. In the Drienfeld-Sokolov gauge we derive a two-parameter family of nonlinear evolution equations which as special cases include the Kortweg-de Vries (KdV) and Harry Dym equations. We find residual gauge transformations which lead to infinintesimal symmetries of this family of equations. For KdV and Harry Dym equations we find an infinite hierarchy of such symmetry transformations, and we investigate their relation with local conservation laws, constants of the motion and the bi-Hamiltonian structure of the equations. Applying successive gauge transformatinos of Miura type we obtain a sequence of gauge equivalent integrable systems, among them the modified KdV and Calogero KdV equations.Comment: 18pages, no figure Journal versio

Eigengene networks for studying the relationships between co-expression modules

Author: A Barabási
A Ghazalpour
A Li
A Yip
B Zhang
D Reiss
E Ravasz
E Segal
G Dennis
H Hotelling
H Wei
J Dong
JM Stuart
L Hartwell
M Oldham
O Alter
P D'haeseleer
P Khaitovich
P Langfelder
Peter Langfelder
R Albert
RA Fisher
RI Jennrich
S Carter
S Horvath
Steve Horvath
T Fuller
WS Wu
X Xu
X Zhou
Y Ye
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/11/2007
Field of study

Abstract Background There is evidence that genes and their protein products are organized into functional modules according to cellular processes and pathways. Gene co-expression networks have been used to describe the relationships between gene transcripts. Ample literature exists on how to detect biologically meaningful modules in networks but there is a need for methods that allow one to study the relationships between modules. Results We show that network methods can also be used to describe the relationships between co-expression modules and present the following methodology. First, we describe several methods for detecting modules that are shared by two or more networks (referred to as consensus modules). We represent the gene expression profiles of each module by an eigengene. Second, we propose a method for constructing an eigengene network, where the edges are undirected but maintain information on the sign of the co-expression information. Third, we propose methods for differential eigengene network analysis that allow one to assess the preservation of network properties across different data sets. We illustrate the value of eigengene networks in studying the relationships between consensus modules in human and chimpanzee brains; the relationships between consensus modules in brain, muscle, liver, and adipose mouse tissues; and the relationships between male-female mouse consensus modules and clinical traits. In some applications, we find that module eigengenes can be organized into higher level clusters which we refer to as meta-modules. Conclusion Eigengene networks can be effective and biologically meaningful tools for studying the relationships between modules of a gene co-expression network. The proposed methods may reveal a higher order organization of the transcriptome. R software tutorials, the data, and supplementary material can be found at the following webpage: <url>http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/EigengeneNetwork</url>.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Biological Process Linkage Networks

Author: A Battle
A Schlicker
A Vazquez
AC Gavin
AH Tong
AJ Butte
Avraham A. Melkman
B Schwikowski
C Stark
D Finley
D Lin
D Segre
DA Stavreva
Dikla Dotan-Cohen
E Formstecher
E Segal
E Unal
EM Marcotte
F Luo
H Hishigaki
H Jeong
I Xenarios
JL Lu
JM Stuart
KR Brown
L Giot
LA Amaral
LF Wu
M Larochelle
MA Harris
MA Huynen
P Bork
PT Spellman
PW Lord
R Kelley
R Sharan
Rodolfo Aramayo
Simon Kasif
SL Wong
Stan Letovsky
TR Hughes
U de Lichtenberg
U Karaoz
X Guo
Z Lubovac
Publication venue: Public Library of Science
Publication date: 23/04/2009
Field of study

BACKGROUND. The traditional approach to studying complex biological networks is based on the identification of interactions between internal components of signaling or metabolic pathways. By comparison, little is known about interactions between higher order biological systems, such as biological pathways and processes. We propose a methodology for gleaning patterns of interactions between biological processes by analyzing protein-protein interactions, transcriptional co-expression and genetic interactions. At the heart of the methodology are the concept of Linked Processes and the resultant network of biological processes, the Process Linkage Network (PLN). RESULTS. We construct, catalogue, and analyze different types of PLNs derived from different data sources and different species. When applied to the Gene Ontology, many of the resulting links connect processes that are distant from each other in the hierarchy, even though the connection makes eminent sense biologically. Some others, however, carry an element of surprise and may reflect mechanisms that are unique to the organism under investigation. In this aspect our method complements the link structure between processes inherent in the Gene Ontology, which by its very nature is species-independent. As a practical application of the linkage of processes we demonstrate that it can be effectively used in protein function prediction, having the power to increase both the coverage and the accuracy of predictions, when carefully integrated into prediction methods. CONCLUSIONS. Our approach constitutes a promising new direction towards understanding the higher levels of organization of the cell as a system which should help current efforts to re-engineer ontologies and improve our ability to predict which proteins are involved in specific biological processes.Lynn and William Frankel Center for Computer Science; the Paul Ivanier center for robotics research and production; National Science Foundation (ITR-048715); National Human Genome Research Institute (1R33HG002850-01A1, R01 HG003367-01A1); National Institute of Health (U54 LM008748

Boston University Institutional Repository (OpenBU)

arXiv.org e-Print Archive

Detection of regulator genes and eQTLs in gene networks

Author: A Butte
A Chatr-Aryamontri
A Clauset
A Joshi
A Joshi
A Kundaje
AA Shabalin
AJ Enright
AJ Walhout
AS Dimas
B Schwanhausser
B Zhang
B Zhang
C Cenik
CO Daub
D Koller
DA Cusanovich
DM Greenawalt
E Bonnet
E Ravasz
E Segal
EC Neto
EC Neto
EC Neto
EE Schadt
EE Schadt
EE Schadt
EE Schadt
EE Schadt
EJ Foss
F Grubert
F Yue
FA Cubillos
FW Albert
G Hemani
G Nicholson
GD Smith
GH Golub
H Foroughi Asl
H Talukdar
HN Kadarmideen
J Millstein
J Qi
J Zhu
J Zhu
J Zhu
JE Aten
JF Ayroles
JJ Faith
JL Björkegren
JS Liu
K Basso
K Qu
KG Ardlie
L Wu
LA Hindorff
LH Hartwell
LS Chen
M Ashburner
M Civelek
M Georges
M Gerstein
M Medvedovic
M Schmidt
M Scutari
MA Schaub
MB Eisen
MD Ritchie
ME Goddard
MEJ Newman
MEJ Newman
MV Rockman
MV Rockman
N Friedman
N Friedman
N Friedman
N Laird
O Stegle
P Langfelder
P Langfelder
P Langfelder
P Lu
R Sharan
R Sharan
RB Brem
RW Williams
S Lee
S Roy
S Tavazoie
SI Lee
SM Waszak
SS Rao
T Lappalainen
T Michoel
TA Manolio
TF Mackay
The ENCODE
TS Furey
VG Cheung
W Cookson
W Zhang
Y Chen
Y Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2016
Field of study

Genetic differences between individuals associated to quantitative phenotypic traits, including disease states, are usually found in non-coding genomic regions. These genetic variants are often also associated to differences in expression levels of nearby genes (they are "expression quantitative trait loci" or eQTLs for short) and presumably play a gene regulatory role, affecting the status of molecular networks of interacting genes, proteins and metabolites. Computational systems biology approaches to reconstruct causal gene networks from large-scale omics data have therefore become essential to understand the structure of networks controlled by eQTLs together with other regulatory genes, and to generate detailed hypotheses about the molecular mechanisms that lead from genotype to phenotype. Here we review the main analytical methods and softwares to identify eQTLs and their associated genes, to reconstruct co-expression networks and modules, to reconstruct causal Bayesian gene and module networks, and to validate predicted networks in silico.Comment: minor revision with typos corrected; review article; 24 pages, 2 figure

Public Library of Science (PLOS)

An Introspective Comparison of Random Forest-Based Classifiers for the Analysis of Cluster-Correlated Data by Way of RF++

Author: A Vlahou
Alan R. Dabney
Anthony P. Leclerc
AR Dabney
B Efron
B Rosner
B Wu
BL Adam
C Strobl
C Strobl
D Agranoff
DS Palmer
EF Petricoin
EJ Finehout
Elizabeth G. Hill
ET Fung
Fabio Rapallo
G Izmirlian
GA Churchill
H Zhang
JM Koomen
Jonas S. Almeida
JR Quinlan
JS Morris
L Breiman
L Breiman
L Breiman
L Li
LE Breiman
M Hilario
MR Segal
PJ Adam
RW Garden
S Schaub
SK Lee
TM Pawlik
TP Conrads
V Svetnik
Y Yasui
YD Chen
Yuliya V. Karpievitch
YV Karpievitch
YV Karpievitch
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Many mass spectrometry-based studies, as well as other biological experiments produce cluster-correlated data. Failure to account for correlation among observations may result in a classification algorithm overfitting the training data and producing overoptimistic estimated error rates and may make subsequent classifications unreliable. Current common practice for dealing with replicated data is to average each subject replicate sample set, reducing the dataset size and incurring loss of information. In this manuscript we compare three approaches to dealing with cluster-correlated data: unmodified Breiman's Random Forest (URF), forest grown using subject-level averages (SLA), and RF++ with subject-level bootstrapping (SLB). RF++, a novel Random Forest-based algorithm implemented in C++, handles cluster-correlated data through a modification of the original resampling algorithm and accommodates subject-level classification. Subject-level bootstrapping is an alternative sampling method that obviates the need to average or otherwise reduce each set of replicates to a single independent sample. Our experiments show nearly identical median classification and variable selection accuracy for SLB forests and URF forests when applied to both simulated and real datasets. However, the run-time estimated error rate was severely underestimated for URF forests. Predictably, SLA forests were found to be more severely affected by the reduction in sample size which led to poorer classification and variable selection accuracy. Perhaps most importantly our results suggest that it is reasonable to utilize URF for the analysis of cluster-correlated data. Two caveats should be noted: first, correct classification error rates must be obtained using a separate test dataset, and second, an additional post-processing step is required to obtain subject-level classifications. RF++ is shown to be an effective alternative for classifying both clustered and non-clustered data. Source code and stand-alone compiled versions of command-line and easy-to-use graphical user interface (GUI) versions of RF++ for Windows and Linux as well as a user manual (Supplementary File S2) are available for download at: http://sourceforge.org/projects/rfpp/ under the GNU public license

OAKTrust Digital Repository (Texas A&M Univ)

Identifying high-impact variants and genes in exomes of Ashkenazi Jewish inflammatory bowel disease patients

Author: Bayrak Cigdem Sevim
Brant Steven R
Cho Judy H
Cooper David N
Daly Mark J
Denson Lee A
Duerr Richard H
Gettler Kyle
Giri Mamta
Haritunians Talin
Itan Yuval
Jain Aayushee
Kars Meltem Ece
Kugathasan Subra
Levine Adam P
Li Dalin
Maffucci Patrick
McGovern Dermot PB
Nadkarni Girish
Ostrer Harry
Philip Schumm L
Rioux John D
Sabic Ksenija
Schiff Elena R
Segal Anthony W
Silverberg Mark S
Snapper Scott
Stenson Peter D
Van Vleck Tielman
Wu Yiming
Zhang Peng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/04/2023
Field of study

Inflammatory bowel disease (IBD) is a group of chronic digestive tract inflammatory conditions whose genetic etiology is still poorly understood. The incidence of IBD is particularly high among Ashkenazi Jews. Here, we identify 8 novel and plausible IBD-causing genes from the exomes of 4453 genetically identified Ashkenazi Jewish IBD cases (1734) and controls (2719). Various biological pathway analyses are performed, along with bulk and single-cell RNA sequencing, to demonstrate the likely physiological relatedness of the novel genes to IBD. Importantly, we demonstrate that the rare and high impact genetic architecture of Ashkenazi Jewish adult IBD displays significant overlap with very early onset-IBD genetics. Moreover, by performing biobank phenome-wide analyses, we find that IBD genes have pleiotropic effects that involve other immune responses. Finally, we show that polygenic risk score analyses based on genome-wide high impact variants have high power to predict IBD susceptibility

UCL Discovery

Probabilistic Inference of Transcription Factor Binding from Multiple Data Sources

Author: A Ambesi-Impiombato
A Bernard
A Beyer
A Sandelin
A Sandelin
A Siepel
AFA Smit
Alistair G. Rust
B Ren
CE Lawrence
CL Warren
CP Robert
CT Harbison
D GuhaThakurta
D Husmeier
D Husmeier
David Jones
DB Gordon
DJ Reiss
DJ Wilkinson
DT Holloway
DT Holloway
E Blanco
E Segal
E Segal
E Wingender
EH Davidson
G Chen
G Thijs
G Thijs
GD Stormo
GE Crawford
H Huang
H Lähdesmäki
H Steck
Harri Lähdesmäki
Ilya Shmulevich
IV Bajić
J Taylor
JD Hughes
JM Claverie
K Quandt
K Thomas
KD MacIsaac
KP Murphy
L Hertzberg
L Narlikar
L Narlikar
L Narlikar
L Zhang
M Eisenstein
M Kellis
M Levine
M Tompa
MA Beer
MC Frith
MF Berger
MJL de Hoon
ML Bulyk
N Friedman
N Rajewsky
ND Heintzman
O Hallikas
OV Kel-Margoulis
Q Zhou
R Siddharthan
R Staden
S Cawley
S Mukherjee
S Sinha
S Sinha
SB Montgomery
SJ Maerkl
SP Brooks
ST Jensen
T Chen
T Fawcett
T Reguly
TD Wu
TI Lee
TL Bailey
TL Bailey
VD Marinescu
W Pan
WJ Kent
WP Lehrach
WW Wasserman
X Liu
X Xie
XS Liu
Y Barash
Y Barash
Y Qi
Y Tamada
Publication venue: Public Library of Science
Publication date: 01/03/2008
Field of study

An important problem in molecular biology is to build a complete understanding of transcriptional regulatory processes in the cell. We have developed a flexible, probabilistic framework to predict TF binding from multiple data sources that differs from the standard hypothesis testing (scanning) methods in several ways. Our probabilistic modeling framework estimates the probability of binding and, thus, naturally reflects our degree of belief in binding. Probabilistic modeling also allows for easy and systematic integration of our binding predictions into other probabilistic modeling methods, such as expression-based gene network inference. The method answers the question of whether the whole analyzed promoter has a binding site, but can also be extended to estimate the binding probability at each nucleotide position. Further, we introduce an extension to model combinatorial regulation by several TFs. Most importantly, the proposed methods can make principled probabilistic inference from multiple evidence sources, such as, multiple statistical models (motifs) of the TFs, evolutionary conservation, regulatory potential, CpG islands, nucleosome positioning, DNase hypersensitive sites, ChIP-chip binding segments and other (prior) sequence-based biological knowledge. We developed both a likelihood and a Bayesian method, where the latter is implemented with a Markov chain Monte Carlo algorithm. Results on a carefully constructed test set from the mouse genome demonstrate that principled data fusion can significantly improve the performance of TF binding prediction methods. We also applied the probabilistic modeling framework to all promoters in the mouse genome and the results indicate a sparse connectivity between transcriptional regulators and their target promoters. To facilitate analysis of other sequences and additional data, we have developed an on-line web tool, ProbTF, which implements our probabilistic TF binding prediction method using multiple data sources. Test data set, a web tool, source codes and supplementary data are available at: http://www.probtf.org

Public Library of Science (PLOS)

Prioritization of gene regulatory interactions from large-scale modules in yeast

Author: A Tanay
A-L Barabasi
CT Harbison
D Greenbaum
E Schweizer
E Segal
F Gao
F Rolland
G Lesage
Ho-Joon Lee
I Simon
J Ihmels
K Lemmens
LH Hartwell
LL Newcomb
M Kellis
M Koranda
Martin Vingron
N Zhang
P Cliften
P Prochasson
PT Spellman
R Siddharthan
Ricardo Bringas
S Rahmann
S Tavazoie
SW Doniger
T Manke
T Yu
Thomas Manke
TI Lee
V Matys
VR Iyer
W-S Wu
X Xu
Y Pilpel
Z Bar-Joseph
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The identification of groups of co-regulated genes and their transcription factors, called transcriptional modules, has been a focus of many studies about biological systems. While methods have been developed to derive numerous modules from genome-wide data, individual links between regulatory proteins and target genes still need experimental verification. In this work, we aim to prioritize regulator-target links within transcriptional modules based on three types of large-scale data sources. Results Starting with putative transcriptional modules from ChIP-chip data, we first derive modules in which target genes show both expression and function coherence. The most reliable regulatory links between transcription factors and target genes are established by identifying intersection of target genes in coherent modules for each enriched functional category. Using a combination of genome-wide yeast data in normal growth conditions and two different reference datasets, we show that our method predicts regulatory interactions with significantly higher predictive power than ChIP-chip binding data alone. A comparison with results from other studies highlights that our approach provides a reliable and complementary set of regulatory interactions. Based on our results, we can also identify functionally interacting target genes, for instance, a group of co-regulated proteins related to cell wall synthesis. Furthermore, we report novel conserved binding sites of a glycoprotein-encoding gene, CIS3, regulated by Swi6-Swi4 and Ndd1-Fkh2-Mcm1 complexes. Conclusion We provide a simple method to prioritize individual TF-gene interactions from large-scale transcriptional modules. In comparison with other published works, we predict a complementary set of regulatory interactions which yields a similar or higher prediction accuracy at the expense of sensitivity. Therefore, our method can serve as an alternative approach to prioritization for further experimental studies.</p

Springer - Publisher Connector