Search CORE

12,861 research outputs found

Methods for protein complex prediction and their contributions towards understanding the organization, function and dynamics of complexes

Author: Patil Ashwini
Srihari Sriganesh
Wong Limsoon
Yong Chern Han
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

Complexes of physically interacting proteins constitute fundamental functional units responsible for driving biological processes within cells. A faithful reconstruction of the entire set of complexes is therefore essential to understand the functional organization of cells. In this review, we discuss the key contributions of computational methods developed till date (approximately between 2003 and 2015) for identifying complexes from the network of interacting proteins (PPI network). We evaluate in depth the performance of these methods on PPI datasets from yeast, and highlight challenges faced by these methods, in particular detection of sparse and small or sub- complexes and discerning of overlapping complexes. We describe methods for integrating diverse information including expression profiles and 3D structures of proteins with PPI networks to understand the dynamics of complex formation, for instance, of time-based assembly of complex subunits and formation of fuzzy complexes from intrinsically disordered proteins. Finally, we discuss methods for identifying dysfunctional complexes in human diseases, an application that is proving invaluable to understand disease mechanisms and to discover novel therapeutic targets. We hope this review aptly commemorates a decade of research on computational prediction of complexes and constitutes a valuable reference for further advancements in this exciting area.Comment: 1 Tabl

arXiv.org e-Print Archive

Elsevier - Publisher Connector

University of Queensland eSpace

A classification-based framework for predicting and analyzing gene regulatory response

Author: AJ Hartemink
Anshul Kundaje
AP Gasch
AP Gasch
Chris H Wiggins
Christina Leslie
CI Holmberg
D Pe'er
D Pe'er
D Pollard
DC Raitt
E Ramil
E Segal
E Segal
ER Gansner
HJ Bussemaker
I Ota
I Pedruzzi
J Ihmels
JD Hughes
JT Lin
M Middendorf
M Middendorf
M Middendorf
MA Beer
Manuel Middendorf
Mihir Shah
P Zarzov
RE Schapire
TI Lee
VK Vyas
W Hoeffding
Y Pilpel
Yoav Freund
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We have recently introduced a predictive framework for studying gene transcriptional regulation in simpler organisms using a novel supervised learning algorithm called GeneClass. GeneClass is motivated by the hypothesis that in model organisms such as Saccharomyces cerevisiae, we can learn a decision rule for predicting whether a gene is up- or down-regulated in a particular microarray experiment based on the presence of binding site subsequences ("motifs") in the gene's regulatory region and the expression levels of regulators such as transcription factors in the experiment ("parents"). GeneClass formulates the learning task as a classification problem — predicting +1 and -1 labels corresponding to up- and down-regulation beyond the levels of biological and measurement noise in microarray measurements. Using the Adaboost algorithm, GeneClass learns a prediction function in the form of an alternating decision tree, a margin-based generalization of a decision tree. METHODS: In the current work, we introduce a new, robust version of the GeneClass algorithm that increases stability and computational efficiency, yielding a more scalable and reliable predictive model. The improved stability of the prediction tree enables us to introduce a detailed post-processing framework for biological interpretation, including individual and group target gene analysis to reveal condition-specific regulation programs and to suggest signaling pathways. Robust GeneClass uses a novel stabilized variant of boosting that allows a set of correlated features, rather than single features, to be included at nodes of the tree; in this way, biologically important features that are correlated with the single best feature are retained rather than decorrelated and lost in the next round of boosting. Other computational developments include fast matrix computation of the loss function for all features, allowing scalability to large datasets, and the use of abstaining weak rules, which results in a more shallow and interpretable tree. We also show how to incorporate genome-wide protein-DNA binding data from ChIP chip experiments into the GeneClass algorithm, and we use an improved noise model for gene expression data. RESULTS: Using the improved scalability of Robust GeneClass, we present larger scale experiments on a yeast environmental stress dataset, training and testing on all genes and using a comprehensive set of potential regulators. We demonstrate the improved stability of the features in the learned prediction tree, and we show the utility of the post-processing framework by analyzing two groups of genes in yeast — the protein chaperones and a set of putative targets of the Nrg1 and Nrg2 transcription factors — and suggesting novel hypotheses about their transcriptional and post-transcriptional regulation. Detailed results and Robust GeneClass source code is available for download from

Crossref

Springer - Publisher Connector

Columbia University Academic Commons

PubMed Central

Bayesian variable selection and data integration for biological regulatory networks

Author: Chen Guang
Jensen Shane T.
Stoeckert Jr, Christian J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

A substantial focus of research in molecular biology are gene regulatory networks: the set of transcription factors and target genes which control the involvement of different biological processes in living cells. Previous statistical approaches for identifying gene regulatory networks have used gene expression data, ChIP binding data or promoter sequence data, but each of these resources provides only partial information. We present a Bayesian hierarchical model that integrates all three data types in a principled variable selection framework. The gene expression data are modeled as a function of the unknown gene regulatory network which has an informed prior distribution based upon both ChIP binding and promoter sequence data. We also present a variable weighting methodology for the principled balancing of multiple sources of prior information. We apply our procedure to the discovery of gene regulatory relationships in Saccharomyces cerevisiae (Yeast) for which we can use several external sources of information to validate our results. Our inferred relationships show greater biological relevance on the external validation measures than previous data integration methods. Our model also estimates synergistic and antagonistic interactions between transcription factors, many of which are validated by previous studies. We also evaluate the results from our procedure for the weighting for multiple sources of prior information. Finally, we discuss our methodology in the context of previous approaches to data integration and Bayesian variable selection.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS130 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Functional Characterization of Transcription Factor Motifs Using Cross-species Comparison across Large Evolutionary Distances

Author: A Louvi
A Rogulja-Ortmann
A Stark
AC Edwards
AE Kel
AI Su
AS Adler
AV Morozov
B van Steensel
BD McCabe
BP Berman
BP Berman
Brian James
C Kwong
C Rushlow
C Rushlow
C van Waveren
CM Bergman
CT Harbison
CW Whitfield
CW Whitfield
D Karolchik
D Porcelli
D Vlieghe
DE Newburger
E Kurant
E Segal
EJ Ward
Evgeny M. Zdobnov
F Casares
GD Stormo
Gene E. Robinson
HG Roider
HG Roider
HM Berman
Hugh M. Robertson
J DeZazzo
J Pinnell
J Wang
J Zeitlinger
J Zhu
JA Lynch
Jaebum Kim
JB Warner
JD Gibson
JD Storey
JG Gindhart Jr
JH Werren
John H. Werren
Joshua D. Gibson
JR Desjarlais
JZ Parrish
KM Bhat
LA Pennacchio
LD Ward
LF Sempere
LW Chang
M Ashburner
M Blanchette
M Boden
M Delorenzi
M Kanehisa
M Kellis
MA Crosby
MB Noyes
MB Noyes
MC Frith
MC Frith
ME Fortini
MS Halfon
N Rajewsky
Oliver Niehuis
P Kheradpour
PK Sorger
R Garesse
R Gordân
RC Scarpulla
RD Finn
Ryan Cunningham
S Grossmann
S Robin
S Roy
S Sinha
S Sinha
S Sinha
SA Ramsey
Saurabh Sinha
Stefan Wyder
T Berleth
TE Creighton
TL Bailey
U Keich
V Matys
W Huang da
WC Xiong
WJ Nelson
WW Wasserman
Wyeth W. Wasserman
X Xie
X Zhou
XY Li
Y Haraguchi
Z Huang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

We address the problem of finding statistically significant associations between cis-regulatory motifs and functional gene sets, in order to understand the biological roles of transcription factors. We develop a computational framework for this task, whose features include a new statistical score for motif scanning, the use of different scores for predicting targets of different motifs, and new ways to deal with redundancies among significant motif–function associations. This framework is applied to the recently sequenced genome of the jewel wasp, Nasonia vitripennis, making use of the existing knowledge of motifs and gene annotations in another insect genome, that of the fruitfly. The framework uses cross-species comparison to improve the specificity of its predictions, and does so without relying upon non-coding sequence alignment. It is therefore well suited for comparative genomics across large evolutionary divergences, where existing alignment-based methods are not applicable. We also apply the framework to find motifs associated with socially regulated gene sets in the honeybee, Apis mellifera, using comparisons with Nasonia, a solitary species, to identify honeybee-specific associations

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Archive ouverte UNIGE

A computational analysis of transcription factor interactions and binding guided by epigenetics

Author: Essebier Alexandra
Publication venue: 'University of Queensland Library'
Publication date: 22/06/2020
Field of study

University of Queensland eSpace

SYSTEMS BIOLOGY: GLOBAL REGULATION OF GENE EXPRESSION

Author: Bernstein B.
Bulyk M.
Bussemaker H.
Walhout M.
Publication venue
Publication date: 01/03/2010
Field of study

Cold Spring Harbor Laboratory Institutional Repository

A Predictive Model of the Oxygen and Heme Regulatory Network in Yeast

Author: A Kundaje
A Smith
A Tanay
A Tanay
AJ Hartemink
AJ Kastaniotis
AM Erkine
Anshul Kundaje
AP Gasch
AV Grishin
BM Bolstad
C Dagsgaard
CH Yeang
Changgui Lan
Christina Leslie
CT Harbison
CV Lowry
D Pe'er
E Segal
E Segal
E Segal
FM Ausubel
FP Roth
Herbert M. Sauro
HF Bunn
HJ Bussemaker
J Ernst
J Ihmels
J Olesen
JC Schneider
JD Hughes
JJ ter Linde
JY Choi
K Pfeifer
KA Morano
KD MacIsaac
KD MacIsaac
KE Kwast
KE Kwast
KV Shianna
L Guarente
L Zhang
L Zhang
L-C Lai
L-C Lai
Li Zhang
M Kaern
M Middendorf
M Middendorf
MA Beer
MD Piper
Mei Zhou
MJ Vasconcelles
MK Yeung
MR Grably
N Abramova
N Rachidi
NE Abramova
O Sertil
O Sertil
PV Burke
R Schapire
RA Irizarry
RE Schapire
RS Zitomer
RS Zitomer
S Kuge
S Labb‚
S Tavazoie
SL Tai
Steve Lianoglou
T Hon
T Hoppe
T Keng
T Prezant
TI Lee
TS Gardner
VV Svetlov
Xiantong Xin
Y Benjamini
Y Freund
Y Jiang
Y Jiang
Y Pilpel
Y Tu
Z Bar-Joseph
Publication venue: Public Library of Science
Publication date: 01/11/2008
Field of study

Deciphering gene regulatory mechanisms through the analysis of high-throughput expression data is a challenging computational problem. Previous computational studies have used large expression datasets in order to resolve fine patterns of coexpression, producing clusters or modules of potentially coregulated genes. These methods typically examine promoter sequence information, such as DNA motifs or transcription factor occupancy data, in a separate step after clustering. We needed an alternative and more integrative approach to study the oxygen regulatory network in Saccharomyces cerevisiae using a small dataset of perturbation experiments. Mechanisms of oxygen sensing and regulation underlie many physiological and pathological processes, and only a handful of oxygen regulators have been identified in previous studies. We used a new machine learning algorithm called MEDUSA to uncover detailed information about the oxygen regulatory network using genome-wide expression changes in response to perturbations in the levels of oxygen, heme, Hap1, and Co2+. MEDUSA integrates mRNA expression, promoter sequence, and ChIP-chip occupancy data to learn a model that accurately predicts the differential expression of target genes in held-out data. We used a novel margin-based score to extract significant condition-specific regulators and assemble a global map of the oxygen sensing and regulatory network. This network includes both known oxygen and heme regulators, such as Hap1, Mga2, Hap4, and Upc2, as well as many new candidate regulators. MEDUSA also identified many DNA motifs that are consistent with previous experimentally identified transcription factor binding sites. Because MEDUSA's regulatory program associates regulators to target genes through their promoter sequences, we directly tested the predicted regulators for OLE1, a gene specifically induced under hypoxia, by experimental analysis of the activity of its promoter. In each case, deletion of the candidate regulator resulted in the predicted effect on promoter activity, confirming that several novel regulators identified by MEDUSA are indeed involved in oxygen regulation. MEDUSA can reveal important information from a small dataset and generate testable hypotheses for further experimental analysis. Supplemental data are included

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Structural Prediction of Protein–Protein Interactions by Docking: Application to Biomedical Problems

Author: Barradas-Bautista Didier
Fernández-Recio Juan
Pallara Chiara
Rosell Mireia
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

A huge amount of genetic information is available thanks to the recent advances in sequencing technologies and the larger computational capabilities, but the interpretation of such genetic data at phenotypic level remains elusive. One of the reasons is that proteins are not acting alone, but are specifically interacting with other proteins and biomolecules, forming intricate interaction networks that are essential for the majority of cell processes and pathological conditions. Thus, characterizing such interaction networks is an important step in understanding how information flows from gene to phenotype. Indeed, structural characterization of protein–protein interactions at atomic resolution has many applications in biomedicine, from diagnosis and vaccine design, to drug discovery. However, despite the advances of experimental structural determination, the number of interactions for which there is available structural data is still very small. In this context, a complementary approach is computational modeling of protein interactions by docking, which is usually composed of two major phases: (i) sampling of the possible binding modes between the interacting molecules and (ii) scoring for the identification of the correct orientations. In addition, prediction of interface and hot-spot residues is very useful in order to guide and interpret mutagenesis experiments, as well as to understand functional and mechanistic aspects of the interaction. Computational docking is already being applied to specific biomedical problems within the context of personalized medicine, for instance, helping to interpret pathological mutations involved in protein–protein interactions, or providing modeled structural data for drug discovery targeting protein–protein interactions.Spanish Ministry of Economy grant number BIO2016-79960-R; D.B.B. is supported by a predoctoral fellowship from CONACyT; M.R. is supported by an FPI fellowship from the Severo Ochoa program. We are grateful to the Joint BSC-CRG-IRB Programme in Computational Biology.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Application of ChIP-Seq data analysis softwares in studies of gene regulation

Author: Talwar J.
Publication venue: Ludwig-Maximilians-Universität
Publication date: 01/01/2015
Field of study

MPG.PuRe