Search CORE

30 research outputs found

Complementary Sources of Protein Functional Information: The Far Side of GO.

Author: A Chang
A Mitchell
AG McDonald
CF Schaefer
CL Smith
D Croft
DA Lima Morais de
E Akiva
G Bindea
G Yu
GL Holliday
H Ramos
I Sillitoe
JS Amberger
L Plessis du
M Kanehisa
ME Oates
N Furnham
R Caspi
RD Finn
S Kerrien
SA Rahman
SR Maetschke
Y Qi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

The GO captures many aspects of functional annotations, but there are other alternative complementary sources of protein function information. For example, enzyme functional annotations are described in a range of resources from the Enzyme Commission (E.C.) hierarchical classification to the Kyoto Encyclopedia of Genes and Genomes (KEGG) to the Catalytic Site Atlas amongst many others. This chapter describes some of the main resources available and how they can be used in conjunction with GO

Crossref

LSHTM Research Online

Springer - Publisher Connector

TRaCE+: Ensemble inference of gene regulatory networks from transcriptional expression profiles of gene knock-out experiments

Author: A Bjorklund
A Pinna
D Marbach
D Marbach
F Crick
F Emmert-Streib
F Markowetz
G Stolovitzky
G Stolovitzky
GK Ackers
M Bansal
PB Madhamshettiwar
RJ Prill
Rudiyanto Gunawan
S Klamt
S.M. Minhaz Ud-Dean
Sandra Heise
SM Ud-Dean
SMM Ud-Dean
SR Maetschke
Steffen Klamt
T Schaffter
TD Consortium
TS Gardner
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A comprehensive assessment of N-terminal signal peptides prediction methods

Author: A Bairoch
A Sidhu
AA Elling
AM Popowicz
B Jagla
D Plewczynski
DQ Liu
EW Klee
G Schneider
G von Heijne
G von Heijne
H Bannai
H Nielsen
H Nielsen
H Nielsen
H Viklund
HB Shen
HF Clark
I Ladunga
I Small
J Hawkins
JD Bendtsen
JD Bendtsen
JD Bendtsen
JJ Sun
JP Vert
JR Bradford
K Frank
KC Chou
KH Choo
KH Choo
Khar Heng Choo
KM Menne
L Kall
L Liu
M Ashburner
M Boden
M Gomi
M Reczko
M Spiess
MA Marra
N Mukherjee
O Emanuelsson
O Emanuelsson
P Fariselli
P Rice
PG Bagos
R Kanagasabai
S Maetschke
S Pascarella
SB Needleman
SF Altschul
SH Nagaraj
Shoba Ranganathan
SM Reynolds
SR Eddy
Tin Wee Tan
W Li
Y Chen
YD Cai
Z Zhang
Z Zhang
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Background: Amino-terminal signal peptides (SPs) are short regions that guide the targeting of secretory proteins to the correct subcellular compartments in the cell. They are cleaved off upon the passenger protein reaching its destination. The explosive growth in sequencing technologies has led to the deposition of vast numbers of protein sequences necessitating rapid functional annotation techniques, with subcellular localization being a key feature. Of the myriad software prediction tools developed to automate the task of assigning the SP cleavage site of these new sequences, we review here, the performance and reliability of commonly used SP prediction tools. Results: The available signal peptide data has been manually curated and organized into three datasets representing eukaryotes, Gram-positive and Gram-negative bacteria. These datasets are used to evaluate thirteen prediction tools that are publicly available. SignalP (both the HMM and ANN versions) maintains consistency and achieves the best overall accuracy in all three benchmarking experiments, ranging from 0.872 to 0.914 although other prediction tools are narrowing the performance gap. Conclusion: The majority of the tools evaluated in this study encounter no difficulty in discriminating between secretory and non-secretory proteins. The challenge clearly remains with pinpointing the correct SP cleavage site. The composite scoring schemes employed by SignalP may help to explain its accuracy. Prediction task is divided into a number of separate steps, thus allowing each score to tackle a particular aspect of the prediction.12 page(s

Crossref

Springer - Publisher Connector

PubMed Central

Macquarie University ResearchOnline

ScholarBank@NUS

Data-driven reverse engineering of signaling pathways using ensembles of dynamic models

Author: A Hindmarsh
A MacNamara
A Margolin
AF Villaverde
AF Villaverde
AF Villaverde
Alejandro F. Villaverde
B Kholodenko
C Guziolowski
C Huang
C Tebaldi
C Terfve
CE Shannon
D Henriques
D Hurley
D Marbach
D Türei
David Henriques
DG Hurley
DM Wittmann
F Markowetz
G Altay
G Jia
G Johnson
H De Jong
H Xing
HM Kaltenbach
IS Jang
J Krumsiek
J Saez-Rodriguez
J Saez-Rodriguez
J Schaber
JA Egea
JA Egea
JJ Faith
JP Faria
JR Banga
Julio R. Banga
Julio Saez-Rodriguez
K Sachs
Kai Tan
KP Burnham
L Breiman
L Breiman
L Dagum
L Geris
L Kuepfer
L Mišković
M Banf
M Bansal
M De La Maza
M Re
M Sunnaker
Miguel Rocha
MJ Song
N Soranzo
P Domingos
P Yang
P Zoppoli
PE Meyer
PE Meyer
R Bonneau
R De Smet
R Hagedorn
R Schapire
R Steuer
RJ Prill
S Bandara
S Kauffman
SM Hill
SMM Ud-Dean
SR Maetschke
T Gneiting
TG Dietterich
VA Huynh-Thu
VA Huynh-Thu
W Luo
WW Chen
Y Lee
Y Tan
YH Chang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

Signaling pathways play a key role in complex diseases such as cancer, for which the development of novel therapies is a difficult, expensive and laborious task. Computational models that can predict the effect of a new combination of drugs without having to test it experimentally can help in accelerating this process. In particular, network-based dynamic models of these pathways hold promise to both understand and predict the effect of therapeutics. However, their use is currently hampered by limitations in our knowledge of the underlying biochemistry, as well as in the experimental and computational technologies used for calibrating the models. Thus, the results from such models need to be carefully interpreted and used in order to avoid biased predictions. Here we present a procedure that deals with this uncertainty by using experimental data to build an ensemble of dynamic models. The method incorporates steps to reduce overfitting and maximize predictive capability. We find that by combining the outputs of individual models in an ensemble it is possible to obtain a more robust prediction. We report results obtained with this method, which we call SELDOM (enSEmbLe of Dynamic lOgic-based Models), showing that it improves the predictions previously reported for several challenging problems.JRB and DH acknowledge funding from the EU FP7 project NICHE (ITN Grant number 289384). JRB acknowledges funding from the Spanish MINECO project SYNBIOFACTORY (grant number DPI2014-55276-C5-2-R). AFV acknowledges funding from the Galician government (Xunta de Galiza) through the I2C postdoctoral fellowship ED481B2014/133-0. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.info:eu-repo/semantics/publishedVersio

Public Library of Science (PLOS)

Universidade do Minho: RepositoriUM

Crossref

Directory of Open Access Journals

PubMed Central

Publikationsserver der RWTH Aachen University

Digital.CSIC

FigShare

Inferring causal molecular networks: empirical assessment through a community-based effort.

Author: A de la Fuente
AA Margolin
Adrian Bivol
Alexander J Bisberg
Alexander V Favorov
Amina A Qutub
Artem Sokolov
Bahman Afsari
BT Hennessy
Byron L Long
C Olsen
Chenyue W Hu
Chris K Wong
CM Chresta
D Freedman
D Husmeier
D Marbach
D Marbach
Dane Taylor
Daniel E Carlin
David P Noren
EG Cerami
EJ Molinelli
Elana J Fertig
Evan O Paull
F Eduati
F Eduati
F Markowetz
Fan Zhu
G Stolovitzky
G Stolovitzky
Gordon B Mills
Gustavo Stolovitzky
H Wang
Haizhou Wang
Heinz Koeppl
I Cantone
J Barretina
J Saez-Rodriguez
JC Costello
JMJ Derry
Joe W Gray
Joshua M Stuart
Julio Saez-Rodriguez
K Sachs
Kiley Graim
Laura M Heiser
Ludmila V Danilova
M Bansal
M Hecker
MH Maathuis
Michael Kellen
Michael Unger
Mingzhou Song
MJ Garnett
N Friedman
Nicole K Nesser
O Guitart-Pla
P Mertins
P Meyer
P Shannon
Paul T Spellman
R Akbani
R De Smet
R Tibes
RJ Prill
RJ Prill
RM Neve
Sach Mukherjee
SM Hill
SR Maetschke
Stephen Friend
Steven M Hill
T Cokelaer
T Ideker
Thea Norman
Thomas Cokelaer
Wai Shing Lee
WW Chen
Y Benjamini
Yang Zhang
Yuanfang Guan
Publication venue: Nat Methods
Publication date: 01/01/2015
Field of study

It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense

TUbiblio

Crossref

PubMed Central

eScholarship - University of California

Warwick Research Archives Portal Repository

Apollo (Cambridge)

DSpace at Rice University

A non-negative matrix factorization framework for identifying modular patterns in metagenomic profile data

Author: A Montano
A Montano
AC McHardy
C Alzate
C Desnues
C Quince
CL Hemme
D Willner
D Willner
DD Lee
DH Huson
DH Parks
EA Dinsdale
EB Hollister
F Meyer
H Kim
J Peterson
JL Morgan
Jonathan Dushoff
Joshua S. Weitz
JP Brunet
K Devarajan
P Saez
PJ Turnbaugh
PJ Turnbaugh
PM Kim
R Gaujoux
S Zhang
SA Levin
SC Madeira
SR Maetschke
TA Gianoulis
Xingpeng Jiang
Y Kluger
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Alignment-free inference of hierarchical and reticulate phylogenomic relationships

Author: Bernard G
Chan CX
Chan Y-B
Chua X-Y
Cong Y
Hogan JM
Maetschke SR
Ragan MA
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/03/2019
Field of study

We are amidst an ongoing flood of sequence data arising from the application of high-throughput technologies, and a concomitant fundamental revision in our understanding of how genomes evolve individually and within the biosphere. Workflows for phylogenomic inference must accommodate data that are not only much larger than before, but often more error prone and perhaps misassembled, or not assembled in the first place. Moreover, genomes of microbes, viruses and plasmids evolve not only by tree-like descent with modification but also by incorporating stretches of exogenous DNA. Thus, next-generation phylogenomics must address computational scalability while rethinking the nature of orthogroups, the alignment of multiple sequences and the inference and comparison of trees. New phylogenomic workflows have begun to take shape based on so-called alignment-free (AF) approaches. Here, we review the conceptual foundations of AF phylogenetics for the hierarchical (vertical) and reticulate (lateral) components of genome evolution, focusing on methods based on k-mers. We reflect on what seems to be successful, and on where further development is needed

Queensland University of Technology ePrints Archive

University of Melbourne Institutional Repository

Feature Induction and Network Mining with Clustering Tree Ensembles

Author: C Mering Von
CJ Burges
D Kocev
D Kocev
D Stojanova
F Moosmann
GR Lanckriet
H Blockeel
J Shawe-Taylor
JJ Faith
JM Cherry
JP Vert
K Bleakley
KD MacIsaac
L Breiman
L Hubert
L Maaten Van Der
L Maaten Van Der
M Schrynemackers
M Zhang
P Geurts
P Geurts
S Yan
SR Maetschke
Y Yamanishi
Y Yamanishi
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Gene Ontology Enrichment Improves Performances of Functional Similarity of Genes

Author: A Schlicker
A Tversky
C Pesquita
C Shi
DA Williams
F Wilcoxon
FM Couto
G Yu
GK Mazandu
GK Mazandu
H Yang
J Chabalier
J Gillis
JH Steiger.
JZ Wang
L Cheng
L Salwinski
MT Longnecker
P Resnik
PW Lord
Q Zou
R Ehsani
S Jain
SJ Bien
SR Maetschke
T Gene
X Guo
X Zeng
Y Moreau
Z Teng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Ranking genome-wide correlation measurements improves microarray and RNA-seq based global and targeted co-expression networks

Author: A Clauset
A Fuente De La
A Kauffmann
B Winkel-Shirley
C Elejalde-Palmett
C Guerin
F Censi
FA Feltus
FM Giorgi
G Csardi
G Sales
H Wei
I Hwang
J Gillis
J Lisso
L Jiang
L López-Kleine
L Song
LE Chai
M Kanehisa
M Mutwil
M Tsuchiya
M Zdarska
MF Blasi
MI Love
MÁ Ruiz-Sola
R Patro
S Ballouz
S Ballouz
S Besseau
S Huang
S Oliver
S Siqueira Santos de
S Uygun
SR Maetschke
T Obayashi
Y Li
Y Zhang
Z Du
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2018
Field of study

International audienceCo-expression networks are essential tools to infer biological associations between gene products and predict gene annotation. Global networks can be analyzed at the transcriptome-wide scale or after querying them with a set of guide genes to capture the transcriptional landscape of a given pathway in a process named Pathway Level Coexpression (PLC). A critical step in network construction remains the definition of gene co-expression. In the present work, we compared how Pearson Correlation Coefficient (PCC), Spearman Correlation Coefficient (SCC), their respective ranked values (Highest Reciprocal Rank (HRR)), Mutual Information (MI) and Partial Correlations (PC) performed on global networks and PLCs. This evaluation was conducted on the model plant Arabidopsis thaliana using microarray and differently pre-processed RNA-seq datasets. We particularly evaluated how dataset × distance measurement combinations performed in 5 PLCs corresponding to 4 well described plant metabolic pathways (phenylpropanoid, carbohydrate, fatty acid and terpene metabolisms) and the cytokinin signaling pathway. Our present work highlights how PCC ranked with HRR is better suited for global network construction and PLC with microarray and RNA-seq data than other distance methods, especially to cluster genes in partitions similar to biological subpathways. Constructing global gene co-expression networks is a popular approach to highlight transcriptional relationships (edges) between genes (vertices). The 'Guilt-by-Association' (GBA) principle supposes that genes sharing similar functions are preferentially connected and aims at predicting new functions for proteins by determining how their respective encoding genes are co-expressed with others using a reference dataset containing known gene functions such as the Gene Ontology (GO) 1. Defining edges connecting genes remains a critical step in global co-expression network construction. Expression data (microarray or RNA-seq) are used to construct expression matrices (genes × samples) and to calculate a distance or a similarity for each possible gene pair. The resulting pairwise distance matrix is then thresholded to obtain an adjacency matrix that discriminates relevant edges. Only edges with a distance below (or a similarity above) the set threshold are considered significant and retained for network construction. The procedure is expected to remove non biologically relevant gene associations while retaining the relevant ones and can be assessed with any reference dataset. Alternatively, guide gene sets may be used to extract more human-readable information from large networks in a process named Pathway-Level Coexpression (PLC) 2–7. This approach aims at capturing the best transcriptional associations of a gene set and at highlighting functional gene groups such as known subpathways in this set. There are two types of approaches to determine transcriptional associations of genes: those that are supervised and those that are unsupervised. Supervised approaches such as regression and machine learning based methods require a prior knowledge which is used as a training dataset to recover biologically relevant gene associations and are used to infer regulatory networks, i.e. to uncover preferential and sequential interactions of a gene over the others. The superiority o

Crossref

HAL Université de Tours