Search CORE

359 research outputs found

CORUM: the comprehensive resource of mammalian protein complexes

Author: A. Ruepp
Alberts
B. Brauner
B. Waegele
Bader
C. Montrone
Fraser
G. Frishman
Gavin
Guldener
Guldener
H. W. Mewes
Hart
Hermjakob
I. Dunger-Kaltenbach
Jensen
Kim
Krogan
Lage
M. Stransky
Mewes
Mishra
O. N. Doudieu
Ruepp
Ruepp
T. Schmidt
V. Stumpflen
Yu
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes

Crossref

PubMed Central

PuSH

Rare coding SNP in DZIP1 gene associated with late-onset sporadic Parkinson's disease

Author: A Lerner
A Ruepp
AA Merchant
AM Glazer
B Bakir-Gungor
B Dass
C Wolff
CB Do
CH Hawkes
D Subramaniam
DAD Monte
DI Chasman
E Eskin
FL Moore
H-C Fung
HR Kim
J Simón-Sánchez
K Lai
K Roeder
K Roeder
K Roeder
K Sekimizu
K Tsuboi
L Lum
LA Hindorff
LM Bekris
M Bak
M Plaisant
M Saad
M-X Li
N Miao
O Bragina
P Mill
P Whitton
PA Beachy
PW Ingham
RE Lamont
S Purcell
SM Chambers
SY Tay
TA Manolio
TH Hamza
TL Edwards
TS Keshava Prasad
V Palma
VF Rafuse
W Satake
Y Katoh
Publication venue
Publication date: 11/09/2011
Field of study

We present the first application of the hypothesis-rich mathematical theory to genome-wide association data. The Hamza et al. late-onset sporadic Parkinson's disease genome-wide association study dataset was analyzed. We found a rare, coding, non-synonymous SNP variant in the gene DZIP1 that confers increased susceptibility to Parkinson's disease. The association of DZIP1 with Parkinson's disease is consistent with a Parkinson's disease stem-cell ageing theory.Comment: 14 page

arXiv.org e-Print Archive

Crossref

PubMed Central

Estudo Geral

iRefR: an R package to manipulate the iRefIndex consolidated protein interaction database

Author: A Ceol
A Clauset
A Ruepp
A Stojmirovic
AL Turinsky
Antonio Mora
B Aranda
B Turner
C Alfarano
C Stark
G Csardi
GD Bader
I Xenarios
Ian M Donaldson
J Yu
KR Brown
P Braun
P Pagel
RM Ewing
S Kerrien
S Razick
TS Keshava Prasad
U Guldener
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The iRefIndex addresses the need to consolidate protein interaction data into a single uniform data resource. iRefR provides the user with access to this data source from an R environment. Results The iRefR package includes tools for selecting specific subsets of interest from the iRefIndex by criteria such as organism, source database, experimental method, protein accessions and publication identifier. Data may be converted between three representations (MITAB, edgeList and graph) for use with other R packages such as igraph, graph and RBGL. The user may choose between different methods for resolving redundancies in interaction data and how n-ary data is represented. In addition, we describe a function to identify binary interaction records that possibly represent protein complexes. We show that the user choice of data selection, redundancy resolution and n-ary data representation all have an impact on graphical analysis. Conclusions The package allows the user to control how these issues are dealt with and communicate them via an R-script written using the iRefR package - this will facilitate communication of methods, reproducibility of network analyses and further modification and comparison of methods by researchers.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

NORA - Norwegian Open Research Archives

Extending CATH: increasing coverage of the protein structure universe and linking structure with function

Author: A. B. Clegg
A. L. Cuff
Ashburner
Bairoch
Buchan
C. A. Orengo
Chandonia
Cuff
D. Jones
Dessailly
Grabowski
Hendrickson
I. Sillitoe
J. Thornton
Kanehisa
M. Pellegrini-Calace
N. Furnham
Neumann
Orengo
Orengo
R. Rentzsch
Rahman
Redfern
Ruepp
T. Lewis
Taylor
Todd
Publication venue: Oxford University Press
Publication date: 19/11/2010
Field of study

CATH version 3.3 (class, architecture, topology, homology) contains 128 688 domains, 2386 homologous superfamilies and 1233 fold groups, and reflects a major focus on classifying structural genomics (SG) structures and transmembrane proteins, both of which are likely to add structural novelty to the database and therefore increase the coverage of protein fold space within CATH. For CATH version 3.4 we have significantly improved the presentation of sequence information and associated functional information for CATH superfamilies. The CATH superfamily pages now reflect both the functional and structural diversity within the superfamily and include structural alignments of close and distant relatives within the superfamily, annotated with functional information and details of conserved residues. A significantly more efficient search function for CATH has been established by implementing the search server Solr (http://lucene.apache.org/solr/). The CATH v3.4 webpages have been built using the Catalyst web framework

Crossref

LSHTM Research Online

PubMed Central

An iterative approach of protein function prediction

Author: A Ruepp
B Schwikowski
C Brun
D Lin
E Zeng
G Chen
G Pandey
H Chua
H Chua
H Hishigaki
J Jiang
Jingyu Hou
M Ashburner
M Deng
M Samanta
M Wang
P Resnik
S Dwight
T Misteli
W Ching
X Chi
Xiaoxiao Chi
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: Current approaches of predicting protein functions from a protein-protein interaction (PPI) dataset are based on an assumption that the available functions of the proteins (a.k.a. annotated proteins) will determine the functions of the proteins whose functions are unknown yet at the moment (a.k.a. un-annotated proteins). Therefore, the protein function prediction is a mono-directed and one-off procedure, i.e. from annotated proteins to un-annotated proteins. However, the interactions between proteins are mutual rather than static and mono-directed, although functions of some proteins are unknown for some reasons at present. That means when we use the similarity-based approach to predict functions of un-annotated proteins, the un-annotated proteins, once their functions are predicted, will affect the similarities between proteins, which in turn will affect the prediction results. In other words, the function prediction is a dynamic and mutual procedure. This dynamic feature of protein interactions, however, was not considered in the existing prediction algorithms.Results: In this paper, we propose a new prediction approach that predicts protein functions iteratively. This iterative approach incorporates the dynamic and mutual features of PPI interactions, as well as the local and global semantic influence of protein functions, into the prediction. To guarantee predicting functions iteratively, we propose a new protein similarity from protein functions. We adapt new evaluation metrics to evaluate the prediction quality of our algorithm and other similar algorithms. Experiments on real PPI datasets were conducted to evaluate the effectiveness of the proposed approach in predicting unknown protein functions.Conclusions: The iterative approach is more likely to reflect the real biological nature between proteins when predicting functions. A proper definition of protein similarity from protein functions is the key to predicting functions iteratively. The evaluation results demonstrated that in most cases, the iterative approach outperformed non-iterative ones with higher prediction quality in terms of prediction precision, recall and F-value

Deakin Research Online

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Impact of smoking status on the relative efficacy of the EGFR TKI/angiogenesis inhibitor combination therapy in advanced NSCLC-a systematic review and meta-analysis.

Author: Bernabé R
Carcereny E
Coate L
Cuffe S
Dafni U
De Castro J
Früh M
Han J-Y
Hashemi S M S
Nadal E
Peters S
Provencio M
Roschitzki-Voser H
Rosell R
Ruepp B
Sala M A
Soo R A
Stahel R A
Tsourti Z
Vervita K
Zygoura P
Publication venue: 'Elsevier BV'
Publication date: 01/04/2022
Field of study

BACKGROUND The ETOP 10-16 BOOSTER trial failed to demonstrate a progression-free survival (PFS) benefit for adding bevacizumab to osimertinib in second line. An exploratory subgroup analysis, however, suggested a PFS benefit of the combination in patients with a smoking history and prompted us to do this study. METHODS A systematic review and meta-analysis to evaluate the differential effect of smoking status on the benefit of adding an angiogenesis inhibitor to epidermal growth factor receptor (EGFR)-tyrosine kinase inhibitor therapy was carried out. All relevant randomized controlled trials appearing in main oncology congresses or in PubMed as of 1 November 2021 were used according to the Preferred Reporting Items for Systematic Review and Meta-Analyses statement. Primarily PFS according to smoking status, and secondarily overall survival (OS) were of interest. Pooled and interaction hazard ratios (HRs) were estimated by fixed or random effects models, depending on the detected degree of heterogeneity. Bias was assessed using the revised Cochrane tool for randomized controlled trials (RoB 2). RESULTS Information by smoking was available for 1291 patients for PFS (seven studies) and 678 patients for OS (four studies). The risk of bias was low for all studies. Combination treatment significantly prolonged PFS for smokers [n = 502, HR = 0.55, 95% confidence interval (CI): 0.44-0.69] but not for nonsmokers (n = 789, HR = 0.92, 95% CI: 0.66-1.27; treatment-by-smoking interaction P = 0.02). Similarly, a significant OS benefit was found for smokers (n = 271, HR = 0.66, 95% CI: 0.47-0.93) but not for nonsmokers (n = 407, HR = 1.07, 95% CI: 0.82-1.42; treatment-by-smoking interaction P = 0.03). CONCLUSION In advanced EGFR-non-small-cell lung cancer patients, the addition of an angiogenesis inhibitor to EGFR-tyrosine kinase inhibitor therapy provides a statistically significant PFS and OS benefit in smokers, but not in non-smokers. The biological basis for this observation should be pursued and could determine whether this might be due to a specific co-mutational pattern produced by tobacco exposure

Serveur académique lausannois

PubMed Central

Bern Open Repository and Information System (BORIS)

Fondo Bibliográfico Digital Institucional

Fungal Virulence and Development Is Regulated by Alternative Pre-mRNA 3′End Processing in Magnaporthe oryzae

Author: A Marnef
A Sesma
AFJ Ram
AG Hunt
Ane Sesma
AS Evitt
B Valent
B Valent
BM Lunde
C Lutz
C Maris
CH Khang
CL Kielkopf
Concepción Gómez-Mena
CR Mandel
DK Morrison
E Beaudoing
Emilio Bueno
FQ Liu
G Mosquera
GK Smyth
GK Smyth
Grant Calder
Hiten D. Madhani
J Lippincott-Schwartz
JL Liu
JM Perez-Canadillas
K Venkataraman
KSK Guisbert
M Besi
M Gowda
M Mangone
MA Keniry
Marina Franceschetti
MD Ruepp
MD Ruepp
MR Fabian
MW Pfaffl
N Kadotani
NJ Talbot
P Anderson
P Kankanala
Q Yang
R Ihaka
R Stefl
R Zoncu
RA Wilson
RA Wilson
RA Wilson
Richard A. Wilson
S Kishore
S Komili
S Millevoi
S Millevoi
Sara L. Tucker
SL Tucker
T Murata
V Zinzalla
WM Michael
Y Benjamini
YS Shi
YTS Mao
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

RNA-binding proteins play a central role in post-transcriptional mechanisms that control gene expression. Identification of novel RNA-binding proteins in fungi is essential to unravel post-transcriptional networks and cellular processes that confer identity to the fungal kingdom. Here, we carried out the functional characterisation of the filamentous fungus-specific RNA-binding protein RBP35 required for full virulence and development in the rice blast fungus. RBP35 contains an N-terminal RNA recognition motif (RRM) and six Arg-Gly-Gly tripeptide repeats. Immunoblots identified two RBP35 protein isoforms that show a steady-state nuclear localisation and bind RNA in vitro. RBP35 coimmunoprecipitates in vivo with Cleavage Factor I (CFI) 25 kDa, a highly conserved protein involved in polyA site recognition and cleavage of pre-mRNAs. Several targets of RBP35 have been identified using transcriptomics including 14-3-3 pre-mRNA, an important integrator of environmental signals. In Magnaporthe oryzae, RBP35 is not essential for viability but regulates the length of 3′UTRs of transcripts with developmental and virulence-associated functions. The Δrbp35 mutant is affected in the TOR (target of rapamycin) signaling pathway showing significant changes in nitrogen metabolism and protein secretion. The lack of clear RBP35 orthologues in yeast, plants and animals indicates that RBP35 is a novel auxiliary protein of the polyadenylation machinery of filamentous fungi. Our data demonstrate that RBP35 is the fungal equivalent of metazoan CFI 68 kDa and suggest the existence of 3′end processing mechanisms exclusive to the fungal kingdom

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

RiuNet

University of East Anglia digital repository

SCPS: a fast implementation of a spectral method for detecting protein families on a genome-wide scale

Author: A Biegert
A Murzin
A Paccanaro
A Ruepp
AJ Enright
AJ Enright
Alberto Paccanaro
AY Ng
B Everitt
D Arthur
D Ballard
EL Hong
G Wang
JJ Forman
JM Chandonia
K Verkhedkar
LJ Jensen
M Ashburner
M Meilă
M Newman
N Kannan
O Krishnadev
P Pipenbacher
P Shannon
Rajkumar Sasidharan
RB Lehoucq
S van Dongen
SF Altschul
SF Altschul
T Fruchterman
Tamás Nepusz
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background An important problem in genomics is the automatic inference of groups of homologous proteins from pairwise sequence similarities. Several approaches have been proposed for this task which are "local" in the sense that they assign a protein to a cluster based only on the distances between that protein and the other proteins in the set. It was shown recently that global methods such as spectral clustering have better performance on a wide variety of datasets. However, currently available implementations of spectral clustering methods mostly consist of a few loosely coupled Matlab scripts that assume a fair amount of familiarity with Matlab programming and hence they are inaccessible for large parts of the research community. Results SCPS (Spectral Clustering of Protein Sequences) is an efficient and user-friendly implementation of a spectral method for inferring protein families. The method uses only pairwise sequence similarities, and is therefore practical when only sequence information is available. SCPS was tested on difficult sets of proteins whose relationships were extracted from the SCOP database, and its results were extensively compared with those obtained using other popular protein clustering algorithms such as TribeMCL, hierarchical clustering and connected component analysis. We show that SCPS is able to identify many of the family/superfamily relationships correctly and that the quality of the obtained clusters as indicated by their F-scores is consistently better than all the other methods we compared it with. We also demonstrate the scalability of SCPS by clustering the entire SCOP database (14,183 sequences) and the complete genome of the yeast <it>Saccharomyces cerevisiae </it>(6,690 sequences). Conclusions Besides the spectral method, SCPS also implements connected component analysis and hierarchical clustering, it integrates TribeMCL, it provides different cluster quality tools, it can extract human-readable protein descriptions using GI numbers from NCBI, it interfaces with external tools such as BLAST and Cytoscape, and it can produce publication-quality graphical representations of the clusters obtained, thus constituting a comprehensive and effective tool for practical research in computational biology. Source code and precompiled executables for Windows, Linux and Mac OS X are freely available at <url>http://www.paccanarolab.org/software/scps</url>.</p

Royal Holloway Research Online

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Enrichment of homologs in insignificant BLAST hits by co-complex network alignment

Author: A Bateman
A Ruepp
B Snel
Berend Snel
EV Koonin
HW Mewes
I Wapinski
J Boekhorst
J Espadaler
J Soding
JB Pereira-Leal
Jos Boekhorst
KP Byrne
L Fokkens
L Li
L Matthews
Like Fokkens
M Ashburner
M Boube
M Campillos
M Kroiss
M Remm
P Smits
R Singh
R Szklarczyk
RA Notebaart
S Bandyopadhyay
Sandra MC Botelho
SF Altschul
SF Altschul
T Gabaldon
T Hubbard
Y Chen
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Homology is a crucial concept in comparative genomics. The algorithm probably most widely used for homology detection in comparative genomics, is BLAST. Usually a stringent score cutoff is applied to distinguish putative homologs from possible false positive hits. As a consequence, some BLAST hits are discarded that are in fact homologous. Results Analogous to the use of the genomics context in genome alignments, we test whether conserved functional context can be used to select candidate homologs from insignificant BLAST hits. We make a co-complex network alignment between complex subunits in yeast and human and find that proteins with an insignificant BLAST hit that are part of homologous complexes, are likely to be homologous themselves. Further analysis of the distant homologs we recovered using the co-complex network alignment, shows that a large majority of these distant homologs are in fact ancient paralogs. Conclusions Our results show that, even though evolution takes place at the sequence and genome level, co-complex networks can be used as circumstantial evidence to improve confidence in the homology of distantly related sequences.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A probabilistic framework to predict protein function from interaction data integrated with semantic knowledge

Author: A Ruepp
A Valencia
A Vazquez
Aidong Zhang
B Schwikowski
B-J Breitkreutz
DS Goldberg
E Nabieva
E Sprinzak
EM Marcotte
H Hishigaki
H Lee
HN Chua
HW Mewes
I Friedberg
International Human Genome Sequencing Consortium
JBL Bard
JR Parrish
JZ Wang
L Salwinski
Lei Shi
M Deng
M Kirac
M Pellegrini
MB Eisen
Murali Ramanathan
P Resnik
PW Lord
R Aebersold
R Overbeek
SF Altschul
The Gene Ontology Consortium
U Karaoz
WR Pearson
X Guo
X Wu
Y Tao
Y-R Cho
Young-Rae Cho
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The functional characterization of newly discovered proteins has been a challenge in the post-genomic era. Protein-protein interactions provide insights into the functional analysis because the function of unknown proteins can be postulated on the basis of their interaction evidence with known proteins. The protein-protein interaction data sets have been enriched by high-throughput experimental methods. However, the functional analysis using the interaction data has a limitation in accuracy because of the presence of the false positive data experimentally generated and the interactions that are a lack of functional linkage. Results Protein-protein interaction data can be integrated with the functional knowledge existing in the Gene Ontology (GO) database. We apply similarity measures to assess the functional similarity between interacting proteins. We present a probabilistic framework for predicting functions of unknown proteins based on the functional similarity. We use the leave-one-out cross validation to compare the performance. The experimental results demonstrate that our algorithm performs better than other competing methods in terms of prediction accuracy. In particular, it handles the high false positive rates of current interaction data well. Conclusion The experimentally determined protein-protein interactions are erroneous to uncover the functional associations among proteins. The performance of function prediction for uncharacterized proteins can be enhanced by the integration of multiple data sources available.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central