Search CORE

509 research outputs found

InterProScan: protein domains identifier

Author: Apweiler R.
Harte N.
Lopez R.
Mulder N.
Pillai S.
Quevillon E.
Silventoinen V.
Publication venue: Oxford University Press
Publication date: 27/06/2005
Field of study

InterProScan [E. M. Zdobnov and R. Apweiler (2001) Bioinformatics, 17, 847–848] is a tool that combines different protein signature recognition methods from the InterPro [N. J. Mulder, R. Apweiler, T. K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bradley, P. Bork, P. Bucher, L. Cerutti et al. (2005) Nucleic Acids Res., 33, D201–D205] consortium member databases into one resource. At the time of writing there are 10 distinct publicly available databases in the application. Protein as well as DNA sequences can be analysed. A web-based version is accessible for academic and commercial organizations from the EBI (). In addition, a standalone Perl version and a SOAP Web Service [J. Snell, D. Tidwell and P. Kulchenko (2001) Programming Web Services with SOAP, 1st edn. O'Reilly Publishers, Sebastopol, CA, ] are also available to the users. Various output formats are supported and include text tables, XML documents, as well as various graphs to help interpret the results

Crossref

PubMed Central

FFPred: an integrated feature-based function prediction server for vertebrate proteomes

Author: A. E. Lobley
Apweiler
Ashburner
C. A. Orengo
Camon
Churchill
D. T. Jones
Fernandez
Jensen
Keerthi
Ofran
Rost
T. Nugent
Publication venue: Oxford University Press
Publication date: 08/05/2008
Field of study

One of the challenges of the post-genomic era is to provide accurate function annotations for large volumes of data resulting from genome sequencing projects. Most function prediction servers utilize methods that transfer existing database annotations between orthologous sequences. In contrast, there are few methods that are independent of homology and can annotate distant and orphan protein sequences. The FFPred server adopts a machine-learning approach to perform function prediction in protein feature space using feature characteristics predicted from amino acid sequence. The features are scanned against a library of support vector machines representing over 300 Gene Ontology (GO) classes and probabilistic confidence scores returned for each annotation term. The GO term library has been modelled on human protein annotations; however, benchmark performance testing showed robust performance across higher eukaryotes. FFPred offers important advantages over traditional function prediction servers in its ability to annotate distant homologues and orphan protein sequences, and achieves greater coverage and classification accuracy than other feature-based prediction servers. A user may upload an amino acid and receive annotation predictions via email. Feature information is provided as easy to interpret graphics displayed on the sequence of interest, allowing for back-interpretation of the associations between features and function classes

Crossref

PubMed Central

UCL Discovery

The GOA database in 2009—an integrated Gene Ontology Annotation resource

Author: C. O'Donovan
Camon
D. Barrell
D. Binns
E. Dimmer
Gattiker
Kersey
Lomax
Lovering
Mulder
R. Apweiler
R. P. Huntley
Thomas
Yon Rhee
Publication venue: Oxford University Press
Publication date: 29/10/2008
Field of study

The Gene Ontology Annotation (GOA) project at the EBI (http://www.ebi.ac.uk/goa) provides high-quality electronic and manual associations (annotations) of Gene Ontology (GO) terms to UniProt Knowledgebase (UniProtKB) entries. Annotations created by the project are collated with annotations from external databases to provide an extensive, publicly available GO annotation resource. Currently covering over 160 000 taxa, with greater than 32 million annotations, GOA remains the largest and most comprehensive open-source contributor to the GO Consortium (GOC) project. Over the last five years, the group has augmented the number and coverage of their electronic pipelines and a number of new manual annotation projects and collaborations now further enhance this resource. A range of files facilitate the download of annotations for particular species, and GO term information and associated annotations can also be viewed and downloaded from the newly developed GOA QuickGO tool (http://www.ebi.ac.uk/QuickGO), which allows users to precisely tailor their annotation set

Crossref

PubMed Central

UCL Discovery

Where differences resemble: sequence-feature analysis in curated databases of intrinsically disordered proteins

Author: Apweiler
Bardou
Brown
Damiano Piovesan
Das
Daughdrill
Dinkel
Dunker
Dunker
Dunker
Dyson
Fichó
Fu
Fukuchi
Gunasekaran
Holehouse
Lee
Marco Necci
Miskei
Mészáros
Necci
Necci
Peng
Piovesan
Piovesan
Piovesan
Radivojac
Receveur-Bréchot
Schad
Silvio C E Tosatto
The Gene Ontology Consortium
Tompa
Uversky
Van Roey
Vucetic
Ward
Wootton
Wright
Xue
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Crossref

Archivio istituzionale della ricerca - Università di Padova

TopoGSA: network topological gene set analysis

Author: A. Baudot
A. Valencia
Abatangelo
Apweiler
Ashburner
Bader
E. Glaab
Futreal
Glaab
Hermjakob
Jenssen
Kanehisa
N. Krasnogor
Peri
Snel
Stark
Vogelstein
Watts
Xenarios
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Summary: TopoGSA (Topology-based Gene Set Analysis) is a web-application dedicated to the computation and visualization of network topological properties for gene and protein sets in molecular interaction networks. Different topological characteristics, such as the centrality of nodes in the network or their tendency to form clusters, can be computed and compared with those of known cellular pathways and processes

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham

PubMed Central

Open Repository and Bibliography - Luxembourg

TOPDB: topology data bank of transmembrane proteins

Author: Altschul
Apweiler
Arora
Bairoch
Benoit
Berman
G. E. Tusnady
I. Simon
Ikeda
Jones
Krogh
L. Kalmar
Lomize
Moller
Tusnady
Tusnady
Tusn dy
van Geest
WALLIN
Publication venue: Oxford University Press
Publication date
Field of study

The Topology Data Bank of Transmembrane Proteins (TOPDB) is the most complete and comprehensive collection of transmembrane protein datasets containing experimentally derived topology information currently available. It contains information gathered from the literature and from public databases available on the internet for more than a thousand transmembrane proteins. TOPDB collects details of various experiments that were carried out to learn about the topology of particular transmembrane proteins. In addition to experimental data from the literature, an extensive collection of structural data was also compiled from PDB and from PDBTM. Because topology information is often incomplete, for each protein in the database the most probable topology that is consistent with the collected experimental constraints was also calculated using the HMMTOP transmembrane topology prediction algorithm. Each record in TOPDB also contains information on the given protein sequence, name, organism and cross references to various other databases. The web interface of TOPDB includes tools for searching, relational querying and data browsing as well as for visualization. TOPDB is designed to bridge the gap between the number of transmembrane proteins available in sequence databases and the publicly accessible topology information of experimentally or computationally studied transmembrane proteins. TOPDB is available at http://topdb.enzim.hu

Crossref

PubMed Central

FLORA: a novel method to predict protein function from structure in diverse superfamilies

Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

Representing kidney development using the gene ontology.

Author: Alam-Faruque Yasmin
Apweiler Rolf
Attrill Helen
Blake Judith A
Davidson Duncan
Dimmer Emily C
Foulger Rebecca E
Harris Midori A
Hill David P
Howe Douglas G
Huntley Rachael P
Mungall Christopher J
O'Donovan Claire
Thomas Stephen Randall
Tweedie Susan
Woolf Adrian S
Publication venue: PLoS One
Publication date: 01/01/2014
Field of study

Gene Ontology (GO) provides dynamic controlled vocabularies to aid in the description of the functional biological attributes and subcellular locations of gene products from all taxonomic groups (www.geneontology.org). Here we describe collaboration between the renal biomedical research community and the GO Consortium to improve the quality and quantity of GO terms describing renal development. In the associated annotation activity, the new and revised terms were associated with gene products involved in renal development and function. This project resulted in a total of 522 GO terms being added to the ontology and the creation of approximately 9,600 kidney-related GO term associations to 940 UniProt Knowledgebase (UniProtKB) entries, covering 66 taxonomic groups. We demonstrate the impact of these improvements on the interpretation of GO term analyses performed on genes differentially expressed in kidney glomeruli affected by diabetic nephropathy. In summary, we have produced a resource that can be utilized in the interpretation of data from small- and large-scale experiments investigating molecular mechanisms of kidney function and development and thereby help towards alleviating renal disease

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository

Apollo (Cambridge)

The Francis Crick Institute

Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation

Author: Apweiler
Benson
Birney
Collins
Dennis
Deyou Zheng
Harrison
Harrison
Harrison
Hubbard
John E. Karro
Kent
Khelifi
Khelifi
Liu
Mark Gerstein
Nadkarni
Nicholas Carriero
Ohshima
Paul Harrrison
Philip Cayting
Torrents
Wang
Yangpan Yan
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhang
Zhaolei Zhang
Zheng
Zheng
Publication venue: Oxford University Press
Publication date: 11/11/2006
Field of study

The Pseudogene.org knowledgebase serves as a comprehensive repository for pseudogene annotation. The definition of a pseudogene varies within the literature, resulting in significantly different approaches to the problem of identification. Consequently, it is difficult to maintain a consistent collection of pseudogenes in detail necessary for their effective use. Our database is designed to address this issue. It integrates a variety of heterogeneous resources and supports a subset structure that highlights specific groups of pseudogenes that are of interest to the research community. Tools are provided for the comparison of sets and the creation of layered set unions, enabling researchers to derive a current ‘consensus’ set of pseudogenes. Additional features include versatile search, the capacity for robust interaction with other databases, the ability to reconstruct older versions of the database (accounting for changing genome builds) and an underlying object-oriented interface designed for researchers with a minimal knowledge of programming. At the present time, the database contains more than 100 000 pseudogenes spanning 64 prokaryote and 11 eukaryote genomes, including a collection of human annotations compiled from 16 sources

CiteSeerX

Crossref

PubMed Central

IntAct—open source resource for molecular interaction data

Author: Alam-Faruque Y.
Apweiler R.
Aranda B.
Bancarz I.
Bridge A.
Derow C.
Dimmer E.
Feuermann M.
Friedrichsen A.
Hermjakob H.
Huntley R.
Kerrien S.
Khadake J.
Kohler C.
Leroy C.
Liban A.
Lieftink C.
Montecchi-Palazzi L.
Orchard S.
Risse J.
Robbe K.
Roechert B.
Thorneycroft D.
Zhang Y.
Publication venue: Oxford University Press
Publication date: 01/12/2006
Field of study

IntAct is an open source database and software suite for modeling, storing and analyzing molecular interaction data. The data available in the database originates entirely from published literature and is manually annotated by expert biologists to a high level of detail, including experimental methods, conditions and interacting domains. The database features over 126 000 binary interactions extracted from over 2100 scientific publications and makes extensive use of controlled vocabularies. The web site provides tools allowing users to search, visualize and download data from the repository. IntAct supports and encourages local installations as well as direct data submission and curation collaborations. IntAct source code and data are freely available from