Search CORE

1,059 research outputs found

The PRINTS database: a fine-grained protein sequence annotation and analysis resource—its status in 2012

Author: A. Coletta
A. L. Mitchell
A. Pavlopoulou
A. Theodosiou
Altschul
Apweiler
Attwood
Attwood
Attwood
Attwood
C. Roma-Mateo
Chen
G. Muirhead
Gilks
Henikoff
Huang
I. Popov
Kawamura
Nordle
P. B. Philippou
Roma-Mateo
Schnoes
Scordis
Sonnhammer
T. K. Attwood
Vaughan
Wong
Wright
Publication venue: Oxford University Press
Publication date: 01/01/2012
Field of study

The PRINTS database, now in its 21st year, houses a collection of diagnostic protein family ‘fingerprints’. Fingerprints are groups of conserved motifs, evident in multiple sequence alignments, whose unique inter-relationships provide distinctive signatures for particular protein families and structural/functional domains. As such, they may be used to assign uncharacterized sequences to known families, and hence to infer tentative functional, structural and/or evolutionary relationships. The February 2012 release (version 42.0) includes 2156 fingerprints, encoding 12 444 individual motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. Here, we report the current status of the database, and introduce a number of recent developments that help both to render a variety of our annotation and analysis tools easier to use and to make them more widely available

Crossref

PubMed Central

The University of Manchester - Institutional Repository

Dokuz Eylul University Research Information System

PRECIS: Protein reports engineered from concise information in SWISS-PROT

Author: Attwood T.K.
Mitchell A.L.
Reich J.R.
Publication venue
Publication date: 02/08/2017
Field of study

Motivation: There have been several endeavours to address the problem of annotating sequence data computationally, but the task is non-trivial and few tools have emerged that gather useful information on a given sequence, or set of sequences, in a simple and convenient manner. As more genome projects bear fruit, the mass of uncharacterized sequence data accumulating in public repositories grows ever larger. There is thus a pressing need for tools to support the process of automatic analysis and annotation of newly determined sequences. With this in mind, we have developed PRECIS, which automatically creates protein reports from sets of SWISS-PROT entries, collating results into structured reports, detailing known biological and medical information, literature and database cross-references, and relevant keywords. Availability: The software is accessible online at: http://www.bioinf.man.ac.uk/cgi-bin/dbbrowser/precis/blast_precis.cg

RERO DOC Digital Library

PCAS – a precomputed proteome annotation database resource

Author: Chen Yunjia
Gao Ge
Jiang Ying
Luo Jingchu
Yin Yanbin
Yu Peng
Zhang Yong
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: Many model proteomes or "complete" sets of proteins of given organisms are now publicly available. Much effort has been invested in computational annotation of those "draft" proteomes. Motif or domain based algorithms play a pivotal role in functional classification of proteins. Employing most available computational algorithms, mainly motif or domain recognition algorithms, we set up to develop an online proteome annotation system with integrated proteome annotation data to complement existing resources. RESULTS: We report here the development of PCAS (ProteinCentric Annotation System) as an online resource of pre-computed proteome annotation data. We applied most available motif or domain databases and their analysis methods, including hmmpfam search of HMMs in Pfam, SMART and TIGRFAM, RPS-PSIBLAST search of PSSMs in CDD, pfscan of PROSITE patterns and profiles, as well as PSI-BLAST search of SUPERFAMILY PSSMs. In addition, signal peptide and TM are predicted using SignalP and TMHMM respectively. We mapped SUPERFAMILY and COGs to InterPro, so the motif or domain databases are integrated through InterPro. PCAS displays table summaries of pre-computed data and a graphical presentation of motifs or domains relative to the protein. As of now, PCAS contains human IPI, mouse IPI, and rat IPI, A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, and S. pombe proteome. PCAS is available at CONCLUSION: PCAS gives better annotation coverage for model proteomes by employing a wider collection of available algorithms. Besides presenting the most confident annotation data, PCAS also allows customized query so users can inspect statistically less significant boundary information as well. Therefore, besides providing general annotation information, PCAS could be used as a discovery platform. We plan to update PCAS twice a year. We will upgrade PCAS when new proteome annotation algorithms identified

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Molecule Pages database

Author: Attwood
B. Riley
B. Saunders
Bader
Benson
E. Chenette
Finn
Gilman
Letunic
Li
M. Day
Mishra
Mulder
S. Lyon
S. Subramaniam
Scordis
Sonnhammer
Stark
Subramaniam
Publication venue: Oxford University Press
Publication date
Field of study

The UCSD-Nature Signaling Gateway Molecule Pages (http://www.signaling-gateway.org/molecule) provides essential information on more than 3800 mammalian proteins involved in cellular signaling. The Molecule Pages contain expert-authored and peer-reviewed information based on the published literature, complemented by regularly updated information derived from public data source references and sequence analysis. The expert-authored data includes both a full-text review about the molecule, with citations, and highly structured data for bioinformatics interrogation, including information on protein interactions and states, transitions between states and protein function. The expert-authored pages are anonymously peer reviewed by the Nature Publishing Group. The Molecule Pages data is present in an object-relational database format and is freely accessible to the authors, the reviewers and the public from a web browser that serves as a presentation layer. The Molecule Pages are supported by several applications that along with the database and the interfaces form a multi-tier architecture. The Molecule Pages and the Signaling Gateway are routinely accessed by a very large research community

Crossref

PubMed Central

On the hierarchical classification of G Protein-Coupled Receptors

Author: A. A. Freitas
A. Secker
Attwood
Bhasin
Bhasin
Bissantz
Cardoso
Christopoulos
D. R. Flower
Das
Davies
Flower
Flower
Foord
Gether
Gloriam
Guo
Horn
H bert
J. Timmis
Karchin
Keerthi
Klabunde
Kolakowski
Lapinsh
M. Mendao
M. N. Davies
Milligan
Papasaikas
Prabhu
Sandberg
Schi th
Publication venue: 'Oxford University Press (OUP)'
Publication date: 22/10/2007
Field of study

Motivation: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50% of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs. Results: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases

CiteSeerX

Crossref

Aberystwyth Research Portal

Kent Academic Repository

A potentially novel overlapping gene in the genomes of Israeli acute paralysis virus and its relatives

Author: Graur Dan
Price Nicholas
Sabath Niv
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

The Israeli acute paralysis virus (IAPV) is a honeybee-infecting virus that was found to be associated with colony collapse disorder. The IAPV genome contains two genes encoding a structural and a nonstructural polyprotein. We applied a recently developed method for the estimation of selection in overlapping genes to detect purifying selection and, hence, functionality. We provide evolutionary evidence for the existence of a functional overlapping gene, which is translated in the +1 reading frame of the structural polyprotein gene. Conserved orthologs of this putative gene, which we provisionally call pog (predicted overlapping gene), were also found in the genomes of a monophyletic clade of dicistroviruses that includes IAPV, acute bee paralysis virus, Kashmir bee virus, and Solenopsis invicta (red imported fire ant) virus 1

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Linking Fold, Function and Phylogeny: A Comparative Genomics View on Protein (Domain) Evolution

Author: Bagowski Christoph P
te Velthuis Aartjan J.W
Publication venue: Bentham Science Publishers Ltd.
Publication date: 01/01/2008
Field of study

Domains are the building blocks of all globular proteins and present one of the most useful levels at which protein function can be understood. Through recombination and duplication of a limited set of domains, proteomes evolved and the collection of protein superfamilies in an organism formed. As such, the presence of a shared domain can be regarded as an indicator of similar function and evolutionary history, but it does not necessarily imply it since convergent evolution may give rise to similar gene functions as well as architectures

Crossref

PubMed Central

Oxford University Research Archive

A two-entropies analysis to identify functional positions in the transmembrane region of class A G protein-coupled receptors

Author: Adriaan P. IJzerman
Attwood
Attwood
Attwood
Ballesteros
Ballesteros
Blaise
Chiu
Dalpiaz
Drews
Eric-Wubbo M. Lameijer
Gether
Gether
Godfraind
Guex
Hopkins
Horn
Horn
Horn
Imai
Innis
Jiang
Jiang
Kai Ye
Kim
Klabunde
Kokkola
Koradi
Kremer
Kristiansen
Kristiansen
Kuipers
Latronico
Li
Li
Liapakis
Lichtarge
Lu
Lu
Madabushi
Man
Margot W. Beukers
Min
Mirny
Mirny
Mirzadegan
Oliveira
Oliveira
Oliveira
Palczewski
Pauwels
Pierce
Pritchard
Provost
Rost
Schoneberg
Shackelford
Shi
Stitham
Swets
Takeda
Tao
Townsend-Nicholson
Tucker
Visiers
Xie
Zeng
Zhu
Publication venue: 'Wiley'
Publication date
Field of study

Crossref

SoyDB: a knowledge database of soybean transcription factors

Author: Cheng Jianlin
Joint Genome Institute (JGI)
Joshi Trupti
Libault Marc
Nguyen Henry T
Stacey Gary
Valliyodan Babu
Wang Zheng
Xu Dong
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Transcription factors play the crucial rule of regulating gene expression and influence almost all biological processes. Systematically identifying and annotating transcription factors can greatly aid further understanding their functions and mechanisms. In this article, we present SoyDB, a user friendly database containing comprehensive knowledge of soybean transcription factors. Description The soybean genome was recently sequenced by the Department of Energy-Joint Genome Institute (DOE-JGI) and is publicly available. Mining of this sequence identified 5,671 soybean genes as putative transcription factors. These genes were comprehensively annotated as an aid to the soybean research community. We developed SoyDB - a knowledge database for all the transcription factors in the soybean genome. The database contains protein sequences, predicted tertiary structures, putative DNA binding sites, domains, homologous templates in the Protein Data Bank (PDB), protein family classifications, multiple sequence alignments, consensus protein sequence motifs, web logo of each family, and web links to the soybean transcription factor database PlantTFDB, known EST sequences, and other general protein databases including Swiss-Prot, Gene Ontology, KEGG, EMBL, TAIR, InterPro, SMART, PROSITE, NCBI, and Pfam. The database can be accessed via an interactive and convenient web server, which supports full-text search, PSI-BLAST sequence search, database browsing by protein family, and automatic classification of a new protein sequence into one of 64 annotated transcription factor families by hidden Markov models. Conclusions A comprehensive soybean transcription factor database was constructed and made publicly accessible at <url>http://casp.rnet.missouri.edu/soydb/</url>.</p

Crossref

DigitalCommons@University of Nebraska

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship Miami