Search CORE

256 research outputs found

A series of PDB related databases for everyday needs

Author: Babaei
Berman
Berman
Bernstein
C. Sander
E. Krieger
Etzold
Flint
G. Vriend
Hekkelman
HOBOHM
HOBOHM
Hooft
Hooft
Hooft
Hooft
Joosten
Kabsch
Krieger
M. L. Hekkelman
Matthews
Murshudov
Murzin
Noguchi
Noguchi
Noguchi
Orengo
Parkinson
Pirovano
R. P. Joosten
R. Schneider
R. W. W. Hooft
Roe
Sander
Sch fer
T. A. H. te Beek
Teeter
Vriend
Wang
Winn
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The Protein Data Bank (PDB) is the world-wide repository of macromolecular structure information. We present a series of databases that run parallel to the PDB. Each database holds one entry, if possible, for each PDB entry. DSSP holds the secondary structure of the proteins. PDBREPORT holds reports on the structure quality and lists errors. HSSP holds a multiple sequence alignment for all proteins. The PDBFINDER holds easy to parse summaries of the PDB file content, augmented with essentials from the other systems. PDB_REDO holds re-refined, and often improved, copies of all structures solved by X-ray. WHY_NOT summarizes why certain files could not be produced. All these systems are updated weekly. The data sets can be used for the analysis of properties of protein structures in areas ranging from structural genomics, to cancer biology and protein design

Crossref

PubMed Central

Radboud Repository

Open Repository and Bibliography - Luxembourg

Mining protein database using machine learning techniques

Author: Camargo Renata
Niranjan Mahesan
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/06/2008
Field of study

With a large amount of information relating to proteins accumulating in databases widely available online, it is of interest to apply machine learning techniques that, by extracting underlying statistical regularities in the data, make predictions about the functional and evolutionary characteristics of unseen proteins. Such predictions can help in achieving a reduction in the space over which experiment designers need to search in order to improve our understanding of the biochemical properties. Previously it has been suggested that an integration of features computable by comparing a pair of proteins can be achieved by an artificial neural network, hence predicting the degree to which they may be evolutionary related and homologous. We compiled two datasets of pairs of proteins, each pair being characterised by seven distinct features. We performed an exhaustive search through all possible combinations of features, for the problem of separating remote homologous from analogous pairs, we note that significant performance gain was obtained by the inclusion of sequence and structure information. We find that the use of a linear classifier was enough to discriminate a protein pair at the family level. However, at the superfamily level, to detect remote homologous pairs was a relatively harder problem. We find that the use of nonlinear classifiers achieve significantly higher accuracies. In this paper, we compare three different pattern classification methods on two problems formulated as detecting evolutionary and functional relationships between pairs of proteins, and from extensive cross validation and feature selection based studies quantify the average limits and uncertainties with which such predictions may be made. Feature selection points to a "knowledge gap" in currently available functional annotations. We demonstrate how the scheme may be employed in a framework to associate an individual protein with an existing family of evolutionarily related proteins

Southampton (e-Prints Soton)

Crossref

Análise preliminar de um processo para identificação e alinhamento de seqüências homólogas para proteínas com estrutura resolvida.

Author: HIGA R. H.
KUSER P. R.
MANCINI A. L.
NESHICH G.
YAMAGISHI M. E. B.
Publication venue: Campinas: Embrapa Informática Agropecuária, 2003.
Publication date: 10/04/2011
Field of study

O objetivo deste trabalho é apresentar e fazer uma avaliação preliminar de um processo alternativo, denominado Sequences Homologue to the Query (Structure-having) Sequence-SH2Q, para elaboração de alinhamentos múltiplos semelhantes à aqueles relatados no HSSP. O processo aqui apresentado baseia-se em programas de domínio público para busca em bases de dados de sequências -Blast (Altschul et al., 1990, 1997) e para alinhamento múltiplo de sequências -ClustalW (Thompson et al., 1994) O critério para avaliação do mesmo é o grau de similaridade entre as medidas de Entropia Relativa, quando comparadas com os mesmos valores relatados pelo HSSP.bitstream/CNPTIA/10041/1/comtec48.pdfAcesso em: 30 maio 2008

Infoteca-e

STING Millennium: a web-based suite of programs for comprehensive and simultaneous analysis of protein structure and sequence.

Author: ALMEIDA C. L. de
BAUDET C.
CAMPOS T. F. e
COSTA I. C.
DOMINIQUINI F.
FALCAO P. R. K.
FERREIRA L. L.
FREITAS E. M. de
GOMES G. B.
HIGA R. H.
HORITA L. G.
INOUE M. K.
LIMA C. S.
LUNA F. M.
MANCINI A. L.
MATTIUZ A. R.
MIURA R. T.
NESHICH G.
OGAWA F. O.
OLIVEIRA A. G.
PALANDRANI J. F.
PAPPAS JUNIOR G.
SANTOS G. F. dos
SOUZA D. F. de
SOUZA S.
TOGAWA R. C.
TORRES W. V.
YAMAGISHI M. E. B.
ÁLVARO A.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/04/2019
Field of study

Sting Millennium suite intrinsics. SMS organization. Sting Millennium Modes. Sting Millennium Modules. Millennium Features. Example of Sting Millennium Application.Na publicação: Paula R. Kuser

Repository Open Access to Scientific Information from Embrapa

A structural classification of protein-protein interactions for detection of convergently evolved motifs and for prediction of protein binding sites on sequence level

Author: Henschel Andreas
Publication venue: Technische Universität Dresden
Publication date: 17/10/2008
Field of study

BACKGROUND: A long-standing challenge in the post-genomic era of Bioinformatics is the prediction of protein-protein interactions, and ultimately the prediction of protein functions. The problem is intrinsically harder, when only amino acid sequences are available, but a solution is more universally applicable. So far, the problem of uncovering protein-protein interactions has been addressed in a variety of ways, both experimentally and computationally. MOTIVATION: The central problem is: How can protein complexes with solved threedimensional structure be utilized to identify and classify protein binding sites and how can knowledge be inferred from this classification such that protein interactions can be predicted for proteins without solved structure? The underlying hypothesis is that protein binding sites are often restricted to a small number of residues, which additionally often are well-conserved in order to maintain an interaction. Therefore, the signal-to-noise ratio in binding sites is expected to be higher than in other parts of the surface. This enables binding site detection in unknown proteins, when homology based annotation transfer fails. APPROACH: The problem is addressed by first investigating how geometrical aspects of domain-domain associations can lead to a rigorous structural classification of the multitude of protein interface types. The interface types are explored with respect to two aspects: First, how do interface types with one-sided homology reveal convergently evolved motifs? Second, how can sequential descriptors for local structural features be derived from the interface type classification? Then, the use of sequential representations for binding sites in order to predict protein interactions is investigated. The underlying algorithms are based on machine learning techniques, in particular Hidden Markov Models. RESULTS: This work includes a novel approach to a comprehensive geometrical classification of domain interfaces. Alternative structural domain associations are found for 40% of all family-family interactions. Evaluation of the classification algorithm on a hand-curated set of interfaces yielded a precision of 83% and a recall of 95%. For the first time, a systematic screen of convergently evolved motifs in 102.000 protein-protein interactions with structural information is derived. With respect to this dataset, all cases related to viral mimicry of human interface bindings are identified. Finally, a library of 740 motif descriptors for binding site recognition - encoded as Hidden Markov Models - is generated and cross-validated. Tests for the significance of motifs are provided. The usefulness of descriptors for protein-ligand binding sites is demonstrated for the case of &quot;ATP-binding&quot;, where a precision of 89% is achieved, thus outperforming comparable motifs from PROSITE. In particular, a novel descriptor for a P-loop variant has been used to identify ATP-binding sites in 60 protein sequences that have not been annotated before by existing motif databases

Technische Universität Dresden: Qucosa

Characterization of Protein Residue Surface Accessibility Using Sequence Homology

Author: Mishra Radhika Pallavi
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2010
Field of study

Residues present on the surface of the proteins are involved in a number of functions, especially in ligand-protein interactions, that are important for drug design. The residues present in the core of the protein provide stability to the protein and help in maintaining protein structure. Hence, there is a need for a binary characterization of protein residues based on their surface accessibility (surface accessible or buried). Such a classification can aid in the directed study of either residue type. A number of methods for the prediction of surface accessible protein residues have been proposed in the past. However, most of these methods are computationally complex and time consuming. In this thesis, we propose a simple method based on protein sequence homology parameters for the binary classification of protein residues as surface accessible or “buried”. To aid in the classification of protein residues, we chose three highly conservative homology-based parameter filter thresholds. The filter thresholds predicted and evaluated are: residue sequence entropy ≥0:15, fraction of strongly hydrophobic residues \u3c0:5 and fraction of small residues \u3c 0:15. The application of these filter thresholds to the residues, is expected to predict the “buried residues” with a better percentage accuracy than that of the surface accessible residues. These filter thresholds were selected from the frequency distributions and the aggregate correlation plots of the various homology-based parameters. An analysis of the plots suggests the presence of a strongly hydrophobic core between packing density 14 –22 where the presence of strongly hydrophobic residues is maximum and the presence of small and non-strongly hydrophobic residues is minimum. However, the densest portion of the protein (density 26 – 35) is indicated to be occupied by a combination of small and non-strongly hydrophobic residues with a negligible presence of strongly hydrophobic residues

SJSU ScholarWorks

ProQuest OAI Repository

Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces

Author: Hekkelman Maarten L
Kuipers Remko KP
te Beek Tim AH
Venselaar Hanka
Vriend Gert
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Contains fulltext : 89590.pdf (publisher's version ) (Open Access)BACKGROUND: Many newly detected point mutations are located in protein-coding regions of the human genome. Knowledge of their effects on the protein's 3D structure provides insight into the protein's mechanism, can aid the design of further experiments, and eventually can lead to the development of new medicines and diagnostic tools. RESULTS: In this article we describe HOPE, a fully automatic program that analyzes the structural and functional effects of point mutations. HOPE collects information from a wide range of information sources including calculations on the 3D coordinates of the protein by using WHAT IF Web services, sequence annotations from the UniProt database, and predictions by DAS services. Homology models are built with YASARA. Data is stored in a database and used in a decision scheme to identify the effects of a mutation on the protein's 3D structure and function. HOPE builds a report with text, figures, and animations that is easy to use and understandable for (bio)medical researchers. CONCLUSIONS: We tested HOPE by comparing its output to the results of manually performed projects. In all straightforward cases HOPE performed similar to a trained bioinformatician. The use of 3D structures helps optimize the results in terms of reliability and details. HOPE's results are easy to understand and are presented in a way that is attractive for researchers without an extensive bioinformatics background

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Radboud Repository

PDBselect 1992–2009 and PDBfilter-select

Author: Abagyan
Chandonia
Hobohm
Hobohm
Hooft
Hooft
Huang
Laskowski
Sander
Smith
Sven Griep
Uwe Hobohm
Zachariah
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

PDBselect (http://bioinfo.tg.fh-giessen.de/pdbselect/) is a list of representative protein chains with low mutual sequence identity selected from the protein data bank (PDB) to enable unbiased statistics. The list increased from 155 chains in 1992 to more than 4500 chains in 2009. PDBfilter-select is an online service to generate user-defined selections

CiteSeerX

Crossref

PubMed Central