30,694 research outputs found
Analysis of Three-Dimensional Protein Images
A fundamental goal of research in molecular biology is to understand protein
structure. Protein crystallography is currently the most successful method for
determining the three-dimensional (3D) conformation of a protein, yet it
remains labor intensive and relies on an expert's ability to derive and
evaluate a protein scene model. In this paper, the problem of protein structure
determination is formulated as an exercise in scene analysis. A computational
methodology is presented in which a 3D image of a protein is segmented into a
graph of critical points. Bayesian and certainty factor approaches are
described and used to analyze critical point graphs and identify meaningful
substructures, such as alpha-helices and beta-sheets. Results of applying the
methodologies to protein images at low and medium resolution are reported. The
research is related to approaches to representation, segmentation and
classification in vision, as well as to top-down approaches to protein
structure prediction.Comment: See http://www.jair.org/ for any accompanying file
Prospects and limitations of full-text index structures in genome analysis
The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared
Jeeva: Enterprise Grid-enabled Web Portal for Protein Secondary Structure Prediction
This paper presents a Grid portal for protein secondary structure prediction
developed by using services of Aneka, a .NET-based enterprise Grid technology.
The portal is used by research scientists to discover new prediction structures
in a parallel manner. An SVM (Support Vector Machine)-based prediction
algorithm is used with 64 sample protein sequences as a case study to
demonstrate the potential of enterprise Grids.Comment: 7 page
A tractable genotype-phenotype map for the self-assembly of protein quaternary structure
The mapping between biological genotypes and phenotypes is central to the
study of biological evolution. Here we introduce a rich, intuitive, and
biologically realistic genotype-phenotype (GP) map, that serves as a model of
self-assembling biological structures, such as protein complexes, and remains
computationally and analytically tractable. Our GP map arises naturally from
the self-assembly of polyomino structures on a 2D lattice and exhibits a number
of properties: (genotypes vastly outnumber phenotypes),
(genotypic redundancy varies greatly between
phenotypes), (phenotypes consist
of disconnected mutational networks) and (most
phenotypes can be reached in a small number of mutations). We also show that
the mutational robustness of phenotypes scales very roughly logarithmically
with phenotype redundancy and is positively correlated with phenotypic
evolvability. Although our GP map describes the assembly of disconnected
objects, it shares many properties with other popular GP maps for connected
units, such as models for RNA secondary structure or the HP lattice model for
protein tertiary structure. The remarkable fact that these important properties
similarly emerge from such different models suggests the possibility that
universal features underlie a much wider class of biologically realistic GP
maps.Comment: 12 pages, 6 figure
PDBFlex: exploring flexibility in protein structures.
The PDBFlex database, available freely and with no login requirements at http://pdbflex.org, provides information on flexibility of protein structures as revealed by the analysis of variations between depositions of different structural models of the same protein in the Protein Data Bank (PDB). PDBFlex collects information on all instances of such depositions, identifying them by a 95% sequence identity threshold, performs analysis of their structural differences and clusters them according to their structural similarities for easy analysis. The PDBFlex contains tools and viewers enabling in-depth examination of structural variability including: 2D-scaling visualization of RMSD distances between structures of the same protein, graphs of average local RMSD in the aligned structures of protein chains, graphical presentation of differences in secondary structure and observed structural disorder (unresolved residues), difference distance maps between all sets of coordinates and 3D views of individual structures and simulated transitions between different conformations, the latter displayed using JSMol visualization software
The RCSB Protein Data Bank: views of structural biology for basic and applied research and education.
The RCSB Protein Data Bank (RCSB PDB, http://www.rcsb.org) provides access to 3D structures of biological macromolecules and is one of the leading resources in biology and biomedicine worldwide. Our efforts over the past 2 years focused on enabling a deeper understanding of structural biology and providing new structural views of biology that support both basic and applied research and education. Herein, we describe recently introduced data annotations including integration with external biological resources, such as gene and drug databases, new visualization tools and improved support for the mobile web. We also describe access to data files, web services and open access software components to enable software developers to more effectively mine the PDB archive and related annotations. Our efforts are aimed at expanding the role of 3D structure in understanding biology and medicine
BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction
A novel discrete mathematical approach is proposed as an additional tool for molecular systematics which does not require prior statistical assumptions concerning the evolutionary process. The method is based on algorithms generating mathematical representations directly from DNA/RNA or protein sequences, followed by the output of numerical (scalar or vector) and visual characteristics (graphs). The binary encoded sequence information is transformed into a compact analytical form, called the Iterative Canonical Form (or ICF) of Boolean functions, which can then be used as a generalized molecular descriptor. The method provides raw vector data for calculating different distance matrices, which in turn can be analyzed by neighbor-joining or UPGMA to derive a phylogenetic tree, or by principal coordinates analysis to get an ordination scattergram. The new method and the associated software for inferring phylogenetic trees are called the Boolean analysis or BOOL-AN
- …