1,890 research outputs found
Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS
GROMACS is a widely used package for biomolecular simulation, and over the
last two decades it has evolved from small-scale efficiency to advanced
heterogeneous acceleration and multi-level parallelism targeting some of the
largest supercomputers in the world. Here, we describe some of the ways we have
been able to realize this through the use of parallelization on all levels,
combined with a constant focus on absolute performance. Release 4.6 of GROMACS
uses SIMD acceleration on a wide range of architectures, GPU offloading
acceleration, and both OpenMP and MPI parallelism within and between nodes,
respectively. The recent work on acceleration made it necessary to revisit the
fundamental algorithms of molecular simulation, including the concept of
neighborsearching, and we discuss the present and future challenges we see for
exascale simulation - in particular a very fine-grained task parallelism. We
also discuss the software management, code peer review and continuous
integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin
Empirical Potential Function for Simplified Protein Models: Combining Contact and Local Sequence-Structure Descriptors
An effective potential function is critical for protein structure prediction
and folding simulation. Simplified protein models such as those requiring only
or backbone atoms are attractive because they enable efficient
search of the conformational space. We show residue specific reduced discrete
state models can represent the backbone conformations of proteins with small
RMSD values. However, no potential functions exist that are designed for such
simplified protein models. In this study, we develop optimal potential
functions by combining contact interaction descriptors and local
sequence-structure descriptors. The form of the potential function is a
weighted linear sum of all descriptors, and the optimal weight coefficients are
obtained through optimization using both native and decoy structures. The
performance of the potential function in test of discriminating native protein
structures from decoys is evaluated using several benchmark decoy sets. Our
potential function requiring only backbone atoms or atoms have
comparable or better performance than several residue-based potential functions
that require additional coordinates of side chain centers or coordinates of all
side chain atoms. By reducing the residue alphabets down to size 5 for local
structure-sequence relationship, the performance of the potential function can
be further improved. Our results also suggest that local sequence-structure
correlation may play important role in reducing the entropic cost of protein
folding.Comment: 20 pages, 5 figures, 4 tables. In press, Protein
Three-Dimensional Graph Matching to Identify Secondary Structure Correspondence of Medium-Resolution Cryo-EM Density Maps
Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4–10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such as α-helices and β-sheets are observable. Consequently, finding the mapping of the secondary structures of the modeled structure (SSEs-A) to the cryo-EM map (SSEs-C) is one of the primary concerns in cryo-EM modeling. To address this issue, this study proposes a novel automatic computational method to identify SSEs correspondence in three-dimensional (3D) space. Initially, through a modeling of the target sequence with the aid of extracting highly reliable features from a generated 3D model and map, the SSEs matching problem is formulated as a 3D vector matching problem. Afterward, the 3D vector matching problem is transformed into a 3D graph matching problem. Finally, a similarity-based voting algorithm combined with the principle of least conflict (PLC) concept is developed to obtain the SSEs correspondence. To evaluate the accuracy of the method, a testing set of 25 experimental and simulated maps with a maximum of 65 SSEs is selected. Comparative studies are also conducted to demonstrate the superiority of the proposed method over some state-of-the-art techniques. The results demonstrate that the method is efficient, robust, and works well in the presence of errors in the predicted secondary structures of the cryo-EM images
Analysis of Three-Dimensional Protein Images
A fundamental goal of research in molecular biology is to understand protein
structure. Protein crystallography is currently the most successful method for
determining the three-dimensional (3D) conformation of a protein, yet it
remains labor intensive and relies on an expert's ability to derive and
evaluate a protein scene model. In this paper, the problem of protein structure
determination is formulated as an exercise in scene analysis. A computational
methodology is presented in which a 3D image of a protein is segmented into a
graph of critical points. Bayesian and certainty factor approaches are
described and used to analyze critical point graphs and identify meaningful
substructures, such as alpha-helices and beta-sheets. Results of applying the
methodologies to protein images at low and medium resolution are reported. The
research is related to approaches to representation, segmentation and
classification in vision, as well as to top-down approaches to protein
structure prediction.Comment: See http://www.jair.org/ for any accompanying file
Ab Initio Protein Structure Prediction Using Evolutionary Approach: A Survey
Protein Structure Prediction (PSP) problem is to determine the three-dimensional structure of a protein only from its primary structure. Misfolding of a protein causes human diseases. Thus, the knowledge of the structure and functionality of proteins, combined with the prediction of their structure is a complex problem and a challenge for the area of computational biology. The metaheuristic optimization algorithms are naturally applicable to support in solving NP-hard problems.These algorithms are bio-inspired, since they were designed based on procedures found in nature, such as the successful evolutionary behavior of natural systems. In this paper, we present a survey on methods to approach the \textit{ab initio} protein structure prediction based on evolutionary computing algorithms, considering both single and multi-objective optimization. An overview of the works is presented, with some details about which characteristics of the problem are considered, as well as specific points of the algorithms used. A comparison between the approaches is presented and some directions of the research field are pointed out
MESSM: a framework for protein threading by neural networks and support vector machines
Protein threading, which is also referred to as fold recognition, aligns a probe amino acid sequence onto a library of representative folds of known structure to identify a structural similarity. Following the threading technique of the structural profile approach, this research focused on developing and evaluating a new framework - Mixed Environment Specific Substitution Mapping (MESSM) - for protein threading by artificial neural networks (ANNs) and support vector machines (SVMs). The MESSM presents a new process to develop an efficient tool for protein fold recognition. It achieved better efficiency while retained the effectiveness on protein prediction. The MESSM has three key components, each of which is a step in the protein threading framework. First, building the fold profile library-given a protein structure with a residue level environmental description, Neural Networks are used to generate an environment-specific amino acid substitution (3D-1D) mapping. Second, mixed substitution mapping--a mixed environment-specific substitution mapping is developed by combing the structural-derived substitution score with sequence profile from well-developed amino acid substitution matrices. Third, confidence evaluation--a support vector machine is employed to measure the significance of the sequence-structure alignment. Four computational experiments are carried out to verify the performance of the MESSM. They are Fischer, ProSup, Lindahl and Wallner benchmarks. Tested on Fischer, Lindahl and Wallner benchmarks, MESSM achieved a comparable performance on fold recognition to those energy potential based threading models. For Fischer benchmark, MESSM correctly recognise 56 out of 68 pairs, which has the same performance as that of COBLATH and SPARKS. The computational experiments show that MESSM is a fast program. It could make an alignment between probe sequence (150 amino acids) and a profile of 4775 template proteins in 30 seconds on a PC with IG memory Pentium IV. Also, tested on ProSup benchmark, the MESSM achieved alignment accuracy of 59.7%, which is better than current models. The research work was extended to develop a threading score following the threading technique of the contact potential approach. A TES (Threading with Environment-specific Score) model is constructed by neural networks
- …