Search CORE

183 research outputs found

Evolutionary decision rules for predicting protein contact maps

Author: Aguilar Ruiz Jesús Salvador
Asencio Cortés Gualberto
Divina Federico
Márquez Chamorro Alfonso Eduardo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Protein structure prediction is currently one of the main open challenges in Bioinformatics. The protein contact map is an useful, and commonly used, represen tation for protein 3D structure and represents binary proximities (contact or non-contact) between each pair of amino acids of a protein. In this work, we propose a multi objective evolutionary approach for contact map prediction based on physico-chemical properties of amino acids. The evolutionary algorithm produces a set of decision rules that identifies contacts between amino acids. The rules obtained by the algorithm impose a set of conditions based on amino acid properties to predict contacts. We present results obtained by our approach on four different protein data sets. A statistical study was also performed to extract valid conclusions from the set of prediction rules generated by our algorithm. Results obtained confirm the validity of our proposal

idUS. Depósito de Investigación Universidad de Sevilla

Predicting Residue-Residue Contacts and Helix-Helix Interactions in Transmembrane Proteins Using an Integrative Feature-Based Random Forest Approach

Author: Chuan Wang
Jiangning Song
Ren-Xiang Yan
Ruben Claudio Aguilar
Xiao-Feng Wang
Zhen Chen
Ziding Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Integral membrane proteins constitute 25–30% of genomes and play crucial roles in many biological processes. However, less than 1% of membrane protein structures are in the Protein Data Bank. In this context, it is important to develop reliable computational methods for predicting the structures of membrane proteins. Here, we present the first application of random forest (RF) for residue-residue contact prediction in transmembrane proteins, which we term as TMhhcp. Rigorous cross-validation tests indicate that the built RF models provide a more favorable prediction performance compared with two state-of-the-art methods, i.e., TMHcon and MEMPACK. Using a strict leave-one-protein-out jackknifing procedure, they were capable of reaching the top L/5 prediction accuracies of 49.5% and 48.8% for two different residue contact definitions, respectively. The predicted residue contacts were further employed to predict interacting helical pairs and achieved the Matthew's correlation coefficients of 0.430 and 0.424, according to two different residue contact definitions, respectively. To facilitate the academic community, the TMhhcp server has been made freely accessible at http://protein.cau.edu.cn/tmhhcp

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Monash University Research Portal

New evolutionary approaches to protein structure prediction

Author: Márquez Chamorro Alfonso Eduardo
Publication venue
Publication date: 01/01/2013
Field of study

Programa de doctorado en Biotecnología y Tecnología QuímicaThe problem of Protein Structure Prediction (PSP) is one of the principal topics in Bioinformatics. Multiple approaches have been developed in order to predict the protein structure of a protein. Determining the three dimensional structure of proteins is necessary to understand the functions of molecular protein level. An useful, and commonly used, representation for protein 3D structure is the protein contact map, which represents binary proximities (contact or non-contact) between each pair of amino acids of a protein. This thesis work, includes a compilation of the soft computing techniques for the protein structure prediction problem (secondary and tertiary structures). A novel evolutionary secondary structure predictor is also widely described in this work. Results obtained confirm the validity of our proposal. Furthermore, we also propose a multi-objective evolutionary approach for contact map prediction based on physico-chemical properties of amino acids. The evolutionary algorithm produces a set of decision rules that identifies contacts between amino acids. The rules obtained by the algorithm impose a set of conditions based on amino acid properties in order to predict contacts. Results obtained by our approach on four different protein data sets are also presented. Finally, a statistical study was performed to extract valid conclusions from the set of prediction rules generated by our algorithm.Universidad Pablo de Olavide. Centro de Estudios de Postgrad

Repositorio Institucional Olavide

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Optimization of BCL::Fold for Protein Folding de novo and with Cryo-EM Restraints

Author: Fooksa Michaela Sever
Publication venue: VANDERBILT
Publication date
Field of study

Statistical approaches to the study of protein folding and energetics

Author: Burkoff Nikolas S.
Publication venue
Publication date
Field of study

The determination of protein structure and the exploration of protein folding landscapes are two of the key problems in computational biology. In order to address these challenges, both a protein model that accurately captures the physics of interest and an efficient sampling algorithm are required. The first part of this thesis documents the continued development of CRANKITE, a coarse-grained protein model, and its energy landscape exploration using nested sampling, a Bayesian sampling algorithm. We extend CRANKITE and optimize its parameters using a maximum likelihood approach. The efficiency of our procedure, using the contrastive divergence approximation, allows a large training set to be used, producing a model which is transferable to proteins not included in the training set. We develop an empirical Bayes model for the prediction of protein β-contacts, which are required inputs for CRANKITE. Our approach couples the constraints and prior knowledge associated with β-contacts to a maximum entropy-based statistic which predicts evolutionarily-related contacts. Nested sampling (NS) is a Bayesian algorithm shown to be efficient at sampling systems which exhibit a first-order phase transition. In this work we parallelize the algorithm and, for the first time, apply it to a biophysical system: small globular proteins modelled using CRANKITE. We generate energy landscape charts, which give a large-scale visualization of the protein folding landscape, and we compare the efficiency of NS to an alternative sampling technique, parallel tempering, when calculating the heat capacity of a short peptide. In the final part of the thesis we adapt the NS algorithm for use within a molecular dynamics framework and demonstrate the application of the algorithm by calculating the thermodynamics of allatom models of a small peptide, comparing results to the standard replica exchange approach. This adaptation will allow NS to be used with more realistic force fields in the future

Warwick Research Archives Portal Repository

Predicción de estructuras de proteínas basada en vecinos más cercanos

Author: Asencio Cortés Gualberto
Publication venue
Publication date: 01/01/2013
Field of study

Programa de Doctorado en Biotecnología y Tecnología QuímicaLas proteínas son las biomoléculas que tienen mayor diversidad estructural y desempeñan multitud de importantes funciones en todos los organismos vivos. Sin embargo, en la formación de las proteínas se producen anomalías que provocan o facilitan el desarrollo de importantes enfermedades como el cáncer o el Alzheimer, siendo de vital importancia el diseño de fármacos que permitan evitar sus desastrosas consecuencias. En dicho diseño de fármacos se precisa disponer de modelos estructurales de proteínas que, pese a que su secuencia es conocida, en la mayoría de los casos su estructura aún se ignora. Es por ello que la predicción de la estructura de una proteína a partir de su secuencia de aminoácidos resulta clave para la cura de este tipo de enfermedades. En la presente Tesis se ha analizado profundamente el estado del arte del problema de la predicción de la estructura terciaria y cuaternaria de una proteína, aportando diversos aspectos y puntos de vista de los métodos más actuales y relevantes presentes en la literatura. Por otra parte, se propone un método nuevo para la predicción de mapas de distancias que representan estructuras proteínicas mediante un esquema de vecinos más cercanos empleando propiedades físico-químicas de aminoácidos como entrada. Se ha realizado una exhaustiva experimentación y se han analizado los resultados desde varios puntos de vista y destacando diversos aspectos de interés. Finalmente, se ha aplicado la propuesta metodológica a dos grupos de proteínas de interés biológico: las proteínas de virus y de mitocondrias, obteniéndose resultados muy prometedores en ambos casos.Universidad Pablo de Olavide. Centro de Estudios de Postgrad

Repositorio Institucional Olavide

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Proceedings, MSVSCC 2014

Author: Old Dominion University Department of Modeling, Simulation & Visualization Engineering
Old Dominion University Virginia Modeling, Analysis & Simulation Center
Publication venue: ODU Digital Commons
Publication date: 11/04/2013
Field of study

Proceedings of the 8th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 17, 2014 at VMASC in Suffolk, Virginia

Old Dominion University

Deep Model for Improved Operator Function State Assessment

Author: Li Feng
Li Jiang
Schnell Tom
Wen Jonathan
Xu Roger
Zhang Guangfan
Publication venue: ODU Digital Commons
Publication date: 01/01/2014
Field of study

A deep learning framework is presented for engagement assessment using EEG signals. Deep learning is a recently developed machine learning technique and has been applied to many applications. In this paper, we proposed a deep learning strategy for operator function state (OFS) assessment. Fifteen pilots participated in a flight simulation from Seattle to Chicago. During the four-hour simulation, EEG signals were recorded for each pilot. We labeled 20- minute data as engaged and disengaged to fine-tune the deep network and utilized the remaining vast amount of unlabeled data to initialize the network. The trained deep network was then used to assess if a pilot was engaged during the four-hour simulation

Old Dominion University

A Features Analysis Tool For Assessing And Improving Computational Models In Structural Biology

Author: O'Meara Matthew James
Publication venue: University of North Carolina at Chapel Hill
Publication date: 01/01/2013
Field of study

The protein-folding problem is to predict, from a protein's amino acid sequence, its folded 3D conformation. State of the art computational models are complex collaboratively maintained prediction software. Like other complex software, they become brittle without support for testing and refactoring. Features analysis, a language of `scientific unit testing', is the visual and quantitative comparison of distributions of features (local geometric measures) sampled from ensembles of native and predicted conformations. To support features analysis I develop a features analysis tool--a modular database framework for extracting and managing sampled feature instance and an exploratory data analysis framework for rapidly comparing feature distributions. In supporting features analysis, the tool supports the creation, tuning, and assessment of computational models, improving protein prediction and design. I demonstrate the features analysis tool through 6 case studies with the Rosetta molecular modeling suite. The first three demonstrate the tool usage mechanics through constructing and checking models. The first evaluates bond angle restraint models when used with the Backrub local sampling heuristic. The second identifies and resolves energy function derivative discontinuities that frustrate gradient-based minimization. The third constructs a model for disulfide bonds. The second three demonstrate using the tool to evaluate and improve how models represent molecular structure. I focus on modeling H-bonds because of their geometric specificity and environmental dependence lead to complex feature distributions. The fourth case study develops a novel functional form for Sp2 acceptor H-bonds. The fifth fits parameters for a refined H-bond model. The sixth combines the refined model with an electrostatics model and harmonizes them with the rest of the energy function. Next, to facilitate assessing model improvements, I develop recovery tests that measure predictive accuracy by asking models to recover native conformations that have been partially randomized. Finally, to demonstrate that the features analysis and recovery test tools support improving protein prediction and design, I evaluated the refined H-bond model and electrostatics model with additional corrections from the Rosetta community. Based on positive results, I recommend a new standard energy function, which has been accepted by the Rosetta community as the largest systematic improvement in nearly a decade.Doctor of Philosoph

Carolina Digital Repository