2,607 research outputs found
Potentials of Mean Force for Protein Structure Prediction Vindicated, Formalized and Generalized
Understanding protein structure is of crucial importance in science, medicine
and biotechnology. For about two decades, knowledge based potentials based on
pairwise distances -- so-called "potentials of mean force" (PMFs) -- have been
center stage in the prediction and design of protein structure and the
simulation of protein folding. However, the validity, scope and limitations of
these potentials are still vigorously debated and disputed, and the optimal
choice of the reference state -- a necessary component of these potentials --
is an unsolved problem. PMFs are loosely justified by analogy to the reversible
work theorem in statistical physics, or by a statistical argument based on a
likelihood function. Both justifications are insightful but leave many
questions unanswered. Here, we show for the first time that PMFs can be seen as
approximations to quantities that do have a rigorous probabilistic
justification: they naturally arise when probability distributions over
different features of proteins need to be combined. We call these quantities
reference ratio distributions deriving from the application of the reference
ratio method. This new view is not only of theoretical relevance, but leads to
many insights that are of direct practical use: the reference state is uniquely
defined and does not require external physical insights; the approach can be
generalized beyond pairwise distances to arbitrary features of protein
structure; and it becomes clear for which purposes the use of these quantities
is justified. We illustrate these insights with two applications, involving the
radius of gyration and hydrogen bonding. In the latter case, we also show how
the reference ratio method can be iteratively applied to sculpt an energy
funnel. Our results considerably increase the understanding and scope of energy
functions derived from known biomolecular structures
Collective estimation of multiple bivariate density functions with application to angular-sampling-based protein loop modeling
This article develops a method for simultaneous estimation of density functions for a collection of populations of protein backbone angle pairs using a data-driven, shared basis that is constructed by bivariate spline functions defined on a triangulation of the bivariate domain. The circular nature of angular data is taken into account by imposing appropriate smoothness constraints across boundaries of the triangles. Maximum penalized likelihood is used to fit the model and an alternating blockwise Newton-type algorithm is developed for computation. A simulation study shows that the collective estimation approach is statistically more efficient than estimating the densities individually. The proposed method was used to estimate neighbor-dependent distributions of protein backbone dihedral angles (i.e., Ramachandran distributions). The estimated distributions were applied to protein loop modeling, one of the most challenging open problems in protein structure prediction, by feeding them into an angular-sampling-based loop structure prediction framework. Our estimated distributions compared favorably to the Ramachandran distributions estimated by fitting a hierarchical Dirichlet process model; and in particular, our distributions showed significant improvements on the hard cases where existing methods do not work well
Protein Structure Determination Using Chemical Shifts
In this PhD thesis, a novel method to determine protein structures using
chemical shifts is presented.Comment: Univ Copenhagen PhD thesis (2014) in Biochemistr
Monte Carlo Protein Folding: Simulations of Met-Enkephalin with Solvent-Accessible Area Parameterizations
Treating realistically the ambient water is one of the main difficulties in
applying Monte Carlo methods to protein folding. The solvent-accessible area
method, a popular method for treating water implicitly, is investigated by
means of Metropolis simulations of the brain peptide Met-Enkephalin. For the
phenomenological energy function ECEPP/2 nine atomic solvation parameter (ASP)
sets are studied that had been proposed by previous authors. The simulations
are compared with each other, with simulations with a distance dependent
electrostatic permittivity , and with vacuum simulations
(). Parallel tempering and a recently proposed biased Metropolis
technique are employed and their performances are evaluated. The measured
observables include energy and dihedral probability densities (pds), integrated
autocorrelation times, and acceptance rates. Two of the ASP sets turn out to be
unsuitable for these simulations. For all other sets, selected configurations
are minimized in search of the global energy minima. Unique minima are found
for the vacuum and the system, but for none of the ASP models.
Other observables show a remarkable dependence on the ASPs. In particular,
autocorrelation times vary dramatically with the ASP parameters. Three ASP sets
have much smaller autocorrelations at 300 K than the vacuum simulations,
opening the possibility that simulations can be speeded up vastly by
judiciously chosing details of the forceComment: 10 pages; published in "NIC Symposium 2004", eds. D. Wolf at el.
(NIC, Juelich, 2004
Skewed Factor Models Using Selection Mechanisms
Traditional factor models explicitly or implicitly assume that the factors follow a multivariate normal distribution; that is, only moments up to order two are involved. However, it may happen in real data problems that the first two moments cannot explain the factors. Based on this motivation, here we devise three new skewed factor models, the skew-normal, the skew-t, and the generalized skew-normal factor models depending on a selection mechanism on the factors. The ECME algorithms are adopted to estimate related parameters for statistical inference. Monte Carlo simulations validate our new models and we demonstrate the need for skewed factor models using the classic open/closed book exam scores dataset
- …