65 research outputs found
Quantum mechanical calculation of the effects of stiff and rigid constraints in the conformational equilibrium of the Alanine dipeptide
If constraints are imposed on a macromolecule, two inequivalent classical
models may be used: the stiff and the rigid one. This work studies the effects
of such constraints on the Conformational Equilibrium Distribution (CED) of the
model dipeptide HCO-L-Ala-NH2 without any simplifying assumption. We use ab
initio Quantum Mechanics calculations including electron correlation at the MP2
level to describe the system, and we measure the conformational dependence of
all the correcting terms to the naive CED based in the Potential Energy Surface
(PES) that appear when the constraints are considered. These terms are related
to mass-metric tensors determinants and also occur in the Fixman's compensating
potential. We show that some of the corrections are non-negligible if one is
interested in the whole Ramachandran space. On the other hand, if only the
energetically lower region, containing the principal secondary structure
elements, is assumed to be relevant, then, all correcting terms may be
neglected up to peptides of considerable length. This is the first time, as far
as we know, that the analysis of the conformational dependence of these
correcting terms is performed in a relevant biomolecule with a realistic
potential energy function.Comment: 37 pages, 4 figures, LaTeX, BibTeX, AMSTe
Explicit factorization of external coordinates in constrained Statistical Mechanics models
If a macromolecule is described by curvilinear coordinates or rigid
constraints are imposed, the equilibrium probability density that must be
sampled in Monte Carlo simulations includes the determinants of different
mass-metric tensors. In this work, we explicitly write the determinant of the
mass-metric tensor G and of the reduced mass-metric tensor g, for any molecule,
general internal coordinates and arbitrary constraints, as a product of two
functions; one depending only on the external coordinates that describe the
overall translation and rotation of the system, and the other only on the
internal coordinates. This work extends previous results in the literature,
proving with full generality that one may integrate out the external
coordinates and perform Monte Carlo simulations in the internal conformational
space of macromolecules. In addition, we give a general mathematical argument
showing that the factorization is a consequence of the symmetries of the metric
tensors involved. Finally, the determinant of the mass-metric tensor G is
computed explicitly in a set of curvilinear coordinates specially well-suited
for general branched molecules.Comment: 22 pages, 2 figures, LaTeX, AMSTeX. v2: Introduccion slightly
extended. Version in arXiv is slightly larger than the published on
Efficient model chemistries for peptides. I. Split-valence Gaussian basis sets and the heterolevel approximation in RHF and MP2
We present an exhaustive study of more than 250 ab initio potential energy
surfaces (PESs) of the model dipeptide HCO-L-Ala-NH2. The model chemistries
(MCs) used are constructed as homo- and heterolevels involving possibly
different RHF and MP2 calculations for the geometry and the energy. The basis
sets used belong to a sample of 39 selected representants from Pople's
split-valence families, ranging from the small 3-21G to the large
6-311++G(2df,2pd). The reference PES to which the rest are compared is the
MP2/6-311++G(2df,2pd) homolevel, which, as far as we are aware, is the more
accurate PES of a dipeptide in the literature. The aim of the study presented
is twofold: On the one hand, the evaluation of the influence of polarization
and diffuse functions in the basis set, distinguishing between those placed at
1st-row atoms and those placed at hydrogens, as well as the effect of different
contraction and valence splitting schemes. On the other hand, the investigation
of the heterolevel assumption, which is defined here to be that which states
that heterolevel MCs are more efficient than homolevel MCs. The heterolevel
approximation is very commonly used in the literature, but it is seldom
checked. As far as we know, the only tests for peptides or related systems,
have been performed using a small number of conformers, and this is the first
time that this potentially very economical approximation is tested in full
PESs. In order to achieve these goals, all data sets have been compared and
analyzed in a way which captures the nearness concept in the space of MCs.Comment: 54 pages, 16 figures, LaTeX, AMSTeX, Submitted to J. Comp. Che
Introduction to protein folding for physicists
The prediction of the three-dimensional native structure of proteins from the
knowledge of their amino acid sequence, known as the protein folding problem,
is one of the most important yet unsolved issues of modern science. Since the
conformational behaviour of flexible molecules is nothing more than a complex
physical problem, increasingly more physicists are moving into the study of
protein systems, bringing with them powerful mathematical and computational
tools, as well as the sharp intuition and deep images inherent to the physics
discipline. This work attempts to facilitate the first steps of such a
transition. In order to achieve this goal, we provide an exhaustive account of
the reasons underlying the protein folding problem enormous relevance and
summarize the present-day status of the methods aimed to solving it. We also
provide an introduction to the particular structure of these biological
heteropolymers, and we physically define the problem stating the assumptions
behind this (commonly implicit) definition. Finally, we review the 'special
flavor' of statistical mechanics that is typically used to study the
astronomically large phase spaces of macromolecules. Throughout the whole work,
much material that is found scattered in the literature has been put together
here to improve comprehension and to serve as a handy reference.Comment: 53 pages, 18 figures, the figures are at a low resolution due to
arXiv restrictions, for high-res figures, go to http://www.pabloechenique.co
ORGANIZING FORCES AND CONFORMATIONAL ACCESSIBILITY IN THE UNFOLDED STATE OF PROTEINS
For over fifty years, the unfolded state of proteins had been thought to be
featureless and random. Experiments by Tanford and Flory confirmed that unfolded
proteins possessed the same dimensions as those predicted of a random flight chain in
good solvent. In the late eighties and early nineties, however, researchers began to notice
structural trends in unfolded proteins. Some experiments showed that the unfolded state
was very similar to the native state, while others indicated a conformational preference
for the polyproline II helix in unfolded proteins. As a result, a paradox developed. How
can unfolded proteins be both random and nonrandom at the same time?
Current experiments and most theoretical simulations cannot characterize the
unfolded state in high detail, so we have used the simplified hard sphere model of
Richards to address this question. By modeling proteins as hard spheres, we can not only
determine what interactions are important in the unfolded state of proteins, but we can
address the paradox directly by investigating whether nonrandom behavior is in conflict
with random coil statistics.
Our simulations identify hundreds of disfavored conformations in short peptides,
each of which proves that unfolded proteins are not at all random. Some interactions are
important for the folded state of proteins as well. For example, we find that an α-helix
cannot be followed directly by a β-strand because of steric considerations. The
interactions outlined here limit the conformational possibilities of an unfolded protein far
beyond what would be expected for a random coil. For a 100-residue protein, we find
that approximately 9 orders of magnitude of conformational freedom are lost because of
iii
local chain organization alone. Furthermore, we show that the existence of this
organization is compatible with random coil statistics.
Although our simulations cannot settle the controversy surrounding the unfolded
state, we can conclude that new methods of characterizing the unfolded state are needed.
Since unfolded proteins are not random coils, the methods developed for describing
random coils cannot adequately describe the complexities of this diverse structural
ensemble
An exact expression to calculate the derivatives of position-dependent observables in molecular simulations with flexible constraints
In this work, we introduce an algorithm to compute the derivatives of
physical observables along the constrained subspace when flexible constraints
are imposed on the system (i.e., constraints in which the hard coordinates are
fixed to configuration-dependent values). The presented scheme is exact, it
does not contain any tunable parameter, and it only requires the calculation
and inversion of a sub-block of the Hessian matrix of second derivatives of the
function through which the constraints are defined. We also present a practical
application to the case in which the sought observables are the Euclidean
coordinates of complex molecular systems, and the function whose minimization
defines the constraints is the potential energy. Finally, and in order to
validate the method, which, as far as we are aware, is the first of its kind in
the literature, we compare it to the natural and straightforward
finite-differences approach in three molecules of biological relevance:
methanol, N-methyl-acetamide and a tri-glycine peptideComment: 13 pages, 8 figures, published versio
Computational studies of protein-ligand molecular recognition
Structure-based drug design is made possible by our understanding of molecular recognition.
The utility of this approach was apparent in the development of the clinically e V ective HIV-1
PR inhibitors, where crystal structures of complexes of HIV-1 protease and inhibitors gave
pivotal information. Computational methods drawing upon structural data are of increasing
relevance to the drug design process. Nonetheless, these methods are quite rudimentary and
signicant improvements are needed. The aim of this thesis was to investigate techniques
which may lead to improved modelling of molecular recognition and a better ability to make
predictions about the binding a Y nity of ligands. The two main themes were the modelling
of acidbase titration behaviour of ligand and receptor, and the application of the simulation
technique of congurational bias Monte Carlo (CBMC). The studies were performed with
HIV-1 PR and its inhibitors as a model system.
Biological processes are inuenced by the pH of the medium in which they take place.
Ligandreceptor binding equilibria are often thermodynamically linked to protonation changes
in ligand and/or receptor, as seen in the the binding of a number of HIV-1 PR inhibitors.
In Chapter 2, a series of sixteen continuum electrostatics pKa calculations of HIV-1 PR
inhibitor complexes was done, in order to characterize the nature and size of these linkages.
The most important e V ects concern changes in the pKa of the enzyme active site aspartate
dyad. Large pKa shifts were predicted in all cases, and at least one of the two dyad pKas
became more basic on binding. At physiologically relevant pH, di V erent ligands induced
di V erent protonation states, with di V erent tautomeric forms favoured. The fully deprotonated
form of the dyad was not signicantly populated for any of the complexes. For about a third
of the complexes, both singly and doubly protonated forms were predicted to be populated.
The predicted predominant protonation states of MVT-101 and VX-478 were consistent with
previous theoretical studies. The size of the predicted pKa shifts for MVT-101 and XK263
di V ered from a previous study using similar methods. The paucity and ambiguity of available
experimental data makes it di Y cult to evaluate the results fully; however the tendency to
exaggerate shifts, as observed in other studies, appears to be present.
Scoring is the prediction of binding a Y nity from the structure of the ligandreceptor
complex, according to an empirical scheme. Scoring studies usually neglect or grossly simplify
the contribution of protonation equilibria to a Y nity, so in Chapter 3 proton linkage data was
included in a regression analysis of the HIV-1 PR complexes from Chapter 2. Parameters
previously shown to correlate with binding, namely electrostatic free energy changes and
buried surface areas, were the basis for the analysis, and terms describing proton linkage,
in the form of a correction for assay pH and an indicator variable for predicted dyad pKa
shift on binding, were also considered. The complex with MVT-101 was an outlier in the
analysis and was excluded. Further analysis demonstrated that the correction for assay pH made a signicant contribution to the regression equation. Amendment of the parameters
for XK263 according to the available experimental data led to an improved regression in
which the term for calculated pKa shifts also made a signicant contribution. The regression
equations obtained had the same form and similar coe Y cients to scoring functions of the
master equation type, and t the experimental data with comparable accuracy.
More physically realistic simulations of ligandreceptor binding using the techniques of
molecular dynamics (MD) or Monte Carlo (MC) are potentially more accurate than scoring
function approaches. These methods are slow, so the alternative of CBMC, which has been
shown to give faster convergence for polymer simulations, was implemented for C harmm 22,
an all-atom protein force eld (Chapter 4). The correctness of the implementation was
demonstrated by comparison with exact and stochastic dynamics (SD) results for individual
terms in the force eld. The algorithm is more complex than those typically used with alkane
force elds, and this has possible consequences for the e Y ciency. CBMC was used to generate
a Ramachandran plot for the alanine dipeptide, and the results were found to be in agreement
with those generated by a SD simulation. Analysis of statistical errors suggests that CBMC
should be competitive with umbrella sampling for simulating conformational equilibria, par-
ticularly when the cost of non-bonded energy evaluations dominates the simulation.
CBMC can be applied to ligandreceptor binding, as demonstrated in grand canonical
simulations of alkane adsorption in zeolites. The more limited problem of nding the pre-
dominant bound conformation of a exible ligand given a rigid protein receptor (i.e. dock-
ing) was treated in Chapter 5, using the example of a tripeptide inhibitor which binds to
HIV-1 PR. Attempts to perform the docking using the Metropolis MC/simulated annealing
and Lamarckian genetic algorithm methods implemented in the program AutoDock failed
to reproduce the native conguration (with runs on the order of two days execution time).
Docking using CBMC, combined with parallel tempering to further improve sampling, was
successful in nding the native binding mode, although this success was dependent on ad hoc
adjustments to the force eld, and a priori knowledge of the ligand protonation state and bind-
ing site. The e Y ciency of the method was considerably lower than hoped, with problems
due to the force eld- and model-dependent coupling between terms in the potential energy
function, and the greedy nature of the CBMC algorithm.
Various conclusions can be drawn from these studies. Chapters 2 and 3 provide evidence
of the importance of protonation equilibria in ligandprotein molecular recognition, and un-
derline the sizable contribution of electrostatic interactions to binding energies. In the face of
this nding, neglect of electrostatic terms, as often seen past studies, appears to be counterpro-
ductive. The scoring study also shows how experimental data can be used more e V ectively if
factors such as assay conditions are carefully taken into account. Implementation of CBMC for
a widely-used protein force eld and application of the algorithm to docking (Chapters 4 and
5) represents a proof of concept for a broadly useful simulation technique. Further work will
be required to nd the right niche for CBMC and fully explore the potential of this and re-
lated techniques. A nal point is the demonstrated utility of the HIV-1 PR test system which
formed the focus of the studies. Abundant structural data has enabled many new approaches
to be tested, and further insights are expected from the analysis of unusual cases, such as the
anomalous results for MVT-101. As well as the question of scoring, studies of mutation and
resistance are likely to attract considerable interest in the future
Exact and efficient calculation of Lagrange multipliers in constrained biological polymers: Proteins and nucleic acids as example cases
In order to accelerate molecular dynamics simulations, it is very common to
impose holonomic constraints on their hardest degrees of freedom. In this way,
the time step used to integrate the equations of motion can be increased, thus
allowing, in principle, to reach longer total simulation times. The imposition
of such constraints results in an aditional set of Nc equations (the equations
of constraint) and unknowns (their associated Lagrange multipliers), that must
be solved in one way or another at each time step of the dynamics. In this work
it is shown that, due to the essentially linear structure of typical biological
polymers, such as nucleic acids or proteins, the algebraic equations that need
to be solved involve a matrix which is banded if the constraints are indexed in
a clever way. This allows to obtain the Lagrange multipliers through a
non-iterative procedure, which can be considered exact up to machine precision,
and which takes O(Nc) operations, instead of the usual O(Nc3) for generic
molecular systems. We develop the formalism, and describe the appropriate
indexing for a number of model molecules and also for alkanes, proteins and
DNA. Finally, we provide a numerical example of the technique in a series of
polyalanine peptides of different lengths using the AMBER molecular dynamics
package.Comment: 29 pages, 10 figures, 1 tabl
- …