120 research outputs found
Kinematic Flexibility Analysis: Hydrogen Bonding Patterns Impart a Spatial Hierarchy of Protein Motion
Elastic network models (ENM) and constraint-based, topological rigidity
analysis are two distinct, coarse-grained approaches to study conformational
flexibility of macromolecules. In the two decades since their introduction,
both have contributed significantly to insights into protein molecular
mechanisms and function. However, despite a shared purpose of these approaches,
the topological nature of rigidity analysis, and thereby the absence of motion
modes, has impeded a direct comparison. Here, we present an alternative,
kinematic approach to rigidity analysis, which circumvents these drawbacks. We
introduce a novel protein hydrogen bond network spectral decomposition, which
provides an orthonormal basis for collective motions modulated by non-covalent
interactions, analogous to the eigenspectrum of normal modes, and decomposes
proteins into rigid clusters identical to those from topological rigidity. Our
kinematic flexibility analysis bridges topological rigidity theory and ENM, and
enables a detailed analysis of motion modes obtained from both approaches. Our
analysis reveals that collectivity of protein motions, reported by the Shannon
entropy, is significantly lower for rigidity theory versus normal mode
approaches. Strikingly, kinematic flexibility analysis suggests that the
hydrogen bonding network encodes a protein-fold specific, spatial hierarchy of
motions, which goes nearly undetected in ENM. This hierarchy reveals distinct
motion regimes that rationalize protein stiffness changes observed from
experiment and molecular dynamics simulations. A formal expression for changes
in free energy derived from the spectral decomposition indicates that motions
across nearly 40% of modes obey enthalpy-entropy compensation. Taken together,
our analysis suggests that hydrogen bond networks have evolved to modulate
protein structure and dynamics
Characterizing RNA ensembles from NMR data with kinematic models
International audienceFunctional mechanisms of biomolecules often manifest themselves precisely in transient conformational substates. Researchers have long sought to structurally characterize dynamic processes in non-coding RNA, combining experimental data with computer algorithms. However, adequate exploration of conformational space for these highly dynamic molecules, starting from static crystal structures, remains challenging. Here, we report a new conformational sampling procedure, KGSrna, which can efficiently probe the native ensemble of RNA molecules in solution. We found that KGSrna ensembles accurately represent the conformational landscapes of 3D RNA encoded by NMR proton chemical shifts. KGSrna resolves motionally averaged NMR data into structural contributions; when coupled with residual dipolar coupling data, a KGSrna ensemble revealed a previously uncharacterized transient excited state of the HIV-1 trans-activation response element stem-loop. Ensemble-based interpretations of averaged data can aid in formulating and testing dynamic, motion-based hypotheses of functional mechanisms in RNAs with broad implications for RNA engineering and therapeutic intervention
Atomic resolution experimental phase information reveals extensive disorder and bound 2-methyl-2,4-pentanediol in Ca\u3csup\u3e2+\u3c/sup\u3e-calmodulin
Calmodulin (CaM) is the primary calcium signaling protein in eukaryotes and has been extensively studied using various biophysical techniques. Prior crystal structures have noted the presence of ambiguous electron density in both hydrophobic binding pockets of Ca2+-CaM, but no assignment of these features has been made. In addition, Ca2+-CaM samples many conformational substates in the crystal and accurately modeling the full range of this functionally important disorder is challenging. In order to characterize these features in a minimally biased manner, a 1.0 A resolution single-wavelength anomalous diffraction data set was measured for selenomethionine-substituted Ca2+-CaM. Density-modified electron-density maps enabled the accurate assignment of Ca2+-CaM main-chain and side-chain disorder. These experimental maps also substantiate complex disorder models that were automatically built using lowcontour features of model-phased electron density. Furthermore, experimental electron-density maps reveal that 2-methyl-2,4-pentanediol (MPD) is present in the C-terminal domain, mediates a lattice contact between N-terminal domains and may occupy the N-terminal binding pocket. The majority of the crystal structures of target-free Ca2+-CaM have been derived from crystals grown using MPD as a precipitant, and thus MPD is likely to be bound in functionally critical regions of Ca2+-CaM in most of these structures. The adventitious binding of MPD helps to explain differences between the Ca2+-CaM crystal and solution structures and is likely to favor more open conformations of the EF-hands in the crystal
Atomic resolution experimental phase information reveals extensive disorder and bound 2-methyl-2,4-pentanediol in Ca\u3csup\u3e2+\u3c/sup\u3e-calmodulin
Calmodulin (CaM) is the primary calcium signaling protein in eukaryotes and has been extensively studied using various biophysical techniques. Prior crystal structures have noted the presence of ambiguous electron density in both hydrophobic binding pockets of Ca2+-CaM, but no assignment of these features has been made. In addition, Ca2+-CaM samples many conformational substates in the crystal and accurately modeling the full range of this functionally important disorder is challenging. In order to characterize these features in a minimally biased manner, a 1.0 A resolution single-wavelength anomalous diffraction data set was measured for selenomethionine-substituted Ca2+-CaM. Density-modified electron-density maps enabled the accurate assignment of Ca2+-CaM main-chain and side-chain disorder. These experimental maps also substantiate complex disorder models that were automatically built using lowcontour features of model-phased electron density. Furthermore, experimental electron-density maps reveal that 2-methyl-2,4-pentanediol (MPD) is present in the C-terminal domain, mediates a lattice contact between N-terminal domains and may occupy the N-terminal binding pocket. The majority of the crystal structures of target-free Ca2+-CaM have been derived from crystals grown using MPD as a precipitant, and thus MPD is likely to be bound in functionally critical regions of Ca2+-CaM in most of these structures. The adventitious binding of MPD helps to explain differences between the Ca2+-CaM crystal and solution structures and is likely to favor more open conformations of the EF-hands in the crystal
An efficient graph generative model for navigating ultra-large combinatorial synthesis libraries
Virtual, make-on-demand chemical libraries have transformed early-stage drug
discovery by unlocking vast, synthetically accessible regions of chemical
space. Recent years have witnessed rapid growth in these libraries from
millions to trillions of compounds, hiding undiscovered, potent hits for a
variety of therapeutic targets. However, they are quickly approaching a size
beyond that which permits explicit enumeration, presenting new challenges for
virtual screening. To overcome these challenges, we propose the Combinatorial
Synthesis Library Variational Auto-Encoder (CSLVAE). The proposed generative
model represents such libraries as a differentiable, hierarchically-organized
database. Given a compound from the library, the molecular encoder constructs a
query for retrieval, which is utilized by the molecular decoder to reconstruct
the compound by first decoding its chemical reaction and subsequently decoding
its reactants. Our design minimizes autoregression in the decoder, facilitating
the generation of large, valid molecular graphs. Our method performs fast and
parallel batch inference for ultra-large synthesis libraries, enabling a number
of important applications in early-stage drug discovery. Compounds proposed by
our method are guaranteed to be in the library, and thus synthetically and
cost-effectively accessible. Importantly, CSLVAE can encode out-of-library
compounds and search for in-library analogues. In experiments, we demonstrate
the capabilities of the proposed method in the navigation of massive
combinatorial synthesis libraries.Comment: 36th Conference on Neural Information Processing Systems (NeurIPS
2022
Distributed structure determination at the JCSG
The software suite Xsolve semi-exhaustively explores key parameters of the X-ray structure-determination process to compute multiple three-dimensional protein structures independently and in parallel from a set of diffraction images. An optimal consensus model for subsequent manual refinement is computed from these structures
Reproducibility of protein x-ray diffuse scattering and potential utility for modeling atomic displacement parameters
Protein structure and dynamics can be probed using x-ray crystallography. Whereas the Bragg peaks are only sensitive to the average unit-cell electron density, the signal between the Bragg peaks—diffuse scattering—is sensitive to spatial correlations in electron-density variations. Although diffuse scattering contains valuable information about protein dynamics, the diffuse signal is more difficult to isolate from the background compared to the Bragg signal, and the reproducibility of diffuse signal is not yet well understood. We present a systematic study of the reproducibility of diffuse scattering from isocyanide hydratase in three different protein forms. Both replicate diffuse datasets and datasets obtained from different mutants were similar in pairwise comparisons (Pearson correlation coefficient \u3e/= 0.8). The data were processed in a manner inspired by previously published methods using custom software with modular design, enabling us to perform an analysis of various data processing choices to determine how to obtain the highest quality data as assessed using unbiased measures of symmetry and reproducibility. The diffuse data were then used to characterize atomic mobility using a liquid-like motions (LLM) model. This characterization was able to discriminate between distinct anisotropic atomic displacement parameter (ADP) models arising from different anisotropic scaling choices that agreed comparably with the Bragg data. Our results emphasize the importance of data reproducibility as a model-free measure of diffuse data quality, illustrate the ability of LLM analysis of diffuse scattering to select among alternative ADP models, and offer insights into the design of successful diffuse scattering experiments
Structure of the first representative of Pfam family PF04016 (DUF364) reveals enolase and Rossmann-like folds that combine to form a unique active site with a possible role in heavy-metal chelation.
The crystal structure of Dhaf4260 from Desulfitobacterium hafniense DCB-2 was determined by single-wavelength anomalous diffraction (SAD) to a resolution of 2.01 Å using the semi-automated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG) as part of the NIGMS Protein Structure Initiative (PSI). This protein structure is the first representative of the PF04016 (DUF364) Pfam family and reveals a novel combination of two well known domains (an enolase N-terminal-like fold followed by a Rossmann-like domain). Structural and bioinformatic analyses reveal partial similarities to Rossmann-like methyltransferases, with residues from the enolase-like fold combining to form a unique active site that is likely to be involved in the condensation or hydrolysis of molecules implicated in the synthesis of flavins, pterins or other siderophores. The genome context of Dhaf4260 and homologs additionally supports a role in heavy-metal chelation
- …