52 research outputs found
Capturing conformational states in proteins using sparse paramagnetic NMR data
Capturing conformational changes in proteins or protein-protein complexes is a challenge for both experimentalists and computational biologists. Solution nuclear magnetic resonance (NMR) is unique in that it permits structural studies of proteins under greatly varying conditions, and thus allows us to monitor induced structural changes. Paramagnetic effects are increasingly used to study protein structures as they give ready access to rich structural information of orientation and long-range distance restraints from the NMR signals of backbone amides, and reliable methods have become available to tag proteins with paramagnetic metal ions site-specifically and at multiple sites. In this study, we show how sparse pseudocontact shift (PCS) data can be used to computationally model conformational states in a protein system, by first identifying core structural elements that are not affected by the environmental change, and then computationally completing the remaining structure based on experimental restraints from PCS. The approach is demonstrated on a 27 kDa two-domain NS2B-NS3 protease system of the dengue virus serotype 2, for which distinct closed and open conformational states have been observed in crystal structures. By changing the input PCS data, the observed conformational states in the dengue virus protease are reproduced without modifying the computational procedure. This data driven Rosetta protocol enables identification of conformational states of a protein system, which are otherwise difficult to obtain either experimentally or computationally.This study was supported by the Australian
Research Council (DP120100561, DP150100383),
which the authors gratefully acknowledge
Mapping genetic variations to three- dimensional protein structures to enhance variant interpretation: a proposed framework
The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods
Recent Advances in NMR Protein Structure Prediction with ROSETTA
Nuclear magnetic resonance (NMR) spectroscopy is a powerful method for studying the structure and dynamics of proteins in their native state. For high-resolution NMR structure determination, the collection of a rich restraint dataset is necessary. This can be difficult to achieve for proteins with high molecular weight or a complex architecture. Computational modeling techniques can complement sparse NMR datasets (<1 restraint per residue) with additional structural information to elucidate protein structures in these difficult cases. The Rosetta software for protein structure modeling and design is used by structural biologists for structure determination tasks in which limited experimental data is available. This review gives an overview of the computational protocols available in the Rosetta framework for modeling protein structures from NMR data. We explain the computational algorithms used for the integration of different NMR data types in Rosetta. We also highlight new developments, including modeling tools for data from paramagnetic NMR and hydrogen–deuterium exchange, as well as chemical shifts in CS-Rosetta. Furthermore, strategies are discussed to complement and improve structure predictions made by the current state-of-the-art AlphaFold2 program using NMR-guided Rosetta modeling
A Novel Domain Assembly Routine for Creating Full-Length Models of Membrane Proteins from Known Domain Structures
Membrane
proteins composed of soluble and membrane domains are
often studied one domain at a time. However, to understand the biological
function of entire protein systems and their interactions with each
other and drugs, knowledge of full-length structures or models is
required. Although few computational methods exist that could potentially
be used to model full-length constructs of membrane proteins, none
of these methods are perfectly suited for the problem at hand. Existing
methods require an interface or knowledge of the relative orientations
of the domains or are not designed for domain assembly, and none of
them are developed for membrane proteins. Here we describe the first
domain assembly protocol specifically designed for membrane proteins
that assembles intra- and extracellular soluble domains and the transmembrane
domain into models of the full-length membrane protein. Our protocol
does not require an interface between the domains and samples possible
domain orientations based on backbone dihedrals in the flexible linker
regions, created via fragment insertion, while keeping the transmembrane
domain fixed in the membrane. For five examples tested, our method
mp_domain_assembly, implemented in RosettaMP, samples domain orientations
close to the known structure and is best used in conjunction with
experimental data to reduce the conformational search space
Docking cholesterol to integral membrane proteins with Rosetta.
Lipid molecules such as cholesterol interact with the surface of integral membrane proteins (IMP) in a mode different from drug-like molecules in a protein binding pocket. These differences are due to the lipid molecule's shape, the membrane's hydrophobic environment, and the lipid's orientation in the membrane. We can use the recent increase in experimental structures in complex with cholesterol to understand protein-cholesterol interactions. We developed the RosettaCholesterol protocol consisting of (1) a prediction phase using an energy grid to sample and score native-like binding poses and (2) a specificity filter to calculate the likelihood that a cholesterol interaction site may be specific. We used a multi-pronged benchmark (self-dock, flip-dock, cross-dock, and global-dock) of protein-cholesterol complexes to validate our method. RosettaCholesterol improved sampling and scoring of native poses over the standard RosettaLigand baseline method in 91% of cases and performs better regardless of benchmark complexity. On the β2AR, our method found one likely-specific site, which is described in the literature. The RosettaCholesterol protocol quantifies cholesterol binding site specificity. Our approach provides a starting point for high-throughput modeling and prediction of cholesterol binding sites for further experimental validation
- …