2,379 research outputs found

    High-Throughput 3D Homology Detection via NMR Resonance Assignment

    Get PDF
    One goal of the structural genomics initiative is the identification of new protein folds. Sequence-based structural homology prediction methods are an important means for prioritizing unknown proteins for structure determination. However, an important challenge remains: two highly dissimilar sequences can have similar folds --- how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure, called HD, for detecting 3D structural homologies from sparse, unassigned protein NMR data. Our method identifies 3D models in a protein structural database whose geometries best fit the unassigned experimental NMR data. HD does not use, and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or homology modelling. The algorithm runs in O(pn5/2log(cn)+plogp)O(pn^{5/2} \log {(cn)} + p \log p) time, where pp is the number of proteins in the database, nn is the number of residues in the target protein and cc is the maximum edge weight in an integer-weighted bipartite graph. Our experiments on real NMR data from 3 different proteins against a database of 4,500 representative folds demonstrate that the method identifies closely related protein folds, including sub-domains of larger proteins, with as little as 10-30\% sequence homology between the target protein (or sub-domain) and the computed model. In particular, we report no false-negatives or false-positives despite significant percentages of missing experimental data

    3D-Structural Homology Detection via Unassigned Residual Dipolar Couplings

    Get PDF
    Recognition of a protein\u27s fold provides valuable information about its function. While many sequence-based homology prediction methods exist, an important challenge remains: two highly dissimilar sequences can have similar folds --- how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure for detecting 3D-structural homologies from sparse, unassigned protein NMR data. Our method identifies the 3D-structural models in a protein structural database whose geometries best fit the unassigned experimental NMR data. It does not use sequence information and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or sequence homology. The algorithm runs in O(pnk3) time, where p is the number of proteins in the database, n is the number of residues in the target protein, and k is the resolution of a rotation search. The method requires only uniform 15N-labelling of the protein and processes unassigned 1H-15N residual dipolar couplings, which can be acquired in a couple of hours. Our experiments on NMR data from 5 different proteins demonstrate that the method identifies closely related protein folds, despite low-sequence homology between the target protein and the computed model

    High-Throughput Inference of Protein-Protein Interaction Sites from Unassigned NMR Data by Analyzing Arrangements Induced By Quadratic Forms on 3-Manifolds

    Get PDF
    We cast the problem of identifying protein-protein interfaces, using only unassigned NMR spectra, into a geometric clustering problem. Identifying protein-protein interfaces is critical to understanding inter- and intra-cellular communication, and NMR allows the study of protein interaction in solution. However it is often the case that NMR studies of a protein complex are very time-consuming, mainly due to the bottleneck in assigning the chemical shifts, even if the apo structures of the constituent proteins are known. We study whether it is possible, in a high-throughput manner, to identify the interface region of a protein complex using only unassigned chemical shift and residual dipolar coupling (RDC) data. We introduce a geometric optimization problem where we must cluster the cells in an arrangement on the boundary of a 3-manifold. The arrangement is induced by a spherical quadratic form, which in turn is parameterized by SO(3)xR^2. We show that this formalism derives directly from the physics of RDCs. We present an optimal algorithm for this problem that runs in O(n^3 log n) time for an n-residue protein. We then use this clustering algorithm as a subroutine in a practical algorithm for identifying the interface region of a protein complex from unassigned NMR data. We present the results of our algorithm on NMR data for 7 proteins from 5 protein complexes and show that our approach is useful for high-throughput applications in which we seek to rapidly identify the interface region of a protein complex

    NMR analysis of the dynamic exchange of the NS2B cofactor between open and closed conformations of the West Nile Virus NS2B-NS3 protease

    Get PDF
    BACKGROUND The two-component NS2B-NS3 proteases of West Nile and dengue viruses are essential for viral replication and established targets for drug development. In all crystal structures of the proteases to date, the NS2B cofactor is located far from the substrate binding site (open conformation) in the absence of inhibitor and lining the substrate binding site (closed conformation) in the presence of an inhibitor. METHODS In this work, nuclear magnetic resonance (NMR) spectroscopy of isotope and spin-labeled samples of the West Nile virus protease was used to investigate the occurrence of equilibria between open and closed conformations in solution. FINDINGS In solution, the closed form of the West Nile virus protease is the predominant conformation irrespective of the presence or absence of inhibitors. Nonetheless, dissociation of the C-terminal part of the NS2B cofactor from the NS3 protease (open conformation) occurs in both the presence and the absence of inhibitors. Low-molecular-weight inhibitors can shift the conformational exchange equilibria so that over 90% of the West Nile virus protease molecules assume the closed conformation. The West Nile virus protease differs from the dengue virus protease, where the open conformation is the predominant form in the absence of inhibitors. CONCLUSION Partial dissociation of NS2B from NS3 has implications for the way in which the NS3 protease can be positioned with respect to the host cell membrane when NS2B is membrane associated via N- and C-terminal segments present in the polyprotein. In the case of the West Nile virus protease, discovery of low-molecular-weight inhibitors that act by breaking the association of the NS2B cofactor with the NS3 protease is impeded by the natural affinity of the cofactor to the NS3 protease. The same strategy can be more successful in the case of the dengue virus NS2B-NS3 protease.The project was funded by the Australian Research Council (http://www.arc.gov.au), grant DP0877540

    An Improved Nuclear Vector Replacement Algorithm for Nuclear Magnetic Resonance Assignment

    Get PDF
    We report an improvement to the Nuclear Vector Replacement (NVR) algorithm for high-throughput Nuclear Magnetic Resonance (NMR) resonance assignment. The new algorithm improves upon our earlier result in terms of accuracy and computational complexity. In particular, the new NVR algorithm assigns backbone resonances without error (100% accuracy) on the same test suite examined in [Langmead and Donald J. Biomol. NMR 2004], and runs in O(n5/2log(cn))O(n^{5/2} \log {(cn)}) time where nn is the number of amino acids in the primary sequence of the protein, and cc is the maximum edge weight in an integer-weighted bipartite graph

    Characterisation of a newly identified family of lipid transfer proteins at membrane contact sites

    Get PDF
    Non-vesicular intracellular lipid traffic is mediated by lipid transfer proteins (LTPs), which contain domains with an internal cavity that can solubilise and transfer lipids. One of the most widespread LTP folds is the Steroidogenic Acute Regulatory Transfer (StART) domain, which forms a hydrophobic pocket, and appears in proteins with different localisations and lipid specificities. The aim of this study was to characterise a new StART-like domain family, which we identified by a bioinformatics approach. I studied aspects of the localisations, functions and structural properties of six StART-like proteins in S. cerevisiae. The yeast StART-like proteins were endoplasmic reticulum (ER)-integral membrane proteins with transmembrane domains, and they localised at membrane contact sites: Lam1p/Lam3p, and Lam2p/Lam4p at junctions between ER and plasma membrane (PM); Lam5p/Lam6p at junctions between the ER and the vacuolar membrane, at nucleus-vacuole junction (NVJ) and at ER-mitochondria contacts. To study their functions, I purified the second StART-like domain of Lam4p, and I identified sterol as its lipid ligand from in vitro binding assays and in a spectroscopy approach with fluorescent ergosterol. We named the whole family LAM for Lipid transfer proteins Anchored at Membrane contact sites. The sterol binding property of the domains was related to a phenotype shared by LAM1, LAM2 and LAM3 delete strains, which showed an increased sensitivity to the sterol-sequestering polyene antifungal drug Amphotericin B (AmB). The two most sensitive strains (lam1∆ and lam3∆), displayed low sphingolipid levels, which is as yet unexplained. All AmB phenotypes were rescued by StART-like domains from the human LAMa, Lam2/4p and Lam5/6p, suggesting that these domains bind sterol. Simultaneous deletion of LAM1, LAM2, and LAM3 significantly reduced the extent of cortical ER-PM contacts, implying that they create the structure of the particularly punctate contact site they target. Finally, I started structural analysis of Lam4S2 to study the mechanism of sterol binding and to confirm our structural model

    Fully automated high-quality NMR structure determination of small 2H-enriched proteins

    Get PDF
    Determination of high-quality small protein structures by nuclear magnetic resonance (NMR) methods generally requires acquisition and analysis of an extensive set of structural constraints. The process generally demands extensive backbone and sidechain resonance assignments, and weeks or even months of data collection and interpretation. Here we demonstrate rapid and high-quality protein NMR structure generation using CS-Rosetta with a perdeuterated protein sample made at a significantly reduced cost using new bacterial culture condensation methods. Our strategy provides the basis for a high-throughput approach for routine, rapid, high-quality structure determination of small proteins. As an example, we demonstrate the determination of a high-quality 3D structure of a small 8 kDa protein, E. coli cold shock protein A (CspA), using <4 days of data collection and fully automated data analysis methods together with CS-Rosetta. The resulting CspA structure is highly converged and in excellent agreement with the published crystal structure, with a backbone RMSD value of 0.5 Å, an all atom RMSD value of 1.2 Å to the crystal structure for well-defined regions, and RMSD value of 1.1 Å to crystal structure for core, non-solvent exposed sidechain atoms. Cross validation of the structure with 15N- and 13C-edited NOESY data obtained with a perdeuterated 15N, 13C-enriched 13CH3 methyl protonated CspA sample confirms that essentially all of these independently-interpreted NOE-based constraints are already satisfied in each of the 10 CS-Rosetta structures. By these criteria, the CS-Rosetta structure generated by fully automated analysis of data for a perdeuterated sample provides an accurate structure of CspA. This represents a general approach for rapid, automated structure determination of small proteins by NMR

    1H, 13C, 15N backbone resonance assignment of apo and ADP-ribose bound forms of the macro domain of Hepatitis E virus through solution NMR spectroscopy

    Get PDF
    International audienceAbstract The genome of Hepatitis E virus (HEV) is 7.2 kilobases long and has three open reading frames. The largest one is ORF1, encoding a non-structural protein involved in the replication process, and whose processing is ill-defined. The ORF1 protein is a multi-modular protein which includes a macro domain (MD). MDs are evolutionarily conserved structures throughout all kingdoms of life. MDs participate in the recognition and removal of ADP-ribosylation, and specifically viral MDs have been identified as erasers of ADP-ribose moieties interpreting them as important players at escaping the early stages of host-immune response. A detailed structural analysis of the apo and bound to ADP-ribose state of the native HEV MD would provide the structural information to understand how HEV MD is implicated in virus-host interplay and how it interacts with its intracellular partner during viral replication. In the present study we present the high yield expression of the native macro domain of HEV and its analysis by solution NMR spectroscopy. The HEV MD is folded in solution and we present a nearly complete backbone and sidechains assignment for apo and bound states. In addition, a secondary structure prediction by TALOS + analysis was performed. The results indicated that HEV MD has a α/β/α topology very similar to that of most viral macro domains
    corecore