108 research outputs found
High-Throughput Inference of Protein-Protein Interaction Sites from Unassigned NMR Data by Analyzing Arrangements Induced By Quadratic Forms on 3-Manifolds
We cast the problem of identifying protein-protein interfaces, using only unassigned NMR spectra, into a geometric clustering problem. Identifying protein-protein interfaces is critical to understanding inter- and intra-cellular communication, and NMR allows the study of protein interaction in solution. However it is often the case that NMR studies of a protein complex are very time-consuming, mainly due to the bottleneck in assigning the chemical shifts, even if the apo structures of the constituent proteins are known. We study whether it is possible, in a high-throughput manner, to identify the interface region of a protein complex using only unassigned chemical shift and residual dipolar coupling (RDC) data. We introduce a geometric optimization problem where we must cluster the cells in an arrangement on the boundary of a 3-manifold. The arrangement is induced by a spherical quadratic form, which in turn is parameterized by SO(3)xR^2. We show that this formalism derives directly from the physics of RDCs. We present an optimal algorithm for this problem that runs in O(n^3 log n) time for an n-residue protein. We then use this clustering algorithm as a subroutine in a practical algorithm for identifying the interface region of a protein complex from unassigned NMR data. We present the results of our algorithm on NMR data for 7 proteins from 5 protein complexes and show that our approach is useful for high-throughput applications in which we seek to rapidly identify the interface region of a protein complex
Dissecting BMI1 Protein-Protein Interactions Through Chemical Biology.
BMI1 has emerged as a key oncogenic factor in many cancers, associated with unregulated cellular proliferation, tumor metastasis and cancer-initiating cell self-renewal. BMI1 is best characterized as a component of the canonical vertebrate polycomb repression complex 1 (PRC1) which negatively regulate transcription of hundreds of genes through ubiquitination of histone H2A. Previous work suggested that BMI1 has multiple protein binding partners within the PRC1 complex and we were motivated by the prospects to target these protein-protein interactions (PPIs) with small molecule inhibitors. This dissertation describes a multi-pronged campaign to: 1) characterize BMI1 PPIs at the molecular level and 2) develop novel chemical tools to explore BMI1 function in both normal and cancer biology.
Using X-ray crystallography and solution NMR approaches we solved the 3D structure of BMI1 in complex with its PRC1 binding partner protein PHC2. Supporting biochemical and biophysical characterization of the BMI1 PPI domain demonstrated a novel mode of self-association of this domain. Mutagenic disruption of both BMI1-PHC2 and BMI1-BMI1 interactions blocks cellular proliferation demonstrating that multiple PPIs are critical for BMI1 function.
To identify small molecule inhibitors of BMI1 we designed two biochemical assays to quantify the BMI1-PHC2 interaction and these assays were used as a platform for high-throughput screening. Through this screen we identified three classes of small molecule inhibitors that bind directly to BMI1 to disrupt the BMI1-PHC2 interaction, representing three different strategies for BMI1 inhibitor development.
As a complementary approach to inhibit BMI1 we developed a specific inhibitor of Ring1B/BMI1- mediated H2A ubiquitination with potent inhibitory activity both in vitro and in cells. Mechanistic characterization demonstrates that Ring1B/BMI1 inhibitors induce significant protein conformational change and the inhibitor-bound conformation is incompatible with nucleosome binding by Ring1B. These molecules represent the first direct-binding inhibitors of Ring1B/BMI1 and have a novel mechanism of action to block direct protein-nucleosome interaction.
Overall, this work contributes to the understanding of BMI1 function through characterization of its multiple PPIs and demonstrates that these interactions can be inhibited by small molecules representing novel strategies to target this protein for development of new chemical tools or potential therapeutics for cancer.PHDChemical BiologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113363/1/flvgray_1.pd
Graph algorithms for NMR resonance assignment and cross-link experiment planning
The study of three-dimensional protein structures produces insights into protein function at the molecular level. Graphs provide a natural representation of protein structures and associated experimental data, and enable the development of graph algorithms to analyze the structures and data. This thesis develops such graph representations and algorithms for two novel applications: structure-based NMR resonance assignment and disulfide cross-link experiment planning for protein fold determination. The first application seeks to identify correspondences between spectral peaks in NMR data and backbone atoms in a structure (from x-ray crystallography or homology modeling), by computing correspondences between a contact graph representing the structure and an analogous but very noisy and ambiguous graph representing the data. The assignment then supports further NMR studies of protein dynamics and protein-ligand interactions. A hierarchical grow-and-match algorithm was developed for smaller assignment problems, ensuring completeness of assignment, while a random graph approach was developed for larger problems, provably determining unique matches in polynomial time with high probability. Test results show that our algorithms are robust to typical levels of structural variation, noise, and missings, and achieve very good overall assignment accuracy. The second application aims to rapidly determine the overall organization of secondary structure elements of a target protein by probing it with a set of planned disulfide cross-links. A set of informative pairs of secondary structure elements is selected from graphs representing topologies of predicted structure models. For each pair in this ``fingerprint\u27\u27, a set of informative disulfide probes is selected from graphs representing residue proximity in the models. Information-theoretic planning algorithms were developed to maximize information gain while minimizing experimental complexity, and Bayes error plan assessment frameworks were developed to characterize the probability of making correct decisions given experimental data. Evaluation of the approach on a number of structure prediction case studies shows that the optimized plans have low risk of error while testing only a very small portion of the quadratic number of possible cross-link candidates
Contact replacement for NMR resonance assignment
Motivation: Complementing its traditional role in structural studies of proteins, nuclear magnetic resonance (NMR) spectroscopy is playing an increasingly important role in functional studies. NMR dynamics experiments characterize motions involved in target recognition, ligand binding, etc., while NMR chemical shift perturbation experiments identify and localize protein–protein and protein–ligand interactions. The key bottleneck in these studies is to determine the backbone resonance assignment, which allows spectral peaks to be mapped to specific atoms. This article develops a novel approach to address that bottleneck, exploiting an available X-ray structure or homology model to assign the entire backbone from a set of relatively fast and cheap NMR experiments
Structural Studies of the Anti-HIV Human Protein APOBEC3G Catalytic Domain: A Dissertation
HIV/AIDS is a disease of grave global importance with over 33 million people infected world-wide and nearly 2 million deaths each year. The rapid emergence of drug resistance, due to viral mutation, renders anti-retroviral drug candidates ineffective with alarming speed and regularity. Instead of targeting mutation prone viral proteins, an alternative approach is to target host proteins that interact with viral proteins and are critical for the HIV life-cycle. APOBEC3G is a host anti-HIV restriction factor that can exert tremendous negative pressure by hypermutating the viral genome and has the potential to be a promising candidate for anti-retroviral therapeutic research.
The work presented in this thesis is focused on investigating the A3G catalytic domain structure and implications of various observed structural features for biological function. High-resolution crystal structures of the A3G catalytic domain were solved using data from macromolecular X-ray crystallographic experiments, revealing a novel intermolecular zinc coordinating motif unique to A3G. Major intermolecular interfaces observed in the crystal structure were investigated for relevance to biochemical activity and biological function.
Co-crystallization with a small-molecule A3G inhibitor, discovered using high-throughput screening assays, revealed a cysteine residue near the active site that is critical for inhibition of catalytic activity by catechol moieties. The serendipitous discovery of covalent interactions between this inhibitor and a surface cysteine residue led to further biochemical experiments that revealed the other cysteine, near the active site, to be critical for inhibition.
Computational modeling was used to propose a steric-hinderance based mechanism of action that was supported by mutational experiments. Structures of other human APOBEC3 homologs were modeled using in-silico methods examined for similarities and differences with A3G catalytic domain crystal structures. Comparisons based on these homology models suggest putative structural features that may endow substrate specificity and other characteristics to the APOBEC3 family members
Novel approaches for bond order assignment and NMR shift prediction
Molecular modelling is one of the cornerstones of modern biological and pharmaceutical research. Accurate modelling approaches easily become computationally overwhelming and thus, different levels of approximations are typically employed. In this work, we develop such approximation approaches for problems arising in structural bioinformatics. A fundamental approximation of molecular physics is the classification of chemical bonds, usually in the form of integer bond orders. Many input data sets lack this information, but several problems render an automated bond order assignment highly challenging. For this task, we develop the BOA Constructor method which accounts for the non-uniqueness of solutions and allows simple extensibility. Testing our method on large evaluation sets, we demonstrate how it improves on the state of the art. Besides traditional applications, bond orders yield valuable input for the approximation of molecular quantities by statistical means. One such problem is the prediction of NMR chemical shifts of protein atoms. We present our pipeline NightShift for automated model generation, use it to create a new prediction model called Spinster, and demonstrate that it outperforms established, manually developed approaches. Combining Spinster and BOA Constructor, we create the Liops-model that for the first time allows to efficiently include the influence of non-protein atoms. Finally, we describe our work on manual modelling techniques, including molecular visualization and novel input paradigms.Methoden des molekularen Modellierens gehören zu den Grundpfeilern moderner biologischer und pharmazeutischer Forschung. Akkurate Modelling-Methoden erfordern jedoch enormen Rechenaufwand, weshalb üblicherweise verschiedene Näherungsverfahren eingesetzt werden. Im Promotionsvortrag werden solche im Rahmen der Promotion entwickelten Näherungen für verschiedene Probleme aus der strukturbasierten Bioinformatik vorgestellt. Eine fundamentale Näherung der molekularen Physik ist die Einteilung chemischer Bindungen in wenige Klassen, meist in Form ganzzahliger Bindungsordnungen. In vielen Datensätzen ist diese Information nicht enthalten und eine automatische Zuweisung ist hochgradig schwierig. Für diese Problemstellung wird die BOA Constructor-Methode vorgestellt, die sowohl mit uneindeutigen Lösungen umgehen kann als auch vom Benutzer leicht erweitert werden kann. In umfangreichen Tests zeigen wir, dass unsere Methode dem bisherigen Stand der Forschung überlegen ist. Neben klassischen Anwendungen liefern Bindungsordnungen wertvolle Informationen für die statistische Vorhersage molekularer Eigenschaften wie z.B. der chemischen Verschiebung von Proteinatomen. Mit der von uns entwickelten NightShift-Pipeline wird ein Verfahren zur automatischen Generierung von Vorhersagemodellen präsentiert, wie z.B. dem Spinster-Modell, das den bisherigen manuell entwickelten Verfahren überlegen ist. Die Kombination mit BOA Constructor führt zum sogenannten Liops-Modell, welches als erstes Modell die effiziente Berücksichtigung des Einflusses von nicht-Proteinatomen erlaubt
Investigations into RNA-binding proteins involved in eukaryotic gene regulation
The flood of RNA-related research in recent decades has revealed RNA to be a structurally and functionally diverse class of molecule, one that generates an intricate network of regulation that has been pivotal to the evolution of complex lifeforms. In order to elucidate how RNA achieves biological function through the formation of ribonucleoprotein (RNP) complexes, characterisation of RNA recognition by RNA-binding proteins (RBPs) is an essential step. The rules governing the interaction of RNA and RBPs have proved difficult to define, and in many instances, it is not understood how specificity is achieved. Knowledge of these rules is crucial to our understanding of RNA-related functions and their role in disease, and requires further in-depth characterisation of a wide variety of RNP complexes. The research in this Thesis details the RNA-binding behaviour of two reported RBPs. Firstly, the RNA-binding behaviour of the Drosophila transcription factor bicoid is investigated. For many years it has been believed that the bicoid homeodomain binds the 3′-UTR of the caudal mRNA transcript, yet no binding site or specificity determinants have been reported. The work here attempts to characterise this interaction. Further, other domains in the protein are examined with a view to understanding how biological specificity might be achieved. Secondly, characterisation of the RNA-binding behaviour of the heterodimeric pair of transcription elongation factors, Spt4 and Spt5, is reported. This heterodimer is known to be an important player in transcription and yet remarkably little is known about its function. In the present work, the AA-repeat RNA-binding properties of these proteins are investigated, and complex binding behaviour is reported. Overall, it is shown that the elucidation of RNA-binding activity by proteins is often not straightforward, requiring the application of multiple and increasingly sophisticated techniques if we are to grasp the underlying biology
Bioinformatic and Experimental Approaches for Deeper Metaproteomic Characterization of Complex Environmental Samples
The coupling of high performance multi-dimensional liquid chromatography and tandem mass spectrometry for characterization of microbial proteins from complex environmental samples has paved the way for a new era in scientific discovery. The field of metaproteomics, which is the study of protein suite of all the organisms in a biological system, has taken a tremendous leap with the introduction of high-throughput proteomics. However, with corresponding increase in sample complexity, novel challenges have been raised with respect to efficient peptide separation via chromatography and bioinformatic analysis of the resulting high throughput data. In this dissertation, various aspects of metaproteomic characterization, including experimental and computational approaches have been systematically evaluated. In this study, robust separation protocols employing strong cation exchange and reverse phase have been designed for efficient peptide separation thus offering excellent orthogonality and ease of automation. These findings will be useful to the proteomics community for obtaining deeper non-redundant peptide identifications which in turn will improve the overall depth of semi-quantitative proteomics.
Secondly, computational bottlenecks associated with screening the vast amount of raw mass spectra generated in these proteomic measurements have been addressed. Computational matching of tandem mass spectra via conventional database search strategies lead to modest peptide/protein identifications. This seriously restricts the amount of information retrieved from these complex samples which is mainly due to high complexity and heterogeneity of the sample containing hundreds of proteins shared between different microbial species often having high level of homology. Hence, the challenges associated with metaproteomic data analysis has been addressed by utilizing multiple iterative search engines coupled with de novo sequencing algorithms for a comprehensive and in-depth characterization of complex environmental samples.
The work presented here will utilize various sample types ranging from isolates and mock microbial mixtures prepared in the laboratory to complex community samples extracted from industrial waste water, acid-mine drainage and methane seep sediments. In a broad perspective, this dissertation aims to provide tools for gaining deeper insights to proteome characterization in complex environmental ecosystems
- …