214 research outputs found

    Optimization Algorithms for Site-directed Protein Recombination Experiment Planning

    Get PDF
    Site-directed protein recombination produces improved and novel protein variants by recombining sequence fragments from parent proteins. The resulting hybrids accumulate multiple mutations that have been evolutionarily accepted together. Subsequent screening or selection identifies hybrids with desirable characteristics. In order to increase the hit rate of good variants, this thesis develops experiment planning algorithms to optimize protein recombination experiments. First, to improve the frequency of generating novel hybrids, a metric is developed to assess the diversity among hybrids and parent proteins. Dynamic programming algorithms are then created to optimize the selection of breakpoint locations according to this metric. Second, the trade-off between diversity and stability in recombination experiment planning is studied, recognizing that diversity requires changes from parent proteins, which may also disrupt important residue interactions necessary for protein stability. Accordingly, methods based on dynamic programming are developed to provide combined optimization of diversity and stability, finding optimal breakpoints such that no other experiment plan has better performance in both aspects simultaneously. Third, in order to support protein recombination with heterogeneous structures and focus on functionally important regions, a general framework for protein fragment swapping is developed. Differentiating source and target parents, and swappable regions within them, fragment swapping enables asymmetric, selective site-directed recombination. Two applications of protein fragment swapping are studied. In order to generate hybrids inheriting functionalities from both source and target proteins by fragment swapping, a method based on integer programming selects optimal swapping fragments to maximize the predicted stability and activity of hybrids in the resulting library. In another application, human source protein fragments are swapped into therapeutic exogenous target protein to minimize the occurrence of peptides that trigger immune response. A dynamic programming method is developed to optimize fragment selection for both humanity and functionality, resulting in therapeutically active variants with decreased immunogenicity

    Algorithms for optimizing cross-overs in DNA shuffling

    Get PDF
    DNA shuffling generates combinatorial libraries of chimeric genes by stochastically recombining parent genes. The resulting libraries are subjected to large-scale genetic selection or screening to identify those chimeras with favorable properties (e.g., enhanced stability or enzymatic activity). While DNA shuffling has been applied quite successfully, it is limited by its homology-dependent, stochastic nature. Consequently, it is used only with parents of sufficient overall sequence identity, and provides no control over the resulting chimeric library. Results: This paper presents efficient methods to extend the scope of DNA shuffling to handle significantly more diverse parents and to generate more predictable, optimized libraries. Our CODNS (cross-over optimization for DNA shuffling) approach employs polynomial-time dynamic programming algorithms to select codons for the parental amino acids, allowing for zero or a fixed number of conservative substitutions. We first present efficient algorithms to optimize the local sequence identity or the nearest-neighbor approximation of the change in free energy upon annealing, objectives that were previously optimized by computationally-expensive integer programming methods. We then present efficient algorithms for more powerful objectives that seek to localize and enhance the frequency of recombination by producing “runs” of common nucleotides either overall or according to the sequence diversity of the resulting chimeras. We demonstrate the effectiveness of CODNS in choosing codons and allocating substitutions to promote recombination between parents targeted in earlier studies: two GAR transformylases (41% amino acid sequence identity), two very distantly related DNA polymerases, Pol X and b (15%), and beta- lactamases of varying identity (26-47%). Conclusions: Our methods provide the protein engineer with a new approach to DNA shuffling that supports substantially more diverse parents, is more deterministic, and generates more predictable and more diverse chimeric libraries

    Experiment Planning for Protein Structure Elucidation and Site-Directed Protein Recombination

    Get PDF
    In order to most effectively investigate protein structure and improve protein function, it is necessary to carefully plan appropriate experiments. The combinatorial number of possible experiment plans demands effective criteria and efficient algorithms to choose the one that is in some sense optimal. This thesis addresses experiment planning challenges in two significant applications. The first part of this thesis develops an integrated computational-experimental approach for rapid discrimination of predicted protein structure models by quantifying their consistency with relatively cheap and easy experiments (cross-linking and site-directed mutagenesis followed by stability measurement). In order to obtain the most information from noisy and sparse experimental data, rigorous Bayesian frameworks have been developed to analyze the information content. Efficient algorithms have been developed to choose the most informative, least expensive, and most robust experiments. The effectiveness of this approach has been demonstrated using existing experimental data as well as simulations, and it has been applied to discriminate predicted structure models of the pTfa chaperone protein from bacteriophage lambda. The second part of this thesis seeks to choose optimal breakpoint locations for protein engineering by site-directed recombination. In order to increase the possibility of obtaining folded and functional hybrids in protein recombination, it is necessary to retain the evolutionary relationships among amino acids that determine protein stability and functionality. A probabilistic hypergraph model has been developed to model these relationships, with edge weights representing their statistical significance derived from database and a protein family. The effectiveness of this model has been validated by showing its ability to distinguish functional hybrids from non-functional ones in existing experimental data. It has been proved to be NP-hard in general to choose the optimal breakpoint locations for recombination that minimize the total perturbation to these relationships, but exact and approximate algorithms have been developed for a number of important cases

    Engineering stabilized enzymes via computational design and immobilization

    Get PDF
    Includes bibliographical references.2016 Fall.The realm of biocatalysis has significantly matured beyond ancient fermentation techniques to accommodate the demand for modern day products. Enzymatically produced goods already influence our daily lives, from sweeteners and laundry detergent to blood pressure medication and antibiotics. Protein engineering has been a major driving force behind this biorevolution, yielding catalysts that can transform non-native substrates and withstand harsh industrial conditions. Although successful in many regards, computational design efforts are still limited by the crude approximations employed in searching a complex energy landscape. Advancements in protein engineering methods will be necessary to develop our understanding of biomolecules and accelerate the next generation of biotechnology applications. Our work employs a combination of computational design and simulation to achieve improved enzyme stability. In the first example, an enzyme used in the production of cellulosic biofuels was redesigned to remain active at high temperature. An initial approach involving consensus sequence analysis, predicted point mutation energy, and combinatorial optimization resulted in a sequence with reduced stability and activity. However, by using recombination methods and molecular dynamics simulations, we were able to identify specific mutations that had a stabilizing or destabilizing effect, and we successfully isolated mutations that benefited enzyme stability. Our iterative approach demonstrated how common design failures could be overcome by careful interpretation and suggested methods for improving future computational design efforts. In the second example, a cellulase was designed to have a high net charge via selected surface mutagenesis. “Supercharged” cellulases were experimentally characterized in various ionic liquids to assess the effect of high ion concentration on enzyme stability and activity. The designed enzymes also provided an opportunity to systematically probe the protein-solvent interface. Molecular dynamics simulations showed how ions influenced protein behavior by inducing minor unfolding events or by physically blocking the active site. Contradictory to previous reports, charged mutations only appeared to alter the affinity of anions and did not significantly change the binding of cations at the protein surface. Understanding the different modes of enzyme inactivation could motivate targeted design strategies for engineering protein resilience in ionic solvents. In addition to the discussed computational design methods, immobilization strategies were identified for capturing enzymes within porous protein crystals. Immobilization offers a generic approach for improving enzyme stability and activity. Our preliminary studies involving horseradish peroxidase and other enzymes suggested protein scaffolds could be employed as an effective immobilization material. Co-immobilizing multiple enzymes within the porous material led to improved product yield via exclusion of off-pathway reactions. Although future studies will be required to assess the potential capabilities of this immobilization strategy in comparison to other materials, preliminary results suggest protein crystals offer a favorable, controlled environment for immobilizing enzymes. The diversity of approaches presented in this thesis emphasizes that there are many options for engineering enzyme stability. Extending the lessons learned from our cellulase engineering to the greater field of rational protein design promotes the concept of biomolecules as designable entities. By establishing the shortcomings of our designs and suggesting routes for improvement, we anticipate our design methods and immobilization strategies will procure continued interest from the biotechnology community. The toolsets we developed for cellulases can be directly transferred to other enzymes and have the potential to impact a range of protein engineering applications

    Bayesian methods for small molecule identification

    Get PDF
    Confident identification of small molecules remains a major challenge in untargeted metabolomics, natural product research and related fields. Liquid chromatography-tandem mass spectrometry is a predominant technique for the high-throughput analysis of small molecules and can detect thousands of different compounds in a biological sample. The automated interpretation of the resulting tandem mass spectra is highly non-trivial and many studies are limited to re-discovering known compounds by searching mass spectra in spectral reference libraries. But these libraries are vastly incomplete and a large portion of measured compounds remains unidentified. This constitutes a major bottleneck in the comprehensive, high-throughput analysis of metabolomics data. In this thesis, we present two computational methods that address different steps in the identification process of small molecules from tandem mass spectra. ZODIAC is a novel method for de novo that is, database-independent molecular formula annotation in complete datasets. It exploits similarities of compounds co-occurring in a sample to find the most likely molecular formula for each individual compound. ZODIAC improves on the currently best-performing method SIRIUS; on one dataset by 16.5 fold. We show that de novo molecular formula annotation is not just a theoretical advantage: We discover multiple novel molecular formulas absent from PubChem, one of the biggest structure databases. Furthermore, we introduce a novel scoring for CSI:FingerID, a state-of-the-art method for searching tandem mass spectra in a structure database. This scoring models dependencies between different molecular properties in a predicted molecular fingerprint via Bayesian networks. This problem has the unusual property, that the marginal probabilities differ for each predicted query fingerprint. Thus, we need to apply Bayesian networks in a novel, non-standard fashion. Modeling dependencies improves on the currently best scoring

    Directed evolution and structural analysis of an OB-fold domain towards a specifc binding reagent

    Get PDF
    Interactions between proteins are a central concept in biology, and understanding and manipulation of these interactions is key to advancing biological science. Research into antibodies as customised binding molecules provided the foundation for development of the field of protein “scaffolds” for molecular recognition, where functional residues are mounted on to a stable protein platform. Consequently, the immunoglobulin domain has been describes as “nature’s paradigm” for a scaffold, and has been widely researched to make engineered antibodies better tools for specific applications. However, limitations in their use have lead to a number of non-immunoglobulin domains to be investigated as customisable scaffolds, to replace or complement antibodies. To be considered a scaffold, a protein domain must show an evolutionarily conserved hydrophobic core in diverse functional contexts. The study presented here investigated the oligosaccharide/oligonucleotide-binding (OB) fold as scaffold, which is a 5-standed ÎČ-barrel seen in diverse organisms with no sequence conservation. The term “Obody” was coined to describe engineered OB-folds. This thesis examined a previously engineered Obody with affinity for lysozyme (KD = 40 ÎŒM) in complex with its ligand by x-ray crystallography (resolution 2.75 Å) which revealed the atomic details of binding. Affinity maturation for lysozyme was undertaken by phage display directed evolution. Gene libraries were constructed by combinatorial PCR incorporating site-specific randomised codons identified by examination of the structure in complex with lysozyme, or by random generation of point mutations by error-prone PCR. Overall a 100-fold improvement in affinity was achieved (KD = 600 nM). To investigate the structural basis of the affinity maturation, two further Obody-lysozyme complexes were solved by x-ray crystallography, one at a KD of 5 ÎŒM (resolution 1.96 Å), one at 600 nM (resolution 1.86 Å). Analysis of the structures revealed changes in individual residue arrangements, as well as rigid-body changes in the relative orientation of the Obody and lysozyme molecules in complex. Directed evolution of Obodies as protein binding reagents remains a challenge, but this study demonstrates their potential. The structures presented here will contribute invaluable insights for the future design of improved Obodies

    Improved approaches to ligand growing through fragment docking and fragment-based library design

    Get PDF
    Die Fragment-basierte Wirkstoffforschung (“fragment-based drug discovery“ – FBDD) hat in den vergangenen zwei Jahrzehnten kontinuierlich an Beliebtheit gewonnen und sich zu einem dominanten Instrument der Erforschung neuer chemischer MolekĂŒle als potentielle bioaktive Modulatoren entwickelt. FBDD ist eng mit AnsĂ€tzen zur Fragment-Erweiterung, wie etwa dem Fragment-„growing“, „merging“ oder dem „linking“, verknĂŒpft. Diese EntwicklungsansĂ€tze können mit Hilfe von Computerprogrammen oder teilautomatischen Prozessen der „de novo“ Wirkstoffentwicklung beschleunigt werden. Obwohl Computer mĂŒhelos Millionen von VorschlĂ€gen generieren können, geschieht dies allerdings oft auf Kosten unsicherer synthetischer Realisierbarkeit der Verbindungen mit einer potentiellen Sackgasse im Optimierungsprozess. Dieses Manuskript beschreibt die Entwicklung zweier computerbasierter Instrumente, PINGUI und SCUBIDOO, mit dem Ziel den FBDD Ausarbeitungs-Zyklus zu fördern. PINGUI ist ein halbautomatischer Arbeitsablauf zur Fragment-Erweiterung basierend auf der Proteinstruktur unter BerĂŒcksichtigung der synthetischen Umsetzbarkeit. SCUBIDOO ist eine freizugĂ€ngliche Datenbank mit aktuell 21 Millionen verfĂŒgbaren virtuellen Produkten, entwickelt durch die Kombination kommerziell verfĂŒgbarer Bausteine („building blocks“) mit bewĂ€hrten organischen Reaktionen. Zu jedem erzeugten virtuellen Produkt wird somit eine Synthesevorschrift geliefert. Die entscheidenden Funktionen von PINGUI, wie die Erzeugung abgeleiteter Bibliotheken oder das Anwenden organischer Reaktionen, wurden daraufhin in die SCUBIDOO Webseite integriert. PINGUI als auch SCUBIDOO wurden des Weiteren zur Erforschung Fragment-basierter Liganden („fragment-based ligand discovery“) mit dem ÎČ-2 adrenergen Rezeptor (ÎČ-2-AR) und der PIM1 Kinase als Zielproteine („targets“) eingesetzt. Im Rahmen einer ersten Studie zum ÎČ-2-AR wurden mit PINGUI acht unterschiedliche Erweiterungen fĂŒr verschiedene Fragment-Treffer („hits“) vorhergesagt (ausgewĂ€hlt?). Alle acht Verbindungen konnten dabei erfolgreich synthetisiert werden und vier der acht Produkte zeigten im Vergleich zu den Ausgangsfragmenten eine erhöhte AffinitĂ€t zum target. Eine zweite Studie umfasste die Anwendung von SCUBIDOO zur schnellen Identifikation von Fragmenten und deren möglichen Erweiterungen mit potentieller BindungsaktivitĂ€t zur PIM-1 Kinase. Als Ergebnis ergab sich ein Fragment-Treffer mit der dazugehörigen Kristallstruktur. Weitere Folgeprodukte befinden sich derzeit in Synthese. Abschließend wurde SCUBIDOO an eine automatische Roboter- Synthese gekoppelt, wodurch hunderte von Verbindungen effizient parallel synthetisiert werden können. 127 der 240 vorhergesagten Produkte (53%) wurden mit dem Ziel an den ÎČ-2-AR zu binden bereits synthetisiert und werden in KĂŒrze weitergehend getestet. Die beiden vorgestellten Computer-Tools könnten zur Verbesserung im Anfangsstadium befindlicher Projekte zur Fragment-basierten Wirkstoffentwicklung, vor allem hinsichtlich der Strategien im Bereich der Fragment Erweiterung, eingesetzt werden. PINGUI zum Beispiel generiert VorschlĂ€ge zur Fragment- Erweiterung, die sich mit hoher Wahrscheinlichkeit an die Zielstruktur anlagern, und stellt somit ein nĂŒtzliches und kreatives Werkzeug zur Untersuchung von Struktur-Wirkungsbeziehungen („structure-activity relationship“ – SAR) dar. SCUBIDOO zeigte sich mit einem bisherigen 53-prozentigen Synthese-Erfolg als zugĂ€nglich fĂŒr die Integration an die effiziente automatisierte Roboter-Synthese. Jede zukĂŒnftige Synthese liefert neue Kenntnisse innerhalb der Datenbank und wird somit nach und nach den Synthese-Erfolg erhöhen. Des Weiteren stellen alle synthetisierten Produkte neuartige Verbindungen dar, was umso mehr den möglichen Einfluss SCUBIDOOs bei der Entdeckung neuer chemischer Strukturen hervorhebt

    Eight Biennial Report : April 2005 – March 2007

    No full text
    • 

    corecore