Search CORE

214 research outputs found

Optimization Algorithms for Site-directed Protein Recombination Experiment Planning

Author: Zheng Wei
Publication venue: Dartmouth Digital Commons
Publication date: 01/06/2010
Field of study

Site-directed protein recombination produces improved and novel protein variants by recombining sequence fragments from parent proteins. The resulting hybrids accumulate multiple mutations that have been evolutionarily accepted together. Subsequent screening or selection identifies hybrids with desirable characteristics. In order to increase the hit rate of good variants, this thesis develops experiment planning algorithms to optimize protein recombination experiments. First, to improve the frequency of generating novel hybrids, a metric is developed to assess the diversity among hybrids and parent proteins. Dynamic programming algorithms are then created to optimize the selection of breakpoint locations according to this metric. Second, the trade-off between diversity and stability in recombination experiment planning is studied, recognizing that diversity requires changes from parent proteins, which may also disrupt important residue interactions necessary for protein stability. Accordingly, methods based on dynamic programming are developed to provide combined optimization of diversity and stability, finding optimal breakpoints such that no other experiment plan has better performance in both aspects simultaneously. Third, in order to support protein recombination with heterogeneous structures and focus on functionally important regions, a general framework for protein fragment swapping is developed. Differentiating source and target parents, and swappable regions within them, fragment swapping enables asymmetric, selective site-directed recombination. Two applications of protein fragment swapping are studied. In order to generate hybrids inheriting functionalities from both source and target proteins by fragment swapping, a method based on integer programming selects optimal swapping fragments to maximize the predicted stability and activity of hybrids in the resulting library. In another application, human source protein fragments are swapped into therapeutic exogenous target protein to minimize the occurrence of peptides that trigger immune response. A dynamic programming method is developed to optimize fragment selection for both humanity and functionality, resulting in therapeutically active variants with decreased immunogenicity

Dartmouth Digital Commons (Dartmouth College)

Algorithms for optimizing cross-overs in DNA shuffling

Author: Bailey-Kellogg Chris
Friedman Alan M
He Lu
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

DNA shuffling generates combinatorial libraries of chimeric genes by stochastically recombining parent genes. The resulting libraries are subjected to large-scale genetic selection or screening to identify those chimeras with favorable properties (e.g., enhanced stability or enzymatic activity). While DNA shuffling has been applied quite successfully, it is limited by its homology-dependent, stochastic nature. Consequently, it is used only with parents of sufficient overall sequence identity, and provides no control over the resulting chimeric library. Results: This paper presents efficient methods to extend the scope of DNA shuffling to handle significantly more diverse parents and to generate more predictable, optimized libraries. Our CODNS (cross-over optimization for DNA shuffling) approach employs polynomial-time dynamic programming algorithms to select codons for the parental amino acids, allowing for zero or a fixed number of conservative substitutions. We first present efficient algorithms to optimize the local sequence identity or the nearest-neighbor approximation of the change in free energy upon annealing, objectives that were previously optimized by computationally-expensive integer programming methods. We then present efficient algorithms for more powerful objectives that seek to localize and enhance the frequency of recombination by producing “runs” of common nucleotides either overall or according to the sequence diversity of the resulting chimeras. We demonstrate the effectiveness of CODNS in choosing codons and allocating substitutions to promote recombination between parents targeted in earlier studies: two GAR transformylases (41% amino acid sequence identity), two very distantly related DNA polymerases, Pol X and b (15%), and beta- lactamases of varying identity (26-47%). Conclusions: Our methods provide the protein engineer with a new approach to DNA shuffling that supports substantially more diverse parents, is more deterministic, and generates more predictable and more diverse chimeric libraries

Springer - Publisher Connector

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

Experiment Planning for Protein Structure Elucidation and Site-Directed Protein Recombination

Author: Ye Xiaoduan
Publication venue: Dartmouth Digital Commons
Publication date: 01/05/2007
Field of study

In order to most effectively investigate protein structure and improve protein function, it is necessary to carefully plan appropriate experiments. The combinatorial number of possible experiment plans demands effective criteria and efficient algorithms to choose the one that is in some sense optimal. This thesis addresses experiment planning challenges in two significant applications. The first part of this thesis develops an integrated computational-experimental approach for rapid discrimination of predicted protein structure models by quantifying their consistency with relatively cheap and easy experiments (cross-linking and site-directed mutagenesis followed by stability measurement). In order to obtain the most information from noisy and sparse experimental data, rigorous Bayesian frameworks have been developed to analyze the information content. Efficient algorithms have been developed to choose the most informative, least expensive, and most robust experiments. The effectiveness of this approach has been demonstrated using existing experimental data as well as simulations, and it has been applied to discriminate predicted structure models of the pTfa chaperone protein from bacteriophage lambda. The second part of this thesis seeks to choose optimal breakpoint locations for protein engineering by site-directed recombination. In order to increase the possibility of obtaining folded and functional hybrids in protein recombination, it is necessary to retain the evolutionary relationships among amino acids that determine protein stability and functionality. A probabilistic hypergraph model has been developed to model these relationships, with edge weights representing their statistical significance derived from database and a protein family. The effectiveness of this model has been validated by showing its ability to distinguish functional hybrids from non-functional ones in existing experimental data. It has been proved to be NP-hard in general to choose the optimal breakpoint locations for recombination that minimize the total perturbation to these relationships, but exact and approximate algorithms have been developed for a number of important cases

Dartmouth Digital Commons (Dartmouth College)

Computational Developability Assessment of Antibody Therapeutics

Author: Khetan Rahul
Publication venue
Publication date: 31/12/2023
Field of study

The University of Manchester - Institutional Repository

Recommended from our members

Exploring protein fitness landscapes with new high-throughput technologies

Author: Zurek Paul Jannis
Publication venue: University of Cambridge
Publication date: 17/03/2021
Field of study

The concept of a protein’s fitness landscape – an abstract space in which related sequences are close together and matched with their fitness – is a useful tool to visualize core principles of protein evolution. Acquiring a new function, for example the laboratory evolution of an enzyme to convert an industrially relevant substrate, can be understood as a stepwise climb through a fitness landscape, reaching higher fitness (or activity) with each step (or mutation). The valleys of such a space relate to the starting points of protein engineering campaigns. Understanding this area could enlighten principles of how proteins quickly adapt in nature and help to identify starting points with a high potential for evolution, a high ‘evolvability’, speeding up protein engineering. In this study, high-throughput technologies will be developed that enable the read-out of directed evolution on a large scale, tracking the exploration of the valley of a fitness landscape: the conversion of an amino acid- to amine dehydrogenase will be investigated as a model of enzyme evolvability with a drastic change of substrate specificity. A sensitive high-throughput screening assay as well as a comprehensive sequencing read-out will be required to establish the identity of selected variants during evolution. I will first generate and characterize three different but related starting points and test their initial evolvability. Stabilizing the starting point results in increased mutational robustness, broadening the range of accepted mutations. However, increased initial stability does not necessarily correlate to higher functional improvement, hinting at a nuanced view of evolvability. A sensitive high-throughput assay is necessary to verify the full potential of the starting points and study the early steps of evolution comprehensively. Broadly applicable ultrahigh-throughput assays of enzyme function, such as absorbance-activated droplet sorting, currently lack the sensitivity of more specific fluorescence-based or low-throughput counterparts. A universal approach to increase detectability in single cell-lysate microfluidic enzyme assays is established by amplifying the enzyme content per droplet more than 10-fold via homogeneous clonal cell growth. Clonal amplification enables the sensitive and precise detection of newly introduced amine dehydrogenase activities, a feat restricted in conventional assays by low initial activity and stability. To generate a truly complete view of directed evolution in a fitness landscape, however, an equally powerful sequencing read-out is necessary to identify all selected variants. Here, unique molecular identifiers are used to increase the accuracy of nanopore sequencing to levels that can reliably distinguish point mutations. I establish an inexpensive and straightforward long read amplicon sequencing workflow which is then applied to map the trajectories of two comparative long-term directed evolution campaigns. In the parallel evolution campaigns, initial beneficial mutations are exclusive to each starting point and lead to incompatible trajectories. Beneficial mutations are scarce and large improvements are unavailable until recombination occurs and a jump through the fitness landscape is realized. The recombined variant holds high evolvability and quickly evolves to take over the population and form the most successful lineages, indicating the power of recombination as a means to innovation in protein evolution. The tools established in this thesis can help protein engineers explore fitness landscapes more economically and comprehensively. Their application to mapping full trajectories of early adaptation uncovers differences in the evolvability of homologs, potentially aiding the identification of evolvable starting points as well as strategies to increase evolvability for efficient protein engineering in the future

Apollo (Cambridge)

Engineering stabilized enzymes via computational design and immobilization

Author: Johnson Lucas B.
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2016
Field of study

Includes bibliographical references.2016 Fall.The realm of biocatalysis has significantly matured beyond ancient fermentation techniques to accommodate the demand for modern day products. Enzymatically produced goods already influence our daily lives, from sweeteners and laundry detergent to blood pressure medication and antibiotics. Protein engineering has been a major driving force behind this biorevolution, yielding catalysts that can transform non-native substrates and withstand harsh industrial conditions. Although successful in many regards, computational design efforts are still limited by the crude approximations employed in searching a complex energy landscape. Advancements in protein engineering methods will be necessary to develop our understanding of biomolecules and accelerate the next generation of biotechnology applications. Our work employs a combination of computational design and simulation to achieve improved enzyme stability. In the first example, an enzyme used in the production of cellulosic biofuels was redesigned to remain active at high temperature. An initial approach involving consensus sequence analysis, predicted point mutation energy, and combinatorial optimization resulted in a sequence with reduced stability and activity. However, by using recombination methods and molecular dynamics simulations, we were able to identify specific mutations that had a stabilizing or destabilizing effect, and we successfully isolated mutations that benefited enzyme stability. Our iterative approach demonstrated how common design failures could be overcome by careful interpretation and suggested methods for improving future computational design efforts. In the second example, a cellulase was designed to have a high net charge via selected surface mutagenesis. “Supercharged” cellulases were experimentally characterized in various ionic liquids to assess the effect of high ion concentration on enzyme stability and activity. The designed enzymes also provided an opportunity to systematically probe the protein-solvent interface. Molecular dynamics simulations showed how ions influenced protein behavior by inducing minor unfolding events or by physically blocking the active site. Contradictory to previous reports, charged mutations only appeared to alter the affinity of anions and did not significantly change the binding of cations at the protein surface. Understanding the different modes of enzyme inactivation could motivate targeted design strategies for engineering protein resilience in ionic solvents. In addition to the discussed computational design methods, immobilization strategies were identified for capturing enzymes within porous protein crystals. Immobilization offers a generic approach for improving enzyme stability and activity. Our preliminary studies involving horseradish peroxidase and other enzymes suggested protein scaffolds could be employed as an effective immobilization material. Co-immobilizing multiple enzymes within the porous material led to improved product yield via exclusion of off-pathway reactions. Although future studies will be required to assess the potential capabilities of this immobilization strategy in comparison to other materials, preliminary results suggest protein crystals offer a favorable, controlled environment for immobilizing enzymes. The diversity of approaches presented in this thesis emphasizes that there are many options for engineering enzyme stability. Extending the lessons learned from our cellulase engineering to the greater field of rational protein design promotes the concept of biomolecules as designable entities. By establishing the shortcomings of our designs and suggesting routes for improvement, we anticipate our design methods and immobilization strategies will procure continued interest from the biotechnology community. The toolsets we developed for cellulases can be directly transferred to other enzymes and have the potential to impact a range of protein engineering applications

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Bayesian methods for small molecule identification

Author: Ludwig Marcus
Publication venue
Publication date: 01/01/2020
Field of study

Confident identification of small molecules remains a major challenge in untargeted metabolomics, natural product research and related fields. Liquid chromatography-tandem mass spectrometry is a predominant technique for the high-throughput analysis of small molecules and can detect thousands of different compounds in a biological sample. The automated interpretation of the resulting tandem mass spectra is highly non-trivial and many studies are limited to re-discovering known compounds by searching mass spectra in spectral reference libraries. But these libraries are vastly incomplete and a large portion of measured compounds remains unidentified. This constitutes a major bottleneck in the comprehensive, high-throughput analysis of metabolomics data. In this thesis, we present two computational methods that address different steps in the identification process of small molecules from tandem mass spectra. ZODIAC is a novel method for de novo that is, database-independent molecular formula annotation in complete datasets. It exploits similarities of compounds co-occurring in a sample to find the most likely molecular formula for each individual compound. ZODIAC improves on the currently best-performing method SIRIUS; on one dataset by 16.5 fold. We show that de novo molecular formula annotation is not just a theoretical advantage: We discover multiple novel molecular formulas absent from PubChem, one of the biggest structure databases. Furthermore, we introduce a novel scoring for CSI:FingerID, a state-of-the-art method for searching tandem mass spectra in a structure database. This scoring models dependencies between different molecular properties in a predicted molecular fingerprint via Bayesian networks. This problem has the unusual property, that the marginal probabilities differ for each predicted query fingerprint. Thus, we need to apply Bayesian networks in a novel, non-standard fashion. Modeling dependencies improves on the currently best scoring

Digitale Bibliothek Thüringen

Directed evolution and structural analysis of an OB-fold domain towards a specifc binding reagent

Author: Steemson John Durand
Publication venue: 'University of Waikato'
Publication date: 10/08/2011
Field of study

Interactions between proteins are a central concept in biology, and understanding and manipulation of these interactions is key to advancing biological science. Research into antibodies as customised binding molecules provided the foundation for development of the field of protein “scaffolds” for molecular recognition, where functional residues are mounted on to a stable protein platform. Consequently, the immunoglobulin domain has been describes as “nature’s paradigm” for a scaffold, and has been widely researched to make engineered antibodies better tools for specific applications. However, limitations in their use have lead to a number of non-immunoglobulin domains to be investigated as customisable scaffolds, to replace or complement antibodies. To be considered a scaffold, a protein domain must show an evolutionarily conserved hydrophobic core in diverse functional contexts. The study presented here investigated the oligosaccharide/oligonucleotide-binding (OB) fold as scaffold, which is a 5-standed β-barrel seen in diverse organisms with no sequence conservation. The term “Obody” was coined to describe engineered OB-folds. This thesis examined a previously engineered Obody with affinity for lysozyme (KD = 40 μM) in complex with its ligand by x-ray crystallography (resolution 2.75 Å) which revealed the atomic details of binding. Affinity maturation for lysozyme was undertaken by phage display directed evolution. Gene libraries were constructed by combinatorial PCR incorporating site-specific randomised codons identified by examination of the structure in complex with lysozyme, or by random generation of point mutations by error-prone PCR. Overall a 100-fold improvement in affinity was achieved (KD = 600 nM). To investigate the structural basis of the affinity maturation, two further Obody-lysozyme complexes were solved by x-ray crystallography, one at a KD of 5 μM (resolution 1.96 Å), one at 600 nM (resolution 1.86 Å). Analysis of the structures revealed changes in individual residue arrangements, as well as rigid-body changes in the relative orientation of the Obody and lysozyme molecules in complex. Directed evolution of Obodies as protein binding reagents remains a challenge, but this study demonstrates their potential. The structures presented here will contribute invaluable insights for the future design of improved Obodies

Research Commons@Waikato

Improved approaches to ligand growing through fragment docking and fragment-based library design

Author: Chevillard Florent
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2016
Field of study

Die Fragment-basierte Wirkstoffforschung (“fragment-based drug discovery“ – FBDD) hat in den vergangenen zwei Jahrzehnten kontinuierlich an Beliebtheit gewonnen und sich zu einem dominanten Instrument der Erforschung neuer chemischer Moleküle als potentielle bioaktive Modulatoren entwickelt. FBDD ist eng mit Ansätzen zur Fragment-Erweiterung, wie etwa dem Fragment-„growing“, „merging“ oder dem „linking“, verknüpft. Diese Entwicklungsansätze können mit Hilfe von Computerprogrammen oder teilautomatischen Prozessen der „de novo“ Wirkstoffentwicklung beschleunigt werden. Obwohl Computer mühelos Millionen von Vorschlägen generieren können, geschieht dies allerdings oft auf Kosten unsicherer synthetischer Realisierbarkeit der Verbindungen mit einer potentiellen Sackgasse im Optimierungsprozess. Dieses Manuskript beschreibt die Entwicklung zweier computerbasierter Instrumente, PINGUI und SCUBIDOO, mit dem Ziel den FBDD Ausarbeitungs-Zyklus zu fördern. PINGUI ist ein halbautomatischer Arbeitsablauf zur Fragment-Erweiterung basierend auf der Proteinstruktur unter Berücksichtigung der synthetischen Umsetzbarkeit. SCUBIDOO ist eine freizugängliche Datenbank mit aktuell 21 Millionen verfügbaren virtuellen Produkten, entwickelt durch die Kombination kommerziell verfügbarer Bausteine („building blocks“) mit bewährten organischen Reaktionen. Zu jedem erzeugten virtuellen Produkt wird somit eine Synthesevorschrift geliefert. Die entscheidenden Funktionen von PINGUI, wie die Erzeugung abgeleiteter Bibliotheken oder das Anwenden organischer Reaktionen, wurden daraufhin in die SCUBIDOO Webseite integriert. PINGUI als auch SCUBIDOO wurden des Weiteren zur Erforschung Fragment-basierter Liganden („fragment-based ligand discovery“) mit dem β-2 adrenergen Rezeptor (β-2-AR) und der PIM1 Kinase als Zielproteine („targets“) eingesetzt. Im Rahmen einer ersten Studie zum β-2-AR wurden mit PINGUI acht unterschiedliche Erweiterungen für verschiedene Fragment-Treffer („hits“) vorhergesagt (ausgewählt?). Alle acht Verbindungen konnten dabei erfolgreich synthetisiert werden und vier der acht Produkte zeigten im Vergleich zu den Ausgangsfragmenten eine erhöhte Affinität zum target. Eine zweite Studie umfasste die Anwendung von SCUBIDOO zur schnellen Identifikation von Fragmenten und deren möglichen Erweiterungen mit potentieller Bindungsaktivität zur PIM-1 Kinase. Als Ergebnis ergab sich ein Fragment-Treffer mit der dazugehörigen Kristallstruktur. Weitere Folgeprodukte befinden sich derzeit in Synthese. Abschließend wurde SCUBIDOO an eine automatische Roboter- Synthese gekoppelt, wodurch hunderte von Verbindungen effizient parallel synthetisiert werden können. 127 der 240 vorhergesagten Produkte (53%) wurden mit dem Ziel an den β-2-AR zu binden bereits synthetisiert und werden in Kürze weitergehend getestet. Die beiden vorgestellten Computer-Tools könnten zur Verbesserung im Anfangsstadium befindlicher Projekte zur Fragment-basierten Wirkstoffentwicklung, vor allem hinsichtlich der Strategien im Bereich der Fragment Erweiterung, eingesetzt werden. PINGUI zum Beispiel generiert Vorschläge zur Fragment- Erweiterung, die sich mit hoher Wahrscheinlichkeit an die Zielstruktur anlagern, und stellt somit ein nützliches und kreatives Werkzeug zur Untersuchung von Struktur-Wirkungsbeziehungen („structure-activity relationship“ – SAR) dar. SCUBIDOO zeigte sich mit einem bisherigen 53-prozentigen Synthese-Erfolg als zugänglich für die Integration an die effiziente automatisierte Roboter-Synthese. Jede zukünftige Synthese liefert neue Kenntnisse innerhalb der Datenbank und wird somit nach und nach den Synthese-Erfolg erhöhen. Des Weiteren stellen alle synthetisierten Produkte neuartige Verbindungen dar, was umso mehr den möglichen Einfluss SCUBIDOOs bei der Entdeckung neuer chemischer Strukturen hervorhebt

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Eight Biennial Report : April 2005 – March 2007

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2007
Field of study

MPG.PuRe