5,970 research outputs found

    Ciliate Gene Unscrambling with Fewer Templates

    Full text link
    One of the theoretical models proposed for the mechanism of gene unscrambling in some species of ciliates is the template-guided recombination (TGR) system by Prescott, Ehrenfeucht and Rozenberg which has been generalized by Daley and McQuillan from a formal language theory perspective. In this paper, we propose a refinement of this model that generates regular languages using the iterated TGR system with a finite initial language and a finite set of templates, using fewer templates and a smaller alphabet compared to that of the Daley-McQuillan model. To achieve Turing completeness using only finite components, i.e., a finite initial language and a finite set of templates, we also propose an extension of the contextual template-guided recombination system (CTGR system) by Daley and McQuillan, by adding an extra control called permitting contexts on the usage of templates.Comment: In Proceedings DCFS 2010, arXiv:1008.127

    Two Refinements of the Template-Guided DNA Recombination Model of Ciliate Computing

    Get PDF
    To solve the mystery of the intricate gene unscrambling mechanism in ciliates, various theoretical models for this process have been proposed from the point of view of computation. Two main models are the reversible guided recombination system by Kari and Landweber and the template-guided recombination (TGR) system by Prescott, Ehrenfeucht and Rozenberg, based on two categories of DNA recombination: the pointer guided and the template directed recombination respectively. The latter model has been generalized by Daley and McQuillan. In this thesis, we propose a new approach to generate regular languages using the iterated TGR system with a finite initial language and a finite set of templates, that reduces the size of the template language and the alphabet compared to that of the Daley-McQuillan model. To achieve computational completeness using only finite components we also propose an extension of the contextual template-guided recombination system (CTGR system) by Daley and McQuillan, by adding an extra control called permitting contexts on the usage of templates. Then we prove that our proposed system, the CTGR system using permitting contexts, has the capability to characterize the family of recursively enumerable languages using a finite initial language and a finite set of templates. Lastly, we present a comparison and analysis of the computational power of the reversible guided recombination system and the TGR system. Keywords: ciliates, gene unscrambling, in vivo computing, DNA computing, cellular computing, reversible guided recombination, template-guided recombination

    Formal Model and Simulation of the Gene Assembly Process in Ciliates

    Get PDF
    The construction process of the functional macronucleus in certain types of ciliates is known as the ciliate gene assembly process. It consists of a massive amount of DNA excision from the micronucleus and the rearrangement of the rest of the DNA sequences (in the case of stichotrichous ciliates). While several computational models have tried to represent certain parts of the gene assembly process, the real process remains not completely understood. In this research, a new formal model called the Computational 2JLP model is introduced based on the recent biological 2JLP model. For justifying the formal model, a simulation is created and tested with real data. Several parameters are introduced in the model that are used to test ambiguities or edge cases of the biological model. Parameters are systematically tested from the simulation to try to find their optimal values. Interestingly, a negative correlation is found between a parameter (which is used to filter out scnRNAs that are similar to IES specific sequences from the macronucleus) and the outcome of the simulation. It indicates that if a scnRNA consists of both an MDS and IES, then from the perspective of maximizing the outcome of the simulation, it is desirable to filter out this scnRNA. The simulator successfully performs the gene assembly process whether the inputs are scrambled or unscrambled DNA sequences. It is desirable for this model to serve as a foundation for future computational and mathematical study, and to help inform and refine the biological model

    Optimization of Recombination Methods and Expanding the Utility of Penicillin G Acylase

    Get PDF
    Protein engineering can be performed by combinatorial techniques (directed evolution) and data-driven methods using machine-learning algorithms. The main characteristic of directed evolution (DE) is the application of an effective and efficient screen or selection on a diverse mutant library. As it is important to have a diverse mutant library for the success of DE, we compared the performance of DNA-shuffling and recombination PCR on fluorescent proteins using sequence information as well as statistical methods. We found that the diversity of the libraries DNA-shuffling and recombination PCR generates were dependent on type of skew primers used and sensitive to nucleotide identity levels between genes. DNA-shuffling and recombination PCR produced libraries with different crossover tendencies, suggesting that the two protocols could be used in combination to produce better libraries. Data-driven protein engineering uses sequence, structure and function data along with analyzed empirical activity information to guide library design. Boolean Learning Support Vector Machines (BLSVM) to identify interacting residues in fluorescent proteins and the gene templates were modified to preserve interactions post recombination. By site-directed mutagenesis, recombination and expression experiments, we validated that BLSVM can be used to identify interacting residues and increase the fraction of active proteins in the library. As an extension to the above experiments, DE was applied on monomeric Red Fluorescent Proteins to improve its spectral characteristics and structure-guided protein engineering was performed on penicillin G acylase (PGA), an industrially relevant catalyst, to change its substrate specificity.Ph.D.Committee Chair: Bommarius, Andreas; Committee Member: Hu, Wei-Shou; Committee Member: Lee, Jay; Committee Member: Lutz, Stefan; Committee Member: Prausnitz, Mar

    Biotechnological platforms for aryl-alcohol oxidases by directed evolution

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Ciencias, Departamento de Biología Molecular. Fecha de lectura: 03-10-2019Esta tesis tiene embargado el acceso al texto completo hasta el 03-04-2021The aryl-alcohol oxidase (AAO) is a fungal flavoenzyme that supplies H2O2 to the ligninolytic consortium during natural wood decay. Being active on a wide array of aromatic alcohols, this GMC oxidase presents a highly enantioselective mechanism of great interest in organic synthesis processes. The most powerful strategy for the AAO to meet industrial standards is the engineering of its properties by directed evolution. In the present Doctoral Thesis, an evolutionary platform for the AAO from Pleurotus eryngii was developed in order to: (i) obtain functional expression in yeasts, (ii) design a secondary benzyl-alcohol oxidase, and (iii) explore the enzymatic conversion of furfural derivatives. To achieve functional expression in Saccharomyces cerevisiae, the AAO gene was fused to different signal peptides including chimeric versions of the mating-α factor and the killer K1 toxin preprosequences. The platform for in vitro evolution was completed with a dual high-throughput screening assay to detect H2O2 that included a method based on the Fenton reaction. To enhance secretion, several libraries were created combining classical evolution (i.e. mutagenic PCR and DNA shuffling) with structure-guided evolution by MORPHING. The final secretion variant FX9, carried four mutations in the signal peptide and two substitutions in the mature protein including the consensus/ancestral H91N. The FX9 improved secretion up to 4.5 mg/L and presented high stability and kinetic values similar to the native enzyme. FX9 was cloned and expressed in Pichia pastoris maintaining expression levels and main biochemical properties. When the production was scaled-up in 5L fermenter, AAO production was increased to 25.5 mg/L. FX9 was further evolved to selectively oxidize secondary benzyl alcohols. The residual activity on chiral molecules was unlocked with the modulation for the catalytic pocket by combinatorial saturation mutagenesis. After four generations, that included a site-directed recombination step to polish mutations, LanDo variant harbored five new substitutions increasing the catalytic efficiency with 1-(p-methoxyphenyl)-ethanol in 3 orders of magnitude with a 99% ee. Exploring the transformation of 5-hydroxymethylfurfural (HMF) into furan-2,5-dicarboxylic acid (FDCA), FX9 acquired mutation F501W that improved catalytic efficiency on HMF 3-fold and showed for the first time the performance of three consecutive oxidations for the AAOLa aril-alcohol oxidasa (AAO) es una flavoenzima fúngica que suple H2O2 al consorcio ligninolítico durante la degradación natural de la madera. Siendo activa con una amplia variedad de alcoholes aromáticos, esta oxidasa GMC presenta un mecanismo altamente enantioselectivo de gran interés en procesos de síntesis orgánica. La estrategia más potente para adaptar a la AAO a estándares industriales es la ingeniería de sus propiedades mediante técnicas de evolución dirigida. En la presente Tesis Doctoral, una plataforma evolutiva para la AAO de Pleurotus eryngii fue desarrollada con el objetivo de: (i) obtener expresión funcional en levaduras, (ii) diseñar una aril-alcohol oxidasa activa con alcoholes secundarios, y (iii) explorar la conversión enzimática de derivados del furfural. Para obtener expresión funcional en Saccharomyces cerevisiae, el gen de la AAO se fusionó a diferentes péptidos señales incluyendo versiones quiméricas de las secuencias prepro del factor-α y la toxina killer K1. La plataforma para la evolución in vitro se completó con un ensayo dual de screening para la detección de H2O2 incluyendo un método basado en la reacción de Fenton. Para mejorar la secreción, se crearon varias librerías combinando evolución clásica (PCR mutagénica y DNA shuffling) con evolución focalizada con el método MORPHING. La variante final FX9, con alta estabilidad y constantes cinéticas similares a la enzima nativa, presentó cuatro mutaciones en el péptido señal y dos substituciones en la proteína madura incluyendo la consenso/ancestral H91N. FX9 se expresó en S. cerevisiae con valores de 4.5 mg/L y fue posteriormente clonada y expresada en Pichia pastoris a escala de fermentador de 5 L alcanzando niveles de secreción 25.5 mg/L y manteniendo sus propiedades bioquímicas generales. La variante FX9 fue sometida a posteriores ciclos de evolución, incluyendo el remodelado del bolsillo catalítico por mutagénesis saturada combinatorial, para la oxidación de alcoholes bencílicos secundarios. Las cinco mutaciones introducidas en la variante LanDo aumentaron la eficiencia catalítica con 1-(p-methoxyphenyl)-ethanol en 3 órdenes de magnitud con un 99 % ee. Explorando la transformación del 5-hydroxymethylfurfural (HMF) en furan-2,5-dicarboxylic acid (FDCA), FX9 adquirió la mutación F501W que mejoró 3 veces la eficiencia catalítica con HMF y demostró por primera vez la catálisis de 3 oxidaciones consecutivas para la AAOLa financiación que me ha permitido seguir mis estudios doctorales. Los proyectos europeos “Optimized oxidoreductases for medium and large scale industrial biotransformations (INDOX FP7-KBBE-2013-7-613549)” y “New enzymatic oxidation/oxyfunctionalization technologies for added value bio-based products. (ENZOX2 H2020-BBI-PPP-2015-2-720297)”. Los proyectos nacionales “Evolución dirigida de oxidoreductasas ligninolíticas modernas y ancestrales para el diseño de una levadura de podredumbre blanca (DEWRY BIO2013-43407-R)”, “Evolución dirigida y computacional de ligninasas (LIGNOLUTION BIO2016-79106-R)“ y “Química sintética mediante enzimas quiméricas de fusión diseñadas por evolución dirigida y computacional (EVOCHIMERA Y2018/BIO-4738)”

    CRISPR/Cas9-induced (CTG⋅CAG)n repeat instability in the myotonic dystrophy type 1 locus: implications for therapeutic genome editing

    Get PDF
    Myotonic dystrophy type 1 (DM1) is caused by (CTG⋅CAG)n-repeat expansion within the DMPK gene and thought to be mediated by a toxic RNA gain of function. Current attempts to develop therapy for this disease mainly aim at destroying or blocking abnormal properties of mutant DMPK (CUG)n RNA. Here, we explored a DNA-directed strategy and demonstrate that single clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-cleavage in either its 5′ or 3′ unique flank promotes uncontrollable deletion of large segments from the expanded trinucleotide repeat, rather than formation of short indels usually seen after double-strand break repair. Complete and precise excision of the repeat tract from normal and large expanded DMPK alleles in myoblasts from unaffected individuals, DM1 patients, and a DM1 mouse model could be achieved at high frequency by dual CRISPR/Cas9-cleavage at either side of the (CTG⋅CAG)n sequence. Importantly, removal of the repeat appeared to have no detrimental effects on the expression of genes in the DM1 locus. Moreover, myogenic capacity, nucleocytoplasmic distribution, and abnormal RNP-binding behavior of transcripts from the edited DMPK gene were normalized. Dual sgRNA-guided excision of the (CTG⋅CAG)n tract by CRISPR/Cas9 technology is applicable for developing isogenic cell lines for research and may provide new therapeutic opportunities for patients with DM1

    Machine Learning Guided Exploration of an Empirical Ribozyme Fitness Landscape

    Get PDF
    Okinawa Institute of Science and Technology Graduate UniversityDoctor of PhilosophyFitness landscape of a biomolecule is a representation of its activity as a function of its sequence. Properties of a fitness landscape determine how evolution proceeds. Therefore, the distribution of functional variants and more importantly, the connectivity of these variants within the sequence space are important scientific questions. Exploration of these spaces, however, is impeded by the combinatorial explosion of the sequence space. High-throughput experimental methods have recently reduced this impediment but only modestly. Better computational methods are needed to fully utilize the rich information from these experimental data to better understand the properties of the fitness landscape. In this work, I seek to improve this exploration process by combining data from massively parallel experimental assay with smart library design using advanced computational techniques. I focus on an artificial RNA enzyme or ribozyme that can catalyze a ligation reaction between two RNA fragments. This chemistry is analogous to that of the modern RNA polymeraseenzymes, therefore, represents an important reaction in the origin of life. In the first chapter, I discuss the background to this work in the context of evolutionary theory of fitness landscape and its implications in biotechnology. In chapter 2, I explore the use of processes borrowed from the field of evolutionary computation to solve optimization problems using real experimental sequence-activity data. In chapter 3, I investigate the use of supervised machine learning models to extract information on epistatic interactions from the dataset collected during multiple rounds of directed evolution. I investigate and experimentally validate the extent to which a deep learning model can be used to guide a completely computational evolutionary algorithm towards distant regions of the fitness landscape. In the final chapter, I perform a comprehensive experimental assay of the combinatorial region explored by the deep learning-guided evolutionary algorithm. Using this dataset, I analyze higher-order epistasis and attempt to explain the increased predictability of the region sampled by the algorithm. Finally, I provide the first experimental evidence of a large RNA ‘neutral network’. Altogether, this work represents the most comprehensive experimental and computational study of the RNA ligase ribozyme fitness landscape to date, providing important insights into the evolutionary search space possibly explored during the earliest stages of life.doctoral thesi

    Directed evolution of ancestral and modern enzymes

    Full text link
    Tesis Doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Ciencias, Departamento de Biología Molecular. Fecha de lectura: 18-10-2019Esta tesis tiene embargado el acceso al texto completo hasta el 18-04-2021Ancestral sequence reconstruction (ASR) and resurrection (i.e. functional expression in a heterologous host) allows enzymes with different properties to be disclosed while its combination with directed evolution may lead to the development of a new generation of biocatalysts. In this Doctoral Thesis we have explored the combination of such powerful methods using as blueprints two different enzyme systems, Rubisco and laccase. In the first chapter of this Thesis we reconstructed and resurrected (in Escherichia coli) Precambrian Rubisco nodes which were evolved in parallel with the extant Rubisco counterpart. An in vitro dual high-throughput screening (HTS) method was set out to identify thermostable and functional variants after- applying a palette of directed evolution strategies. The stronger tolerance to mutational loads, the improved expression and the different kinetic behavior were some of the traits that highlighted in the Precambrian enzyme. Particularly, the evolved ancestral Clone B2 stood out as a case study of this elusive protein due to its alternative performance in the classical equilibrium of Rubisco kinetic constants. In the second chapter we focused ASR and directed evolution on basidiomycete laccases. Firstly, ancestral nodes from the Paleozoic were reconstructed and resurrected in Saccharomyces cerevisiae. The resurrected enzymes showed a higher heterologous expression and a broader pH stability profile than the modern -laboratory evolved- counterpart. The most promising ancestral node was subjected to structure-guided evolution for the oxidation of β–diketones, an unusual type of redox mediators capable to initiate the polymerization of vinyl monomers. The final chapter of the Thesis deals with consensus design, a long-standing protein engineering method to increase stability without compromising activity. We applied an in-house consensus method to stabilize a laboratory evolved high-redox potential laccase. Multiple sequence alignments were carried out and computationally refined by applying relative entropy and mutual information thresholds. Through this approach, an ensemble of 20 consensus mutations were identified, 18 of which were consensus ancestral mutations. After analyzing potential epistasis by site directed recombination in vivo, the best mutant was characterized displaying dramatically improved thermostability, kinetic values and secretion levels.La reconstrucción y resurrección (i.e. expresión funcional en un hospedador heterólogo) de secuencias ancestrales permite obtener enzimas con diferentes propiedades que al ser combinadas con la evolución dirigida pueden dar lugar al desarrollo de una nueva generación de biocatalizadores. En la presente Tesis Doctoral hemos explorado la combinación de estos potentes métodos usando como modelos dos sistemas enzimáticos diferentes, Rubisco y lacasa. En el primer capítulo se reconstruyeron y resucitaron (en Escherichia coli) nodos de rubiscos Precámbricas con el fin de evolucionarlos en paralelo con una versión moderna de Rubisco. Se desarrolló un método de cribado dual in vitro para poder identificar variantes termoestables y funcionales tras aplicar varias estrategias de evolución dirigida. Las enzimas Precámbricas destacaron por una alta tolerancia a tasas mutagénicas, expresión funcional mejorada y valores cinéticos diferentes a los de las enzimas modernas. En particular, la rubisco ancestral evolucionada, clon B2, despuntó como caso de estudio de esta complicada enzima debido al comportamiento alternativo que muestra con respecto al equilibrio clásico de las constantes cinéticas de la Rubisco. En el segundo capítulo se llevo a cabo la resurrección y evolución dirigida de lacasas de basidiomicetos. En primer lugar se reconstruyeron y resucitaron en Saccharomyces cerevisiae nodos ancestrales del Paleozoico. Las enzimas ancestrales mostraron mayor nivel de expresión heteróloga así como un perfil de estabilidad a diferentes pHs más amplio que el de la versión –evolucionada en el laboratorio- moderna. El nodo ancestral más prometedor fue sometido a evolución estructuralmente guiada para la oxidación de β-dicetonas, un tipo de mediador redox poco usual capaz de iniciar la polimerización de monómeros de vinilo. El capítulo final de la Tesis trata sobre el diseño consenso, un método clásico de ingeniería de proteínas para aumentar la estabilidad sin afectar a la actividad. Se aplicó un método consenso propio para la estabilización de una lacasa de alto potencial redox evolucionada en el laboratorio. Se llevó a cabo un alineamiento de múltiples secuencias que fue refinado computacionalmente mediante el uso de los marcadores de entropía relativa e información mutua. Mediante este procedimiento se identificaron 20 mutaciones consenso, 18 de las cuales corresponden a mutaciones ancestrales-consenso. Se analizó la posible epistasia de estas mutaciones mediante recombinación dirigida in vivo y se caracterizó el mejor mutante que presentó mayores niveles de estabilidad, valores cinéticos y secreciónLa presente Tesis Doctoral se ha llevado a cabo gracias a la financiación recibida a través de una beca para la formación de personal investigador (FPI) del Ministerio de Economía y Competitividad (BES-2014-068887) dentro de los proyectos nacionales DEWRY (BIO2013-43407-R) y LIGNOLUTION (BIO2016-79106-R)
    corecore