660 research outputs found

    Evolving artificial cell signaling networks: perspectives and methods

    Get PDF
    Nature is a source of inspiration for computational techniques which have been successfully applied to a wide variety of complex application domains. In keeping with this we examine Cell Signaling Networks (CSN) which are chemical networks responsible for coordinating cell activities within their environment. Through evolution they have become highly efficient for governing critical control processes such as immunological responses, cell cycle control or homeostasis. Realising (and evolving) Artificial Cell Signaling Networks (ACSNs) may provide new computational paradigms for a variety of application areas. In this paper we introduce an abstraction of Cell Signaling Networks focusing on four characteristic properties distinguished as follows: Computation, Evolution, Crosstalk and Robustness. These properties are also desirable for potential applications in the control systems, computation and signal processing field. These characteristics are used as a guide for the development of an ACSN evolutionary simulation platform. Following this we describe a novel class of Artificial Chemistry named Molecular Classifier Systems (MCS) to simulate ACSNs. The MCS can be regarded as a special purpose derivation of Hollands Learning Classifier System (LCS). We propose an instance of the MCS called the MCS.b that extends the precursor of the LCS: the broadcast language. We believe the MCS.b can offer a general purpose tool that can assist in the study of real CSNs in Silico The research we are currently involved in is part of the multi disciplinary European funded project, ESIGNET, with the central question of the study of the computational properties of CSNs by evolving them using methods from evolutionary computation, and to re-apply this understanding in developing new ways to model and predict real CSNs

    A Synthetic Genetic System to Investigate Brain Connectivity and Genetically Manipulate Interacting Cells

    Get PDF
    The underlying goal of neuroscience research is to understand how the nervous system functions to bring about behavior. A detailed map of neural circuits is required for scientists to tackle this question. To this purpose, we developed a synthetic and genetically-encoded system, TRanscellular ACtivation of Transcription (TRACT) to monitor cell-cell contact. Upon ligand-receptor interaction at sites of cell-cell contact, the transmembrane domain of an engineered Notch receptor is cleaved by intramembrane proteolysis and releases a fragment that regulates transcription in the receptor-expressing cell. We demonstrate that in cultured cells, the synthetic receptor can be activated to drive reporter gene expression by co-incubation with ligand-expressing cell or by growth on ligand-coated surfaces. We further show that TRACT can detect interactions between neurons and glia in the Drosophila brain; expressing the ligand in spatially-restricted subsets of neurons leads to transcription of a reporter in the glial cells that interact with those neurons. To optimize TRACT for neural tracing, we attempted to target the synthetic receptor to post-synaptic sites by fusion with the intracellular domain of Drosophila neuroligin2. However, this modification only facilitate the receptor to be localized homogeneously throughout the neurites. The induction data of the modified receptor shows that the new receptor has better sensitivity compared to the original receptor, but the ligand-receptor interaction still happened at non-synaptic sites of membrane contact. To further target the ligand to pre-synaptic sites, we fused the ligand to different pre-synaptic markers. We found the one fused with synaptobrevin is likely located at axon terminals, but only able to trigger moderate induction. Therefore, more examinations are required to further characterize the capability of this ligand. In summary, TRACT is useful for monitoring cell-cell interactions in animals and could also be used to genetically manipulate cells based on contact. Moreover, we believe that proper targeting of the ligand to synaptic sites will improve the specificity of TRACT for synaptic connections in the future

    Simultaneous learning of instantaneous and time-delayed genetic interactions using novel information theoretic scoring technique

    Get PDF
    BACKGROUND: Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. RESULTS: In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. CONCLUSION: By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach

    Chemometric Approaches for Systems Biology

    Full text link
    The present Ph.D. thesis is devoted to study, develop and apply approaches commonly used in chemometrics to the emerging field of systems biology. Existing procedures and new methods are applied to solve research and industrial questions in different multidisciplinary teams. The methodologies developed in this document will enrich the plethora of procedures employed within omic sciences to understand biological organisms and will improve processes in biotechnological industries integrating biological knowledge at different levels and exploiting the software packages derived from the thesis. This dissertation is structured in four parts. The first block describes the framework in which the contributions presented here are based. The objectives of the two research projects related to this thesis are highlighted and the specific topics addressed in this document via conference presentations and research articles are introduced. A comprehensive description of omic sciences and their relationships within the systems biology paradigm is given in this part, jointly with a review of the most applied multivariate methods in chemometrics, on which the novel approaches proposed here are founded. The second part addresses many problems of data understanding within metabolomics, fluxomics, proteomics and genomics. Different alternatives are proposed in this block to understand flux data in steady state conditions. Some are based on applications of multivariate methods previously applied in other chemometrics areas. Others are novel approaches based on a bilinear decomposition using elemental metabolic pathways, from which a GNU licensed toolbox is made freely available for the scientific community. As well, a framework for metabolic data understanding is proposed for non-steady state data, using the same bilinear decomposition proposed for steady state data, but modelling the dynamics of the experiments using novel two and three-way data analysis procedures. Also, the relationships between different omic levels are assessed in this part integrating different sources of information of plant viruses in data fusion models. Finally, an example of interaction between organisms, oranges and fungi, is studied via multivariate image analysis techniques, with future application in food industries. The third block of this thesis is a thoroughly study of different missing data problems related to chemometrics, systems biology and industrial bioprocesses. In the theoretical chapters of this part, new algorithms to obtain multivariate exploratory and regression models in the presence of missing data are proposed, which serve also as preprocessing steps of any other methodology used by practitioners. Regarding applications, this block explores the reconstruction of networks in omic sciences when missing and faulty measurements appear in databases, and how calibration models between near infrared instruments can be transferred, avoiding costs and time-consuming full recalibrations in bioindustries and research laboratories. Finally, another software package, including a graphical user interface, is made freely available for missing data imputation purposes. The last part discusses the relevance of this dissertation for research and biotechnology, including proposals deserving future research.Esta tesis doctoral se centra en el estudio, desarrollo y aplicación de técnicas quimiométricas en el emergente campo de la biología de sistemas. Procedimientos comúnmente utilizados y métodos nuevos se aplican para resolver preguntas de investigación en distintos equipos multidisciplinares, tanto del ámbito académico como del industrial. Las metodologías desarrolladas en este documento enriquecen la plétora de técnicas utilizadas en las ciencias ómicas para entender el funcionamiento de organismos biológicos y mejoran los procesos en la industria biotecnológica, integrando conocimiento biológico a diferentes niveles y explotando los paquetes de software derivados de esta tesis. Esta disertación se estructura en cuatro partes. El primer bloque describe el marco en el cual se articulan las contribuciones aquí presentadas. En él se esbozan los objetivos de los dos proyectos de investigación relacionados con esta tesis. Asimismo, se introducen los temas específicos desarrollados en este documento mediante presentaciones en conferencias y artículos de investigación. En esta parte figura una descripción exhaustiva de las ciencias ómicas y sus interrelaciones en el paradigma de la biología de sistemas, junto con una revisión de los métodos multivariantes más aplicados en quimiometría, que suponen las pilares sobre los que se asientan los nuevos procedimientos aquí propuestos. La segunda parte se centra en resolver problemas dentro de metabolómica, fluxómica, proteómica y genómica a partir del análisis de datos. Para ello se proponen varias alternativas para comprender a grandes rasgos los datos de flujos metabólicos en estado estacionario. Algunas de ellas están basadas en la aplicación de métodos multivariantes propuestos con anterioridad, mientras que otras son técnicas nuevas basadas en descomposiciones bilineales utilizando rutas metabólicas elementales. A partir de éstas se ha desarrollado software de libre acceso para la comunidad científica. A su vez, en esta tesis se propone un marco para analizar datos metabólicos en estado no estacionario. Para ello se adapta el enfoque tradicional para sistemas en estado estacionario, modelando las dinámicas de los experimentos empleando análisis de datos de dos y tres vías. En esta parte de la tesis también se establecen relaciones entre los distintos niveles ómicos, integrando diferentes fuentes de información en modelos de fusión de datos. Finalmente, se estudia la interacción entre organismos, como naranjas y hongos, mediante el análisis multivariante de imágenes, con futuras aplicaciones a la industria alimentaria. El tercer bloque de esta tesis representa un estudio a fondo de diferentes problemas relacionados con datos faltantes en quimiometría, biología de sistemas y en la industria de bioprocesos. En los capítulos más teóricos de esta parte, se proponen nuevos algoritmos para ajustar modelos multivariantes, tanto exploratorios como de regresión, en presencia de datos faltantes. Estos algoritmos sirven además como estrategias de preprocesado de los datos antes del uso de cualquier otro método. Respecto a las aplicaciones, en este bloque se explora la reconstrucción de redes en ciencias ómicas cuando aparecen valores faltantes o atípicos en las bases de datos. Una segunda aplicación de esta parte es la transferencia de modelos de calibración entre instrumentos de infrarrojo cercano, evitando así costosas re-calibraciones en bioindustrias y laboratorios de investigación. Finalmente, se propone un paquete software que incluye una interfaz amigable, disponible de forma gratuita para imputación de datos faltantes. En la última parte, se discuten los aspectos más relevantes de esta tesis para la investigación y la biotecnología, incluyendo líneas futuras de trabajo.Aquesta tesi doctoral es centra en l'estudi, desenvolupament, i aplicació de tècniques quimiomètriques en l'emergent camp de la biologia de sistemes. Procediments comúnment utilizats i mètodes nous s'apliquen per a resoldre preguntes d'investigació en diferents equips multidisciplinars, tant en l'àmbit acadèmic com en l'industrial. Les metodologies desenvolupades en aquest document enriquixen la plétora de tècniques utilitzades en les ciències òmiques per a entendre el funcionament d'organismes biològics i milloren els processos en la indústria biotecnològica, integrant coneixement biològic a distints nivells i explotant els paquets de software derivats d'aquesta tesi. Aquesta dissertació s'estructura en quatre parts. El primer bloc descriu el marc en el qual s'articulen les contribucions ací presentades. En ell s'esbossen els objectius dels dos projectes d'investigació relacionats amb aquesta tesi. Així mateix, s'introduixen els temes específics desenvolupats en aquest document mitjançant presentacions en conferències i articles d'investigació. En aquesta part figura una descripació exhaustiva de les ciències òmiques i les seues interrelacions en el paradigma de la biologia de sistemes, junt amb una revisió dels mètodes multivariants més aplicats en quimiometria, que supossen els pilars sobre els quals s'assenten els nous procediments ací proposats. La segona part es centra en resoldre problemes dins de la metabolòmica, fluxòmica, proteòmica i genòmica a partir de l'anàlisi de dades. Per a això es proposen diverses alternatives per a compendre a grans trets les dades de fluxos metabòlics en estat estacionari. Algunes d'elles estàn basades en l'aplicació de mètodes multivariants propostos amb anterioritat, mentre que altres són tècniques noves basades en descomposicions bilineals utilizant rutes metabòliques elementals. A partir d'aquestes s'ha desenvolupat software de lliure accés per a la comunitat científica. Al seu torn, en aquesta tesi es proposa un marc per a analitzar dades metabòliques en estat no estacionari. Per a això s'adapta l'enfocament tradicional per a sistemes en estat estacionari, modelant les dinàmiques dels experiments utilizant anàlisi de dades de dues i tres vies. En aquesta part de la tesi també s'establixen relacions entre els distints nivells òmics, integrant diferents fonts d'informació en models de fusió de dades. Finalment, s'estudia la interacció entre organismes, com taronges i fongs, mitjançant l'anàlisi multivariant d'imatges, amb futures aplicacions a la indústria alimentària. El tercer bloc d'aquesta tesi representa un estudi a fons de diferents problemes relacionats amb dades faltants en quimiometria, biologia de sistemes i en la indústria de bioprocessos. En els capítols més teòrics d'aquesta part, es proposen nous algoritmes per a ajustar models multivariants, tant exploratoris com de regressió, en presencia de dades faltants. Aquests algoritmes servixen ademés com a estratègies de preprocessat de dades abans de l'ús de qualsevol altre mètode. Respecte a les aplicacions, en aquest bloc s'explora la reconstrucció de xarxes en ciències òmiques quan apareixen valors faltants o atípics en les bases de dades. Una segona aplicació d'aquesta part es la transferència de models de calibració entre instruments d'infrarroig proper, evitant així costoses re-calibracions en bioindústries i laboratoris d'investigació. Finalment, es proposa un paquet software que inclou una interfície amigable, disponible de forma gratuïta per a imputació de dades faltants. En l'última part, es discutixen els aspectes més rellevants d'aquesta tesi per a la investigació i la biotecnologia, incloent línies futures de treball.Folch Fortuny, A. (2016). Chemometric Approaches for Systems Biology [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/77148TESISPremios Extraordinarios de tesis doctorale

    Cloning, Reconstruction and Heterologous Expression of Secondary Metabolite Gene Clusters from Fusarium

    Get PDF

    MACHINE LEARNING APPROACHES FOR BIOMARKER IDENTIFICATION AND SUBGROUP DISCOVERY FOR POST-TRAUMATIC STRESS DISORDER

    Get PDF
    Post-traumatic stress disorder (PTSD) is a psychiatric disorder caused by environmental and genetic factors resulting from alterations in genetic variation, epigenetic changes and neuroimaging characteristics. There is a pressing need to identify reliable molecular and physiological biomarkers for accurate diagnosis, prognosis, and treatment, as well to deepen the understanding of PTSD pathophysiology. Machine learning methods are widely used to infer patterns from biological data, identify biomarkers, and make predictions. The objective of this research is to apply machine learning methods for the accurate classification of human diseases from genome-scale datasets, focusing primarily on PTSD.The DoD-funded Systems Biology of PTSD Consortium has recruited combat veterans with and without PTSD for measurement of molecular and physiological data from blood or urine samples with the goal of identifying accurate and specific PTSD biomarkers. As a member of the Consortium with access to these PTSD multiple omics datasets, we first completed a project titled Clinical Subgroup-Specific PTSD Classification and Biomarker Discovery. We applied machine learning approaches to these data to build classification models consisting of molecular and clinical features to predict PTSD status. We also identified candidate biomarkers for diagnosis, which improves our understanding of PTSD pathogenesis. In a second project, entitled Multi-Omic PTSD Subgroup Identification and Clinical Characterization, we applied methods for integrating multiple omics datasets to investigate the complex, multivariate nature of the biological systems underlying PTSD. We identified an optimal 2 PTSD subgroups using two different machine learning approaches from 82 PTSD positive samples, and we found that the subgroups exhibited different remitting behavior as inferred from subjects recalled at a later time point. The results from our association, differential expression, and classification analyses demonstrated the distinct clinical and molecular features characterizing these subgroups.Taken together, our work has advanced our understanding of PTSD biomarkers and subgroups through the use of machine learning approaches. Results from our work should strongly contribute to the precise diagnosis and eventual treatment of PTSD, as well as other diseases. Future work will involve continuing to leverage these results to enable precision medicine for PTSD

    Neuronal Activity at Synapse Resolution:Reporters and Effectors for Synaptic Neuroscience

    Get PDF
    The development of methods for the activity-dependent tagging of neurons enabled a new way to tackle the problem of engram identification at the cellular level, giving rise to groundbreaking findings in the field of memory studies. However, the resolution of activity-dependent tagging remains limited to the whole-cell level. Notably, events taking place at the synapse level play a critical role in the establishment of new memories, and strong experimental evidence shows that learning and synaptic plasticity are tightly linked. Here, we provide a comprehensive review of the currently available techniques that enable to identify and track the neuronal activity with synaptic spatial resolution. We also present recent technologies that allow to selectively interfere with specific subsets of synapses. Lastly, we discuss how these technologies can be applied to the study of learning and memory

    Changing the ligand-binding specificity of E. coli periplasmic binding protein RbsB by rational design and screening

    Get PDF
    Periplasmic binding proteins (PBPs) form a superfamily of bacterial proteins with a conserved bilobal structure, which are involved in substrate scavenging for bacterial cells. A wide variety of natural ligand-binding domains has evolved. PBPs are composed of two domains connected by a hinge region, which form a binding pocket between the two domains. They can be found in two stable conformations; in absence of ligand the PBP adopts an open conformation, where the binding pocket is exposed. In presence of the ligand, the protein changes to the closed conformation where the ligand is buried in the middle of the protein. This project focused on the ribose-binding protein of Escherichia coli (RbsB). Ribose binding to RbsB stabilizes the closed state. RbsB-bound ribose is presented to a cytoplasmic transport channels (RbsAC), from where it is imported into the cell, or interacts to membrane receptors (i.e., Trg) and can elicit a chemotactic signal. Due to their unique ligand-binding characteristics and wide variety of natural binding pockets PBPs have been of interest for the development of biosensors and bioreporter systems. PBP bioreporters were initiated over 20 years ago by a development in the group of Hazelbauer, who fused the C-terminal part of the E. coli EnvZ osmoregulation histidine kinase to the N-terminal part of the Trg methyl-accepting chemotaxis receptor protein, creating a hybrid receptor Trz1. Ligand bound galactose-binding protein (GBP) and ribose-bound RbsB interact with Trz1, which eventually leads to phosphorylation of the response regulator OmpR, activating transcription from the ompC promotor (and any reporter gene fused to this). In 2003, Hellinga’s group proposed that based on crystal structure information of ligand-bound PBPs variants with new ligand recognition specificities could be designed by computational approaches. Notably, they claimed the design of a RbsB-variant with nM affinity for recognition of 2,4,6-trinitrotoluene (TNT). This idea inspired the scientific community, because it could easily extend PBP-binding to a tremendous variety of compounds, including non- natural molecules, and would thus permit a wide variety of biosensor and bioreporter systems based on RbsB/GBP and Trz1. Unfortunately, independent engineering of some of the most promising published mutants failed to reproduce the reported in vivo and in vitro results. These studies further concluded that the published variants were actually misfolded proteins and/or impaired in stability as result of the introduced ligand-pocket mutations. This fact was largely ignored by Hellinga’s publications. Still inspired by the concept and trying to understand the reason of such limited success, our group raised the hypothesis that changing from ribose to TNT in a single step was likely unfeasible, but given the wide range of naturally evolved PBP ligand binding pockets, a step by step change of ribose binding to a non-natural analogue should be possible. To test this, we selected compounds with distinct differences but still chemically similar to ribose: 1,3-cyclohexanediol (13CHD) and cyclohexanol (CH). Mutant ligand binding pockets that might accommodate 13CHD and/or CH were computationally simulated and calculated using Rosetta, from which a list of critical amino acid residues to mutate in RbsB was selected. These were then synthesized and cloned into E. coli; a resulting set of 2 million mutants containing one of five possible substitutions at each of 9 selected critical amino acid positions. The library was introduced into an E. coli bioreporter strain, which carries the Trz1 hybrid signaling pathway coupled to GFP production when the (new) ligand would bind the (mutant) RbsB. The main goals of this work were to screen and characterize mutants from this first library, and potentially improve mutants for the new ligand binding in further rounds of mutagenesis. In the first part of this work a precise and user-friendly high-throughput strategy to screen the mutant library was developed. Clones were grown as individual microcolonies in alginate beads, to reduce single cell GFP expression variability, which were screened by fluorescence activated cell sorting (FACS) for gain-of-function GFP expression in presence of 13CHD. Six mutants with modest (1.5- fold) but consistent induction with 1 mM 13CHD were isolated. Moreover, these mutants completely lost the capacity to react to ribose. The RbsB mutants were characterized in terms of periplasmic space abundance, stability, secondary structure and ligand affinity. Isothermal microcalorimetry confirmed 13CHD binding, although only two mutants were sufficiently stable upon purification. Circular dichroism and quantification of periplasmic space abundance suggested the mutants to be prone to misfolding and/or defects in translocation. In the second part of this work, we used random and semi-random mutagenesis to improve the affinity and/or stability of the six isolated mutants with 13CHD binding capacity. Several mutant libraries were produced and screened with the previous described strategy. Variants displaying higher expression levels of GFP in presence of 13CHD were collected by FACS, and were used as starting point for the next round of evolution. This mutagenesis and rigorous screening strategy allowed us to isolate 7 mutants with improved (3.2-fold) GFP induction in presence of 13CHD and in a concentration- dependent manner. Several variants were observed that displayed open and closed conformations simultaneously, suggesting they were impaired in transition dynamics. Moreover, our screening strategy largely ignores potential variants with improved binding and closed conformation stability, but that are unable to interact with Trz1 receptor (e.i., trigger the signaling cascade). Finally in the third part of this work, we developed and tested an in vivo system to characterize the quality of the translocation process and receptor interactions. Wild-type- and mutant-RbsB proteins were fused to mCherry reporter protein to study protein abundance and subcellular localization. Whereas RbsB-mCherry proteins clearly localized to the periplasmic space and centered in polar regions depending on chemoreceptor availability, mutant-RbsB-mCherry expression resulted in high proportions of cells devoid of clear foci and low proportions of cells with multiple fluorescent foci, suggesting poorer translocation and mislocalisation. In addition, polar foci of mutants were less fluorescent, suggesting poorer chemoreceptor binding. By spiking further derivative mutant libraries generated by error-prone PCR without or with different proportions of E. coli expressing wild-type RbsB-mCherry we could estimate the potential improvement and deterioration of mutants with wild- type-like periplasmic localisation. The in vivo translocation system may thus be used to detect mutants with better signal transduction capacity. In conclusion, we firmly showed that design of PBP receptor proteins with new binding capacities for non-natural compounds is feasible, but still largely a matter of trial and error. The combination of computational simulations, random mutagenesis and rigorous screening allowed us to isolate variants with new recognition for 13CHD and loss of ribose binding. However, our results also showed that most predicted ligand-binding pocket mutations lead to poorly folding and functioning proteins, and it is likely that the dynamic transition needed between open and closed conformations of (here) RbsB is insufficiently understood and currently predictable to allow rational expansion to a wide range of new ligands. -- Les protéines de liaison périplasmiques (PLP) constituent une superfamille de protéines bactériennes avec une structure bilobée. Elles sont impliquées dans la captation de substrats pour les cellules bactériennes, et montrent grande diversité de domaines de liaison à des composés naturels. Les PLP sont composées de deux domaines connectés par une région charnière, ce qui forme une poche de liaison au substrat entre les deux domaines. Les PLP montrent deux états stables : ouverte en l’absence de ligand, conformation dans laquelle la poche de liaison est exposée, et fermée quand le ligand est séquestré dans la poche de liaison. Ce projet a porté sur l’étude de la PLP RbsB liant le ribose chez Escherichia coli. La liaison du ribose stabilise l’état fermé de RbsB et permet l’interaction avec le transporteur cytoplasmique RbsAC et son passage dans le cytoplasme de la cellule, ou son interaction avec des récepteurs membranaires tels que Trg permettant en une réponse chimiotactique. Étant données leurs caractéristiques uniques de liaison aux ligands et la grande variété de poches de liaison naturellement observée chez les PLP, elles présentent un grand intérêt pour le développement de biosenseurs et de systèmes biorapporteurs. Les premiers biorapporteurs basés sur des PLP ont été développés 20 ans auparavant par le groupe de Hazelbauer. Cette équipe a fusionné la partie C-terminale de la protéine kinase à histidine impliquée dans l’osmorégulation (EnvZ) et l’extrémité N-terminale du récepteur chimiotactique accepteur de groupement méthyle (Trg), pour créer le récepteur hybride Trz1. Les PLP liant le galactose (GBP) et le ribose (RbsB) interagissent avec Trz1, ce qui entraine la phosphorylation du régulateur réponse OmpR qui lui-même va activer la transcription à partir du promoteur du gène ompC (ou n’importe quel système rapporteur placé en aval de ce promoteur). En 2003, le groupe de Hellinga proposait que, sur la base de la structure cristallographique de différents PLP liées à leur ligand, des variants reconnaissant de nouveaux ligands pourraient être générés sur la base d’une approche informatique. En particulier, cette équipe se targue d’avoir générer un variant de RbsB permettant de lier le 2,4,6-trinitrotoluène (TNT) avec une affinité de l’ordre du nanomolaire. Cette idée a inspiré la communauté scientifique car cette approche pourrait s’étendre à une diversité incroyable de composés naturels ou non, ce qui permettrait le développement de biosenseurs et biorapporteurs variés basés sur ce système. Malheureusement, la construction des mutants les plus prometteurs par des équipes indépendantes n’ont pas permis de rapporter de l’activité in vivo et/ou in vitro. Cela a été ignoré dans les publications du groupe Hellinga. Inspirés par ce concept et voulant savoir quelles étaient les raisons de ce succès quelque peu limité, notre groupe a émis l’hypothèse que le changement de spécificité de RbsB du ribose au TNT en une étape était probablement infaisable mais, étant donnée la grande diversité de poches de liaisons naturellement observées chez les LPL, un changement pas à pas du ribose vers un composé analogue non naturel devrait être possible. Pour tester cela, nous avons sélectionné des composés distincts du ribose mais présentant tout de même des similarités : 1,3-cyclohexanediol (13CHD) and cyclohexanol (CH). Des mutants qui pourraient accueillir le 13CHD et/ou CH ont été générés par simulation informatique en utilisant le programme Rosetta, lequel a fourni une liste d’acides aminés critiques à muter. Une librairie de mutant a été synthétisée, celle-ci contenant 2 millions de variants de RbsB avec 1 substition parmi 5 possibles à 9 positions sélectionnées pour leur aspect critique dans la reconnaissance du substrat. La librairie a été introduite et criblée chez une souche reportrice d’E. coli contenant la chaine de signalisation hybride Trz1 couplée à la production de la protéine fluorescente verte (GFP) lorsque le (nouveau) ligand se liera à la protéine RbsB (sauvage ou mutante). Le but principal de ce travail était de caractériser cette librairie de mutants, et éventuellement d’améliorer la capacité de ces mutants à lier un autre composant par des cycles de mutagénèses additionnels. Dans la première partie de ce travail, une stratégie simple et efficace pour cribler la librairie de mutant a été développée. Les différents clones/variants ont été cultivés individuellement en microcolonies dans des billes d’alginate afin de réduire la variabilité du signal GFP observé au niveau de la cellule unique. Les billes ont été analysées par trieur de cellules reposant sur la fluorescence (FACS) afin de détecter des mutants présentant une activité GFP accrue en présence de 13CHD. Six mutants ont été isolés pour leur modeste mais significative induction (1,5 fois) en présence de 1 mM de 13CHD. De plus, ces mutants avaient totalement perdu leur capacité à réagir au ribose. Les mutants RbsB ont été caractérisés plus en détails pour leur localisation dans périplasme, leur stabilité, leur abondance et leur affinité pour le ligand. La technique de microcalorimétrie isotherme a confirmé que ces mutants lient le 13CHD, bien que seulement 2 de ces protéines mutantes se soient révélées suffisamment stables après purification. L’analyse par dichroïsme circulaire et la quantification de l’abondance des protéines dans l’espace périplasmique suggèrent que les protéines mutantes sont sujettes à un mauvais repliement et/ou un problème dans la translocation du cytoplasme au périplasme. Dans une seconde partie, nous avons muté les six mutants isolés précédemment de façon aléatoire ou semi-aléatoire afin d’améliorer leur affinité pour le 13CHD et/ou leur stabilité. Plusieurs librairies de mutants ont été produites et analysées selon la méthode décrite plus tôt. Les variants montrant une plus forte expression du système rapporteur GFP en présence de 13CHD ont été isolés par FACS, et utilisés comme point de départ pour la prochaine étape d’évolution. Cette mutagénèse et l’analyse rigoureuse des librairies nous ont permis d’isoler 7 mutants avec une augmentation de 3,2 fois du signal GFP en présence de 13CHD, et d’une façon dose-dépendante. Plusieurs variants ont montré qu’ils adoptaient la conformation ouverte et fermées au sein de la population bactérienne. Cette dernière observation suggère que ces mutants sont affectés dans leur capacité à passer d’une conformation à l’autre. De plus, notre stratégie de crible ne tient pas compte les variants qui montreraient une liaison accrue et une bonne stabilité de la conformation fermée, mais qui seraient incapables d’interagir avec le récepteur Trz1 (et donc de déclencher la cascade de signalisation du rapporteur). Finalement, dans la troisième partie de ce travail, nous avons développé et testé un système in vivo permettant de caractériser la qualité du processus de translocation dans l’espace périplasmique et l’interaction avec les récepteurs. Les protéines RbsB sauvage et mutantes ont été fusionnées à la protéine fluorescente rouge mCherry afin de visualiser l’abondance et la localisation sub-cellulaire des protéines au niveau de la cellule unique en utilisant la microscopy à épifluorescence et le traitement des images obtenues. Alors que la protéine de fusion RbsB sauvage montre une localisation périplasmique centrées au niveau des pôles de la cellule dépendamment de la disponibilité des chimiorécepteurs, les fusions avec les variants de RbsB montraient une forte proportion de cellules dépourvues de foci, et une faible proportion de cellules avec de multiples foci, suggérant une plus faible liaison aux chimiorécepteurs. En analysant plus en détails des librairies de mutants générées par PCR mutagène, en mélangeant ou non avec des cellules contenant la protéine de fusion RbsB sauvage, nous avons pu estimer l’amélioration potentielle ou la détérioration des qualités des mutants RbsB par rapport au sauvage en terme de localisation périplasmique. Ce système de translocation in vivo pourrait être utilisé afin de détecter des mutants permettant une meilleure transduction du signal. En conclusion, nous avons montré que la conception de protéines réceptrices PLP présentant de nouvelles capacités de liaison pour des composés non naturels est bien faisable, mais repose encore sur une stratégie d’essais et erreurs. La combinaison de simulations informatiques, de mutagénèses aléatoires et de crible rigoureux nous a permis d’isoler des variants de RbsB avec une capacité à reconnaitre le 13CHD, tout en ne liant plus le ribose. Néanmoins, nos résultats ont également montré que la plupart des prédictions de mutations au niveau de la poche de liaison ont mené à un mauvais repliement ou fonctionnement des protéines. Il est très probable que la dynamique de transition entre la conformation ouverte et fermée (de RbsB pour cette étude) ne soit pas encore assez bien comprise, et donc actuellement non prédictable pour permettre le test d’une grande variété de nouveaux ligands
    corecore