280 research outputs found

    Computational Design and Experimental Validation of Functional Ribonucleic Acid Nanostructures

    Get PDF
    In living cells, two major classes of ribonucleic acid (RNA) molecules can be found. The first class called the messenger RNA (mRNA) contains the genetic information that allows the ribosome to read and translate it into proteins. The second class called non-coding RNA (ncRNA), do not code for proteins and are involved with key cellular processes, such as gene expression regulation, splicing, differentiation, and development. NcRNAs fold into an ensemble of thermodynamically stable secondary structures, which will eventually lead the molecule to fold into a specific 3D structure. It is widely known that ncRNAs carry their functions via their 3D structures as well as their molecular composition. The secondary structure of ncRNAs is composed of different types of structural elements (motifs) such as stacking base pairs, internal loops, hairpin loops and pseudoknots. Pseudoknots are specifically difficult to model, are abundant in nature and known to stabilize the functional form of the molecule. Due to the diverse range of functions of ncRNAs, their computational design and analysis have numerous applications in nano-technology, therapeutics, synthetic biology, and materials engineering. The RNA design problem is to find novel RNA sequences that are predicted to fold into target structure(s) while satisfying specific qualitative characteristics and constraints. RNA design can be modeled as a combinatorial optimization problem (COP) and is known to be computationally challenging or more precisely NP-hard. Numerous algorithms to solve the RNA design problem have been developed over the past two decades, however mostly ignore pseudoknots and therefore limit application to only a slice of real-world modeling and design problems. Moreover, the few existing pseudoknot designer methods which were developed only recently, do not provide any evidence about the applicability of their proposed design methodology in biological contexts. The two objectives of this thesis are set to address these two shortcomings. First, we are interested in developing an efficient computational method for the design of RNA secondary structures including pseudoknots that show significantly improved in-silico quality characteristics than the state of the art. Second, we are interested in showing the real-world worthiness of the proposed method by validating it experimentally. More precisely, our aim is to design instances of certain types of RNA enzymes (i.e. ribozymes) and demonstrate that they are functionally active. This would likely only happen if their predicted folding matched their actual folding in the in-vitro experiments. In this thesis, we present four contributions. First, we propose a novel adaptive defect weighted sampling algorithm to efficiently solve the RNA secondary structure design problem where pseudoknots are included. We compare the performance of our design algorithm with the state of the art and show that our method generates molecules that are thermodynamically more stable and less defective than those generated by state of the art methods. Moreover, we show when the effect of fitness evaluation is decoupled from the search and optimization process, our optimization method converges faster than the non-dominated sorting genetic algorithm (NSGA II) and the ant colony optimization (ACO) algorithm do. Second, we use our algorithmic development to implement an RNA design pipeline called Enzymer and make it available as an open source package useful for wet lab practitioners and RNA bioinformaticians. Enzymer uses multiple sequence alignment (MSA) data to generate initial design templates for further optimization. Our design pipeline can then be used to re-engineer naturally occurring RNA enzymes such as ribozymes and riboswitches. Our first and second contributions are published in the RNA section of the Journal of Frontiers in Genetics. Third, we use Enzymer to reengineer three different species of pseudoknotted ribozymes: a hammerhead ribozyme from the mouse gut metagenome, a hammerhead ribozyme from Yarrowia lipolytica and a glmS ribozyme from Thermoanaerobacter tengcogensis. We designed a total of 18 ribozyme sequences and showed the 16 of them were active in-vitro. Our experimental results have been submitted to the RNA journal and strongly suggest that Enzymer is a reliable tool to design pseudoknotted ncRNAs with desired secondary structure. Finally, we propose a novel architecture for a new ribozyme-based gene regulatory network where a hammerhead ribozyme modulates expression of a reporter gene when an external stimulus IPTG is present. Our in-vivo results show expected results in 7 out of 12 cases

    Contributions of Graph Theory and Algorithms to Animal Behaviour and Neuroscience

    Get PDF
    Η θεωρία γραφημάτων και οι αλγόριθμοι προσφέρουν πολύτιμες εργαλειοθήκες για τη μοντε- λοποίηση καθώς και την ανάλυση πολυάριθμων φαινομένων στις φυσικές επιστήμες. Εδώ παρουσιάζεται μια ανασκόπηση της σύγχρονης βιβλιογραφίας, χωρισμένη σε τέσσερα κύρια κεφάλαια, δίνοντας κάποιες ενδείξεις για το πώς οι έννοιες αυτών των δύο κλάδων μπορούν να χρησιμοποιηθούν για τη μελέτη της συμπεριφοράς των ζώων και της νευροεπιστήμης. Κατ ’εξαίρεση, το πρώτο μέρος του πρώτου κεφαλαίου παρέχει μια σύντομη συζήτηση σχετικά με τις εφαρμογές της θεωρίας γραφημάτων στη μοριακή βιολογία. Η επιλογή αυτή έγινε προκειμένου να καταστεί η εργασία αυτή πληρέστερη και να δοθεί στους αναγνώστες με διαφορετικό υπόβαθρο, όσο το δυνατόν περισσότερο, συνολική άποψη για τη δυνητική χρησι- μότητα τέτοιων διεπιστημονικών προσεγγίσεων. Τα υπόλοιπα δύο τμήματα του πρώτου κεφα- λαίου εστιάζουν σε δίκτυα του εγκεφάλου και σε κεντρικές έννοιες της θεωρίας γραφημάτων, όπως η κεντρικότητα, στη μελέτη τους. Το δεύτερο κεφάλαιο εισάγει μερικές έννοιες της κοινωνικότητας των ζώων και αναφέρεται σε μελέτες της συνεργασίας στο ζωικό βασίλειο, εστιάζοντας στην εξελικτική θεωρία γραφημάτων και παιγνίων. Επιπλέον, στη τελευταία ενότητα αυτού του κεφαλαίου συζητείται η συλλογική κίνηση ομάδων ζώων, παρέχοντας εκτός των άλλων, εισαγωγή βασικών όρων για το επόμενο τρίτο κεφάλαιο. Η διεπιστημονική έρευνα, με στόχο την ενοποίηση μεθόδων από διαφορετικούς τομείς, λαμβάνει χώρα ευρέως για να απαντήσει βιολογικά ερωτήματα. Εντούτοις, όπως παρουσιάζεται παρακάτω, η έρευνα στους αλγορίθμους και στη βιολογία μπορούν να συμβάλλουν στην ανάπτυξη η μια της άλλης. Ως εκ τούτου, το τρίτο κεφάλαιο παρέχει πληροφορίες σχετικά με αλγόριθμους των οποίων ο σχεδιασμός έχει εμπνευστεί από τη (συλλογική) συμπεριφορά των ζώων στο φυσικό περιβάλλον. Τέλος, το τέταρτο κεφάλαιο αποκλίνει εκ νέου από το επίκεντρο των προηγούμε- νων κεφαλαίων και κάνει μια σύντομη εισαγωγή στο σημαντικό, αλλά και αμφιλεγόμενο, υπολογιστικό χαρακτήρα της νόησης και κατ’ επέκταση της συμπεριφοράς. Συνολικά, μπορεί κανείς να παρατηρήσει ότι η συνεργασία των προαναφερθέντων πεδίων είναι εκτεταμένη ενώ η πραγματοποιημένη έρευνα ανοίγει νέα ερωτήματα που μπορούν να μελετηθούν μόνο υπό το φως τέτοιων διεπιστημονικών συνεργασιών.Graph theory and algorithms offer precious toolboxes for the modelling as well as the analysis of numerous phenomena in natural sciences. Here a review of the modern bibliography is pre- sented, divided in four main chapters, giving some indications on how the concepts of these two disciplines can be used for the study of animal behaviour and neuroscience. As an exception the premier part of the first chapter provides a short discussion on the applications of graph theory on molecular biology. This choice made in order to make this work more complete and give to the readers from various backgrounds an, as much as possible, overall view of the future potential of such interdisciplinary approaches. The rest two sections of the first chapter deals with brain networks and central terms of graph theory, such as centrality, in their study. The second chapter introduces some concepts of animal sociality and refers to studies of animal cooperation, focusing on evolutionary graph and game theory. Moreover, in the last section of this chapter the collective motion of animal groups is discussed providing, into the bargain, an introduction of basic terms for the subsequent third chapter. Interdisciplinary research, aiming to unite methods from different fields, is vastly used in order to answer biological questions. Although, as it is presented below, both the fields of algorithms and biology can contribute to the elaboration of each other. Hence, the third chapter provides information about algorithms whose design has been inspired by the (collective) behaviour of animals in the nature. Finally, the fourth chapter deviates anew from the central focus of the previous chapters and makes a short introduction in the substantial controversial computational nature of cognition and by extension behaviour. Overall, one can observe that the cooperation of the above mentioned fields is extensive while the accomplished research opens new questions which can be studied only in the light of such collaborations

    Key role and diversity of EcR/USP and other nuclear receptors in selected Arthropoda species

    Get PDF
    The nuclear receptors (NRs), an important protein superfamily of transcription factors found in all animals, regulate the expression of genes in a large array of biological processes. Their involvement in moulting and metamorphosis, embryonic development, cell differentiation and reproduction of arthropods is well documented. Especially the two NRs that form the functional ecdysteroid receptor and which are at the base of the ecdysteroid signalling cascade regulating moulting and metamorphosis, EcR and USP, have been researched intensively. During evolution, gene duplication and gene loss events have created a broad diversity of these NRs between different groups in the animal kingdom. However, in 2008, at the time this PhD research started, the information that was available on arthropod NRs was mainly restricted to holometabolic insects. Complete sets of NRs for other groups, including Crustacea, Chelicerata or more basal insects were unavailable. Over the last few years, the number of genome sequencing projects that are carried out for Arthropoda is rapidly increasing. This gave us the opportunity to investigate the NRs in a number of other arthropods and compare the sets of NRs between some of these groups. We chose three species, the hemimetabolic pea aphid Acyrthosiphon pisum, the holometabolic buff-tailed bumblebee Bombus terrestris and the chelicerate two-spotted spidermite Tetranychus urticae as representatives of their respective clades. The main research questions that were addressed in the PhD thesis were: (1) Do holometabolic, hemimetabolic and non-insect arthropods exhibit important differences in their sets of nuclear receptors?, (2) What are the consequences towards the ecdysteroid signalling cascade and can any differences be found there?, and (3) Can RNAi be used to add functional information to the fundamental data that these genome annotation analyses have delivered

    Forståelse av forholdet mellom struktur og funksjon til Vitellogenin i honningbia

    Get PDF
    This thesis focuses on the structure and molecular function of Vitellogenin (Vg) from honey bees (Apis mellifera). Vg is an ancient protein found in animals. Most biological processes depend on proteins' activities, and the structural shape of proteins determines what they can do and how they work. It is important to understand the shape and associated functional properties of honey bee Vg, as honey bees are important pollinators in our natural environment and agricultural food system. A yolk-protein that transports nutrients like lipids and zinc, Vg is necessary for honey bee reproduction, and the protein also regulates social behavior and has immune-related functions. Paper I presents a full-length protein structure for honey bee Vg, generated using computational structure prediction. For the first time, we describe the complete structural fold of the protein, revealing previously unknown structural features. In Paper II, I use structural- and sequence-data analysis to identify seven potential zinc-binding sites at different protein regions. Element analysis of purified Vg shows that, on average, three zinc-sites are occupied per molecule – a ratio not reported before. Paper III explores the Vg structure from the perspective of allelic variation on the honey bee vg-gene. We used amplicon Nanopore sequencing with barcoded primers to identify 121 Vg variants. With these data, I found that the domains and subdomains of Vg are characterized by different levels of variation. While some of these patterns were expected, my results also provide new insights on possible structure-function relationships. I use findings from Papers I, II, and III in Paper IV to develop a novel explanatory model for how Vg holds its lipid load. In sum, this thesis presents a detailed structural study that contributes toward understanding the multifunctional role of honey bee Vg.Denne avhandlingen fokuserer på strukturen og funksjonen til Vitellogenin (Vg) hos honningbier (Apis mellifera). Vg er et gammelt protein som finnes i mange dyr. De fleste biologiske prosesser er avhengige av proteiners aktivitet, og den strukturelle formen til et protein bestemmer hva det kan gjøre og hvordan det fungerer. De er viktig å forstå formen og de assosierte funksjonelle egenskapene til Vg i honningbia, ettersom honningbier er viktige pollinatorer i vårt naturlige miljø og for matproduksjon i landbruk. Vg er nødvendig for reproduksjon i honningbier som et egg-protein, ved å transportere næringsstoffer som lipider og sink, men proteinet regulerer også sosial adferd og har immunrelaterte funksjoner. Paper I presenterer en full-lengde proteinstruktur av Vg i honningbia, generert ved å bruke beregningsmessig protein-prediksjon. Vi beskriver en fullstendig strukturell form av proteinet for første gang, som avdekker nye strukturelle egenskaper. I Paper II, bruker jeg struktur- og sekvensdata-analyser til å identifisere syv potensielle sink-bindingsseter på ulike områder i proteinet. Element-analyse av renset Vg viser at tre sink-seter, i snitt, er bundet per molekyl – en ratio som ikke har blitt rapportert tidligere. Paper III utforsker Vg strukturen fra et genetisk variasjonsperspektiv i vg-genet til honningbia. Vi bruker amplikon Nanoporesekvensering med seriekodede primere for å identifisere 121 Vg-varianter. Med disse data fant jeg ut at domener og subdomer i Vg karakteriseres av variasjonsnivå. Noen av disse mønstrene var forventet, men mine resultater bidrar også til ny innsikt i forholdet mellom Vgs struktur og funksjon. Jeg bruker funnene fra Paper I, II, og III i Paper IV for å utlede en ny forklaringsmodell for hvordan Vg bærer sin lipidlast. Min avhandling representerer en detaljert strukturell studie som tar viktige steg mot å forstå den flerfunksjonelle rollen til Vg i honningbia.Norges forskningsråd ; BioCa

    Development of a novel platform for high-throughput gene design and artificial gene synthesis to produce large libraries of recombinant venom peptides for drug discovery

    Get PDF
    Tese de Doutoramento em Ciências Veterinárias na Especialidade de Ciências Biológicas e BiomédicasAnimal venoms are complex mixtures of biologically active molecules that, while presenting low immunogenicity, target with high selectivity and efficacy a variety of membrane receptors. It is believed that animal venoms comprise a natural library of more than 40 million different natural compounds that have been continuously fine-tuned during the evolutionary process to disturb cellular function. Within animal venoms, reticulated peptides are the most attractive class of molecules for drug discovery. However, the use of animal venoms to develop novel pharmacological compounds is still hampered by difficulties in obtaining these low molecular mass cysteine-rich polypeptides in sufficient amounts. Here, a high-throughput gene synthesis platform was developed to produce synthetic genes encoding venom peptides. The final goal of this project is the production of large libraries of recombinant venom peptides that can be screened for drug discovery. A robust and efficient Polymerase Chain Reaction (PCR) methodology was refined to assemble overlapping oligonucleotides into small artificial genes (< 500 bp) with high-fidelity. In addition, two bioinformatics tools were constructed to design multiple optimized genes (ATGenium) and overlapping oligonucleotides (NZYOligo designer), in order to allow automation of the high-throughput gene synthesis platform. The platform can assemble 96 synthetic genes encoding venom peptides simultaneously, with an error rate of 1.1 mutations per kb. To decrease the error rate associated with artificial gene synthesis, an error removal step using phage T7 endonuclease I was designed and integrated into the gene synthesis methodology. T7 endonuclease I was shown to be highly effective to specifically recognize and cleave DNA mismatches allowing a dramatically reduction of error frequency in large synthetic genes, from 3.45 to 0.43 errors per kb. Combining the knowledge acquired in the initial stages of the work, a comprehensive study was performed to investigate the influence of gene design, presence of fusion tags, cellular localization of expression, and usage of Tobacco Etch Virus (TEV) protease for tag removal, on the recombinant expression of disulfide-rich venom peptides in Escherichia coli. Codon usage dramatically affected the levels of recombinant expression in E. coli. In addition, a significant pressure in the usage of the two cysteine codons suggests that both need to be present at equivalent levels in genes designed de novo to ensure high levels of expression. This study also revealed that DsbC was the best fusion tag for recombinant expression of disulfide-rich peptides, in particular when expression of the fusion peptide was directed to the bacterial periplasm. TEV protease was highly effective for efficient tag removal and its recognition sites can tolerate all residues at its C-terminal, with exception of proline, confirming that no extra residues need to be incorporated at the N-terminus of recombinant venom peptides. This study revealed that E. coli is a convenient heterologous host for the expression of soluble and potentially functional venom peptides. Thus, this novel high-throughput gene synthesis platform was used to produce ~5,000 synthetic genes with a low error rate. This genetic library supported the production of the largest library of recombinant venom peptides constructed until now. The library contains 2736 animal venom peptides and it is presently being screened for the discovery of novel drug leads related to different diseases.RESUMO - Desenvolvimento de uma nova plataforma de alta capacidade para desenhar e sintetizar genes artificiais, para a produção de péptidos venómicos recombinantes - Os venenos animais são misturas complexas de moléculas biologicamente activas que se ligam com elevada selectividade e eficácia a uma grande variedade de receptores de membrana. Embora apresentem baixa imunogenicidade, os venenos podem afectar a função celular actuando ao nível dos seus receptores. Actualmente, pensa-se que os venenos de animais constituam uma biblioteca natural de mais de 40 milhões de moléculas diferentes que têm sido continuamente aperfeiçoadas ao longo do processo evolutivo. Tendo em conta a composição dos venenos, os péptidos reticulados são a classe mais atractiva de moléculas com interesse farmacológico. No entanto, a utilização de venenos para o desenvolvimento de novos fármacos está limitada por dificuldades em obter estas moléculas em quantidades adequadas ao seu estudo. Neste trabalho desenvolveu-se uma plataforma de alta capacidade para a síntese de genes sintéticos codificadores de péptidos venómicos, com o objectivo de produzir bibliotecas de péptidos venómicos recombinantes que possam ser rastreadas para a descoberta de novos medicamentos. Com o objectivo de sintetizar genes pequenos (< 500 pares de bases) com elevada fidelidade e em simultâneo, desenvolveu-se uma metodologia de PCR (polymerase chain reaction) robusta e eficiente, que se baseia na extensão de oligonucleótidos sobrepostos. Para possibilitar a automatização da plataforma de síntese de genes, foram construídas duas ferramentas bioinformáticas para desenhar simultaneamente dezenas a milhares de genes optimizados para a expressão em Escherichia coli (ATGenium) e os respectivos oligonucleótios sobrepostos (NZYOligo designer). Esta plataforma foi optimizada para sintetizar em simultâneo 96 genes sintéticos, tendo-se obtido uma taxa de erro de 1.1 mutações por kb de DNA sintetizado. A fim de diminuir a taxa de erro associada à produção de genes sintéticos, desenvolveu-se um método para remoção de erros utilizando a enzima T7 endonuclease I. A enzima T7 endonuclease I mostrou-se muito eficaz no reconhecimento e clivagem de moléculas DNA que apresentam emparelhamentos incorrectos, reduzindo drasticamente a frequência de erros identificados em genes grandes, de 3.45 para 0.43 erros por kb de DNA sintetizado. Investigou-se também a influência do desenho dos genes, da presença de tags de fusão, da localização celular da expressão e da actividade da protease Tobacco Etch Virus (TEV) para a remoção eficiente de tags, na expressão de péptidos venómicos ricos em cisteínas em E. coli. A utilização de codões meticulosamente escolhidos afectou drasticamente os níveis de expressão em E. coli. Para além disso, os resultados mostram que existe uma pressão significativa no uso dos dois codões que codificam para o resíduo cisteína, o que sugere que ambos os codões têm de estar presentes, em níveis equivalentes, nos genes que foram desenhados e optimizados para garantir elevados níveis de expressão. Este trabalho indicou também que o tag de fusão DsbC foi o mais apropriado para a expressão eficiente de péptidos venómicos ricos em cisteínas, particularmente quando os péptidos recombinantes foram expressos no periplasma bacteriano. Confirmou-se que a protease TEV é eficaz na remoção de tags de fusão, podendo o seu local de reconhecimento conter quaisquer aminoácidos na extremidade C-terminal, com excepção da prolina. Desta forma, verificou-se não ser necessário incorporar qualquer aminoácido extra na extremidade N-terminal dos péptidos venómicos recombinantes. Reunindo todos os resultados, verificou-se que a E. coli é um hospedeiro adequado para a expressão, na forma solúvel, de péptidos venómicos potencialmente funcionais. Por último, foram produzidos, com uma taxa de erro reduzida, ~5000 genes sintéticos codificadores de péptidos venómicos utilizando a nova plataforma de elevada capacidade para a síntese de genes aqui desenvolvida. A nova biblioteca de genes sintéticos foi usada para produzir a maior biblioteca de péptidos venómicos recombinantes construída até agora, que inclui 2736 péptidos venómicos. Esta biblioteca recombinante está presentemente a ser rastreada com o objectivo de descobrir novas drogas com interesse para a saúde humana

    Bioinformatics Applications Based On Machine Learning

    Get PDF
    The great advances in information technology (IT) have implications for many sectors, such as bioinformatics, and has considerably increased their possibilities. This book presents a collection of 11 original research papers, all of them related to the application of IT-related techniques within the bioinformatics sector: from new applications created from the adaptation and application of existing techniques to the creation of new methodologies to solve existing problems
    corecore