370 research outputs found

    Evolution of mammalian genome architecture through retrotransposition

    Get PDF
    Retrotransposons, mobile DNA elements that replicate via a copy and paste mechanism, are a major component of mammalian genome architecture. They account for at least one-third of the human genome and are major drivers of lineage-specific gain and loss of DNA. While there are many examples of how specific retrotransposons have impacted evolution, their interaction with large-scale genome architecture remains poorly characterised. Throughout my thesis I investigated two fundamental questions regarding genome evolution and retrotransposons. Firstly, how does genome architecture shape retrotransposon accumulation? Secondly, how does retrotransposon accumulation in turn impact on genome architecture? The current model of retrotransposon accumulation largely relies on local sequence composition. However, this model fails to account for genome-wide chromatin structure, an important factor that regulates DNA accessibility to insertion machinery. By analysing retrotransposon accumulation at open chromatin sites I showed that genome structure strongly associates with retrotransposon accumulation patterns. In addition, by mapping retrotransposon accumulation patterns of non-human mammals back to human, I was able to observe large-scale positional conservation of lineage-specific retrotransposons. These findings suggest that through conservation of synteny, gene regulation and nuclear organisation, retrotransposon accumulation in mammalian genomes follows similar evolutionary trajectories. Beneath the conserved structural framework of mammalian genomes there exists a high degree of lineage-specific turnover of DNA. Outside of whole genome duplication, retrotransposons are the largest contributing factor to genome growth. In contrast to this, accumulation of retrotransposons can also increase the probability of unequal crossing over causing DNA loss through large deletion events. Using multiple pairwise alignments I calculated regional levels of lineage-specific DNA gain and loss in the human and mouse genomes. I found that while lineage-specific DNA loss overlapped with open chromatin regions in both genomes, different sources for lineage-specific DNA gain drove divergence in genome architecture. These findings reveal the turbulent nature of lineage-specific evolution of large-scale genome architecture, ultimately questioning the evolutionary stability of structural chromosomal domains. In addition to analysing large-scale genome architecture I performed two separate analyses on retrotransposons in the bovine genome. Due to the presence of BovB retrotransposons, the bovine retrotransposon landscape is clearly distinct from other placental mammals. For the first analysis, I identified bovine-specific retrotransposon associated gene coexpression networks. Following the genomic distribution of bovine retrotransposons, my results show that gene expression strongly associates with genome architecture. For the second analysis, I characterised retrotransposons surrounding tandem duplicate copies of the bovine NK-lysin gene. My results were consistent with retrotransposon accumulation causing genomic rearrangements via non-allelic homologous recombination. Altogether, my thesis reveals hidden interactions between retrotransposon accumulation, and mammalian genome structure and function. By re-purposing publicly available datasets I have characterised various aspects of the complex co-evolutionary relationships between retrotransposons and the genomes in which they reside in.Thesis (Ph.D.) -- University of Adelaide, School of Biological Sciences, 201

    A highly condensed genome without heterochromatin : orchestration of gene expression and epigenomics in Paramecium tetraurelia

    Get PDF
    Epigenetic regulation in unicellular ciliates can be as complex as in metazoans and is well described regarding small RNA (sRNA) mediated effects. The ciliate Paramecium harbors several copies of sRNA-biogenesis related proteins involved in genome rearrangements resulting in chromatin alterations. The global chromatin organization thereby is poorly understood, and unusual characteristics of the somatic nucleus, like high polyploidy, high genome coding density, and absence of heterochromatin, ought to call for complex regulation to orchestrate gene expression. The present study characterized the nucleosomal organization required for gene regulation and proper Polymerase II activity. Histone marks reveal broad domains in gene bodies, whereas intergenic regions are nucleosome free. Low occupancy in silent genes suggests that gene inactivation does not involve nucleosome recruitment. Thus, Paramecium gene regulation counteracts the current understanding of chromatin biology. Apart from global nucleosome studies, two sRNA binding proteins (Ptiwis) classically associated with transposon silencing were investigated in the background of transgene-induced silencing. Surprisingly, both Ptiwis also load sRNAs from endogenous loci in vegetative growth, revealing a broad diversity of Ptiwi functions. Together, the studies enlighten epigenetic mechanisms that regulate gene expression in a condensed genome, with Ptiwis contributing to transcriptome and chromatin dynamics.Epigenetische Regulation kann in einzelligen Ciliaten so komplex sein wie in Vielzellern und wurde umfassend angesichts kleiner RNA (sRNA)-vermittelter Effekte untersucht. Der Ciliat Paramecium besitzt mehrere Kopien sRNA-Biogenese assoziierter Proteine, die an Genomprozessierungen und resultierenden Chromatinänderungen beteiligt sind. Die globale Organisation des Chromatins ist dabei kaum verstanden und obskure Eigenschaften des somatischen Kerns, wie hohe Polyploidie, Kodierungsdichte und Fehlen von Heterochromatin, sollten eine komplexe Regulation zur Steuerung der Genexpression erfordern. Die vorliegende Studie charakterisiert die Chromatinorganisation, die für die Genregulation und Polymerase II Aktivität notwendig ist. Histonmodifikationen zeigen breite Verteilungen in Genen, während intergenische Regionen Nukleosomen-frei sind. Ein Stilllegen von Genen scheint ohne die Rekrutierung von Nukleosomen zu erfolgen, womit die Genregulation in Paramecium dem aktuellen Verständnis der Chromatinbiologie widerspricht. Neben Nukleosomenstudien wurden zwei sRNA-bindende Proteine (Ptiwis), die klassisch mit Transposon-Silencing assoziiert sind, im Hintergrund des Transgeninduzierten Silencings untersucht. Überraschenderweise laden Ptiwis sRNAs von endogenen Loci im vegetativen Wachstum, was vielfältige Ptiwi-Funktionen offenbart. Die Studien zeigen epigenetische Mechanismen zur Genregulation in einem kompakten Genom, wobei Ptiwis zur Transkriptom- und Chromatindynamik beitragen

    Grand Celebration: 10th Anniversary of the Human Genome Project

    Get PDF
    In 1990, scientists began working together on one of the largest biological research projects ever proposed. The project proposed to sequence the three billion nucleotides in the human genome. The Human Genome Project took 13 years and was completed in April 2003, at a cost of approximately three billion dollars. It was a major scientific achievement that forever changed the understanding of our own nature. The sequencing of the human genome was in many ways a triumph for technology as much as it was for science. From the Human Genome Project, powerful technologies have been developed (e.g., microarrays and next generation sequencing) and new branches of science have emerged (e.g., functional genomics and pharmacogenomics), paving new ways for advancing genomic research and medical applications of genomics in the 21st century. The investigations have provided new tests and drug targets, as well as insights into the basis of human development and diagnosis/treatment of cancer and several mysterious humans diseases. This genomic revolution is prompting a new era in medicine, which brings both challenges and opportunities. Parallel to the promising advances over the last decade, the study of the human genome has also revealed how complicated human biology is, and how much remains to be understood. The legacy of the understanding of our genome has just begun. To celebrate the 10th anniversary of the essential completion of the Human Genome Project, in April 2013 Genes launched this Special Issue, which highlights the recent scientific breakthroughs in human genomics, with a collection of papers written by authors who are leading experts in the field

    In silico investigation of glossina morsitans promoters

    Get PDF
    Philosophiae Doctor - PhDTsetse flies (Glossina spp) are the biological vectors for Trypanosomes, the causative magents of Human African Trypanosomiasis (HAT). HAT is a debilitating disease that continues to present a major public health problem and a key factor limiting rural development in vast regions of tropical Africa. To augment vector control efforts, the International Glossina Genome Initiative (IGGI) was established in 2004 with the ultimate goal of generating a fully annotated whole genome sequence for Glossina morsitans. A working draft genome of Glossina morsitans was availed in 2011. In this thesis, transcriptional regulatory features in Glossina morsitans were analysed using the draft genome. A method for TSS identification in the newly sequenced Glossina morsitans genome was developed using TSS-seq tags sampled from two developmental stages of Glossina morsitans. High throughput next generation sequencing reads obtained from Glossina morsitans larvae and pupae were used to locate transcription start sites (TSS) in the Glossina morsitans genome. TSS-seq tag clusters, defined as a minimum number of reads at the 5’ predicted UTR or first coding exon, were used to define transcription start sites. A total of 3134 tag clusters were identified on the Glossina genome. Approximately 45.4% (1424) of the tag clusters mapped to the first coding exons or their proximal predicted 5’UTR regions and include 31 tag clusters that mapped to transposons. A total of 1101 (35.1%) tag clusters mapped outside the genic region and/or scaffolds without gene predictions and may correspond to previously un-annotated transcripts or noncoding RNA TSS. The core promoter regions were classified as narrow or broad based on the number of TSS positions within a TSS-seq cluster. Majority (95%) of the core promoters analysed in this study were of the broad type while only 5% were of the narrow type. Comparison of canonical core promoter motif occurences between random and bona fide core promoters showed that, generally, the number of motifs in biologically functional genomic windows in the true dataset exceeded those in the random dataset (p <= 0.00164, 0.00135, 0.00185 for the narrow, broad with peak and broad without peak categories respectively). Frequency of motif co-occurrence in core promoter was found to be fundamentally different across various initiation patterns. Narrow core promoters recorded higher frequency of the TATA-box and INR motifs and two-way motif co-occurrence showed that the TATA-box-INR pair is over-represented in the narrow category. Broad core promoters showed higher frequency of the BREd and MTE motifs and two-way motif co-occurrence showed that the MTE-DPE pair is over-represented in broad core promoters. TATA-less promoters account for 77% of the core promoters in this analysis. TATA-less core promoters showed a higher frequency of the MTE and INR motifs in contrast to observations in Drosophila where the DPE motif has been reported to occur frequently in TATA-less promoters. These motif combinations suggest their equal importance to transcription in their corresponding promoter classes in Glossina morsitans

    Computational epigenetics : bioinformatic methods for epigenome prediction, DNA methylation mapping and cancer epigenetics

    Get PDF
    Epigenetic research aims to understand heritable gene regulation that is not directly encoded in the DNA sequence. Epigenetic mechanisms such as DNA methylation and histone modifications modulate the packaging of the DNA in the nucleus and thereby influence gene expression. Patterns of epigenetic information are faithfully propagated over multiple cell divisions, which makes epigenetic gene regulation a key mechanism for cellular differentiation and cell fate decisions. In addition, incomplete erasure of epigenetic information can lead to complex patterns of non-Mendelian inheritance. Stochastic and environment-induced epigenetic defects are known to play a major role in cancer and ageing, and they may also contribute to mental disorders and autoimmune diseases. Recent technical advances — such as the development of the ChIP-on-chip and ChIP-seq protocols for genome-wide mapping of epigenetic information — have started to convert epigenetic research into a high-throughput endeavor, to which bioinformatics is expected to make significant contributions. This thesis describes computational work at the intersection of epigenetics and genome research, aiming to address the bioinformatic challenges posed by the human epigenome. While its methods are carried over and adapted from bioinformatics and related fields (including data mining, machine learning, statistics, algorithms, optimization, software engineering and databases), its overarching goal is to contribute to epigenetic research, both directly through analyzing and modeling of epigenetic information, and indirectly through the development of practically useful methods and software toolkits. This thesis is broadly structured into four parts. The first part gives a brief introduction into epigenetic regulation and inheritance, and reviews the emerging field of computational epigenetics. The second part addresses the question of genome-epigenome interactions using machine learning methods. It is shown that accurate predictions of DNA methylation and other epigenetic modifications can be derived from the genomic DNA sequence. Based on this finding, the EpiGRAPH web service for epigenome analysis and prediction is described, and methods for refined annotation of CpG islands in the human genome are proposed. The third part is dedicated to large-scale analysis of DNA methylation, which is the best-known epigenetic phenomenon. The BiQ Analyzer software toolkit is presented, together with a bioinformatic analysis of the "National Methylome Project for Chromosome 21'; dataset, for which BiQ Analyzer had played an enabling role. This part concludes with statistical modeling of DNA methylation variation and an analysis of its implications for DNA methylation mapping in a large number of human individuals. The fourth part describes two pilot projects applying the bioinformatic concepts of this thesis to cancer epigenetics. First, genome-scale datasets are probed for evidence of a link between DNA methylation and Polycomb binding, which is believed to play a role in epigenetic deregulation of cancer cells. Second, a biomarker that tests for cancer-specific DNA methylation is optimized and validated for use in clinical settings. Arguably the most interesting result of this thesis is the unexpectedly high correlation between genome and epigenome that was found by several methods and based on multiple epigenome datasets. This finding suggests that the role of the genome for epigenetic regulation has been underappreciated, and it underlines the importance of integrated analysis of genome and epigenome. With the EpiGRAPH web service for (epi-) genome analysis and prediction, a research tool is provided to facilitate further investigation of this striking interaction.Ziel epigenetischer Forschung ist ein besseres Verständnis der Mechanismen erblicher Gen-Regulation, die nicht direkt in der DNA-Sequenz codiert sind. Epigenetische Veränderungen des Genoms — wie zum Beispiel DNA-Methylierung und Histon-Modifikationen — beeinflussen die räumliche Anordnung der DNA im Zellkern und damit auch die Gen-Expression. Epigenetische Informationen werden über viele Zellteilungen stabil weitergegeben, weswegen die epigenetische Gen-Regulation ein Schlüsselmechanismus für Zell-Differenzierung und Determinierung ist. Darüber hinaus ergeben sich aus dem unvollständigen Löschen von epigenetischen Informationen komplexe nicht-Mendelsche Vererbungsgänge. Stochastische und umweltinduzierte epigenetische Defekte spielen eine wichtige Rolle für Krebs und molekulares Altern, und sie scheinen ebenfalls psychische Störungen und Autoimmun-Erkrankungen zu beeinflussen. In Folge technischer Fortschritte — wie etwa der Entwicklung der ChIP-on-chip und ChIP-seq Protokolle zur genomweiten Kartierung epigenetischer Informationen — hat eine Transformation der epigenetischen Forschung hin zu Hochdurchsatz-Analysen begonnen, zu der die Bioinformatik einen wichtigen Beitrag leisten muss. Diese Dissertation beschreibt bioinformatische Studien an der Schnittstelle von Epigenetik und Genomforschung, mit dem Ziel einer adäquaten Antwort auf die analytischen Herausforderungen des menschlichen Epigenoms. Während ihre Methoden aus der Bioinformatik und benachbarten Gebieten (Data Mining, maschinelles Lernen, Statistik, Algorithmik, Optimierung, Software Engineering und Datenbanken) entlehnt und adaptiert sind, ist es das übergeordnete Ziel der Arbeit, einen Beitrag zur epigenetischen Forschung zu leisten; und zwar sowohl direkt durch die Analyse und Modellierung epigenetischer Daten, also auch indirekt durch die Entwicklung praktisch verwertbarer Methoden und Software-Werkzeuge. Diese Dissertation gliedert sich grob in vier Teile. Der erste Teil führt in den Themenkomplex der epigenetischen Vererbung und Gen-Regulation ein und fasst das junge Forschungsgebiet "Computational Epigenetics" zusammen. Der zweite Teil adressiert die Frage nach Genom-Epigenom-Interaktionen mit Methoden des maschinellen Lernens. Es wird gezeigt, dass aus der genomischen DNA-Sequenz eine akkurate Vorhersage der DNA-Methylierung sowie anderer epigenetischer Modifikationen abgeleitet werden kann. Basierend auf diesem Ergebnis werden der EpiGRAPH-Webservice zur Epigenom-Analyse und Vorhersage beschrieben sowie Methoden für die verbesserte Annotation von CpG-Inseln in Wirbeltier- Genomen ausgearbeitet. Der dritte Teil beschäftigt sich mit der Hochdurchsatzanalyse von DNA-Methylierung, dem bekanntesten epigenetischen Phänomen. Die BiQ Analyzer Software wird vorgestellt, und die Ergebnisse einer bioinformatischen Analyse des "National Methylome Project for Chromosome 21"-Datensatzes werden beschrieben, zu dessen Generierung der BiQ Analyzer einen fundamentalen Beitrag leisten konnte. Den Abschluss dieses Teils bildet die statistische Modellierung von DNA-Methylierungs-Variation und eine Analyse ihrer Bedeutung für die DNA-Methylierungs-Kartierung einer großen Anzahl menschlicher Individuen. Der vierte Teil beschreibt zwei Pilotprojekte, in denen die bioinformatischen Konzepte dieser Arbeit in der Krebs-Epigenetik angewandt werden. Zum einen werden epigenomische Datensätze im Hinblick auf Interaktionen zwischen DNA-Methylierung und Polycomb- Bindestellen untersucht — eine Beziehung, die vermutlich bei der epigenetischen Deregulierung von Krebszellen eine Rolle spielt. Zum anderen wird ein Biomarker für die Verxiii wendung unter klinischen Bedingungen optimiert und validiert, der eine krebsspezifische Veränderung der DNA-Methylierung detektieren kann. Das vielleicht interessanteste Ergebnis dieser Dissertation ist eine unerwartet hohe Korrelation zwischen Genom und Epigenom, die mit mehreren Methoden und für verschiedenste Epigenom-Datensätze nachgewiesen werden konnte. Dieses Ergebnis legt nahe, dass der regulatorische Einfluss des Genoms auf das Epigenom bisher nicht ausreichend gewürdigt wurde, und es unterstreicht die Wichtigkeit einer integrierten Analyse von Genom und Epigenom. Der EpiGRAPH-Webservice bietet sich als Werkzeug für eine genauere Untersuchung dieser bemerkenswerten Interaktion an

    "Omics" in traumatic brain injury: novel approaches to a complex disease

    Get PDF
    Background: To date, there is neither any pharmacological treatment with efficacy in traumatic brain injury (TBI) nor any method to halt the disease progress. This is due to an incomplete understanding of the vast complexity of the biological cascades and failure to appreciate the diversity of secondary injury mechanisms in TBI. In recent years, techniques for high-throughput characterization and quantification of biological molecules that include genomics, proteomics, and metabolomics have evolved and referred to as omics.Methods: In this narrative review, we highlight how omics technology can be applied to potentiate diagnostics and prognostication as well as to advance our understanding of injury mechanisms in TBI.Results: The omics platforms provide possibilities to study function, dynamics, and alterations of molecular pathways of normal and TBI disease states. Through advanced bioinformatics, large datasets of molecular information from small biological samples can be analyzed in detail and provide valuable knowledge of pathophysiological mechanisms, to include in prognostic modeling when connected to clinically relevant data. In such a complex disease as TBI, omics enables broad categories of studies from gene compositions associated with susceptibility to secondary injury or poor outcome, to potential alterations in metabolites following TBI.Conclusion: The field of omics in TBI research is rapidly evolving. The recent data and novel methods reviewed herein may form the basis for improved precision medicine approaches, development of pharmacological approaches, and individualization of therapeutic efforts by implementing mathematical "big data" predictive modeling in the near future.</p

    "Omics" in traumatic brain injury: novel approaches to a complex disease

    Get PDF
    Background To date, there is neither any pharmacological treatment with efficacy in traumatic brain injury (TBI) nor any method to halt the disease progress. This is due to an incomplete understanding of the vast complexity of the biological cascades and failure to appreciate the diversity of secondary injury mechanisms in TBI. In recent years, techniques for high-throughput characterization and quantification of biological molecules that include genomics, proteomics, and metabolomics have evolved and referred to as omics. Methods In this narrative review, we highlight how omics technology can be applied to potentiate diagnostics and prognostication as well as to advance our understanding of injury mechanisms in TBI. Results The omics platforms provide possibilities to study function, dynamics, and alterations of molecular pathways of normal and TBI disease states. Through advanced bioinformatics, large datasets of molecular information from small biological samples can be analyzed in detail and provide valuable knowledge of pathophysiological mechanisms, to include in prognostic modeling when connected to clinically relevant data. In such a complex disease as TBI, omics enables broad categories of studies from gene compositions associated with susceptibility to secondary injury or poor outcome, to potential alterations in metabolites following TBI. Conclusion The field of omics in TBI research is rapidly evolving. The recent data and novel methods reviewed herein may form the basis for improved precision medicine approaches, development of pharmacological approaches, and individualization of therapeutic efforts by implementing mathematical "big data" predictive modeling in the near future.Scientific Assessment and Innovation in Neurosurgical Treatment Strategie
    • …
    corecore