78 research outputs found

    Constraints on genes shape long-term conservation of macro-synteny in metazoan genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many metazoan genomes conserve chromosome-scale gene linkage relationships (“macro-synteny”) from the common ancestor of multicellular animal life <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr></abbrgrp>, but the biological explanation for this conservation is still unknown. Double cut and join (DCJ) is a simple, well-studied model of neutral genome evolution amenable to both simulation and mathematical analysis <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>, but as we show here, it is not sufficent to explain long-term macro-synteny conservation.</p> <p>Results</p> <p>We examine a family of simple (one-parameter) extensions of DCJ to identify models and choices of parameters consistent with the levels of macro- and micro-synteny conservation observed among animal genomes. Our software implements a flexible strategy for incorporating genomic context into the DCJ model to incorporate various types of genomic context (“DCJ-[C]”), and is available as open source software from <url>http://github.com/putnamlab/dcj-c</url>.</p> <p>Conclusions</p> <p>A simple model of genome evolution, in which DCJ moves are allowed only if they maintain chromosomal linkage among a set of constrained genes, can simultaneously account for the level of macro-synteny conservation and for correlated conservation among multiple pairs of species. Simulations under this model indicate that a constraint on approximately 7% of metazoan genes is sufficient to constrain genome rearrangement to an average rate of 25 inversions and 1.7 translocations per million years.</p

    Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication

    Get PDF
    Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of "living fossils." As arthropods, they belong to the Ecdysozoa}, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes, and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Here we use a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers and 5,775 candidate conserved protein coding genes. Comparison to other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications (WGDs) ~ 300 MYA, followed by extensive chromosome fusion

    Ancient gene linkages support ctenophores as sister to other animals

    Get PDF
    A central question in evolutionary biology is whether sponges or ctenophores (comb jellies) are the sister group to all other animals. These alternative phylogenetic hypotheses imply different scenarios for the evolution of complex neural systems and other animal-specific traits1,2,3,4,5,6. Conventional phylogenetic approaches based on morphological characters and increasingly extensive gene sequence collections have not been able to definitively answer this question7,8,9,10,11. Here we develop chromosome-scale gene linkage, also known as synteny, as a phylogenetic character for resolving this question12. We report new chromosome-scale genomes for a ctenophore and two marine sponges, and for three unicellular relatives of animals (a choanoflagellate, a filasterean amoeba and an ichthyosporean) that serve as outgroups for phylogenetic analysis. We find ancient syntenies that are conserved between animals and their close unicellular relatives. Ctenophores and unicellular eukaryotes share ancestral metazoan patterns, whereas sponges, bilaterians, and cnidarians share derived chromosomal rearrangements. Conserved syntenic characters unite sponges with bilaterians, cnidarians, and placozoans in a monophyletic clade to the exclusion of ctenophores, placing ctenophores as the sister group to all other animals. The patterns of synteny shared by sponges, bilaterians, and cnidarians are the result of rare and irreversible chromosome fusion-and-mixing events that provide robust and unambiguous phylogenetic support for the ctenophore-sister hypothesis. These findings provide a new framework for resolving deep, recalcitrant phylogenetic problems and have implications for our understanding of animal evolution.journal articl

    The little skate genome and the evolutionary emergence of wing-like fins

    Get PDF
    Skates are cartilaginous fish whose body plan features enlarged wing-like pectoral fins, enabling them to thrive in benthic environments1,2. However, the molecular underpinnings of this unique trait remain unclear. Here we investigate the origin of this phenotypic innovation by developing the little skate Leucoraja erinacea as a genomically enabled model. Analysis of a high-quality chromosome-scale genome sequence for the little skate shows that it preserves many ancestral jawed vertebrate features compared with other sequenced genomes, including numerous ancient microchromosomes. Combining genome comparisons with extensive regulatory datasets in developing fins—including gene expression, chromatin occupancy and three-dimensional conformation—we find skate-specific genomic rearrangements that alter the three-dimensional regulatory landscape of genes that are involved in the planar cell polarity pathway. Functional inhibition of planar cell polarity signalling resulted in a reduction in anterior fin size, confirming that this pathway is a major contributor to batoid fin morphology. We also identified a fin-specific enhancer that interacts with several hoxa genes, consistent with the redeployment of hox gene expression in anterior pectoral fins, and confirmed its potential to activate transcription in the anterior fin using zebrafish reporter assays. Our findings underscore the central role of genome reorganization and regulatory variation in the evolution of phenotypes, shedding light on the molecular origin of an enigmatic trait

    Cross-genome Comparison of Global Oikopleura dioica Populations

    Get PDF
    Larvaceans represent the second most abundant zooplankton in all the world’s oceans, with key roles in marine food chains and global carbon flux. Oikopleura dioica is a free-swimming planktonic tunicate from the group and possesses the smallest animal genome with extremely dynamic organization: multiple genomic features such as transposon diversity, intron repertoire, gene content and order are altered in Oikopleura compared with other metazoans. Intriguingly, such genome reorganization has not affected the preservation of their ancestral morphology, since O. dioica maintains a chordate-like body plan throughout its life. O. dioica can be easily distinguished from other larvaceans mainly based on separate sexes and the presence of two subchordal cells on its tail. My research is focused on the cross-genome comparison of three O. dioica populations sampled from the Northern hemisphere: one from North Atlantic (Barcelona/Bergen) and two from Pacific (Osaka/Aomori and Okinawa/Kume) Oceans. For each population, I generated high-quality genome assemblies using a combination of short- and long-read sequencing technologies, as well as chromatin conformation data, confirming preservation of three chromosome pairs. A pairwise comparison of populations revealed a striking degree of genome reshuffling that involves a vast number of synteny breaks and rearrangements. My research also shows that rearrangements mostly happen within individual chromosomes and generally preserve protein-coding features, such as genes and their constituent exons, although the gene order has been effectively randomized. O. dioica populations exhibit differences in repeats and gene content that affect even evolutionary conserved clusters, such as Hox genes. Consistent with an increased evolutionary rate, the accumulation of rearrangements in O. dioica appears to have happened much faster than in other animals and resulted in the divergence of multiple lineages of dioecious Oikopleura. The fact that their morphology stayed virtually identical makes O. dioica a perfect model to study genotype-phenotype correlation and the possible existence of unknown regulatory mechanisms.Overall, my thesis contributes new insights into the evolution of chordate genomes and, thus, may be interesting beyond the field of Oikopleura research.Okinawa Institute of Science and Technology Graduate Universit

    Genome-wide gene expression surveys and a transcriptome map in chicken

    Get PDF
    The chicken (Gallus gallus) is an important model organism in genetics, developmental biology, immunology, evolutionary research, and agricultural science. The completeness of the draft chicken genome sequence provided new possibilities to study genomic changes during evolution by comparing the chicken genome to that of other species. The development of long oligonucleotide microarrays based on the genome sequence made it possible to survey genome-wide gene expression in chicken. This thesis describes two gene expression surveys across a range of healthy chicken tissues in both adult and embryonic stages. Specifically, we focus on the mechanisms of regulation of gene transcription and their evolution in the vertebrate genome. Chapter 1 provides a brief history of the chicken as a model organism in biological and genomics research. In particular a brief overview is presented about expression profiling experiments, followed by an introduction to gene transcription regulation in general. Finally, the aim and outline of this thesis is presented. An important aim of this thesis is to generate surveys of genome-wide gene expression data in chicken using microarrays. In chapter 2, we introduce microarray data normalization including background correction, within-array normalization and between-array normalization. Based on these results an analysis approach is recommended for the analysis of two-color microarray data as performed in the experiments described in this thesis. We also briefly explain the relevant methodology for the identification of differentially expressed genes and how to translate resulting gene lists into biological knowledge. Finally, specific issues related to updating microarray probe annotation in farm animals, is discussed. For the analysis of the microarray data in this thesis re-annotation of the probes on the chicken 20K oligoarray was done using the oligoRAP, analysis pipeline. The vast amount of data generated from a single transcriptomics study makes it impossible to extract meaningful biological knowledge by manually going through individual genes from a list with hundreds and thousands of differentially expressed genes. In chapter 3, we present a practical approach using a collection of R/Bioconductor packages to extract biological knowledge from a microarray experiment in farm animals. Furthermore, a locally adaptive statistical procedure (LAP) analysis approach is used to identify differentially expressed chromosomal regions in a microarray experiment. Chapter 4 presents a genome-wide gene expression survey across eight different tissues (brain, bursa of Fabricius, kidney, liver, lung, small intestine, spleen, and thymus from 10-week old chickens) in adult birds using a chicken 20K microarray. To a certain extent, most genes show some tissue-specific pattern of expression. Housekeeping and tissue-specific genes are identified based on gene expression patterns across the eight different tissues. The results show that housekeeping genes are more compact, i.e. are smaller, with shorter, coding sequence length, intron length, and smaller length of the intergenic regions. This observed compactness of housekeeping genes may be a result of selection on economy of transcription during evolution. Furthermore, a comparative analysis of gene expression among mouse, chicken, and frog showed that the expression patterns of orthologous genes are conserved during evolution between mammals, birds, and amphibians. The chicken embryo has been a very popular model for developmental biology. To study the overall gene expression pattern in whole chicken embryos at different developmental stages and/or embryonic tissues, a genome-wide gene expression survey across different developmental and embryonic stages was performed (chapter 5). The study included four different developmental stages (HH stage 3, 10, 15, 22) and eight different embryonic tissues (brain, bursa of Fabricius, heart, kidney, liver, lung, small intestine, and spleen from HH stage 36). We were able to identify several embryonic stage- and tissue-specific genes in our analysis. Genomic features of genes widely expressed under these 12 conditions suggest that widely expressed genes are more compact than tissue-specific genes, confirming the findings described in chapter 4. The analysis of the differentially expressed genes during the different developmental stages of whole embryo indicates a gradual change in gene expression during embryo development. A comparison of the gene expression profiles between the same organs, of adults and embryos reveals both striking similarities as well as differences. The overall goal of this thesis was to improve our understanding of the mechanisms of transcriptional regulation in the chicken. In chapter 6, a transcriptome map for all chicken chromosomes is presented based on the expression data described in chapter 4. The results reveal the presence of two distinct types of chromosomal regions characterized by clusters of highly or lowly expressed genes respectively. Furthermore, these regions show a high correlation with a number of genome characteristics, like gene density, gene length, intron length, and GC content. A comparative analysis between the chicken and human transcriptome maps suggests that the regions with clusters of highly expressed genes are relatively conserved between the two genomes. Our results revealed the presence of a higher order organization of the chicken genome that affects gene expression, confirming similar observations in other species. Finally, in chapter 7 I summarize the main findings and discuss some of the limitations of the analyses described in this thesis. I also discuss the different merits and shortcomings of studying gene expression using either microarrays or next-generation sequencing technology and propose directions for future research. The rapid developments in new-generation sequencing technology will facilitate better coverage and depth of the chicken genome. This will provide a better genome assembly and an improved genome annotation. The sequence-based approaches for studying gene expression will reduce noise levels compared to hybridization-based approaches. Overall, next-generation sequencing is already providing greatly enhance tools to further improve our understanding of the chicken transcriptome and its regulation. <br/

    Comparative genomics of Dothideomycete fungi

    Get PDF
    Fungi are a diverse group of eukaryotic micro-organisms particularly suited for comparative genomics analyses. Fungi are important to industry, fundamental science and many of them are notorious pathogens of crops, thereby endangering global food supply. Dozens of fungi have been sequenced in the last decade and with the advances of the next generation sequencing, thousands of new genome sequences will become available in coming years. In this thesis I have used bioinformatics tools to study different biological and evolutionary processes in various genomes with a focus on the genomes of the Dothideomycetefungi Cladosporium fulvum, Dothistroma septosporumand Zymoseptoria tritici. Chapter 1introduces the scientific disciplines of mycology and bioinformatics from a historical perspective. It exemplifies a typical whole-genome sequence analysis of a fungal genome, and focusses in particular on structural gene annotation and detection of transposable elements. In addition it shortly reviews the microRNA pathway as known in animal and plants in the context of the putative existence of similar yet subtle different small RNA pathways in other branches of the eukaryotic tree of life. Chapter 2addresses the novel sequenced genomes of the closely related Dothideomyceteplant pathogenic fungi Cladosporium fulvumand Dothistroma septosporum. Remarkably, it revealed occurrence of a surprisingly high similarity at the protein level combined with striking differences at the DNA level, gene repertoire and gene expression. Most noticeably, the genome of C. fulvumappears to be at least twice as large, which is solely attributable to a much larger content in repetitive sequences. Chapter 3describes a novel alignment-based fungal gene prediction method (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It shows excellent performance benchmarked on a dataset of 7,000 unigene-supported gene models from ten different fungi. Applicability of the method was shown by revisiting the annotations of C. fulvumand D. septosporumand of various other fungal genomes from the first-generation sequencing era. Thousands of gene models were revised in each of the gene catalogues, indeed revealing a correlation to the quality of the genome assembly, and to sequencing strategies used in the sequencing centres, highlighting different types of errors in different annotation pipelines. Chapter 4focusses on the unexpected high number of gene models that were identified by ABFGP that align nicely to informant genes, but only upon toleration of frame shifts and in-frame stop-codons. These discordances could represent sequence errors (SEs) and/or disruptive mutations (DMs) that caused these truncated and erroneous gene models. We revisited the same fungal gene catalogues as in chapter 3, confirmed SEs by resequencing and successively removed those, yielding a high-confidence and large dataset of nearly 1,000 pseudogenes caused by DMs. This dataset of fungal pseudogenes, containing genes listed as bona fide genes in current gene catalogues, does not correspond to various observations previously done on fungal pseudogenes. Moreover, the degree of pseudogenization showing up to a ten-fold variation for the lowest versus the highest affected species, is generally higher in species that reproduce asexually compared to those that in addition reproduce sexually. Chapter 5describes explorative genomics and comparative genomics analyses revealing the presence of introner-like elements (ILEs) in various Dothideomycetefungi including Zymoseptoria triticiin which they had not identified yet, although its genome sequence is already publicly available for several years. ILEs combine hallmark intron properties with the apparent capability of multiplying themselves as repetitive sequence. ILEs strongly associate with events of intron gain, thereby delivering in silico proof of their mobility. Phylogenetic analyses at the intra- and inter-species level showed that most ILEs are related and likely share common ancestry. Chapter 6provides additional evidence that ILE multiplication strongly dominates over other types of intron duplication in fungi. The observed high rate of ILE multiplication followed by rapid sequence degeneration led us to hypothesize that multiplication of ILEs has been the major cause and mechanism of intron gain in fungi, and we speculate that this could be generalized to all eukaryotes. Chapter 7describes a new strategy for miRNA hairpin prediction using statistical distributions of observed biological variation of properties (descriptors) of known miRNA hairpins. We show that the method outperforms miRNA prediction by previous, conventional methods that usually apply threshold filtering. Using this method, several novel candidate miRNAs were assigned in the genomes of Caenorhabditis elegansand two human viruses. Although this chapter is not applied on fungi, the study does provide a flexible method to find evidence for existence of a putative miRNA-like pathway in fungi. Chapter 8provides a general discussion on the advent of bioinformatics in mycological research and its implications. It highlights the necessity of a prioriplanning and integration of functional analysis and bioinformatics in order to achieve scientific excellence, and describes possible scenarios for the near future of fungal (comparative) genomics research. Moreover, it discusses the intrinsic error rate in large-scale, automatically inferred datasets and the implications of using and comparing those.</p

    Comparative genomics of Dothideomycete fungi

    Get PDF
    Fungi are a diverse group of eukaryotic micro-organisms particularly suited for comparative genomics analyses. Fungi are important to industry, fundamental science and many of them are notorious pathogens of crops, thereby endangering global food supply. Dozens of fungi have been sequenced in the last decade and with the advances of the next generation sequencing, thousands of new genome sequences will become available in coming years. In this thesis I have used bioinformatics tools to study different biological and evolutionary processes in various genomes with a focus on the genomes of the Dothideomycetefungi Cladosporium fulvum, Dothistroma septosporumand Zymoseptoria tritici. Chapter 1introduces the scientific disciplines of mycology and bioinformatics from a historical perspective. It exemplifies a typical whole-genome sequence analysis of a fungal genome, and focusses in particular on structural gene annotation and detection of transposable elements. In addition it shortly reviews the microRNA pathway as known in animal and plants in the context of the putative existence of similar yet subtle different small RNA pathways in other branches of the eukaryotic tree of life. Chapter 2addresses the novel sequenced genomes of the closely related Dothideomyceteplant pathogenic fungi Cladosporium fulvumand Dothistroma septosporum. Remarkably, it revealed occurrence of a surprisingly high similarity at the protein level combined with striking differences at the DNA level, gene repertoire and gene expression. Most noticeably, the genome of C. fulvumappears to be at least twice as large, which is solely attributable to a much larger content in repetitive sequences. Chapter 3describes a novel alignment-based fungal gene prediction method (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It shows excellent performance benchmarked on a dataset of 7,000 unigene-supported gene models from ten different fungi. Applicability of the method was shown by revisiting the annotations of C. fulvumand D. septosporumand of various other fungal genomes from the first-generation sequencing era. Thousands of gene models were revised in each of the gene catalogues, indeed revealing a correlation to the quality of the genome assembly, and to sequencing strategies used in the sequencing centres, highlighting different types of errors in different annotation pipelines. Chapter 4focusses on the unexpected high number of gene models that were identified by ABFGP that align nicely to informant genes, but only upon toleration of frame shifts and in-frame stop-codons. These discordances could represent sequence errors (SEs) and/or disruptive mutations (DMs) that caused these truncated and erroneous gene models. We revisited the same fungal gene catalogues as in chapter 3, confirmed SEs by resequencing and successively removed those, yielding a high-confidence and large dataset of nearly 1,000 pseudogenes caused by DMs. This dataset of fungal pseudogenes, containing genes listed as bona fide genes in current gene catalogues, does not correspond to various observations previously done on fungal pseudogenes. Moreover, the degree of pseudogenization showing up to a ten-fold variation for the lowest versus the highest affected species, is generally higher in species that reproduce asexually compared to those that in addition reproduce sexually. Chapter 5describes explorative genomics and comparative genomics analyses revealing the presence of introner-like elements (ILEs) in various Dothideomycetefungi including Zymoseptoria triticiin which they had not identified yet, although its genome sequence is already publicly available for several years. ILEs combine hallmark intron properties with the apparent capability of multiplying themselves as repetitive sequence. ILEs strongly associate with events of intron gain, thereby delivering in silico proof of their mobility. Phylogenetic analyses at the intra- and inter-species level showed that most ILEs are related and likely share common ancestry. Chapter 6provides additional evidence that ILE multiplication strongly dominates over other types of intron duplication in fungi. The observed high rate of ILE multiplication followed by rapid sequence degeneration led us to hypothesize that multiplication of ILEs has been the major cause and mechanism of intron gain in fungi, and we speculate that this could be generalized to all eukaryotes. Chapter 7describes a new strategy for miRNA hairpin prediction using statistical distributions of observed biological variation of properties (descriptors) of known miRNA hairpins. We show that the method outperforms miRNA prediction by previous, conventional methods that usually apply threshold filtering. Using this method, several novel candidate miRNAs were assigned in the genomes of Caenorhabditis elegansand two human viruses. Although this chapter is not applied on fungi, the study does provide a flexible method to find evidence for existence of a putative miRNA-like pathway in fungi. Chapter 8provides a general discussion on the advent of bioinformatics in mycological research and its implications. It highlights the necessity of a prioriplanning and integration of functional analysis and bioinformatics in order to achieve scientific excellence, and describes possible scenarios for the near future of fungal (comparative) genomics research. Moreover, it discusses the intrinsic error rate in large-scale, automatically inferred datasets and the implications of using and comparing those.</p
    • 

    corecore