305 research outputs found

    Extensive gene content variation in the <i>Brachypodium distachyon</i> pan-genome correlates with population structure

    Get PDF
    13 Pags.- 6 Figs. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holderWhile prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.The work conducted by the US DOE Joint Genome Institute is supported by the Office of Science of the US Department of Energy under Contract no. DE-AC02-05CH11231. D.P. W. and R.A. were funded in part by the National Science Foundation (grant no. IOS–1258126), and the Great Lakes Bioenergy Research Center (Department of Energy Biological and Environmental Research Office of Science grant no. DE– FCO2–07ER64494). TEJ and DLDM were supported by NSF PGRP grant IOS-0922457. P.C. and B.C.M. were funded by Spanish MINECO (CGL2012-39953-C02-01 and CGL2016-79790-P). B.C.M. was partially funded by DGA—Obra Social La Caixa (grant number GA-LC-059-2011) and Spanish MINECO (AGL2013-48756-R, CSIC13-4E-2490). PC was partially funded by Spanish Aragon Government-European Social Fund (Bioflora).Peer reviewe

    Genome-wide repeat dynamics reflect phylogenetic distance in closely related allotetraploid Nicotiana (Solanaceae)

    Get PDF
    Nicotiana sect. Repandae is a group of four allotetraploid species originating from a single allopolyploidisation event approximately 5 million years ago. Previous phylogenetic analyses support the hypothesis of N. nudicaulis as sister to the other three species. This is concordant with changes in genome size, separating those with genome downsizing (N. nudicaulis) from those with genome upsizing (N. repanda, N. nesophila, N. stocktonii). However, a recent analysis reflecting genome dynamics of different transposable element families reconstructed greater similarity between N. nudicaulis and the Revillagigedo Island taxa (N. nesophila and N. stocktonii), thereby placing N. repanda as sister to the rest of the group. This could reflect a different phylogenetic hypothesis or the unique evolutionary history of these particular elements. Here we re-examine relationships in this group and investigate genome-wide patterns in repetitive DNA, utilising high-throughput sequencing and a genome skimming approach. Repetitive DNA clusters provide support for N. nudicaulis as sister to the rest of the section, with N. repanda sister to the two Revillagigedo Island species. Clade-specific patterns in the occurrence and abundance of particular repeats confirm the original (N. nudicaulis (N. repanda (N. nesophila ? N. stocktonii))) hypothesis. Furthermore, overall repeat dynamics in the island species N. nesophila and N. stocktonii confirm their similarity to N. repanda and the distinctive patterns between these three species and N. nudicaulis. Together these results suggest that broad-scale repeat dynamics do in fact reflect evolutionary history and could be predicted based on phylogenetic distance

    2R and remodeling of vertebrate signal transduction engine

    Get PDF
    <p>Abstract</p> <p><b>Background</b></p> <p>Whole genome duplication (WGD) is a special case of gene duplication, observed rarely in animals, whereby all genes duplicate simultaneously through polyploidisation. Two rounds of WGD (2R-WGD) occurred at the base of vertebrates, giving rise to an enormous wave of genetic novelty, but a systematic analysis of functional consequences of this event has not yet been performed.</p> <p><b>Results</b></p> <p>We show that 2R-WGD affected an overwhelming majority (74%) of signalling genes, in particular developmental pathways involving receptor tyrosine kinases, Wnt and transforming growth factor-β ligands, G protein-coupled receptors and the apoptosis pathway. 2R-retained genes, in contrast to tandem duplicates, were enriched in protein interaction domains and multifunctional signalling modules of Ras and mitogen-activated protein kinase cascades. 2R-WGD had a fundamental impact on the cell-cycle machinery, redefined molecular building blocks of the neuronal synapse, and was formative for vertebrate brains. We investigated 2R-associated nodes in the context of the human signalling network, as well as in an inferred ancestral pre-2R (AP2R) network, and found that hubs (particularly involving negative regulation) were preferentially retained, with high connectivity driving retention. Finally, microarrays and proteomics demonstrated a trend for gradual paralog expression divergence independent of the duplication mechanism, but inferred ancestral expression states suggested preferential subfunctionalisation among 2R-ohnologs (2ROs).</p> <p><b>Conclusions</b></p> <p>The 2R event left an indelible imprint on vertebrate signalling and the cell cycle. We show that 2R-WGD preferentially retained genes are associated with higher organismal complexity (for example, locomotion, nervous system, morphogenesis), while genes associated with basic cellular functions (for example, translation, replication, splicing, recombination; with the notable exception of cell cycle) tended to be excluded. 2R-WGD set the stage for the emergence of key vertebrate functional novelties (such as complex brains, circulatory system, heart, bone, cartilage, musculature and adipose tissue). A full explanation of the impact of 2R on evolution, function and the flow of information in vertebrate signalling networks is likely to have practical consequences for regenerative medicine, stem cell therapies and cancer treatment.</p

    Screening synteny blocks in pairwise genome comparisons through integer programming

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events.</p> <p>Results</p> <p>We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons).</p> <p>Conclusions</p> <p>The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available <url>http://github.com/tanghaibao/quota-alignment</url>. QUOTA-ALIGN program is also integrated as a major component in SynMap <url>http://genomevolution.com/CoGe/SynMap.pl</url>, offering easier access to thousands of genomes for non-programmers.</p

    Deconstruction of the (Paleo)Polyploid Grapevine Genome Based on the Analysis of Transposition Events Involving NBS Resistance Genes

    Get PDF
    Plants have followed a reticulate type of evolution and taxa have frequently merged via allopolyploidization. A polyploid structure of sequenced genomes has often been proposed, but the chromosomes belonging to putative component genomes are difficult to identify. The 19 grapevine chromosomes are evolutionary stable structures: their homologous triplets have strongly conserved gene order, interrupted by rare translocations. The aim of this study is to examine how the grapevine nucleotide-binding site (NBS)-encoding resistance (NBS-R) genes have evolved in the genomic context and to understand mechanisms for the genome evolution. We show that, in grapevine, i) helitrons have significantly contributed to transposition of NBS-R genes, and ii) NBS-R gene cluster similarity indicates the existence of two groups of chromosomes (named as Va and Vc) that may have evolved independently. Chromosome triplets consist of two Va and one Vc chromosomes, as expected from the tetraploid and diploid conditions of the two component genomes. The hexaploid state could have been derived from either allopolyploidy or the separation of the Va and Vc component genomes in the same nucleus before fusion, as known for Rosaceae species. Time estimation indicates that grapevine component genomes may have fused about 60 mya, having had at least 40–60 mya to evolve independently. Chromosome number variation in the Vitaceae and related families, and the gap between the time of eudicot radiation and the age of Vitaceae fossils, are accounted for by our hypothesis

    A Position Effect on the Heritability of Epigenetic Silencing

    Get PDF
    In animals and yeast, position effects have been well documented. In animals, the best example of this process is Position Effect Variegation (PEV) in Drosophila melanogaster. In PEV, when genes are moved into close proximity to constitutive heterochromatin, their expression can become unstable, resulting in variegated patches of gene expression. This process is regulated by a variety of proteins implicated in both chromatin remodeling and RNAi-based silencing. A similar phenomenon is observed when transgenes are inserted into heterochromatic regions in fission yeast. In contrast, there are few examples of position effects in plants, and there are no documented examples in either plants or animals for positions that are associated with the reversal of previously established silenced states. MuDR transposons in maize can be heritably silenced by a naturally occurring rearranged version of MuDR. This element, Muk, produces a long hairpin RNA molecule that can trigger DNA methylation and heritable silencing of one or many MuDR elements. In most cases, MuDR elements remain inactive even after Muk segregates away. Thus, Muk-induced silencing involves a directed and heritable change in gene activity in the absence of changes in DNA sequence. Using classical genetic analysis, we have identified an exceptional position at which MuDR element silencing is unstable. Muk effectively silences the MuDR element at this position. However, after Muk is segregated away, element activity is restored. This restoration is accompanied by a reversal of DNA methylation. To our knowledge, this is the first documented example of a position effect that is associated with the reversal of epigenetic silencing. This observation suggests that there are cis-acting sequences that alter the propensity of an epigenetically silenced gene to remain inactive. This raises the interesting possibility that an important feature of local chromatin environments may be the capacity to erase previously established epigenetic marks

    An Autotetraploid Linkage Map of Rose (Rosa hybrida) Validated Using the Strawberry (Fragaria vesca) Genome Sequence

    Get PDF
    Polyploidy is a pivotal process in plant evolution as it increase gene redundancy and morphological intricacy but due to the complexity of polysomic inheritance we have only few genetic maps of autopolyploid organisms. A robust mapping framework is particularly important in polyploid crop species, rose included (2n = 4x = 28), where the objective is to study multiallelic interactions that control traits of value for plant breeding. From a cross between the garden, peach red and fragrant cultivar Fragrant Cloud (FC) and a cut-rose yellow cultivar Golden Gate (GG), we generated an autotetraploid GGFC mapping population consisting of 132 individuals. For the map we used 128 sequence-based markers, 141 AFLP, 86 SSR and three morphological markers. Seven linkage groups were resolved for FC (Total 632 cM) and GG (616 cM) which were validated by markers that segregated in both parents as well as the diploid integrated consensus map

    Using Genomic Sequencing for Classical Genetics in E. coli K12

    Get PDF
    We here develop computational methods to facilitate use of 454 whole genome shotgun sequencing to identify mutations in Escherichia coli K12. We had Roche sequence eight related strains derived as spontaneous mutants in a background without a whole genome sequence. They provided difference tables based on assembling each genome to reference strain E. coli MG1655 (NC_000913). Due to the evolutionary distance to MG1655, these contained a large number of both false negatives and positives. By manual analysis of the dataset, we detected all the known mutations (24 at nine locations) and identified and genetically confirmed new mutations necessary and sufficient for the phenotypes we had selected in four strains. We then had Roche assemble contigs de novo, which we further assembled to full-length pseudomolecules based on synteny with MG1655. This hybrid method facilitated detection of insertion mutations and allowed annotation from MG1655. After removing one genome with less than the optimal 20- to 30-fold sequence coverage, we identified 544 putative polymorphisms that included all of the known and selected mutations apart from insertions. Finally, we detected seven new mutations in a total of only 41 candidates by comparing single genomes to composite data for the remaining six and using a ranking system to penalize homopolymer sequencing and misassembly errors. An additional benefit of the analysis is a table of differences between MG1655 and a physiologically robust E. coli wild-type strain NCM3722. Both projects were greatly facilitated by use of comparative genomics tools in the CoGe software package (http://genomevolution.org/)
    corecore