11 research outputs found

    BIOINFORMATICS

    No full text
    Transposable element annotation of the rice genom

    The evolutionary fate of MULE-mediated duplications of host gene fragments in rice

    No full text
    DNA transposons are known to frequently capture duplicated fragments of host genes. The evolutionary impact of this phenomenon depends on how frequently the fragments retain protein-coding function as opposed to becoming pseudogenes. Gene fragment duplication by Mutator-like elements (MULEs) has previously been documented in maize, Arabidopsis, and rice. Here we present a rigorous genome-wide analysis of MULEs in the model plant Oryza sativa (domesticated rice). We identify 8274 MULEs with intact termini and target-site duplications (TSDs) and show that 1337 of them contain duplicated host gene fragments. Through a detailed examination of the 5% of duplicated gene fragments that are transcribed, we demonstrate that virtually all cases contain pseudogenic features such as fragmented conserved protein domains, frameshifts, and premature stop codons. In addition, we show that the distribution of the ratio of nonsynonymous to synonymous amino acid substitution rates for the duplications agrees with the expected distribution for pseudogenes. We conclude that MULE-mediated host gene duplication results in the formation of pseudogenes, not novel functional protein-coding genes; however, the transcribed duplications possess characteristics consistent with a potential role in the regulation of host gene expression

    A Gene Family Derived from Transposable Elements during Early Angiosperm Evolution Has Reproductive Fitness Benefits in <em>Arabidopsis thaliana</em>

    Get PDF
    <div><p>The benefits of ever-growing numbers of sequenced eukaryotic genomes will not be fully realized until we learn to decipher vast stretches of noncoding DNA, largely composed of transposable elements. Transposable elements persist through self-replication, but some genes once encoded by transposable elements have, through a process called molecular domestication, evolved new functions that increase fitness. Although they have conferred numerous adaptations, the number of such domesticated transposable element genes remains unknown, so their evolutionary and functional impact cannot be fully assessed. Systematic searches that exploit genomic signatures of natural selection have been employed to identify potential domesticated genes, but their predictions have yet to be experimentally verified. To this end, we investigated a family of domesticated genes called <em>MUSTANG</em> (<em>MUG</em>), identified in a previous bioinformatic search of plant genomes. We show that <em>MUG</em> genes are functional. Mutants of <em>Arabidopsis thaliana MUG</em> genes yield phenotypes with severely reduced plant fitness through decreased plant size, delayed flowering, abnormal development of floral organs, and markedly reduced fertility. <em>MUG</em> genes are present in all flowering plants, but not in any non-flowering plant lineages, such as gymnosperms, suggesting that the molecular domestication of <em>MUG</em> may have been an integral part of early angiosperm evolution. This study shows that systematic searches can be successful at identifying functional genetic elements in noncoding regions and demonstrates how to combine systematic searches with reverse genetics in a fruitful way to decipher eukaryotic genomes.</p> </div

    <i>MUSTANG</i> phylogeny and gene structure in <i>A. thaliana</i>.

    No full text
    <p>(A) <i>MUG</i> phylogeny in nine angiosperm species. At, <i>Arabidopsis thaliana</i>; Bd, <i>Brachypodium distachyon</i>; Cp, <i>Carica papaya</i>; Mg, <i>Mimulus guttatus</i>; Mt, <i>Medicago truncatula</i>; Os, <i>Oryza sativa</i>; Sb, <i>Sorghum bicolor</i>; Vv, <i>Vitis vinifera</i>; Zm, <i>Zea mays</i>. See <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002931#pgen.1002931.s001" target="_blank">Figure S1</a> for sequences and <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002931#pgen.1002931.s008" target="_blank">Table S5</a> for locus IDs. All bootstrap values are >70% (not shown). Cp7 is truncated; its position is approximate. (B) Graphical representation of <i>At-MUG1</i>, <i>At-MUG2</i>, <i>At-MUG7</i>, and <i>At-MUG8</i> gene transcripts. Bold horizontal lines represent transcripts, dips introns, and rectangles coding sequences.</p

    Detection and genomic analysis of BRAF fusions in Juvenile Pilocytic Astrocytoma through the combination and integration of multi-omic data

    No full text
    Abstract Background Juvenile Pilocytic Astrocytomas (JPAs) are one of the most common pediatric brain tumors, and they are driven by aberrant activation of the mitogen-activated protein kinase (MAPK) signaling pathway. RAF-fusions are the most common genetic alterations identified in JPAs, with the prototypical KIAA1549-BRAF fusion leading to loss of BRAF’s auto-inhibitory domain and subsequent constitutive kinase activation. JPAs are highly vascular and show pervasive immune infiltration, which can lead to low tumor cell purity in clinical samples. This can result in gene fusions that are difficult to detect with conventional omics approaches including RNA-Seq. Methods To this effect, we applied RNA-Seq as well as linked-read whole-genome sequencing and in situ Hi-C as new approaches to detect and characterize low-frequency gene fusions at the genomic, transcriptomic and spatial level. Results Integration of these datasets allowed the identification and detailed characterization of two novel BRAF fusion partners, PTPRZ1 and TOP2B, in addition to the canonical fusion with partner KIAA1549. Additionally, our Hi-C datasets enabled investigations of 3D genome architecture in JPAs which showed a high level of correlation in 3D compartment annotations between JPAs compared to other pediatric tumors, and high similarity to normal adult astrocytes. We detected interactions between BRAF and its fusion partners exclusively in tumor samples containing BRAF fusions. Conclusions We demonstrate the power of integrating multi-omic datasets to identify low frequency fusions and characterize the JPA genome at high resolution. We suggest that linked-reads and Hi-C could be used in clinic for the detection and characterization of JPAs

    Sequencing strategies and characterization of 721 vervet monkey genomes for future genetic analyses of medically relevant traits

    Get PDF
    BACKGROUND: We report here the first genome-wide high-resolution polymorphism resource for non-human primate (NHP) association and linkage studies, constructed for the Caribbean-origin vervet monkey, or African green monkey (Chlorocebus aethiops sabaeus), one of the most widely used NHPs in biomedical research. We generated this resource by whole genome sequencing (WGS) of monkeys from the Vervet Research Colony (VRC), an NIH-supported research resource for which extensive phenotypic data are available. RESULTS: We identified genome-wide single nucleotide polymorphisms (SNPs) by WGS of 721 members of an extended pedigree from the VRC. From high-depth WGS data we identified more than 4 million polymorphic unequivocal segregating sites; by pruning these SNPs based on heterozygosity, quality control filters, and the degree of linkage disequilibrium (LD) between SNPs, we constructed genome-wide panels suitable for genetic association (about 500,000 SNPs) and linkage analysis (about 150,000 SNPs). To further enhance the utility of these resources for linkage analysis, we used a further pruned subset of the linkage panel to generate multipoint identity by descent matrices. CONCLUSIONS: The genetic and phenotypic resources now available for the VRC and other Caribbean-origin vervets enable their use for genetic investigation of traits relevant to human diseases. BMC Biol 2015 Jun 20; 13:41

    The genome of the vervet (Chlorocebus aethiops sabaeus)

    Get PDF
    We describe a genome reference of the African green monkey or vervet (Chlorocebus aethiops). This member of the Old World monkey (OWM) superfamily is uniquely valuable for genetic investigations of simian immunodeficiency virus (SIV), for which it is the most abundant natural host species, and of a wide range of health-related phenotypes assessed in Caribbean vervets (C. a. sabaeus), whose numbers have expanded dramatically since Europeans introduced small numbers of their ancestors from West Africa during the colonial era. We use the reference to characterize the genomic relationship between vervets and other primates, the intra-generic phylogeny of vervet subspecies, and genome-wide structural variations of a pedigreed C. a. sabaeus population. Through comparative analyses with human and rhesus macaque, we characterize at high resolution the unique chromosomal fission events that differentiate the vervets and their close relatives from most other catarrhine primates, in whom karyotype is highly conserved. We also provide a summary of transposable elements and contrast these with the rhesus macaque and human. Analysis of sequenced genomes representing each of the main vervet subspecies supports previously hypothesized relationships between these populations, which range across most of sub-Saharan Africa, while uncovering high levels of genetic diversity within each. Sequence-based analyses of major histocompatibility complex (MHC) polymorphisms reveal extremely low diversity in Caribbean C. a. sabaeus vervets, compared to vervets from putatively ancestral West African regions. In the C. a. sabaeus research population, we discover the first structural variations that are, in some cases, predicted to have a deleterious effect; future studies will determine the phenotypic impact of these variations.Funding to R.K.W. was provided by NIH-NHGRI grant 5U54HG00307907. Support for the Vervet Research Colony was provided by NIH grant RR019963/OD010965 to J.R.K. Funding to N.B.F. was provided by NIH grants R01RR016300 and R01OD010980. The French National Agency for Research on AIDS and Viral Hepatitis (ANRS) provided funding to M.C.M.-T. Funding to M.R. and R.S. was provided by the Ministero della Universita’ e della Ricerca. Funding to K.D. was provided by Genome Canada and Genome Quebec. B.A. and R.N. acknowledge upport from the Wellcome Trust (grant number WT095908) and the European Molecular Biology Laboratory
    corecore