105 research outputs found

    Considerations and complications of mapping small RNA high-throughput data to transposable elements

    Get PDF
    BACKGROUND High-throughput sequencing (HTS) has revolutionized the way in which epigenetic research is conducted. When coupled with fully-sequenced genomes, millions of small RNA (sRNA) reads are mapped to regions of interest and the results scrutinized for clues about epigenetic mechanisms. However, this approach requires careful consideration in regards to experimental design, especially when one investigates repetitive parts of genomes such as transposable elements (TEs), or when such genomes are large, as is often the case in plants. RESULTS Here, in an attempt to shed light on complications of mapping sRNAs to TEs, we focus on the 2,300 Mb maize genome, 85% of which is derived from TEs, and scrutinize methodological strategies that are commonly employed in TE studies. These include choices for the reference dataset, the normalization of multiply mapping sRNAs, and the selection among sRNA metrics. We further examine how these choices influence the relationship between sRNAs and the critical feature of TE age, and contrast their effect on low copy genomic regions and other popular HTS data. CONCLUSIONS Based on our analyses, we share a series of take-home messages that may help with the design, implementation, and interpretation of high-throughput TE epigenetic studies specifically, but our conclusions may also apply to any work that involves analysis of HTS data

    Highly conserved motifs in non-coding regions of Sirevirus retrotransposons: the key for their pattern of distribution within and across plants?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Retrotransposons are key players in the evolution of eukaryotic genomes. Moreover, it is now known that some retrotransposon classes, like the abundant and plant-specific Sireviruses, have intriguingly distinctive host preferences. Yet, it is largely unknown if this bias is supported by different genome structures.</p> <p>Results</p> <p>We performed sensitive comparative analysis of the genomes of a large set of Ty1/<it>copia </it>retrotransposons. We discovered that Sireviruses are unique among <it>Pseudoviridae </it>in that they constitute an ancient genus characterized by vastly divergent members, which however contain highly conserved motifs in key non-coding regions: multiple polypurine tract (PPT) copies cluster upstream of the 3' long terminal repeat (3'LTR), of which the terminal PPT tethers to a distinctive attachment site and is flanked by a precisely positioned inverted repeat. Their LTRs possess a novel type of repeated motif (RM) defined by its exceptionally high copy number, symmetry and core CGG-CCG signature. These RM boxes form CpG islands and lie a short distance upstream of a conserved promoter region thus hinting towards regulatory functions. Intriguingly, in the envelope-containing Sireviruses additional boxes cluster at the 5' vicinity of the envelope. The 5'LTR/internal domain junction and a polyC-rich integrase signal are also highly conserved domains of the Sirevirus genome.</p> <p>Conclusions</p> <p>Our comparative analysis of retrotransposon genomes using advanced <it>in silico </it>methods highlighted the unique genome organization of Sireviruses. Their structure may dictate a life cycle with different regulation and transmission strategy compared to other <it>Pseudoviridae</it>, which may contribute towards their pattern of distribution within and across plants.</p

    Multiple evidence for the role of an Ovate-like gene in determining fruit shape in pepper

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Grafting is a widely used technique contributing to sustainable and ecological production of many vegetables, but important fruit quality characters such as taste, aroma, texture and shape are known for years to be affected by grafting in important vegetables species including pepper. From all the characters affected, fruit shape is the most easily observed and measured. From research in tomato, fruit shape is known to be controlled by many QTLs but only few of them have larger effect on fruit shape variance. In this study we used pepper cultivars with different fruit shape to study the role of a pepper <it>Ovate</it>-like gene, <it>CaOvate</it>, which encodes a negative regulator protein that brings significant changes in tomato fruit shape.</p> <p>Results</p> <p>We successfully cloned and characterized <it>Ovate</it>-like genes (designated as <it>CaOvate</it>) from two pepper cultivars of different fruit shape, cv. "Mytilini Round" and cv. "Piperaki Long", hereafter referred to as cv. "Round" and cv. "Long" after the shape of their mature fruits. The <it>CaOvate </it>consensus contains a 1008-bp ORF, encodes a 335 amino-acid polypeptide, shares 63% identity with the tomato OVATE protein and exhibits high similarity with OVATE sequences from other Solanaceae species, all placed in the same protein subfamily as outlined by expert sequence analysis. No significant structural differences were detected between the <it>CaOvate </it>genes obtained from the two cultivars. However, relative quantitative expression analysis showed that the expression of <it>CaOvate </it>followed a different developmental profile between the two cultivars, being higher in cv. "Round". Furthermore, down-regulation of <it>CaOvate </it>through VIGS in cv. "Round" changes its fruit to a more oblong form indicating that <it>CaOvate </it>is indeed involved in determining fruit shape in pepper, perhaps by negatively affecting the expression of its target gene, <it>CaGA20ox1</it>, also studied in this work.</p> <p>Conclusions</p> <p>Herein, we clone, characterize and study <it>CaOvate </it>and <it>CaGA20ox1 </it>genes, very likely involved in shaping pepper fruit. The oblong phenotype of the fruits in a plant of cv. "Round", where we observed a significant reduction in the expression levels of <it>CaOvate</it>, resembled the change in shape that takes place by grafting the round-fruited cultivar cv. "Round" onto the long-fruited pepper cultivar cv. "Long". Understanding the role of <it>CaOvate </it>and <it>CaGA20ox1</it>, as well as of other genes like <it>Sun </it>also involved in controlling fruit shape in Solanaceae plants like tomato, pave the way to better understand the molecular mechanisms involved in controlling fruit shape in Solanaceae plants in general, and pepper in particular, as well as the changes in fruit quality induced after grafting and perhaps the ways to mitigate them.</p

    A role for palindromic structures in the cis-region of maize Sirevirus LTRs in transposable element evolution and host epigenetic response

    Get PDF
    Transposable elements (TEs) proliferate within the genome of their host, which responds by silencing them epigenetically. Much is known about the mechanisms of silencing in plants, particularly the role of siRNAs in guiding DNA methylation. In contrast, little is known about siRNA targeting patterns along the length of TEs, yet this information may provide crucial insights into the dynamics between hosts and TEs. By focusing on 6456 carefully annotated, full-length Sirevirus LTR retrotransposons in maize, we show that their silencing associates with underlying characteristics of the TE sequence and also uncover three features of the host–TE interaction. First, siRNA mapping varies among families and among elements, but particularly along the length of elements. Within the cis-regulatory portion of the LTRs, a complex palindrome-rich region acts as a hotspot of both siRNA matching and sequence evolution. These patterns are consistent across leaf, tassel, and immature ear libraries, but particularly emphasized for floral tissues and 21- to 22-nt siRNAs. Second, this region has the ability to form hairpins, making it a potential template for the production of miRNA-like, hairpin-derived small RNAs. Third, Sireviruses are targeted by siRNAs as a decreasing function of their age, but the oldest elements remain highly targeted, partially by siRNAs that cross-map to the youngest elements. We show that the targeting of older Sireviruses reflects their conserved palindromes. Altogether, we hypothesize that the palindromes aid the silencing of active elements and influence transposition potential, siRNA targeting levels, and ultimately the fate of an element within the genome

    Expansion of the BioCyc collection of pathway/genome databases to 160 genomes

    Get PDF
    The BioCyc database collection is a set of 160 pathway/genome databases (PGDBs) for most eukaryotic and prokaryotic species whose genomes have been completely sequenced to date. Each PGDB in the BioCyc collection describes the genome and predicted metabolic network of a single organism, inferred from the MetaCyc database, which is a reference source on metabolic pathways from multiple organisms. In addition, each bacterial PGDB includes predicted operons for the corresponding species. The BioCyc collection provides a unique resource for computational systems biology, namely global and comparative analyses of genomes and metabolic networks, and a supplement to the BioCyc resource of curated PGDBs. The Omics viewer available through the BioCyc website allows scientists to visualize combinations of gene expression, proteomics and metabolomics data on the metabolic maps of these organisms. This paper discusses the computational methodology by which the BioCyc collection has been expanded, and presents an aggregate analysis of the collection that includes the range of number of pathways present in these organisms, and the most frequently observed pathways. We seek scientists to adopt and curate individual PGDBs within the BioCyc collection. Only by harnessing the expertise of many scientists we can hope to produce biological databases, which accurately reflect the depth and breadth of knowledge that the biomedical research community is producing

    ARResT/AssignSubsets: a novel application for robust subclassification of chronic lymphocytic leukemia based on B cell receptor IG stereotypy.

    Get PDF
    Abstract Motivation: An ever-increasing body of evidence supports the importance of B cell receptor immunoglobulin (BcR IG) sequence restriction, alias stereotypy, in chronic lymphocytic leukemia (CLL). This phenomenon accounts for ∼30% of studied cases, one in eight of which belong to major subsets, and extends beyond restricted sequence patterns to shared biologic and clinical characteristics and, generally, outcome. Thus, the robust assignment of new cases to major CLL subsets is a critical, and yet unmet, requirement. Results: We introduce a novel application, ARResT/AssignSubsets, which enables the robust assignment of BcR IG sequences from CLL patients to major stereotyped subsets. ARResT/AssignSubsets uniquely combines expert immunogenetic sequence annotation from IMGT/V-QUEST with curation to safeguard quality, statistical modeling of sequence features from more than 7500 CLL patients, and results from multiple perspectives to allow for both objective and subjective assessment. We validated our approach on the learning set, and evaluated its real-world applicability on a new representative dataset comprising 459 sequences from a single institution. Availability and implementation: ARResT/AssignSubsets is freely available on the web at http://bat.infspire.org/arrest/assignsubsets/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online
    corecore