6 research outputs found

    Tools and databases for solving problems in detection and identification of repetitive DNA sequences

    Get PDF
    Genome compartments known to carry out very important biological functions (e.g. centromeres and telomeres) are mostly constituted of repetitive sequences. At the same time, regions of the genomes enriched in repetitive sequences have always presented great technical challenges for sequence alignments and genome assemblies. Fast evolving sequencing technologies and the increasing accessibility of genomic datasets have opened the opportunity to gain new insights into poorly explored genome fractions, built of repetitive DNA. Comprehensive and accurate annotation and characterization of these sequences is therefore an important contribution to the understanding of genomic architecture and function as a whole. In order to attend the emerging needs in repeat analysis and characterization, many bioinformatics tools, databases and pipelines have been generated. This review is intended to draw attention to the problems encountered in the genomic studies of repetitive sequences and to provide an overview of a spectrum of most prominent bioinformatics tools used for gaining better insight into these important genomic components. Some of the described assets are focused on detection of a wide range of repeats while the others are focused on a specific type of repetitive DNA sequences, generated as an answer to specific research interests and needs of the scientific community.</p

    Genome sequence and evolution of Betula platyphylla

    Get PDF
    Betula L. (birch) is a pioneer hardwood tree species with ecological, economic, and evolutionary importance in the Northern Hemisphere. We sequenced the Betula platyphylla genome and assembled the sequences into 14 chromosomes. The Betula genome lacks evidence of recent whole-genome duplication and has the same paleoploidy level as Vitis vinifera and Prunus mume. Phylogenetic analysis of lignin pathway genes coupled with tissue-specific expression patterns provided clues for understanding the formation of higher ratios of syringyl to guaiacyl lignin observed in Betula species. Our transcriptome analysis of leaf tissues under a time-series cold stress experiment revealed the presence of the MEKK1–MKK2–MPK4 cascade and six additional mitogen-activated protein kinases that can be linked to a gene regulatory network involving many transcription factors and cold tolerance genes. Our genomic and transcriptome analyses provide insight into the structures, features, and evolution of the B. platyphylla genome. The chromosome-level genome and gene resources of B. platyphylla obtained in this study will facilitate the identification of important and essential genes governing important traits of trees and genetic improvement of B. platyphylla

    Bioinformatics Tools and Genomic Resources Available in Understanding the Structure and Function of Gossypium

    Get PDF
    Cotton is economically and evolutionarily important crop for its fiber. In order to improve fiber quality and yield, and to exploit the natural genetic potential inherent in genotypes, understanding genome structure and function of cultivated cotton is important. In order to achieve this, a functional understanding of bioinformatics resources such as databases, software solutions, and analysis tools is required. But currently, there are very few unified reports on bioinformatics tools and even fewer repositories to access cotton genomic information. Also, resourceful developers and bioinformatics scientists actively addressing complex genomic challenges in cotton genomes are much in need. The primary goal of this chapter is to provide a review of such tools and resources for analyzing the structure and function of the cotton genome with preferential emphasis on this complex and economically important plant species. This discourse begins with a description of concurrent advances in high‐throughput genome sequencing and bioinformatics analyses and focuses on four major sections covering bioinformatics tools and resources for analysis of: (1) genomes; (2) transcriptomes; (3) small RNAs; and (4) epigenomes. In each section, recent advances in cotton have been discussed. Cotton genome sequencing and annotation efforts are outlined within these sections. This review discusses the availability of genome information of both diploid and tetraploid species that have impelled cotton genome research into the post‐genomics era, opening new avenues for exploring regulatory mechanisms associated with fine‐tuning of gene expression of fiber‐related genes. Finally, the potential impacts of these rapid advances, especially the challenges in handling and analyzing the large datasets are discussed

    Conservative route to genome compaction in a miniature annelid

    Get PDF
    The causes and consequences of genome reduction in animals are unclear because our understanding of this process mostly relies on lineages with often exceptionally high rates of evolution. Here, we decode the compact 73.8-megabase genome of Dimorphilus gyrociliatus, a meiobenthic segmented worm. The D. gyrociliatus genome retains traits classically associated with larger and slower-evolving genomes, such as an ordered, intact Hox cluster, a generally conserved developmental toolkit and traces of ancestral bilaterian linkage. Unlike some other animals with small genomes, the analysis of the D. gyrociliatus epigenome revealed canonical features of genome regulation, excluding the presence of operons and trans-splicing. Instead, the gene-dense D. gyrociliatus genome presents a divergent Myc pathway, a key physiological regulator of growth, proliferation and genome stability in animals. Altogether, our results uncover a conservative route to genome compaction in annelids, reminiscent of that observed in the vertebrate Takifugu rubripes

    MITE Digger, an efficient and accurate algorithm for genome wide discovery of miniature inverted repeat transposable elements

    No full text
    Abstract Background Miniature inverted repeat transposable elements (MITEs) are abundant non-autonomous elements, playing important roles in shaping gene and genome evolution. Their characteristic structural features are suitable for automated identification by computational approaches, however, de novo MITE discovery at genomic levels is still resource expensive. Efficient and accurate computational tools are desirable. Existing algorithms process every member of a MITE family, therefore a major portion of the computing task is redundant. Results In this study, redundant computing steps were analyzed and a novel algorithm emphasizing on the reduction of such redundant computing was implemented in MITE Digger. It completed processing the whole rice genome sequence database in ~15 hours and produced 332 MITE candidates with low false positive (1.8%) and false negative (0.9%) rates. MITE Digger was also tested for genome wide MITE discovery with four other genomes. Conclusions MITE Digger is efficient and accurate for genome wide retrieval of MITEs. Its user friendly interface further facilitates genome wide analyses of MITEs on a routine basis. The MITE Digger program is available at: http://labs.csb.utoronto.ca/yang/MITEDigger
    corecore