97 research outputs found

    Identification of novel non-coding RNAs using profiles of short sequence reads from next generation sequencing data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The increasing interest in small non-coding RNAs (ncRNAs) such as microRNAs (miRNAs), small interfering RNAs (siRNAs) and Piwi-interacting RNAs (piRNAs) and recent advances in sequencing technology have yielded large numbers of short (18-32 nt) RNA sequences from different organisms, some of which are derived from small nucleolar RNAs (snoRNAs) and transfer RNAs (tRNAs). We observed that these short ncRNAs frequently cover the entire length of annotated snoRNAs or tRNAs, which suggests that other loci specifying similar ncRNAs can be identified by clusters of short RNA sequences.</p> <p>Results</p> <p>We combined publicly available datasets of tens of millions of short RNA sequence tags from <it>Drosophila melanogaster</it>, and mapped them to the <it>Drosophila </it>genome. Approximately 6 million perfectly mapping sequence tags were then assembled into 521,302 tag-contigs (TCs) based on tag overlap. Most transposon-derived sequences, exons and annotated miRNAs, tRNAs and snoRNAs are detected by TCs, which show distinct patterns of length and tag-depth for different categories. The typical length and tag-depth of snoRNA-derived TCs was used to predict 7 previously unrecognized box H/ACA and 26 box C/D snoRNA candidates. We also identified one snRNA candidate and 86 loci with a high number of tags that are yet to be annotated, 7 of which have a particular 18mer motif and are located in introns of genes involved in development. A subset of new snoRNA candidates and putative ncRNA candidates was verified by Northern blot.</p> <p>Conclusions</p> <p>In this study, we have introduced a new approach to identify new members of known classes of ncRNAs based on the features of TCs corresponding to known ncRNAs. A large number of the identified TCs are yet to be examined experimentally suggesting that many more novel ncRNAs remain to be discovered.</p

    Diversity of immunoglobulin light chain genes in non-teleost ray-finned fish uncovers IgL subdivision into five ancient isotypes

    Get PDF
    <p>The aim of this study was to fill important gaps in the evolutionary history of immunoglobulins by examining the structure and diversity of IgL genes in non-teleost ray-finned fish. First, based on the bioinformatic analysis of recent transcriptomic and genomic resources, we experimentally characterized the IgL genes in the chondrostean fish, Acipenser ruthenus (sterlet). We show that this species has three loci encoding IgL kappa-like chains with a translocon-type gene organization and a single VJC cluster, encoding homogeneous lambda-like light chain. In addition, sterlet possesses sigma-like VL and J-CL genes, which are transcribed separately and both encode protein products with cleavable leader peptides. The Acipenseriformes IgL dataset was extended by the sequences mined in the databases of species belonging to other non-teleost lineages of ray-finned fish: Holostei and Polypteriformes. Inclusion of these new data into phylogenetic analysis showed a clear subdivision of IgL chains into five groups. The isotype described previously as the teleostean IgL lambda turned out to be a kappa and lambda chain paralog that emerged before the radiation of ray-finned fish. We designate this isotype as lambda-2. The phylogeny also showed that sigma-2 IgL chains initially regarded as specific for cartilaginous fish are present in holosteans, polypterids, and even in turtles. We conclude that there were five ancient IgL isotypes, which evolved differentially in various lineages of jawed vertebrates.</p

    Bridging the gap between vertebrate cytogenetics and genomics with single-chromosome sequencing (ChromSeq)

    Get PDF
    The study of vertebrate genome evolution is currently facing a revolution, brought about by next generation sequencing technologies that allow researchers to produce nearly complete and error-free genome assemblies. Novel approaches however do not always provide a direct link with information on vertebrate genome evolution gained from cytogenetic approaches. It is useful to preserve and link cytogenetic data with novel genomic discoveries. Sequencing of DNA from single isolated chromosomes (ChromSeq) is an elegant approach to determine the chromosome content and assign genome assemblies to chromosomes, thus bridging the gap between cytogenetics and genomics. The aim of this paper is to describe how ChromSeq can support the study of vertebrate genome evolution and how it can help link cytogenetic and genomic data. We show key examples of ChromSeq application in the refinement of vertebrate genome assemblies and in the study of vertebrate chromosome and karyotype evolution. We also provide a general overview of the approach and a concrete example of genome refinement using this method in the species Anolis carolinensis

    Paucity and preferential suppression of transgenes in late replication domains of the D. melanogaster genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Eukaryotic genomes are organized in extended domains with distinct features intimately linking genome structure, replication pattern and chromatin state. Recently we identified a set of long late replicating euchromatic regions that are underreplicated in salivary gland polytene chromosomes of <it>D. melanogaster</it>.</p> <p>Results</p> <p>Here we demonstrate that these underreplicated regions (URs) have a low density of <it>P</it>-<it>element </it>and <it>piggyBac </it>insertions compared to the genome average or neighboring regions. In contrast, <it>Minos</it>-based transposons show no paucity in URs but have a strong bias to testis-specific genes. We estimated the suppression level in 2,852 stocks carrying a single <it>P</it>-<it>element </it>by analysis of eye color determined by the mini-<it>white </it>marker gene and demonstrate that the proportion of suppressed transgenes in URs is more than three times higher than in the flanking regions or the genomic average. The suppressed transgenes reside in intergenic, genic or promoter regions of the annotated genes. We speculate that the low insertion frequency of <it>P-elemen</it>ts and <it>piggyBac</it>s in URs partially results from suppression of transgenes that potentially could prevent identification of transgenes due to complete suppression of the marker gene. In a similar manner, the proportion of suppressed transgenes is higher in loci replicating late or very late in Kc cells and these loci have a lower density of <it>P-elements </it>and <it>piggyBac </it>insertions. In transgenes with two marker genes suppression of mini-<it>white </it>gene in eye coincides with suppression of <it>yellow </it>gene in bristles.</p> <p>Conclusions</p> <p>Our results suggest that the late replication domains have a high inactivation potential apparently linked to the silenced or closed chromatin state in these regions, and that such inactivation potential is largely maintained in different tissues.</p

    Contrasting origin of B chromosomes in two cervids (Siberian roe deer and grey brocket deer) unravelled by chromosome-specific DNA sequencing.

    Get PDF
    BACKGROUND: B chromosomes are dispensable and variable karyotypic elements found in some species of animals, plants and fungi. They often originate from duplications and translocations of host genomic regions or result from hybridization. In most species, little is known about their DNA content. Here we perform high-throughput sequencing and analysis of B chromosomes of roe deer and brocket deer, the only representatives of Cetartiodactyla known to have B chromosomes. RESULTS: In this study we developed an approach to identify genomic regions present on chromosomes by high-throughput sequencing of DNA generated from flow-sorted chromosomes using degenerate-oligonucleotide-primed PCR. Application of this method on small cattle autosomes revealed a previously described KIT gene region translocation associated with colour sidedness. Implementing this approach to B chromosomes from two cervid species, Siberian roe deer (Capreolus pygargus) and grey brocket deer (Mazama gouazoubira), revealed dramatically different genetic content: roe deer B chromosomes consisted of two duplicated genomic regions (a total of 1.42-1.98 Mbp) involving three genes, while grey brocket deer B chromosomes contained 26 duplicated regions (a total of 8.28-9.31 Mbp) with 34 complete and 21 partial genes, including KIT and RET protooncogenes, previously found on supernumerary chromosomes in canids. Sequence variation analysis of roe deer B chromosomes revealed a high frequency of mutations and increased heterozygosity due to either amplification within B chromosomes or divergence between different Bs. In contrast, grey brocket deer B chromosomes were found to be more homogeneous and resembled autosomes in patterns of sequence variation. Similar tendencies were observed in repetitive DNA composition. CONCLUSIONS: Our data demonstrate independent origins of B chromosomes in the grey brocket and roe deer. We hypothesize that the B chromosomes of these two cervid species represent different stages of B chromosome sequences evolution: probably nascent and similar to autosomal copies in brocket deer, highly derived in roe deer. Based on the presence of the same orthologous protooncogenes in canids and brocket deer Bs we argue that genomic regions involved in B chromosome formation are not random. In addition, our approach is also applicable to the characterization of other evolutionary and clinical rearrangements

    Contrasting origin of B chromosomes in two cervids (Siberian roe deer and grey brocket deer) unravelled by chromosome-specific DNA sequencing

    Get PDF
    Abstract Background B chromosomes are dispensable and variable karyotypic elements found in some species of animals, plants and fungi. They often originate from duplications and translocations of host genomic regions or result from hybridization. In most species, little is known about their DNA content. Here we perform high-throughput sequencing and analysis of B chromosomes of roe deer and brocket deer, the only representatives of Cetartiodactyla known to have B chromosomes. Results In this study we developed an approach to identify genomic regions present on chromosomes by high-throughput sequencing of DNA generated from flow-sorted chromosomes using degenerate-oligonucleotide-primed PCR. Application of this method on small cattle autosomes revealed a previously described KIT gene region translocation associated with colour sidedness. Implementing this approach to B chromosomes from two cervid species, Siberian roe deer (Capreolus pygargus) and grey brocket deer (Mazama gouazoubira), revealed dramatically different genetic content: roe deer B chromosomes consisted of two duplicated genomic regions (a total of 1.42-1.98 Mbp) involving three genes, while grey brocket deer B chromosomes contained 26 duplicated regions (a total of 8.28-9.31 Mbp) with 34 complete and 21 partial genes, including KIT and RET protooncogenes, previously found on supernumerary chromosomes in canids. Sequence variation analysis of roe deer B chromosomes revealed a high frequency of mutations and increased heterozygosity due to either amplification within B chromosomes or divergence between different Bs. In contrast, grey brocket deer B chromosomes were found to be more homogeneous and resembled autosomes in patterns of sequence variation. Similar tendencies were observed in repetitive DNA composition. Conclusions Our data demonstrate independent origins of B chromosomes in the grey brocket and roe deer. We hypothesize that the B chromosomes of these two cervid species represent different stages of B chromosome sequences evolution: probably nascent and similar to autosomal copies in brocket deer, highly derived in roe deer. Based on the presence of the same orthologous protooncogenes in canids and brocket deer Bs we argue that genomic regions involved in B chromosome formation are not random. In addition, our approach is also applicable to the characterization of other evolutionary and clinical rearrangements

    X Chromosome Evolution in Cetartiodactyla

    Get PDF
    The mammalian X chromosome is characterized by high level of conservation. On the contrary the Cetartiodactyl X chromosome displays variation in morphology and G-banding pattern. It is hypothesized that X chromosome has undergone multiple rearrangements during Cetartiodactyla speciation. To investigate the evolution of this sex chromosome we have selected 26 BAC clones from cattle CHORI-240 library evenly distributed along the cattle X chromosome. High-resolution maps were obtained by fluorescence in situ hybridisation in a representative range of cetartiodactyl species from different families: pig (Suidae), gray whale (Eschrichtiidae), pilot whale (Delphinidae), hippopotamus (Hippopotamidae), Java mouse deer (Tragulidae), pronghorn (Antilocapridae), Siberian musk deer (Moschidae), giraffe (Giraffidae). To trace the X chromosome evolution during fast radiation in speciose families, we mapped more than one species in Cervidae (moose, Siberian roe deer, fallow deer and Pere David’s deer) and Bovidae (musk ox, goat, sheep, sable antelope, nilgau, gaur, saola, and cattle). We have identified three major conserved synteny blocks and based on this data reconstructed the structure of putative ancestral cetartiodactyl X chromosome. We demonstrate that intrachromosomal rearrangements such as inversions and centromere reposition are main drivers of cetartiodactyl’s chromosome X evolution

    A High-Resolution SNP Array-Based Linkage Map Anchors a New Domestic Cat Draft Genome Assembly and Provides Detailed Patterns of Recombination

    Get PDF
    High-resolution genetic and physical maps are invaluable tools for building accurate genome assemblies, and interpreting results of genome-wide association studies (GWAS). Previous genetic and physical maps anchored good quality draft assemblies of the domestic cat genome, enabling the discovery of numerous genes underlying hereditary disease and phenotypes of interest to the biomedical science and breeding communities. However, these maps lacked sufficient marker density to order thousands of shorter scaffolds in earlier assemblies, which instead relied heavily on comparative mapping with related species. A high-resolution map would aid in validating and ordering chromosome scaffolds from existing and new genome assemblies. Here, we describe a high-resolution genetic linkage map of the domestic cat genome based on genotyping 453 domestic cats from several multi-generational pedigrees on the Illumina 63K SNP array. The final maps include 58,055 SNP markers placed relative to 6637 markers with unique positions, distributed across all autosomes and the X chromosome. Our final sex-averaged maps span a total autosomal length of 4464 cM, the longest described linkage map for any mammal, confirming length estimates from a previous microsatellite-based map. The linkage map was used to order and orient the scaffolds from a substantially more contiguous domestic cat genome assembly (Felis catusv8.0), which incorporated ∌20 × coverage of Illumina fragment reads. The new genome assembly shows substantial improvements in contiguity, with a nearly fourfold increase in N50 scaffold size to 18 Mb. We use this map to report probable structural errors in previous maps and assemblies, and to describe features of the recombination landscape, including a massive (∌50 Mb) recombination desert (of virtually zero recombination) on the X chromosome that parallels a similar desert on the porcine X chromosome in both size and physical location

    The sterlet sturgeon genome sequence and the mechanisms of segmental rediploidization.

    Get PDF
    Sturgeons seem to be frozen in time. The archaic characteristics of this ancient fish lineage place it in a key phylogenetic position at the base of the ~30,000 modern teleost fish species. Moreover, sturgeons are notoriously polyploid, providing unique opportunities to investigate the evolution of polyploid genomes. We assembled a high-quality chromosome-level reference genome for the sterlet, Acipenser ruthenus. Our analysis revealed a very low protein evolution rate that is at least as slow as in other deep branches of the vertebrate tree, such as that of the coelacanth. We uncovered a whole-genome duplication that occurred in the Jurassic, early in the evolution of the entire sturgeon lineage. Following this polyploidization, the rediploidization of the genome included the loss of whole chromosomes in a segmental deduplication process. While known adaptive processes helped conserve a high degree of structural and functional tetraploidy over more than 180 million years, the reduction of redundancy of the polyploid genome seems to have been remarkably random

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics
    • 

    corecore