721 research outputs found

    Special features of RAD Sequencing data:implications for genotyping

    Get PDF
    Restriction site-associated DNA Sequencing (RAD-Seq) is an economical and efficient method for SNP discovery and genotyping. As with other sequencing-by-synthesis methods, RAD-Seq produces stochastic count data and requires sensitive analysis to develop or genotype markers accurately. We show that there are several sources of bias specific to RAD-Seq that are not explicitly addressed by current genotyping tools, namely restriction fragment bias, restriction site heterozygosity and PCR GC content bias. We explore the performance of existing analysis tools given these biases and discuss approaches to limiting or handling biases in RAD-Seq data. While these biases need to be taken seriously, we believe RAD loci affected by them can be excluded or processed with relative ease in most cases and that most RAD loci will be accurately genotyped by existing tools

    Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing

    Get PDF
    BACKGROUND: Paired-tag sequencing approaches are commonly used for the analysis of genome structure. However, mammalian genomes have a complex organization with a variety of repetitive elements that complicate comprehensive genome-wide analyses. RESULTS: Here, we systematically assessed the utility of paired-end and mate-pair (MP) next-generation sequencing libraries with insert sizes ranging from 170 bp to 25 kb, for genome coverage and for improving scaffolding of a mammalian genome (Rattus norvegicus). Despite a lower library complexity, large insert MP libraries (20 or 25 kb) provided very high physical genome coverage and were found to efficiently span repeat elements in the genome. Medium-sized (5, 8 or 15 kb) MP libraries were much more efficient for genome structure analysis than the more commonly used shorter insert paired-end and 3 kb MP libraries. Furthermore, the combination of medium- and large insert libraries resulted in a 3-fold increase in N50 in scaffolding processes. Finally, we show that our data can be used to evaluate and improve contig order and orientation in the current rat reference genome assembly. CONCLUSIONS: We conclude that applying combinations of mate-pair libraries with insert sizes that match the distributions of repetitive elements improves contig scaffolding and can contribute to the finishing of draft genomes

    Deconvoluting simulated metagenomes: The performance of hard- and softclustering algorithms applied to metagenomic chromosome conformation capture (3C)

    Full text link
    © 2016 DeMaere and Darling. Background. Chromosome conformation capture, coupled with high throughputDNA sequencing in protocols like Hi-C and 3C-seq, has been proposed as a viable means of generating data to resolve the genomes of microorganisms living in naturally occuring environments. Metagenomic Hi-C and 3C-seq datasets have begun to emerge, but the feasibility of resolving genomes when closely related organisms (strain-level diversity) are present in the sample has not yet been systematically characterised. Methods. We developed a computational simulation pipeline for metagenomic 3C and Hi-C sequencing to evaluate the accuracy of genomic reconstructions at, above, and below an operationally defined species boundary. We simulated datasets and measured accuracy over a wide range of parameters. Five clustering algorithms were evaluated (2 hard, 3 soft) using an adaptation of the extended B-cubed validation measure. Results. When all genomes in a sample are below 95% sequence identity, all of the tested clustering algorithms performed well. When sequence data contains genomes above 95% identity (our operational definition of strain-level diversity), a naive soft- clustering extension of the Louvain method achieves the highest performance. Discussion. Previously, only hard-clustering algorithms have been applied to metage- nomic 3C and Hi-C data, yet none of these perform well when strain-level diversity exists in a metagenomic sample. Our simple extension of the Louvain method performed the best in these scenarios, however, accuracy remained well below the levels observed for samples without strain-level diversity. Strain resolution is also highly dependent on the amount of available 3C sequence data, suggesting that depth of sequencing must be carefully considered during experimental design. Finally, there appears to be great scope to improve the accuracy of strain resolution through further algorithm development

    Comparative genomics of the pathogenic ciliate Ichthyophthirius multifiliis, its free-living relatives and a host species provide insights into adoption of a parasitic lifestyle and prospects for disease control

    Get PDF
    BACKGROUND: Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental studies challenging. Unlike most well-studied protozoan parasites, Ich belongs to a phylum composed primarily of free-living members. Indeed, it is closely related to the model organism Tetrahymena thermophila. Genomic studies represent a promising strategy to reduce the impact of this disease and to understand the evolutionary transition to parasitism. RESULTS: We report the sequencing, assembly and annotation of the Ich macronuclear genome. Compared with its free-living relative T. thermophila, the Ich genome is reduced approximately two-fold in length and gene density and three-fold in gene content. We analyzed in detail several gene classes with diverse functions in behavior, cellular function and host immunogenicity, including protein kinases, membrane transporters, proteases, surface antigens and cytoskeletal components and regulators. We also mapped by orthology Ich's metabolic pathways in comparison with other ciliates and a potential host organism, the zebrafish Danio rerio. CONCLUSIONS: Knowledge of the complete protein-coding and metabolic potential of Ich opens avenues for rational testing of therapeutic drugs that target functions essential to this parasite but not to its fish hosts. Also, a catalog of surface protein-encoding genes will facilitate development of more effective vaccines. The potential to use T. thermophila as a surrogate model offers promise toward controlling 'white spot' disease and understanding the adaptation to a parasitic lifestyle

    A Genomic Investigation of Divergence Between Tuna Species

    Get PDF
    Effective management and conservation of marine pelagic fishes is heavily dependent on a robust understanding of their population structure, their evolutionary history, and the delineation of appropriate management units. The Yellowfin tuna (Thunnus albacares) and the Blackfin tuna (Thunnus atlanticus) are two exploited epipelagic marine species with overlapping ranges in the tropical and sub-tropical Atlantic Ocean. This work analyzed genome-wide genetic variation of both species in the Atlantic basin to investigate the occurrence of population subdivision and adaptive variation. A de novo assembly of the Blackfin tuna genome was generated using Illumina paired-end sequencing data and applied as a reference for population genomic analysis of specimens from 9 localities spanning most of the Blackfin tuna range. Analysis suggested the presence of four weakly differentiated units corresponding to the northwestern Atlantic Ocean, Gulf of Mexico, Caribbean Sea, and southwestern Atlantic Ocean, respectively. Significant spatial autocorrelation of genotypes was observed for specimens collected within 800 km of each other. A high-quality genome assembly generated for the Yellowfin tuna using PacBio and Illumina sequences was scaffolded by a linkage map developed through analysis of the segregation of genome wide Single Nucleotide Polymorphisms in 164 larvae offspring from a single pair produced by controlled breeding. The genome assembly was used as a reference for population genomic analysis of juvenile specimens from the 4 main nursery areas hypothesized in the Atlantic Ocean basin. Analyses corroborated previously reported population subdivision between the east and west Atlantic Ocean, but also suggested subdivision associated with individual nursery areas within the east and west regions. Draft reference assemblies were generated for Albacore, Bigeye and Longtail tunas and used in combination with the Yellowfin and Blackfin tuna genomes obtained in this work and existing assemblies for bluefin tunas in preliminary analyses of genome wide variation between species of the Thunnus genus. Whole-genome derived SNP-based phylogenetic analysis of the Thunnus genus suggests phylogenetic relationships may be more complex than suggested in earlier work based on Restriction-site Associated DNA sequencing or muscle transcriptome sequencing and prompt for further analysis of the genus using a more comprehensive sampling of taxa in each oceanic basin

    Genomic characterizations of Xanthomonas cucurbitae and using comparative genomics to predict novel microbe-associated molecular patterns in Xanthomonas

    Get PDF
    Bacterial spot is a major plant disease caused by many plant-pathogenic members of the genus Xanthomonas. While each species is narrow in host range, bacterial spot Xanthomonads infect a large variety of plant hosts, leading to large economic losses for farmers around the world. Although Xanthomonas utilizes a wide array of virulence and pathogenicity factors to infect their hosts, plants have a range of methods to recognize invaders and prevent infection. Understanding the genomic and molecular interactions between Xanthomonas and their hosts are an important part of developing effective crop protection strategies and breeding plants for resistance. While X. cucurbitae has been identified as the causal agent of bacterial spot on cucurbits, no genomic-level analyses have been carried out regarding the pathogen. Using the first reference quality X. cucurbitae genome assembly, an RNA-seq analysis was carried out to assess virulence characteristics of the pathogen. By analyzing the X. cucurbitae transcriptome, we observed behavioral changes between nutrient-sufficient and host-mimicking conditions, as well as the upregulation of genes related to virulence and pathogenicity. We also identified virulence genes likely to be essential in successful bacterial spot infection. In addition, a RAD-seq analysis was performed to characterize populations clusters of X. cucurbitae isolated throughout the Midwestern United States. We revealed multiple populations of X. cucurbitae present throughout the region and demonstrated clear genetic differences between these populations using population genetics analyses. These studies demonstrate clear value in future genomic studies regarding X. cucurbitae. X. euvesicatoria and X. perforans are two bacterial spot Xanthomonads affecting tomatoes and peppers. We conducted a comparative genomics study in X. euvesicatoria and X. perforans populations to identify genes under selection pressure, and to characterize potential genes involved in plant-pathogen interactions. By calculating the test statistic Tajima’s D, we found evidence of purifying selection throughout the genomes of both bacterial spot Xanthomonads. In addition, Tajima’s D was successfully able to detect known microbe-associated molecular patterns (MAMPs), and we were able to characterize the recognition of these MAMPs between species in luminol-based reactive oxygen species (ROS) assays. While this study was not yet able to identify novel MAMPs, we show that Tajima’s D is a powerful tool in detecting genes that are important to plant-pathogen interactions

    Prevalence and relationship of endosymbiotic Wolbachia in the butterfly genus Erebia

    Get PDF
    Wolbachia is an endosymbiont common to most invertebrates, which can have significant evolutionary implications for its host species by acting as a barrier to gene flow. Despite the importance of Wolbachia, still little is known about its prevalence and diversification pattern among closely related host species. Wolbachia strains may phylogenetically coevolve with their hosts, unless horizontal host-switches are particularly common. We address these issues in the genus Erebia, one of the most diverse Palearctic butterfly genera.; We sequenced the Wolbachia genome from a strain infecting Erebia cassioides and showed that it belongs to the Wolbachia supergroup B, capable of infecting arthropods from different taxonomic orders. The prevalence of Wolbachia across 13 closely related Erebia host species based on extensive population-level genetic data revealed that multiple Wolbachia strains jointly infect all investigated taxa, but with varying prevalence. Finally, the phylogenetic relationships of Wolbachia strains are in some cases significantly associated to that of their hosts, especially among the most closely related Erebia species, demonstrating mixed evidence for phylogenetic coevolution.; Closely related host species can be infected by closely related Wolbachia strains, evidencing some phylogenetic coevolution, but the actual pattern of infection more often reflects historical or contemporary geographic proximity among host species. Multiple processes, including survival in distinct glacial refugia, recent host shifts in sympatry, and a loss of Wolbachia during postglacial range expansion seem to have jointly shaped the complex interactions between Wolbachia evolution and the diversification of its host among our studied Erebia species

    Initial sequencing and analysis of the human genome

    Full text link
    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/62798/1/409860a0.pd
    corecore