34 research outputs found

    Fully automated sequence alignment methods are comparable to, and much faster than, traditional methods in large data sets: an example with hepatitis B virus

    Get PDF
    Aligning sequences for phylogenetic analysis (multiple sequence alignment; MSA) is an important, but increasingly computationally expensive step with the recent surge in DNA sequence data. Much of this sequence data is publicly available, but can be extremely fragmentary (i.e., a combination of full genomes and genomic fragments), which can compound the computational issues related to MSA. Traditionally, alignments are produced with automated algorithms and then checked and/or corrected “by eye” prior to phylogenetic inference. However, this manual curation is inefficient at the data scales required of modern phylogenetics and results in alignments that are not reproducible. Recently, methods have been developed for fully automating alignments of large data sets, but it is unclear if these methods produce alignments that result in compatible phylogenies when compared to more traditional alignment approaches that combined automated and manual methods. Here we use approximately 33,000 publicly available sequences from the hepatitis B virus (HBV), a globally distributed and rapidly evolving virus, to compare different alignment approaches. Using one data set comprised exclusively of whole genomes and a second that also included sequence fragments, we compared three MSA methods: (1) a purely automated approach using traditional software, (2) an automated approach including by eye manual editing, and (3) more recent fully automated approaches. To understand how these methods affect phylogenetic results, we compared resulting tree topologies based on these different alignment methods using multiple metrics. We further determined if the monophyly of existing HBV genotypes was supported in phylogenies estimated from each alignment type and under different statistical support thresholds. Traditional and fully automated alignments produced similar HBV phylogenies. Although there was variability between branch support thresholds, allowing lower support thresholds tended to result in more differences among trees. Therefore, differences between the trees could be best explained by phylogenetic uncertainty unrelated to the MSA method used. Nevertheless, automated alignment approaches did not require human intervention and were therefore considerably less time-intensive than traditional approaches. Because of this, we conclude that fully automated algorithms for MSA are fully compatible with older methods even in extremely difficult to align data sets. Additionally, we found that most HBV diagnostic genotypes did not correspond to evolutionarily-sound groups, regardless of alignment type and support threshold. This suggests there may be errors in genotype classification in the database or that HBV genotypes may need a revision

    A manually annotated Actinidia chinensis var. chinensis (kiwifruit) genome highlights the challenges associated with draft genomes and gene prediction in plants

    Get PDF
    Most published genome sequences are drafts, and most are dominated by computational gene prediction. Draft genomes typically incorporate considerable sequence data that are not assigned to chromosomes, and predicted genes without quality confidence measures. The current Actinidia chinensis (kiwifruit) 'Hongyang' draft genome has 164\ua0Mb of sequences unassigned to pseudo-chromosomes, and omissions have been identified in the gene models

    Characterising genetic loci associated with loss of apomixis in Hieracium

    Get PDF
    Most plant species strictly utilise sexual reproduction to generate genetically diverse seed to ensure adaptation of their descendents to the changing demands of their environment. Some species, however, have largely dispensed with sexual reproduction, opting instead to propagate clonally via apomixis, and maintain genotypes that are presumably already sufficiently adapted. Researchers of apomixis have long been attracted to the phenomenon as a biological curiosity, but more significant investigative attention is now being paid to it due to its ability to fix heterosis and therefore enable the economic production of high yielding hybrid varieties of the world's major crops. Despite the strong motivation to integrate apomixis into seed-production systems, previous attempts to introgress the trait from wild apomictic relatives, or to induce it via mutagenesis, have yet to produce commercial apomictic varieties. It now appears likely that the successful transfer of apomixis into sexual crops will first require the elucidation of the molecular mechanisms, employed by native apomicts, that enable the avoidance of key components of sexual reproduction that otherwise serve to generate genetic diversity. The Apomixis Programme at Crop & Food Research, Lincoln, aims to elucidate the genetics and molecular mechanisms of apomixis in Hieracium subgenus Pilosella. Two major deviations from sexual reproduction are required: the avoidance of meiosis, or apomeiosis, and the avoidance of fertilisation, or parthenogenesis. Segregating populations demonstrate independent segregation of apomeiosis and parthenogenesis. However, conventional mapping approaches towards determinants of apomixis in other species have often encountered significant difficulties posed by suppressed recombination at their loci. Alternative genetic resources of Hieracium were therefore generated using T-DNA and transposon mutagenesis, and deletion mutagenesis. The present research focused on identifying and generating molecular maps of apomixes loci by screening deletion mutant panels of two genotypes, H. glaciale and H. caespitosum with secondary digest amplified fragment length polymorphism (SDAFLP). Identified loci were verified by their associations with apomixis in segregating populations, and SDAFLP markers were sequenced and converted into sequence characterised amplified regions (SCARs). The utility of the SCARs for the future isolation of BAC clones was determined by their presence or absence in key mutants. The identification and characterisation of three loci whose loss was associated with loss of -parthenogenesis in H. glaciale are described in Chapter 3. One locus transmitted to hybrid progeny as a determining locus and the other two transmitted as modifying loci. AT-DNA mutant of the H. glaciale background, which was included in the mutant panel, was found to carry a deletion at the determining locus. Findings that indicate that T-DNA insertions are not linked to the deletion are set out in Chapter 4, and somaclonal variation is suggested as an alternative cause of the deletion. Chapter 5 describes the use of deletion mutagenesis to identify two loci in H. caespitosum: one is associated with loss of apomeiosis (LOA) and the other with loss of parthenogenesis (LOP). Key mutants were screened with SDAFLP to obtain high densities of markers at LOA and LOP and markers that were predicted to be nearest the determinants were sequenced and converted into SCARs. One sequenced marker at LOP is likely to partially code for a regulatory gene. LOA and LOP segregated independently among hybrid progeny in strong association with apomeiosis and parthenogenesis respectively. Segregation distortion was characteristic of both loci, while recombination did not appear to be suppressed. Chapter 6 discusses how the findings of this research may be used to investigate the evolution of apomixis and to isolate its genetic determinants. It also discusses some challenges that might be encountered in the future during the engineering of apomixis in commercial crop species

    Comparative transcriptome analysis of the wild-type model apomict Hieracium praealtum and its loss of parthenogenesis (lop) mutant

    No full text
    Abstract Background Asexual seed formation (apomixis) has been observed in diverse plant families but is rare in crop plants. The generation of apomictic crops would revolutionize agriculture, as clonal seed production provides a low cost and efficient way to produce hybrid seed. Hieracium (Asteraceae) is a model system for studying the molecular components of gametophytic apomixis (asexual seed reproduction). Results In this study, a reference transcriptome was produced from apomictic Hieracium undergoing the key apomictic events of apomeiosis, parthenogenesis and autonomous endosperm development. In addition, transcriptome sequences from pre-pollination and post-pollination stages were generated from a loss of parthenogenesis (lop) mutant accession that exhibits loss of parthenogenesis and autonomous endosperm development. The transcriptome is composed of 147,632 contigs, 50% of which were annotated with orthologous genes and their probable function. The transcriptome was used to identify transcripts differentially expressed during apomictic and pollination dependent (lop) seed development. Gene Ontology enrichment analysis of differentially expressed transcripts showed that an important difference between apomictic and pollination dependent seed development was the expression of genes relating to epigenetic gene regulation. Genes that mark key developmental stages, i.e. aposporous embryo sac development and seed development, were also identified through their enhanced expression at those stages. Conclusion The production of a comprehensive floral reference transcriptome for Hieracium provides a valuable resource for research into the molecular basis of apomixis and the identification of the genes underlying the LOP locus

    Composition and distribution of lice (Insecta: Phthiraptera) on Colombian and Peruvian birds: New data on louse-host association in the Neotropics

    Get PDF
    The diversity of permanent ectoparasites is likely underestimated due to the difficulty of collecting samples. Lice (Insecta: Phthiraptera) are permanent ectoparasites of birds and mammals; there are approximately 5,000 species described and many more undescribed, particularly in the Neotropics. We document the louse genera collected from birds sampled in Peru (2006–2007) and Colombia (2009–2016), from 22 localities across a variety of ecosystems, ranging from lowland tropical forest and Llanos to high elevation cloud forest. We identified 35 louse genera from a total of 210 bird species belonging to 37 avian families and 13 orders. These genera belong to two suborders and three families of lice: Amblycera, families Menoponidae (present on 131 bird species) and Ricinidae (39 bird species); and Ischnocera, family Philopteridae (119 bird species). We compared our bird-louse associations with data in Price et al. (2003) and recently published Neotropical studies. The majority of bird-louse associations (51.9%) were new, with most of these coming from Passeriformes, the most diverse avian order, with the most poorly known louse fauna. Finally, we found geographical variation in louse infestation and prevalence rates. With this study, we report the first comprehensive documentation of bird-louse associations for Colombia and substantially increase the known associations documented for Peru

    Total_alignment_trees.zip

    No full text
    Trees estimated from total alignments (genomes + fragmentary sequences) of hepatitis B viruses. Includes trees estimated from manual and UPP alignments

    Genome_trees.zip

    No full text
    Tree files estimated from sequence alignments of hepatitis B virus genomes. Trees are best maximum likelihood (ML) trees with bootstrap support values. Includes trees based on MUSCLE, manual, and PASTA genome alignments
    corecore