29 research outputs found

    MPBoot: fast phylogenetic maximum parsimony tree inference and bootstrap approximation

    Get PDF
    Background: The nonparametric bootstrap is widely used to measure the branch support of phylogenetic trees. However, bootstrapping is computationally expensive and remains a bottleneck in phylogenetic analyses. Recently, an ultrafast bootstrap approximation (UFBoot) approach was proposed for maximum likelihood analyses. However, such an approach is still missing for maximum parsimony. Results: To close this gap we present MPBoot, an adaptation and extension of UFBoot to compute branch supports under the maximum parsimony principle. MPBoot works for both uniform and non-uniform cost matrices. Our analyses on biological DNA and protein showed that under uniform cost matrices, MPBoot runs on average 4.7 (DNA) to 7 times (protein data) (range: 1.2–20.7) faster than the standard parsimony bootstrap implemented in PAUP*; but 1.6 (DNA) to 4.1 times (protein data) slower than the standard bootstrap with a fast search routine in TNT (fast-TNT). However, for non-uniform cost matrices MPBoot is 5 (DNA) to 13 times (protein data) (range:0.3–63. 9) faster than fast-TNT. We note that MPBoot achieves better scores more frequently than PAUP* and fast-TNT. However, this effect is less pronounced if an intensive but slower search in TNT is invoked. Moreover, experiments on large-scale simulated data show that while both PAUP* and TNT bootstrap estimates are too conservative, MPBoot bootstrap estimates appear more unbiased. Conclusions: MPBoot provides an efficient alternative to the standard maximum parsimony bootstrap procedure. It shows favorable performance in terms of run time, the capability of finding a maximum parsimony tree, and high bootstrap accuracy on simulated as well as empirical data sets. MPBoot is easy-to-use, open-source and available at http://www.cibiv.at/software/mpboo

    Limited contribution of non-intensive chicken farming to ESBL-producing Escherichia coli colonization in humans in Vietnam: an epidemiological and genomic analysis.

    Get PDF
    OBJECTIVES: To investigate the risk of colonization with ESBL-producing Escherichia coli (ESBL-Ec) in humans in Vietnam associated with non-intensive chicken farming. METHODS: Faecal samples from 204 randomly selected farmers and their chickens, and from 306 age- and sex-matched community-based individuals who did not raise poultry were collected. Antimicrobial usage in chickens and humans was assessed by medicine cabinet surveys. WGS was employed to obtain a high-resolution genomic comparison between ESBL-Ec isolated from humans and chickens. RESULTS: The adjusted prevalence of ESBL-Ec colonization was 20.0% (95% CI 10.8%-29.1%) and 35.2% (95% CI 30.4%-40.1%) in chicken farms and humans in Vietnam, respectively. Colonization with ESBL-Ec in humans was associated with antimicrobial usage (OR = 2.52, 95% CI = 1.08-5.87) but not with involvement in chicken farming. blaCTX-M-55 was the most common ESBL-encoding gene in strains isolated from chickens (74.4%) compared with blaCTX-M-27 in human strains (47.0%). In 3 of 204 (1.5%) of the farms, identical ESBL genes were detected in ESBL-Ec isolated from farmers and their chickens. Genomic similarity indicating recent sharing of ESBL-Ec between chickens and farmers was found in only one of these farms. CONCLUSIONS: The integration of epidemiological and genomic data in this study has demonstrated a limited contribution of non-intensive chicken farming to ESBL-Ec colonization in humans in Vietnam and further emphasizes the importance of reducing antimicrobial usage in both human and animal host reservoirs

    Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

    Get PDF
    BACKGROUND: Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP) strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX, or combinations of both are used to locate high-scoring segment pairs (HSPs) between two sequences from which pairwise similarities and distances are computed in different ways resulting in a total of 96 GBDP variants. The suitability of these distance formulae for phylogeny reconstruction is directly estimated by computing a recently described measure of "treelikeness", the so-called δ value, from the respective distance matrices. Additionally, we compare the trees inferred from these matrices using UPGMA, NJ, BIONJ, FastME, or STC, respectively, with the NCBI taxonomy tree of the taxa under study. RESULTS: Our results indicate that, at this taxonomic level, plastid genomes are much more valuable for inferring phylogenies than are mitochondrial genomes, and that distances based on breakpoints are of little use. Distances based on the proportion of "matched" HSP length to average genome length were best for tree estimation. Additionally we found that using TBLASTX instead of BLASTN and, particularly, combining TBLASTX and BLASTN leads to a small but significant increase in accuracy. Other factors do not significantly affect the phylogenetic outcome. The BIONJ algorithm results in phylogenies most in accordance with the current NCBI taxonomy, with NJ and FastME performing insignificantly worse, and STC performing as well if applied to high quality distance matrices. δ values are found to be a reliable predictor of phylogenetic accuracy. CONCLUSION: Using the most treelike distance matrices, as judged by their δ values, distance methods are able to recover all major plant lineages, and are more in accordance with Apicomplexa organelles being derived from "green" plastids than from plastids of the "red" type. GBDP-like methods can be used to reliably infer phylogenies from different kinds of genomic data. A framework is established to further develop and improve such methods. δ values are a topology-independent tool of general use for the development and assessment of distance methods for phylogenetic inference

    progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

    Get PDF
    Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms.We describe a new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss (flux). We demonstrate that the new method can accurately align regions conserved in some, but not all, of the genomes, an important case not handled by our previous work. The method uses a novel alignment objective score called a sum-of-pairs breakpoint score, which facilitates accurate detection of rearrangement breakpoints when genomes have unequal gene content. We also apply a probabilistic alignment filtering method to remove erroneous alignments of unrelated sequences, which are commonly observed in other genome alignment methods. We describe new metrics for quantifying genome alignment accuracy which measure the quality of rearrangement breakpoint predictions and indel predictions. The new genome alignment algorithm demonstrates high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental gain and loss. We apply the new algorithm to a set of 23 genomes from the genera Escherichia, Shigella, and Salmonella. Analysis of whole-genome multiple alignments allows us to extend the previously defined concepts of core- and pan-genomes to include not only annotated genes, but also non-coding regions with potential regulatory roles. The 23 enterobacteria have an estimated core-genome of 2.46Mbp conserved among all taxa and a pan-genome of 15.2Mbp. We document substantial population-level variability among these organisms driven by segmental gain and loss. Interestingly, much variability lies in intergenic regions, suggesting that the Enterobacteriacae may exhibit regulatory divergence.The multiple genome alignments generated by our software provide a platform for comparative genomic and population genomic studies. Free, open-source software implementing the described genome alignment approach is available from http://gel.ahabs.wisc.edu/mauve

    Treating Severe Malaria in Pregnancy: A Review of the Evidence

    Full text link

    Colonization of Enteroaggregative Escherichia coli and Shiga toxin-producing Escherichia coli in chickens and humans in southern Vietnam

    No full text
    BACKGROUND: Enteroaggregative (EAEC) and Shiga-toxin producing Escherichia coli (STEC) are a major cause of diarrhea worldwide. E. coli carrying both virulence factors characteristic for EAEC and STEC and producing extended-spectrum beta-lactamase caused severe and protracted disease during an outbreak of E. coli O104:H4 in Europe in 2011. We assessed the opportunities for E. coli carrying the aggR and stx genes to emerge in 'backyard' farms in south-east Asia. RESULTS: Faecal samples collected from 204 chicken farms; 204 farmers and 306 age- and gender-matched individuals not exposed to poultry farming were plated on MacConkey agar plates with and without antimicrobials being supplemented. Sweep samples obtained from MacConkey agar plates without supplemented antimicrobials were screened by multiplex PCR for the detection of the stx1, stx2 and aggR genes. One chicken farm sample each (0.5 %) contained the stx1 and the aggR gene. Eleven (2.4 %) human faecal samples contained the stx1 gene, 2 samples (0.4 %) contained stx2 gene, and 31 (6.8 %) contained the aggR gene. From 46 PCR-positive samples, 205 E. coli isolates were tested for the presence of stx1, stx2, aggR, wzx O104 and fliC H4 genes. None of the isolates simultaneously contained the four genetic markers associated with E. coli O104:H4 epidemic strain (aggR, stx2, wzx O104 and fliC H4 ). Of 34 EAEC, 64.7 % were resistant to 3(rd)-generation cephalosporins. CONCLUSION: These results indicate that in southern Vietnam, the human population is a more likely reservoir of aggR and stx gene carrying E. coli than the chicken population. However, conditions for transmission of isolates and/or genes between human and animal reservoirs resulting in the emergence of highly virulent E. coli strains are still favorable, given the nature of'backyard' farms in Vietnam

    Evidence for cervical cancer mortality with screening program in Taiwan, 1981–2010: age-period-cohort model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cervical cancer is the most common cancer experienced by women worldwide; however, screening techniques are very effective for reducing the risk of death. The national cervical cancer screening program was implemented in Taiwan in 1995. The objective of this study was to examine and provide evidence of the cervical cancer mortality trends for the periods before and after the screening program was implemented.</p> <p>Methods</p> <p>Data from 1981 to 2010 of the causes of death registered were obtained from the Department of Health, Taiwan. Age-standardized mortality rates, age-specific rates, and age-period-cohort models that employed the sequential method were used to assess temporal changes that occurred between 1981 and 2010, with 1995 used as the separating year.</p> <p>Results</p> <p>The results showed that for both time periods of 1981 to 1995 and 1996 to 2010, age and period had significant effects, whereas the birth cohort effects were insignificant. For patients between 80 and 84 years of age, the mortality rate for 1981 to 1995 and 1996 to 2010 was 48.34 and 68.08. The cervical cancer mortality rate for 1996 to 2010 was 1.0 for patients between 75 and 79 years of age and 1.4 for patients between 80 and 84 years of age compared to that for 1981 to 1995. Regarding the period effect, the mortality trend decreased 2-fold from 1996 to 2010.</p> <p>Conclusions</p> <p>The results of this study indicate a decline in cervical cancer mortality trends after the screening program involving Papanicolaou tests was implemented in 1995. However, the positive effects of the screening program were not observed in elderly women because of treatment delays during the initial implementation of the screening program.</p

    Numerical Optimization Techniques in Maximum Likelihood Tree Inference

    No full text
    International audienceIn this chapter, we present recent computational and algorithmic advances for improving the inference of phylogenetic trees from the analysis of homologous genetic sequences under the maximum likelihood criterion. In particular, we detail how the use of matrix algebra at the core of Felsenstein’s pruning algorithm, combined with the architecture of modern day computer processors, leads to efficient techniques for optimizing edge lengths. We also discuss some properties of the likelihood function when considering the optimization of the parameters of mixture models that are used to describe the variation of rates-across sites
    corecore