98 research outputs found

    A comparison of common programming languages used in bioinformatics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The performance of different programming languages has previously been benchmarked using abstract mathematical algorithms, but not using standard bioinformatics algorithms. We compared the memory usage and speed of execution for three standard bioinformatics methods, implemented in programs using one of six different programming languages. Programs for the Sellers algorithm, the Neighbor-Joining tree construction algorithm and an algorithm for parsing BLAST file outputs were implemented in C, C++, C#, Java, Perl and Python.</p> <p>Results</p> <p>Implementations in C and C++ were fastest and used the least memory. Programs in these languages generally contained more lines of code. Java and C# appeared to be a compromise between the flexibility of Perl and Python and the fast performance of C and C++. The relative performance of the tested languages did not change from Windows to Linux and no clear evidence of a faster operating system was found.</p> <p>Source code and additional information are available from <url>http://www.bioinformatics.org/benchmark/</url></p> <p>Conclusion</p> <p>This benchmark provides a comparison of six commonly used programming languages under two different operating systems. The overall comparison shows that a developer should choose an appropriate language carefully, taking into account the performance expected and the library availability for each language.</p

    Rapid Detection and Subtyping of Human Influenza A Viruses and Reassortants by Pyrosequencing

    Get PDF
    Background: Given the continuing co-circulation of the 2009 H1N1 pandemic influenza A viruses with seasonal H3N2 viruses, rapid and reliable detection of newly emerging influenza reassortant viruses is important to enhance our influenza surveillance. Methodology/Principal Findings: A novel pyrosequencing assay was developed for the rapid identification and subtyping of potential human influenza A virus reassortants based on all eight gene segments of the virus. Except for HA and NA genes, one universal set of primers was used to amplify and subtype each of the six internal genes. With this method, all eight gene segments of 57 laboratory isolates and 17 original specimens of seasonal H1N1, H3N2 and 2009 H1N1 pandemic viruses were correctly matched with their corresponding subtypes. In addition, this method was shown to be capable of detecting reassortant viruses by correctly identifying the source of all 8 gene segments from three vaccine production reassortant viruses and three H1N2 viruses. Conclusions/Significance: In summary, this pyrosequencing assay is a sensitive and specific procedure for screening large numbers of viruses for reassortment events amongst the commonly circulating human influenza A viruses, which is mor

    SuperCAT: a supertree database for combined and integrative multilocus sequence typing analysis of the Bacillus cereus group of bacteria (including B. cereus, B. anthracis and B. thuringiensis)

    Get PDF
    The Bacillus cereus group of bacteria is an important group including mammalian and insect pathogens, such as B. anthracis, the anthrax bacterium, B. thuringiensis, used as a biological pesticide and B. cereus, often involved in food poisoning incidents. To characterize the population structure and epidemiology of these bacteria, five separate multilocus sequence typing (MLST) schemes have been developed, which makes results difficult to compare. Therefore, we have developed a database that compiles and integrates MLST data from all five schemes for the B. cereus group, accessible at http://mlstoslo.uio.no/. Supertree techniques were used to combine the phylogenetic information from analysis of all schemes and datasets, in order to produce an integrated view of the B. cereus group population. The database currently contains strain information and sequence data for 1029 isolates and 26 housekeeping gene fragments, which can be searched by keywords, MLST scheme, or sequence similarity. Supertrees can be browsed according to various criteria such as species, isolate source, or genetic distance, and subtrees containing strains of interest can be extracted. Besides analysis of the available data, the user has the possibility to enter her/his own sequences and compare them to the database and/or include them into the supertree reconstructions

    The contrasting phylodynamics of human influenza B viruses

    Get PDF
    A complex interplay of viral, host, and ecological factors shapes the spatio-temporal incidence and evolution of human influenza viruses. Although considerable attention has been paid to influenza A viruses, a lack of equivalent data means that an integrated evolutionary and epidemiological framework has until now not been available for influenza B viruses, despite their significant disease burden. Through the analysis of over 900 full genomes from an epidemiological collection of more than 26,000 strains from Australia and New Zealand, we reveal fundamental differences in the phylodynamics of the two co-circulating lineages of influenza B virus (Victoria and Yamagata), showing that their individual dynamics are determined by a complex relationship between virus transmission, age of infection, and receptor binding preference. In sum, this work identifies new factors that are important determinants of influenza B evolution and epidemiology.Dhanasekaran Vijaykrishna, Edward C Holmes, Udayan Joseph, Mathieu Fourment, Yvonne CF Su, Rebecca Halpin, Raphael TC Lee, Yi-Mo Deng, Vithiagaran Gunalan, Xudong Lin, Timothy B Stockwell, Nadia B Fedorova, Bin Zhou, Natalie Spirason, Denise Kühnert, Veronika Bošková, Tanja Stadler, Anna-Maria Costa, Dominic E Dwyer, Q Sue Huang, Lance C Jennings, William Rawlinson, Sheena G Sullivan, Aeron C Hurt, Sebastian Maurer-Stroh, David E Wentworth, Gavin JD Smith, Ian G Bar

    Morphometric Relationship, Phylogenetic Correlation, and Character Evolution in the Species-Rich Genus Aphis (Hemiptera: Aphididae)

    Get PDF
    The species-rich genus Aphis consists of more than 500 species, many of them host-specific on a wide range of plants, yet very similar in general appearance due to convergence toward particular morphological types. Most species have been historically clustered into four main phenotypic groups (gossypii, craccivora, fabae, and spiraecola groups). To confirm the morphological hypotheses between these groups and to examine the characteristics that determine them, multivariate morphometric analyses were performed using 28 characters measured/counted from 40 species. To infer whether the morphological relationships are correlated with the genetic relationships, we compared the morphometric dataset with a phylogeny reconstructed from the combined dataset of three mtDNA and one nuclear DNA regions.Based on a comparison of morphological and molecular datasets, we confirmed morphological reduction or regression in the gossypii group unlike in related groups. Most morphological characteristics of the gossypii group were less variable than for the other groups. Due to these, the gossypii group could be morphologically well separated from the craccivora, fabae, and spiraecola groups. In addition, the correlation of the rates of evolution between morphological and DNA datasets was highly significant in their diversification.The morphological separation between the gossypii group and the other species-groups are congruent with their phylogenetic relationships. Analysis of trait evolution revealed that the morphological traits found to be significant based on the morphometric analyses were confidently correlated with the phylogeny. The dominant patterns of trait evolution resulting in increased rates of short branches and temporally later evolution are likely suitable for the modality of Aphis speciation because they have adapted species-specifically, rapidly, and more recently on many different host plants

    Bayesian molecular clock dating of species divergences in the genomics era

    Get PDF
    It has been five decades since the proposal of the molecular clock hypothesis, which states that the rate of evolution at the molecular level is constant through time and among species. This hypothesis has become a powerful tool in evolutionary biology, making it possible to use molecular sequences to estimate the geological ages of species divergence events. With recent advances in Bayesian clock dating methodology and the explosive accumulation of genetic sequence data, molecular clock dating has found widespread applications, from tracking virus pandemics, to studying the macroevolutionary process of speciation and extinction, to estimating a timescale for Life on Earth

    Systematic and Evolutionary Insights Derived from mtDNA COI Barcode Diversity in the Decapoda (Crustacea: Malacostraca)

    Get PDF
    Background: Decapods are the most recognizable of all crustaceans and comprise a dominant group of benthic invertebrates of the continental shelf and slope, including many species of economic importance. Of the 17635 morphologically described Decapoda species, only 5.4% are represented by COI barcode region sequences. It therefore remains a challenge to compile regional databases that identify and analyse the extent and patterns of decapod diversity throughout the world. Methodology/Principal Findings: We contributed 101 decapod species from the North East Atlantic, the Gulf of Cadiz and the Mediterranean Sea, of which 81 species represent novel COI records. Within the newly-generated dataset, 3.6% of the species barcodes conflicted with the assigned morphological taxonomic identification, highlighting both the apparent taxonomic ambiguity among certain groups, and the need for an accelerated and independent taxonomic approach. Using the combined COI barcode projects from the Barcode of Life Database, we provide the most comprehensive COI data set so far examined for the Order (1572 sequences of 528 species, 213 genera, and 67 families). Patterns within families show a general predicted molecular hierarchy, but the scale of divergence at each taxonomic level appears to vary extensively between families. The range values of mean K2P distance observed were: within species 0.285% to 1.375%, within genus 6.376% to 20.924% and within family 11.392% to 25.617%. Nucleotide composition varied greatly across decapods, ranging from 30.8 % to 49.4 % GC content. Conclusions/Significance: Decapod biological diversity was quantified by identifying putative cryptic species allowing a rapid assessment of taxon diversity in groups that have until now received limited morphological and systematic examination. We highlight taxonomic groups or species with unusual nucleotide composition or evolutionary rates. Such data are relevant to strategies for conservation of existing decapod biodiversity, as well as elucidating the mechanisms and constraints shaping the patterns observed.FCT - SFRH/BD/25568/ 2006EC FP6 - GOCE-CT-2005-511234 HERMESFCT - PTDC/MAR/69892/2006 LusomarBo
    corecore