130 research outputs found

    Bioinformatics on the Cloud Computing Platform Azure

    Get PDF
    We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. © 2014 Shanahan et al

    Bacterial microevolution and the Pangenome

    Get PDF
    The comparison of multiple genome sequences sampled from a bacterial population reveals considerable diversity in both the core and the accessory parts of the pangenome. This diversity can be analysed in terms of microevolutionary events that took place since the genomes shared a common ancestor, especially deletion, duplication, and recombination. We review the basic modelling ingredients used implicitly or explicitly when performing such a pangenome analysis. In particular, we describe a basic neutral phylogenetic framework of bacterial pangenome microevolution, which is not incompatible with evaluating the role of natural selection. We survey the different ways in which pangenome data is summarised in order to be included in microevolutionary models, as well as the main methodological approaches that have been proposed to reconstruct pangenome microevolutionary history

    Processing and analyzing multiple genomes alignments with MafFilter

    Get PDF
    As the number of available genome sequences from both closely related species and individuals withinspecies increased, theoretical and methodological convergences between the fields of phylogenomics andpopulation genomics emerged. Population genomics typically focuses on the analysis of variants, whilephylogenomics heavily relies on genome alignments. However, these are playing an increasingly importantrole in studies at the population level. Multiple genome alignments of individuals are used when structuralvariation is of primary interest and when genome architecture permits to assemblede novogenomesequences. Here I describe MafFilter, a command-line-driven program allowing to process genome align-ments in the Multiple Alignment Format (MAF). Using concrete examples based on publicly availabledatasets, I demonstrate how MafFilter can be used to develop efficient and reproducible pipelines withquality assurance for downstream analyses. I further show how MafFilter can be used to perform both basicand advanced population genomic analyses in order to infer the patterns of nucleotide diversity alonggenomes

    Short Term Evolution of a Highly Transmissible Methicillin-Resistant Staphylococcus aureus Clone (ST228) in a Tertiary Care Hospital

    Get PDF
    Staphylococcus aureus is recognized as one of the major human pathogens and is by far one of the most common nosocomial organisms. The genetic basis for the emergence of highly epidemic strains remains mysterious. Studying the microevolution of the different clones of S. aureus is essential for identifying the forces driving pathogen emergence and spread. The aim of the present study was to determine the genetic changes characterizing a lineage belonging to the South German clone (ST228) that spread over ten years in a tertiary care hospital in Switzerland. For this reason, we compared the whole genome of eight isolates recovered between 2001 and 2008 at the Lausanne hospital. The genetic comparison of these isolates revealed that their genomes are extremely closely related. Yet, a few more important genetic changes, such as the replacement of a plasmid, the loss of large fragments of DNA, or the insertion of transposases, were observed. These transfers of mobile genetic elements shaped the evolution of the ST228 lineage that spread within the Lausanne hospital. Nevertheless, although the strains analyzed differed in their dynamics, we have not been able to link a particular genetic element with spreading success. Finally, the present study showed that new sequencing technologies improve considerably the quality and quantity of information obtained for a single strain; but this information is still difficult to interpret and important investments are required for the technology to become accessible for routine investigations

    Whole genome sequencing to investigate the emergence of clonal complex 23 Neisseria meningitidis serogroup Y disease in the United States

    Get PDF
    In the United States, serogroup Y, ST-23 clonal complex Neisseria meningitidis was responsible for an increase in meningococcal disease incidence during the 1990s. This increase was accompanied by antigenic shift of three outer membrane proteins, with a decrease in the population that predominated in the early 1990s as a different population emerged later in that decade. To understand factors that may have been responsible for the emergence of serogroup Y disease, we used whole genome pyrosequencing to investigate genetic differences between isolates from early and late N. meningitidis populations, obtained from meningococcal disease cases in Maryland in the 1990s. The genomes of isolates from the early and late populations were highly similar, with 1231 of 1776 shared genes exhibiting 100% amino acid identity and an average πN = 0.0033 and average πS = 0.0216. However, differences were found in predicted proteins that affect pilin structure and antigen profile and in predicted proteins involved in iron acquisition and uptake. The observed changes are consistent with acquisition of new alleles through horizontal gene transfer. Changes in antigen profile due to the genetic differences found in this study likely allowed the late population to emerge due to escape from population immunity. These findings may predict which antigenic factors are important in the cyclic epidemiology of meningococcal disease
    corecore