37 research outputs found

    Global optimal eBURST analysis of multilocus typing data using a graphic matroid approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Multilocus Sequence Typing (MLST) is a frequently used typing method for the analysis of the clonal relationships among strains of several clinically relevant microbial species. MLST is based on the sequence of housekeeping genes that result in each strain having a distinct numerical allelic profile, which is abbreviated to a unique identifier: the sequence type (ST). The relatedness between two strains can then be inferred by the differences between allelic profiles. For a more comprehensive analysis of the possible patterns of evolutionary descent, a set of rules were proposed and implemented in the eBURST algorithm. These rules allow the division of a data set into several clusters of related strains, dubbed clonal complexes, by implementing a simple model of clonal expansion and diversification. Within each clonal complex, the rules identify which links between STs correspond to the most probable pattern of descent. However, the eBURST algorithm is not globally optimized, which can result in links, within the clonal complexes, that violate the rules proposed.</p> <p>Results</p> <p>Here, we present a globally optimized implementation of the eBURST algorithm – goeBURST. The search for a global optimal solution led to the formalization of the problem as a graphic matroid, for which greedy algorithms that provide an optimal solution exist. Several public data sets of MLST data were tested and differences between the two implementations were found and are discussed for five bacterial species: <it>Enterococcus faecium</it>, <it>Streptococcus pneumoniae</it>, <it>Burkholderia pseudomallei</it>, <it>Campylobacter jejuni </it>and <it>Neisseria spp.</it>. A novel feature implemented in goeBURST is the representation of the level of tiebreak rule reached before deciding if a link should be drawn, which can used to visually evaluate the reliability of the represented hypothetical pattern of descent.</p> <p>Conclusion</p> <p>goeBURST is a globally optimized implementation of the eBURST algorithm, that identifies alternative patterns of descent for several bacterial species. Furthermore, the algorithm can be applied to any multilocus typing data based on the number of differences between numeric profiles. A software implementation is available at <url>http://goeBURST.phyloviz.net</url>.</p

    Acolhimento de refugiados: alimentação e necessidades nutricionais em situações de emergência

    Get PDF
    A Europa é neste momento um dos principais destinos de um intenso fluxo migratórioprovocado por diferentes conflitos armados no Médio Oriente e em África, tendo a ComissãoEuropeia (CE) acordado na distribuição de uma parte destas pessoas em clara necessidade deproteção nacional, pelos diversos Estados Membros. As populações em trânsito e ascaracterísticas do seu acolhimento possuem especificidades que podem comprometer oacesso a uma alimentação adequada e a cuidados básicos de saúde, influenciando amorbilidade e a mortalidade nos grupos afetados.Este manual pretende estabelecer um referencial para a intervenção alimentar e nutricionalaos refugiados que chegam a Portugal. Destina-se a todos aqueles que prestam apoio, quer anível individual, quer a nível institucional, e que sejam responsáveis por qualquer aspetorelacionado com a saúde e a alimentação dessas populações, facilitando a operacionalizaçãoda assistência e fornecendo ferramentas para a tomada de decisões.O manual está organizado em 3 partes. Inicia-se com a avaliação do estado nutricional dapopulação a acolher, apresenta depois diversas estratégias para o desenho da intervençãoalimentar e nutricional tendo por base as necessidades nutricionais previstas para estes grupospopulacionais e, na parte final, aborda a importância de garantir a higiene e segurança dosalimentos na prestação desta assistência alimentar. Este manual apresenta ainda algumasconsiderações relacionadas com os cuidados básicos na área da psicologia destinados àsequipas que estão no terrenoEurope is now a major destination of an intense migratory flow caused by armed conflicts inthe Middle East and Africa. In this sense, the European Commission (EC) agreed on thedistribution of some of these people in clear need of international protection by variousMember States. The populations in transit and the characteristics of its reception havespecificities that can compromise the access to adequate food and basic health care. They canalso influence morbidity and mortality in affected groups.This manual aims to establish a framework of nutrition and food intervention to refugees inPortugal. It is aimed at all those who support it, either individually or institutionally, and areresponsible for any aspect related to the health and nutrition of these populations, facilitatingthe implementation of care and providing tools for decision making. The manual is organizedinto 3 parts. It begins with the assessment of nutritional status of the population. Then presents several strategies to support the design of food and nutrition intervention based onthe estimated nutritional needs for these population groups. In the final section it addressesthe importance of ensuring food safety and hygiene when providing this food assistance. Thismanual also addresses some considerations towards basic psychological interventionprinciples, destined to help the teams on the field

    Beta-hemolytic Streptococcus dysgalactiae strains isolated from horses are a genetically distinct population within the Streptococcus dysgalactiae taxon

    Get PDF
    The pathogenic role of beta-hemolytic Streptococcus dysgalactiae in the equine host is increasingly recognized. A collection of 108 Lancefield group C (n=96) or L (n=12) horse isolates recovered in the United States and in three European countries presented multilocus sequence typing (MLST) alleles, sequence types and emm types (only 56% of the isolates could be emm typed) that were, with few exceptions, distinct from those previously found in human Streptococcus dysgalactiae subsp. equisimilis. Characterization of a subset of horse isolates by multilocus sequence analysis (MLSA) and 16S rRNA gene sequence showed that most equine isolates could also be differentiated from S. dysgalactiae strains from other animal species, supporting the existence of a horse specific genomovar. Draft genome information confirms the distinctiveness of the horse genomovar and indicates the presence of potentially horse-specific virulence factors. While this genomovar represents most of the isolates recovered from horses, a smaller MLST and MLSA defined sub-population seems to be able to cause infections in horses, other animals and humans, indicating that transmission between hosts of strains belonging to this group may occur

    TypOn: the microbial typing ontology

    Get PDF
    Bacterial identification and characterization at subspecies level is commonly known as Microbial Typing. Currently, these methodologies are fundamental tools in Clinical Microbiology and bacterial population genetics studies to track outbreaks and to study the dissemination and evolution of virulence or pathogenicity factors and antimicrobial resistance. Due to advances in DNA sequencing technology, these methods have evolved to become focused on sequence-based methodologies. The need to have a common understanding of the concepts described and the ability to share results within the community at a global level are increasingly important requisites for the continued development of portable and accurate sequence-based typing methods, especially with the recent introduction of Next Generation Sequencing (NGS) technologies. In this paper, we present an ontology designed for the sequence-based microbial typing field, capable of describing any of the sequence-based typing methodologies currently in use and being developed, including novel NGS based methods. This is a fundamental step to accurately describe, analyze, curate, and manage information for microbial typing based on sequence based typing methods.info:eu-repo/semantics/publishedVersio

    Evaluation of Jackknife and Bootstrap for Defining Confidence Intervals for Pairwise Agreement Measures

    Get PDF
    Several research fields frequently deal with the analysis of diverse classification results of the same entities. This should imply an objective detection of overlaps and divergences between the formed clusters. The congruence between classifications can be quantified by clustering agreement measures, including pairwise agreement measures. Several measures have been proposed and the importance of obtaining confidence intervals for the point estimate in the comparison of these measures has been highlighted. A broad range of methods can be used for the estimation of confidence intervals. However, evidence is lacking about what are the appropriate methods for the calculation of confidence intervals for most clustering agreement measures. Here we evaluate the resampling techniques of bootstrap and jackknife for the calculation of the confidence intervals for clustering agreement measures. Contrary to what has been shown for some statistics, simulations showed that the jackknife performs better than the bootstrap at accurately estimating confidence intervals for pairwise agreement measures, especially when the agreement between partitions is low. The coverage of the jackknife confidence interval is robust to changes in cluster number and cluster size distribution

    Ranked Adjusted Rand: integrating distance and partition information in a measure of clustering agreement

    Get PDF
    BACKGROUND: Biological information is commonly used to cluster or classify entities of interest such as genes, conditions, species or samples. However, different sources of data can be used to classify the same set of entities and methods allowing the comparison of the performance of two data sources or the determination of how well a given classification agrees with another are frequently needed, especially in the absence of a universally accepted "gold standard" classification. RESULTS: Here, we describe a novel measure – the Ranked Adjusted Rand (RAR) index. RAR differs from existing methods by evaluating the extent of agreement between any two groupings, taking into account the intercluster distances. This characteristic is relevant to evaluate cases of pairs of entities grouped in the same cluster by one method and separated by another. The latter method may assign them to close neighbour clusters or, on the contrary, to clusters that are far apart from each other. RAR is applicable even when intercluster distance information is absent for both or one of the groupings. In the first case, RAR is equal to its predecessor, Adjusted Rand (HA) index. Artificially designed clusterings were used to demonstrate situations in which only RAR was able to detect differences in the grouping patterns. A study with larger simulated clusterings ensured that in realistic conditions, RAR is effectively integrating distance and partition information. The new method was applied to biological examples to compare 1) two microbial typing methods, 2) two gene regulatory network distances and 3) microarray gene expression data with pathway information. In the first application, one of the methods does not provide intercluster distances while the other originated a hierarchical clustering. RAR proved to be more sensitive than HA in the choice of a threshold for defining clusters in the hierarchical method that maximizes agreement between the results of both methods. CONCLUSION: RAR has its major advantage in combining cluster distance and partition information, while the previously available methods used only the latter. RAR should be used in the research problems were HA was previously used, because in the absence of inter cluster distance effects it is an equally effective measure, and in the presence of distance effects it is a more complete one

    Critical steps in clinical shotgun metagenomics for the concomitant detection and typing of microbial pathogens

    Get PDF
    High throughput sequencing has been proposed as a one-stop solution for diagnostics and molecular typing directly from patient samples, allowing timely and appropriate implementation of measures for treatment, infection prevention and control. However, it is unclear how the variety of available methods impacts the end results. We applied shotgun metagenomics on diverse types of patient samples using three different methods to deplete human DNA prior to DNA extraction. Libraries were prepared and sequenced with Illumina chemistry. Data was analyzed using methods likely to be available in clinical microbiology laboratories using genomics. The results of microbial identification were compared to standard culture-based microbiological methods. On average, 75% of the reads corresponded to human DNA, being a major determinant in the analysis outcome. None of the kits was clearly superior suggesting that the initial ratio between host and microbial DNA or other sample characteristics were the major determinants of the proportion of microbial reads. Most pathogens identified by culture were also identified through metagenomics, but substantial differences were noted between the taxonomic classification tools. In two cases the high number of human reads resulted in insufficient sequencing depth of bacterial DNA for identification. In three samples, we could infer the probable multilocus sequence type of the most abundant species. The tools and databases used for taxonomic classification and antimicrobial resistance identification had a key impact on the results, recommending that efforts need to be aimed at standardization of the analysis methods if metagenomics is to be used routinely in clinical microbiology

    Fast phylogenetic inference from typing data

    Get PDF
    Background: Microbial typing methods are commonly used to study the relatedness of bacterial strains. Sequencebased typing methods are a gold standard for epidemiological surveillance due to the inherent portability of sequence and allelic profile data, fast analysis times and their capacity to create common nomenclatures for strains or clones. This led to development of several novel methods and several databases being made available for many microbial species. With the mainstream use of High Throughput Sequencing, the amount of data being accumulated in these databases is huge, storing thousands of different profiles. On the other hand, computing genetic evolutionary distances among a set of typing profiles or taxa dominates the running time of many phylogenetic inference methods. It is important also to note that most of genetic evolution distance definitions rely, even if indirectly, on computing the pairwise Hamming distance among sequences or profiles. Results: We propose here an average-case linear-time algorithm to compute pairwise Hamming distances among a set of taxa under a given Hamming distance threshold. This article includes both a theoretical analysis and extensive experimental results concerning the proposed algorithm. We further show how this algorithm can be successfully integrated into a well known phylogenetic inference method, and how it can be used to speedup querying local phylogenetic patterns over large typing databases.info:eu-repo/semantics/publishedVersio
    corecore