29 research outputs found

    Bacterial whole genome-based phylogeny: construction of a new benchmarking dataset and assessment of some existing methods

    Get PDF
    BackgroundWhole genome sequencing (WGS) is increasingly used in diagnostics and surveillance of infectious diseases. A major application for WGS is to use the data for identifying outbreak clusters, and there is therefore a need for methods that can accurately and efficiently infer phylogenies from sequencing reads. In the present study we describe a new dataset that we have created for the purpose of benchmarking such WGS-based methods for epidemiological data, and also present an analysis where we use the data to compare the performance of some current methods.ResultsOur aim was to create a benchmark data set that mimics sequencing data of the sort that might be collected during an outbreak of an infectious disease. This was achieved by letting an E. coli hypermutator strain grow in the lab for 8 consecutive days, each day splitting the culture in two while also collecting samples for sequencing. The result is a data set consisting of 101 whole genome sequences with known phylogenetic relationship. Among the sequenced samples 51 correspond to internal nodes in the phylogeny because they are ancestral, while the remaining 50 correspond to leaves.We also used the newly created data set to compare three different online available methods that infer phylogenies from whole-genome sequencing reads: NDtree, CSI Phylogeny and REALPHY. One complication when comparing the output of these methods with the known phylogeny is that phylogenetic methods typically build trees where all observed sequences are placed as leafs, even though some of them are in fact ancestral. We therefore devised a method for post processing the inferred trees by collapsing short branches (thus relocating some leafs to internal nodes), and also present two new measures of tree similarity that takes into account the identity of both internal and leaf nodes.ConclusionsBased on this analysis we find that, among the investigated methods, CSI Phylogeny had the best performance, correctly identifying 73% of all branches in the tree and 71% of all clades.We have made all data from this experiment (raw sequencing reads, consensus whole-genome sequences, as well as descriptions of the known phylogeny in a variety of formats) publicly available, with the hope that other groups may find this data useful for benchmarking and exploring the performance of epidemiological methods. All data is freely available at: https://cge.cbs.dtu.dk/services/evolution_data.php

    Computational Analysis Reveals the Temporal Acquisition of Pathway Alterations during the Evolution of Cancer

    Get PDF
    Cancer metastasis is the lethal developmental step in cancer, responsible for the majority of cancer deaths. To metastasise, cancer cells must acquire the ability to disseminate systemically and to escape an activated immune response. Here, we endeavoured to investigate if metastatic dissemination reflects acquisition of genomic traits that are selected for. We acquired mutation and copy number data from 8332 tumours representing 19 cancer types acquired from The Cancer Genome Atlas and the Hartwig Medical Foundation. A total of 827,344 non-synonymous mutations across 8332 tumour samples representing 19 cancer types were timed as early or late relative to copy number alterations, and potential driver events were annotated. We found that metastatic cancers had a significantly higher proportion of clonal mutations and a general enrichment of early mutations in p53 and RTK/KRAS pathways. However, while individual pathways demonstrated a clear time-separated preference for specific events, the relative timing did not vary between primary and metastatic cancers. These results indicate that the selective pressure that drives cancer development does not change dramatically between primary and metastatic cancer on a genomic level, and is mainly focused on alterations that increase proliferation

    Classifying cGAS-STING Activity Links Chromosomal Instability with Immunotherapy Response in Metastatic Bladder Cancer

    Get PDF
    UNLABELLED: The cGAS-STING pathway serves a critical role in anticancer therapy. Particularly, response to immunotherapy is likely driven by both active cGAS-STING signaling that attracts immune cells, and by the presence of cancer neoantigens that presents as targets for cytotoxic T cells. Chromosomal instability (CIN) is a hallmark of cancer, but also leads to an accumulation of cytosolic DNA that in turn results in increased cGAS-STING signaling. To avoid triggering the cGAS-STING pathway, it is commonly disrupted by cancer cells, either through mutations in the pathway or through transcriptional silencing. Given its effect on the immune system, determining the cGAS-STING activation status prior to treatment initiation is likely of clinical relevance. Here, we used combined expression data from 2,307 tumors from five cancer types from The Cancer Genome Atlas to define a novel cGAS-STING activity score based on eight genes with a known role in the pathway. Using unsupervised clustering, four distinct categories of cGAS-STING activation were identified. In multivariate models, the cGAS-STING active tumors show improved prognosis. Importantly, in an independent bladder cancer immunotherapy-treated cohort, patients with low cGAS-STING expression showed limited response to treatment, while patients with high expression showed improved response and prognosis, particularly among patients with high CIN and more neoantigens. In a multivariate model, a significant interaction was observed between CIN, neoantigens, and cGAS-STING activation. Together, this suggests a potential role of cGAS-STING activity as a predictive biomarker for the application of immunotherapy. SIGNIFICANCE: The cGAS-STING pathway is induced by CIN, triggers inflammation and is often deficient in cancer. We provide a tool to evaluate cGAS-STING activity and demonstrate clinical significance in immunotherapy response

    A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance

    Get PDF
    Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available

    Development of a web tool for Escherichia coli subtyping based on fimH alleles:Running title: Development of E. coli fimH sub-typing web-tool

    Get PDF
    ABSTRACT The aim of this study was to construct a valid publicly available method for in silico fimH subtyping of Escherichia coli particularly suitable for differentiation of fine-resolution subgroups within clonal groups defined by standard multilocus sequence typing (MLST). FimTyper was constructed as a FASTA database containing all currently known fimH alleles. The software source code is publicly available at https://bitbucket.org/genomicepidemiology/fimtyper , the database is freely available at https://bitbucket.org/genomicepidemiology/fimtyper_db , and a service implementing the software is available at https://cge.cbs.dtu.dk/services/FimTyper . FimTyper was validated on three data sets: one containing Sanger sequences of fimH alleles of 42 E. coli isolates generated prior to the current study (data set 1), one containing whole-genome sequence (WGS) data of 243 third-generation-cephalosporin-resistant E. coli isolates (data set 2), and one containing a randomly chosen subset of 40 E. coli isolates from data set 2 that were subjected to conventional fimH subtyping (data set 3). The combination of the three data sets enabled an evaluation and comparison of FimTyper on both Sanger sequences and WGS data. FimTyper correctly predicted all 42 fimH subtypes from the Sanger sequences from data set 1 and successfully analyzed all 243 draft genomes from data set 2. FimTyper subtyping of the Sanger sequences and WGS data from data set 3 were in complete agreement. Additionally, fimH subtyping was evaluated on a phylogenetic network of 122 sequence type 131 (ST131) E. coli isolates. There was perfect concordance between the typology and fimH -based subclones within ST131, with accurate identification of the pandemic multidrug-resistant clonal subgroup ST131- H 30. FimTyper provides a standardized tool, as a rapid alternative to conventional fimH subtyping, highly suitable for surveillance and outbreak detection. </jats:p

    Norwegian patients and retail chicken meat share cephalosporin-resistant Escherichia coli and IncK/bla<sub>CMY-2</sub> resistance plasmids

    Get PDF
    Objectives In 2012 and 2014 the Norwegian monitoring programme for antimicrobial resistance in the veterinary and food production sectors (NORM-VET) showed that 124 of a total of 406 samples (31%) of Norwegian retail chicken meat were contaminated with extended-spectrum cephalosporin-resistant Escherichia coli. The aim of this study was to compare selected cephalosporin-resistant E. coli from humans and poultry to determine their genetic relatedness based on whole genome sequencing (WGS). Methods Escherichia coli representing three prevalent cephalosporin-resistant multi-locus sequence types (STs) isolated from poultry (n = 17) were selected from the NORM-VET strain collections. All strains carried an IncK plasmid with a blaCMY-2 gene. Clinical E. coli isolates (n = 284) with AmpC-mediated resistance were collected at Norwegian microbiology laboratories from 2010 to 2014. PCR screening showed that 29 of the clinical isolates harboured both IncK and blaCMY-2. All IncK/blaCMY-2-positive isolates were analysed with WGS-based bioinformatics tools. Results Analysis of single nucleotide polymorphisms (SNP) in 2.5 Mbp of shared genome sequences showed close relationship, with fewer than 15 SNP differences between five clinical isolates from urinary tract infections (UTIs) and the ST38 isolates from poultry. Furthermore, all of the 29 clinical isolates harboured IncK/blaCMY-2 plasmid variants highly similar to the IncK/blaCMY-2 plasmid present in the poultry isolates. Conclusions Our results provide support for the hypothesis that clonal transfer of cephalosporin-resistant E. coli from chicken meat to humans may occur, and may cause difficult-to-treat infections. Furthermore, these E. coli can be a source of AmpC-resistance plasmids for opportunistic pathogens in the human microbiota

    KIDNEY-PAGER: analysis of circulating tumor DNA as a biomarker in renal cancer - an observational trial. Study protocol

    Get PDF
    BACKGROUND: Management of localized renal cell carcinoma (RCC) is challenged by inaccurate methods to assess the risk of recurrence and deferred detection of relapse and residual disease after radical or partial nephrectomy. Circulating tumor DNA (ctDNA) has been proposed as a potential biomarker in RCC.PURPOSE: Conduction of an observational study to evaluate the validity of ctDNA as a biomarker of the risk of recurrence and subclinical residual disease to improve postoperative surveillance.MATERIAL AND METHODS: Urine and blood will be prospectively collected before and after surgery of the primary tumor from up to 500 patients until 5 years of follow-up. ctDNA analysis will be performed using shallow whole genome sequencing and cell-free methylated DNA immunoprecipitation sequencing. ctDNA levels in plasma and urine will be correlated to oncological outcomes. Residual blood and urine as well as tissue biopsies will be biobanked for future research.INTERPRETATION: Results will pave the way for future ctDNA-guided clinical trials aiming to improve RCC management.</p
    corecore