63 research outputs found

    Combination of whole genome sequencing and supervised machine learning provides unambiguous identification of eae-positive Shiga toxin-producing Escherichia coli

    Get PDF
    Introduction: The objective of this study was to develop, using a genome wide machine learning approach, an unambiguous model to predict the presence of highly pathogenic STEC in E. coli reads assemblies derived from complex samples containing potentially multiple E. coli strains. Our approach has taken into account the high genomic plasticity of E. coli and utilized the stratification of STEC and E. coli pathogroups classification based on the serotype and virulence factors to identify specific combinations of biomarkers for improved characterization of eae-positive STEC (also named EHEC for enterohemorrhagic E.coli) which are associated with bloody diarrhea and hemolytic uremic syndrome (HUS) in human. Methods: The Machine Learning (ML) approach was used in this study on a large curated dataset composed of 1,493 E. coli genome sequences and 1,178 Coding Sequences (CDS). Feature selection has been performed using eight classification algorithms, resulting in a reduction of the number of CDS to six. From this reduced dataset, the eight ML models were trained with hyper-parameter tuning and cross-validation steps. Results and discussion: It is remarkable that only using these six genes, EHEC can be clearly identified from E. coli read assemblies obtained from in silico mixtures and complex samples such as milk metagenomes. These various combinations of discriminative biomarkers can be implemented as novel marker genes for the unambiguous EHEC characterization from different E. coli strains mixtures as well as from raw milk metagenomesPeer Reviewe

    Evaluation of high molecular weight DNA extraction methods for long-read sequencing of Shiga toxin-producing Escherichia coli.

    No full text
    Next generation sequencing has become essential for pathogen characterization and typing. The most popular second generation sequencing technique produces data of high quality with very low error rates and high depths. One major drawback of this technique is the short reads. Indeed, short-read sequencing data of Shiga toxin-producing Escherichia coli (STEC) are difficult to assemble because of the presence of numerous mobile genetic elements (MGEs), which contain repeated elements. The resulting draft assemblies are often highly fragmented, which results in a loss of information, especially concerning MGEs or large structural variations. The use of long-read sequencing can circumvent these problems and produce complete or nearly complete genomes. The ONT MinION, for its small size and minimal investment requirements, is particularly popular. The ultra-long reads generated with the MinION can easily span prophages and repeat regions. In order to take full advantage of this technology it requires High Molecular Weight (HMW) DNA of high quality in high quantity. In this study, we have tested three different extraction methods: bead-based, solid-phase and salting-out, and evaluated their impact on STEC DNA yield, quality and integrity as well as performance in MinION long-read sequencing. Both the bead-based and salting-out methods allowed the recovery of large quantities of HMW STEC DNA suitable for MinION library preparation. The DNA extracted using the salting-out method consistently produced longer reads in the subsequent MinION runs, compared with the bead-based methods. While both methods performed similarly in subsequent STEC genome assembly, DNA extraction based on salting-out appeared to be the overall best method to produce high quantity of pure HMW STEC DNA for MinION sequencing

    Evaluation of high molecular weight DNA extraction methods for long-read sequencing of Shiga toxin-producing Escherichia coli

    No full text
    Next generation sequencing has become essential for pathogen characterization and typing. The most popular second generation sequencing technique produces data of high quality with very low error rates and high depths. One major drawback of this technique is the short reads. Indeed, short-read sequencing data of Shiga toxin-producing Escherichia coli (STEC) are difficult to assemble because of the presence of numerous mobile genetic elements (MGEs), which contain repeated elements. The resulting draft assemblies are often highly fragmented, which results in a loss of information, especially concerning MGEs or large structural variations. The use of long-read sequencing can circumvent these problems and produce complete or nearly complete genomes. The ONT MinION, for its small size and minimal investment requirements, is particularly popular. The ultra-long reads generated with the MinION can easily span prophages and repeat regions. In order to take full advantage of this technology it requires High Molecular Weight (HMW) DNA of high quality in high quantity. In this study, we have tested three different extraction methods: bead-based, solid-phase and salting-out, and evaluated their impact on STEC DNA yield, quality and integrity as well as performance in MinION long-read sequencing. Both the bead-based and salting-out methods allowed the recovery of large quantities of HMW STEC DNA suitable for MinION library preparation. The DNA extracted using the salting-out method consistently produced longer reads in the subsequent MinION runs, compared with the bead-based methods. While both methods performed similarly in subsequent STEC genome assembly, DNA extraction based on salting-out appeared to be the overall best method to produce high quantity of pure HMW STEC DNA for MinION sequencing

    Features of Mycobacterium bovis Complete Genomes Belonging to 5 Different Lineages

    No full text
    The raw data are deposited in a public domain server at the NCBI SRA database, under BioProject accession number PRJNA832544.International audienceMammalian tuberculosis (TB) is a zoonotic disease mainly due to Mycobacterium bovis (M. bovis). A current challenge for its eradication is understanding its transmission within multi-host systems. Improvements in long-read sequencing technologies have made it possible to obtain complete bacterial genomes that provide a comprehensive view of species-specific genomic features. In the context of TB, new genomic references based on complete genomes genetically close to field strains are also essential to perform precise field molecular epidemiological studies. A total of 10 M. bovis strains representing each genetic lineage identified in France and in other countries were selected for performing complete assembly of their genomes. Pangenome analysis revealed a “closed” pangenome composed of 3900 core genes and only 96 accessory genes. Whole genomes-based alignment using progressive Mauve showed remarkable conservation of the genomic synteny except that the genomes have a variable number of copies of IS6110. Characteristic genomic traits of each lineage were identified through the discovery of specific indels. Altogether, these results provide new genetic features that improve the description of M. bovis lineages. The availability of new complete representative genomes of M. bovis will be useful to epidemiological studies and better understand the transmission of this clonal-evolving pathogen

    New reference genomes of Mycobacterium bovis adapted to French genotype diversity

    No full text
    International audienceBovinetuberculosis ( cattle outbreaks in France are present in specific regions and circulate in a multi host system that includes not only domestic but also wild animals. The transmission link between infected animals remains difficult to establish given that they are locally infected M bovis strains with identical genotypes (spoligotype MIRU VNTR). Whole genome SNP (single nucleotide polymorphisms) compared to appropriate reference genomes can precisely differentiate strains. However, the previous reference genome (AF 2122 97 belongs to the European 1 clonal complex mainly found in the British Isles but less in France or in other European countries 3 New reference genomes genetically close to French field strains, such as Mb 3601 that allowed us to describe the Eu 3 clonal complex, are required to perform precise field molecular epidemiological studies

    Features of Mycobacterium bovis Complete Genomes Belonging to 5 Different Lineages

    No full text
    The raw data are deposited in a public domain server at the NCBI SRA database, under BioProject accession number PRJNA832544.International audienceMammalian tuberculosis (TB) is a zoonotic disease mainly due to Mycobacterium bovis (M. bovis). A current challenge for its eradication is understanding its transmission within multi-host systems. Improvements in long-read sequencing technologies have made it possible to obtain complete bacterial genomes that provide a comprehensive view of species-specific genomic features. In the context of TB, new genomic references based on complete genomes genetically close to field strains are also essential to perform precise field molecular epidemiological studies. A total of 10 M. bovis strains representing each genetic lineage identified in France and in other countries were selected for performing complete assembly of their genomes. Pangenome analysis revealed a “closed” pangenome composed of 3900 core genes and only 96 accessory genes. Whole genomes-based alignment using progressive Mauve showed remarkable conservation of the genomic synteny except that the genomes have a variable number of copies of IS6110. Characteristic genomic traits of each lineage were identified through the discovery of specific indels. Altogether, these results provide new genetic features that improve the description of M. bovis lineages. The availability of new complete representative genomes of M. bovis will be useful to epidemiological studies and better understand the transmission of this clonal-evolving pathogen

    Captive Psittacines with Chlamydia avium Infection

    Full text link
    Avian chlamydiosis is an infection caused by obligate intracellular, gram-negative bacteria belonging to the Chlamydiaceae family. Birds can be hosts of several Chlamydia species, including Chlamydia avium, which has only been detected in pigeons and psittacine birds. In this study, depression, respiratory distress, and mortality were noted among psittacines belonging to a large aviary with 35 different avian species. On the basis of immunohistochemistry and PCR testing, chlamydiosis was diagnosed in affected birds. Gross and histopathologic lesions were mainly observed in the spleen and gastrointestinal tract. Chlamydia avium was detected in four psittacines by PCR, including two dead birds and two individuals exhibiting respiratory distress. Increased aspartate aminotransferase and lactate dehydrogenase values and anemia were consistently identified in affected birds. Administration of doxycycline, combined with hepatoprotectors and vitamins, was effective in stopping mortality and bacterial shedding

    New Mycobacterium bovis complete genomes of different clonal complexes to improve molecular epidemiology of french field strains

    No full text
    International audienceBovine Tuberculosis (bTB) is a zoonotic disease due to Mycobacterium bovis (M. bovis). France has a bTB-free status, but the disease has not been eradicated yet and a worryingly steady increase of bTB outbreaks has been observed in some regions (1). This could be explained by the detection of bTB in wildlife that spills it back to livestock in the same territories. The transmission link within these multi-host systems remains difficult to establish given that they share the same M. bovis genotypes (2). Obtaining new reference genomes for each of these genotypes could improve the knowledge of clonal groups and refine molecular field epidemiology studies based on whole genome sequencing
    • 

    corecore