29 research outputs found

    Resources for the analysis of bacterial and microbial genomic data with a focus on antibiotic resistance

    Get PDF
    Antibiotics are drugs which inhibit the growth of bacterial cells. Their discovery was one of the most significant achievements in medicine: it allowed the development of successful treatment options for severe bacterial infections, which has helped to significantly increase our life expectancy. However, bacteria have the ability to adapt to changing environmental conditions through genetic modifications, and can, therefore, become resistant to an antibiotic. Extensive use of antibiotics promotes the development of antibiotic resistance and, since some genetic factors can be exchanged between the cells, emergence of new resistance mechanisms and their spread have become a serious global problem. Counteractive measures have been initiated, focusing on the different factors contributing to the antibiotic resistance crisis. These include the study of bacterial isolates and complete microbial communities using whole-genome sequencing (WGS) data. In both cases, there are specific challenges and requirements for different analytical approaches. The goal of the present thesis was the implementation of multiple resources which should facilitate further microbiological studies, with a focus on bacteria and antibiotic resistance. The main project, GEAR-base, included an analysis of WGS and resistance data of around eleven thousand bacterial clinical isolates covering the main human pathogens and antibiotics from different drug classes. The dataset consisted of WGS data, antibiotic susceptibility profiles and meta-information, along with additional taxonomic characterization of a sample subset. The analysis of this isolate collection allowed for the identification of bacterial species demonstrating increasing resistance rates, to construct species pan-genomes from the de novo assembled genomes, and to link gene presence or absence to the available antibiotic resistance profiles. The generated data and results were made available through the online resource GEAR-base. This resource provides access to the resistance information and genomic data, and implements functionality to compare submitted genes or genomes to the data included in the resource. In microbial community studies, the metagenome obtained through WGS is analyzed to determine its taxonomic composition. For this task, genomic sequences are clustered, or binned, to represent sequences belonging to specific organisms or closely-related organism groups. BusyBee Web was developed to provide an automatic binning pipeline using frequencies of k-mers (subsequences of length k) and bootstrapped supervised clustering. It also includes further data annotation, such as taxonomic classification of the input sequences, presence of know resistance factors, and bin quality. Plasmids, extra-chromosomal DNA molecules found in some bacteria, play an important role in antibiotic resistance spread. As the classification of sequences from WGS data as chromosomal or plasmid-derived is challenging, demonstrated by evaluating four tools implementing three different approaches, having a reference dataset to detect the plasmids which are already known is therefore desirable. To this end, an online resource for complete bacterial plasmids (PLSDB) was implemented. In summary, the herein described online resources represent valuable datasets and/or tools for the analysis of microbial genomic data and, especially, bacterial pathogens and antibiotic resistance.Antibiotika sind Medikamente, die das Wachstum von Bakterienzellen hemmen. Ihre Entdeckung war eine der bedeutendsten Leistungen der Medizin: Es erlaubte die Entwicklung von erfolgreichen Behandlungsmöglichkeiten von schwerwiegenden bakteriellen Infektionen, was geholfen hat, unsere Lebenserwartung zu erhöhen. Allerdings sind Bakterien in der Lage sich den wechselnden Umweltbedingungen anzupassen und können dadurch resistent gegen ein Antibiotikum werden. Der extensive Gebrauch von Antibiotika fördert die Entwicklung von Antibiotikaresistenzen und, da einige genetische Faktoren zwischen den Zellen ausgetauscht werden können, sind das Auftauchen von neuen Resistenzmechanismen und deren Verbreitung zu einem seriösen globalen Problem geworden. Gegenmaßnahmen wurden ergriffen, die sich auf die verschiedenen Faktoren fokussieren, die zur Antibiotikaresistenzkrise beitragen. Diese umfassen Studien von bakteriellen Isolaten und ganzen Mikrobengemeinschaften mithilfe von Gesamt-Genom-Sequenzierung (GGS). In beiden Fällen gibt es spezifische Herausforderungen und Bedürfnisse für verschiedene analytische Methoden. Das Ziel dieser Dissertation war die Implementierung von mehreren Ressourcen, die weitere mikrobielle Studien erleichtern sollen und einen Fokus auf Bakterien und Antibiotikaresistenz haben. Das Hauptprojekt, GEAR-base, beinhaltete eine Analyse von GGS- und Resistenzdaten von ungefähr elftausend klinischen Bakterienisolaten und umfasste die wichtigen menschlichen Pathogene und Antibiotika aus verschiedenen Medikamentenklassen. Neben den GGS-Daten, Empfindlichkeitsprofilen für die Antibiotika und Metainformation, beinhaltete der Datensatz zusätzliche taxonomische Charakterisierung von einer Teilmenge der Proben. Die Analyse dieser Sammlung an Isolaten erlaubte die Identifizierung von Spezies mit ansteigenden Resistenzraten, die Konstruktion von den Spezies-Pan-Genomen aus den de novo assemblierten Genomen und die Verknüpfung vom Vorhandensein oder Fehlen von Genen mit den Antibiotikaresistenzprofilen. Die generierten Daten und Ergebnisse wurden durch die Online-Ressource GEAR-base bereitgestellt. Diese Ressource bietet Zugang zur Resistenzinformation und den gesammelten genomischen Daten und implementiert Funktionen zum Vergleich von hochgeladenen Genen oder Genomen zu den Daten, die in der Ressource enthalten sind. In den Studien von Mikrobengemeinschaften wird das durch GGS erhaltene Metagenom analysiert, um seine taxonomische Zusammensetzung zu bestimmen. Dafür werden die genomischen Sequenzen in sogenannte Bins gruppiert (Binning), die die Zugehörigkeit von den Sequenzen zu bestimmten Organismen oder zu Gruppen von nah verwandten Organismen repräsentieren. BusyBee Web wurde entwickelt, um eine automatische Binning-Pipeline anzubieten, die die Häufigkeitsprofile von k-meren (Teilsequenzen der Länge k) und eine auf dem Bootstrap-Verfahren basierte Methode für die Gruppierung der Sequenzen nutzt. Zusätzlich wird eine Annotation der Daten durchgeführt, wie die taxonomische Klassifizierung der hochgeladenen Sequenzen, das Vorhandensein von bekannten Resistenzfaktoren und die Qualität der Bins. Plasmide, DNA-Moleküle, die zusätzlich zum Chromosom in einigen Bakterien vorhanden sind, spielen eine wichtige Rolle in der Verbreitung von Antibiotikaresistenzen. Die Klassifizierung von Sequenzen aus der GGS als von einem Chromosom oder einem Plasmid stammend ist herausfordernd, wie es in einer Evaluation von vier Tools, die drei verschiedene Ansätze implementieren, demonstriert wurde. Deshalb ist das Vorhandensein von einem Referenzdatensatz, um schon bekannte Plasmide zu detektieren, sehr wünschenswert. Zu diesem Zweck wurde eine Online-Ressource von vollständigen bakteriellen Plasmiden implementiert (PLSDB). Die hier beschriebenen Online-Ressourcen stellen nützliche Datensätze und/oder Werkzeuge dar, die für die Analyse von mikrobiellen genomischen Daten, insbesondere von bakteriellen Pathogenen und Antibiotikaresistenzen, eingesetzt werden können

    PLSDB: a resource of complete bacterial plasmids

    Get PDF
    The study of bacterial isolates or communities requires the analysis of the therein included plasmids in order to provide an extensive characterization of the organisms. Plasmids harboring resistance and virulence factors are of especial interest as they contribute to the dissemination of antibiotic resistance. As the number of newly sequenced bacterial genomes is growing a comprehensive resource is required which will allow to browse and filter the available plasmids, and to perform sequence analyses. Here, we present PLSDB, a resource containing 13 789 plasmid records collected from the NCBI nucleotide database. The web server provides an interactive view of all obtained plasmids with additional meta information such as sequence characteristics, sample-related information and taxonomy. Moreover, nucleotide sequence data can be uploaded to search for short nucleotide sequences (e.g. specific genes) in the plasmids, to compare a given plasmid to the records in the collection or to determine whether a sample contains one or multiple of the known plasmids (containment analysis). The resource is freely accessible under https://ccbmicrobe.cs.uni-saarland.de/plsdb/

    Prospect and challenge of detecting dynamic gene copy number increases in stem cells by whole genome sequencing

    Get PDF
    Gene amplification is an evolutionarily well-conserved and highly efficient mechanism to increase the amount of specific proteins. In humans, gene amplification is a hallmark of cancer and has recently been found during stem cell differentiation. Amplifications in stem cells are restricted to specific tissue areas and time windows, rendering their detection difficult. Here, we report on the performance of deep WGS sequencing (average 82-fold depth of coverage) on the BGISEQ with nanoball technology to detect amplifications in human mesenchymal and neural stem cells. As reference technology, we applied arraybased comparative genomic hybridization (aCGH), fluorescence in situ hybridization (FISH), and qPCR. Using different in silico strategies for amplification detection, we analyzed the potential of WGS for amplification detection. Our results provide evidence that WGS accurately identifies changes of the copy number profiles in human stem cell differentiation. However, the identified changes are not in all cases consistent between WGS and aCGH. The results between WGS and the validation by qPCR were concordant in 83.3% of all tested 36 cases. In sum, both genome-wide techniques, aCGH and WGS, have unique advantages and specific challenges, calling for locus-specific confirmation by the low-throughput approaches qPCR or FISH

    MicroRNA signature in spermatozoa and seminal plasma of proven fertile men and in testicular tissue of men with obstructive azoospermia

    Get PDF
    MicroRNAs (miRNAs) have recently received a significant amount of attention due to their remarkable influence on post-transcriptional gene regulation. In this study, we aim to provide a catalogue of miRNAs present in spermatozoa, seminal plasma and testicular tissue. Expression profiles of miRNA in spermatozoa and seminal plasma of 16 proven fertile men and testicular tissue of eight men with morphologically and/or histologically confirmed obstructive azoospermia were determined by microarray and RT-qPCR in combination with bioinformatics analyses. A total of 123, 156 and 133 miRNAs were consistently detected in spermatozoa, seminal plasma and testicular tissue respectively. Sixty-four miRNAs were shared across all sample types. Based on miRNAs expression level present in each group, correlation analysis showed moderate-to-strong correlations within the spermatozoa and seminal plasma samples and a wider range of correlations within the testicular tissue samples. The target genes of known miRNAs appeared to be involved in a wide range of biological processes related to reproduction, development and differentiation of germ cells. Our results suggest that there is a certain similarity between spermatozoa and seminal plasma for the relative miRNA expression changes with respect to testicular tissue and provide an overview of the miRNAs present in each sample type

    Gene amplification in mesenchymal stem cells and during differentiation towards adipocytes or osteoblasts

    Get PDF
    Gene amplifications are an attribute of tumor cells and have for long time been overlooked in normal cells. A growing number of investigations describe gene amplifications in normal mammalian cells during development and differentiation. Possibly, tumor cells have rescued the gene amplification mechanism as a physiological attribute of stem cells. Here, we investigated human mesenchymal stem cells (hMSCs) for gene amplification using array-CGH, single cell fluorescence in situ hybridization and qPCR. Gene amplifications were detected in mesenchymal stem cells and in mesenchymal stem cells during differentiation towards adipocytes and osteoblasts. Undifferentiated hMSCs harbor 12 amplified chromosomal regions, hMSCs that differentiated towards adipocytes 18 amplified chromosome regions, and hMSCs that differentiate towards osteoblasts 19 amplified regions. Specifically, hMSCs that differentiated towards adipocytes or osteoblasts harbor CDK4 and MDM2 amplifications both of which frequently occur in osteosarcoma and liposarcoma that are both of same cell origin. Beside the amplifications, we identified 36 under-replicated regions in undifferentiated and in differentiating hMSC cells

    Clinical Resistome Screening of 1,110 Escherichia coli Isolates Efficiently Recovers Diagnostically Relevant Antibiotic Resistance Biomarkers and Potential Novel Resistance Mechanisms

    Get PDF
    Multidrug-resistant pathogens represent one of the biggest global healthcare challenges. Molecular diagnostics can guide effective antibiotics therapy but relies on validated, predictive biomarkers. Here we present a novel, universally applicable workflow for rapid identification of antimicrobial resistance (AMR) biomarkers from clinical Escherichia coli isolates and quantitatively evaluate the potential to recover causal biomarkers for observed resistance phenotypes. For this, a metagenomic plasmid library from 1,110 clinical E. coli isolates was created and used for high-throughput screening to identify biomarker candidates against Tobramycin (TOB), Ciprofloxacin (CIP), and Trimethoprim Sulfamethoxazole (TMP-SMX). Identified candidates were further validated in vitro and also evaluated in silico for their diagnostic performance based on matched genotype phenotype data. AMR biomarkers recovered by the metagenomics screening approach mechanistically explained 77% of observed resistance phenotypes for Tobramycin, 76% for Trimethoprim-Sulfamethoxazole, and 20% Ciprofloxacin. Sensitivity for Ciprofloxacin resistance detection could be improved to 97% by complementing results with AMR biomarkers that are undiscoverable due to intrinsic limitations of the workflow. Additionally, when combined in a multiplex diagnostic in silico panel, the identified AMR biomarkers reached promising positive and negative predictive values of up to 97 and 99%, respectively. Finally, we demonstrate that the developed workflow can be used to identify potential novel resistance mechanisms

    The composition of the pulmonary microbiota in sarcoidosis - an observational study

    Get PDF
    Background: Sarcoidosis is a systemic disease of unknown etiology. The disease mechanisms are largely speculative and may include the role microbial patterns that initiate and drive an underlying immune process. The aim of this study was to characterize the microbiota of the lung of patients with sarcoidosis and compare its composition and diversity with the results from patients with other interstitial lung disease (ILD) and historic healthy controls. Methods: Patients (sarcoidosis, n = 31; interstitial lung disease, n = 19) were recruited within the PULMOHOM study, a prospective cohort study to characterize inflammatory processes in pulmonary diseases. Bronchoscopy of the middle lobe or the lingula was performed and the recovered fluid was immediately sent for analysis of the pulmonary microbiota by 16sRNA gene sequencing. Subsequent bioinformatic analysis was performed to compare the groups. Results: There were no significant differences between patients with sarcoidosis or other ILDs with regard to microbiome composition and diversity. In addition, the abundance of the genera Atopobium, Fusobacterium, Mycobacterium or Propionibacterium were not different between the two groups. There were no gross differences to historical healthy controls. Conclusion: The analysis of the pulmonary microbiota based on 16sRNA gene sequencing did not show a significant dysbiosis in patients with sarcoidosis as compared to other ILD patients. These data do not exclude a microbiological component in the pathogenesis of sarcoidosis

    An estimate of the total number of true human miRNAs

    Get PDF
    While the number of human miRNA candidates continuously increases, only a few of them are completely characterized and experimentally validated. Toward determining the total number of true miRNAs, we employed a combined in silico high- and experimental low-throughput validation strategy. We collected 28 866 human small RNA sequencing data sets containing 363.7 billion sequencing reads and excluded falsely annotated and low quality data. Our high-throughput analysis identified 65% of 24 127 mature miRNA candidates as likely false-positives. Using northern blotting, we experimentally validated miRBase entries and novel miRNA candidates. By exogenous overexpression of 108 precursors that encode 205 mature miRNAs, we confirmed 68.5% of the miRBase entries with the confirmation rate going up to 94.4% for the high-confidence entries and 18.3% of the novel miRNA candidates. Analyzing endogenous miRNAs, we verified the expression of 8 miRNAs in 12 different human cell lines. In total, we extrapolated 2300 true human mature miRNAs, 1115 of which are currently annotated in miRBase V22. The experimentally validated miRNAs will contribute to revising targetomes hypothesized by utilizing falsely annotated miRNAs

    Comparison of initial oral microbiomes of young adults with and without cavitated dentin caries lesions using an in situ biofilm model

    Get PDF
    Dental caries is caused by acids released from bacterial biofilms. However, the in vivo formation of initial biofilms in relation to caries remains largely unexplored. The aim of this study was to compare the oral microbiome during the initial phase of bacterial colonization for individuals with (CC) and without (NC) cavitated dentin caries lesions. Bovine enamel slabs on acrylic splints were worn by the volunteers (CC: 14, NC: 13) for in situ biofilm formation (2 h, 4 h, 8 h, 1 ml saliva as reference). Sequencing of the V1/V2 regions of the 16S rRNA gene was performed (MiSeq). The relative abundances of individual operational taxonomic units (OTUs) were compared between samples from the CC group and the NC group. Random forests models were furthermore trained to separate the groups. While the overall heterogeneity did not differ substantially between CC and NC individuals, several individual OTUs were found to have significantly different relative abundances. For the 8 h samples, most of the significant OTUs showed higher relative abundances in the CC group, while the majority of significant OTUs in the saliva samples were more abundant in the NC group. Furthermore, using OTU signatures enabled a separation between both groups, with area-under-the-curve (AUC) values of ~0.8. In summary, the results suggest that initial oral biofilms provide the potential to differentiate between CC and NC individuals

    The sncRNA Zoo: a repository for circulating small noncoding RNAs in animals

    Get PDF
    The repertoire of small noncoding RNAs (sncRNAs), particularly miRNAs, in animals is considered to be evolutionarily conserved. Studies on sncRNAs are often largely based on homology-based information, relying on genomic sequence similarity and excluding actual expression data. To obtain information on sncRNA expression (including miRNAs, snoRNAs, YRNAs and tRNAs), we performed low-input-volume next-generation sequencing of 500 pg of RNA from 21 animals at two German zoological gardens. Notably, none of the species under investigation were previously annotated in any miRNA reference database. Sequencing was performed on blood cells as they are amongst the most accessible, stable and abundant sources of the different sncRNA classes. We evaluated and compared the composition and nature of sncRNAs across the different species by computational approaches. While the distribution of sncRNAs in the different RNA classes varied significantly, general evolutionary patterns were maintained. In particular, miRNA sequences and expression were found to be even more conserved than previously assumed. To make the results available for other researchers, all data, including expression profiles at the species and family levels, and different tools for viewing, filtering and searching the data are freely available in the online resource ASRA (Animal sncRNA Atlas) at https://www.ccb.uni-saarland.de/asra/
    corecore