29 research outputs found
Resources for the analysis of bacterial and microbial genomic data with a focus on antibiotic resistance
Antibiotics are drugs which inhibit the growth of bacterial cells. Their
discovery was one of the most significant achievements in medicine:
it allowed the development of successful treatment options for severe
bacterial infections, which has helped to significantly increase our life
expectancy. However, bacteria have the ability to adapt to changing
environmental conditions through genetic modifications, and can,
therefore, become resistant to an antibiotic. Extensive use of antibiotics
promotes the development of antibiotic resistance and, since
some genetic factors can be exchanged between the cells, emergence
of new resistance mechanisms and their spread have become a serious
global problem.
Counteractive measures have been initiated, focusing on the different
factors contributing to the antibiotic resistance crisis. These
include the study of bacterial isolates and complete microbial communities
using whole-genome sequencing (WGS) data. In both cases,
there are specific challenges and requirements for different analytical
approaches. The goal of the present thesis was the implementation
of multiple resources which should facilitate further microbiological
studies, with a focus on bacteria and antibiotic resistance. The main
project, GEAR-base, included an analysis of WGS and resistance data
of around eleven thousand bacterial clinical isolates covering the main
human pathogens and antibiotics from different drug classes. The
dataset consisted of WGS data, antibiotic susceptibility profiles and
meta-information, along with additional taxonomic characterization
of a sample subset. The analysis of this isolate collection allowed
for the identification of bacterial species demonstrating increasing
resistance rates, to construct species pan-genomes from the de novo
assembled genomes, and to link gene presence or absence to the
available antibiotic resistance profiles. The generated data and results
were made available through the online resource GEAR-base. This
resource provides access to the resistance information and genomic
data, and implements functionality to compare submitted genes or
genomes to the data included in the resource.
In microbial community studies, the metagenome obtained through
WGS is analyzed to determine its taxonomic composition. For this
task, genomic sequences are clustered, or binned, to represent sequences
belonging to specific organisms or closely-related organism
groups. BusyBee Web was developed to provide an automatic binning
pipeline using frequencies of k-mers (subsequences of length k)
and bootstrapped supervised clustering. It also includes further data
annotation, such as taxonomic classification of the input sequences,
presence of know resistance factors, and bin quality.
Plasmids, extra-chromosomal DNA molecules found in some bacteria,
play an important role in antibiotic resistance spread. As
the classification of sequences from WGS data as chromosomal or
plasmid-derived is challenging, demonstrated by evaluating four tools
implementing three different approaches, having a reference dataset
to detect the plasmids which are already known is therefore desirable.
To this end, an online resource for complete bacterial plasmids
(PLSDB) was implemented.
In summary, the herein described online resources represent valuable
datasets and/or tools for the analysis of microbial genomic data
and, especially, bacterial pathogens and antibiotic resistance.Antibiotika sind Medikamente, die das Wachstum von Bakterienzellen
hemmen. Ihre Entdeckung war eine der bedeutendsten Leistungen
der Medizin: Es erlaubte die Entwicklung von erfolgreichen
Behandlungsmöglichkeiten von schwerwiegenden bakteriellen Infektionen,
was geholfen hat, unsere Lebenserwartung zu erhöhen. Allerdings
sind Bakterien in der Lage sich den wechselnden Umweltbedingungen
anzupassen und können dadurch resistent gegen ein Antibiotikum
werden. Der extensive Gebrauch von Antibiotika fördert die Entwicklung
von Antibiotikaresistenzen und, da einige genetische Faktoren
zwischen den Zellen ausgetauscht werden können, sind das Auftauchen
von neuen Resistenzmechanismen und deren Verbreitung zu
einem seriösen globalen Problem geworden.
GegenmaĂźnahmen wurden ergriffen, die sich auf die verschiedenen
Faktoren fokussieren, die zur Antibiotikaresistenzkrise beitragen.
Diese umfassen Studien von bakteriellen Isolaten und ganzen
Mikrobengemeinschaften mithilfe von Gesamt-Genom-Sequenzierung
(GGS). In beiden Fällen gibt es spezifische Herausforderungen und
BedĂĽrfnisse fĂĽr verschiedene analytische Methoden. Das Ziel dieser
Dissertation war die Implementierung von mehreren Ressourcen, die
weitere mikrobielle Studien erleichtern sollen und einen Fokus auf
Bakterien und Antibiotikaresistenz haben. Das Hauptprojekt, GEAR-base,
beinhaltete eine Analyse von GGS- und Resistenzdaten von
ungefähr elftausend klinischen Bakterienisolaten und umfasste die
wichtigen menschlichen Pathogene und Antibiotika aus verschiedenen
Medikamentenklassen. Neben den GGS-Daten, Empfindlichkeitsprofilen
fĂĽr die Antibiotika und Metainformation, beinhaltete der
Datensatz zusätzliche taxonomische Charakterisierung von einer Teilmenge
der Proben. Die Analyse dieser Sammlung an Isolaten erlaubte
die Identifizierung von Spezies mit ansteigenden Resistenzraten, die
Konstruktion von den Spezies-Pan-Genomen aus den de novo assemblierten
Genomen und die VerknĂĽpfung vom Vorhandensein oder
Fehlen von Genen mit den Antibiotikaresistenzprofilen. Die generierten
Daten und Ergebnisse wurden durch die Online-Ressource
GEAR-base bereitgestellt. Diese Ressource bietet Zugang zur Resistenzinformation
und den gesammelten genomischen Daten und
implementiert Funktionen zum Vergleich von hochgeladenen Genen
oder Genomen zu den Daten, die in der Ressource enthalten sind.
In den Studien von Mikrobengemeinschaften wird das durch GGS
erhaltene Metagenom analysiert, um seine taxonomische Zusammensetzung
zu bestimmen. DafĂĽr werden die genomischen Sequenzen
in sogenannte Bins gruppiert (Binning), die die Zugehörigkeit
von den Sequenzen zu bestimmten Organismen oder zu Gruppen von
nah verwandten Organismen repräsentieren. BusyBee Web wurde entwickelt,
um eine automatische Binning-Pipeline anzubieten, die die
Häufigkeitsprofile von k-meren (Teilsequenzen der Länge k) und eine
auf dem Bootstrap-Verfahren basierte Methode fĂĽr die Gruppierung
der Sequenzen nutzt. Zusätzlich wird eine Annotation der Daten
durchgefĂĽhrt, wie die taxonomische Klassifizierung der hochgeladenen
Sequenzen, das Vorhandensein von bekannten Resistenzfaktoren
und die Qualität der Bins.
Plasmide, DNA-Moleküle, die zusätzlich zum Chromosom in einigen
Bakterien vorhanden sind, spielen eine wichtige Rolle in der
Verbreitung von Antibiotikaresistenzen. Die Klassifizierung von Sequenzen
aus der GGS als von einem Chromosom oder einem Plasmid
stammend ist herausfordernd, wie es in einer Evaluation von vier
Tools, die drei verschiedene Ansätze implementieren, demonstriert
wurde. Deshalb ist das Vorhandensein von einem Referenzdatensatz,
um schon bekannte Plasmide zu detektieren, sehr wĂĽnschenswert.
Zu diesem Zweck wurde eine Online-Ressource von vollständigen
bakteriellen Plasmiden implementiert (PLSDB).
Die hier beschriebenen Online-Ressourcen stellen nützliche Datensätze
und/oder Werkzeuge dar, die fĂĽr die Analyse von mikrobiellen
genomischen Daten, insbesondere von bakteriellen Pathogenen und
Antibiotikaresistenzen, eingesetzt werden können
PLSDB: a resource of complete bacterial plasmids
The study of bacterial isolates or communities requires the analysis of the therein included plasmids
in order to provide an extensive characterization of
the organisms. Plasmids harboring resistance and
virulence factors are of especial interest as they
contribute to the dissemination of antibiotic resistance. As the number of newly sequenced bacterial
genomes is growing a comprehensive resource is
required which will allow to browse and filter the
available plasmids, and to perform sequence analyses. Here, we present PLSDB, a resource containing 13 789 plasmid records collected from the NCBI
nucleotide database. The web server provides an
interactive view of all obtained plasmids with additional meta information such as sequence characteristics, sample-related information and taxonomy.
Moreover, nucleotide sequence data can be uploaded
to search for short nucleotide sequences (e.g. specific genes) in the plasmids, to compare a given
plasmid to the records in the collection or to determine whether a sample contains one or multiple of the known plasmids (containment analysis).
The resource is freely accessible under https://ccbmicrobe.cs.uni-saarland.de/plsdb/
Prospect and challenge of detecting dynamic gene copy number increases in stem cells by whole genome sequencing
Gene amplification is an evolutionarily well-conserved and highly efficient mechanism to increase the amount of specific
proteins. In humans, gene amplification is a hallmark of cancer and has recently been found during stem cell differentiation.
Amplifications in stem cells are restricted to specific tissue areas and time windows, rendering their detection difficult. Here, we
report on the performance of deep WGS sequencing (average 82-fold depth of coverage) on the BGISEQ with nanoball
technology to detect amplifications in human mesenchymal and neural stem cells. As reference technology, we applied arraybased comparative genomic hybridization (aCGH), fluorescence in situ hybridization (FISH), and qPCR. Using different in silico
strategies for amplification detection, we analyzed the potential of WGS for amplification detection. Our results provide evidence
that WGS accurately identifies changes of the copy number profiles in human stem cell differentiation. However, the identified
changes are not in all cases consistent between WGS and aCGH. The results between WGS and the validation by qPCR were
concordant in 83.3% of all tested 36 cases. In sum, both genome-wide techniques, aCGH and WGS, have unique advantages and
specific challenges, calling for locus-specific confirmation by the low-throughput approaches qPCR or FISH
MicroRNA signature in spermatozoa and seminal plasma of proven fertile men and in testicular tissue of men with obstructive azoospermia
MicroRNAs (miRNAs) have recently received a significant amount of attention due to their remarkable influence on post-transcriptional gene regulation. In this study, we aim to provide a catalogue of miRNAs present in spermatozoa, seminal plasma and testicular tissue. Expression profiles of miRNA in spermatozoa and seminal plasma of 16 proven fertile men and testicular tissue of eight men with morphologically and/or histologically confirmed obstructive azoospermia were determined by microarray and RT-qPCR in combination with bioinformatics analyses. A total of 123, 156 and 133 miRNAs were consistently detected in spermatozoa, seminal plasma and testicular tissue respectively. Sixty-four miRNAs were shared across all sample types. Based on miRNAs expression level present in each group, correlation analysis showed moderate-to-strong correlations within the spermatozoa and seminal plasma samples and a wider range of correlations within the testicular tissue samples. The target genes of known miRNAs appeared to be involved in a wide range of biological processes related to reproduction, development and differentiation of germ cells. Our results suggest that there is a certain similarity between spermatozoa and seminal plasma for the relative miRNA expression changes with respect to testicular tissue and provide an overview of the miRNAs present in each sample type
Gene amplification in mesenchymal stem cells and during differentiation towards adipocytes or osteoblasts
Gene amplifications are an attribute of tumor cells and have for long time been overlooked in normal cells. A growing number of investigations describe gene amplifications in normal mammalian cells during development and differentiation. Possibly, tumor cells have rescued the gene amplification mechanism as a physiological attribute of stem cells. Here, we investigated human mesenchymal stem cells (hMSCs) for gene amplification using array-CGH, single cell fluorescence in situ hybridization and qPCR. Gene amplifications were detected in mesenchymal stem cells and in mesenchymal stem cells during differentiation towards adipocytes and osteoblasts. Undifferentiated hMSCs harbor 12 amplified chromosomal regions, hMSCs that differentiated towards adipocytes 18 amplified chromosome regions, and hMSCs that differentiate towards osteoblasts 19 amplified regions. Specifically, hMSCs that differentiated towards adipocytes or osteoblasts harbor CDK4 and MDM2 amplifications both of which frequently occur in osteosarcoma and liposarcoma that are both of same cell origin. Beside the amplifications, we identified 36 under-replicated regions in undifferentiated and in differentiating hMSC cells
Clinical Resistome Screening of 1,110 Escherichia coli Isolates Efficiently Recovers Diagnostically Relevant Antibiotic Resistance Biomarkers and Potential Novel Resistance Mechanisms
Multidrug-resistant pathogens represent one of the biggest global healthcare challenges.
Molecular diagnostics can guide effective antibiotics therapy but relies on validated,
predictive biomarkers. Here we present a novel, universally applicable workflow for rapid
identification of antimicrobial resistance (AMR) biomarkers from clinical Escherichia coli
isolates and quantitatively evaluate the potential to recover causal biomarkers for observed
resistance phenotypes. For this, a metagenomic plasmid library from 1,110 clinical E. coli
isolates was created and used for high-throughput screening to identify biomarker
candidates against Tobramycin (TOB), Ciprofloxacin (CIP), and Trimethoprim Sulfamethoxazole (TMP-SMX). Identified candidates were further validated in vitro and
also evaluated in silico for their diagnostic performance based on matched genotype phenotype data. AMR biomarkers recovered by the metagenomics screening approach
mechanistically explained 77% of observed resistance phenotypes for Tobramycin, 76%
for Trimethoprim-Sulfamethoxazole, and 20% Ciprofloxacin. Sensitivity for Ciprofloxacin
resistance detection could be improved to 97% by complementing results with AMR
biomarkers that are undiscoverable due to intrinsic limitations of the workflow. Additionally,
when combined in a multiplex diagnostic in silico panel, the identified AMR biomarkers
reached promising positive and negative predictive values of up to 97 and 99%, respectively.
Finally, we demonstrate that the developed workflow can be used to identify potential
novel resistance mechanisms
The composition of the pulmonary microbiota in sarcoidosis - an observational study
Background: Sarcoidosis is a systemic disease of unknown etiology. The disease mechanisms are largely
speculative and may include the role microbial patterns that initiate and drive an underlying immune process. The
aim of this study was to characterize the microbiota of the lung of patients with sarcoidosis and compare its
composition and diversity with the results from patients with other interstitial lung disease (ILD) and historic
healthy controls.
Methods: Patients (sarcoidosis, n = 31; interstitial lung disease, n = 19) were recruited within the PULMOHOM study,
a prospective cohort study to characterize inflammatory processes in pulmonary diseases. Bronchoscopy of the
middle lobe or the lingula was performed and the recovered fluid was immediately sent for analysis of the
pulmonary microbiota by 16sRNA gene sequencing. Subsequent bioinformatic analysis was performed to compare
the groups.
Results: There were no significant differences between patients with sarcoidosis or other ILDs with regard to
microbiome composition and diversity. In addition, the abundance of the genera Atopobium, Fusobacterium,
Mycobacterium or Propionibacterium were not different between the two groups. There were no gross differences
to historical healthy controls.
Conclusion: The analysis of the pulmonary microbiota based on 16sRNA gene sequencing did not show a
significant dysbiosis in patients with sarcoidosis as compared to other ILD patients. These data do not exclude a
microbiological component in the pathogenesis of sarcoidosis
An estimate of the total number of true human miRNAs
While the number of human miRNA candidates continuously increases, only a few of them are completely characterized and experimentally validated. Toward determining the total number of true miRNAs, we employed a combined in silico high- and experimental low-throughput validation strategy. We collected 28Â 866 human small RNA sequencing data sets containing 363.7 billion sequencing reads and excluded falsely annotated and low quality data. Our high-throughput analysis identified 65% of 24Â 127 mature miRNA candidates as likely false-positives. Using northern blotting, we experimentally validated miRBase entries and novel miRNA candidates. By exogenous overexpression of 108 precursors that encode 205 mature miRNAs, we confirmed 68.5% of the miRBase entries with the confirmation rate going up to 94.4% for the high-confidence entries and 18.3% of the novel miRNA candidates. Analyzing endogenous miRNAs, we verified the expression of 8 miRNAs in 12 different human cell lines. In total, we extrapolated 2300 true human mature miRNAs, 1115 of which are currently annotated in miRBase V22. The experimentally validated miRNAs will contribute to revising targetomes hypothesized by utilizing falsely annotated miRNAs
Comparison of initial oral microbiomes of young adults with and without cavitated dentin caries lesions using an in situ biofilm model
Dental caries is caused by acids released from bacterial biofilms. However, the in vivo formation of initial biofilms in relation to caries remains largely unexplored. The aim of this study was to compare the oral microbiome during the initial phase of bacterial colonization for individuals with (CC) and without (NC) cavitated dentin caries lesions. Bovine enamel slabs on acrylic splints were worn by the volunteers (CC: 14, NC: 13) for in situ biofilm formation (2 h, 4 h, 8 h, 1 ml saliva as reference). Sequencing of the V1/V2 regions of the 16S rRNA gene was performed (MiSeq). The relative abundances of individual operational taxonomic units (OTUs) were compared between samples from the CC group and the NC group. Random forests models were furthermore trained to separate the groups. While the overall heterogeneity did not differ substantially between CC and NC individuals, several individual OTUs were found to have significantly different relative abundances. For the 8 h samples, most of the significant OTUs showed higher relative abundances in the CC group, while the majority of significant OTUs in the saliva samples were more abundant in the NC group. Furthermore, using OTU signatures enabled a separation between both groups, with area-under-the-curve (AUC) values of ~0.8. In summary, the results suggest that initial oral biofilms provide the potential to differentiate between CC and NC individuals
The sncRNA Zoo: a repository for circulating small noncoding RNAs in animals
The repertoire of small noncoding RNAs (sncRNAs), particularly miRNAs, in animals is considered to be evolutionarily conserved. Studies on sncRNAs are often largely based on homology-based information, relying on genomic sequence similarity and excluding actual expression data. To obtain information on sncRNA expression (including miRNAs, snoRNAs, YRNAs and tRNAs), we performed low-input-volume next-generation sequencing of 500 pg of RNA from 21 animals at two German zoological gardens. Notably, none of the species under investigation were previously annotated in any miRNA reference database. Sequencing was performed on blood cells as they are amongst the most accessible, stable and abundant sources of the different sncRNA classes. We evaluated and compared the composition and nature of sncRNAs across the different species by computational approaches. While the distribution of sncRNAs in the different RNA classes varied significantly, general evolutionary patterns were maintained. In particular, miRNA sequences and expression were found to be even more conserved than previously assumed. To make the results available for other researchers, all data, including expression profiles at the species and family levels, and different tools for viewing, filtering and searching the data are freely available in the online resource ASRA (Animal sncRNA Atlas) at https://www.ccb.uni-saarland.de/asra/