Search CORE

23 research outputs found

MetaGeniE: Characterizing Human Clinical Samples Using Deep Metagenomic Sequencing

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date: 01/11/2014
Field of study

<div>With the decreasing cost of next-generation sequencing, deep sequencing of clinical samples provides unique opportunities to understand host-associated microbial communities. Among the primary challenges of clinical metagenomic sequencing is the rapid filtering of human reads to survey for pathogens with high specificity and sensitivity. Metagenomes are inherently variable due to different microbes in the samples and their relative abundance, the size and architecture of genomes, and factors such as target DNA amounts in tissue samples (i.e. human DNA versus pathogen DNA concentration). This variation in metagenomes typically manifests in sequencing datasets as low pathogen abundance, a high number of host reads, and the presence of close relatives and complex microbial communities. In addition to these challenges posed by the composition of metagenomes, high numbers of reads generated from high-throughput deep sequencing pose immense computational challenges. Accurate identification of pathogens is confounded by individual reads mapping to multiple different reference genomes due to gene similarity in different taxa present in the community or close relatives in the reference database. Available global and local sequence aligners also vary in sensitivity, specificity, and speed of detection. The efficiency of detection of pathogens in clinical samples is largely dependent on the desired taxonomic resolution of the organisms. We have developed an efficient strategy that identifies “all against all” relationships between sequencing reads and reference genomes. Our approach allows for scaling to large reference databases and then genome reconstruction by aggregating global and local alignments, thus allowing genetic characterization of pathogens at higher taxonomic resolution. These results were consistent with strain level SNP genotyping and bacterial identification from laboratory culture.</div

Public Library of Science (PLOS)

Crossref

OpenKnowledge@NAU

Directory of Open Access Journals

PubMed Central

FigShare

Bacterial infection detected by MetaGeniE confirmed with the laboratory culture media.

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

MRSA: Methicillin resistant Staphylococcus aureus; ENCL:Enterobacter cloacae; PSAR:Pseudomonas aeruginosa; MSSA: Methicillin sensitive S. aureus; ECOL:Escherichia coli; ENSP:Enterococcus sp.; HAEM:Haemophilus influenza.Bacterial infection detected by MetaGeniE confirmed with the laboratory culture media.</p

FigShare

Effect of human filtration on percent genome coverage and read recall percentage of pathogen detection.

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

The legends of the figure are prefixed with the number of reads (0.1K = 100; 1K = 1000; 10K = 10000; 100K = 100000; 1M = 1000000) followed by mg_bw2 for only fast alignment feature of human read reduction; mg_dc for all features of human read reduction except data compression; mgall_bw2 for all features of human read reduction module).</p

FigShare

Sequential reduction of the metagenome reads for 4 clinical samples from cystic fibrosis patients.

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

Data points represent the remaining reads after each processing step of the pipeline. First six data points (Initial, Quality Filter, BWT Alignment, Data Compression, Sensitive Alignment, Human Repeat Alignment) represent the Human Read Reduction and BWA Bacteria and BLAT Bacteria represent Pathogen Detection against bacterial database.</p

FigShare

Detection of genomes in complex community.

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

Relationship between genome size and genome coverage with increasing sequencing reads. Effect of detection on E. coli APEC O1 in simple and complex community.</p

FigShare

Comparison of detection of close relative in co-infection versus single infection.

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

A. Comparison of percent genome coverage of true detection in co-infection versus false detection of S. aureus Newman. B. Comparison of percent genome coverage of S. aureus TCH1516 in co-infection versus simple infection.</p

FigShare

Benchmarking the human read reduction module of the pipeline.

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

A. Total numbers of reads remaining after human read reduction with different filtration parameters B. Runtime for human read filtration with different aligner and filtration parameters (in minutes).</p

FigShare

Relationship between percent genome coverage and read recall percentage with incremental divergence (i.e. error).

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

Relationship between percent genome coverage and read recall percentage with incremental divergence (i.e. error).</p

FigShare

Description of different steps of human filtration of pipeline utilized to compare sensitivity/specificity of detection and performance of runtime and computational resources of the simulated reads.

Author: Arun Rawat (185306)
David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
Jeffrey T. Foster (164257)
Paul Keim (69765)
Publication venue
Publication date
Field of study

Dash (-) represents that the option was not utilized.Description of different steps of human filtration of pipeline utilized to compare sensitivity/specificity of detection and performance of runtime and computational resources of the simulated reads.</p

FigShare

Annotation details of lost and acquired genes in the evolution of A. baumannii.

Author: David M. Engelthaler (117020)
Elizabeth M. Driebe (190628)
James M. Schupp (193214)
Jason W. Sahl (276694)
John D. Gillece (193203)
Paul Keim (69765)
Victor G. Waddell (276695)
Publication venue
Publication date
Field of study

*Acb = Acinetobacter calcoaceticus-baumannii, b = baumannii, n = nosocomialis.</p

FigShare