143 research outputs found

    Application of Two-Part Statistics for Comparison of Sequence Variant Counts

    Get PDF
    Investigation of microbial communities, particularly human associated communities, is significantly enhanced by the vast amounts of sequence data produced by high throughput sequencing technologies. However, these data create high-dimensional complex data sets that consist of a large proportion of zeros, non-negative skewed counts, and frequently, limited number of samples. These features distinguish sequence data from other forms of high-dimensional data, and are not adequately addressed by statistical approaches in common use. Ultimately, medical studies may identify targeted interventions or treatments, but lack of analytic tools for feature selection and identification of taxa responsible for differences between groups, is hindering advancement. The objective of this paper is to examine the application of a two-part statistic to identify taxa that differ between two groups. The advantages of the two-part statistic over common statistical tests applied to sequence count datasets are discussed. Results from the t-test, the Wilcoxon test, and the two-part test are compared using sequence counts from microbial ecology studies in cystic fibrosis and from cenote samples. We show superior performance of the two-part statistic for analysis of sequence data. The improved performance in microbial ecology studies was independent of study type and sequence technology used

    XplorSeq: A software environment for integrated management and phylogenetic analysis of metagenomic sequence data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Advances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects.</p> <p>Results</p> <p>XplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file.</p> <p>Conclusion</p> <p>XplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at <url>http://vent.colorado.edu/phyloware</url>.</p

    Directed Evaluation of Enterotoxigenic Escherichia coli Autotransporter Proteins as Putative Vaccine Candidates

    Get PDF
    Diarrheal diseases are responsible for more than 1.5 million deaths annually in developing countries. Enterotoxigenic E. coli (ETEC) are among the most common bacterial causes of diarrhea, accounting for an estimated 300,000–500,000 deaths each year, mostly in young children. There unfortunately is not yet a vaccine that can offer sustained, broad-based protection against ETEC. While most vaccine development effort has focused on plasmid-encoded finger-like ETEC adhesin structures known as colonization factors, additional effort is needed to identify conserved target antigens. Epidemiologic studies suggest that immune responses to uncharacterized, chromosomally encoded antigens could contribute to protection resulting from repeated infections. Earlier studies of immune responses to ETEC infection had identified a class of surface-expressed molecules known as autotransporters (AT). Therefore, available ETEC genome sequences were examined to identify conserved ETEC autotransporters not shared by the commensal E. coli HS strain, followed by studies of the immune response to these antigens, and tests of their utility as vaccine components. Two chromosomally encoded ATs, identified in ETEC, but not in HS, were found to be immunogenic and protective in an animal model, suggesting that conserved AT molecules contribute to protective immune responses that follow natural ETEC infection and offering new potential targets for vaccines

    Comprehensive molecular, genomic and phenotypic analysis of a major clone of Enterococcus faecalis MLST ST40

    Get PDF

    The bile salt glycocholate induces global changes in gene and protein expression and activates virulence in enterotoxigenic Escherichia coli

    Get PDF
    Pathogenic bacteria use specific host factors to modulate virulence and stress responses during infection. We found previously that the host factor bile and the bile component glyco-conjugated cholate (NaGCH, sodium glycocholate) upregulate the colonization factor CS5 in enterotoxigenic Escherichia coli (ETEC). To further understand the global regulatory effects of bile and NaGCH, we performed Illumina RNA-Seq and found that crude bile and NaGCH altered the expression of 61 genes in CS5 + CS6 ETEC isolates. The most striking finding was high induction of the CS5 operon (csfA-F), its putative transcription factor csvR, and the putative ETEC virulence factor cexE. iTRAQ-coupled LC-MS/MS proteomic analyses verified induction of the plasmid-borne virulence proteins CS5 and CexE and also showed that NaGCH affected the expression of bacterial membrane proteins. Furthermore, NaGCH induced bacteria to aggregate, increased their adherence to epithelial cells, and reduced their motility. Our results indicate that CS5 + CS6 ETEC use NaGCH present in the small intestine as a signal to initiate colonization of the epithelium
    corecore