202 research outputs found

    A statistical toolbox for metagenomics: assessing functional diversity in microbial communities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The 99% of bacteria in the environment that are recalcitrant to culturing have spurred the development of metagenomics, a culture-independent approach to sample and characterize microbial genomes. Massive datasets of metagenomic sequences have been accumulated, but analysis of these sequences has focused primarily on the descriptive comparison of the relative abundance of proteins that belong to specific functional categories. More robust statistical methods are needed to make inferences from metagenomic data. In this study, we developed and applied a suite of tools to describe and compare the richness, membership, and structure of microbial communities using peptide fragment sequences extracted from metagenomic sequence data.</p> <p>Results</p> <p>Application of these tools to acid mine drainage, soil, and whale fall metagenomic sequence collections revealed groups of peptide fragments with a relatively high abundance and no known function. When combined with analysis of 16S rRNA gene fragments from the same communities these tools enabled us to demonstrate that although there was no overlap in the types of 16S rRNA gene sequence observed, there was a core collection of operational protein families that was shared among the three environments.</p> <p>Conclusion</p> <p>The results of comparisons between the three habitats were surprising considering the relatively low overlap of membership and the distinctively different characteristics of the three habitats. These tools will facilitate the use of metagenomics to pursue statistically sound genome-based ecological analyses.</p

    De novo identification of viral pathogens from cell culture hologenomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak response system, especially in the case of emerging infectious diseases and viral epidemics. This process is generally tedious and time-consuming thus making it ineffective in traditional settings. The added complexity in these situations is the non-availability of pure isolates of pathogens as they are present as mixed genomes or hologenomes. Next-generation sequencing approaches offer an attractive solution in this scenario as it provides adequate depth of sequencing at fast and affordable costs, apart from making it possible to decipher complex interactions between genomes at a scale that was not possible before. The widespread application of next-generation sequencing in this field has been limited by the non-availability of an efficient computational pipeline to systematically analyze data to delineate pathogen genomes from mixed population of genomes or hologenomes.</p> <p>Findings</p> <p>We applied next-generation sequencing on a sample containing mixed population of genomes from an epidemic with appropriate processing and enrichment. The data was analyzed using an extensive computational pipeline involving mapping to reference genome sets and <it>de-novo </it>assembly. In depth analysis of the data generated revealed the presence of sequences corresponding to <it>Japanese encephalitis </it>virus. The genome of the virus was also independently <it>de-novo </it>assembled. The presence of the virus was in addition, verified using standard molecular biology techniques.</p> <p>Conclusions</p> <p>Our approach can accurately identify causative pathogens from cell culture hologenome samples containing mixed population of genomes and in principle can be applied to patient hologenome samples without any background information. This methodology could be widely applied to identify and isolate pathogen genomes and understand their genomic variability during outbreaks.</p

    Finding the Needles in the Metagenome Haystack

    Get PDF
    In the collective genomes (the metagenome) of the microorganisms inhabiting the Earth’s diverse environments is written the history of life on this planet. New molecular tools developed and used for the past 15 years by microbial ecologists are facilitating the extraction, cloning, screening, and sequencing of these genomes. This approach allows microbial ecologists to access and study the full range of microbial diversity, regardless of our ability to culture organisms, and provides an unprecedented access to the breadth of natural products that these genomes encode. However, there is no way that the mere collection of sequences, no matter how expansive, can provide full coverage of the complex world of microbial metagenomes within the foreseeable future. Furthermore, although it is possible to fish out highly informative and useful genes from the sea of gene diversity in the environment, this can be a highly tedious and inefficient procedure. Microbial ecologists must be clever in their pursuit of ecologically relevant, valuable, and niche-defining genomic information within the vast haystack of microbial diversity. In this report, we seek to describe advances and prospects that will help microbial ecologists glean more knowledge from investigations into metagenomes. These include technological advances in sequencing and cloning methodologies, as well as improvements in annotation and comparative sequence analysis. More significant, however, will be ways to focus in on various subsets of the metagenome that may be of particular relevance, either by limiting the target community under study or improving the focus or speed of screening procedures. Lastly, given the cost and infrastructure necessary for large metagenome projects, and the almost inexhaustible amount of data they can produce, trends toward broader use of metagenome data across the research community coupled with the needed investment in bioinformatics infrastructure devoted to metagenomics will no doubt further increase the value of metagenomic studies in various environments

    PhoR/PhoP two component regulatory system affects biocontrol capability of Bacillus subtilis NCD-2

    Get PDF
    The Bacillus subtilis strain NCD-2 is an important biocontrol agent against cotton verticillium wilt and cotton sore shin in the field, which are caused by Verticillium dahliae Kleb and Rhizoctonia solani Kuhn, respectively. A mutant of strain NCD-2, designated M216, with decreased antagonism to V. dahliae and R. solani, was selected by mini-Tn10 mutagenesis and in vitro virulence screening. The inserted gene in the mutant was cloned and identified as the phoR gene, which encodes a sensor kinase in the PhoP/PhoR two-component system. Compared to the wild-type strain, the APase activities of the mutant was decreased significantly when cultured in low phosphate medium, but no obvious difference was observed when cultured in high phosphate medium. The mutant also grew more slowly on organic phosphate agar and lost its phosphatidylcholine-solubilizing ability. The suppression of cotton seedling damping-off in vivo and colonization of the rhizosphere of cotton also decreased in the mutant strain when compared with the wild type strain. All of these characteristics could be partially restored by complementation of the phoR gene in the M216 mutant

    Metatranscriptomics and Pyrosequencing Facilitate Discovery of Potential Viral Natural Enemies of the Invasive Caribbean Crazy Ant, Nylanderia pubens

    Get PDF
    BACKGROUND: Nylanderia pubens (Forel) is an invasive ant species that in recent years has developed into a serious nuisance problem in the Caribbean and United States. A rapidly expanding range, explosive localized population growth, and control difficulties have elevated this ant to pest status. Professional entomologists and the pest control industry in the United States are urgently trying to understand its biology and develop effective control methods. Currently, no known biological-based control agents are available for use in controlling N. pubens. METHODOLOGY AND PRINCIPAL FINDINGS: Metagenomics and pyrosequencing techniques were employed to examine the transcriptome of field-collected N. pubens colonies in an effort to identify virus infections with potential to serve as control agents against this pest ant. Pyrosequencing (454-platform) of a non-normalized N. pubens expression library generated 1,306,177 raw sequence reads comprising 450 Mbp. Assembly resulted in generation of 59,017 non-redundant sequences, including 27,348 contigs and 31,669 singlets. BLAST analysis of these non-redundant sequences identified 51 of potential viral origin. Additional analyses winnowed this list of potential viruses to three that appear to replicate in N. pubens. CONCLUSIONS: Pyrosequencing the transcriptome of field-collected samples of N. pubens has identified at least three sequences that are likely of viral origin and, in which, N. pubens serves as host. In addition, the N. pubens transcriptome provides a genetic resource for the scientific community which is especially important at this early stage of developing a knowledgebase for this new pest

    Gene prediction in metagenomic fragments: A large scale machine learning approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions.</p> <p>Results</p> <p>We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability.</p> <p>Conclusion</p> <p>Large scale machine learning methods are well-suited for gene prediction in metagenomic DNA fragments. In particular, the combination of linear discriminants and neural networks is promising and should be considered for integration into metagenomic analysis pipelines. The data sets can be downloaded from the URL provided (see Availability and requirements section).</p

    Effects of therapy with [177Lu-DOTA0,Tyr3]octreotate on endocrine function

    Get PDF
    Purpose: Peptide receptor radionuclide therapy (PRRT) with radiolabelled somatostatin analogues is a novel therapy for patients with somatostatin receptor-positive tumours. We determined the effects of PRRT with [177Lu-DOTA0,Tyr3]octreotate (177Lu-octreotate) on glucose homeostasis and the pituitary-gonadal, pituitary-thyroid and pituitary-adrenal axes. Methods: Hormone levels were measured and adrenal function assessed at baseline and up to 24 months of follow-up. Results: In 35 men, mean serum inhibin B levels were decreased at 3 months post-therapy (205 ± 16 to 25 ± 4 ng/l, p 550 nmol/l, n = 18). Five patients developed elevated HbA1clevels (> 6.5%). Conclusion: In men177Lu-octreotate therapy induced transient inhibitory effects on spermatogenesis, but non-SHBG-bound T levels remained unaffected. In the long term, gonadotropin levels decreased significantly in postmenopausal women. Only a few patients developed hypothyroidism or elevated levels of HbA1c. Therefore, PRRT with177Lu-octreotate can be regarded as a safe treatment modality with respect to short-and long-term endocrine function

    Premature Decline of Serum Total Testosterone in HIV-Infected Men in the HAART-Era

    Get PDF
    BackgroundTestosterone (T) deficiency remains a poorly understood issue in men with Human Immunodeficiency Virus (HIV). We investigated the gonadal status in HIV-infected men in order to characterize T deficiency and to identify predictive factors for low serum T.Methodology/Principal FindingsWe performed a cross-sectional, observational study on 1325 consecutive HIV male outpatients, most of them having lipodystrophy. Serum total T<300 ng/dL was used as the threshold for biochemical T deficiency. Morning serum total T, luteinizing hormone (LH), estradiol, HIV parameters, and body composition parameters by CT-scan and Dual-Energy-X-ray-Absorptiometry were measured in each case. Sexual behavior was evaluated in a subset of 247 patients. T deficiency was found in 212 subjects, especially in the age range 40\u201359, but was frequent even in younger patients. T deficiency occurred mainly in association with low/normal serum LH. Adiposity was higher in subjects with T deficiency (p<0.0001) and both visceral adipose tissue and body mass index were the main negative predictors of serum total T. Osteoporosis and erectile dysfunction were present in a similar percentage in men with or without T deficiency.Conclusions/SignificancePremature decline of serum T is common (16%) among young/middle-aged HIV-infected men and is associated with inappropriately low/normal LH and increased visceral fat. T deficiency occurs at a young age and may be considered an element of the process of premature or accelerated aging known to be associated with HIV infection. The role of HIV and/or HIV infection treatments, as well as the role of the general health state on the gonadal axis, remains, in fact, to be elucidated. Due to the low specificity of signs and symptoms of hypogonadism in the context of HIV, caution is needed in the diagnosis of hypogonadism in HIV-infected men with biochemical low serum T levels

    Assessment of Metagenomic Assembly Using Simulated Next Generation Sequencing Data

    Get PDF
    Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represented the overall functional composition reasonably well. We further examined the effect of scaffolding of contigs using paired-end Illumina reads. It dramatically increased contig lengths of the simple community and yielded minor improvements to the more complex communities. Although the increase in contig length was accompanied by increased chimericity, it resulted in more complete genes and a better characterization of the functional repertoire. The metagenomic simulators developed for this research are freely available
    corecore