245 research outputs found

    Scaling metagenome sequence assembly with probabilistic de Bruijn graphs

    Full text link
    Deep sequencing has enabled the investigation of a wide range of environmental microbial ecosystems, but the high memory requirements for {\em de novo} assembly of short-read shotgun sequencing data from these complex populations are an increasingly large practical barrier. Here we introduce a memory-efficient graph representation with which we can analyze the k-mer connectivity of metagenomic samples. The graph representation is based on a probabilistic data structure, a Bloom filter, that allows us to efficiently store assembly graphs in as little as 4 bits per k-mer, albeit inexactly. We show that this data structure accurately represents DNA assembly graphs in low memory. We apply this data structure to the problem of partitioning assembly graphs into components as a prelude to assembly, and show that this reduces the overall memory requirements for {\em de novo} assembly of metagenomes. On one soil metagenome assembly, this approach achieves a nearly 40-fold decrease in the maximum memory requirements for assembly. This probabilistic graph representation is a significant theoretical advance in storing assembly graphs and also yields immediate leverage on metagenomic assembly

    Children's body mass index, participation in school meals, and observed energy intake at school meals

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Data from a dietary-reporting validation study with fourth-grade children were analyzed to investigate a possible relationship of body mass index (BMI) with daily participation in school meals and observed energy intake at school meals, and whether the relationships differed by breakfast location (classroom; cafeteria).</p> <p>Methods</p> <p>Data were collected in 17, 17, and 8 schools during three school years. For the three years, six, six, and seven of the schools had breakfast in the classroom; all other schools had breakfast in the cafeteria. Information about 180 days of school breakfast and school lunch participation during fourth grade for each of 1,571 children (90% Black; 53% girls) was available in electronic administrative records from the school district. Children were weighed and measured, and BMI was calculated. Each of a subset of 465 children (95% Black; 49% girls) was observed eating school breakfast and school lunch on the same day. Mixed-effects regression was conducted with BMI as the dependent variable and school as the random effect; independent variables were breakfast participation, lunch participation, combined participation (breakfast and lunch on the same day), average observed energy intake for breakfast, average observed energy intake for lunch, sex, age, breakfast location, and school year. Analyses were repeated for BMI category (underweight/healthy weight; overweight; obese; severely obese) using pooled ordered logistic regression models that excluded sex and age.</p> <p>Results</p> <p>Breakfast participation, lunch participation, and combined participation were not significantly associated with BMI or BMI category irrespective of whether the model included observed energy intake at school meals. Observed energy intake at school meals was significantly and positively associated with BMI and BMI category. For the total sample and subset, breakfast location was significantly associated with BMI; average BMI was larger for children with breakfast in the classroom than in the cafeteria. Significantly more kilocalories were observed eaten at breakfast in the classroom than in the cafeteria.</p> <p>Conclusions</p> <p>For fourth-grade children, results provide evidence of a positive relationship between BMI and observed energy intake at school meals, and between BMI and school breakfast in the classroom; however, BMI and participation in school meals were not significantly associated.</p

    Microbiome Composition and Function Drives Wound-Healing Impairment in the Female Genital Tract

    Get PDF
    The mechanism(s) by which bacterial communities impact susceptibility to infectious diseases, such as HIV, and maintain female genital tract (FGT) health are poorly understood. Evaluation of FGT bacteria has predominantly been limited to studies of species abundance, but not bacterial function. We therefore sought to examine the relationship of bacterial community composition and function with mucosal epithelial barrier health in the context of bacterial vaginosis (BV) using metaproteomic, metagenomic, and in vitro approaches. We found highly diverse bacterial communities dominated by Gardnerella vaginalis associated with host epithelial barrier disruption and enhanced immune activation, and low diversity communities dominated by Lactobacillus species that associated with lower Nugent scores, reduced pH, and expression of host mucosal proteins important for maintaining epithelial integrity. Importantly, proteomic signatures of disrupted epithelial integrity associated with G. vaginalis-dominated communities in the absence of clinical BV diagnosis. Because traditional clinical assessments did not capture this, it likely represents a larger underrepresented phenomenon in populations with high prevalence of G. vaginalis. We finally demonstrated that soluble products derived from G. vaginalis inhibited wound healing, while those derived from L. iners did not, providing insight into functional mechanisms by which FGT bacterial communities affect epithelial barrier integrity

    Digital PCR provides sensitive and absolute calibration for high throughput sequencing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Next-generation DNA sequencing on the 454, Solexa, and SOLiD platforms requires absolute calibration of the number of molecules to be sequenced. This requirement has two unfavorable consequences. First, large amounts of sample-typically micrograms-are needed for library preparation, thereby limiting the scope of samples which can be sequenced. For many applications, including metagenomics and the sequencing of ancient, forensic, and clinical samples, the quantity of input DNA can be critically limiting. Second, each library requires a titration sequencing run, thereby increasing the cost and lowering the throughput of sequencing.</p> <p>Results</p> <p>We demonstrate the use of digital PCR to accurately quantify 454 and Solexa sequencing libraries, enabling the preparation of sequencing libraries from nanogram quantities of input material while eliminating costly and time-consuming titration runs of the sequencer. We successfully sequenced low-nanogram scale bacterial and mammalian DNA samples on the 454 FLX and Solexa DNA sequencing platforms. This study is the first to definitively demonstrate the successful sequencing of picogram quantities of input DNA on the 454 platform, reducing the sample requirement more than 1000-fold without pre-amplification and the associated bias and reduction in library depth.</p> <p>Conclusion</p> <p>The digital PCR assay allows absolute quantification of sequencing libraries, eliminates uncertainties associated with the construction and application of standard curves to PCR-based quantification, and with a coefficient of variation close to 10%, is sufficiently precise to enable direct sequencing without titration runs.</p

    Guiding Ethical Principles in Engineering Biology Research

    Get PDF
    Engineering biology is being applied toward solving or mitigating some of the greatest challenges facing society. As with many other rapidly advancing technologies, the development of these powerful tools must be considered in the context of ethical uses for personal, societal, and/or environmental advancement. Researchers have a responsibility to consider the diverse outcomes that may result from the knowledge and innovation they contribute to the field. Together, we developed a Statement of Ethics in Engineering Biology Research to guide researchers as they incorporate the consideration of long-term ethical implications of their work into every phase of the research lifecycle. Herein, we present and contextualize this Statement of Ethics and its six guiding principles. Our goal is to facilitate ongoing reflection and collaboration among technical researchers, social scientists, policy makers, and other stakeholders to support best outcomes in engineering biology innovation and development

    Empirical Distributions of F-ST from Large-Scale Human Polymorphism Data

    Get PDF
    Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright’s FST that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-FST may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically FST analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global FST distribution closely follows an exponential distribution. Third, although the overall FST distribution is similarly shaped (inverse J), FST distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-FST of these groups is linear in allele frequency. These results suggest that investigating the extremes of the FST distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection

    The rise and fall of methanotrophy following a deepwater oil-well blowout

    Get PDF
    The blowout of the Macondo oil well in the Gulf of Mexico in April 2010 injected up to 500,000 tonnes of natural gas, mainly methane, into the deep sea1. Most of the methane released was thought to have been consumed by marine microbes between July and August 20102, 3. Here, we report spatially extensive measurements of methane concentrations and oxidation rates in the nine months following the spill. We show that although gas-rich deepwater plumes were a short-lived feature, water column concentrations of methane remained above background levels throughout the rest of the year. Rates of microbial methane oxidation peaked in the deepwater plumes in May and early June, coincident with a rapid rise in the abundance of known and new methane-oxidizing microbes. At this time, rates of methane oxidation reached up to 5,900 nmol l−1 d−1—the highest rates documented in the global pelagic ocean before the blowout4. Rates of methane oxidation fell to less than 50 nmol l−1 d−1 in late June, and continued to decline throughout the remainder of the year. We suggest the precipitous drop in methane consumption in late June, despite the persistence of methane in the water column, underscores the important role that physiological and environmental factors play in constraining the activity of methane-oxidizing bacteria in the Gulf of Mexico

    Allelic polymorphism in the T cell receptor and its impact on immune responses

    Get PDF
    In comparison to human leukocyte antigen (HLA) polymorphism, the impact of allelic sequence variation within T cell receptor (TCR) loci is much less understood. Particular TCR loci have been associated with autoimmunity, but the molecular basis for this phenomenon is undefined. We examined the T cell response to an HLA-B*3501-restricted epitope (HPVGEADYFEY) from Epstein-Barr virus (EBV), which is frequently dominated by a TRBV9*01 public TCR (TK3). However, the common allelic variant TRBV9*02, which differs by a single amino acid near the CDR2β loop (Gln55→His55), was never used in this response. The structure of the TK3 TCR, its allelic variant, and a nonnaturally occurring mutant (Gln55→Ala55) in complex with HLA-B*3501 revealed that the Gln55→His55 polymorphism affected the charge complementarity at the TCR-peptide-MHC interface, resulting in reduced functional recognition of the cognate and naturally occurring variants of this EBV peptide. Thus, polymorphism in the TCR loci may contribute toward variability in immune responses and the outcome of infection

    Tundra microbial community taxa and traits predict decomposition parameters of stable, old soil organic carbon.

    Get PDF
    The susceptibility of soil organic carbon (SOC) in tundra to microbial decomposition under warmer climate scenarios potentially threatens a massive positive feedback to climate change, but the underlying mechanisms of stable SOC decomposition remain elusive. Herein, Alaskan tundra soils from three depths (a fibric O horizon with litter and course roots, an O horizon with decomposing litter and roots, and a mineral-organic mix, laying just above the permafrost) were incubated. Resulting respiration data were assimilated into a 3-pool model to derive decomposition kinetic parameters for fast, slow, and passive SOC pools. Bacterial, archaeal, and fungal taxa and microbial functional genes were profiled throughout the 3-year incubation. Correlation analyses and a Random Forest approach revealed associations between model parameters and microbial community profiles, taxa, and traits. There were more associations between the microbial community data and the SOC decomposition parameters of slow and passive SOC pools than those of the fast SOC pool. Also, microbial community profiles were better predictors of model parameters in deeper soils, which had higher mineral contents and relatively greater quantities of old SOC than in surface soils. Overall, our analyses revealed the functional potential of microbial communities to decompose tundra SOC through a suite of specialized genes and taxa. These results portray divergent strategies by which microbial communities access SOC pools across varying depths, lending mechanistic insights into the vulnerability of what is considered stable SOC in tundra regions

    Disparate Associations of HLA Class I Markers with HIV-1 Acquisition and Control of Viremia in an African Population

    Get PDF
    BACKGROUND:Acquisition of human immunodeficiency virus type 1 (HIV-1) infection is mediated by a combination of characteristics of the infectious and the susceptible member of a transmission pair, including human behavioral and genetic factors, as well as viral fitness and tropism. Here we report on the impact of established and potential new HLA class I determinants of heterosexual HIV-1 acquisition in the HIV-1-exposed seronegative (HESN) partners of serodiscordant Zambian couples. METHODOLOGY/PRINCIPAL FINDINGS:We assessed the relationships of behavioral and clinically documented risk factors, index partner viral load, and host genetic markers to HIV-1 transmission among 568 cohabiting couples followed for at least nine months. We genotyped subjects for three classical HLA class I genes known to influence immune control of HIV-1 infection. From 1995 to December 2006, 240 HESNs seroconverted and 328 remained seronegative. In Cox proportional hazards models, HLA-A*68:02 and the B*42-C*17 haplotype in HESN partners were significantly and independently associated with faster HIV-1 acquisition (relative hazards = 1.57 and 1.55; p = 0.007 and 0.013, respectively) after controlling for other previously established contributing factors in the index partner (viral load and specific class I alleles), in the HESN partner (age, gender), or in the couple (behavioral and clinical risk score). Few if any previously implicated class I markers were associated here with the rate of acquiring infection. CONCLUSIONS/SIGNIFICANCE:A few HLA class I markers showed modest effects on acquisition of HIV-1 subtype C infection in HESN partners of discordant Zambian couples. However, the striking disparity between those few markers and the more numerous, different markers found to determine HIV-1 disease course makes it highly unlikely that, whatever the influence of class I variation on the rate of infection, the mechanism mediating that phenomenon is identical to that involved in disease control
    corecore