3,218 research outputs found

    Childhood infections and asthma: at the crossroads of the hygiene and Barker hypotheses

    Get PDF
    The hygiene hypothesis states that childhood asthma develops as a result of decreased exposure to infectious agents during infancy and early childhood. This results in the persistence of the neonatal T helper lymphocyte 2 immunophenotype, thereby predisposing the child to atopic disease. While multiple studies support the hygiene hypothesis in asthma ontogeny, the evidence remains inconclusive; multiple other environmental exposures in early childhood also alter predisposition to asthma. Moreover, the current paradigm for asthma development extends far beyond simple childhood environmental exposures to include fetal development, genetic predisposition, and interactions of the developmental state and genetics with the environment

    On the Origins and Control of Community Types in the Human Microbiome

    Full text link
    Microbiome-based stratification of healthy individuals into compositional categories, referred to as "community types", holds promise for drastically improving personalized medicine. Despite this potential, the existence of community types and the degree of their distinctness have been highly debated. Here we adopted a dynamic systems approach and found that heterogeneity in the interspecific interactions or the presence of strongly interacting species is sufficient to explain community types, independent of the topology of the underlying ecological network. By controlling the presence or absence of these strongly interacting species we can steer the microbial ecosystem to any desired community type. This open-loop control strategy still holds even when the community types are not distinct but appear as dense regions within a continuous gradient. This finding can be used to develop viable therapeutic strategies for shifting the microbial composition to a healthy configurationComment: Main Text, Figures, Methods, Supplementary Figures, and Supplementary Tex

    Screening and Replication using the Same Data Set: Testing Strategies for Family-Based Studies in which All Probands Are Affected

    Get PDF
    For genome-wide association studies in family-based designs, we propose a powerful two-stage testing strategy that can be applied in situations in which parent-offspring trio data are available and all offspring are affected with the trait or disease under study. In the first step of the testing strategy, we construct estimators of genetic effect size in the completely ascertained sample of affected offspring and their parents that are statistically independent of the family-based association/transmission disequilibrium tests (FBATs/TDTs) that are calculated in the second step of the testing strategy. For each marker, the genetic effect is estimated (without requiring an estimate of the SNP allele frequency) and the conditional power of the corresponding FBAT/TDT is computed. Based on the power estimates, a weighted Bonferroni procedure assigns an individually adjusted significance level to each SNP. In the second stage, the SNPs are tested with the FBAT/TDT statistic at the individually adjusted significance levels. Using simulation studies for scenarios with up to 1,000,000 SNPs, varying allele frequencies and genetic effect sizes, the power of the strategy is compared with standard methodology (e.g., FBATs/TDTs with Bonferroni correction). In all considered situations, the proposed testing strategy demonstrates substantial power increases over the standard approach, even when the true genetic model is unknown and must be selected based on the conditional power estimates. The practical relevance of our methodology is illustrated by an application to a genome-wide association study for childhood asthma, in which we detect two markers meeting genome-wide significance that would not have been detected using standard methodology

    Graph Convolutional Network-based Feature Selection for High-dimensional and Low-sample Size Data

    Full text link
    Feature selection is a powerful dimension reduction technique which selects a subset of relevant features for model construction. Numerous feature selection methods have been proposed, but most of them fail under the high-dimensional and low-sample size (HDLSS) setting due to the challenge of overfitting. In this paper, we present a deep learning-based method - GRAph Convolutional nEtwork feature Selector (GRACES) - to select important features for HDLSS data. We demonstrate empirical evidence that GRACES outperforms other feature selection methods on both synthetic and real-world datasets.Comment: 24 pages, 4 figures, 4 table

    CGBayesNets: Conditional Gaussian Bayesian Network Learning and Inference with Mixed Discrete and Continuous Data

    Get PDF
    Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com

    Lack of reproducibility of linkage results in serially measured blood pressure data

    Get PDF
    BACKGROUND: Using the longitudinal Framingham Heart Study data on blood pressure, we analyzed the reproducibility of linkage measures from serial cross-sectional surveys of a defined population by performing genome-wide model-free linkage analyses to systolic blood pressure (SBP) and history of hypertension (HTN) measured at five separate time points. RESULTS: The heritability of SBP was relatively stable over time, ranging from 11.6 to 23.5% (coefficient of variation = 25.7%). However, the variability in linkage results was much greater. The average correlation in LOD scores at any pair of time points was 0.46 for HTN (NPL All LOD) and 0.17 for SBP (Variance Components LOD). No evidence of reproducible linkage results was found, with a mean κ of 0.02 for linkage to HTN and -0.03 for SBP linkage. At loci with potential evidence for linkage (LOD > 1.0 at one or more time points), the correlation was even lower. The coefficient of variation at loci with potential evidence of linkage was 126% for HTN and 135% for SBP. None of 15 chromosomal regions for HTN and only one of 28 regions for SBP with potential evidence for linkage had a LOD > 1.0 at more than two of the five time points. CONCLUSION: These data suggest that, although heritability estimates at different time points are relatively robust, the reproducibility of linkage results in serial cross-sectional samples of a geographically defined population at successive time points is poor. This may explain in part the difficulty encountered in replicating linkage studies of complex phenotypes

    Using Canonical Correlation Analysis to Discover Genetic Regulatory Variants

    Get PDF
    Background: Discovering genetic associations between genetic markers and gene expression levels can provide insight into gene regulation and, potentially, mechanisms of disease. Such analyses typically involve a linkage or association analysis in which expression data are used as phenotypes. This approach leads to a large number of multiple comparisons and may therefore lack power. We assess the potential of applying canonical correlation analysis to partitioned genomewide data as a method for discovering regulatory variants. Methodology/Principal Findings: Simulations suggest that canonical correlation analysis has higher power than standard pairwise univariate regression to detect single nucleotide polymorphisms when the expression trait has low heritability. The increase in power is even greater under the recessive model. We demonstrate this approach using the Childhood Asthma Management Program data. Conclusions/Significance: Our approach reduces multiple comparisons and may provide insight into the complex relationships between genotype and gene expression
    • …
    corecore