10 research outputs found

    Statistical inference in variance components models for biomedical applications

    Get PDF

    Variance Components Models for Analysis of Big Family Data of Health Outcomes in the Lifelines Cohort Study

    Get PDF
    Large multigenerational cohort studies offer powerful ways to study the hereditary effects on various health outcomes. However, accounting for complex kinship relations in big data structures can be methodologically challenging. The traditional kinship model is computationally infeasible when considering thousands of individuals. In this article, we propose a computationally efficient alternative that employs fractional relatedness of family members through a series of founding members. The primary goal of this study is to investigate whether the effect of determinants on health outcome variables differs with and without accounting for family structure. We compare a fixed-effects model without familial effects with several variance components models that account for heritability and shared environment structure. Our secondary goal is to apply the fractional relatedness model in a realistic setting. Lifelines is a three-generation cohort study investigating the biological, behavioral, and environmental determinants of healthy aging. We analyzed a sample of 89,353 participants from 32,452 reconstructed families. Our primary conclusion is that the effect of determinants on health outcome variables does not differ with and without accounting for family structure. However, accounting for family structure through fractional relatedness allows for estimating heritability in a computationally efficient way, showing some interesting differences between physical and mental quality of life heritability. We have shown through simulations that the proposed fractional relatedness model performs better than the standard kinship model, not only in terms of computational time and convenience of fitting using standard functions in R, but also in terms of bias of heritability estimates and coverage

    Statistical inference in variance components models for biomedical applications

    Get PDF
    Confidence intervals are an essential research topic in statistics. Based on confidence intervals we can draw conclusions about the uncertainty of the estimates. Confidence intervals are not simple to construct for complex functions, such as functions of variance components. Functions of variance components, like intraclass correlation coefficients (ICCs), are used in the biomedical area as measures of agreement, heritability, and heterogeneity. Agreement assesses the closeness in judgements among physicians on measurements. Heritability measures the variance contribution due to genetics in phenotype variance. Heterogeneity is mostly used in the context of meta-analysis, when variation between studies is of interest. These three measures played a dominant role in our research. Methods for the construction of confidence intervals for these complex functions of variance components and their performance on coverage probabilities were studied in non-trivial epidemiological applications. These applications consist of (1) an agreement study of radiologists measuring the volume of glands in the neck and head, (2) a meta-analysis of nonlinear dose-response models for the effect of antipsychotic medications on the occupancy of the dopamine in the brain, (3) a meta-analysis of test-negative case-control studies to estimate the influenza vaccine effectiveness, and (4) a three-generation family study (LifeLines) to investigate the effect of body mass index on mental and physical component scores, and determine the variance contributions due to heredity and shared environment. The latter study contributes to research in healthy ageing. This thesis also explores the causal inference for plant genetics. For all applications improved or new statistical methodology were developed

    Confidence intervals for intraclass correlation coefficients in variance components models

    Get PDF
    Confidence intervals for intraclass correlation coefficients in agreement studies with continuous outcomes are model-specific and no generic approach exists. This paper provides two generic approaches for intraclass correlation coefficients of the form Sigma q=1 2). The first approach uses Satterthwaite's approximation and an F-distribution. The second approach uses the first and second moments of the intraclass correlation coefficient estimate in combination with a Beta distribution. Both approaches are based on the restricted maximum likelihood estimates for the variance components involved. Simulation studies are conducted to examine the coverage probabilities of the confidence intervals for agreement studies with a mix of small sample sizes. Two different three-way variance components models and balanced and unbalanced one-way random effects models are investigated. The proposed approaches are compared with other approaches developed for these specific models. The approach based on the F-distribution provides acceptable coverage probabilities, but the approach based on the Beta distribution results in accurate coverages for most settings in both balanced and unbalanced designs. A real agreement study is provided to illustrate the approaches

    Probability genotype imputation method and integrated weighted lasso for QTL identification

    Get PDF
    Background: Many QTL studies have two common features: (1) often there is missing marker information, (2) among many markers involved in the biological process only a few are causal. In statistics, the second issue falls under the headings “sparsity” and “causal inference”. The goal of this work is to develop a two-step statistical methodology for QTL mapping for markers with binary genotypes. The first step introduces a novel imputation method for missing genotypes. Outcomes of the proposed imputation method are probabilities which serve as weights to the second step, namely in weighted lasso. The sparse phenotype inference is employed to select a set of predictive markers for the trait of interest. Results: Simulation studies validate the proposed methodology under a wide range of realistic settings. Furthermore, the methodology outperforms alternative imputation and variable selection methods in such studies. The methodology was applied to an Arabidopsis experiment, containing 69 markers for 165 recombinant inbred lines of a F8 generation. The results confirm previously identified regions, however several new markers are also found. On the basis of the inferred ROC behavior these markers show good potential for being real, especially for the germination trait Gmax. Conclusions: Our imputation method shows higher accuracy in terms of sensitivity and specificity compared to alternative imputation method. Also, the proposed weighted lasso outperforms commonly practiced multiple regression as well as the traditional lasso and adaptive lasso with three weighting schemes. This means that under realistic missing data settings this methodology can be used for QTL identification.

    Variance Components Models for Analysis of Big Family Data of Health Outcomes in the Lifelines Cohort Study

    No full text
    Large multigenerational cohort studies offer powerful ways to study the hereditary effects on various health outcomes. However, accounting for complex kinship relations in big data structures can be methodologically challenging. The traditional kinship model is computationally infeasible when considering thousands of individuals. In this article, we propose a computationally efficient alternative that employs fractional relatedness of family members through a series of founding members. The primary goal of this study is to investigate whether the effect of determinants on health outcome variables differs with and without accounting for family structure. We compare a fixed-effects model without familial effects with several variance components models that account for heritability and shared environment structure. Our secondary goal is to apply the fractional relatedness model in a realistic setting. Lifelines is a three-generation cohort study investigating the biological, behavioral, and environmental determinants of healthy aging. We analyzed a sample of 89,353 participants from 32,452 reconstructed families. Our primary conclusion is that the effect of determinants on health outcome variables does not differ with and without accounting for family structure. However, accounting for family structure through fractional relatedness allows for estimating heritability in a computationally efficient way, showing some interesting differences between physical and mental quality of life heritability. We have shown through simulations that the proposed fractional relatedness model performs better than the standard kinship model, not only in terms of computational time and convenience of fitting using standard functions in R, but also in terms of bias of heritability estimates and coverage

    Variance components models for analysis of big family data of health outcomes in the lifelines cohort study

    Get PDF
    \u3cp\u3eLarge multigenerational cohort studies offer powerful ways to study the hereditary effects on various health outcomes. However, accounting for complex kinship relations in big data structures can be methodologically challenging. The traditional kinship model is computationally infeasible when considering thousands of individuals. In this article, we propose a computationally efficient alternative that employs fractional relatedness of family members through a series of founding members. The primary goal of this study is to investigate whether the effect of determinants on health outcome variables differs with and without accounting for family structure. We compare a fixed-effects model without familial effects with several variance components models that account for heritability and shared environment structure. Our secondary goal is to apply the fractional relatedness model in a realistic setting. Lifelines is a three-generation cohort study investigating the biological, behavioral, and environmental determinants of healthy aging. We analyzed a sample of 89,353 participants from 32,452 reconstructed families. Our primary conclusion is that the effect of determinants on health outcome variables does not differ with and without accounting for family structure. However, accounting for family structure through fractional relatedness allows for estimating heritability in a computationally efficient way, showing some interesting differences between physical and mental quality of life heritability. We have shown through simulations that the proposed fractional relatedness model performs better than the standard kinship model, not only in terms of computational time and convenience of fitting using standard functions in R, but also in terms of bias of heritability estimates and coverage.\u3c/p\u3

    Zinc Single Atom Confinement Effects on Catalysis in 1T-Phase Molybdenum Disulfide

    No full text
    Active sites are atomic sites within catalysts that drive reactions and are essential for catalysis. Spatially confining guest metals within active site microenvironments has been predicted to improve catalytic activity by altering the electronic states of active sites. Using the hydrogen evolution reaction (HER) as a model reaction, we show that intercalating zinc single atoms between layers of 1T-MoS2 (Zn SAs/1T-MoS2) enhances HER performance by decreasing the overpotential, charge transfer resistance, and kinetic barrier. The confined Zn atoms tetrahedrally coordinate to basal sulfur (S) atoms and expand the interlayer spacing of 1T-MoS2 by ∼3.4%. Under confinement, the Zn SAs donate electrons to coordinated S atoms, which lowers the free energy barrier of H* adsorption-desorption and enhances HER kinetics. In this work, which is applicable to all types of catalytic reactions and layered materials, HER performance is enhanced by controlling the coordination geometry and electronic states of transition metals confined within active-site microenvironments
    corecore