11 research outputs found

    A variance components factor model for genetic association studies: a Bayesian analysis.

    No full text
    Studies of gene-trait associations for complex diseases often involve multiple traits that may vary by genotype groups or patterns. Such traits are usually manifestations of lower-dimensional latent factors or disease syndromes. We illustrate the use of a variance components factor (VCF) model to model the association between multiple traits and genotype groups as well as any other existing patient-level covariates. This model characterizes the correlations between traits as underlying latent factors that can be used in clinical decision-making. We apply it within the Bayesian framework and provide a straightforward implementation using the WinBUGS software. The VCF model is illustrated with simulated data and an example that comprises changes in plasma lipid measurements of patients who were treated with statins to lower low-density lipoprotein cholesterol, and polymorphisms from the apolipoprotein-E gene. The simulation shows that this model clearly characterizes existing multiple trait manifestations across genotype groups where individuals' group assignments are fully observed or can be deduced from the observed data. It also allows one to investigate covariate by genotype group interactions that may explain the variability in the traits. The flexibility to characterize such multiple trait manifestations makes the VCF model more desirable than the univariate variance components model, which is applied to each trait separately. The Bayesian framework offers a flexible approach that allows one to incorporate prior information

    Application of two machine learning algorithms to genetic association studies in the presence of covariates

    Get PDF
    BACKGROUND: Population-based investigations aimed at uncovering genotype-trait associations often involve high-dimensional genetic polymorphism data as well as information on multiple environmental and clinical parameters. Machine learning (ML) algorithms offer a straightforward analytic approach for selecting subsets of these inputs that are most predictive of a pre-defined trait. The performance of these algorithms, however, in the presence of covariates is not well characterized. METHODS AND RESULTS: In this manuscript, we investigate two approaches: Random Forests (RFs) and Multivariate Adaptive Regression Splines (MARS). Through multiple simulation studies, the performance under several underlying models is evaluated. An application to a cohort of HIV-1 infected individuals receiving anti-retroviral therapies is also provided. CONCLUSION: Consistent with more traditional regression modeling theory, our findings highlight the importance of considering the nature of underlying gene-covariate-trait relationships before applying ML algorithms, particularly when there is potential confounding or effect mediation

    Impact of a multi-strategy community intervention to reduce maternal and child health inequalities in India : A qualitative study in Haryana

    Get PDF
    A multi-strategy community intervention, known as National Rural Health Mission (NRHM), was implemented in India from 2005 to 2012. By improving the availability of and access to better-quality healthcare, the aim was to reduce maternal and child health (MCH) inequalities. This study was planned to explore the perceptions and beliefs of stakeholders about extent of implementation and effectiveness of NRHM's health sector plans in improving MCH status and reducing inequalities. A total of 33 in-depth interviews (n = 33) with program managers, community representatives, mothers and 8 focus group discussions (n = 42) with health service providers were conducted from September to December 2013, in Haryana, post NRHM. Using NVivo software (version 9), an inductive applied thematic analysis was done based upon grounded theory, program theory of change and a framework approach. Almost all the participants reported that there was an improvement in overall health infrastructure through an increased availability of accredited social health activists, free ambulance services, and free treatment facilities in rural areas. This had increased the demand and utilization of MCH services, especially for those related to institutional delivery, even by the poor families. Service providers felt that acute shortage of human resources was a major health system level barrier. District-specific individual, community, and socio-political level barriers were also observed. Overall program managers, service providers and community representatives believed that NRHM had a role in improving MCH outcomes and in reduction of geographical and socioeconomic inequalities, through improvement in accessibility, availability and affordability of the MCH services in the rural areas and for the poor. Any reduction in gender-based inequalities, however, was linked to the adoption of small family sizes and an increase in educational levels

    Multiple Imputation and Random Forests (MIRF) for Unobservable, High-Dimensional Data

    Get PDF
    Understanding the genetic underpinnings to complex diseases requires consideration of sophisticated analytical methods designed to uncover intricate associations across multiple predictor variables. At the same time, knowledge of whether single nucleotide polymorphisms within a gene are on the same (in cis) or on different (in trans) chromosomal copies, may provide crucial information about measures of disease progression. In association studies of unrelated individuals, allelic phase is generally unobservable, generating an additional analytical challenge. In this manuscript, we describe a novel approach that combines multiple imputation and random forests for this high-dimensional, unobservable data setting. An application to a cohort of HIV-1 infected individuals receiving anti-retroviral therapies is presented. A simulation study is also presented to characterize method performance

    Design sequences for sensory studies: achieving balance for carry-over and position effects.

    No full text
    In sequences of human sensory assessments, the response toa stimulus may be influenced by previous stimuli. When investigating this phenomenon experimentally with several types or levels of stimulus, it is useful to have treatment sequences which are balanced for first-order carry-over effects. The requirement of balance for each experimental participant leads us to consider sequences of n symbols comprising an initial symbol followed by n ;blocks' each containing a permutation of the symbols. These sequences are designed to include all n (2) ordered pairs of symbols once each, and to have treatment and sequence position effects which are approximately or thogonal. Such sequences were suggested by Finney and Outhwaite (1956), who were able to find examples for particular values of n. We describe and illustrate acomputer algorithm for systematically enumerating the sequences for those values of n for which they exist. Criteria are proposed for choosing between the sequences according to the nearness to orthogonality of their treatment and position effects

    Multiplicative models for combining information from several sensory experiments : A Bayesian analysis

    Get PDF
    Consider the situation in which several quantitative sensory experiments are carried out on the same type of product, and the assessors in these experiments are drawn from a common pool. The data from such a sequence of experiments contain information on the relative biases and variability of individual assessors, and on any temporal influences on the experiments. This information can be extracted by extending models for individual experiments to encompass variation between them. Because each assessor takes part in several experiments, adjustment can be made for the absence of individuals from some of them. By including future experiments in the extended models, it is also possible to use information on assessors' previous performance to reduce the average variance of product differences in future. Such a combination of information over experiments is illustrated using a sequence of 45 apple-tasting experiments conducted with the main aim of monitoring assessor performance over time. Models with multiplicative interaction terms have been used for modelling heterogeneous interaction between assessors and products in individual sensory experiments. Under the assumption that the data are normally distributed (or can be suitably transformed) we extend such models to analyse data from sequences of experiments. A Bayesian approach is used because of the complexity of the extended model and the need to incorporate future experiments

    Serologic Responses in Childhood Pulmonary Tuberculosis.

    Get PDF
    BACKGROUND: Identification of the Mycobacterium tuberculosis immunoproteome and antigens associated with serologic responses in adults has renewed interest in developing a serologic test for childhood tuberculosis (TB). We investigated IgG antibody responses against M. tuberculosis antigens in children with well-characterized TB. METHODS: We studied archived sera obtained from hospitalized children with suspected pulmonary TB, and classified as having confirmed TB (culture-confirmed), unlikely TB (clinical improvement without TB treatment), or unconfirmed TB (all others). A multiplexed bead-based assay for IgG antibodies against 119 M. tuberculosis antigens was developed, validated and used to test sera. The areas under the curves (AUC) of the empiric receiver-operator characteristic curves were generated as measures of predictive ability. A cross-validated generalized linear model was used to select the most predictive combinations of antigens. RESULTS: For the confirmed TB versus unlikely TB comparison, the maximal single antigen AUC was 0.63, corresponding to sensitivity 0.60 and specificity 0.60. Older (age 60+ months) children's responses were better predictive of TB status than younger (age 12-59 months) children's, with a maximal single antigen AUC of -0.76. For the confirmed TB versus unlikely TB groups, the most predictive combinations of antigens assigned TB risk probabilities of 0.33 and 0.33, respectively, when all ages were considered, and 0.57 (IQR 0.48, 0.64) and 0.35 (IQR 0.32, 0.40) when only older children were considered. CONCLUSION: An antigen-based IgG test is unlikely to meet the performance characteristics required of a TB detection test applicable to all age groups
    corecore