4 research outputs found

    Regression for Pooled Testing Data with Biomedical Applications

    Get PDF
    Since first introduced by Dorfman in 1943, pooled testing has been widely used as a cost and time effective testing protocol in the variety of applications. This dis- sertation consists of three projects that reveal the use of pooling techniques in the disease prevention from the perspective of regression. For disease monitoring and control, individual covariates information are often of practical interest and yield meaningful interpretations. It is natural to model the outcome of interest, which can be either a disease status (binary) or a biomarker concentration index (continuous), with individual-specific covariates through a regression analysis. Chapter 2 focuses on the pooled biomarker assessment, where a pooling procedure is implemented to measure a continuous outcome of interest. A semi-parametric single-index model is developed to model the mean trend of biomarker concentration. In spite of pooled biomarker assessment, this dissertation also focuses on the group testing problems in infectious disease studies. In Chapter 3, we propose a multivariate logistic regres- sion model for the multiple-infection group testing data. To facilitate the variable selection and model interpretation, we further develop a regularized approach which selects the active risk factors for each infection. Other than significant cost savings, pooling strategy provides more precise biomarker mean curve estimations (in Chapter 2), and more accurate variable selections (in Chapter 3). With these cheerful benefits from pooling strategy, for the purpose of promoting group testing to laboratories, in Chapter 4, we further discuss how to simplify the pooled testing routine realistically without significant impairments on regression estimation

    A Bayesian Model for Detection of Highorder Interactions Among Genetic Variants in Genome-Wide Association Studies

    Get PDF
    Background: A central question for disease studies and crop improvements is how genetics variants drive phenotypes. Genome Wide Association Study (GWAS) provides a powerful tool for characterizing the genotypephenotype relationships in complex traits and diseases. Epistasis (gene-gene interaction), including high-order interaction among more than two genes, often plays important roles in complex traits and diseases, but current GWAS analysis usually just focuses on additive effects of single nucleotide polymorphisms (SNPs). The lack of effective computational modelling of high-order functional interactions often leads to significant under-utilization of GWAS data. Results: We have developed a novel Bayesian computational method with a Markov Chain Monte Carlo (MCMC) search, and implemented the method as a Bayesian High-order Interaction Toolkit (BHIT) for detecting epistatic interactions among SNPs. BHIT first builds a Bayesian model on both continuous data and discrete data, which is capable of detecting high-order interactions in SNPs related to case—control or quantitative phenotypes. We also developed a pipeline that enables users to apply BHIT on different species in different use cases. Conclusions: Using both simulation data and soybean nutritional seed composition studies on oil content and protein content, BHIT effectively detected some high-order interactions associated with phenotypes, and it outperformed a number of other available tools. BHIT is freely available for academic users at http://digbio.missouri.edu/BHIT/

    Single-index regression for pooled biomarker data

    No full text
    <p>Laboratory assays used to evaluate biomarkers (biological markers) are often prohibitively expensive. As an efficient data collection mechanism to save on testing costs, pooling has become more commonly used in epidemiological research. Useful statistical methods have been proposed to relate pooled biomarker measurements to individual covariate information. However, most of these regression techniques have proceeded under parametric linear assumptions. To relax such assumptions, we propose a semiparametric approach that originates from the context of the single-index model. Unlike with traditional single-index methodologies, we face a challenge in that the observed data are biomarker measurements on pools rather than individual specimens. In this article, we propose a method that addresses this challenge. The asymptotic properties of our estimators are derived. We illustrate the finite sample performance of our estimators through simulation and by applying it to a diabetes data set and a chemokine data set.</p

    Single-index regression for pooled biomarker data

    No full text
    corecore