3 research outputs found

    Generative Method to Discover Genetically Driven Image Biomarkers

    Get PDF
    Abstract. We present a generative probabilistic approach to discovery of disease subtypes determined by the genetic variants. In many diseases, multiple types of pathology may present simultaneously in a patient, making quantification of the disease challenging. Our method seeks common co-occurring image and genetic patterns in a population as a way to model these two different data types jointly. We assume that each patient is a mixture of multiple disease subtypes and use the joint generative model of image and genetic markers to identify disease subtypes guided by known genetic influences. Our model is based on a variant of the so-called topic models that uncover the latent structure in a collection of data. We derive an efficient variational inference algorithm to extract patterns of co-occurrence and to quantify the presence of heterogeneous disease processes in each patient. We evaluate the method on simulated data and illustrate its use in the context of Chronic Obstructive Pulmonary Disease (COPD) to characterize the relationship between image and genetic signatures of COPD subtypes in a large patient cohort

    Spherical Topic Models for Imaging Phenotype Discovery in Genetic Studies

    No full text
    In this paper, we use Spherical Topic Models to discover the latent structure of lung disease. This method can be widely employed when a measurement for each subject is provided as a normalized histogram of relevant features. In this paper, the resulting descriptors are used as phenotypes to identify genetic markers associated with the Chronic Obstructive Pulmonary Disease (COPD). Features extracted from images capture the heterogeneity of the disease and therefore promise to improve detection of relevant genetic variants in Genome Wide Association Studies (GWAS). Our generative model is based on normalized histograms of image intensity of each subject and it can be readily extended to other forms of features as long as they are provided as normalized histograms. The resulting algorithm represents the intensity distribution as a combination of meaningful latent factors and mixing coefficients that can be used for genetic association analysis. This approach is motivated by a clinical hypothesis that COPD symptoms are caused by multiple coexisting disease processes. Our experiments show that the new features enhance the previously detected signal on chromosome 15 with respect to standard respiratory and imaging measurements.National Institutes of Health (U.S.) (National Institute for Biomedical Imaging and Bioengineering (U.S.)/National Alliance for Medical Image Computing (U.S.) U54-EB005149)National Institutes of Health (U.S.) (National Center for Research Resources (U.S.)/Neuroimaging Analysis Center (U.S.) P41-RR13218)National Institutes of Health (U.S.) (National Institute for Biomedical Imaging and Bioengineering (U.S.)/Neuroimaging Analysis Center (U.S.) P41-EB-015902)National Heart, Lung, and Blood Institute (R01HL089856)National Heart, Lung, and Blood Institute (R01HL089897)National Heart, Lung, and Blood Institute (K08HL097029)National Heart, Lung, and Blood Institute (R01HL113264