6 research outputs found

    Scalable Statistical Methods for Cell Type Deconvolution and Mixed Models Applied to High Dimensional Genomic Data

    Get PDF
    Utilizing genomic data in the clinical setting provides new opportunities for biomarker discovery, disease characterization, and personalizing treatment, but also poses new statistical challenges. In the first part of the dissertation, we propose a new computational method, IsoDeconvMM, which estimates cell type fractions using isoform-level RNA-seq gene expression data one gene at a time. The cell type composition of a tissue sample may itself be of interest and is needed for proper analysis of differential gene expression of heterogeneous tissues. Although a variety of existing computational methods estimate cell type proportions using gene-level expression data, isoform-level expression could be equally or more informative for determining cell type origin. In genomics datasets as well as many other modern biomedical datasets, the data are increasingly high dimensional and exhibit complex correlation structures. Generalized linear mixed models (GLMMs) have long been employed to account for such dependencies. In the second part of this dissertation, we implement several statistical and computational innovations to improve the speed of a high dimensional penalized GLMM framework for simultaneously selecting fixed and random effects, resulting in the efficient R package glmmPen. Although this framework extends the feasible dimensionality of GLMMs relative to existing methods, new methodology is needed to alleviate computational burden as the dimension increases and allow scalability to hundreds of predictors. We present a novel reformulation of the GLMM using a factor model decomposition of the random effects, enabling scalable computation of GLMMs in higher dimensions by reducing the latent space from a large number of random effects to a smaller set of common factors. We extend our prior work to estimate model parameters and perform simultaneous selection of fixed and random effects using a modified version of the Monte Carlo Expectation Conditional Minimization (MCECM) algorithm. We show that through this factor model decomposition, we can improve the speed and scalability of fitting high dimensional penalized GLMMs. Finally, we extend our framework on performing high dimensional penalized generalized linear mixed models to survival outcome data. We approximate proportional hazards mixed effects models using piecewise constant hazards mixed effects survival models.Doctor of Philosoph

    glmmPen: High Dimensional Penalized Generalized Linear Mixed Models

    Full text link
    Generalized linear mixed models (GLMMs) are widely used in research for their ability to model correlated outcomes with non-Gaussian conditional distributions. The proper selection of fixed and random effects is a critical part of the modeling process, where model misspecification may lead to significant bias. However, the joint selection of fixed and and random effects has historically been limited to lower dimensional GLMMs, largely due to the use of criterion-based model selection strategies. Here we present the R package glmmPen, one of the first that to select fixed and random effects in higher dimension using a penalized GLMM modeling framework. Model parameters are estimated using a Monte Carlo expectation conditional minimization (MCECM) algorithm, which leverages Stan and RcppArmadillo for increased computational efficiency. Our package supports multiple distributional families and penalty functions. In this manuscript we discuss the modeling procedure, estimation scheme, and software implementation through application to a pancreatic cancer subtyping study

    Efficient Computation of High-Dimensional Penalized Generalized Linear Mixed Models by Latent Factor Modeling of the Random Effects

    Full text link
    Modern biomedical datasets are increasingly high dimensional and exhibit complex correlation structures. Generalized Linear Mixed Models (GLMMs) have long been employed to account for such dependencies. However, proper specification of the fixed and random effects in GLMMs is increasingly difficult in high dimensions, and computational complexity grows with increasing dimension of the random effects. We present a novel reformulation of the GLMM using a factor model decomposition of the random effects, enabling scalable computation of GLMMs in high dimensions by reducing the latent space from a large number of random effects to a smaller set of latent factors. We also extend our prior work to estimate model parameters using a modified Monte Carlo Expectation Conditional Minimization algorithm, allowing us to perform variable selection on both the fixed and random effects simultaneously. We show through simulation that through this factor model decomposition, our method can fit high dimensional penalized GLMMs faster than comparable methods and more easily scale to larger dimensions not previously seen in existing approaches

    Exploring the Acceptability of Text Messages to Inform and Support Shared Decision-making for Colorectal Cancer Screening: Online Panel Survey

    No full text
    BackgroundWhile online portals may be helpful to engage patients in shared decision-making at the time of cancer screening, because of known disparities in patient portal use, sole reliance on portals to support cancer screening decision-making could exacerbate well-known disparities in this health care area. Innovative approaches are needed to engage patients in health care decision-making and to support equitable shared decision-making. ObjectiveWe assessed the acceptability of text messages to engage sociodemographically diverse individuals in colorectal cancer (CRC) screening decisions and support shared decision-making in practice. MethodsWe developed a brief text message program offering educational information consisting of components of shared decision-making regarding CRC screening (eg, for whom screening is recommended, screening test options, and pros/cons of options). The program and postprogram survey were offered to members of an online panel. The outcome of interest was program acceptability measured by observed program engagement, participant-reported acceptability, and willingness to use similar programs (behavioral intent). We evaluated acceptability among historically marginalized categories of people defined by income, literacy, and race. ResultsOf the 289 participants, 115 reported having a low income, 146 were Black/African American, and 102 had less than extreme confidence in their health literacy. With one exception, we found equal or greater acceptability, regardless of measure, within each of the marginalized categories of people compared to their counterparts. The exception was that participants reporting an income below US $50,000 were less likely to engage with sufficient content of the program to learn that there was a choice among different CRC screening tests (difference –10.4%, 95% CI –20.1 to –0.8). Of note, Black/African American participants reported being more likely to sign up to receive text messages from their doctor’s office compared to white participants (difference 18.7%, 95% CI 7.0-30.3). ConclusionsStudy findings demonstrate general acceptance of text messages to inform and support CRC screening shared decision-making

    Randomly organized lipids and marginally stable proteins: A coupling of weak interactions to optimize membrane signaling

    Get PDF
    AbstractEukaryotic lipids in a bilayer are dominated by weak cooperative interactions. These interactions impart highly dynamic and pliable properties to the membrane. C2 domain-containing proteins in the membrane also interact weakly and cooperatively giving rise to a high degree of conformational plasticity. We propose that this feature of weak energetics and plasticity shared by lipids and C2 domain-containing proteins enhance a cell's ability to transduce information across the membrane. We explored this hypothesis using information theory to assess the information storage capacity of model and mast cell membranes, as well as differential scanning calorimetry, carboxyfluorescein release assays, and tryptophan fluorescence to assess protein and membrane stability. The distribution of lipids in mast cell membranes encoded 5.6–5.8bits of information. More information resided in the acyl chains than the head groups and in the inner leaflet of the plasma membrane than the outer leaflet. When the lipid composition and information content of model membranes were varied, the associated C2 domains underwent large changes in stability and denaturation profile. The C2 domain-containing proteins are therefore acutely sensitive to the composition and information content of their associated lipids. Together, these findings suggest that the maximum flow of signaling information through the membrane and into the cell is optimized by the cooperation of near-random distributions of membrane lipids and proteins. This article is part of a Special Issue entitled: Interfacially Active Peptides and Proteins. Guest Editors: William C. Wimley and Kalina Hristova

    Patient-reported outcomes in CD30-directed CAR-T cells against relapsed/refractory CD30+ lymphomas

    No full text
    Chimeric antigen receptor (CAR)-T cells targeting CD30 have demonstrated high response rates with durable remissions observed in a subset of patients with relapsed/refractory CD30+ hematologic malignancies, particularly classical Hodgkin lymphoma. This therapy has low rates of toxicity including cytokine release syndrome with no neurotoxicity observed in our phase 2 study. We collected patient-reported outcomes (PROs) on patients treated with CD30 directed CAR-T cells to evaluate the impact of this therapy on their symptom experience. We collected PROs including PROMIS (Patient-Reported Outcomes Measurement Information System) Global Health and Physical Function questionnaires and selected symptom questions from the NCI PRO-CTCAE in patients enrolled on our clinical trial of CD30-directed CAR-T cells at procurement, at time of CAR-T cell infusion, and at various time points post treatment. We compared PROMIS scores and overall symptom burden between pre-procurement, time of infusion, and at 4 weeks post infusion. At least one PRO measurement during the study period was found in 23 out of the 28 enrolled patients. Patient overall symptom burden, global health and mental health, and physical function were at or above baseline levels at 4 weeks post CAR-T cell infusion. In addition, PROMIS scores for patients who participated in the clinical trial were similar to the average healthy population. CD30 CAR-T cell therapy has a favorable toxicity profile with patient physical function and symptom burden recovering to at least their baseline pretreatment health by 1 month post infusion. Trial registration number: NCT02690545
    corecore