6 research outputs found
Scalable Statistical Methods for Cell Type Deconvolution and Mixed Models Applied to High Dimensional Genomic Data
Utilizing genomic data in the clinical setting provides new opportunities for biomarker discovery, disease characterization, and personalizing treatment, but also poses new statistical challenges. In the first part of the dissertation, we propose a new computational method, IsoDeconvMM, which estimates cell type fractions using isoform-level RNA-seq gene expression data one gene at a time. The cell type composition of a tissue sample may itself be of interest and is needed for proper analysis of differential gene expression of heterogeneous tissues. Although a variety of existing computational methods estimate cell type proportions using gene-level expression data, isoform-level expression could be equally or more informative for determining cell type origin. In genomics datasets as well as many other modern biomedical datasets, the data are increasingly high dimensional and exhibit complex correlation structures. Generalized linear mixed models (GLMMs) have long been employed to account for such dependencies. In the second part of this dissertation, we implement several statistical and computational innovations to improve the speed of a high dimensional penalized GLMM framework for simultaneously selecting fixed and random effects, resulting in the efficient R package glmmPen. Although this framework extends the feasible dimensionality of GLMMs relative to existing methods, new methodology is needed to alleviate computational burden as the dimension increases and allow scalability to hundreds of predictors. We present a novel reformulation of the GLMM using a factor model decomposition of the random effects, enabling scalable computation of GLMMs in higher dimensions by reducing the latent space from a large number of random effects to a smaller set of common factors. We extend our prior work to estimate model parameters and perform simultaneous selection of fixed and random effects using a modified version of the Monte Carlo Expectation Conditional Minimization (MCECM) algorithm. We show that through this factor model decomposition, we can improve the speed and scalability of fitting high dimensional penalized GLMMs. Finally, we extend our framework on performing high dimensional penalized generalized linear mixed models to survival outcome data. We approximate proportional hazards mixed effects models using piecewise constant hazards mixed effects survival models.Doctor of Philosoph
glmmPen: High Dimensional Penalized Generalized Linear Mixed Models
Generalized linear mixed models (GLMMs) are widely used in research for their
ability to model correlated outcomes with non-Gaussian conditional
distributions. The proper selection of fixed and random effects is a critical
part of the modeling process, where model misspecification may lead to
significant bias. However, the joint selection of fixed and and random effects
has historically been limited to lower dimensional GLMMs, largely due to the
use of criterion-based model selection strategies. Here we present the R
package glmmPen, one of the first that to select fixed and random effects in
higher dimension using a penalized GLMM modeling framework. Model parameters
are estimated using a Monte Carlo expectation conditional minimization (MCECM)
algorithm, which leverages Stan and RcppArmadillo for increased computational
efficiency. Our package supports multiple distributional families and penalty
functions. In this manuscript we discuss the modeling procedure, estimation
scheme, and software implementation through application to a pancreatic cancer
subtyping study
Efficient Computation of High-Dimensional Penalized Generalized Linear Mixed Models by Latent Factor Modeling of the Random Effects
Modern biomedical datasets are increasingly high dimensional and exhibit
complex correlation structures. Generalized Linear Mixed Models (GLMMs) have
long been employed to account for such dependencies. However, proper
specification of the fixed and random effects in GLMMs is increasingly
difficult in high dimensions, and computational complexity grows with
increasing dimension of the random effects. We present a novel reformulation of
the GLMM using a factor model decomposition of the random effects, enabling
scalable computation of GLMMs in high dimensions by reducing the latent space
from a large number of random effects to a smaller set of latent factors. We
also extend our prior work to estimate model parameters using a modified Monte
Carlo Expectation Conditional Minimization algorithm, allowing us to perform
variable selection on both the fixed and random effects simultaneously. We show
through simulation that through this factor model decomposition, our method can
fit high dimensional penalized GLMMs faster than comparable methods and more
easily scale to larger dimensions not previously seen in existing approaches
Exploring the Acceptability of Text Messages to Inform and Support Shared Decision-making for Colorectal Cancer Screening: Online Panel Survey
BackgroundWhile online portals may be helpful to engage patients in shared decision-making at the time of cancer screening, because of known disparities in patient portal use, sole reliance on portals to support cancer screening decision-making could exacerbate well-known disparities in this health care area. Innovative approaches are needed to engage patients in health care decision-making and to support equitable shared decision-making.
ObjectiveWe assessed the acceptability of text messages to engage sociodemographically diverse individuals in colorectal cancer (CRC) screening decisions and support shared decision-making in practice.
MethodsWe developed a brief text message program offering educational information consisting of components of shared decision-making regarding CRC screening (eg, for whom screening is recommended, screening test options, and pros/cons of options). The program and postprogram survey were offered to members of an online panel. The outcome of interest was program acceptability measured by observed program engagement, participant-reported acceptability, and willingness to use similar programs (behavioral intent). We evaluated acceptability among historically marginalized categories of people defined by income, literacy, and race.
ResultsOf the 289 participants, 115 reported having a low income, 146 were Black/African American, and 102 had less than extreme confidence in their health literacy. With one exception, we found equal or greater acceptability, regardless of measure, within each of the marginalized categories of people compared to their counterparts. The exception was that participants reporting an income below US $50,000 were less likely to engage with sufficient content of the program to learn that there was a choice among different CRC screening tests (difference –10.4%, 95% CI –20.1 to –0.8). Of note, Black/African American participants reported being more likely to sign up to receive text messages from their doctor’s office compared to white participants (difference 18.7%, 95% CI 7.0-30.3).
ConclusionsStudy findings demonstrate general acceptance of text messages to inform and support CRC screening shared decision-making
Randomly organized lipids and marginally stable proteins: A coupling of weak interactions to optimize membrane signaling
AbstractEukaryotic lipids in a bilayer are dominated by weak cooperative interactions. These interactions impart highly dynamic and pliable properties to the membrane. C2 domain-containing proteins in the membrane also interact weakly and cooperatively giving rise to a high degree of conformational plasticity. We propose that this feature of weak energetics and plasticity shared by lipids and C2 domain-containing proteins enhance a cell's ability to transduce information across the membrane. We explored this hypothesis using information theory to assess the information storage capacity of model and mast cell membranes, as well as differential scanning calorimetry, carboxyfluorescein release assays, and tryptophan fluorescence to assess protein and membrane stability. The distribution of lipids in mast cell membranes encoded 5.6–5.8bits of information. More information resided in the acyl chains than the head groups and in the inner leaflet of the plasma membrane than the outer leaflet. When the lipid composition and information content of model membranes were varied, the associated C2 domains underwent large changes in stability and denaturation profile. The C2 domain-containing proteins are therefore acutely sensitive to the composition and information content of their associated lipids. Together, these findings suggest that the maximum flow of signaling information through the membrane and into the cell is optimized by the cooperation of near-random distributions of membrane lipids and proteins. This article is part of a Special Issue entitled: Interfacially Active Peptides and Proteins. Guest Editors: William C. Wimley and Kalina Hristova
Patient-reported outcomes in CD30-directed CAR-T cells against relapsed/refractory CD30+ lymphomas
Chimeric antigen receptor (CAR)-T cells targeting CD30 have demonstrated high response rates with durable remissions observed in a subset of patients with relapsed/refractory CD30+ hematologic malignancies, particularly classical Hodgkin lymphoma. This therapy has low rates of toxicity including cytokine release syndrome with no neurotoxicity observed in our phase 2 study. We collected patient-reported outcomes (PROs) on patients treated with CD30 directed CAR-T cells to evaluate the impact of this therapy on their symptom experience. We collected PROs including PROMIS (Patient-Reported Outcomes Measurement Information System) Global Health and Physical Function questionnaires and selected symptom questions from the NCI PRO-CTCAE in patients enrolled on our clinical trial of CD30-directed CAR-T cells at procurement, at time of CAR-T cell infusion, and at various time points post treatment. We compared PROMIS scores and overall symptom burden between pre-procurement, time of infusion, and at 4 weeks post infusion. At least one PRO measurement during the study period was found in 23 out of the 28 enrolled patients. Patient overall symptom burden, global health and mental health, and physical function were at or above baseline levels at 4 weeks post CAR-T cell infusion. In addition, PROMIS scores for patients who participated in the clinical trial were similar to the average healthy population. CD30 CAR-T cell therapy has a favorable toxicity profile with patient physical function and symptom burden recovering to at least their baseline pretreatment health by 1 month post infusion. Trial registration number: NCT02690545