150,308 research outputs found
PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference
Generalized linear models (GLMs) -- such as logistic regression, Poisson
regression, and robust regression -- provide interpretable models for diverse
data types. Probabilistic approaches, particularly Bayesian ones, allow
coherent estimates of uncertainty, incorporation of prior information, and
sharing of power across experiments via hierarchical models. In practice,
however, the approximate Bayesian methods necessary for inference have either
failed to scale to large data sets or failed to provide theoretical guarantees
on the quality of inference. We propose a new approach based on constructing
polynomial approximate sufficient statistics for GLMs (PASS-GLM). We
demonstrate that our method admits a simple algorithm as well as trivial
streaming and distributed extensions that do not compound error across
computations. We provide theoretical guarantees on the quality of point (MAP)
estimates, the approximate posterior, and posterior mean and uncertainty
estimates. We validate our approach empirically in the case of logistic
regression using a quadratic approximation and show competitive performance
with stochastic gradient descent, MCMC, and the Laplace approximation in terms
of speed and multiple measures of accuracy -- including on an advertising data
set with 40 million data points and 20,000 covariates.Comment: In Proceedings of the 31st Annual Conference on Neural Information
Processing Systems (NIPS 2017). v3: corrected typos in Appendix
Statistical Modeling of Epistasis and Linkage Decay using Logic Regression
Logic regression has been recognized as a tool that can identify and model non-additive genetic interactions using Boolean logic groups. Logic regression, TASSEL-GLM and SAS-GLM were compared for analytical precision using a previously characterized model system to identify the best genetic model explaining epistatic interaction of vernalization-sensitivity in barley. A genetic model containing two molecular markers identified in vernalization response in barley was selected using logic regression while both TASSEL-GLM and SAS-GLM included spurious associations in their models. The results also suggest the logic regression can be used to identify dominant/recessive relationships between epistatic alleles through its use of conjugate
operators
Improved physiological noise regression in fNIRS: a multimodal extension of the General Linear Model using temporally embedded Canonical Correlation Analysis
For the robust estimation of evoked brain activity from functional Near-Infrared Spectroscopy (fNIRS) signals, it is crucial to reduce nuisance signals from systemic physiology and motion. The current best practice incorporates short-separation (SS) fNIRS measurements as regressors in a General Linear Model (GLM). However, several challenging signal characteristics such as non-instantaneous and non-constant coupling are not yet addressed by this approach and additional auxiliary signals are not optimally exploited. We have recently introduced a new methodological framework for the unsupervised multivariate analysis of fNIRS signals using Blind Source Separation (BSS) methods. Building onto the framework, in this manuscript we show how to incorporate the advantages of regularized temporally embedded Canonical Correlation Analysis (tCCA) into the supervised GLM. This approach allows flexible integration of any number of auxiliary modalities and signals. We provide guidance for the selection of optimal parameters and auxiliary signals for the proposed GLM extension. Its performance in the recovery of evoked HRFs is then evaluated using both simulated ground truth data and real experimental data and compared with the GLM with short-separation regression. Our results show that the GLM with tCCA significantly improves upon the current best practice, yielding significantly better results across all applied metrics: Correlation (HbO max. +45%), Root Mean Squared Error (HbO max. -55%), F-Score (HbO up to 3.25-fold) and p-value as well as power spectral density of the noise floor. The proposed method can be incorporated into the GLM in an easily applicable way that flexibly combines any available auxiliary signals into optimal nuisance regressors. This work has potential significance both for conventional neuroscientific fNIRS experiments as well as for emerging applications of fNIRS in everyday environments, medicine and BCI, where high Contrast to Noise Ratio is of importance for single trial analysis.Published versio
Comparative analysis of US real-world dosing patterns and direct infusion-related costs for matched cohorts of rheumatoid arthritis patients treated with infliximab or intravenous golimumab.
Purpose: The objectives of this study were to evaluate and compare treatment patterns and infusion-related health care resource expenditures for rheumatoid arthritis (RA) patients initiating golimumab for intravenous use (GLM-IV) and infliximab (IFX) therapy and to assess cost implications from the commercial perspective.
Methods: Adult RA patients with a new episode of GLM-IV or IFX treatment between January 1, 2014 and March 31, 2016 were identified from MarketScan databases and evaluated for maintenance infusion intervals and related costs of treatment. IFX and GLM-IV patients were matched 1:1 on index medication treatment duration, gender, payer type, prior biologic use, and post-index methotrexate use. Paid amounts for drugs and associated administration costs were applied to treatment group dosing patterns.
Results: Final matched treatment groups included 547 GLM-IV and 547 IFX patients (mean age = 55-56 years). Mean (SD) follow-up was 609 (161) days for GLM-IV and 613 (163) days for IFX. Treatment duration was 396 (240) days for GLM-IV and 397 (239) days for IFX. Overall, 80% of GLM-IV and 39% of IFX maintenance infusions were given approximately every 8 weeks; and 6% of GLM-IV and 53% of IFX maintenance infusions occurred more frequently than every 8 weeks (P\u3c0.001). When weighting of the maintenance infusion interval was applied, the mean number of induction plus maintenance infusions during the first year of treatment was estimated at 7.03 for GLM-IV and 9.48 for IFX. From the commercial perspective, drug plus administration costs per infusion were 5,444 for IFX with total annual cost of therapy for GLM-IV patients costing 6,774 less than that for IFX patients in subsequent years.
Conclusion: Annual GLM-IV drug plus administration costs for commercial health plans were significantly less than IFX in RA patients due to differences in real-world dosing and administration. © 2019 Ellis et al
Linear Models for Multivariate Repeated Measures Data
We study the general linear model (GLM) with doubly exchangeable distributed error for m observed random variables. The doubly exchangeable linear model (DEGLM) arises when the m¡dimensional error vectors are \doubly exchangeable" (de¯ned later), jointly normally distributed, which is much weaker assumption than the independent and identically distributed error vectors as in the case of GLM or classical GLM (CGLM). We estimate the parameters in the model and also ¯nd their distributions.Multivariate repeated measures; Linear model; Replicated observations.
- …
