23 research outputs found
Boosting Black Box Variational Inference
Approximating a probability density in a tractable manner is a central task
in Bayesian statistics. Variational Inference (VI) is a popular technique that
achieves tractability by choosing a relatively simple variational family.
Borrowing ideas from the classic boosting framework, recent approaches attempt
to \emph{boost} VI by replacing the selection of a single density with a
greedily constructed mixture of densities. In order to guarantee convergence,
previous works impose stringent assumptions that require significant effort for
practitioners. Specifically, they require a custom implementation of the greedy
step (called the LMO) for every probabilistic model with respect to an
unnatural variational family of truncated distributions. Our work fixes these
issues with novel theoretical and algorithmic insights. On the theoretical
side, we show that boosting VI satisfies a relaxed smoothness assumption which
is sufficient for the convergence of the functional Frank-Wolfe (FW) algorithm.
Furthermore, we rephrase the LMO problem and propose to maximize the Residual
ELBO (RELBO) which replaces the standard ELBO optimization in VI. These
theoretical enhancements allow for black box implementation of the boosting
subroutine. Finally, we present a stopping criterion drawn from the duality gap
in the classic FW analyses and exhaustive experiments to illustrate the
usefulness of our theoretical and algorithmic contributions
Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization
We propose a novel Stochastic Frank-Wolfe (a.k.a. conditional gradient)
algorithm for constrained smooth finite-sum minimization with a generalized
linear prediction/structure. This class of problems includes empirical risk
minimization with sparse, low-rank, or other structured constraints. The
proposed method is simple to implement, does not require step-size tuning, and
has a constant per-iteration cost that is independent of the dataset size.
Furthermore, as a byproduct of the method we obtain a stochastic estimator of
the Frank-Wolfe gap that can be used as a stopping criterion. Depending on the
setting, the proposed method matches or improves on the best computational
guarantees for Stochastic Frank-Wolfe algorithms. Benchmarks on several
datasets highlight different regimes in which the proposed method exhibits a
faster empirical convergence than related methods. Finally, we provide an
implementation of all considered methods in an open-source package.Comment: To appear in the Proceedings of the 37th International Conference on
Machine Learning, 2020. Main text: 9 pages, 1 figure. Fixes previously found
erro
ACE: A fast, skillful learned global atmospheric model for climate prediction
Existing ML-based atmospheric models are not suitable for climate prediction,
which requires long-term stability and physical consistency. We present ACE
(AI2 Climate Emulator), a 200M-parameter, autoregressive machine learning
emulator of an existing comprehensive 100-km resolution global atmospheric
model. The formulation of ACE allows evaluation of physical laws such as the
conservation of mass and moisture. The emulator is stable for 100 years, nearly
conserves column moisture without explicit constraints and faithfully
reproduces the reference model's climate, outperforming a challenging baseline
on over 90% of tracked variables. ACE requires nearly 100x less wall clock time
and is 100x more energy efficient than the reference model using typically
available resources. Without fine-tuning, ACE can stably generalize to a
previously unseen historical sea surface temperature dataset.Comment: Accepted at Tackling Climate Change with Machine Learning: workshop
at NeurIPS 202
Recommended from our members
Comprehensive molecular characterization of gastric adenocarcinoma
Gastric cancer is a leading cause of cancer deaths, but analysis of its molecular and clinical characteristics has been complicated by histological and aetiological heterogeneity. Here we describe a comprehensive molecular evaluation of 295 primary gastric adenocarcinomas as part of The Cancer Genome Atlas (TCGA) project. We propose a molecular classification dividing gastric cancer into four subtypes: tumours positive for Epstein–Barr virus, which display recurrent PIK3CA mutations, extreme DNA hypermethylation, and amplification of JAK2, CD274 (also known as PD-L1) and PDCD1LG2 (also knownasPD-L2); microsatellite unstable tumours, which show elevated mutation rates, including mutations of genes encoding targetable oncogenic signalling proteins; genomically stable tumours, which are enriched for the diffuse histological variant and mutations of RHOA or fusions involving RHO-family GTPase-activating proteins; and tumours with chromosomal instability, which show marked aneuploidy and focal amplification of receptor tyrosine kinases. Identification of these subtypes provides a roadmap for patient stratification and trials of targeted therapies
Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin
Recent genomic analyses of pathologically-defined tumor types identify “within-a-tissue” disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head & neck, and a subset of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multi-platform classification, while correlated with tissue-of-origin, provides independent information for predicting clinical outcomes. All datasets are available for data-mining from a unified resource to support further biological discoveries and insights into novel therapeutic strategies
ai2cm/ace: 2023.12.0
<p>Inference code for model described in https://arxiv.org/abs/2310.02074</p>
Sparse Gaussian Processes on Discrete Domains
Kernel methods on discrete domains have shown great promise for many challenging data types, for instance, biological sequence data and molecular structure data. Scalable kernel methods like Support Vector Machines may offer good predictive performances but do not intrinsically provide uncertainty estimates. In contrast, probabilistic kernel methods like Gaussian Processes offer uncertainty estimates in addition to good predictive performance but fall short in terms of scalability. While the scalability of Gaussian processes can be improved using sparse inducing point approximations, the selection of these inducing points remains challenging. We explore different techniques for selecting inducing points on discrete domains, including greedy selection, determinantal point processes, and simulated annealing. We find that simulated annealing, which can select inducing points that are not in the training set, can perform competitively with support vector machines and full Gaussian processes on synthetic data, as well as on challenging real-world DNA sequence data.ISSN:2169-353
Boosting Variational Inference With Locally Adaptive Step-Sizes
Variational Inference makes a trade-off between the capacity of the variational family and the tractability of finding an approximate posterior distribution. Instead, Boosting Variational Inference allows practitioners to obtain increasingly good posterior approximations by spending more compute. The main obstacle to widespread adoption of Boosting Variational Inference is the amount of resources necessary to improve over a strong Variational Inference baseline. In our work, we trace this limitation back to the global curvature of the KL-divergence. We characterize how the global curvature impacts time and memory consumption, address the problem with the notion of local curvature, and provide a novel approximate backtracking algorithm for estimating local curvature. We give new theoretical convergence rates for our algorithms and provide experimental validation on synthetic and real-world datasets
Neighborhood Contrastive Learning Applied to Online Patient Monitoring
Intensive care units (ICU) are increasingly looking towards machine learning for methods to provide online monitoring of critically ill patients. In machine learning, online monitoring is often formulated as a supervised learning problem. Recently, contrastive learning approaches have demonstrated promising improvements over competitive supervised benchmarks. These methods rely on well-understood data augmentation techniques developed for image data which do not apply to online monitoring. In this work, we overcome this limitation by supplementing time-series data augmentation techniques with a novel contrastive learning objective which we call neighborhood contrastive learning (NCL). Our objective explicitly groups together contiguous time segments from each patient while maintaining state-specific information. Our experiments demonstrate a marked improvement over existing work applying contrastive methods to medical time-series.ISSN:2640-349