1,545 research outputs found
Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent
Gaussian processes are a powerful framework for quantifying uncertainty and
for sequential decision-making but are limited by the requirement of solving
linear systems. In general, this has a cubic cost in dataset size and is
sensitive to conditioning. We explore stochastic gradient algorithms as a
computationally efficient method of approximately solving these linear systems:
we develop low-variance optimization objectives for sampling from the posterior
and extend these to inducing points. Counterintuitively, stochastic gradient
descent often produces accurate predictions, even in cases where it does not
converge quickly to the optimum. We explain this through a spectral
characterization of the implicit bias from non-convergence. We show that
stochastic gradient descent produces predictive distributions close to the true
posterior both in regions with sufficient data coverage, and in regions
sufficiently far away from the data. Experimentally, stochastic gradient
descent achieves state-of-the-art performance on sufficiently large-scale or
ill-conditioned regression tasks. Its uncertainty estimates match the
performance of significantly more expensive baselines on a large-scale Bayesian
optimization task
Classical and quantum algorithms for scaling problems
This thesis is concerned with scaling problems, which have a plethora of connections to different areas of mathematics, physics and computer science. Although many structural aspects of these problems are understood by now, we only know how to solve them efficiently in special cases.We give new algorithms for non-commutative scaling problems with complexity guarantees that match the prior state of the art. To this end, we extend the well-known (self-concordance based) interior-point method (IPM) framework to Riemannian manifolds, motivated by its success in the commutative setting. Moreover, the IPM framework does not obviously suffer from the same obstructions to efficiency as previous methods. It also yields the first high-precision algorithms for other natural geometric problems in non-positive curvature.For the (commutative) problems of matrix scaling and balancing, we show that quantum algorithms can outperform the (already very efficient) state-of-the-art classical algorithms. Their time complexity can be sublinear in the input size; in certain parameter regimes they are also optimal, whereas in others we show no quantum speedup over the classical methods is possible. Along the way, we provide improvements over the long-standing state of the art for searching for all marked elements in a list, and computing the sum of a list of numbers.We identify a new application in the context of tensor networks for quantum many-body physics. We define a computable canonical form for uniform projected entangled pair states (as the solution to a scaling problem), circumventing previously known undecidability results. We also show, by characterizing the invariant polynomials, that the canonical form is determined by evaluating the tensor network contractions on networks of bounded size
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Spectral Sparsification for Communication-Efficient Collaborative Rotation and Translation Estimation
We propose fast and communication-efficient optimization algorithms for
multi-robot rotation averaging and translation estimation problems that arise
from collaborative simultaneous localization and mapping (SLAM),
structure-from-motion (SfM), and camera network localization applications. Our
methods are based on theoretical relations between the Hessians of the
underlying Riemannian optimization problems and the Laplacians of suitably
weighted graphs. We leverage these results to design a collaborative solver in
which robots coordinate with a central server to perform approximate
second-order optimization, by solving a Laplacian system at each iteration.
Crucially, our algorithms permit robots to employ spectral sparsification to
sparsify intermediate dense matrices before communication, and hence provide a
mechanism to trade off accuracy with communication efficiency with provable
guarantees. We perform rigorous theoretical analysis of our methods and prove
that they enjoy (local) linear rate of convergence. Furthermore, we show that
our methods can be combined with graduated non-convexity to achieve
outlier-robust estimation. Extensive experiments on real-world SLAM and SfM
scenarios demonstrate the superior convergence rate and communication
efficiency of our methods.Comment: Revised extended technical report (37 pages, 15 figures, 6 tables
QuantEase: Optimization-based Quantization for Language Models -- An Efficient and Intuitive Algorithm
With the rising popularity of Large Language Models (LLMs), there has been an
increasing interest in compression techniques that enable their efficient
deployment. This study focuses on the Post-Training Quantization (PTQ) of LLMs.
Drawing from recent advances, our work introduces QuantEase, a layer-wise
quantization framework where individual layers undergo separate quantization.
The problem is framed as a discrete-structured non-convex optimization,
prompting the development of algorithms rooted in Coordinate Descent (CD)
techniques. These CD-based methods provide high-quality solutions to the
complex non-convex layer-wise quantization problems. Notably, our CD-based
approach features straightforward updates, relying solely on matrix and vector
operations, circumventing the need for matrix inversion or decomposition. We
also explore an outlier-aware variant of our approach, allowing for retaining
significant weights (outliers) with complete precision. Our proposal attains
state-of-the-art performance in terms of perplexity and zero-shot accuracy in
empirical evaluations across various LLMs and datasets, with relative
improvements up to 15% over methods such as GPTQ. Particularly noteworthy is
our outlier-aware algorithm's capability to achieve near or sub-3-bit
quantization of LLMs with an acceptable drop in accuracy, obviating the need
for non-uniform quantization or grouping techniques, improving upon methods
such as SpQR by up to two times in terms of perplexity
Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial
On top of machine learning models, uncertainty quantification (UQ) functions
as an essential layer of safety assurance that could lead to more principled
decision making by enabling sound risk assessment and management. The safety
and reliability improvement of ML models empowered by UQ has the potential to
significantly facilitate the broad adoption of ML solutions in high-stakes
decision settings, such as healthcare, manufacturing, and aviation, to name a
few. In this tutorial, we aim to provide a holistic lens on emerging UQ methods
for ML models with a particular focus on neural networks and the applications
of these UQ methods in tackling engineering design as well as prognostics and
health management problems. Toward this goal, we start with a comprehensive
classification of uncertainty types, sources, and causes pertaining to UQ of ML
models. Next, we provide a tutorial-style description of several
state-of-the-art UQ methods: Gaussian process regression, Bayesian neural
network, neural network ensemble, and deterministic UQ methods focusing on
spectral-normalized neural Gaussian process. Established upon the mathematical
formulations, we subsequently examine the soundness of these UQ methods
quantitatively and qualitatively (by a toy regression example) to examine their
strengths and shortcomings from different dimensions. Then, we review
quantitative metrics commonly used to assess the quality of predictive
uncertainty in classification and regression problems. Afterward, we discuss
the increasingly important role of UQ of ML models in solving challenging
problems in engineering design and health prognostics. Two case studies with
source codes available on GitHub are used to demonstrate these UQ methods and
compare their performance in the life prediction of lithium-ion batteries at
the early stage and the remaining useful life prediction of turbofan engines
The OpenMolcas Web: A Community-Driven Approach to Advancing Computational Chemistry
The developments of the open-source OpenMolcas chemistry software environment since spring 2020 are described, with a focus on novel functionalities accessible in the stable branch of the package or via interfaces with other packages. These developments span a wide range of topics in computational chemistry and are presented in thematic sections: electronic structure theory, electronic spectroscopy simulations, analytic gradients and molecular structure optimizations, ab initio molecular dynamics, and other new features. This report offers an overview of the chemical phenomena and processes OpenMolcas can address, while showing that OpenMolcas is an attractive platform for state-of-the-art atomistic computer simulations
Iterative solution to the biharmonic equation in mixed form discretized by the Hybrid High-Order method
We consider the solution to the biharmonic equation in mixed form discretized
by the Hybrid High-Order (HHO) methods. The two resulting second-order elliptic
problems can be decoupled via the introduction of a new unknown, corresponding
to the boundary value of the solution of the first Laplacian problem. This
technique yields a global linear problem that can be solved iteratively via a
Krylov-type method. More precisely, at each iteration of the scheme, two
second-order elliptic problems have to be solved, and a normal derivative on
the boundary has to be computed. In this work, we specialize this scheme for
the HHO discretization. To this aim, an explicit technique to compute the
discrete normal derivative of an HHO solution of a Laplacian problem is
proposed. Moreover, we show that the resulting discrete scheme is well-posed.
Finally, a new preconditioner is designed to speed up the convergence of the
Krylov method. Numerical experiments assessing the performance of the proposed
iterative algorithm on both two- and three-dimensional test cases are
presented
- …