1,109 research outputs found
Probabilistic Interpretation of Linear Solvers
This manuscript proposes a probabilistic framework for algorithms that
iteratively solve unconstrained linear problems with positive definite
for . The goal is to replace the point estimates returned by existing
methods with a Gaussian posterior belief over the elements of the inverse of
, which can be used to estimate errors. Recent probabilistic interpretations
of the secant family of quasi-Newton optimization algorithms are extended.
Combined with properties of the conjugate gradient algorithm, this leads to
uncertainty-calibrated methods with very limited cost overhead over conjugate
gradients, a self-contained novel interpretation of the quasi-Newton and
conjugate gradient algorithms, and a foundation for new nonlinear optimization
methods.Comment: final version, in press at SIAM J Optimizatio
Shampoo: Preconditioned Stochastic Tensor Optimization
Preconditioned gradient methods are among the most general and powerful tools
in optimization. However, preconditioning requires storing and manipulating
prohibitively large matrices. We describe and analyze a new structure-aware
preconditioning algorithm, called Shampoo, for stochastic optimization over
tensor spaces. Shampoo maintains a set of preconditioning matrices, each of
which operates on a single dimension, contracting over the remaining
dimensions. We establish convergence guarantees in the stochastic convex
setting, the proof of which builds upon matrix trace inequalities. Our
experiments with state-of-the-art deep learning models show that Shampoo is
capable of converging considerably faster than commonly used optimizers.
Although it involves a more complex update rule, Shampoo's runtime per step is
comparable to that of simple gradient methods such as SGD, AdaGrad, and Adam
Incremental Processing and Optimization of Update Streams
Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications.
Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks
Efficient neural network verification and training
In spite of their highly-publicized achievements in disparate applications, neural networks are yet to be widely deployed in safety-critical applications. In fact, fundamental concerns exist on the robustness, fairness, privacy and explainability of deep learning systems. In this thesis, we strive to increase trust in deep learning systems by presenting contributions pertaining to neural network verification and training. First, by designing dual solvers for popular network relaxations, we provide fast and scalable bounds on neural network outputs. In particular, we present two solvers for the convex hull of element-wise activation functions, and two algorithms for a formulation based on the convex hull of the composition of ReLU activations with the preceding linear layer. We show that these methods are significantly faster than off-the-shelf solvers, and improve on the speed-accuracy trade-offs of previous dual algorithms. In order to efficiently employ them for formal neural network verification, we design a massively parallel Branch-and-Bound framework around the bounding algorithms. Our contributions, which we publicly released as part of the OVAL verification framework, improved on the scalability of existing network verifiers, and proved to be influential for the development of more recent algorithms. Second, we present an intuitive and inexpensive algorithm to train neural networks for verifiability via Branch-and-Bound. Our method is shown to yield state-of-the- art performance on verifying robustness to small adversarial perturbations while reducing the training costs compared to previous algorithms. Finally, we conduct a comprehensive experimental evaluation of specialized training schemes to train networks for multiple tasks at once, showing that they perform on par with a simple baseline. We provide a partial explanation of our surprising results, aiming to stir further research towards the understanding of deep multi-task learning
- …