1,109 research outputs found

    Probabilistic Interpretation of Linear Solvers

    Full text link
    This manuscript proposes a probabilistic framework for algorithms that iteratively solve unconstrained linear problems Bx=bBx = b with positive definite BB for xx. The goal is to replace the point estimates returned by existing methods with a Gaussian posterior belief over the elements of the inverse of BB, which can be used to estimate errors. Recent probabilistic interpretations of the secant family of quasi-Newton optimization algorithms are extended. Combined with properties of the conjugate gradient algorithm, this leads to uncertainty-calibrated methods with very limited cost overhead over conjugate gradients, a self-contained novel interpretation of the quasi-Newton and conjugate gradient algorithms, and a foundation for new nonlinear optimization methods.Comment: final version, in press at SIAM J Optimizatio

    Shampoo: Preconditioned Stochastic Tensor Optimization

    Full text link
    Preconditioned gradient methods are among the most general and powerful tools in optimization. However, preconditioning requires storing and manipulating prohibitively large matrices. We describe and analyze a new structure-aware preconditioning algorithm, called Shampoo, for stochastic optimization over tensor spaces. Shampoo maintains a set of preconditioning matrices, each of which operates on a single dimension, contracting over the remaining dimensions. We establish convergence guarantees in the stochastic convex setting, the proof of which builds upon matrix trace inequalities. Our experiments with state-of-the-art deep learning models show that Shampoo is capable of converging considerably faster than commonly used optimizers. Although it involves a more complex update rule, Shampoo's runtime per step is comparable to that of simple gradient methods such as SGD, AdaGrad, and Adam

    Incremental Processing and Optimization of Update Streams

    Get PDF
    Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications. Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks

    Efficient neural network verification and training

    Get PDF
    In spite of their highly-publicized achievements in disparate applications, neural networks are yet to be widely deployed in safety-critical applications. In fact, fundamental concerns exist on the robustness, fairness, privacy and explainability of deep learning systems. In this thesis, we strive to increase trust in deep learning systems by presenting contributions pertaining to neural network verification and training. First, by designing dual solvers for popular network relaxations, we provide fast and scalable bounds on neural network outputs. In particular, we present two solvers for the convex hull of element-wise activation functions, and two algorithms for a formulation based on the convex hull of the composition of ReLU activations with the preceding linear layer. We show that these methods are significantly faster than off-the-shelf solvers, and improve on the speed-accuracy trade-offs of previous dual algorithms. In order to efficiently employ them for formal neural network verification, we design a massively parallel Branch-and-Bound framework around the bounding algorithms. Our contributions, which we publicly released as part of the OVAL verification framework, improved on the scalability of existing network verifiers, and proved to be influential for the development of more recent algorithms. Second, we present an intuitive and inexpensive algorithm to train neural networks for verifiability via Branch-and-Bound. Our method is shown to yield state-of-the- art performance on verifying robustness to small adversarial perturbations while reducing the training costs compared to previous algorithms. Finally, we conduct a comprehensive experimental evaluation of specialized training schemes to train networks for multiple tasks at once, showing that they perform on par with a simple baseline. We provide a partial explanation of our surprising results, aiming to stir further research towards the understanding of deep multi-task learning
    corecore