33 research outputs found

    Improving the Practicality of Model-Based Reinforcement Learning: An Investigation into Scaling up Model-Based Methods in Online Settings

    Get PDF
    This thesis is a response to the current scarcity of practical model-based control algorithms in the reinforcement learning (RL) framework. As of yet there is no consensus on how best to integrate imperfect transition models into RL whilst mitigating policy improvement instabilities in online settings. Current state-of-the-art policy learning algorithms that surpass human performance often rely on model-free approaches that enjoy unmitigated sampling of transition data. Model-based RL (MBRL) instead attempts to distil experience into transition models that allow agents to plan new policies without needing to return to the environment and sample more data. The initial focus of this investigation is on kernel conditional mean embeddings (CMEs) (Song et al., 2009) deployed in an approximate policy iteration (API) algorithm (Grünewälder et al., 2012a). This existing MBRL algorithm boasts theoretically stable policy updates in continuous state and discrete action spaces. The Bellman operator’s value function and (transition) conditional expectation are modelled and embedded respectively as functions in a reproducing kernel Hilbert space (RKHS). The resulting finite-induced approximate pseudo-MDP (Yao et al., 2014a) can be solved exactly in a dynamic programming algorithm with policy improvement suboptimality guarantees. However model construction and policy planning scale cubically and quadratically respectively with the training set size, rendering the CME impractical for sampleabundant tasks in online settings. Three variants of CME API are investigated to strike a balance between stable policy updates and reduced computational complexity. The first variant models the value function and state-action representation explicitly in a parametric CME (PCME) algorithm with favourable computational complexity. However a soft conservative policy update technique is developed to mitigate policy learning oscillations in the planning process. The second variant returns to the non-parametric embedding and contributes (along with external work) to the compressed CME (CCME); a sparse and computationally more favourable CME. The final variant is a fully end-to-end differentiable embedding trained with stochastic gradient updates. The value function remains modelled in an RKHS such that backprop is driven by a non-parametric RKHS loss function. Actively compressed CME (ACCME) satisfies the pseudo-MDP contraction constraint using a sparse softmax activation function. The size of the pseudo-MDP (i.e. the size of the embedding’s last layer) is controlled by sparsifying the last layer weight matrix by extending the truncated gradient method (Langford et al., 2009) with group lasso updates in a novel ‘use it or lose it’ neuron pruning mechanism. Surprisingly this technique does not require extensive fine-tuning between control tasks

    SCALABLE INTEGRATED CIRCUIT SIMULATION ALGORITHMS FOR ENERGY-EFFICIENT TERAFLOP HETEROGENEOUS PARALLEL COMPUTING PLATFORMS

    Get PDF
    Integrated circuit technology has gone through several decades of aggressive scaling.It is increasingly challenging to analyze growing design complexity. Post-layout SPICE simulation can be computationally prohibitive due to the huge amount of parasitic elements, which can easily boost the computation and memory cost. As the decrease in device size, the circuits become more vulnerable to process variations. Designers need to statistically simulate the probability that a circuit does not meet the performance metric, which requires millions times of simulations to capture rare failure events. Recent, multiprocessors with heterogeneous architecture have emerged as mainstream computing platforms. The heterogeneous computing platform can achieve highthroughput energy efficient computing. However, the application of such platform is not trivial and needs to reinvent existing algorithms to fully utilize the computing resources. This dissertation presents several new algorithms to address those aforementioned two significant and challenging issues on the heterogeneous platform. Harmonic Balance (HB) analysis is essential for efficient verification of large postlayout RF and microwave integrated circuits (ICs). However, existing methods either suffer from excessively long simulation time and prohibitively large memory consumption or exhibit poor stability. This dissertation introduces a novel transient-simulation guided graph sparsification technique, as well as an efficient runtime performance modeling approach tailored for heterogeneous manycore CPU-GPU computing system to build nearly-optimal subgraph preconditioners that can lead to minimum HB simulation runtime. Additionally, we propose a novel heterogeneous parallel sparse block matrix algorithm by taking advantages of the structure of HB Jacobian matrices as well as GPU’s streaming multiprocessors to achieve optimal workload balancing during the preconditioning phase of HB analysis. We also show how the proposed preconditioned iterative algorithm can efficiently adapt to heterogeneous computing systems with different CPU and GPU computing capabilities. Extensive experimental results show that our HB solver can achieve up to 20X speedups and 5X memory reduction when compared with the state-of-the-art direct solver highly optimized for twelve-core CPUs. In nowadays variation-aware IC designs, cell characterizations and SRAM memory yield analysis require many thousands or even millions of repeated SPICE simulations for relatively small nonlinear circuits. In this dissertation, for the first time, we present a massively parallel SPICE simulator on GPU, TinySPICE, for efficiently analyzing small nonlinear circuits. TinySPICE integrates a highly-optimized shared-memory based matrix solver and fast parametric three-dimensional (3D) LUTs based device evaluation method. A novel circuit clustering method is also proposed to improve the stability and efficiency of the matrix solver. Compared with CPU-based SPICE simulator, TinySPICE achieves up to 264X speedups for parametric SRAM yield analysis without loss of accuracy

    Stochastic Optimization For Multi-Agent Statistical Learning And Control

    Get PDF
    The goal of this thesis is to develop a mathematical framework for optimal, accurate, and affordable complexity statistical learning among networks of autonomous agents. We begin by noting the connection between statistical inference and stochastic programming, and consider extensions of this setup to settings in which a network of agents each observes a local data stream and would like to make decisions that are good with respect to information aggregated across the entire network. There is an open-ended degree of freedom in this problem formulation, however: the selection of the estimator function class which defines the feasible set of the stochastic program. Our central contribution is the design of stochastic optimization tools in reproducing kernel Hilbert spaces that yield optimal, accurate, and affordable complexity statistical learning for a multi-agent network. To obtain this result, we first explore the relative merits and drawbacks of different function class selections. In Part I, we consider multi-agent expected risk minimization this problem setting for the case that each agent seems to learn a common globally optimal generalized linear models (GLMs) by developing a stochastic variant of Arrow-Hurwicz primal-dual method. We establish convergence to the primal-dual optimal pair when either consensus or ``proximity constraints encode the fact that we want all agents\u27 to agree, or nearby agents to make decisions that are close to one another. Empirically, we observe that these convergence results are substantiated but that convergence may not translate into statistical accuracy. More broadly, optimality within a given estimator function class is not the same as one that makes minimal inference errors. The optimality-accuracy tradeoff of GLMs motivates subsequent efforts to learn more sophisticated estimators based upon learned feature encodings of the data that is fed into the statistical model. The specific tool we turn to in Part II is dictionary learning, where we optimize both over regression weights and an encoding of the data, which yields a non-convex problem. We investigate the use of stochastic methods for online task-driven dictionary learning, and obtain promising performance for the task of a ground robot learning to anticipate control uncertainty based on its past experience. Heartened by this implementation, we then consider extensions of this framework for a multi-agent network to each learn globally optimal task-driven dictionaries based on stochastic primal-dual methods. However, it is here the non-convexity of the optimization problem causes problems: stringent conditions on stochastic errors and the duality gap limit the applicability of the convergence guarantees, and impractically small learning rates are required for convergence in practice. Thus, we seek to learn nonlinear statistical models while preserving convexity, which is possible through kernel methods ( Part III). However, the increased descriptive power of nonparametric estimation comes at the cost of infinite complexity. Thus, we develop a stochastic approximation algorithm in reproducing kernel Hilbert spaces (RKHS) that ameliorates this complexity issue while preserving optimality: we combine the functional generalization of stochastic gradient method (FSGD) with greedily constructed low-dimensional subspace projections based on matching pursuit. We establish that the proposed method yields a controllable trade-off between optimality and memory, and yields highly accurate parsimonious statistical models in practice. % Then, we develop a multi-agent extension of this method by proposing a new node-separable penalty function and applying FSGD together with low-dimensional subspace projections. This extension allows a network of autonomous agents to learn a memory-efficient approximation to the globally optimal regression function based only on their local data stream and message passing with neighbors. In practice, we observe agents are able to stably learn highly accurate and memory-efficient nonlinear statistical models from streaming data. From here, we shift focus to a more challenging class of problems, motivated by the fact that true learning is not just revising predictions based upon data but augmenting behavior over time based on temporal incentives. This goal may be described by Markov Decision Processes (MDPs): at each point, an agent is in some state of the world, takes an action and then receives a reward while randomly transitioning to a new state. The goal of the agent is to select the action sequence to maximize its long-term sum of rewards, but determining how to select this action sequence when both the state and action spaces are infinite has eluded researchers for decades. As a precursor to this feat, we consider the problem of policy evaluation in infinite MDPs, in which we seek to determine the long-term sum of rewards when starting in a given state when actions are chosen according to a fixed distribution called a policy. We reformulate this problem as a RKHS-valued compositional stochastic program and we develop a functional extension of stochastic quasi-gradient algorithm operating in tandem with the greedy subspace projections mentioned above. We prove convergence with probability 1 to the Bellman fixed point restricted to this function class, and we observe a state of the art trade off in memory versus Bellman error for the proposed method on the Mountain Car driving task, which bodes well for incorporating policy evaluation into more sophisticated, provably stable reinforcement learning techniques, and in time, developing optimal collaborative multi-agent learning-based control systems

    Regularized approximate policy iteration using kernel for on-line reinforcement learning

    Get PDF
    By using Reinforcement Learning (RL), an autonomous agent interacting with the environment can learn how to take adequate actions for every situation in order to optimally achieve its own goal. RL provides a general methodology able to solve uncertain and complex decision problems which may be present in many real-world applications. RL problems are usually modeled as a Markov Decision Processes (MDPs) deeply studied in the literature. The main peculiarity of a RL algorithm is that the RL agent is assumed to learn the optimal policies from its experiences without knowing the parameters of the MDP. The key element in solving the MDP is learning a value function which gives the expectation of total reward an agent might expect at its current state taking a given action. This value function allows to obtain the optimal policy. In this thesis we study the capacity of SVR using kernel methods to adapt and solve complex RL problems in large or continuous state space. SVR can be studied using a geometrical interpretation in terms of optimal margin or can be seen as a regularization problem given in a Reproducing Kernel Hilbert Space (RKHS) SVR have good properties over the generalization ability and as they are based a on convex optimization problem, they do not suffer from sub-optimality. SVR are non-parametric showing the ability to automatically adapt to the complexity of the problem. Accordingly, applying SVR to approximate value functions sounds to be a good approach. SVR can be solved both in batch mode when the whole set of training sample are at disposal of the learning agents or incrementally which enables the addition or removal of training samples very effectively. Incremental SVR finds the appropriate KKT conditions for new or updated data by modifying their influences into the regression function maintaining consistence in the KKT conditions for the rest of data used for learning. In RL problems an incremental SVR should be able to approximate the action value function leading to the optimal policy. Accordingly, computation load should be lower, learning speed faster and generalization more effective than other existing method The overall contribution coming from of our work is to develop, formalize, implement and study a new RL technique for generalization in discrete and continuous state spaces with finite actions. Our method uses the Approximate Policy Iteration (API) framework with the BRM criterion which allows to represent the action value function using SVR. This approach for RL is the first one we know using SVR compatible to the agent interaction- with-the-environment framework of RL which shows his power by solving a large number of benchmark problems, including very difficult ones, like the bicycle driving and riding control problem. In addition, unlike most RL approaches to generalization, we develop a proof finding theoretical bounds for the convergence of the method to the optimal solution under given conditions.Mediante el uso de aprendizaje por refuerzo (RL), un agente autónomo interactuando con el medio ambiente puede aprender a tomar adecuada acciones para cada situación con el fin de lograr de manera óptima su propia meta. RL proporciona una metodología general capaz de resolver problemas de decisión complejos que pueden estar presentes en muchas aplicaciones del mundo real. Problemas RL usualmente se modelan como una Procesos de Decisión de Markov (MDP) estudiados profundamente en la literatura. La principal peculiaridad de un algoritmo de RL es que el agente es asumido para aprender las políticas óptimas de sus experiencias sin saber los parámetros de la MDP. El elemento clave en resolver el MDP está en el aprender una función de valor que da la expectativa de recompensa total que un agente puede esperar en su estado actual para tomar una acción determinada. Esta función de valor permite obtener la política óptima. En esta tesis se estudia la capacidad del SVR utilizando núcleo métodos para adaptarse y resolver problemas RL complejas en el espacio estado grande o continua. RVS puede ser estudiado mediante un interpretación geométrica en términos de margen óptimo o puede ser visto como un problema de regularización dado en un Reproducing Kernel Hilbert Space (RKHS). SVR tiene buenas propiedades sobre la capacidad de generalización y ya que se basan en una optimización convexa problema, ellos no sufren de sub-optimalidad. SVR son no paramétrico que muestra la capacidad de adaptarse automáticamente a la complejidad del problema. En consecuencia, la aplicación de RVS para aproximar funciones de valor suena para ser un buen enfoque. SVR puede resolver tanto en modo batch cuando todo el conjunto de muestra de entrenamiento están a disposición de los agentes de aprendizaje o incrementalmente que permite la adición o eliminación de muestras de entrenamiento muy eficaz. Incremental SVR encuentra las condiciones adecuadas para KKT nuevas o actualizadas de datos modificando sus influencias en la función de regresión mantener consistencia en las condiciones KKT para el resto de los datos utilizados para el aprendizaje. En los problemas de RL una RVS elemental será capaz de aproximar la función de valor de acción que conduce a la política óptima. En consecuencia, la carga de cálculo debería ser menor, la velocidad de aprendizaje más rápido y generalización más efectivo que el otro método existente La contribución general que viene de nuestro trabajo es desarrollar, formalizar, ejecutar y estudiar una nueva técnica de RL para la generalización en espacio de estados discretos y continuos con acciones finitas. Nuestro método utiliza el marco de la Approximate Policy Iteration (API) con el criterio de BRM que permite representar la función de valor de acción utilizando SVR. Este enfoque de RL es el primero que conocemos usando SVR compatible con el marco de RL con agentes interaccionado con el ambiente que muestra su poder mediante la resolución de un gran número de problemas de referencia, incluyendo los muy difíciles, como la conducción de bicicletas y problema de control de conducción. Además, a diferencia de la mayoría RL se acerca a la generalización, desarrollamos un hallazgo prueba límites teóricos para la convergencia del método a la solución óptima en condiciones dadas.Postprint (published version

    Towards Performance Portable Graph Algorithms

    Get PDF
    In today's data-driven world, our computational resources have become heterogeneous, making the processing of large-scale graphs in an architecture agnostic manner crucial. Traditionally, hand-optimized high-performance computing (HPC) solutions have been studied and used to implement highly efficient and scalable graph algorithms. In recent years, several graph processing and management systems have also been proposed. Hand optimized HPC approaches require high levels of expertise and graph processing frameworks suffer from expressibility and performance. Portability is a major concern for both approaches. The main thesis of this work is that block-based graph algorithms offer a compromise between efficient parallelism and architecture agnostic algorithm design for a wide class of graph problems. This dissertation seeks to prove this thesis by focusing the work on the three pillars; data/computation partitioning, block-based algorithm design, and performance portability. In this dissertation, we first show how we can partition the computation and the data to design efficient block-based algorithms for solving graph merging and triangle counting problems. Then, generalizing from our experiences, we propose an algorithmic framework, for shared-memory, heterogeneous machines for implementing block-based graph algorithms; PGAbB. PGAbB aims to maximally leverage different architectures by implementing a task-based execution on top of a block-based programming model. In this talk we will discuss PGAbB's programming model, algorithmic optimizations for scheduling, and load-balancing strategies for graph problems on real-world and synthetic inputs.Ph.D

    Sur la conception de solveurs linéaires hybrides pour les architectures parallèles modernes

    Get PDF
    In the context of this thesis, our focus is on numerical linear algebra, more precisely on solution of large sparse systems of linear equations. We focus on designing efficient parallel implementations of MaPHyS, an hybrid linear solver based on domain decomposition techniques. First we investigate the MPI+threads approach. In MaPHyS, the first level of parallelism arises from the independent treatment of the various subdomains. The second level is exploited thanks to the use of multi-threaded dense and sparse linear algebra kernels involved at the subdomain level. Such an hybrid implementation of an hybrid linear solver suitably matches the hierarchical structure of modern supercomputers and enables a trade-off between the numerical and parallel performances of the solver. We demonstrate the flexibility of our parallel implementation on a set of test examples. Secondly, we follow a more disruptive approach where the algorithms are described as sets of tasks with data inter-dependencies that leads to a directed acyclic graph (DAG) representation. The tasks are handled by a runtime system. We illustrate how a first task-based parallel implementation can be obtained by composing task-based parallel libraries within MPI processes throught a preliminary prototype implementation of our hybrid solver. We then show how a task-based approach fully abstracting the hardware architecture can successfully exploit a wide range of modern hardware architectures. We implemented a full task-based Conjugate Gradient algorithm and showed that the proposed approach leads to very high performance on multi-GPU, multicore and heterogeneous architectures.Dans le contexte de cette thèse, nous nous focalisons sur des algorithmes pour l’algèbre linéaire numérique, plus précisément sur la résolution de grands systèmes linéaires creux. Nous mettons au point des méthodes de parallélisation pour le solveur linéaire hybride MaPHyS. Premièrement nous considerons l'aproche MPI+threads. Dans MaPHyS, le premier niveau de parallélisme consiste au traitement indépendant des sous-domaines. Le second niveau est exploité grâce à l’utilisation de noyaux multithreadés denses et creux au sein des sous-domaines. Une telle implémentation correspond bien à la structure hiérarchique des supercalculateurs modernes et permet un compromis entre les performances numériques et parallèles du solveur. Nous démontrons la flexibilité de notre implémentation parallèle sur un ensemble de cas tests. Deuxièmement nous considérons un approche plus innovante, où les algorithmes sont décrits comme des ensembles de tâches avec des inter-dépendances, i.e., un graphe de tâches orienté sans cycle (DAG). Nous illustrons d’abord comment une première parallélisation à base de tâches peut être obtenue en composant des librairies à base de tâches au sein des processus MPI illustrer par un prototype d’implémentation préliminaire de notre solveur hybride. Nous montrons ensuite comment une approche à base de tâches abstrayant entièrement le matériel peut exploiter avec succès une large gamme d’architectures matérielles. À cet effet, nous avons implanté une version à base de tâches de l’algorithme du Gradient Conjugué et nous montrons que l’approche proposée permet d’atteindre une très haute performance sur des architectures multi-GPU, multicoeur ainsi qu’hétérogène

    Learning Dynamic Systems for Intention Recognition in Human-Robot-Cooperation

    Get PDF
    This thesis is concerned with intention recognition for a humanoid robot and investigates how the challenges of uncertain and incomplete observations, a high degree of detail of the used models, and real-time inference may be addressed by modeling the human rationale as hybrid, dynamic Bayesian networks and performing inference with these models. The key focus lies on the automatic identification of the employed nonlinear stochastic dependencies and the situation-specific inference

    Regularized approximate policy iteration using kernel for on-line reinforcement learning

    Get PDF
    By using Reinforcement Learning (RL), an autonomous agent interacting with the environment can learn how to take adequate actions for every situation in order to optimally achieve its own goal. RL provides a general methodology able to solve uncertain and complex decision problems which may be present in many real-world applications. RL problems are usually modeled as a Markov Decision Processes (MDPs) deeply studied in the literature. The main peculiarity of a RL algorithm is that the RL agent is assumed to learn the optimal policies from its experiences without knowing the parameters of the MDP. The key element in solving the MDP is learning a value function which gives the expectation of total reward an agent might expect at its current state taking a given action. This value function allows to obtain the optimal policy. In this thesis we study the capacity of SVR using kernel methods to adapt and solve complex RL problems in large or continuous state space. SVR can be studied using a geometrical interpretation in terms of optimal margin or can be seen as a regularization problem given in a Reproducing Kernel Hilbert Space (RKHS) SVR have good properties over the generalization ability and as they are based a on convex optimization problem, they do not suffer from sub-optimality. SVR are non-parametric showing the ability to automatically adapt to the complexity of the problem. Accordingly, applying SVR to approximate value functions sounds to be a good approach. SVR can be solved both in batch mode when the whole set of training sample are at disposal of the learning agents or incrementally which enables the addition or removal of training samples very effectively. Incremental SVR finds the appropriate KKT conditions for new or updated data by modifying their influences into the regression function maintaining consistence in the KKT conditions for the rest of data used for learning. In RL problems an incremental SVR should be able to approximate the action value function leading to the optimal policy. Accordingly, computation load should be lower, learning speed faster and generalization more effective than other existing method The overall contribution coming from of our work is to develop, formalize, implement and study a new RL technique for generalization in discrete and continuous state spaces with finite actions. Our method uses the Approximate Policy Iteration (API) framework with the BRM criterion which allows to represent the action value function using SVR. This approach for RL is the first one we know using SVR compatible to the agent interaction- with-the-environment framework of RL which shows his power by solving a large number of benchmark problems, including very difficult ones, like the bicycle driving and riding control problem. In addition, unlike most RL approaches to generalization, we develop a proof finding theoretical bounds for the convergence of the method to the optimal solution under given conditions.Mediante el uso de aprendizaje por refuerzo (RL), un agente autónomo interactuando con el medio ambiente puede aprender a tomar adecuada acciones para cada situación con el fin de lograr de manera óptima su propia meta. RL proporciona una metodología general capaz de resolver problemas de decisión complejos que pueden estar presentes en muchas aplicaciones del mundo real. Problemas RL usualmente se modelan como una Procesos de Decisión de Markov (MDP) estudiados profundamente en la literatura. La principal peculiaridad de un algoritmo de RL es que el agente es asumido para aprender las políticas óptimas de sus experiencias sin saber los parámetros de la MDP. El elemento clave en resolver el MDP está en el aprender una función de valor que da la expectativa de recompensa total que un agente puede esperar en su estado actual para tomar una acción determinada. Esta función de valor permite obtener la política óptima. En esta tesis se estudia la capacidad del SVR utilizando núcleo métodos para adaptarse y resolver problemas RL complejas en el espacio estado grande o continua. RVS puede ser estudiado mediante un interpretación geométrica en términos de margen óptimo o puede ser visto como un problema de regularización dado en un Reproducing Kernel Hilbert Space (RKHS). SVR tiene buenas propiedades sobre la capacidad de generalización y ya que se basan en una optimización convexa problema, ellos no sufren de sub-optimalidad. SVR son no paramétrico que muestra la capacidad de adaptarse automáticamente a la complejidad del problema. En consecuencia, la aplicación de RVS para aproximar funciones de valor suena para ser un buen enfoque. SVR puede resolver tanto en modo batch cuando todo el conjunto de muestra de entrenamiento están a disposición de los agentes de aprendizaje o incrementalmente que permite la adición o eliminación de muestras de entrenamiento muy eficaz. Incremental SVR encuentra las condiciones adecuadas para KKT nuevas o actualizadas de datos modificando sus influencias en la función de regresión mantener consistencia en las condiciones KKT para el resto de los datos utilizados para el aprendizaje. En los problemas de RL una RVS elemental será capaz de aproximar la función de valor de acción que conduce a la política óptima. En consecuencia, la carga de cálculo debería ser menor, la velocidad de aprendizaje más rápido y generalización más efectivo que el otro método existente La contribución general que viene de nuestro trabajo es desarrollar, formalizar, ejecutar y estudiar una nueva técnica de RL para la generalización en espacio de estados discretos y continuos con acciones finitas. Nuestro método utiliza el marco de la Approximate Policy Iteration (API) con el criterio de BRM que permite representar la función de valor de acción utilizando SVR. Este enfoque de RL es el primero que conocemos usando SVR compatible con el marco de RL con agentes interaccionado con el ambiente que muestra su poder mediante la resolución de un gran número de problemas de referencia, incluyendo los muy difíciles, como la conducción de bicicletas y problema de control de conducción. Además, a diferencia de la mayoría RL se acerca a la generalización, desarrollamos un hallazgo prueba límites teóricos para la convergencia del método a la solución óptima en condiciones dadas

    Inference, Computation, and Games

    Get PDF
    In this thesis, we use statistical inference and competitive games to design algorithms for computational mathematics. In the first part, comprising chapters two through six, we use ideas from Gaussian process statistics to obtain fast solvers for differential and integral equations. We begin by observing the equivalence of conditional (near-)independence of Gaussian processes and the (near-)sparsity of the Cholesky factors of its precision and covariance matrices. This implies the existence of a large class of dense matrices with almost sparse Cholesky factors, thereby greatly increasing the scope of application of sparse Cholesky factorization. Using an elimination ordering and sparsity pattern motivated by the screening effect in spatial statistics, we can compute approximate Cholesky factors of the covariance matrices of Gaussian processes admitting a screening effect in near-linear computational complexity. These include many popular smoothness priors such as the Matérn class of covariance functions. In the special case of Green's matrices of elliptic boundary value problems (with possibly unknown elliptic operators of arbitrarily high order, with possibly rough coefficients), we can use tools from numerical homogenization to prove the exponential accuracy of our method. This result improves the state-of-the-art for solving general elliptic integral equations and provides the first proof of an exponential screening effect. We also derive a fast solver for elliptic partial differential equations, with accuracy-vs-complexity guarantees that improve upon the state-of-the-art. Furthermore, the resulting solver is performant in practice, frequently beating established algebraic multigrid libraries such as AMGCL and Trilinos on a series of challenging problems in two and three dimensions. Finally, for any given covariance matrix, we obtain a closed-form expression for its optimal (in terms of Kullback-Leibler divergence) approximate inverse-Cholesky factorization subject to a sparsity constraint, recovering Vecchia approximation and factorized sparse approximate inverses. Our method is highly robust, embarrassingly parallel, and further improves our asymptotic results on the solution of elliptic integral equations. We also provide a way to apply our techniques to sums of independent Gaussian processes, resolving a major limitation of existing methods based on the screening effect. As a result, we obtain fast algorithms for large-scale Gaussian process regression problems with possibly noisy measurements. In the second part of this thesis, comprising chapters seven through nine, we study continuous optimization through the lens of competitive games. In particular, we consider competitive optimization, where multiple agents attempt to minimize conflicting objectives. In the single-agent case, the updates of gradient descent are minimizers of quadratically regularized linearizations of the loss function. We propose to generalize this idea by using the Nash equilibria of quadratically regularized linearizations of the competitive game as updates (linearize the game). We provide fundamental reasons why the natural notion of linearization for competitive optimization problems is given by the multilinear (as opposed to linear) approximation of the agents' loss functions. The resulting algorithm, which we call competitive gradient descent, thus provides a natural generalization of gradient descent to competitive optimization. By using ideas from information geometry, we extend CGD to competitive mirror descent (CMD) that can be applied to a vast range of constrained competitive optimization problems. CGD and CMD resolve the cycling problem of simultaneous gradient descent and show promising results on problems arising in constrained optimization, robust control theory, and generative adversarial networks. Finally, we point out the GAN-dilemma that refutes the common interpretation of GANs as approximate minimizers of a divergence obtained in the limit of a fully trained discriminator. Instead, we argue that GAN performance relies on the implicit competitive regularization (ICR) due to the simultaneous optimization of generator and discriminator and support this hypothesis with results on low-dimensional model problems and GANs on CIFAR10.</p
    corecore