6,713 research outputs found

    Reinforcement learning in large state action spaces

    Get PDF
    Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios. This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory). In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications

    Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)

    Full text link
    The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment. FMER is a subset of image processing and it is a multidisciplinary topic to analysis. So, it requires familiarity with other topics of Artifactual Intelligence (AI) such as machine learning, digital image processing, psychology and more. So, it is a great opportunity to write a book which covers all of these topics for beginner to professional readers in the field of AI and even without having background of AI. Our goal is to provide a standalone introduction in the field of MFER analysis in the form of theorical descriptions for readers with no background in image processing with reproducible Matlab practical examples. Also, we describe any basic definitions for FMER analysis and MATLAB library which is used in the text, that helps final reader to apply the experiments in the real-world applications. We believe that this book is suitable for students, researchers, and professionals alike, who need to develop practical skills, along with a basic understanding of the field. We expect that, after reading this book, the reader feels comfortable with different key stages such as color and depth image processing, color and depth image representation, classification, machine learning, facial micro-expressions recognition, feature extraction and dimensionality reduction. The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment.Comment: This is the second edition of the boo

    Green Carbon Footprint for Model Inference Serving via Exploiting Mixed-Quality Models and GPU Partitioning

    Full text link
    This paper presents a solution to the challenge of mitigating carbon emissions from large-scale high performance computing (HPC) systems and datacenters that host machine learning (ML) inference services. ML inference is critical to modern technology products, but it is also a significant contributor to datacenter compute cycles and carbon emissions. We introduce Clover, a carbon-friendly ML inference service runtime system that balances performance, accuracy, and carbon emissions through mixed-quality models and GPU resource partitioning. Our experimental results demonstrate that Clover is effective in substantially reducing carbon emissions while maintaining high accuracy and meeting service level agreement (SLA) targets. Therefore, it is a promising solution toward achieving carbon neutrality in HPC systems and datacenters

    Computational Geometry Contributions Applied to Additive Manufacturing

    Get PDF
    This Doctoral Thesis develops novel articulations of Computation Geometry for applications on Additive Manufacturing, as follows: (1) Shape Optimization in Lattice Structures. Implementation and sensitivity analysis of the SIMP (Solid Isotropic Material with Penalization) topology optimization strategy. Implementation of a method to transform density maps, resulting from topology optimization, into surface lattice structures. Procedure to integrate material homogenization and Design of Experiments (DOE) to estimate the stress/strain response of large surface lattice domains. (2) Simulation of Laser Metal Deposition. Finite Element Method implementation of a 2D nonlinear thermal model of the Laser Metal Deposition (LMD) process considering temperaturedependent material properties, phase change and radiation. Finite Element Method implementation of a 2D linear transient thermal model for a metal substrate that is heated by the action of a laser. (3) Process Planning for Laser Metal Deposition. Implementation of a 2.5D path planning method for Laser Metal Deposition. Conceptualization of a workflow for the synthesis of the Reeb Graph for a solid region in ℝ" denoted by its Boundary Representation (B-Rep). Implementation of a voxel-based geometric simulator for LMD process. Conceptualization, implementation, and validation of a tool for the minimization of the material over-deposition at corners in LMD. Implementation of a 3D (non-planar) slicing and path planning method for the LMD-manufacturing of overhanging features in revolute workpieces. The aforementioned contributions have been screened by the international scientific community via Journal and Conference submissions and publications

    Modelling, Monitoring, Control and Optimization for Complex Industrial Processes

    Get PDF
    This reprint includes 22 research papers and an editorial, collected from the Special Issue "Modelling, Monitoring, Control and Optimization for Complex Industrial Processes", highlighting recent research advances and emerging research directions in complex industrial processes. This reprint aims to promote the research field and benefit the readers from both academic communities and industrial sectors

    Sparse Cholesky factorization by greedy conditional selection

    Full text link
    Dense kernel matrices resulting from pairwise evaluations of a kernel function arise naturally in machine learning and statistics. Previous work in constructing sparse approximate inverse Cholesky factors of such matrices by minimizing Kullback-Leibler divergence recovers the Vecchia approximation for Gaussian processes. These methods rely only on the geometry of the evaluation points to construct the sparsity pattern. In this work, we instead construct the sparsity pattern by leveraging a greedy selection algorithm that maximizes mutual information with target points, conditional on all points previously selected. For selecting kk points out of NN, the naive time complexity is O(Nk4)\mathcal{O}(N k^4), but by maintaining a partial Cholesky factor we reduce this to O(Nk2)\mathcal{O}(N k^2). Furthermore, for multiple (mm) targets we achieve a time complexity of O(Nk2+Nm2+m3)\mathcal{O}(N k^2 + N m^2 + m^3), which is maintained in the setting of aggregated Cholesky factorization where a selected point need not condition every target. We apply the selection algorithm to image classification and recovery of sparse Cholesky factors. By minimizing Kullback-Leibler divergence, we apply the algorithm to Cholesky factorization, Gaussian process regression, and preconditioning with the conjugate gradient, improving over kk-nearest neighbors selection

    SNEkhorn: Dimension Reduction with Symmetric Entropic Affinities

    Full text link
    Many approaches in machine learning rely on a weighted graph to encode the similarities between samples in a dataset. Entropic affinities (EAs), which are notably used in the popular Dimensionality Reduction (DR) algorithm t-SNE, are particular instances of such graphs. To ensure robustness to heterogeneous sampling densities, EAs assign a kernel bandwidth parameter to every sample in such a way that the entropy of each row in the affinity matrix is kept constant at a specific value, whose exponential is known as perplexity. EAs are inherently asymmetric and row-wise stochastic, but they are used in DR approaches after undergoing heuristic symmetrization methods that violate both the row-wise constant entropy and stochasticity properties. In this work, we uncover a novel characterization of EA as an optimal transport problem, allowing a natural symmetrization that can be computed efficiently using dual ascent. The corresponding novel affinity matrix derives advantages from symmetric doubly stochastic normalization in terms of clustering performance, while also effectively controlling the entropy of each row thus making it particularly robust to varying noise levels. Following, we present a new DR algorithm, SNEkhorn, that leverages this new affinity matrix. We show its clear superiority to state-of-the-art approaches with several indicators on both synthetic and real-world datasets

    Stability Verification of Neural Network Controllers using Mixed-Integer Programming

    Full text link
    We propose a framework for the stability verification of Mixed-Integer Linear Programming (MILP) representable control policies. This framework compares a fixed candidate policy, which admits an efficient parameterization and can be evaluated at a low computational cost, against a fixed baseline policy, which is known to be stable but expensive to evaluate. We provide sufficient conditions for the closed-loop stability of the candidate policy in terms of the worst-case approximation error with respect to the baseline policy, and we show that these conditions can be checked by solving a Mixed-Integer Quadratic Program (MIQP). Additionally, we demonstrate that an outer and inner approximation of the stability region of the candidate policy can be computed by solving an MILP. The proposed framework is sufficiently general to accommodate a broad range of candidate policies including ReLU Neural Networks (NNs), optimal solution maps of parametric quadratic programs, and Model Predictive Control (MPC) policies. We also present an open-source toolbox in Python based on the proposed framework, which allows for the easy verification of custom NN architectures and MPC formulations. We showcase the flexibility and reliability of our framework in the context of a DC-DC power converter case study and investigate its computational complexity

    Combinatorial-Based Auction For The Transportation Procurement: An Optimization-Oriented Review

    Get PDF
    This paper conducts a literature review on freight transport service procurements (FTSP) and explores the application of combinatorial auctions (CAs) mechanism and the mathematical modeling approach of the associated problems. It provides an overview of modeling the problems and their solution strategies. The results demonstrate that there has been limited scholarly attention to sustainable issues, risk mitigation and the stochastic nature of parameters. Finally, several promising future directions for FTSP research have been proposed, including FTSP for green orientation in the context of carbon reduction, shipper’s reputation, carrier collaboration for bid generation, etc
    corecore