102,440 research outputs found

    On Performance Estimation in Automatic Algorithm Configuration

    Full text link
    Over the last decade, research on automated parameter tuning, often referred to as automatic algorithm configuration (AAC), has made significant progress. Although the usefulness of such tools has been widely recognized in real world applications, the theoretical foundations of AAC are still very weak. This paper addresses this gap by studying the performance estimation problem in AAC. More specifically, this paper first proves the universal best performance estimator in a practical setting, and then establishes theoretical bounds on the estimation error, i.e., the difference between the training performance and the true performance for a parameter configuration, considering finite and infinite configuration spaces respectively. These findings were verified in extensive experiments conducted on four algorithm configuration scenarios involving different problem domains. Moreover, insights for enhancing existing AAC methods are also identified.Comment: accepted by AAAI 202

    Automatic Differentiation of Rigid Body Dynamics for Optimal Control and Estimation

    Full text link
    Many algorithms for control, optimization and estimation in robotics depend on derivatives of the underlying system dynamics, e.g. to compute linearizations, sensitivities or gradient directions. However, we show that when dealing with Rigid Body Dynamics, these derivatives are difficult to derive analytically and to implement efficiently. To overcome this issue, we extend the modelling tool `RobCoGen' to be compatible with Automatic Differentiation. Additionally, we propose how to automatically obtain the derivatives and generate highly efficient source code. We highlight the flexibility and performance of the approach in two application examples. First, we show a Trajectory Optimization example for the quadrupedal robot HyQ, which employs auto-differentiation on the dynamics including a contact model. Second, we present a hardware experiment in which a 6 DoF robotic arm avoids a randomly moving obstacle in a go-to task by fast, dynamic replanning

    autoAx: An Automatic Design Space Exploration and Circuit Building Methodology utilizing Libraries of Approximate Components

    Full text link
    Approximate computing is an emerging paradigm for developing highly energy-efficient computing systems such as various accelerators. In the literature, many libraries of elementary approximate circuits have already been proposed to simplify the design process of approximate accelerators. Because these libraries contain from tens to thousands of approximate implementations for a single arithmetic operation it is intractable to find an optimal combination of approximate circuits in the library even for an application consisting of a few operations. An open problem is "how to effectively combine circuits from these libraries to construct complex approximate accelerators". This paper proposes a novel methodology for searching, selecting and combining the most suitable approximate circuits from a set of available libraries to generate an approximate accelerator for a given application. To enable fast design space generation and exploration, the methodology utilizes machine learning techniques to create computational models estimating the overall quality of processing and hardware cost without performing full synthesis at the accelerator level. Using the methodology, we construct hundreds of approximate accelerators (for a Sobel edge detector) showing different but relevant tradeoffs between the quality of processing and hardware cost and identify a corresponding Pareto-frontier. Furthermore, when searching for approximate implementations of a generic Gaussian filter consisting of 17 arithmetic operations, the proposed approach allows us to identify approximately 10310^3 highly important implementations from 102310^{23} possible solutions in a few hours, while the exhaustive search would take four months on a high-end processor.Comment: Accepted for publication at the Design Automation Conference 2019 (DAC'19), Las Vegas, Nevada, US

    AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture

    Full text link
    CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in today's datacenters. These architectures provide programmers with the ability to reprogram the FPGAs for flexible acceleration of many workloads. Nonetheless, this advantage is often overshadowed by the poor programmability of FPGAs whose programming is conventionally a RTL design practice. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. This paper aims to address this challenge and pave the path from software programs towards high-quality FPGA accelerators. Specifically, we first propose the composable, parallel and pipeline (CPP) microarchitecture as a template of accelerator designs. Such a well-defined template is able to support efficient accelerator designs for a broad class of computation kernels, and more importantly, drastically reduce the design space. Also, we introduce an analytical model to capture the performance and resource trade-offs among different design configurations of the CPP microarchitecture, which lays the foundation for fast design space exploration. On top of the CPP microarchitecture and its analytical model, we develop the AutoAccel framework to make the entire accelerator generation automated. AutoAccel accepts a software program as an input and performs a series of code transformations based on the result of the analytical-model-based design space exploration to construct the desired CPP microarchitecture. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels
    • ā€¦
    corecore