118 research outputs found
Learning Event-triggered Control from Data through Joint Optimization
We present a framework for model-free learning of event-triggered control
strategies. Event-triggered methods aim to achieve high control performance
while only closing the feedback loop when needed. This enables resource
savings, e.g., network bandwidth if control commands are sent via communication
networks, as in networked control systems. Event-triggered controllers consist
of a communication policy, determining when to communicate, and a control
policy, deciding what to communicate. It is essential to jointly optimize the
two policies since individual optimization does not necessarily yield the
overall optimal solution. To address this need for joint optimization, we
propose a novel algorithm based on hierarchical reinforcement learning. The
resulting algorithm is shown to accomplish high-performance control in line
with resource savings and scales seamlessly to nonlinear and high-dimensional
systems. The method's applicability to real-world scenarios is demonstrated
through experiments on a six degrees of freedom real-time controlled
manipulator. Further, we propose an approach towards evaluating the stability
of the learned neural network policies
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks
Predicting the number of clock cycles a processor takes to execute a block of
assembly instructions in steady state (the throughput) is important for both
compiler designers and performance engineers. Building an analytical model to
do so is especially complicated in modern x86-64 Complex Instruction Set
Computer (CISC) machines with sophisticated processor microarchitectures in
that it is tedious, error prone, and must be performed from scratch for each
processor generation. In this paper we present Ithemal, the first tool which
learns to predict the throughput of a set of instructions. Ithemal uses a
hierarchical LSTM--based approach to predict throughput based on the opcodes
and operands of instructions in a basic block. We show that Ithemal is more
accurate than state-of-the-art hand-written tools currently used in compiler
backends and static machine code analyzers. In particular, our model has less
than half the error of state-of-the-art analytical models (LLVM's llvm-mca and
Intel's IACA). Ithemal is also able to predict these throughput values just as
fast as the aforementioned tools, and is easily ported across a variety of
processor microarchitectures with minimal developer effort.Comment: Published at 36th International Conference on Machine Learning (ICML)
201
Task Feasibility Maximization using Model-Free Policy Search and Model-Based Whole-Body Control
International audienceProducing feasible motions for highly redundant robots, such as humanoids, is a complicated and high-dimensional problem.Model-based whole-body control of such robots, can generate complex dynamic behaviors through the simultaneous execution of multiple tasks.Unfortunately, tasks are generally planned without close consideration for the underlying controller being used, or the other tasks being executed, and are often infeasible when executed on the robot. Consequently, there is no guarantee that the motion will be accomplished.In this work, we develop an optimization loop which automatically improves task feasibility using model-free policy search in conjunction with model-based whole-body control.This combination allows problems to be solved, which would be otherwise intractable using simply one or the other.Through experiments on both the simulated and real iCub humanoid robot, we show that by optimizing task feasibility, initially infeasible complex dynamic motions can be realized --- specifically, a sit-to-stand transition
Actor-Critic based Improper Reinforcement Learning
We consider an improper reinforcement learning setting where a learner is
given base controllers for an unknown Markov decision process, and wishes
to combine them optimally to produce a potentially new controller that can
outperform each of the base ones. This can be useful in tuning across
controllers, learnt possibly in mismatched or simulated environments, to obtain
a good controller for a given target environment with relatively few trials.
Towards this, we propose two algorithms: (1) a Policy Gradient-based
approach; and (2) an algorithm that can switch between a simple Actor-Critic
(AC) based scheme and a Natural Actor-Critic (NAC) scheme depending on the
available information. Both algorithms operate over a class of improper
mixtures of the given controllers. For the first case, we derive convergence
rate guarantees assuming access to a gradient oracle. For the AC-based approach
we provide convergence rate guarantees to a stationary point in the basic AC
case and to a global optimum in the NAC case. Numerical results on (i) the
standard control theoretic benchmark of stabilizing an cartpole; and (ii) a
constrained queueing task show that our improper policy optimization algorithm
can stabilize the system even when the base policies at its disposal are
unstable.Comment: arXiv admin note: substantial text overlap with arXiv:2102.0820
- …