18,689 research outputs found
Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
We derive sublinear regret bounds for undiscounted reinforcement learning in
continuous state space. The proposed algorithm combines state aggregation with
the use of upper confidence bounds for implementing optimism in the face of
uncertainty. Beside the existence of an optimal policy which satisfies the
Poisson equation, the only assumptions made are Holder continuity of rewards
and transition probabilities
Machine Learning in Wireless Sensor Networks: Algorithms, Strategies, and Applications
Wireless sensor networks monitor dynamic environments that change rapidly
over time. This dynamic behavior is either caused by external factors or
initiated by the system designers themselves. To adapt to such conditions,
sensor networks often adopt machine learning techniques to eliminate the need
for unnecessary redesign. Machine learning also inspires many practical
solutions that maximize resource utilization and prolong the lifespan of the
network. In this paper, we present an extensive literature review over the
period 2002-2013 of machine learning methods that were used to address common
issues in wireless sensor networks (WSNs). The advantages and disadvantages of
each proposed algorithm are evaluated against the corresponding problem. We
also provide a comparative guide to aid WSN designers in developing suitable
machine learning solutions for their specific application challenges.Comment: Accepted for publication in IEEE Communications Surveys and Tutorial
Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures
Reinforcement learning (RL) constitutes a promising solution for alleviating
the problem of traffic congestion. In particular, deep RL algorithms have been
shown to produce adaptive traffic signal controllers that outperform
conventional systems. However, in order to be reliable in highly dynamic urban
areas, such controllers need to be robust with the respect to a series of
exogenous sources of uncertainty. In this paper, we develop an open-source
callback-based framework for promoting the flexible evaluation of different
deep RL configurations under a traffic simulation environment. With this
framework, we investigate how deep RL-based adaptive traffic controllers
perform under different scenarios, namely under demand surges caused by special
events, capacity reductions from incidents and sensor failures. We extract
several key insights for the development of robust deep RL algorithms for
traffic control and propose concrete designs to mitigate the impact of the
considered exogenous uncertainties.Comment: 8 page
Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems
Many modern nonlinear control methods aim to endow systems with guaranteed
properties, such as stability or safety, and have been successfully applied to
the domain of robotics. However, model uncertainty remains a persistent
challenge, weakening theoretical guarantees and causing implementation failures
on physical systems. This paper develops a machine learning framework centered
around Control Lyapunov Functions (CLFs) to adapt to parametric uncertainty and
unmodeled dynamics in general robotic systems. Our proposed method proceeds by
iteratively updating estimates of Lyapunov function derivatives and improving
controllers, ultimately yielding a stabilizing quadratic program model-based
controller. We validate our approach on a planar Segway simulation,
demonstrating substantial performance improvements by iteratively refining on a
base model-free controller
- …