42,844 research outputs found
Connectivity of Direct Products of Graphs
Let be the connectivity of and the direct product
of and . We prove that for any graphs and with ,
, which was conjectured
by Guji and Vumar.Comment: 5 pages, accepted by Ars Com
Learning to Control in Metric Space with Optimal Regret
We study online reinforcement learning for finite-horizon deterministic
control systems with {\it arbitrary} state and action spaces. Suppose that the
transition dynamics and reward function is unknown, but the state and action
space is endowed with a metric that characterizes the proximity between
different states and actions. We provide a surprisingly simple upper-confidence
reinforcement learning algorithm that uses a function approximation oracle to
estimate optimistic Q functions from experiences. We show that the regret of
the algorithm after episodes is where is a
smoothness parameter, and is the doubling dimension of the state-action
space with respect to the given metric. We also establish a near-matching
regret lower bound. The proposed method can be adapted to work for more
structured transition systems, including the finite-state case and the case
where value functions are linear combinations of features, where the method
also achieve the optimal regret
- …