4,327 research outputs found

    Time Complexity of Decentralized Fixed-Mode Verification

    Get PDF
    Given an interconnected system, this note is concerned with the time complexity of verifying whether an unrepeated mode of the system is a decentralized fixed mode (DFM). It is shown that checking the decentralized fixedness of any distinct mode is tantamount to testing the strong connectivity of a digraph formed based on the system. It is subsequently proved that the time complexity of this decision problem using the proposed approach is the same as the complexity of matrix multiplication. This work concludes that the identification of distinct DFMs (by means of a deterministic algorithm, rather than a randomized one) is computationally very easy, although the existing algorithms for solving this problem would wrongly imply that it is cumbersome. This note provides not only a complexity analysis, but also an efficient algorithm for tackling the underlying problem

    VIME: Variational Information Maximizing Exploration

    Full text link
    Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.Comment: Published in Advances in Neural Information Processing Systems 29 (NIPS), pages 1109-111
    • …
    corecore