42,140 research outputs found

    CompILE: Compositional Imitation Learning and Execution

    Get PDF
    We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data. CompILE uses a novel unsupervised, fully-differentiable sequence segmentation module to learn latent encodings of sequential data that can be re-composed and executed to perform new tasks. Once trained, our model generalizes to sequences of longer length and from environment instances not seen during training. We evaluate CompILE in a challenging 2D multi-task environment and a continuous control task, and show that it can find correct task boundaries and event encodings in an unsupervised manner. Latent codes and associated behavior policies discovered by CompILE can be used by a hierarchical agent, where the high-level policy selects actions in the latent code space, and the low-level, task-specific policies are simply the learned decoders. We found that our CompILE-based agent could learn given only sparse rewards, where agents without task-specific policies struggle.Comment: ICML (2019

    Formal Estimation of Collision Risks for Autonomous Vehicles: A Compositional Data-Driven Approach

    Full text link
    In this work, we propose a compositional data-driven approach for the formal estimation of collision risks for autonomous vehicles (AVs) while acting in a stochastic multi-agent framework. The proposed approach is based on the construction of sub-barrier certificates for each stochastic agent via a set of data collected from its trajectories while providing an a-priori guaranteed confidence on the data-driven estimation. In our proposed setting, we first cast the original collision risk problem for each agent as a robust optimization program (ROP). Solving the acquired ROP is not tractable due to an unknown model that appears in one of its constraints. To tackle this difficulty, we collect finite numbers of data from trajectories of each agent and provide a scenario optimization program (SOP) corresponding to the original ROP. We then establish a probabilistic bridge between the optimal value of SOP and that of ROP, and accordingly, we formally construct the sub-barrier certificate for each unknown agent based on the number of data and a required level of confidence. We then propose a compositional technique based on small-gain reasoning to quantify the collision risk for multi-agent AVs with some desirable confidence based on sub-barrier certificates of individual agents constructed from data. For the case that the proposed compositionality conditions are not satisfied, we provide a relaxed version of compositional results without requiring any compositionality conditions but at the cost of providing a potentially conservative collision risk. Eventually, we also present our approaches for non-stochastic multi-agent AVs. We demonstrate the effectiveness of our proposed results by applying them to a vehicle platooning consisting of 100 vehicles with 1 leader and 99 followers. We formally estimate the collision risk by collecting data from trajectories of each agent.Comment: This work has been accepted at IEEE Transactions on Control of Network System

    Decentralized bisimulation for multiagent systems

    Full text link
    Copyright © 2015, International Foundation for Autonomous Agents and Multiagent Systems. The notion of bisimulation has been introduced as a powerful way to abstract from details of systems in the formal verification community. When applying to multiagent systems, classical bisimulations will allow one agent to make decisions based on full histories of others. Thus, as a general concept, classical bisimulations are unrealistically powerful for such systems. In this paper, we define a coarser notion of bisimulation under which an agent can only make realistic decisions based on information available to it. Our bisimulation still implies trace distribution equivalence of the systems, and moreover, it allows a compositional abstraction framework of reasoning about the systems

    A Compositional Framework for Preference-Aware Agents

    Get PDF
    A formal description of a Cyber-Physical system should include a rigorous specification of the computational and physical components involved, as well as their interaction. Such a description, thus, lends itself to a compositional model where every module in the model specifies the behavior of a (computational or physical) component or the interaction between different components. We propose a framework based on Soft Constraint Automata that facilitates the component-wise description of such systems and includes the tools necessary to compose subsystems in a meaningful way, to yield a description of the entire system. Most importantly, Soft Constraint Automata allow the description and composition of components' preferences as well as environmental constraints in a uniform fashion. We illustrate the utility of our framework using a detailed description of a patrolling robot, while highlighting methods of composition as well as possible techniques to employ them.Comment: In Proceedings V2CPS-16, arXiv:1612.0402

    Projective simulation for artificial intelligence

    Get PDF
    We propose a model of a learning agent whose interaction with the environment is governed by a simulation-based projection, which allows the agent to project itself into future situations before it takes real action. Projective simulation is based on a random walk through a network of clips, which are elementary patches of episodic memory. The network of clips changes dynamically, both due to new perceptual input and due to certain compositional principles of the simulation process. During simulation, the clips are screened for specific features which trigger factual action of the agent. The scheme is different from other, computational, notions of simulation, and it provides a new element in an embodied cognitive science approach to intelligent action and learning. Our model provides a natural route for generalization to quantum-mechanical operation and connects the fields of reinforcement learning and quantum computation.Comment: 22 pages, 18 figures. Close to published version, with footnotes retaine
    • …
    corecore