49,571 research outputs found

    Object-Oriented Dynamics Learning through Multi-Level Abstraction

    Full text link
    Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability. However, existing approaches suffer from structural limitations and optimization difficulties for common environments with multiple dynamic objects. In this paper, we present a novel self-supervised learning framework, called Multi-level Abstraction Object-oriented Predictor (MAOP), which employs a three-level learning architecture that enables efficient object-based dynamics learning from raw visual observations. We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability. Our results show that MAOP significantly outperforms previous methods in terms of sample efficiency and generalization over novel environments for learning environment models. We also demonstrate that learned dynamics models enable efficient planning in unseen environments, comparable to true environment models. In addition, MAOP learns semantically and visually interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial Intelligence (AAAI), 202

    Grounding semantics in robots for Visual Question Answering

    Get PDF
    In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning

    Level-Set Based Artery-Vein Separation in Blood Pool Agent CE-MR Angiograms

    Get PDF
    Blood pool agents (BPAs) for contrast-enhanced (CE) magnetic-resonance angiography (MRA) allow prolonged imaging times for higher contrast and resolution. Imaging is performed during the steady state when the contrast agent is distributed through the complete vascular system. However, simultaneous venous and arterial enhancement in this steady state hampers interpretation. In order to improve visualization of the arteries and veins from steady-state BPA data, a semiautomated method for artery-vein separation is presented. In this method, the central arterial axis and central venous axis are used as initializations for two surfaces that simultaneously evolve in order to capture the arterial and venous parts of the vasculature using the level-set framework. Since arteries and veins can be in close proximity of each other, leakage from the evolving arterial (venous) surface into the venous (arterial) part of the vasculature is inevitable. In these situations, voxels are labeled arterial or venous based on the arrival time of the respective surface. The evolution is steered by external forces related to feature images derived from the image data and by internal forces related to the geometry of the level sets. In this paper, the robustness and accuracy of three external forces (based on image intensity, image gradient, and vessel-enhancement filtering) and combinations of them are investigated and tested on seven patient datasets. To this end, results with the level-set-based segmentation are compared to the reference-standard manually obtained segmentations. Best results are achieved by applying a combination of intensity- and gradient-based forces and a smoothness constraint based on the curvature of the surface. By applying this combination to the seven datasets, it is shown that, with minimal user interaction, artery-vein separation for improved arterial and venous visualization in BPA CE-MRA can be achieved

    Virtual to Real Reinforcement Learning for Autonomous Driving

    Full text link
    Reinforcement learning is considered as a promising direction for driving policy learning. However, training autonomous driving vehicle with reinforcement learning in real environment involves non-affordable trial-and-error. It is more desirable to first train in a virtual environment and then transfer to the real environment. In this paper, we propose a novel realistic translation network to make model trained in virtual environment be workable in real world. The proposed network can convert non-realistic virtual image input into a realistic one with similar scene structure. Given realistic frames as input, driving policy trained by reinforcement learning can nicely adapt to real world driving. Experiments show that our proposed virtual to real (VR) reinforcement learning (RL) works pretty well. To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data
    • …
    corecore