49,571 research outputs found
Object-Oriented Dynamics Learning through Multi-Level Abstraction
Object-based approaches for learning action-conditioned dynamics has
demonstrated promise for generalization and interpretability. However, existing
approaches suffer from structural limitations and optimization difficulties for
common environments with multiple dynamic objects. In this paper, we present a
novel self-supervised learning framework, called Multi-level Abstraction
Object-oriented Predictor (MAOP), which employs a three-level learning
architecture that enables efficient object-based dynamics learning from raw
visual observations. We also design a spatial-temporal relational reasoning
mechanism for MAOP to support instance-level dynamics learning and handle
partial observability. Our results show that MAOP significantly outperforms
previous methods in terms of sample efficiency and generalization over novel
environments for learning environment models. We also demonstrate that learned
dynamics models enable efficient planning in unseen environments, comparable to
true environment models. In addition, MAOP learns semantically and visually
interpretable disentangled representations.Comment: Accepted to the Thirthy-Fourth AAAI Conference On Artificial
Intelligence (AAAI), 202
Grounding semantics in robots for Visual Question Answering
In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
Level-Set Based Artery-Vein Separation in Blood Pool Agent CE-MR Angiograms
Blood pool agents (BPAs) for contrast-enhanced (CE) magnetic-resonance angiography (MRA) allow prolonged imaging times for higher contrast and resolution. Imaging is performed during the steady state when the contrast agent is distributed through the complete vascular system. However, simultaneous venous and arterial enhancement in this steady state hampers interpretation. In order to improve visualization of the arteries and veins from steady-state BPA data, a semiautomated method for artery-vein separation is presented. In this method, the central arterial axis and central venous axis are used as initializations for two surfaces that simultaneously evolve in order to capture the arterial and venous parts of the vasculature using the level-set framework. Since arteries and veins can be in close proximity of each other, leakage from the evolving arterial (venous) surface into the venous (arterial) part of the vasculature is inevitable. In these situations, voxels are labeled arterial or venous based on the arrival time of the respective surface. The evolution is steered by external forces related to feature images derived from the image data and by internal forces related to the geometry of the level sets. In this paper, the robustness and accuracy of three external forces (based on image intensity, image gradient, and vessel-enhancement filtering) and combinations of them are investigated and tested on seven patient datasets. To this end, results with the level-set-based segmentation are compared to the reference-standard manually obtained segmentations. Best results are achieved by applying a combination of intensity- and gradient-based forces and a smoothness constraint based on the curvature of the surface. By applying this combination to the seven datasets, it is shown that, with minimal user interaction, artery-vein separation for improved arterial and venous visualization in BPA CE-MRA can be achieved
Virtual to Real Reinforcement Learning for Autonomous Driving
Reinforcement learning is considered as a promising direction for driving
policy learning. However, training autonomous driving vehicle with
reinforcement learning in real environment involves non-affordable
trial-and-error. It is more desirable to first train in a virtual environment
and then transfer to the real environment. In this paper, we propose a novel
realistic translation network to make model trained in virtual environment be
workable in real world. The proposed network can convert non-realistic virtual
image input into a realistic one with similar scene structure. Given realistic
frames as input, driving policy trained by reinforcement learning can nicely
adapt to real world driving. Experiments show that our proposed virtual to real
(VR) reinforcement learning (RL) works pretty well. To our knowledge, this is
the first successful case of driving policy trained by reinforcement learning
that can adapt to real world driving data
- …