9,880 research outputs found
Active Object Localization in Visual Situations
We describe a method for performing active localization of objects in
instances of visual situations. A visual situation is an abstract
concept---e.g., "a boxing match", "a birthday party", "walking the dog",
"waiting for a bus"---whose image instantiations are linked more by their
common spatial and semantic structure than by low-level visual similarity. Our
system combines given and learned knowledge of the structure of a particular
situation, and adapts that knowledge to a new situation instance as it actively
searches for objects. More specifically, the system learns a set of probability
distributions describing spatial and other relationships among relevant
objects. The system uses those distributions to iteratively sample object
proposals on a test image, but also continually uses information from those
object proposals to adaptively modify the distributions based on what the
system has detected. We test our approach's ability to efficiently localize
objects, using a situation-specific image dataset created by our group. We
compare the results with several baselines and variations on our method, and
demonstrate the strong benefit of using situation knowledge and active
context-driven localization. Finally, we contrast our method with several other
approaches that use context as well as active search for object localization in
images.Comment: 14 page
Real-time 3D object detection and SLAM fusion in a low-cost LiDAR test vehicle setup
Recently released research about deep learning applications related to perception for autonomous driving focuses heavily on the usage of LiDAR point cloud data as input for the neural networks, highlighting the importance of LiDAR technology in the field of Autonomous Driving (AD). In this sense, a great percentage of the vehicle platforms used to create the datasets released for the development of these neural networks, as well as some AD commercial solutions available on the market, heavily invest in an array of sensors, including a large number of sensors as well as several sensor modalities. However, these costs create a barrier to entry for low-cost solutions for the performance of critical perception tasks such as Object Detection and SLAM. This paper explores current vehicle platforms and proposes a low-cost, LiDAR-based test vehicle platform capable of running critical perception tasks (Object Detection and SLAM) in real time. Additionally, we propose the creation of a deep learning-based inference model for Object Detection deployed in a resource-constrained device, as well as a graph-based SLAM implementation, providing important considerations, explored while taking into account the real-time processing requirement and presenting relevant results demonstrating the usability of the developed work in the context of the proposed low-cost platform
Semantic Robot Programming for Goal-Directed Manipulation in Cluttered Scenes
We present the Semantic Robot Programming (SRP) paradigm as a convergence of
robot programming by demonstration and semantic mapping. In SRP, a user can
directly program a robot manipulator by demonstrating a snapshot of their
intended goal scene in workspace. The robot then parses this goal as a scene
graph comprised of object poses and inter-object relations, assuming known
object geometries. Task and motion planning is then used to realize the user's
goal from an arbitrary initial scene configuration. Even when faced with
different initial scene configurations, SRP enables the robot to seamlessly
adapt to reach the user's demonstrated goal. For scene perception, we propose
the Discriminatively-Informed Generative Estimation of Scenes and Transforms
(DIGEST) method to infer the initial and goal states of the world from RGBD
images. The efficacy of SRP with DIGEST perception is demonstrated for the task
of tray-setting with a Michigan Progress Fetch robot. Scene perception and task
execution are evaluated with a public household occlusion dataset and our
cluttered scene dataset.Comment: published in ICRA 201
- …