3,117 research outputs found
3D environment mapping using the Kinect V2 and path planning based on RRT algorithms
This paper describes a 3D path planning system that is able to provide a solution trajectory for the automatic control of a robot. The proposed system uses a point cloud obtained from the robot workspace, with a Kinect V2 sensor to identify the interest regions and the obstacles of the environment. Our proposal includes a collision-free path planner based on the Rapidly-exploring Random Trees variant (RRT*), for a safe and optimal navigation of robots in 3D spaces. Results on RGB-D segmentation and recognition, point cloud processing, and comparisons between different RRT* algorithms, are presented.Peer ReviewedPostprint (published version
Learning Deep Visual Object Models From Noisy Web Data: How to Make it Work
Deep networks thrive when trained on large scale data collections. This has
given ImageNet a central role in the development of deep architectures for
visual object classification. However, ImageNet was created during a specific
period in time, and as such it is prone to aging, as well as dataset bias
issues. Moving beyond fixed training datasets will lead to more robust visual
systems, especially when deployed on robots in new environments which must
train on the objects they encounter there. To make this possible, it is
important to break free from the need for manual annotators. Recent work has
begun to investigate how to use the massive amount of images available on the
Web in place of manual image annotations. We contribute to this research thread
with two findings: (1) a study correlating a given level of noisily labels to
the expected drop in accuracy, for two deep architectures, on two different
types of noise, that clearly identifies GoogLeNet as a suitable architecture
for learning from Web data; (2) a recipe for the creation of Web datasets with
minimal noise and maximum visual variability, based on a visual and natural
language processing concept expansion strategy. By combining these two results,
we obtain a method for learning powerful deep object models automatically from
the Web. We confirm the effectiveness of our approach through object
categorization experiments using our Web-derived version of ImageNet on a
popular robot vision benchmark database, and on a lifelong object discovery
task on a mobile robot.Comment: 8 pages, 7 figures, 3 table
Open World Assistive Grasping Using Laser Selection
Many people with motor disabilities are unable to complete activities of
daily living (ADLs) without assistance. This paper describes a complete robotic
system developed to provide mobile grasping assistance for ADLs. The system is
comprised of a robot arm from a Rethink Robotics Baxter robot mounted to an
assistive mobility device, a control system for that arm, and a user interface
with a variety of access methods for selecting desired objects. The system uses
grasp detection to allow previously unseen objects to be picked up by the
system. The grasp detection algorithms also allow for objects to be grasped in
cluttered environments. We evaluate our system in a number of experiments on a
large variety of objects. Overall, we achieve an object selection success rate
of 88% and a grasp detection success rate of 90% in a non-mobile scenario, and
success rates of 89% and 72% in a mobile scenario
Vision-based deep execution monitoring
Execution monitor of high-level robot actions can be effectively improved by
visual monitoring the state of the world in terms of preconditions and
postconditions that hold before and after the execution of an action.
Furthermore a policy for searching where to look at, either for verifying the
relations that specify the pre and postconditions or to refocus in case of a
failure, can tremendously improve the robot execution in an uncharted
environment. It is now possible to strongly rely on visual perception in order
to make the assumption that the environment is observable, by the amazing
results of deep learning. In this work we present visual execution monitoring
for a robot executing tasks in an uncharted Lab environment. The execution
monitor interacts with the environment via a visual stream that uses two DCNN
for recognizing the objects the robot has to deal with and manipulate, and a
non-parametric Bayes estimation to discover the relations out of the DCNN
features. To recover from lack of focus and failures due to missed objects we
resort to visual search policies via deep reinforcement learning
- …