198 research outputs found
Multiform Adaptive Robot Skill Learning from Humans
Object manipulation is a basic element in everyday human lives. Robotic
manipulation has progressed from maneuvering single-rigid-body objects with
firm grasping to maneuvering soft objects and handling contact-rich actions.
Meanwhile, technologies such as robot learning from demonstration have enabled
humans to intuitively train robots. This paper discusses a new level of robotic
learning-based manipulation. In contrast to the single form of learning from
demonstration, we propose a multiform learning approach that integrates
additional forms of skill acquisition, including adaptive learning from
definition and evaluation. Moreover, going beyond state-of-the-art technologies
of handling purely rigid or soft objects in a pseudo-static manner, our work
allows robots to learn to handle partly rigid partly soft objects with
time-critical skills and sophisticated contact control. Such capability of
robotic manipulation offers a variety of new possibilities in human-robot
interaction.Comment: Accepted to 2017 Dynamic Systems and Control Conference (DSCC),
Tysons Corner, VA, October 11-1
Dynamic Scene Reconstruction and Understanding
Traditional approaches to 3D reconstruction have achieved remarkable progress in static scene acquisition. The acquired data serves as priors or benchmarks for many vision and graphics tasks, such as object detection and robotic navigation. Thus, obtaining interpretable and editable representations from a raw monocular RGB-D video sequence is an outstanding goal in scene understanding. However, acquiring an interpretable representation becomes significantly more challenging when a scene contains dynamic activities; for example, a moving camera, rigid object movement, and non-rigid motions. These dynamic scene elements introduce a scene factorization problem, i.e., dividing a scene into elements and jointly estimating elements’ motion and geometry. Moreover, the monocular setting brings in the problems of tracking and fusing partially occluded objects as they are scanned from one viewpoint at a time.
This thesis explores several ideas for acquiring an interpretable model in dynamic environments. Firstly, we utilize synthetic assets such as floor plans and object meshes to generate dynamic data for training and evaluation. Then, we explore the idea of learning geometry priors with an instance segmentation module, which predicts the location and grouping of indoor objects. We use the learned geometry priors to infer the occluded object geometry for tracking and reconstruction. While instance segmentation modules usually have a generalization issue, i.e., struggling to handle unknown objects, we observed that the empty space information in the background geometry is more reliable for detecting moving objects. Thus, we proposed a segmentation-by-reconstruction strategy for acquiring rigidly-moving objects and backgrounds. Finally, we present a novel neural representation to learn a factorized scene representation, reconstructing every dynamic element. The proposed model supports both rigid and non-rigid motions without pre-trained templates. We demonstrate that our systems and representation improve the reconstruction quality on synthetic test sets and real-world scans
Cloth manipulation and perception competition
In the last decade, several competitions in robotic manipulation have been organised as a way to drive scientific progress in the field. They enable comparison of different approaches through a well-defined benchmark with equal test conditions. However, current competitions usually focus on rigid-object manipulation, leaving behind the challenges that suppose grasping deformable objects, especially highly-deformable ones as cloth-like objects. In this paper, we want to present the first competition in perception and manipulation of textile objects as an eficient method to accelerate scientific progress in the domain of domestic service robots. To do so, we selected a small set of tasks to benchmark in a common framework using the same set of objects and assessment methods. This competition has been conceived to freely distribute the Household Cloth Object Set to research groups working on cloth manipulation and perception and participate on the challenge. In this work, we present an overview of the tasks that are proposed in the competition, detailed descriptions of the tasks and more information on the scoring and rules are provided in the website http://www.iri.upc.edu/groups/perception/ClothManipulationChallenge/Peer ReviewedPostprint (published version
State of the Art in Dense Monocular Non-Rigid 3D Reconstruction
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular2D image observations is a long-standing and actively researched area ofcomputer vision and graphics. It is an ill-posed inverse problem,since--without additional prior assumptions--it permits infinitely manysolutions leading to accurate projection to the input 2D images. Non-rigidreconstruction is a foundational building block for downstream applicationslike robotics, AR/VR, or visual content creation. The key advantage of usingmonocular cameras is their omnipresence and availability to the end users aswell as their ease of use compared to more sophisticated camera set-ups such asstereo or multi-view systems. This survey focuses on state-of-the-art methodsfor dense non-rigid 3D reconstruction of various deformable objects andcomposite scenes from monocular videos or sets of monocular views. It reviewsthe fundamentals of 3D reconstruction and deformation modeling from 2D imageobservations. We then start from general methods--that handle arbitrary scenesand make only a few prior assumptions--and proceed towards techniques makingstronger assumptions about the observed objects and types of deformations (e.g.human faces, bodies, hands, and animals). A significant part of this STAR isalso devoted to classification and a high-level comparison of the methods, aswell as an overview of the datasets for training and evaluation of thediscussed techniques. We conclude by discussing open challenges in the fieldand the social aspects associated with the usage of the reviewed methods.<br
State of the Art in Dense Monocular Non-Rigid 3D Reconstruction
3D reconstruction of deformable (or non-rigid) scenes from a set of monocular
2D image observations is a long-standing and actively researched area of
computer vision and graphics. It is an ill-posed inverse problem,
since--without additional prior assumptions--it permits infinitely many
solutions leading to accurate projection to the input 2D images. Non-rigid
reconstruction is a foundational building block for downstream applications
like robotics, AR/VR, or visual content creation. The key advantage of using
monocular cameras is their omnipresence and availability to the end users as
well as their ease of use compared to more sophisticated camera set-ups such as
stereo or multi-view systems. This survey focuses on state-of-the-art methods
for dense non-rigid 3D reconstruction of various deformable objects and
composite scenes from monocular videos or sets of monocular views. It reviews
the fundamentals of 3D reconstruction and deformation modeling from 2D image
observations. We then start from general methods--that handle arbitrary scenes
and make only a few prior assumptions--and proceed towards techniques making
stronger assumptions about the observed objects and types of deformations (e.g.
human faces, bodies, hands, and animals). A significant part of this STAR is
also devoted to classification and a high-level comparison of the methods, as
well as an overview of the datasets for training and evaluation of the
discussed techniques. We conclude by discussing open challenges in the field
and the social aspects associated with the usage of the reviewed methods.Comment: 25 page
- …