2 research outputs found
Understanding Contexts Inside Robot and Human Manipulation Tasks through a Vision-Language Model and Ontology System in a Video Stream
Manipulation tasks in daily life, such as pouring water, unfold intentionally
under specialized manipulation contexts. Being able to process contextual
knowledge in these Activities of Daily Living (ADLs) over time can help us
understand manipulation intentions, which are essential for an intelligent
robot to transition smoothly between various manipulation actions. In this
paper, to model the intended concepts of manipulation, we present a vision
dataset under a strictly constrained knowledge domain for both robot and human
manipulations, where manipulation concepts and relations are stored by an
ontology system in a taxonomic manner. Furthermore, we propose a scheme to
generate a combination of visual attentions and an evolving knowledge graph
filled with commonsense knowledge. Our scheme works with real-world camera
streams and fuses an attention-based Vision-Language model with the ontology
system. The experimental results demonstrate that the proposed scheme can
successfully represent the evolution of an intended object manipulation
procedure for both robots and humans. The proposed scheme allows the robot to
mimic human-like intentional behaviors by watching real-time videos. We aim to
develop this scheme further for real-world robot intelligence in Human-Robot
Interaction
Constrained Motion Planning Networks X
Constrained motion planning is a challenging field of research, aiming for
computationally efficient methods that can find a collision-free path on the
constraint manifolds between a given start and goal configuration. These
planning problems come up surprisingly frequently, such as in robot
manipulation for performing daily life assistive tasks. However, few solutions
to constrained motion planning are available, and those that exist struggle
with high computational time complexity in finding a path solution on the
manifolds. To address this challenge, we present Constrained Motion Planning
Networks X (CoMPNetX). It is a neural planning approach, comprising a
conditional deep neural generator and discriminator with neural gradients-based
fast projection operator. We also introduce neural task and scene
representations conditioned on which the CoMPNetX generates implicit manifold
configurations to turbo-charge any underlying classical planner such as
Sampling-based Motion Planning methods for quickly solving complex constrained
planning tasks. We show that our method finds path solutions with high success
rates and lower computation times than state-of-the-art traditional
path-finding tools on various challenging scenarios.Comment: This is preprint version of a paper published in IEEE Transactions on
Robotics. The videos, code, dataset and trained models can be found here:
https://sites.google.com/view/compnetx/hom