9,504 research outputs found
Joint Attention in Driver-Pedestrian Interaction: from Theory to Practice
Today, one of the major challenges that autonomous vehicles are facing is the
ability to drive in urban environments. Such a task requires communication
between autonomous vehicles and other road users in order to resolve various
traffic ambiguities. The interaction between road users is a form of
negotiation in which the parties involved have to share their attention
regarding a common objective or a goal (e.g. crossing an intersection), and
coordinate their actions in order to accomplish it. In this literature review
we aim to address the interaction problem between pedestrians and drivers (or
vehicles) from joint attention point of view. More specifically, we will
discuss the theoretical background behind joint attention, its application to
traffic interaction and practical approaches to implementing joint attention
for autonomous vehicles
A Survey of Deep Learning Techniques for Mobile Robot Applications
Advancements in deep learning over the years have attracted research into how
deep artificial neural networks can be used in robotic systems. This research
survey will present a summarization of the current research with a specific
focus on the gains and obstacles for deep learning to be applied to mobile
robotics
SIGVerse: A cloud-based VR platform for research on social and embodied human-robot interaction
Common sense and social interaction related to daily-life environments are
considerably important for autonomous robots, which support human activities.
One of the practical approaches for acquiring such social interaction skills
and semantic information as common sense in human activity is the application
of recent machine learning techniques. Although recent machine learning
techniques have been successful in realizing automatic manipulation and driving
tasks, it is difficult to use these techniques in applications that require
human-robot interaction experience. Humans have to perform several times over a
long term to show embodied and social interaction behaviors to robots or
learning systems. To address this problem, we propose a cloud-based immersive
virtual reality (VR) platform which enables virtual human-robot interaction to
collect the social and embodied knowledge of human activities in a variety of
situations. To realize the flexible and reusable system, we develop a real-time
bridging mechanism between ROS and Unity, which is one of the standard
platforms for developing VR applications. We apply the proposed system to a
robot competition field named RoboCup@Home to confirm the feasibility of the
system in a realistic human-robot interaction scenario. Through demonstration
experiments at the competition, we show the usefulness and potential of the
system for the development and evaluation of social intelligence through
human-robot interaction. The proposed VR platform enables robot systems to
collect social experiences with several users in a short time. The platform
also contributes in providing a dataset of social behaviors, which would be a
key aspect for intelligent service robots to acquire social interaction skills
based on machine learning techniques.Comment: 16 pages. Under review in Frontiers in Robotics and A
Grasp2Vec: Learning Object Representations from Self-Supervised Grasping
Well structured visual representations can make robot learning faster and can
improve generalization. In this paper, we study how we can acquire effective
object-centric representations for robotic manipulation tasks without human
labeling by using autonomous robot interaction with the environment. Such
representation learning methods can benefit from continuous refinement of the
representation as the robot collects more experience, allowing them to scale
effectively without human intervention. Our representation learning approach is
based on object persistence: when a robot removes an object from a scene, the
representation of that scene should change according to the features of the
object that was removed. We formulate an arithmetic relationship between
feature vectors from this observation, and use it to learn a representation of
scenes and objects that can then be used to identify object instances, localize
them in the scene, and perform goal-directed grasping tasks where the robot
must retrieve commanded objects from a bin. The same grasping procedure can
also be used to automatically collect training data for our method, by
recording images of scenes, grasping and removing an object, and recording the
outcome. Our experiments demonstrate that this self-supervised approach for
tasked grasping substantially outperforms direct reinforcement learning from
images and prior representation learning methods.Comment: CoRL 2018. Eric Jang and Coline Devin contributed equally to this
wor
Machine Vision in the Context of Robotics: A Systematic Literature Review
Machine vision is critical to robotics due to a wide range of applications
which rely on input from visual sensors such as autonomous mobile robots and
smart production systems. To create the smart homes and systems of tomorrow, an
overview about current challenges in the research field would be of use to
identify further possible directions, created in a systematic and reproducible
manner. In this work a systematic literature review was conducted covering
research from the last 10 years. We screened 172 papers from four databases and
selected 52 relevant papers. While robustness and computation time were
improved greatly, occlusion and lighting variance are still the biggest
problems faced. From the number of recent publications, we conclude that the
observed field is of relevance and interest to the research community. Further
challenges arise in many areas of the field.Comment: 10 pages 5 figures, systematic literature stud
Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations
This paper focuses on the problem of learning 6-DOF grasping with a parallel
jaw gripper in simulation. We propose the notion of a geometry-aware
representation in grasping based on the assumption that knowledge of 3D
geometry is at the heart of interaction. Our key idea is constraining and
regularizing grasping interaction learning through 3D geometry prediction.
Specifically, we formulate the learning of deep geometry-aware grasping model
in two steps: First, we learn to build mental geometry-aware representation by
reconstructing the scene (i.e., 3D occupancy grid) from RGBD input via
generative 3D shape modeling. Second, we learn to predict grasping outcome with
its internal geometry-aware representation. The learned outcome prediction
model is used to sequentially propose grasping solutions via
analysis-by-synthesis optimization. Our contributions are fourfold: (1) To best
of our knowledge, we are presenting for the first time a method to learn a
6-DOF grasping net from RGBD input; (2) We build a grasping dataset from
demonstrations in virtual reality with rich sensory and interaction
annotations. This dataset includes 101 everyday objects spread across 7
categories, additionally, we propose a data augmentation strategy for effective
learning; (3) We demonstrate that the learned geometry-aware representation
leads to about 10 percent relative performance improvement over the baseline
CNN on grasping objects from our dataset. (4) We further demonstrate that the
model generalizes to novel viewpoints and object instances.Comment: Published at ICRA 201
Deep Learning in Robotics: A Review of Recent Research
Advances in deep learning over the last decade have led to a flurry of
research in the application of deep artificial neural networks to robotic
systems, with at least thirty papers published on the subject between 2014 and
the present. This review discusses the applications, benefits, and limitations
of deep learning vis-\`a-vis physical robotic systems, using contemporary
research as exemplars. It is intended to communicate recent advances to the
wider robotics community and inspire additional interest in and application of
deep learning in robotics.Comment: 41 pages, 135 reference
Learning to Take Good Pictures of People with a Robot Photographer
We present a robotic system capable of navigating autonomously by following a
line and taking good quality pictures of people. When a group of people are
detected, the robot rotates towards them and then back to line while
continuously taking pictures from different angles. Each picture is processed
in the cloud where its quality is estimated in a two-stage algorithm. First,
features such as the face orientation and likelihood of facial emotions are
input to a fully connected neural network to assign a quality score to each
face. Second, a representation is extracted by abstracting faces from the image
and it is input to a to Convolutional Neural Network (CNN) to classify the
quality of the overall picture. We collected a dataset in which a picture was
labeled as good quality if subjects are well-positioned in the image and
oriented towards the camera with a pleasant expression. Our approach detected
the quality of pictures with 78.4% accuracy in this dataset and received a
better mean user rating (3.71/5) than a heuristic method that uses photographic
composition procedures in a study where 97 human judges rated each picture. A
statistical analysis against the state-of-the-art verified the quality of the
resulting pictures
When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey
With widespread applications of artificial intelligence (AI), the
capabilities of the perception, understanding, decision-making and control for
autonomous systems have improved significantly in the past years. When
autonomous systems consider the performance of accuracy and transferability,
several AI methods, like adversarial learning, reinforcement learning (RL) and
meta-learning, show their powerful performance. Here, we review the
learning-based approaches in autonomous systems from the perspectives of
accuracy and transferability. Accuracy means that a well-trained model shows
good results during the testing phase, in which the testing set shares a same
task or a data distribution with the training set. Transferability means that
when a well-trained model is transferred to other testing domains, the accuracy
is still good. Firstly, we introduce some basic concepts of transfer learning
and then present some preliminaries of adversarial learning, RL and
meta-learning. Secondly, we focus on reviewing the accuracy or transferability
or both of them to show the advantages of adversarial learning, like generative
adversarial networks (GANs), in typical computer vision tasks in autonomous
systems, including image style transfer, image superresolution, image
deblurring/dehazing/rain removal, semantic segmentation, depth estimation,
pedestrian detection and person re-identification (re-ID). Then, we further
review the performance of RL and meta-learning from the aspects of accuracy or
transferability or both of them in autonomous systems, involving pedestrian
tracking, robot navigation and robotic manipulation. Finally, we discuss
several challenges and future topics for using adversarial learning, RL and
meta-learning in autonomous systems
Visual Affordance and Function Understanding: A Survey
Nowadays, robots are dominating the manufacturing, entertainment and
healthcare industries. Robot vision aims to equip robots with the ability to
discover information, understand it and interact with the environment. These
capabilities require an agent to effectively understand object affordances and
functionalities in complex visual domains. In this literature survey, we first
focus on Visual affordances and summarize the state of the art as well as open
problems and research gaps. Specifically, we discuss sub-problems such as
affordance detection, categorization, segmentation and high-level reasoning.
Furthermore, we cover functional scene understanding and the prevalent
functional descriptors used in the literature. The survey also provides
necessary background to the problem, sheds light on its significance and
highlights the existing challenges for affordance and functionality learning.Comment: 26 pages, 22 image
- …