Search CORE

9,504 research outputs found

Joint Attention in Driver-Pedestrian Interaction: from Theory to Practice

Author: Rasouli Amir
Tsotsos John K.
Publication venue
Publication date: 27/03/2018
Field of study

Today, one of the major challenges that autonomous vehicles are facing is the ability to drive in urban environments. Such a task requires communication between autonomous vehicles and other road users in order to resolve various traffic ambiguities. The interaction between road users is a form of negotiation in which the parties involved have to share their attention regarding a common objective or a goal (e.g. crossing an intersection), and coordinate their actions in order to accomplish it. In this literature review we aim to address the interaction problem between pedestrians and drivers (or vehicles) from joint attention point of view. More specifically, we will discuss the theoretical background behind joint attention, its application to traffic interaction and practical approaches to implementing joint attention for autonomous vehicles

arXiv.org e-Print Archive

A Survey of Deep Learning Techniques for Mobile Robot Applications

Author: Anwer Tarique
Shabbir Jahanzaib
Publication venue
Publication date: 20/03/2018
Field of study

Advancements in deep learning over the years have attracted research into how deep artificial neural networks can be used in robotic systems. This research survey will present a summarization of the current research with a specific focus on the gains and obstacles for deep learning to be applied to mobile robotics

arXiv.org e-Print Archive

SIGVerse: A cloud-based VR platform for research on social and embodied human-robot interaction

Author: Inamura Tetsunari
Mizuchi Yoshiaki
Publication venue
Publication date: 02/05/2020
Field of study

Common sense and social interaction related to daily-life environments are considerably important for autonomous robots, which support human activities. One of the practical approaches for acquiring such social interaction skills and semantic information as common sense in human activity is the application of recent machine learning techniques. Although recent machine learning techniques have been successful in realizing automatic manipulation and driving tasks, it is difficult to use these techniques in applications that require human-robot interaction experience. Humans have to perform several times over a long term to show embodied and social interaction behaviors to robots or learning systems. To address this problem, we propose a cloud-based immersive virtual reality (VR) platform which enables virtual human-robot interaction to collect the social and embodied knowledge of human activities in a variety of situations. To realize the flexible and reusable system, we develop a real-time bridging mechanism between ROS and Unity, which is one of the standard platforms for developing VR applications. We apply the proposed system to a robot competition field named RoboCup@Home to confirm the feasibility of the system in a realistic human-robot interaction scenario. Through demonstration experiments at the competition, we show the usefulness and potential of the system for the development and evaluation of social intelligence through human-robot interaction. The proposed VR platform enables robot systems to collect social experiences with several users in a short time. The platform also contributes in providing a dataset of social behaviors, which would be a key aspect for intelligent service robots to acquire social interaction skills based on machine learning techniques.Comment: 16 pages. Under review in Frontiers in Robotics and A

arXiv.org e-Print Archive

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

Author: Devin Coline
Jang Eric
Levine Sergey
Vanhoucke Vincent
Publication venue
Publication date: 19/11/2018
Field of study

Well structured visual representations can make robot learning faster and can improve generalization. In this paper, we study how we can acquire effective object-centric representations for robotic manipulation tasks without human labeling by using autonomous robot interaction with the environment. Such representation learning methods can benefit from continuous refinement of the representation as the robot collects more experience, allowing them to scale effectively without human intervention. Our representation learning approach is based on object persistence: when a robot removes an object from a scene, the representation of that scene should change according to the features of the object that was removed. We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin. The same grasping procedure can also be used to automatically collect training data for our method, by recording images of scenes, grasping and removing an object, and recording the outcome. Our experiments demonstrate that this self-supervised approach for tasked grasping substantially outperforms direct reinforcement learning from images and prior representation learning methods.Comment: CoRL 2018. Eric Jang and Coline Devin contributed equally to this wor

arXiv.org e-Print Archive

Machine Vision in the Context of Robotics: A Systematic Literature Review

Author: Dimter Tom
Ghofrani Javad
Kirschne Robert
Reichelt Dirk
Rossburg Daniel
Publication venue
Publication date: 03/05/2019
Field of study

Machine vision is critical to robotics due to a wide range of applications which rely on input from visual sensors such as autonomous mobile robots and smart production systems. To create the smart homes and systems of tomorrow, an overview about current challenges in the research field would be of use to identify further possible directions, created in a systematic and reproducible manner. In this work a systematic literature review was conducted covering research from the last 10 years. We screened 172 papers from four databases and selected 52 relevant papers. While robustness and computation time were improved greatly, occlusion and lighting variance are still the biggest problems faced. From the number of recent publications, we conclude that the observed field is of relevance and interest to the research community. Further challenges arise in many areas of the field.Comment: 10 pages 5 figures, systematic literature stud

arXiv.org e-Print Archive

Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations

Author: Bai Yunfei
Davidson James
Gupta Abhinav
Hsu Jasmine
Khansari Mohi
Lee Honglak
Pathak Arkanath
Yan Xinchen
Publication venue
Publication date: 14/06/2018
Field of study

This paper focuses on the problem of learning 6-DOF grasping with a parallel jaw gripper in simulation. We propose the notion of a geometry-aware representation in grasping based on the assumption that knowledge of 3D geometry is at the heart of interaction. Our key idea is constraining and regularizing grasping interaction learning through 3D geometry prediction. Specifically, we formulate the learning of deep geometry-aware grasping model in two steps: First, we learn to build mental geometry-aware representation by reconstructing the scene (i.e., 3D occupancy grid) from RGBD input via generative 3D shape modeling. Second, we learn to predict grasping outcome with its internal geometry-aware representation. The learned outcome prediction model is used to sequentially propose grasping solutions via analysis-by-synthesis optimization. Our contributions are fourfold: (1) To best of our knowledge, we are presenting for the first time a method to learn a 6-DOF grasping net from RGBD input; (2) We build a grasping dataset from demonstrations in virtual reality with rich sensory and interaction annotations. This dataset includes 101 everyday objects spread across 7 categories, additionally, we propose a data augmentation strategy for effective learning; (3) We demonstrate that the learned geometry-aware representation leads to about 10 percent relative performance improvement over the baseline CNN on grasping objects from our dataset. (4) We further demonstrate that the model generalizes to novel viewpoints and object instances.Comment: Published at ICRA 201

arXiv.org e-Print Archive

Deep Learning in Robotics: A Review of Recent Research

Author: Gashler Michael S.
Pierson Harry A.
Publication venue
Publication date: 22/07/2017
Field of study

Advances in deep learning over the last decade have led to a flurry of research in the application of deep artificial neural networks to robotic systems, with at least thirty papers published on the subject between 2014 and the present. This review discusses the applications, benefits, and limitations of deep learning vis-\`a-vis physical robotic systems, using contemporary research as exemplars. It is intended to communicate recent advances to the wider robotics community and inspire additional interest in and application of deep learning in robotics.Comment: 41 pages, 135 reference

arXiv.org e-Print Archive

Learning to Take Good Pictures of People with a Robot Photographer

Author: Cosgun Akansel
Drummond Tom
Koseoglu Mehmet
Newbury Rhys
Publication venue
Publication date: 11/04/2019
Field of study

We present a robotic system capable of navigating autonomously by following a line and taking good quality pictures of people. When a group of people are detected, the robot rotates towards them and then back to line while continuously taking pictures from different angles. Each picture is processed in the cloud where its quality is estimated in a two-stage algorithm. First, features such as the face orientation and likelihood of facial emotions are input to a fully connected neural network to assign a quality score to each face. Second, a representation is extracted by abstracting faces from the image and it is input to a to Convolutional Neural Network (CNN) to classify the quality of the overall picture. We collected a dataset in which a picture was labeled as good quality if subjects are well-positioned in the image and oriented towards the camera with a pleasant expression. Our approach detected the quality of pictures with 78.4% accuracy in this dataset and received a better mean user rating (3.71/5) than a heuristic method that uses photographic composition procedures in a study where 97 human judges rated each picture. A statistical analysis against the state-of-the-art verified the quality of the resulting pictures

arXiv.org e-Print Archive

When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey

Author: Kurths Jürgen
Qian Feng
Sun Qiyu
Tang Yang
Wang Jianrui
Yen Gary G.
Zhang Chongzhen
Zhao Chaoqiang
Publication venue
Publication date: 24/05/2020
Field of study

With widespread applications of artificial intelligence (AI), the capabilities of the perception, understanding, decision-making and control for autonomous systems have improved significantly in the past years. When autonomous systems consider the performance of accuracy and transferability, several AI methods, like adversarial learning, reinforcement learning (RL) and meta-learning, show their powerful performance. Here, we review the learning-based approaches in autonomous systems from the perspectives of accuracy and transferability. Accuracy means that a well-trained model shows good results during the testing phase, in which the testing set shares a same task or a data distribution with the training set. Transferability means that when a well-trained model is transferred to other testing domains, the accuracy is still good. Firstly, we introduce some basic concepts of transfer learning and then present some preliminaries of adversarial learning, RL and meta-learning. Secondly, we focus on reviewing the accuracy or transferability or both of them to show the advantages of adversarial learning, like generative adversarial networks (GANs), in typical computer vision tasks in autonomous systems, including image style transfer, image superresolution, image deblurring/dehazing/rain removal, semantic segmentation, depth estimation, pedestrian detection and person re-identification (re-ID). Then, we further review the performance of RL and meta-learning from the aspects of accuracy or transferability or both of them in autonomous systems, involving pedestrian tracking, robot navigation and robotic manipulation. Finally, we discuss several challenges and future topics for using adversarial learning, RL and meta-learning in autonomous systems

arXiv.org e-Print Archive

Visual Affordance and Function Understanding: A Survey

Author: Hassanin Mohammed
Khan Salman
Tahtali Murat
Publication venue
Publication date: 18/07/2018
Field of study

Nowadays, robots are dominating the manufacturing, entertainment and healthcare industries. Robot vision aims to equip robots with the ability to discover information, understand it and interact with the environment. These capabilities require an agent to effectively understand object affordances and functionalities in complex visual domains. In this literature survey, we first focus on Visual affordances and summarize the state of the art as well as open problems and research gaps. Specifically, we discuss sub-problems such as affordance detection, categorization, segmentation and high-level reasoning. Furthermore, we cover functional scene understanding and the prevalent functional descriptors used in the literature. The survey also provides necessary background to the problem, sheds light on its significance and highlights the existing challenges for affordance and functionality learning.Comment: 26 pages, 22 image

arXiv.org e-Print Archive