361 research outputs found

    Grounding of Human Environments and Activities for Autonomous Robots

    Get PDF
    With the recent proliferation of robotic applications in domestic and industrial scenarios, it is vital for robots to continually learn about their environments and about the humans they share their environments with. In this paper, we present a framework for autonomous, unsupervised learning from various sensory sources of useful human ‘concepts’; including colours, people names, usable objects and simple activities. This is achieved by integrating state-of-the-art object segmentation, pose estimation, activity analysis and language grounding into a continual learning framework. Learned concepts are grounded to natural language if commentary is available, allowing the robot to communicate in a human-understandable way. We show, using a challenging, real-world dataset of human activities, that our framework is able to extract useful concepts, ground natural language descriptions to them, and, as a proof-of-concept, to generate simple sentences from templates to describe people and activities

    Learning and Mining Player Motion Profiles in Physically Interactive Robogames

    Get PDF
    Physically-Interactive RoboGames (PIRG) are an emerging application whose aim is to develop robotic agents able to interact and engage humans in a game situation. In this framework, learning a model of players’ activity is relevant both to understand their engagement, as well as to understand specific strategies they adopted, which in turn can foster game adaptation. Following such directions and given the lack of quantitative methods for player modeling in PIRG, we propose a methodology for representing players as a mixture of existing player’s types uncovered from data. This is done by dealing both with the intrinsic uncertainty associated with the setting and with the agent necessity to act in real time to support the game interaction. Our methodology first focuses on encoding time series data generated from player-robot interaction into images, in particular Gramian angular field images, to represent continuous data. To these, we apply latent Dirichlet allocation to summarize the player’s motion style as a probabilistic mixture of different styles discovered from data. This approach has been tested in a dataset collected from a real, physical robot game, where activity patterns are extracted by using a custom three-axis accelerometer sensor module. The obtained results suggest that the proposed system is able to provide a robust description for the player interaction

    Unsupervised Human Activity Analysis for Intelligent Mobile Robots

    Get PDF
    The success of intelligent mobile robots in daily living environments depends on their ability to understand human movements and behaviours. One goal of recent research is to understand human activities performed in real human environments from long term observation. We consider a human activity to be a temporally dynamic configuration of a person interacting with key objects within the environment that provide some functionality. This can be a motion trajectory made of a sequence of 2-dimensional points representing a person’s position, as well as more detailed sequences of high-dimensional body poses, a collection of 3-dimensional points representing body joints positions, as estimated from the point of view of the robot. The limited field of view of the robot, restricted by the limitations of its sensory modalities, poses the challenge of understanding human activities from obscured, incomplete and noisy observations. As an embedded system it also has perceptual limitations which restrict the resolution of the human activity representations it can hope to achieve. In this thesis an approach for unsupervised learning of activities implemented on an autonomous mobile robot is presented. This research makes the following novel contributions: 1) A qualitative spatial-temporal vector space encoding of human activities as observed by an autonomous mobile robot. 2) Methods for learning a low dimensional representation of common and repeated patterns from multiple encoded visual observations. In order to handle the perceptual challenges, multiple abstractions are applied to the robot’s perception data. The human observations are first encoded using a leg-detector, an upper-body image classifier, and a convolutional neural network for pose estimation, while objects within the environment are automatically segmented from a 3-dimensional point cloud representation. Central to the success of the presented framework is mapping these encodings into an abstract qualitative space in order to generalise patterns invariant to exact quantitative positions within the real world. This is performed using a number of qualitative spatial-temporal representations which capture different aspects of the relations between the human subject and the objects in the environment. The framework auto-generates a vocabulary of discrete spatial-temporal descriptors extracted from the video sequences and each observation is represented as a vector over this vocabulary. Analogously to information retrieval on text corpora we use generative probabilistic techniques to recover latent, semantically meaningful, concepts in the encoded observations in an unsupervised manner. The relatively small number of concepts discovered are defined as multinomial distributions over the vocabulary and considered as human activity classes, granting the robot a high-level understanding of visually observed complex scenes. We validate the framework using, 1) A dataset collected from a physical robot autonomously patrolling and performing tasks in an office environment during a six week deployment, and 2) a high-dimensional “full body pose” dataset captured over multiple days by a mobile robot observing a kitchen area of an office environment from multiple view points. We show that the emergent categories from our framework align well with how humans interpret behaviours andsimple activities. Our presented framework models each extended observation as a probabilistic mixture over the learned activities, meaning it can learn human activity models even when embedded in continuous video sequences without the need for manual temporal segmentation, which can be time consuming and costly. Finally, we present methods for learning such human activity models in an incremental and continuous setting using variational inference methods to update the activity distribution online. This allows the mobile robot to efficiently learn and update its models of human activity over time, discarding the raw data, allowing for life-long learning

    Detection of unanticipated faults for autonomous underwater vehicles using online topic models

    Get PDF
    © The Author(s), 2017. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Journal of Field Robotics 35 (2018): 705-716, doi:10.1002/rob.21771.For robots to succeed in complex missions, they must be reliable in the face of subsystem failures and environmental challenges. In this paper, we focus on autonomous underwater vehicle (AUV) autonomy as it pertains to self‐perception and health monitoring, and we argue that automatic classification of state‐sensor data represents an important enabling capability. We apply an online Bayesian nonparametric topic modeling technique to AUV sensor data in order to automatically characterize its performance patterns, then demonstrate how in combination with operator‐supplied semantic labels these patterns can be used for fault detection and diagnosis by means of a nearest‐neighbor classifier. The method is evaluated using data collected by the Monterey Bay Aquarium Research Institute's Tethys long‐range AUV in three separate field deployments. Our results show that the proposed method is able to accurately identify and characterize patterns that correspond to various states of the AUV, and classify faults at a high rate of correct detection with a very low false detection rate.Office of Naval Research Grant Number: N00014‐14‐1‐0199; David and Lucile Packard Foundatio

    Whole brain Probabilistic Generative Model toward Realizing Cognitive Architecture for Developmental Robots

    Get PDF
    Building a humanlike integrative artificial cognitive system, that is, an artificial general intelligence, is one of the goals in artificial intelligence and developmental robotics. Furthermore, a computational model that enables an artificial cognitive system to achieve cognitive development will be an excellent reference for brain and cognitive science. This paper describes the development of a cognitive architecture using probabilistic generative models (PGMs) to fully mirror the human cognitive system. The integrative model is called a whole-brain PGM (WB-PGM). It is both brain-inspired and PGMbased. In this paper, the process of building the WB-PGM and learning from the human brain to build cognitive architectures is described.Comment: 55 pages, 8 figures, submitted to Neural Network

    CLAD: A Complex and Long Activities Dataset with Rich Crowdsourced Annotations

    Get PDF
    This paper introduces a novel activity dataset which exhibits real-life and diverse scenarios of complex, temporally-extended human activities and actions. The dataset presents a set of videos of actors performing everyday activities in a natural and unscripted manner. The dataset was recorded using a static Kinect 2 sensor which is commonly used on many robotic platforms. The dataset comprises of RGB-D images, point cloud data, automatically generated skeleton tracks in addition to crowdsourced annotations. Furthermore, we also describe the methodology used to acquire annotations through crowdsourcing. Finally some activity recognition benchmarks are presented using current state-of-the-art techniques. We believe that this dataset is particularly suitable as a testbed for activity recognition research but it can also be applicable for other common tasks in robotics/computer vision research such as object detection and human skeleton tracking

    3D Robotic Sensing of People: Human Perception, Representation and Activity Recognition

    Get PDF
    The robots are coming. Their presence will eventually bridge the digital-physical divide and dramatically impact human life by taking over tasks where our current society has shortcomings (e.g., search and rescue, elderly care, and child education). Human-centered robotics (HCR) is a vision to address how robots can coexist with humans and help people live safer, simpler and more independent lives. As humans, we have a remarkable ability to perceive the world around us, perceive people, and interpret their behaviors. Endowing robots with these critical capabilities in highly dynamic human social environments is a significant but very challenging problem in practical human-centered robotics applications. This research focuses on robotic sensing of people, that is, how robots can perceive and represent humans and understand their behaviors, primarily through 3D robotic vision. In this dissertation, I begin with a broad perspective on human-centered robotics by discussing its real-world applications and significant challenges. Then, I will introduce a real-time perception system, based on the concept of Depth of Interest, to detect and track multiple individuals using a color-depth camera that is installed on moving robotic platforms. In addition, I will discuss human representation approaches, based on local spatio-temporal features, including new “CoDe4D” features that incorporate both color and depth information, a new “SOD” descriptor to efficiently quantize 3D visual features, and the novel AdHuC features, which are capable of representing the activities of multiple individuals. Several new algorithms to recognize human activities are also discussed, including the RG-PLSA model, which allows us to discover activity patterns without supervision, the MC-HCRF model, which can explicitly investigate certainty in latent temporal patterns, and the FuzzySR model, which is used to segment continuous data into events and probabilistically recognize human activities. Cognition models based on recognition results are also implemented for decision making that allow robotic systems to react to human activities. Finally, I will conclude with a discussion of future directions that will accelerate the upcoming technological revolution of human-centered robotics
    • 

    corecore