10 research outputs found

    Modeling High-Dimensional Humans for Activity Anticipation using Gaussian Process Latent CRFs

    Full text link
    Abstract—For robots, the ability to model human configura-tions and temporal dynamics is crucial for the task of anticipating future human activities, yet requires conflicting properties: On one hand, we need a detailed high-dimensional description of human configurations to reason about the physical plausibility of the prediction; on the other hand, we need a compact representation to be able to parsimoniously model the relations between the human and the environment. We therefore propose a new model, GP-LCRF, which admits both the high-dimensional and low-dimensional representation of humans. It assumes that the high-dimensional representation is generated from a latent variable corresponding to its low-dimensional representation using a Gaussian process. The gener-ative process not only defines the mapping function between the high- and low-dimensional spaces, but also models a distribution of humans embedded as a potential function in GP-LCRF along with other potentials to jointly model the rich context among humans, objects and the activity. Through extensive experiments on activity anticipation, we show that our GP-LCRF consistently outperforms the state-of-the-art results and reduces the predicted human trajectory error by 11.6%. I

    Action Recognition: From Static Datasets to Moving Robots

    Get PDF
    Deep learning models have achieved state-of-the- art performance in recognizing human activities, but often rely on utilizing background cues present in typical computer vision datasets that predominantly have a stationary camera. If these models are to be employed by autonomous robots in real world environments, they must be adapted to perform independently of background cues and camera motion effects. To address these challenges, we propose a new method that firstly generates generic action region proposals with good potential to locate one human action in unconstrained videos regardless of camera motion and then uses action proposals to extract and classify effective shape and motion features by a ConvNet framework. In a range of experiments, we demonstrate that by actively proposing action regions during both training and testing, state-of-the-art or better performance is achieved on benchmarks. We show the outperformance of our approach compared to the state-of-the-art in two new datasets; one emphasizes on irrelevant background, the other highlights the camera motion. We also validate our action recognition method in an abnormal behavior detection scenario to improve workplace safety. The results verify a higher success rate for our method due to the ability of our system to recognize human actions regardless of environment and camera motion

    Human Activity Recognition and Prediction using RGBD Data

    Get PDF
    Being able to predict and recognize human activities is an essential element for us to effectively communicate with other humans during our day to day activities. A system that is able to do this has a number of appealing applications, from assistive robotics to health care and preventative medicine. Previous work in supervised video-based human activity prediction and detection fails to capture the richness of spatiotemporal data that these activities generate. Convolutional Long short-term memory (Convolutional LSTM) networks are a useful tool in analyzing this type of data, showing good results in many other areas. This thesis’ focus is on utilizing RGB-D Data to improve human activity prediction and recognition. A modified Convolutional LSTM network is introduced to do so. Experiments are performed on the network and are compared to other models in-use as well as the current state-of-the-art system. We show that our proposed model for human activity prediction and recognition outperforms the current state-of-the-art models in the CAD-120 dataset without giving bounding frames or ground-truths about objects

    Goal Set Inverse Optimal Control and Iterative Re-planning for Predicting Human Reaching Motions in Shared Workspaces

    Full text link
    To enable safe and efficient human-robot collaboration in shared workspaces it is important for the robot to predict how a human will move when performing a task. While predicting human motion for tasks not known a priori is very challenging, we argue that single-arm reaching motions for known tasks in collaborative settings (which are especially relevant for manufacturing) are indeed predictable. Two hypotheses underlie our approach for predicting such motions: First, that the trajectory the human performs is optimal with respect to an unknown cost function, and second, that human adaptation to their partner's motion can be captured well through iterative re-planning with the above cost function. The key to our approach is thus to learn a cost function which "explains" the motion of the human. To do this, we gather example trajectories from pairs of participants performing a collaborative assembly task using motion capture. We then use Inverse Optimal Control to learn a cost function from these trajectories. Finally, we predict reaching motions from the human's current configuration to a task-space goal region by iteratively re-planning a trajectory using the learned cost function. Our planning algorithm is based on the trajectory optimizer STOMP, it plans for a 23 DoF human kinematic model and accounts for the presence of a moving collaborator and obstacles in the environment. Our results suggest that in most cases, our method outperforms baseline methods when predicting motions. We also show that our method outperforms baselines for predicting human motion when a human and a robot share the workspace.Comment: 12 pages, Accepted for publication IEEE Transaction on Robotics 201

    A framework for digitisation of manual manufacturing task knowledge using gaming interface technology

    Get PDF
    Intense market competition and the global skill supply crunch are hurting the manufacturing industry, which is heavily dependent on skilled labour. Companies must look for innovative ways to acquire manufacturing skills from their experts and transfer them to novices and eventually to machines to remain competitive. There is a lack of systematic processes in the manufacturing industry and research for cost-effective capture and transfer of human skills. Therefore, the aim of this research is to develop a framework for digitisation of manual manufacturing task knowledge, a major constituent of which is human skill. The proposed digitisation framework is based on the theory of human-workpiece interactions that is developed in this research. The unique aspect of the framework is the use of consumer-grade gaming interface technology to capture and record manual manufacturing tasks in digital form to enable the extraction, decoding and transfer of manufacturing knowledge constituents that are associated with the task. The framework is implemented, tested and refined using 5 case studies, including 1 toy assembly task, 2 real-life-like assembly tasks, 1 simulated assembly task and 1 real-life composite layup task. It is successfully validated based on the outcomes of the case studies and a benchmarking exercise that was conducted to evaluate its performance. This research contributes to knowledge in five main areas, namely, (1) the theory of human-workpiece interactions to decipher human behaviour in manual manufacturing tasks, (2) a cohesive and holistic framework to digitise manual manufacturing task knowledge, especially tacit knowledge such as human action and reaction skills, (3) the use of low-cost gaming interface technology to capture human actions and the effect of those actions on workpieces during a manufacturing task, (4) a new way to use hidden Markov modelling to produce digital skill models to represent human ability to perform complex tasks and (5) extraction and decoding of manufacturing knowledge constituents from the digital skill models
    corecore