3,834 research outputs found
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Towards Active Event Recognition
Directing robot attention to recognise activities and to anticipate events like goal-directed actions is a crucial skill for human-robot interaction. Unfortunately, issues like intrinsic time constraints, the spatially distributed nature of the entailed information sources, and the existence of a multitude of unobservable states affecting the system, like latent intentions, have long rendered achievement of such skills a rather elusive goal. The problem tests the limits of current attention control systems. It requires an integrated solution for tracking, exploration and recognition, which traditionally have been seen as separate problems in active vision.We propose a probabilistic generative framework based on a mixture of Kalman filters and information gain maximisation that uses predictions in both recognition and attention-control. This framework can efficiently use the observations of one element in a dynamic environment to provide information on other elements, and consequently enables guided exploration.Interestingly, the sensors-control policy, directly derived from first principles, represents the intuitive trade-off between finding the most discriminative clues and maintaining overall awareness.Experiments on a simulated humanoid robot observing a human executing goal-oriented actions demonstrated improvement on recognition time and precision over baseline systems
3D Robotic Sensing of People: Human Perception, Representation and Activity Recognition
The robots are coming. Their presence will eventually bridge the digital-physical divide and dramatically impact human life by taking over tasks where our current society has shortcomings (e.g., search and rescue, elderly care, and child education). Human-centered robotics (HCR) is a vision to address how robots can coexist with humans and help people live safer, simpler and more independent lives.
As humans, we have a remarkable ability to perceive the world around us, perceive people, and interpret their behaviors. Endowing robots with these critical capabilities in highly dynamic human social environments is a significant but very challenging problem in practical human-centered robotics applications.
This research focuses on robotic sensing of people, that is, how robots can perceive and represent humans and understand their behaviors, primarily through 3D robotic vision. In this dissertation, I begin with a broad perspective on human-centered robotics by discussing its real-world applications and significant challenges. Then, I will introduce a real-time perception system, based on the concept of Depth of Interest, to detect and track multiple individuals using a color-depth camera that is installed on moving robotic platforms. In addition, I will discuss human representation approaches, based on local spatio-temporal features, including new āCoDe4Dā features that incorporate both color and depth information, a new āSODā descriptor to efficiently quantize 3D visual features, and the novel AdHuC features, which are capable of representing the activities of multiple individuals. Several new algorithms to recognize human activities are also discussed, including the RG-PLSA model, which allows us to discover activity patterns without supervision, the MC-HCRF model, which can explicitly investigate certainty in latent temporal patterns, and the FuzzySR model, which is used to segment continuous data into events and probabilistically recognize human activities. Cognition models based on recognition results are also implemented for decision making that allow robotic systems to react to human activities. Finally, I will conclude with a discussion of future directions that will accelerate the upcoming technological revolution of human-centered robotics
A Novel Predictive-Coding-Inspired Variational RNN Model for Online Prediction and Recognition
This study introduces PV-RNN, a novel variational RNN inspired by the
predictive-coding ideas. The model learns to extract the probabilistic
structures hidden in fluctuating temporal patterns by dynamically changing the
stochasticity of its latent states. Its architecture attempts to address two
major concerns of variational Bayes RNNs: how can latent variables learn
meaningful representations and how can the inference model transfer future
observations to the latent variables. PV-RNN does both by introducing adaptive
vectors mirroring the training data, whose values can then be adapted
differently during evaluation. Moreover, prediction errors during
backpropagation, rather than external inputs during the forward computation,
are used to convey information to the network about the external data. For
testing, we introduce error regression for predicting unseen sequences as
inspired by predictive coding that leverages those mechanisms. The model
introduces a weighting parameter, the meta-prior, to balance the optimization
pressure placed on two terms of a lower bound on the marginal likelihood of the
sequential data. We test the model on two datasets with probabilistic
structures and show that with high values of the meta-prior the network
develops deterministic chaos through which the data's randomness is imitated.
For low values, the model behaves as a random process. The network performs
best on intermediate values, and is able to capture the latent probabilistic
structure with good generalization. Analyzing the meta-prior's impact on the
network allows to precisely study the theoretical value and practical benefits
of incorporating stochastic dynamics in our model. We demonstrate better
prediction performance on a robot imitation task with our model using error
regression compared to a standard variational Bayes model lacking such a
procedure.Comment: The paper is accepted in Neural Computatio
Quantitative Measures of Regret and Trust in Human-Robot Collaboration Systems
Human-robot collaboration (HRC) systems integrate the strengths of both humans and robots to improve the joint system performance. In this thesis, we focus on social human-robot interaction (sHRI) factors and in particular regret and trust. Humans experience regret during decision-making under uncertainty when they feel that a better result could be obtained if chosen differently. A framework to quantitatively measure regret is proposed in this thesis. We embed quantitative regret analysis into Bayesian sequential decision-making (BSD) algorithms for HRC shared vision tasks in both domain search and assembly tasks. The BSD method has been used for robot decision-making tasks, which however is proved to be very different from human decision-making patterns. Instead, regret theory qualitatively models human\u27s rational decision-making behaviors under uncertainty. Moreover, it has been shown that joint performance of a team will improve if all members share the same decision-making logic. Trust plays a critical role in determining the level of a human\u27s acceptance and hence utilization of a robot. A dynamic network based trust model combing the time series trust model is first implemented in a multi-robot motion planning task with a human-in-the-loop. However, in this model, the trust estimates for each robot is independent, which fails to model the correlative trust in multi-robot collaboration. To address this issue, the above model is extended to interdependent multi-robot Dynamic Bayesian Networks
Recommended from our members
Visual Dynamics Models for Robotic Planning and Control
For a robot to interact with its environment, it must perceive the world and understand how the world evolves as a consequence of its actions. This thesis studies a few methods that a robot can use to respond to its observations, with a focus on instances that can leverage visual dynamic models. In general, these are models of how the visual observations of a robot evolves as a consequence of its actions. This could be in the form of predictive models that directly predict the future in the space of image pixels, in the space of visual features extracted from these images, or in the space of compact learned latent representations. The three instances that this thesis studies are in the context of visual servoing, visual planning, and representation learning for reinforcement learning. In the first case, we combine learned visual features with learning single-step predictive dynamics models and reinforcement learning to learn visual servoing mechanisms. In the second case, we use a deterministic multi-step video prediction model to achieve various manipulation tasks through visual planning. In addition, we show that conventional video prediction models are unequipped to model uncertainty and multiple futures, which could limit the planning capabilities of the robot. To address this, we propose a stochastic video prediction model that is trained with a combination of variational losses, adversarial losses, and perceptual losses, and show that this model can predict futures that are more realistic, diverse, and accurate. Unlike the first two cases, in which the dynamics model is used to make predictions for decision-making, the third case learns the model solely for representation learning. We learn a stochastic sequential latent variable model to learn a latent representation, and then use it as an intermediate representation for reinforcement learning. We show that this approach improves final performance and sample efficiency
- ā¦