4,351 research outputs found
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark
In this paper, we introduce a large Multi-Attribute and Language Search
dataset for text-based person retrieval, called MALS, and explore the
feasibility of performing pre-training on both attribute recognition and
image-text matching tasks in one stone. In particular, MALS contains 1,510,330
image-text pairs, which is about 37.5 times larger than prevailing CUHK-PEDES,
and all images are annotated with 27 attributes. Considering the privacy
concerns and annotation costs, we leverage the off-the-shelf diffusion models
to generate the dataset. To verify the feasibility of learning from the
generated data, we develop a new joint Attribute Prompt Learning and Text
Matching Learning (APTM) framework, considering the shared knowledge between
attribute and text. As the name implies, APTM contains an attribute prompt
learning stream and a text matching learning stream. (1) The attribute prompt
learning leverages the attribute prompts for image-attribute alignment, which
enhances the text matching learning. (2) The text matching learning facilitates
the representation learning on fine-grained details, and in turn, boosts the
attribute prompt learning. Extensive experiments validate the effectiveness of
the pre-training on MALS, achieving state-of-the-art retrieval performance via
APTM on three challenging real-world benchmarks. In particular, APTM achieves a
consistent improvement of +6.96%, +7.68%, and +16.95% Recall@1 accuracy on
CUHK-PEDES, ICFG-PEDES, and RSTPReid datasets by a clear margin, respectively
Pedestrian Models for Autonomous Driving Part II: High-Level Models of Human Behavior
Abstract—Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, inter- active motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behaviour as well as detecting and tracking them. This narrative review article is Part II of a pair, together surveying the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low-level image detection to high-level psychological models, from the perspective of an AV designer. This self-contained Part II covers the higher levels of this stack, consisting of models of pedestrian behaviour, from prediction of individual pedestrians’ likely destinations and paths, to game-theoretic models of interactions between pedestrians and autonomous vehicles. This survey clearly shows that, although there are good models for optimal walking behaviour, high-level psychological and social modelling of pedestrian behaviour still remains an open research question that requires many conceptual issues to be clarified. Early work has been done on descriptive and qualitative models of behaviour, but much work is still needed to translate them into quantitative algorithms for practical AV control
Authoring virtual crowds: a survey
Recent advancements in crowd simulation unravel a wide range of functionalities for virtual agents, delivering highly-realistic,natural virtual crowds. Such systems are of particular importance to a variety of applications in fields such as: entertainment(e.g., movies, computer games); architectural and urban planning; and simulations for sports and training. However, providingtheir capabilities to untrained users necessitates the development of authoring frameworks. Authoring virtual crowds is acomplex and multi-level task, varying from assuming control and assisting users to realise their creative intents, to deliveringintuitive and easy to use interfaces, facilitating such control. In this paper, we present a categorisation of the authorable crowdsimulation components, ranging from high-level behaviours and path-planning to local movements, as well as animation andvisualisation. We provide a review of the most relevant methods in each area, emphasising the amount and nature of influencethat the users have over the final result. Moreover, we discuss the currently available authoring tools (e.g., graphical userinterfaces, drag-and-drop), identifying the trends of early and recent work. Finally, we suggest promising directions for futureresearch that mainly stem from the rise of learning-based methods, and the need for a unified authoring framework.This work has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska Curie grant agreement No 860768 (CLIPE project). This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement No 739578 and the Government of the Republic of Cyprus through the Deputy Ministry of Research, Innovation and Digital PolicyPeer ReviewedPostprint (author's final draft
Do You Need Instructions Again? Predicting Wayfinding Instruction Demand
The demand for instructions during wayfinding, defined as the frequency of requesting instructions for each decision point, can be considered as an important indicator of the internal cognitive processes during wayfinding. This demand can be a consequence of the mental state of feeling lost, being uncertain, mind wandering, having difficulty following the route, etc. Therefore, it can be of great importance for theoretical cognitive studies on human perception of the environment. From an application perspective, this demand can be used as a measure of the effectiveness of the navigation assistance system. It is therefore worthwhile to be able to predict this demand and also to know what factors trigger it. This paper takes a step in this direction by reporting a successful prediction of instruction demand (accuracy of 78.4%) in a real-world wayfinding experiment with 45 participants, and interpreting the environmental, user, instructional, and gaze-related features that caused it
- …