1,208 research outputs found

    Is the Pedestrian going to Cross? Answering by 2D Pose Estimation

    Get PDF
    Our recent work suggests that, thanks to nowadays powerful CNNs, image-based 2D pose estimation is a promising cue for determining pedestrian intentions such as crossing the road in the path of the ego-vehicle, stopping before entering the road, and starting to walk or bending towards the road. This statement is based on the results obtained on non-naturalistic sequences (Daimler dataset), i.e. in sequences choreographed specifically for performing the study. Fortunately, a new publicly available dataset (JAAD) has appeared recently to allow developing methods for detecting pedestrian intentions in naturalistic driving conditions; more specifically, for addressing the relevant question is the pedestrian going to cross? Accordingly, in this paper we use JAAD to assess the usefulness of 2D pose estimation for answering such a question. We combine CNN-based pedestrian detection, tracking and pose estimation to predict the crossing action from monocular images. Overall, the proposed pipeline provides new state-of-the-art results.Comment: This is a paper presented in IEEE Intelligent Vehicles Symposium (IEEE IV 2018

    Pedestrian Models for Autonomous Driving Part I: Low-Level Models, from Sensing to Tracking

    Get PDF
    Abstract—Autonomous vehicles (AVs) must share space with pedestrians, both in carriageway cases such as cars at pedestrian crossings and off-carriageway cases such as delivery vehicles navigating through crowds on pedestrianized high-streets. Unlike static obstacles, pedestrians are active agents with complex, inter- active motions. Planning AV actions in the presence of pedestrians thus requires modelling of their probable future behaviour as well as detecting and tracking them. This narrative review article is Part I of a pair, together surveying the current technology stack involved in this process, organising recent research into a hierarchical taxonomy ranging from low-level image detection to high-level psychology models, from the perspective of an AV designer. This self-contained Part I covers the lower levels of this stack, from sensing, through detection and recognition, up to tracking of pedestrians. Technologies at these levels are found to be mature and available as foundations for use in high-level systems, such as behaviour modelling, prediction and interaction control

    "Let’s get ready to bundle!":Crowd-level Human Keypoint Tracking

    Get PDF
    This work examines the suitability of a state-of-the-art human pose trackingmethod for application within surveillance scenarios and focuses on publicplaces in urban areas that tend to suffer from crowdedness, such as city centers.Starting with a short introduction to motivate keypoint tracking in surveillanceapplications, this report will present details about the adapted method, whichfollows an LSTM-based approach. Afterwards, different changes that had tobe incorporated in order to successfully apply the given method to our targetsetting will be presented. Finally, various experiments will show how the chosenmethod performs, based on experiments with simulated data

    A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

    Full text link
    Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject's interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.Comment: accepted by CVPR 201

    Pedestrian Environment Model for Automated Driving

    Full text link
    Besides interacting correctly with other vehicles, automated vehicles should also be able to react in a safe manner to vulnerable road users like pedestrians or cyclists. For a safe interaction between pedestrians and automated vehicles, the vehicle must be able to interpret the pedestrian's behavior. Common environment models do not contain information like body poses used to understand the pedestrian's intent. In this work, we propose an environment model that includes the position of the pedestrians as well as their pose information. We only use images from a monocular camera and the vehicle's localization data as input to our pedestrian environment model. We extract the skeletal information with a neural network human pose estimator from the image. Furthermore, we track the skeletons with a simple tracking algorithm based on the Hungarian algorithm and an ego-motion compensation. To obtain the 3D information of the position, we aggregate the data from consecutive frames in conjunction with the vehicle position. We demonstrate our pedestrian environment model on data generated with the CARLA simulator and the nuScenes dataset. Overall, we reach a relative position error of around 16% on both datasets.Comment: Accepted for presentation at the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC 2023), 24-28 September 2023, Bilbao, Bizkaia, Spai

    Topological Mapping and Navigation in Real-World Environments

    Full text link
    We introduce the Hierarchical Hybrid Spatial Semantic Hierarchy (H2SSH), a hybrid topological-metric map representation. The H2SSH provides a more scalable representation of both small and large structures in the world than existing topological map representations, providing natural descriptions of a hallway lined with offices as well as a cluster of buildings on a college campus. By considering the affordances in the environment, we identify a division of space into three distinct classes: path segments afford travel between places at their ends, decision points present a choice amongst incident path segments, and destinations typically exist at the start and end of routes. Constructing an H2SSH map of the environment requires understanding both its local and global structure. We present a place detection and classification algorithm to create a semantic map representation that parses the free space in the local environment into a set of discrete areas representing features like corridors, intersections, and offices. Using these areas, we introduce a new probabilistic topological simultaneous localization and mapping algorithm based on lazy evaluation to estimate a probability distribution over possible topological maps of the global environment. After construction, an H2SSH map provides the necessary representations for navigation through large-scale environments. The local semantic map provides a high-fidelity metric map suitable for motion planning in dynamic environments, while the global topological map is a graph-like map that allows for route planning using simple graph search algorithms. For navigation, we have integrated the H2SSH with Model Predictive Equilibrium Point Control (MPEPC) to provide safe and efficient motion planning for our robotic wheelchair, Vulcan. However, navigation in human environments entails more than safety and efficiency, as human behavior is further influenced by complex cultural and social norms. We show how social norms for moving along corridors and through intersections can be learned by observing how pedestrians around the robot behave. We then integrate these learned norms with MPEPC to create a socially-aware navigation algorithm, SA-MPEPC. Through real-world experiments, we show how SA-MPEPC improves not only Vulcan’s adherence to social norms, but the adherence of pedestrians interacting with Vulcan as well.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144014/1/collinej_1.pd

    User simulation of space utilisation : system for office building usage simulation

    Get PDF
    • …
    corecore