2,573 research outputs found
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Fine-grained traffic state estimation and visualisation
Tools for visualising the current traffic state are used by local authorities for strategic monitoring of the traffic network and by everyday users for planning their journey. Popular visualisations include those provided by Google Maps and by Inrix. Both employ a traffic lights colour-coding system, where roads on a map are coloured green if traffic is flowing normally and red or black if there is congestion. New sensor technology, especially from wireless sources, is allowing resolution down to lane level. A case study is reported in which a traffic micro-simulation test bed is used to generate high-resolution estimates. An interactive visualisation of the fine-grained traffic state is presented. The visualisation is demonstrated using Google Earth and affords the user a detailed three-dimensional view of the traffic state down to lane level in real time
aUToLights: A Robust Multi-Camera Traffic Light Detection and Tracking System
Following four successful years in the SAE AutoDrive Challenge Series I, the
University of Toronto is participating in the Series II competition to develop
a Level 4 autonomous passenger vehicle capable of handling various urban
driving scenarios by 2025. Accurate detection of traffic lights and correct
identification of their states is essential for safe autonomous operation in
cities. Herein, we describe our recently-redesigned traffic light perception
system for autonomous vehicles like the University of Toronto's self-driving
car, Artemis. Similar to most traffic light perception systems, we rely
primarily on camera-based object detectors. We deploy the YOLOv5 detector for
bounding box regression and traffic light classification across multiple
cameras and fuse the observations. To improve robustness, we incorporate priors
from high-definition semantic maps and perform state filtering using hidden
Markov models. We demonstrate a multi-camera, real time-capable traffic light
perception pipeline that handles complex situations including multiple visible
intersections, traffic light variations, temporary occlusion, and flashing
light states. To validate our system, we collected and annotated a varied
dataset incorporating flashing states and a range of occlusion types. Our
results show superior performance in challenging real-world scenarios compared
to single-frame, single-camera object detection.Comment: In Proceedings of the Conference on Robots and Vision (CRV'23),
Montreal, Canada, Jun. 6-8, 202
Recommended from our members
Explainable and Advisable Learning for Self-driving Vehicles
Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly
Traffic Light Recognition for Real Scenes Based on Image Processing and Deep Learning
Traffic light recognition in urban environments is crucial for vehicle control. Many studies have been devoted to recognizing traffic lights. However, existing recognition methods still face many challenges in terms of accuracy, runtime and size. This paper presents a novel robust traffic light recognition approach that takes into account these three aspects based on image processing and deep learning. The proposed approach adopts a two-stage architecture, first performing detection and then classification. In the detection, the perspective relationship and the fractal dimension are both considered to dramatically reduce the number of invalid candidate boxes, i.e. region proposals. In the classification, the candidate boxes are classified by SqueezeNet. Finally, the recognized traffic light boxes are reshaped by postprocessing. Compared with several reference models, this approach is significantly competitive in terms of accuracy and runtime. We show that our approach is lightweight, easy to implement, and applicable to smart terminals, mobile devices or embedded devices in practice
Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning
Recent advances in combining deep neural network architectures with
reinforcement learning techniques have shown promising potential results in
solving complex control problems with high dimensional state and action spaces.
Inspired by these successes, in this paper, we build two kinds of reinforcement
learning algorithms: deep policy-gradient and value-function based agents which
can predict the best possible traffic signal for a traffic intersection. At
each time step, these adaptive traffic light control agents receive a snapshot
of the current state of a graphical traffic simulator and produce control
signals. The policy-gradient based agent maps its observation directly to the
control signal, however the value-function based agent first estimates values
for all legal control signals. The agent then selects the optimal control
action with the highest value. Our methods show promising results in a traffic
network simulated in the SUMO traffic simulator, without suffering from
instability issues during the training process
- …