3,193 research outputs found
PIP-Net: Pedestrian Intention Prediction in the Wild
Accurate pedestrian intention prediction (PIP) by Autonomous Vehicles (AVs)
is one of the current research challenges in this field. In this article, we
introduce PIP-Net, a novel framework designed to predict pedestrian crossing
intentions by AVs in real-world urban scenarios. We offer two variants of
PIP-Net designed for different camera mounts and setups. Leveraging both
kinematic data and spatial features from the driving scene, the proposed model
employs a recurrent and temporal attention-based solution, outperforming
state-of-the-art performance. To enhance the visual representation of road
users and their proximity to the ego vehicle, we introduce a categorical depth
feature map, combined with a local motion flow feature, providing rich insights
into the scene dynamics. Additionally, we explore the impact of expanding the
camera's field of view, from one to three cameras surrounding the ego vehicle,
leading to enhancement in the model's contextual perception. Depending on the
traffic scenario and road environment, the model excels in predicting
pedestrian crossing intentions up to 4 seconds in advance which is a
breakthrough in current research studies in pedestrian intention prediction.
Finally, for the first time, we present the Urban-PIP dataset, a customised
pedestrian intention prediction dataset, with multi-camera annotations in
real-world automated driving scenarios
Multisource and Multitemporal Data Fusion in Remote Sensing
The sharp and recent increase in the availability of data captured by
different sensors combined with their considerably heterogeneous natures poses
a serious challenge for the effective and efficient processing of remotely
sensed data. Such an increase in remote sensing and ancillary datasets,
however, opens up the possibility of utilizing multimodal datasets in a joint
manner to further improve the performance of the processing approaches with
respect to the application at hand. Multisource data fusion has, therefore,
received enormous attention from researchers worldwide for a wide variety of
applications. Moreover, thanks to the revisit capability of several spaceborne
sensors, the integration of the temporal information with the spatial and/or
spectral/backscattering information of the remotely sensed data is possible and
helps to move from a representation of 2D/3D data to 4D data structures, where
the time variable adds new information as well as challenges for the
information extraction algorithms. There are a huge number of research works
dedicated to multisource and multitemporal data fusion, but the methods for the
fusion of different modalities have expanded in different paths according to
each research community. This paper brings together the advances of multisource
and multitemporal data fusion approaches with respect to different research
communities and provides a thorough and discipline-specific starting point for
researchers at different levels (i.e., students, researchers, and senior
researchers) willing to conduct novel investigations on this challenging topic
by supplying sufficient detail and references
Deep Learning based 3D Segmentation: A Survey
3D object segmentation is a fundamental and challenging problem in computer
vision with applications in autonomous driving, robotics, augmented reality and
medical image analysis. It has received significant attention from the computer
vision, graphics and machine learning communities. Traditionally, 3D
segmentation was performed with hand-crafted features and engineered methods
which failed to achieve acceptable accuracy and could not generalize to
large-scale data. Driven by their great success in 2D computer vision, deep
learning techniques have recently become the tool of choice for 3D segmentation
tasks as well. This has led to an influx of a large number of methods in the
literature that have been evaluated on different benchmark datasets. This paper
provides a comprehensive survey of recent progress in deep learning based 3D
segmentation covering over 150 papers. It summarizes the most commonly used
pipelines, discusses their highlights and shortcomings, and analyzes the
competitive results of these segmentation methods. Based on the analysis, it
also provides promising research directions for the future.Comment: Under review of ACM Computing Surveys, 36 pages, 10 tables, 9 figure
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
3D Multiple Object Tracking on Autonomous Driving: A Literature Review
3D multi-object tracking (3D MOT) stands as a pivotal domain within
autonomous driving, experiencing a surge in scholarly interest and commercial
promise over recent years. Despite its paramount significance, 3D MOT confronts
a myriad of formidable challenges, encompassing abrupt alterations in object
appearances, pervasive occlusion, the presence of diminutive targets, data
sparsity, missed detections, and the unpredictable initiation and termination
of object motion trajectories. Countless methodologies have emerged to grapple
with these issues, yet 3D MOT endures as a formidable problem that warrants
further exploration. This paper undertakes a comprehensive examination,
assessment, and synthesis of the research landscape in this domain, remaining
attuned to the latest developments in 3D MOT while suggesting prospective
avenues for future investigation. Our exploration commences with a systematic
exposition of key facets of 3D MOT and its associated domains, including
problem delineation, classification, methodological approaches, fundamental
principles, and empirical investigations. Subsequently, we categorize these
methodologies into distinct groups, dissecting each group meticulously with
regard to its challenges, underlying rationale, progress, merits, and demerits.
Furthermore, we present a concise recapitulation of experimental metrics and
offer an overview of prevalent datasets, facilitating a quantitative comparison
for a more intuitive assessment. Lastly, our deliberations culminate in a
discussion of the prevailing research landscape, highlighting extant challenges
and charting possible directions for 3D MOT research. We present a structured
and lucid road-map to guide forthcoming endeavors in this field.Comment: 24 pages, 6 figures, 2 table
- …