2,337 research outputs found

    Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation

    Full text link
    Multi-modal 3D scene understanding has gained considerable attention due to its wide applications in many areas, such as autonomous driving and human-computer interaction. Compared to conventional single-modal 3D understanding, introducing an additional modality not only elevates the richness and precision of scene interpretation but also ensures a more robust and resilient understanding. This becomes especially crucial in varied and challenging environments where solely relying on 3D data might be inadequate. While there has been a surge in the development of multi-modal 3D methods over past three years, especially those integrating multi-camera images (3D+2D) and textual descriptions (3D+language), a comprehensive and in-depth review is notably absent. In this article, we present a systematic survey of recent progress to bridge this gap. We begin by briefly introducing a background that formally defines various 3D multi-modal tasks and summarizes their inherent challenges. After that, we present a novel taxonomy that delivers a thorough categorization of existing methods according to modalities and tasks, exploring their respective strengths and limitations. Furthermore, comparative results of recent approaches on several benchmark datasets, together with insightful analysis, are offered. Finally, we discuss the unresolved issues and provide several potential avenues for future research

    LiDAR aided simulation pipeline for wireless communication in vehicular traffic scenarios

    Get PDF
    Abstract. Integrated Sensing and Communication (ISAC) is a modern technology under development for Sixth Generation (6G) systems. This thesis focuses on creating a simulation pipeline for dynamic vehicular traffic scenarios and a novel approach to reducing wireless communication overhead with a Light Detection and Ranging (LiDAR) based system. The simulation pipeline can be used to generate data sets for numerous problems. Additionally, the developed error model for vehicle detection algorithms can be used to identify LiDAR performance with respect to different parameters like LiDAR height, range, and laser point density. LiDAR behavior on traffic environment is provided as part of the results in this study. A periodic beam index map is developed by capturing antenna azimuth and elevation angles, which denote maximum Reference Signal Receive Power (RSRP) for a simulated receiver grid on the road and classifying areas using Support Vector Machine (SVM) algorithm to reduce the number of Synchronization Signal Blocks (SSBs) that are needed to be sent in Vehicle to Infrastructure (V2I) communication. This approach effectively reduces the wireless communication overhead in V2I communication

    Synthetic Datasets for Autonomous Driving: A Survey

    Full text link
    Autonomous driving techniques have been flourishing in recent years while thirsting for huge amounts of high-quality data. However, it is difficult for real-world datasets to keep up with the pace of changing requirements due to their expensive and time-consuming experimental and labeling costs. Therefore, more and more researchers are turning to synthetic datasets to easily generate rich and changeable data as an effective complement to the real world and to improve the performance of algorithms. In this paper, we summarize the evolution of synthetic dataset generation methods and review the work to date in synthetic datasets related to single and multi-task categories for to autonomous driving study. We also discuss the role that synthetic dataset plays the evaluation, gap test, and positive effect in autonomous driving related algorithm testing, especially on trustworthiness and safety aspects. Finally, we discuss general trends and possible development directions. To the best of our knowledge, this is the first survey focusing on the application of synthetic datasets in autonomous driving. This survey also raises awareness of the problems of real-world deployment of autonomous driving technology and provides researchers with a possible solution.Comment: 19 pages, 5 figure

    A Review of Environmental Context Detection for Navigation Based on Multiple Sensors

    Get PDF
    Current navigation systems use multi-sensor data to improve the localization accuracy, but often without certitude on the quality of those measurements in certain situations. The context detection will enable us to build an adaptive navigation system to improve the precision and the robustness of its localization solution by anticipating possible degradation in sensor signal quality (GNSS in urban canyons for instance or camera-based navigation in a non-textured environment). That is why context detection is considered the future of navigation systems. Thus, it is important firstly to define this concept of context for navigation and to find a way to extract it from available information. This paper overviews existing GNSS and on-board vision-based solutions of environmental context detection. This review shows that most of the state-of-the art research works focus on only one type of data. It confirms that the main perspective of this problem is to combine different indicators from multiple sensors

    Seamless Positioning and Navigation in Urban Environment

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Self-supervised learning for point cloud data: A survey

    Get PDF
    3D point clouds are a crucial type of data collected by LiDAR sensors and widely used in transportation applications due to its concise descriptions and accurate localization. Deep neural networks (DNNs) have achieved remarkable success in processing large amount of disordered and sparse 3D point clouds, especially in various computer vision tasks, such as pedestrian detection and vehicle recognition. Among all the learning paradigms, Self-Supervised Learning (SSL), an unsupervised training paradigm that mines effective information from the data itself, is considered as an essential solution to solve the time-consuming and labor-intensive data labeling problems via smart pre-training task design. This paper provides a comprehensive survey of recent advances on SSL for point clouds. We first present an innovative taxonomy, categorizing the existing SSL methods into four broad categories based on the pretexts’ characteristics. Under each category, we then further categorize the methods into more fine-grained groups and summarize the strength and limitations of the representative methods. We also compare the performance of the notable SSL methods in literature on multiple downstream tasks on benchmark datasets both quantitatively and qualitatively. Finally, we propose a number of future research directions based on the identified limitations of existing SSL research on point clouds

    Long Range Automated Persistent Surveillance

    Get PDF
    This dissertation addresses long range automated persistent surveillance with focus on three topics: sensor planning, size preserving tracking, and high magnification imaging. field of view should be reserved so that camera handoff can be executed successfully before the object of interest becomes unidentifiable or untraceable. We design a sensor planning algorithm that not only maximizes coverage but also ensures uniform and sufficient overlapped camera’s field of view for an optimal handoff success rate. This algorithm works for environments with multiple dynamic targets using different types of cameras. Significantly improved handoff success rates are illustrated via experiments using floor plans of various scales. Size preserving tracking automatically adjusts the camera’s zoom for a consistent view of the object of interest. Target scale estimation is carried out based on the paraperspective projection model which compensates for the center offset and considers system latency and tracking errors. A computationally efficient foreground segmentation strategy, 3D affine shapes, is proposed. The 3D affine shapes feature direct and real-time implementation and improved flexibility in accommodating the target’s 3D motion, including off-plane rotations. The effectiveness of the scale estimation and foreground segmentation algorithms is validated via both offline and real-time tracking of pedestrians at various resolution levels. Face image quality assessment and enhancement compensate for the performance degradations in face recognition rates caused by high system magnifications and long observation distances. A class of adaptive sharpness measures is proposed to evaluate and predict this degradation. A wavelet based enhancement algorithm with automated frame selection is developed and proves efficient by a considerably elevated face recognition rate for severely blurred long range face images
    • …
    corecore