270 research outputs found

    Towards Vehicle-to-everything Autonomous Driving: A Survey on Collaborative Perception

    Full text link
    Vehicle-to-everything (V2X) autonomous driving opens up a promising direction for developing a new generation of intelligent transportation systems. Collaborative perception (CP) as an essential component to achieve V2X can overcome the inherent limitations of individual perception, including occlusion and long-range perception. In this survey, we provide a comprehensive review of CP methods for V2X scenarios, bringing a profound and in-depth understanding to the community. Specifically, we first introduce the architecture and workflow of typical V2X systems, which affords a broader perspective to understand the entire V2X system and the role of CP within it. Then, we thoroughly summarize and analyze existing V2X perception datasets and CP methods. Particularly, we introduce numerous CP methods from various crucial perspectives, including collaboration stages, roadside sensors placement, latency compensation, performance-bandwidth trade-off, attack/defense, pose alignment, etc. Moreover, we conduct extensive experimental analyses to compare and examine current CP methods, revealing some essential and unexplored insights. Specifically, we analyze the performance changes of different methods under different bandwidths, providing a deep insight into the performance-bandwidth trade-off issue. Also, we examine methods under different LiDAR ranges. To study the model robustness, we further investigate the effects of various simulated real-world noises on the performance of different CP methods, covering communication latency, lossy communication, localization errors, and mixed noises. In addition, we look into the sim-to-real generalization ability of existing CP methods. At last, we thoroughly discuss issues and challenges, highlighting promising directions for future efforts. Our codes for experimental analysis will be public at https://github.com/memberRE/Collaborative-Perception.Comment: 19 page

    Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges

    Full text link
    Collaborative perception is essential to address occlusion and sensor failure issues in autonomous driving. In recent years, theoretical and experimental investigations of novel works for collaborative perception have increased tremendously. So far, however, few reviews have focused on systematical collaboration modules and large-scale collaborative perception datasets. This work reviews recent achievements in this field to bridge this gap and motivate future research. We start with a brief overview of collaboration schemes. After that, we systematically summarize the collaborative perception methods for ideal scenarios and real-world issues. The former focuses on collaboration modules and efficiency, and the latter is devoted to addressing the problems in actual application. Furthermore, we present large-scale public datasets and summarize quantitative results on these benchmarks. Finally, we highlight gaps and overlook challenges between current academic research and real-world applications. The project page is https://github.com/CatOneTwo/Collaborative-Perception-in-Autonomous-DrivingComment: 18 pages, 6 figures. Accepted by IEEE Intelligent Transportation Systems Magazine. URL: https://github.com/CatOneTwo/Collaborative-Perception-in-Autonomous-Drivin

    Bridging the Domain Gap for Multi-Agent Perception

    Full text link
    Existing multi-agent perception algorithms usually select to share deep neural features extracted from raw sensing data between agents, achieving a trade-off between accuracy and communication bandwidth limit. However, these methods assume all agents have identical neural networks, which might not be practical in the real world. The transmitted features can have a large domain gap when the models differ, leading to a dramatic performance drop in multi-agent perception. In this paper, we propose the first lightweight framework to bridge such domain gaps for multi-agent perception, which can be a plug-in module for most existing systems while maintaining confidentiality. Our framework consists of a learnable feature resizer to align features in multiple dimensions and a sparse cross-domain transformer for domain adaption. Extensive experiments on the public multi-agent perception dataset V2XSet have demonstrated that our method can effectively bridge the gap for features from different domains and outperform other baseline methods significantly by at least 8% for point-cloud-based 3D object detection.Comment: Accepted by ICRA2023.Code: https://github.com/DerrickXuNu/MPD

    Collaboration Helps Camera Overtake LiDAR in 3D Detection

    Full text link
    Camera-only 3D detection provides an economical solution with a simple configuration for localizing objects in 3D space compared to LiDAR-based detection systems. However, a major challenge lies in precise depth estimation due to the lack of direct 3D measurements in the input. Many previous methods attempt to improve depth estimation through network designs, e.g., deformable layers and larger receptive fields. This work proposes an orthogonal direction, improving the camera-only 3D detection by introducing multi-agent collaborations. Our proposed collaborative camera-only 3D detection (CoCa3D) enables agents to share complementary information with each other through communication. Meanwhile, we optimize communication efficiency by selecting the most informative cues. The shared messages from multiple viewpoints disambiguate the single-agent estimated depth and complement the occluded and long-range regions in the single-agent view. We evaluate CoCa3D in one real-world dataset and two new simulation datasets. Results show that CoCa3D improves previous SOTA performances by 44.21% on DAIR-V2X, 30.60% on OPV2V+, 12.59% on CoPerception-UAVs+ for AP@70. Our preliminary results show a potential that with sufficient collaboration, the camera might overtake LiDAR in some practical scenarios. We released the dataset and code at https://siheng-chen.github.io/dataset/CoPerception+ and https://github.com/MediaBrain-SJTU/CoCa3D.Comment: Accepted by CVPR2

    CoBEVFusion: Cooperative Perception with LiDAR-Camera Bird's-Eye View Fusion

    Full text link
    Autonomous Vehicles (AVs) use multiple sensors to gather information about their surroundings. By sharing sensor data between Connected Autonomous Vehicles (CAVs), the safety and reliability of these vehicles can be improved through a concept known as cooperative perception. However, recent approaches in cooperative perception only share single sensor information such as cameras or LiDAR. In this research, we explore the fusion of multiple sensor data sources and present a framework, called CoBEVFusion, that fuses LiDAR and camera data to create a Bird's-Eye View (BEV) representation. The CAVs process the multi-modal data locally and utilize a Dual Window-based Cross-Attention (DWCA) module to fuse the LiDAR and camera features into a unified BEV representation. The fused BEV feature maps are shared among the CAVs, and a 3D Convolutional Neural Network is applied to aggregate the features from the CAVs. Our CoBEVFusion framework was evaluated on the cooperative perception dataset OPV2V for two perception tasks: BEV semantic segmentation and 3D object detection. The results show that our DWCA LiDAR-camera fusion model outperforms perception models with single-modal data and state-of-the-art BEV fusion models. Our overall cooperative perception architecture, CoBEVFusion, also achieves comparable performance with other cooperative perception models

    Analyzing Infrastructure LiDAR Placement with Realistic LiDAR Simulation Library

    Full text link
    Recently, Vehicle-to-Everything(V2X) cooperative perception has attracted increasing attention. Infrastructure sensors play a critical role in this research field; however, how to find the optimal placement of infrastructure sensors is rarely studied. In this paper, we investigate the problem of infrastructure sensor placement and propose a pipeline that can efficiently and effectively find optimal installation positions for infrastructure sensors in a realistic simulated environment. To better simulate and evaluate LiDAR placement, we establish a Realistic LiDAR Simulation library that can simulate the unique characteristics of different popular LiDARs and produce high-fidelity LiDAR point clouds in the CARLA simulator. Through simulating point cloud data in different LiDAR placements, we can evaluate the perception accuracy of these placements using multiple detection models. Then, we analyze the correlation between the point cloud distribution and perception accuracy by calculating the density and uniformity of regions of interest. Experiments show that when using the same number and type of LiDAR, the placement scheme optimized by our proposed method improves the average precision by 15%, compared with the conventional placement scheme in the standard lane scene. We also analyze the correlation between perception performance in the region of interest and LiDAR point cloud distribution and validate that density and uniformity can be indicators of performance. Both the RLS Library and related code will be released at https://github.com/PJLab-ADG/LiDARSimLib-and-Placement-Evaluation.Comment: 7 pages, 6 figures, accepted to the IEEE International Conference on Robotics and Automation (ICRA'23

    LiDAR aided simulation pipeline for wireless communication in vehicular traffic scenarios

    Get PDF
    Abstract. Integrated Sensing and Communication (ISAC) is a modern technology under development for Sixth Generation (6G) systems. This thesis focuses on creating a simulation pipeline for dynamic vehicular traffic scenarios and a novel approach to reducing wireless communication overhead with a Light Detection and Ranging (LiDAR) based system. The simulation pipeline can be used to generate data sets for numerous problems. Additionally, the developed error model for vehicle detection algorithms can be used to identify LiDAR performance with respect to different parameters like LiDAR height, range, and laser point density. LiDAR behavior on traffic environment is provided as part of the results in this study. A periodic beam index map is developed by capturing antenna azimuth and elevation angles, which denote maximum Reference Signal Receive Power (RSRP) for a simulated receiver grid on the road and classifying areas using Support Vector Machine (SVM) algorithm to reduce the number of Synchronization Signal Blocks (SSBs) that are needed to be sent in Vehicle to Infrastructure (V2I) communication. This approach effectively reduces the wireless communication overhead in V2I communication

    Point Cloud Processing Algorithms for Environment Understanding in Intelligent Vehicle Applications

    Get PDF
    Understanding the surrounding environment including both still and moving objects is crucial to the design and optimization of intelligent vehicles. In particular, acquiring the knowledge about the vehicle environment could facilitate reliable detection of moving objects for the purpose of avoiding collisions. In this thesis, we focus on developing point cloud processing algorithms to support intelligent vehicle applications. The contributions of this thesis are three-fold.;First, inspired by the analogy between point cloud and video data, we propose to formulate a problem of reconstructing the vehicle environment (e.g., terrains and buildings) from a sequence of point cloud sets. Built upon existing point cloud registration tool such as iterated closest point (ICP), we have developed an expectation-maximization (EM)-like technique that can automatically mosaic multiple point cloud sets into a larger one characterizing the still environment surrounding the vehicle.;Second, we propose to utilize the color information (from color images captured by the RGB camera) as a supplementary source to the three-dimensional point cloud data. Such joint color and depth representation has the potential of better characterizing the surrounding environment of a vehicle. Based on the novel joint RGBD representation, we propose training a convolution neural network on color images and depth maps generated from the point cloud data.;Finally, we explore a sensor fusion method that combines the results given by a Lidar based detection algorithm and vehicle to everything (V2X) communicated data. Since Lidar and V2X respectively characterize the environmental information from complementary sources, we propose to get a better localization of the surrounding vehicles by a linear sensor fusion method. The effectiveness of the proposed sensor fusion method is verified by comparing detection error profiles

    DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving

    Full text link
    Safety is the primary priority of autonomous driving. Nevertheless, no published dataset currently supports the direct and explainable safety evaluation for autonomous driving. In this work, we propose DeepAccident, a large-scale dataset generated via a realistic simulator containing diverse accident scenarios that frequently occur in real-world driving. The proposed DeepAccident dataset contains 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset with 40k annotated samples. In addition, we propose a new task, end-to-end motion and accident prediction, based on the proposed dataset, which can be used to directly evaluate the accident prediction ability for different autonomous driving algorithms. Furthermore, for each scenario, we set four vehicles along with one infrastructure to record data, thus providing diverse viewpoints for accident scenarios and enabling V2X (vehicle-to-everything) research on perception and prediction tasks. Finally, we present a baseline V2X model named V2XFormer that demonstrates superior performance for motion and accident prediction and 3D object detection compared to the single-vehicle model
    corecore