3,433 research outputs found

    ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation

    Full text link
    Domain shifts such as sensor type changes and geographical situation variations are prevalent in Autonomous Driving (AD), which poses a challenge since AD model relying on the previous-domain knowledge can be hardly directly deployed to a new domain without additional costs. In this paper, we provide a new perspective and approach of alleviating the domain shifts, by proposing a Reconstruction-Simulation-Perception (ReSimAD) scheme. Specifically, the implicit reconstruction process is based on the knowledge from the previous old domain, aiming to convert the domain-related knowledge into domain-invariant representations, e.g., 3D scene-level meshes. Besides, the point clouds simulation process of multiple new domains is conditioned on the above reconstructed 3D meshes, where the target-domain-like simulation samples can be obtained, thus reducing the cost of collecting and annotating new-domain data for the subsequent perception process. For experiments, we consider different cross-domain situations such as Waymo-to-KITTI, Waymo-to-nuScenes, Waymo-to-ONCE, etc, to verify the zero-shot target-domain perception using ReSimAD. Results demonstrate that our method is beneficial to boost the domain generalization ability, even promising for 3D pre-training.Comment: Code and simulated points are available at https://github.com/PJLab-ADG/3DTrans#resima

    The Cityscapes Dataset for Semantic Urban Scene Understanding

    Full text link
    Visual understanding of complex urban street scenes is an enabling factor for a wide range of applications. Object detection has benefited enormously from large-scale datasets, especially in the context of deep learning. For semantic urban scene understanding, however, no current dataset adequately captures the complexity of real-world urban scenes. To address this, we introduce Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling. Cityscapes is comprised of a large, diverse set of stereo video sequences recorded in streets from 50 different cities. 5000 of these images have high quality pixel-level annotations; 20000 additional images have coarse annotations to enable methods that leverage large volumes of weakly-labeled data. Crucially, our effort exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity. Our accompanying empirical study provides an in-depth analysis of the dataset characteristics, as well as a performance evaluation of several state-of-the-art approaches based on our benchmark.Comment: Includes supplemental materia

    Lidar-based Gait Analysis and Activity Recognition in a 4D Surveillance System

    Get PDF
    This paper presents new approaches for gait and activity analysis based on data streams of a Rotating Multi Beam (RMB) Lidar sensor. The proposed algorithms are embedded into an integrated 4D vision and visualization system, which is able to analyze and interactively display real scenarios in natural outdoor environments with walking pedestrians. The main focus of the investigations are gait based person re-identification during tracking, and recognition of specific activity patterns such as bending, waving, making phone calls and checking the time looking at wristwatches. The descriptors for training and recognition are observed and extracted from realistic outdoor surveillance scenarios, where multiple pedestrians are walking in the field of interest following possibly intersecting trajectories, thus the observations might often be affected by occlusions or background noise. Since there is no public database available for such scenarios, we created and published a new Lidar-based outdoors gait and activity dataset on our website, that contains point cloud sequences of 28 different persons extracted and aggregated from 35 minutes-long measurements. The presented results confirm that both efficient gait-based identification and activity recognition is achievable in the sparse point clouds of a single RMB Lidar sensor. After extracting the people trajectories, we synthesized a free-viewpoint video, where moving avatar models follow the trajectories of the observed pedestrians in real time, ensuring that the leg movements of the animated avatars are synchronized with the real gait cycles observed in the Lidar stream

    An Approach Of Automatic Reconstruction Of Building Models For Virtual Cities From Open Resources

    Get PDF
    Along with the ever-increasing popularity of virtual reality technology in recent years, 3D city models have been used in different applications, such as urban planning, disaster management, tourism, entertainment, and video games. Currently, those models are mainly reconstructed from access-restricted data sources such as LiDAR point clouds, airborne images, satellite images, and UAV (uncrewed air vehicle) images with a focus on structural illustration of buildings’ contours and layouts. To help make 3D models closer to their real-life counterparts, this thesis research proposes a new approach for the automatic reconstruction of building models from open resources. In this approach, first, building shapes are reconstructed by using the structural and geographic information retrievable from the open repository of OpenStreetMap (OSM). Later, images available from the street view of Google maps are used to extract information of the exterior appearance of buildings for texture mapping onto their boundaries. The constructed 3D environment is used as prior knowledge for the navigation purposes in a self-driving car. The static objects from the 3D model are compared with the real-time images of static objects to reduce the computation time by eliminating them from the detection proces
    • …
    corecore