3,433 research outputs found
ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation
Domain shifts such as sensor type changes and geographical situation
variations are prevalent in Autonomous Driving (AD), which poses a challenge
since AD model relying on the previous-domain knowledge can be hardly directly
deployed to a new domain without additional costs. In this paper, we provide a
new perspective and approach of alleviating the domain shifts, by proposing a
Reconstruction-Simulation-Perception (ReSimAD) scheme. Specifically, the
implicit reconstruction process is based on the knowledge from the previous old
domain, aiming to convert the domain-related knowledge into domain-invariant
representations, e.g., 3D scene-level meshes. Besides, the point clouds
simulation process of multiple new domains is conditioned on the above
reconstructed 3D meshes, where the target-domain-like simulation samples can be
obtained, thus reducing the cost of collecting and annotating new-domain data
for the subsequent perception process. For experiments, we consider different
cross-domain situations such as Waymo-to-KITTI, Waymo-to-nuScenes,
Waymo-to-ONCE, etc, to verify the zero-shot target-domain perception using
ReSimAD. Results demonstrate that our method is beneficial to boost the domain
generalization ability, even promising for 3D pre-training.Comment: Code and simulated points are available at
https://github.com/PJLab-ADG/3DTrans#resima
The Cityscapes Dataset for Semantic Urban Scene Understanding
Visual understanding of complex urban street scenes is an enabling factor for
a wide range of applications. Object detection has benefited enormously from
large-scale datasets, especially in the context of deep learning. For semantic
urban scene understanding, however, no current dataset adequately captures the
complexity of real-world urban scenes.
To address this, we introduce Cityscapes, a benchmark suite and large-scale
dataset to train and test approaches for pixel-level and instance-level
semantic labeling. Cityscapes is comprised of a large, diverse set of stereo
video sequences recorded in streets from 50 different cities. 5000 of these
images have high quality pixel-level annotations; 20000 additional images have
coarse annotations to enable methods that leverage large volumes of
weakly-labeled data. Crucially, our effort exceeds previous attempts in terms
of dataset size, annotation richness, scene variability, and complexity. Our
accompanying empirical study provides an in-depth analysis of the dataset
characteristics, as well as a performance evaluation of several
state-of-the-art approaches based on our benchmark.Comment: Includes supplemental materia
Lidar-based Gait Analysis and Activity Recognition in a 4D Surveillance System
This paper presents new approaches for gait and activity analysis based on data streams of a Rotating Multi Beam (RMB) Lidar sensor. The proposed algorithms are embedded into an integrated 4D vision and visualization system, which is able to analyze and interactively display real scenarios in natural outdoor environments with walking pedestrians. The main focus of the investigations are gait based person re-identification during tracking, and recognition of specific activity patterns such as bending, waving, making phone calls and checking the time looking at wristwatches.
The descriptors for training and recognition are observed and extracted from realistic outdoor surveillance scenarios, where multiple pedestrians are walking in the field of interest following possibly intersecting trajectories, thus the observations might often be affected by occlusions or background noise.
Since there is no public database available for such scenarios, we created and published a new Lidar-based outdoors gait and activity dataset on our website, that contains point cloud sequences of 28 different persons extracted and aggregated from 35 minutes-long measurements.
The presented results confirm that both efficient gait-based identification and activity recognition is achievable in the sparse point clouds of a single RMB Lidar sensor. After extracting the people trajectories, we synthesized a free-viewpoint video, where moving avatar models follow the trajectories of the observed pedestrians in real time, ensuring that the leg movements of the animated avatars are synchronized with the real gait cycles observed in the Lidar stream
An Approach Of Automatic Reconstruction Of Building Models For Virtual Cities From Open Resources
Along with the ever-increasing popularity of virtual reality technology in recent years, 3D city models have been used in different applications, such as urban planning, disaster management, tourism, entertainment, and video games. Currently, those models are mainly reconstructed from access-restricted data sources such as LiDAR point clouds, airborne images, satellite images, and UAV (uncrewed air vehicle) images with a focus on structural illustration of buildings’ contours and layouts. To help make 3D models closer to their real-life counterparts, this thesis research proposes a new approach for the automatic reconstruction of building models from open resources. In this approach, first, building shapes are reconstructed by using the structural and geographic information retrievable from the open repository of OpenStreetMap (OSM). Later, images available from the street view of Google maps are used to extract information of the exterior appearance of buildings for texture mapping onto their boundaries. The constructed 3D environment is used as prior knowledge for the navigation purposes in a self-driving car. The static objects from the 3D model are compared with the real-time images of static objects to reduce the computation time by eliminating them from the detection proces
- …