22 research outputs found
Semantic Visual Localization
Robust visual localization under a wide range of viewing conditions is a
fundamental problem in computer vision. Handling the difficult cases of this
problem is not only very challenging but also of high practical relevance,
e.g., in the context of life-long localization for augmented reality or
autonomous robots. In this paper, we propose a novel approach based on a joint
3D geometric and semantic understanding of the world, enabling it to succeed
under conditions where previous approaches failed. Our method leverages a novel
generative model for descriptor learning, trained on semantic scene completion
as an auxiliary task. The resulting 3D descriptors are robust to missing
observations by encoding high-level 3D geometric and semantic information.
Experiments on several challenging large-scale localization datasets
demonstrate reliable localization under extreme viewpoint, illumination, and
geometry changes
ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic Reconstruction
We propose an online 3D semantic segmentation method that incrementally
reconstructs a 3D semantic map from a stream of RGB-D frames. Unlike offline
methods, ours is directly applicable to scenarios with real-time constraints,
such as robotics or mixed reality. To overcome the inherent challenges of
online methods, we make two main contributions. First, to effectively extract
information from the input RGB-D video stream, we jointly estimate geometry and
semantic labels per frame in 3D. A key focus of our approach is to reason about
semantic entities both in the 2D input and the local 3D domain to leverage
differences in spatial context and network architectures. Our method predicts
2D features using an off-the-shelf segmentation network. The extracted 2D
features are refined by a lightweight 3D network to enable reasoning about the
local 3D structure. Second, to efficiently deal with an infinite stream of
input RGB-D frames, a subsequent network serves as a temporal expert predicting
the incremental scene updates by leveraging 2D, 3D, and past information in a
learned manner. These updates are then integrated into a global scene
representation. Using these main contributions, our method can enable scenarios
with real-time constraints and can scale to arbitrary scene sizes by processing
and updating the scene only in a local region defined by the new measurement.
Our experiments demonstrate improved results compared to existing online
methods that purely operate in local regions and show that complementary
sources of information can boost the performance. We provide a thorough
ablation study on the benefits of different architectural as well as
algorithmic design decisions. Our method yields competitive results on the
popular ScanNet benchmark and SceneNN dataset
RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization
We study an important, yet largely unexplored problem of large-scale
cross-modal visual localization by matching ground RGB images to a
geo-referenced aerial LIDAR 3D point cloud (rendered as depth images). Prior
works were demonstrated on small datasets and did not lend themselves to
scaling up for large-scale applications. To enable large-scale evaluation, we
introduce a new dataset containing over 550K pairs (covering 143 km^2 area) of
RGB and aerial LIDAR depth images. We propose a novel joint embedding based
method that effectively combines the appearance and semantic cues from both
modalities to handle drastic cross-modal variations. Experiments on the
proposed dataset show that our model achieves a strong result of a median rank
of 5 in matching across a large test set of 50K location pairs collected from a
14km^2 area. This represents a significant advancement over prior works in
performance and scale. We conclude with qualitative results to highlight the
challenging nature of this task and the benefits of the proposed model. Our
work provides a foundation for further research in cross-modal visual
localization.Comment: ACM Multimedia 202
Clinical and virological characteristics of hospitalised COVID-19 patients in a German tertiary care centre during the first wave of the SARS-CoV-2 pandemic: a prospective observational study
Purpose: Adequate patient allocation is pivotal for optimal resource management in strained healthcare systems, and requires detailed knowledge of clinical and virological disease trajectories. The purpose of this work was to identify risk factors associated with need for invasive mechanical ventilation (IMV), to analyse viral kinetics in patients with and without IMV and to provide a comprehensive description of clinical course.
Methods: A cohort of 168 hospitalised adult COVID-19 patients enrolled in a prospective observational study at a large European tertiary care centre was analysed.
Results: Forty-four per cent (71/161) of patients required invasive mechanical ventilation (IMV). Shorter duration of symptoms before admission (aOR 1.22 per day less, 95% CI 1.10-1.37, p < 0.01) and history of hypertension (aOR 5.55, 95% CI 2.00-16.82, p < 0.01) were associated with need for IMV. Patients on IMV had higher maximal concentrations, slower decline rates, and longer shedding of SARS-CoV-2 than non-IMV patients (33 days, IQR 26-46.75, vs 18 days, IQR 16-46.75, respectively, p < 0.01). Median duration of hospitalisation was 9 days (IQR 6-15.5) for non-IMV and 49.5 days (IQR 36.8-82.5) for IMV patients.
Conclusions: Our results indicate a short duration of symptoms before admission as a risk factor for severe disease that merits further investigation and different viral load kinetics in severely affected patients. Median duration of hospitalisation of IMV patients was longer than described for acute respiratory distress syndrome unrelated to COVID-19