117 research outputs found
DDL-MVS: Depth Discontinuity Learning for MVS Networks
Traditional MVS methods have good accuracy but struggle with completeness,
while recently developed learning-based multi-view stereo (MVS) techniques have
improved completeness except accuracy being compromised. We propose depth
discontinuity learning for MVS methods, which further improves accuracy while
retaining the completeness of the reconstruction. Our idea is to jointly
estimate the depth and boundary maps where the boundary maps are explicitly
used for further refinement of the depth maps. We validate our idea and
demonstrate that our strategies can be easily integrated into the existing
learning-based MVS pipeline where the reconstruction depends on high-quality
depth map estimation. Extensive experiments on various datasets show that our
method improves reconstruction quality compared to baseline. Experiments also
demonstrate that the presented model and strategies have good generalization
capabilities. The source code will be available soon
On the Estimation of Image-matching Uncertainty in Visual Place Recognition
In Visual Place Recognition (VPR) the pose of a query image is estimated by
comparing the image to a map of reference images with known reference poses. As
is typical for image retrieval problems, a feature extractor maps the query and
reference images to a feature space, where a nearest neighbor search is then
performed. However, till recently little attention has been given to
quantifying the confidence that a retrieved reference image is a correct match.
Highly certain but incorrect retrieval can lead to catastrophic failure of
VPR-based localization pipelines. This work compares for the first time the
main approaches for estimating the image-matching uncertainty, including the
traditional retrieval-based uncertainty estimation, more recent data-driven
aleatoric uncertainty estimation, and the compute-intensive geometric
verification. We further formulate a simple baseline method, ``SUE'', which
unlike the other methods considers the freely-available poses of the reference
images in the map. Our experiments reveal that a simple L2-distance between the
query and reference descriptors is already a better estimate of image-matching
uncertainty than current data-driven approaches. SUE outperforms the other
efficient uncertainty estimation methods, and its uncertainty estimates
complement the computationally expensive geometric verification approach.
Future works for uncertainty estimation in VPR should consider the baselines
discussed in this work.Comment: To appear in the proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR) 202
UniBEV: Multi-modal 3D Object Detection with Uniform BEV Encoders for Robustness against Missing Sensor Modalities
Multi-sensor object detection is an active research topic in automated
driving, but the robustness of such detection models against missing sensor
input (modality missing), e.g., due to a sudden sensor failure, is a critical
problem which remains under-studied. In this work, we propose UniBEV, an
end-to-end multi-modal 3D object detection framework designed for robustness
against missing modalities: UniBEV can operate on LiDAR plus camera input, but
also on LiDAR-only or camera-only input without retraining. To facilitate its
detector head to handle different input combinations, UniBEV aims to create
well-aligned Bird's Eye View (BEV) feature maps from each available modality.
Unlike prior BEV-based multi-modal detection methods, all sensor modalities
follow a uniform approach to resample features from the native sensor
coordinate systems to the BEV features. We furthermore investigate the
robustness of various fusion strategies w.r.t. missing modalities: the commonly
used feature concatenation, but also channel-wise averaging, and a
generalization to weighted averaging termed Channel Normalized Weights. To
validate its effectiveness, we compare UniBEV to state-of-the-art BEVFusion and
MetaBEV on nuScenes over all sensor input combinations. In this setting, UniBEV
achieves mAP on average over all input combinations, significantly
improving over the baselines ( mAP on average for BEVFusion,
mAP on average for MetaBEV). An ablation study shows the robustness benefits of
fusing by weighted averaging over regular concatenation, and of sharing queries
between the BEV encoders of each modality. Our code will be released upon paper
acceptance.Comment: 6 pages, 5 figure
PointeNet: A Lightweight Framework for Effective and Efficient Point Cloud Analysis
Current methodologies in point cloud analysis predominantly explore 3D
geometries, often achieved through the introduction of intricate learnable
geometric extractors in the encoder or by deepening networks with repeated
blocks. However, these approaches inevitably lead to a significant number of
learnable parameters, resulting in substantial computational costs and imposing
memory burdens on CPU/GPU. Additionally, the existing strategies are primarily
tailored for object-level point cloud classification and segmentation tasks,
with limited extensions to crucial scene-level applications, such as autonomous
driving. In response to these limitations, we introduce PointeNet, an efficient
network designed specifically for point cloud analysis. PointeNet distinguishes
itself with its lightweight architecture, low training cost, and plug-and-play
capability, effectively capturing representative features. The network consists
of a Multivariate Geometric Encoding (MGE) module and an optional
Distance-aware Semantic Enhancement (DSE) module. The MGE module employs
operations of sampling, grouping, and multivariate geometric aggregation to
lightweightly capture and adaptively aggregate multivariate geometric features,
providing a comprehensive depiction of 3D geometries. The DSE module, designed
for real-world autonomous driving scenarios, enhances the semantic perception
of point clouds, particularly for distant points. Our method demonstrates
flexibility by seamlessly integrating with a classification/segmentation head
or embedding into off-the-shelf 3D object detection networks, achieving notable
performance improvements at a minimal cost. Extensive experiments on
object-level datasets, including ModelNet40, ScanObjectNN, ShapeNetPart, and
the scene-level dataset KITTI, demonstrate the superior performance of
PointeNet over state-of-the-art methods in point cloud analysis
Cross-BERT for Point Cloud Pretraining
Introducing BERT into cross-modal settings raises difficulties in its
optimization for handling multiple modalities. Both the BERT architecture and
training objective need to be adapted to incorporate and model information from
different modalities. In this paper, we address these challenges by exploring
the implicit semantic and geometric correlations between 2D and 3D data of the
same objects/scenes. We propose a new cross-modal BERT-style self-supervised
learning paradigm, called Cross-BERT. To facilitate pretraining for irregular
and sparse point clouds, we design two self-supervised tasks to boost
cross-modal interaction. The first task, referred to as Point-Image Alignment,
aims to align features between unimodal and cross-modal representations to
capture the correspondences between the 2D and 3D modalities. The second task,
termed Masked Cross-modal Modeling, further improves mask modeling of BERT by
incorporating high-dimensional semantic information obtained by cross-modal
interaction. By performing cross-modal interaction, Cross-BERT can smoothly
reconstruct the masked tokens during pretraining, leading to notable
performance enhancements for downstream tasks. Through empirical evaluation, we
demonstrate that Cross-BERT outperforms existing state-of-the-art methods in 3D
downstream applications. Our work highlights the effectiveness of leveraging
cross-modal 2D knowledge to strengthen 3D point cloud representation and the
transferable capability of BERT across modalities
A Structure Design Method for Reduction of MRI Acoustic Noise
The acoustic problem of the split gradient coil is one challenge in a Magnetic Resonance Imaging and Linear Accelerator (MRI-LINAC) system. In this paper, we aimed to develop a scheme to reduce the acoustic noise of the split gradient coil. First, a split gradient assembly with an asymmetric configuration was designed to avoid vibration in same resonant modes for the two assembly cylinders. Next, the outer ends of the split main magnet were constructed using horn structures, which can distribute the acoustic field away from patient region. Finally, a finite element method (FEM) was used to quantitatively evaluate the effectiveness of the above acoustic noise reduction scheme. Simulation results found that the noise could be maximally reduced by 6.9 dB and 5.6 dB inside and outside the central gap of the split MRI system, respectively, by increasing the length of one gradient assembly cylinder by 20 cm. The optimized horn length was observed to be 55 cm, which could reduce noise by up to 7.4 dB and 5.4 dB inside and outside the central gap, respectively. The proposed design could effectively reduce the acoustic noise without any influence on the application of other noise reduction methods
Microwave‐Assisted Pyrolysis of Biomass for Bio‐Oil Production
Microwave‐assisted pyrolysis (MAP) is a new thermochemical process that converts biomass to bio‐oil. Compared with the conventional electrical heating pyrolysis, MAP is more rapid, efficient, selective, controllable, and flexible. This chapter provides an up‐to‐date knowledge of bio‐oil production from microwave‐assisted pyrolysis of biomass. The chemical, physical, and energy properties of bio‐oils obtained from microwave‐assisted pyrolysis of biomass are described in comparison with those from conventional pyrolysis, the characteristics of microwave‐assisted pyrolysis as affected by biomass feedstock properties, microwave heating operations, use of exogenous microwave absorbents, and catalysts are discussed. With the advantages it offers and the further research and development recommended, microwave‐assisted pyrolysis has a bright future in production of bio‐oils that can effectively narrow the energy gap and reduce negative environmental impacts of our energy production and application practice
- …