2,773 research outputs found
CrossFusion: Interleaving Cross-modal Complementation for Noise-resistant 3D Object Detection
The combination of LiDAR and camera modalities is proven to be necessary and
typical for 3D object detection according to recent studies. Existing fusion
strategies tend to overly rely on the LiDAR modal in essence, which exploits
the abundant semantics from the camera sensor insufficiently. However, existing
methods cannot rely on information from other modalities because the corruption
of LiDAR features results in a large domain gap. Following this, we propose
CrossFusion, a more robust and noise-resistant scheme that makes full use of
the camera and LiDAR features with the designed cross-modal complementation
strategy. Extensive experiments we conducted show that our method not only
outperforms the state-of-the-art methods under the setting without introducing
an extra depth estimation network but also demonstrates our model's noise
resistance without re-training for the specific malfunction scenarios by
increasing 5.2\% mAP and 2.4\% NDS
Transforming Healthcare Quality through Information Tehnology
Information and information exchange are crucial to the delivery of care on all levels of the health care delivery system—the patient, the care team, the health care organization, and the encompassing political-economic environment. To diagnose and treat individual patients effectively, individual care providers and care teams must have access to at least three major types of clinical information—the patient’s health record, the rapidly changing medical-evidence base, and provider orders guiding the process of patient care. In this frame, Information Technology can help healthcare organizations improve the quality of care that they provide, improve patient safety, improve cost-effectiveness, accelerate the translation of research findings into practice, improve care for the medically underserved, increase consumer involvement, improve accuracy and privacy, and increase their ability to monitor health nationally. Consequently, in the present article are presented some implementations of Information and Communication Technologies in the Health Care field.Healthcare; Quality; Information and Communication Technologies
Cross Modal Transformer: Towards Fast and Robust 3D Object Detection
In this paper, we propose a robust 3D detector, named Cross Modal Transformer
(CMT), for end-to-end 3D multi-modal detection. Without explicit view
transformation, CMT takes the image and point clouds tokens as inputs and
directly outputs accurate 3D bounding boxes. The spatial alignment of
multi-modal tokens is performed by encoding the 3D points into multi-modal
features. The core design of CMT is quite simple while its performance is
impressive. It achieves 74.1\% NDS (state-of-the-art with single model) on
nuScenes test set while maintaining faster inference speed. Moreover, CMT has a
strong robustness even if the LiDAR is missing. Code is released at
https://github.com/junjie18/CMT
Understanding the Robustness of 3D Object Detection with Bird's-Eye-View Representations in Autonomous Driving
3D object detection is an essential perception task in autonomous driving to
understand the environments. The Bird's-Eye-View (BEV) representations have
significantly improved the performance of 3D detectors with camera inputs on
popular benchmarks. However, there still lacks a systematic understanding of
the robustness of these vision-dependent BEV models, which is closely related
to the safety of autonomous driving systems. In this paper, we evaluate the
natural and adversarial robustness of various representative models under
extensive settings, to fully understand their behaviors influenced by explicit
BEV features compared with those without BEV. In addition to the classic
settings, we propose a 3D consistent patch attack by applying adversarial
patches in the 3D space to guarantee the spatiotemporal consistency, which is
more realistic for the scenario of autonomous driving. With substantial
experiments, we draw several findings: 1) BEV models tend to be more stable
than previous methods under different natural conditions and common corruptions
due to the expressive spatial representations; 2) BEV models are more
vulnerable to adversarial noises, mainly caused by the redundant BEV features;
3) Camera-LiDAR fusion models have superior performance under different
settings with multi-modal inputs, but BEV fusion model is still vulnerable to
adversarial noises of both point cloud and image. These findings alert the
safety issue in the applications of BEV detectors and could facilitate the
development of more robust models.Comment: 8 pages, CVPR202
EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized Maps
Accurate and reliable ego-localization is critical for autonomous driving. In
this paper, we present EgoVM, an end-to-end localization network that achieves
comparable localization accuracy to prior state-of-the-art methods, but uses
lightweight vectorized maps instead of heavy point-based maps. To begin with,
we extract BEV features from online multi-view images and LiDAR point cloud.
Then, we employ a set of learnable semantic embeddings to encode the semantic
types of map elements and supervise them with semantic segmentation, to make
their feature representation consistent with BEV features. After that, we feed
map queries, composed of learnable semantic embeddings and coordinates of map
elements, into a transformer decoder to perform cross-modality matching with
BEV features. Finally, we adopt a robust histogram-based pose solver to
estimate the optimal pose by searching exhaustively over candidate poses. We
comprehensively validate the effectiveness of our method using both the
nuScenes dataset and a newly collected dataset. The experimental results show
that our method achieves centimeter-level localization accuracy, and
outperforms existing methods using vectorized maps by a large margin.
Furthermore, our model has been extensively tested in a large fleet of
autonomous vehicles under various challenging urban scenes.Comment: 8 page
Utilizing artificial intelligence in perioperative patient flow:systematic literature review
Abstract. The purpose of this thesis was to map the existing landscape of artificial intelligence (AI) applications used in secondary healthcare, with a focus on perioperative care. The goal was to find out what systems have been developed, and how capable they are at controlling perioperative patient flow. The review was guided by the following research question: How is AI currently utilized in patient flow management in the context of perioperative care?
This systematic literature review examined the current evidence regarding the use of AI in perioperative patient flow. A comprehensive search was conducted in four databases, resulting in 33 articles meeting the inclusion criteria. Findings demonstrated that AI technologies, such as machine learning (ML) algorithms and predictive analytics tools, have shown somewhat promising outcomes in optimizing perioperative patient flow. Specifically, AI systems have proven effective in predicting surgical case durations, assessing risks, planning treatments, supporting diagnosis, improving bed utilization, reducing cancellations and delays, and enhancing communication and collaboration among healthcare providers. However, several challenges were identified, including the need for accurate and reliable data sources, ethical considerations, and the potential for biased algorithms. Further research is needed to validate and optimize the application of AI in perioperative patient flow.
The contribution of this thesis is summarizing the current state of the characteristics of AI application in perioperative patient flow. This systematic literature review provides information about the features of perioperative patient flow and the clinical tasks of AI applications previously identified
Bidirectional Propagation for Cross-Modal 3D Object Detection
Recent works have revealed the superiority of feature-level fusion for
cross-modal 3D object detection, where fine-grained feature propagation from 2D
image pixels to 3D LiDAR points has been widely adopted for performance
improvement. Still, the potential of heterogeneous feature propagation between
2D and 3D domains has not been fully explored. In this paper, in contrast to
existing pixel-to-point feature propagation, we investigate an opposite
point-to-pixel direction, allowing point-wise features to flow inversely into
the 2D image branch. Thus, when jointly optimizing the 2D and 3D streams, the
gradients back-propagated from the 2D image branch can boost the representation
ability of the 3D backbone network working on LiDAR point clouds. Then,
combining pixel-to-point and point-to-pixel information flow mechanisms, we
construct an bidirectional feature propagation framework, dubbed BiProDet. In
addition to the architectural design, we also propose normalized local
coordinates map estimation, a new 2D auxiliary task for the training of the 2D
image branch, which facilitates learning local spatial-aware features from the
image modality and implicitly enhances the overall 3D detection performance.
Extensive experiments and ablation studies validate the effectiveness of our
method. Notably, we rank on the highly competitive
KITTI benchmark on the cyclist class by the time of submission. The source code
is available at https://github.com/Eaphan/BiProDet.Comment: Accepted by ICLR2023. Code is avaliable at
https://github.com/Eaphan/BiProDe
ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
3D human reconstruction from RGB images achieves decent results in good
weather conditions but degrades dramatically in rough weather. Complementary,
mmWave radars have been employed to reconstruct 3D human joints and meshes in
rough weather. However, combining RGB and mmWave signals for robust all-weather
3D human reconstruction is still an open challenge, given the sparse nature of
mmWave and the vulnerability of RGB images. In this paper, we present
ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies
in all weather conditions robustly. Specifically, our ImmFusion consists of
image and point backbones for token feature extraction and a Transformer module
for token fusion. The image and point backbones refine global and local
features from original data, and the Fusion Transformer Module aims for
effective information fusion of two modalities by dynamically selecting
informative tokens. Extensive experiments on a large-scale dataset, mmBody,
captured in various environments demonstrate that ImmFusion can efficiently
utilize the information of two modalities to achieve a robust 3D human body
reconstruction in all weather conditions. In addition, our method's accuracy is
significantly superior to that of state-of-the-art Transformer-based
LiDAR-camera fusion methods
- …