Search CORE

2,773 research outputs found

CrossFusion: Interleaving Cross-modal Complementation for Noise-resistant 3D Object Detection

Author: Chen Hao
Ma Weijie
Ou Linlin
Yang Yang
Yu Xinyi
Publication venue
Publication date: 19/04/2023
Field of study

The combination of LiDAR and camera modalities is proven to be necessary and typical for 3D object detection according to recent studies. Existing fusion strategies tend to overly rely on the LiDAR modal in essence, which exploits the abundant semantics from the camera sensor insufficiently. However, existing methods cannot rely on information from other modalities because the corruption of LiDAR features results in a large domain gap. Following this, we propose CrossFusion, a more robust and noise-resistant scheme that makes full use of the camera and LiDAR features with the designed cross-modal complementation strategy. Extensive experiments we conducted show that our method not only outperforms the state-of-the-art methods under the setting without introducing an extra depth estimation network but also demonstrates our model's noise resistance without re-training for the specific malfunction scenarios by increasing 5.2\% mAP and 2.4\% NDS

arXiv.org e-Print Archive

Transforming Healthcare Quality through Information Tehnology

Author: Kristina ZGODAVOVA
Sofia Elena COLESCA
Publication venue
Publication date
Field of study

Information and information exchange are crucial to the delivery of care on all levels of the health care delivery system—the patient, the care team, the health care organization, and the encompassing political-economic environment. To diagnose and treat individual patients effectively, individual care providers and care teams must have access to at least three major types of clinical information—the patient’s health record, the rapidly changing medical-evidence base, and provider orders guiding the process of patient care. In this frame, Information Technology can help healthcare organizations improve the quality of care that they provide, improve patient safety, improve cost-effectiveness, accelerate the translation of research findings into practice, improve care for the medically underserved, increase consumer involvement, improve accuracy and privacy, and increase their ability to monitor health nationally. Consequently, in the present article are presented some implementations of Information and Communication Technologies in the Health Care field.Healthcare; Quality; Information and Communication Technologies

Research Papers in Economics

Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

Author: Jia Fan
Li Shuailin
Liu Yingfei
Sun Jianjian
Wang Tiancai
Yan Junjie
Zhang Xiangyu
Publication venue
Publication date: 12/03/2023
Field of study

In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. It achieves 74.1\% NDS (state-of-the-art with single model) on nuScenes test set while maintaining faster inference speed. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code is released at https://github.com/junjie18/CMT

arXiv.org e-Print Archive

Understanding the Robustness of 3D Object Detection with Bird's-Eye-View Representations in Autonomous Driving

Author: Chen Hai
Ding Wenbo
Dong Yinpeng
Zhang Yichi
Zhao Shu
Zheng Shibao
Zhong Jiachen
Zhu Zijian
Publication venue
Publication date: 16/09/2023
Field of study

3D object detection is an essential perception task in autonomous driving to understand the environments. The Bird's-Eye-View (BEV) representations have significantly improved the performance of 3D detectors with camera inputs on popular benchmarks. However, there still lacks a systematic understanding of the robustness of these vision-dependent BEV models, which is closely related to the safety of autonomous driving systems. In this paper, we evaluate the natural and adversarial robustness of various representative models under extensive settings, to fully understand their behaviors influenced by explicit BEV features compared with those without BEV. In addition to the classic settings, we propose a 3D consistent patch attack by applying adversarial patches in the 3D space to guarantee the spatiotemporal consistency, which is more realistic for the scenario of autonomous driving. With substantial experiments, we draw several findings: 1) BEV models tend to be more stable than previous methods under different natural conditions and common corruptions due to the expressive spatial representations; 2) BEV models are more vulnerable to adversarial noises, mainly caused by the redundant BEV features; 3) Camera-LiDAR fusion models have superior performance under different settings with multi-modal inputs, but BEV fusion model is still vulnerable to adversarial noises of both point cloud and image. These findings alert the safety issue in the applications of BEV detectors and could facilitate the development of more robust models.Comment: 8 pages, CVPR202

arXiv.org e-Print Archive

EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized Maps

Author: Cai Chengying
He Yuzhe
Liang Shuang
Rui Xiaofei
Wan Guowei
Publication venue
Publication date: 18/07/2023
Field of study

Accurate and reliable ego-localization is critical for autonomous driving. In this paper, we present EgoVM, an end-to-end localization network that achieves comparable localization accuracy to prior state-of-the-art methods, but uses lightweight vectorized maps instead of heavy point-based maps. To begin with, we extract BEV features from online multi-view images and LiDAR point cloud. Then, we employ a set of learnable semantic embeddings to encode the semantic types of map elements and supervise them with semantic segmentation, to make their feature representation consistent with BEV features. After that, we feed map queries, composed of learnable semantic embeddings and coordinates of map elements, into a transformer decoder to perform cross-modality matching with BEV features. Finally, we adopt a robust histogram-based pose solver to estimate the optimal pose by searching exhaustively over candidate poses. We comprehensively validate the effectiveness of our method using both the nuScenes dataset and a newly collected dataset. The experimental results show that our method achieves centimeter-level localization accuracy, and outperforms existing methods using vectorized maps by a large margin. Furthermore, our model has been extensively tested in a large fleet of autonomous vehicles under various challenging urban scenes.Comment: 8 page

arXiv.org e-Print Archive

Utilizing artificial intelligence in perioperative patient flow:systematic literature review

Author: Huotari A. (Anne)
Publication venue: University of Oulu
Publication date: 15/06/2023
Field of study

Abstract. The purpose of this thesis was to map the existing landscape of artificial intelligence (AI) applications used in secondary healthcare, with a focus on perioperative care. The goal was to find out what systems have been developed, and how capable they are at controlling perioperative patient flow. The review was guided by the following research question: How is AI currently utilized in patient flow management in the context of perioperative care? This systematic literature review examined the current evidence regarding the use of AI in perioperative patient flow. A comprehensive search was conducted in four databases, resulting in 33 articles meeting the inclusion criteria. Findings demonstrated that AI technologies, such as machine learning (ML) algorithms and predictive analytics tools, have shown somewhat promising outcomes in optimizing perioperative patient flow. Specifically, AI systems have proven effective in predicting surgical case durations, assessing risks, planning treatments, supporting diagnosis, improving bed utilization, reducing cancellations and delays, and enhancing communication and collaboration among healthcare providers. However, several challenges were identified, including the need for accurate and reliable data sources, ethical considerations, and the potential for biased algorithms. Further research is needed to validate and optimize the application of AI in perioperative patient flow. The contribution of this thesis is summarizing the current state of the characteristics of AI application in perioperative patient flow. This systematic literature review provides information about the features of perioperative patient flow and the clinical tasks of AI applications previously identified

University of Oulu Repository - Jultika

Bidirectional Propagation for Cross-Modal 3D Object Detection

Author: Hou Junhui
Xing Guoliang
Yuan Yixuan
Zhang Qijian
Zhang Yifan
Publication venue
Publication date: 22/01/2023
Field of study

Recent works have revealed the superiority of feature-level fusion for cross-modal 3D object detection, where fine-grained feature propagation from 2D image pixels to 3D LiDAR points has been widely adopted for performance improvement. Still, the potential of heterogeneous feature propagation between 2D and 3D domains has not been fully explored. In this paper, in contrast to existing pixel-to-point feature propagation, we investigate an opposite point-to-pixel direction, allowing point-wise features to flow inversely into the 2D image branch. Thus, when jointly optimizing the 2D and 3D streams, the gradients back-propagated from the 2D image branch can boost the representation ability of the 3D backbone network working on LiDAR point clouds. Then, combining pixel-to-point and point-to-pixel information flow mechanisms, we construct an bidirectional feature propagation framework, dubbed BiProDet. In addition to the architectural design, we also propose normalized local coordinates map estimation, a new 2D auxiliary task for the training of the 2D image branch, which facilitates learning local spatial-aware features from the image modality and implicitly enhances the overall 3D detection performance. Extensive experiments and ablation studies validate the effectiveness of our method. Notably, we rank

\mathbf{1^{\mathrm{st}}}

on the highly competitive KITTI benchmark on the cyclist class by the time of submission. The source code is available at https://github.com/Eaphan/BiProDet.Comment: Accepted by ICLR2023. Code is avaliable at https://github.com/Eaphan/BiProDe

arXiv.org e-Print Archive

ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions

Author: Chen Anjun
Chen Jiming
Chen Yingfeng
Fang Bin
Huo Yuchi
Shi Kun
Wang Xiangyu
Ye Qi
Zhu Shaohao
Publication venue
Publication date: 03/10/2022
Field of study

3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. Complementary, mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and the vulnerability of RGB images. In this paper, we present ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies in all weather conditions robustly. Specifically, our ImmFusion consists of image and point backbones for token feature extraction and a Transformer module for token fusion. The image and point backbones refine global and local features from original data, and the Fusion Transformer Module aims for effective information fusion of two modalities by dynamically selecting informative tokens. Extensive experiments on a large-scale dataset, mmBody, captured in various environments demonstrate that ImmFusion can efficiently utilize the information of two modalities to achieve a robust 3D human body reconstruction in all weather conditions. In addition, our method's accuracy is significantly superior to that of state-of-the-art Transformer-based LiDAR-camera fusion methods

arXiv.org e-Print Archive