181 research outputs found
Multi-Modal 3D Object Detection in Autonomous Driving: a Survey
In the past few years, we have witnessed rapid development of autonomous
driving. However, achieving full autonomy remains a daunting task due to the
complex and dynamic driving environment. As a result, self-driving cars are
equipped with a suite of sensors to conduct robust and accurate environment
perception. As the number and type of sensors keep increasing, combining them
for better perception is becoming a natural trend. So far, there has been no
indepth review that focuses on multi-sensor fusion based perception. To bridge
this gap and motivate future research, this survey devotes to review recent
fusion-based 3D detection deep learning models that leverage multiple sensor
data sources, especially cameras and LiDARs. In this survey, we first introduce
the background of popular sensors for autonomous cars, including their common
data representations as well as object detection networks developed for each
type of sensor data. Next, we discuss some popular datasets for multi-modal 3D
object detection, with a special focus on the sensor data included in each
dataset. Then we present in-depth reviews of recent multi-modal 3D detection
networks by considering the following three aspects of the fusion: fusion
location, fusion data representation, and fusion granularity. After a detailed
review, we discuss open challenges and point out possible solutions. We hope
that our detailed review can help researchers to embark investigations in the
area of multi-modal 3D object detection
OCC-VO: Dense Mapping via 3D Occupancy-Based Visual Odometry for Autonomous Driving
Visual Odometry (VO) plays a pivotal role in autonomous systems, with a
principal challenge being the lack of depth information in camera images. This
paper introduces OCC-VO, a novel framework that capitalizes on recent advances
in deep learning to transform 2D camera images into 3D semantic occupancy,
thereby circumventing the traditional need for concurrent estimation of ego
poses and landmark locations. Within this framework, we utilize the TPV-Former
to convert surround view cameras' images into 3D semantic occupancy. Addressing
the challenges presented by this transformation, we have specifically tailored
a pose estimation and mapping algorithm that incorporates Semantic Label
Filter, Dynamic Object Filter, and finally, utilizes Voxel PFilter for
maintaining a consistent global semantic map. Evaluations on the Occ3D-nuScenes
not only showcase a 20.6% improvement in Success Ratio and a 29.6% enhancement
in trajectory accuracy against ORB-SLAM3, but also emphasize our ability to
construct a comprehensive map. Our implementation is open-sourced and available
at: https://github.com/USTCLH/OCC-VO.Comment: 7pages, 3 figure
Mga Modulates Bmpr1a Activity by Antagonizing Bs69 in Zebrafish
MAX giant associated protein (MGA) is a dual transcriptional factor containing both T-box and bHLHzip DNA binding domains. In vitro studies have shown that MGA functions as a transcriptional repressor or activator to regulate transcription of promotors containing either E-box or T-box binding sites. BS69 (ZMYND11), a multidomain-containing (i.e., PHD, BROMO, PWWP, and MYND) protein, has been shown to selectively recognizes histone variant H3.3 lysine 36 trimethylation (H3.3K36me3), modulates RNA Polymerase II elongation, and functions as RNA splicing regulator. Mutations in MGA or BS69 have been linked to multiple cancers or neural developmental disorders. Here, by TALEN and CRISPR/Cas9-mediated loss of gene function assays, we show that zebrafish Mga and Bs69 are required to maintain proper Bmp signaling during early embryogenesis. We found that Mga protein localized in the cytoplasm modulates Bmpr1a activity by physical association with Zmynd11/Bs69. The Mynd domain of Bs69 specifically binds the kinase domain of Bmpr1a and interferes with its phosphorylation and activation of Smad1/5/8. Mga acts to antagonize Bs69 and facilitate the Bmp signaling pathway by disrupting the Bs69-Bmpr1a association. Functionally, Bmp signaling under control of Mga and Bs69 is required for properly specifying the ventral tailfin cell fate.</p
: Transferring Visual Representations for Reinforcement Learning via Prompting
It is important for deep reinforcement learning (DRL) algorithms to transfer
their learned policies to new environments that have different visual inputs.
In this paper, we introduce Prompt based Proximal Policy Optimization
(), a three-stage DRL algorithm that transfers visual representations
from a target to a source environment by applying prompting. The process of
consists of three stages: pre-training, prompting, and predicting. In
particular, we specify a prompt-transformer for representation conversion and
propose a two-step training process to train the prompt-transformer for the
target environment, while the rest of the DRL pipeline remains unchanged. We
implement and evaluate it on the OpenAI CarRacing video game. The
experimental results show that outperforms the state-of-the-art visual
transferring schemes. In particular, allows the learned policies to
perform well in environments with different visual inputs, which is much more
effective than retraining the policies in these environments.Comment: This paper has been accepted to be presented at the upcoming IEEE
International Conference on Multimedia & Expo (ICME) in 202
Jamming Sensor Networks: Attack and Defense Strategies
Wireless sensor networks are built upon a shared medium that makes it easy for adversaries to conduct radio interference, or jamming, attacks that effectively cause a denial of service of either transmission or reception functionalities. These attacks can easily be accomplished by an adversary by either bypassing MAC-layer protocols or emitting a radio signal targeted at jamming a particular channel. In this article we survey different jamming attacks that may be employed against a sensor network. In order to cope with the problem of jamming, we discuss a two-phase strategy involving the diagnosis of the attack, followed by a suitable defense strategy. We highlight the challenges associated with detecting jamming. To cope with jamming, we propose two different but complementary approaches. One approach is to simply retreat from the interferer, which may be accomplished by either spectral evasion (channel surfing) or spatial evasion (spatial retreats). The second approach aims to compete more actively with the interferer by adjusting resources, such as power levels and communication coding, to achieve communication in the presence of the jammer
mmPlace: Robust Place Recognition with Intermediate Frequency Signal of Low-cost Single-chip Millimeter Wave Radar
Place recognition is crucial for tasks like loop-closure detection and
re-localization. Single-chip millimeter wave radar (single-chip radar in short)
emerges as a low-cost sensor option for place recognition, with the advantage
of insensitivity to degraded visual environments. However, it encounters two
challenges. Firstly, sparse point cloud from single-chip radar leads to poor
performance when using current place recognition methods, which assume much
denser data. Secondly, its performance significantly declines in scenarios
involving rotational and lateral variations, due to limited overlap in its
field of view (FOV). We propose mmPlace, a robust place recognition system to
address these challenges. Specifically, mmPlace transforms intermediate
frequency (IF) signal into range azimuth heatmap and employs a spatial encoder
to extract features. Additionally, to improve the performance in scenarios
involving rotational and lateral variations, mmPlace employs a rotating
platform and concatenates heatmaps in a rotation cycle, effectively expanding
the system's FOV. We evaluate mmPlace's performance on the milliSonic dataset,
which is collected on the University of Science and Technology of China (USTC)
campus, the city roads surrounding the campus, and an underground parking
garage. The results demonstrate that mmPlace outperforms point cloud-based
methods and achieves 87.37% recall@1 in scenarios involving rotational and
lateral variations.Comment: 8 pages, 8 figure
- …