Search CORE

181 research outputs found

Multi-Modal 3D Object Detection in Autonomous Driving: a Survey

Author: Ji Jianmin
Mao Qiuyu
Wang Yingjie
Zhang Yanyong
Zhang Yu
Zhu Hanqi
Publication venue
Publication date: 25/06/2021
Field of study

In the past few years, we have witnessed rapid development of autonomous driving. However, achieving full autonomy remains a daunting task due to the complex and dynamic driving environment. As a result, self-driving cars are equipped with a suite of sensors to conduct robust and accurate environment perception. As the number and type of sensors keep increasing, combining them for better perception is becoming a natural trend. So far, there has been no indepth review that focuses on multi-sensor fusion based perception. To bridge this gap and motivate future research, this survey devotes to review recent fusion-based 3D detection deep learning models that leverage multiple sensor data sources, especially cameras and LiDARs. In this survey, we first introduce the background of popular sensors for autonomous cars, including their common data representations as well as object detection networks developed for each type of sensor data. Next, we discuss some popular datasets for multi-modal 3D object detection, with a special focus on the sensor data included in each dataset. Then we present in-depth reviews of recent multi-modal 3D detection networks by considering the following three aspects of the fusion: fusion location, fusion data representation, and fusion granularity. After a detailed review, we discuss open challenges and point out possible solutions. We hope that our detailed review can help researchers to embark investigations in the area of multi-modal 3D object detection

arXiv.org e-Print Archive

OCC-VO: Dense Mapping via 3D Occupancy-Based Visual Odometry for Autonomous Driving

Author: Duan Yifan
Ji Jianmin
Li Heng
Liu Haiyi
Zhang Xinran
Zhang Yanyong
Publication venue
Publication date: 19/09/2023
Field of study

Visual Odometry (VO) plays a pivotal role in autonomous systems, with a principal challenge being the lack of depth information in camera images. This paper introduces OCC-VO, a novel framework that capitalizes on recent advances in deep learning to transform 2D camera images into 3D semantic occupancy, thereby circumventing the traditional need for concurrent estimation of ego poses and landmark locations. Within this framework, we utilize the TPV-Former to convert surround view cameras' images into 3D semantic occupancy. Addressing the challenges presented by this transformation, we have specifically tailored a pose estimation and mapping algorithm that incorporates Semantic Label Filter, Dynamic Object Filter, and finally, utilizes Voxel PFilter for maintaining a consistent global semantic map. Evaluations on the Occ3D-nuScenes not only showcase a 20.6% improvement in Success Ratio and a 29.6% enhancement in trajectory accuracy against ORB-SLAM3, but also emphasize our ability to construct a comprehensive map. Our implementation is open-sourced and available at: https://github.com/USTCLH/OCC-VO.Comment: 7pages, 3 figure

arXiv.org e-Print Archive

Mga Modulates Bmpr1a Activity by Antagonizing Bs69 in Zebrafish

Author: Chen Ji
Dougan Scott
Munisha Mumingjiang
Sun Xiaoyun
Sun Yuhua
Zhang Yanyong
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

MAX giant associated protein (MGA) is a dual transcriptional factor containing both T-box and bHLHzip DNA binding domains. In vitro studies have shown that MGA functions as a transcriptional repressor or activator to regulate transcription of promotors containing either E-box or T-box binding sites. BS69 (ZMYND11), a multidomain-containing (i.e., PHD, BROMO, PWWP, and MYND) protein, has been shown to selectively recognizes histone variant H3.3 lysine 36 trimethylation (H3.3K36me3), modulates RNA Polymerase II elongation, and functions as RNA splicing regulator. Mutations in MGA or BS69 have been linked to multiple cancers or neural developmental disorders. Here, by TALEN and CRISPR/Cas9-mediated loss of gene function assays, we show that zebrafish Mga and Bs69 are required to maintain proper Bmp signaling during early embryogenesis. We found that Mga protein localized in the cytoplasm modulates Bmpr1a activity by physical association with Zmynd11/Bs69. The Mynd domain of Bs69 specifically binds the kinase domain of Bmpr1a and interferes with its phosphorylation and activation of Smad1/5/8. Mga acts to antagonize Bs69 and facilitate the Bmp signaling pathway by disrupting the Bs69-Bmpr1a association. Functionally, Bmp signaling under control of Mga and Bs69 is required for properly specifying the ventral tailfin cell fate.</p

Directory of Open Access Journals

Frontiers - Publisher Connector

Institute of Hydrobiology, Chinese Academy Of Sciences

$P^{3}O$ : Transferring Visual Representations for Reinforcement Learning via Prompting

Author: Chu Xiaomeng
Duan Yifan
Ji Jianmin
Peng Jie
You Guoliang
Zhang Yanyong
Zhang Yu
Publication venue
Publication date: 27/03/2023
Field of study

It is important for deep reinforcement learning (DRL) algorithms to transfer their learned policies to new environments that have different visual inputs. In this paper, we introduce Prompt based Proximal Policy Optimization (

P^{3}O

), a three-stage DRL algorithm that transfers visual representations from a target to a source environment by applying prompting. The process of

P^{3}O

consists of three stages: pre-training, prompting, and predicting. In particular, we specify a prompt-transformer for representation conversion and propose a two-step training process to train the prompt-transformer for the target environment, while the rest of the DRL pipeline remains unchanged. We implement

P^{3}O

and evaluate it on the OpenAI CarRacing video game. The experimental results show that

P^{3}O

outperforms the state-of-the-art visual transferring schemes. In particular,

P^{3}O

allows the learned policies to perform well in environments with different visual inputs, which is much more effective than retraining the policies in these environments.Comment: This paper has been accepted to be presented at the upcoming IEEE International Conference on Multimedia & Expo (ICME) in 202

arXiv.org e-Print Archive

Jamming Sensor Networks: Attack and Defense Strategies

Author: Trappe Wade
Xu Wenyuan
Zhang Yanyong
Publication venue: Scholar Commons
Publication date: 01/04/2006
Field of study

Wireless sensor networks are built upon a shared medium that makes it easy for adversaries to conduct radio interference, or jamming, attacks that effectively cause a denial of service of either transmission or reception functionalities. These attacks can easily be accomplished by an adversary by either bypassing MAC-layer protocols or emitting a radio signal targeted at jamming a particular channel. In this article we survey different jamming attacks that may be employed against a sensor network. In order to cope with the problem of jamming, we discuss a two-phase strategy involving the diagnosis of the attack, followed by a suitable defense strategy. We highlight the challenges associated with detecting jamming. To cope with jamming, we propose two different but complementary approaches. One approach is to simply retreat from the interferer, which may be accomplished by either spectral evasion (channel surfing) or spatial evasion (spatial retreats). The second approach aims to compete more actively with the interferer by adjusting resources, such as power levels and communication coding, to achieve communication in the presence of the jammer

Scholar Commons - Institutional Repository of the University of South Carolina

mmPlace: Robust Place Recognition with Intermediate Frequency Signal of Low-cost Single-chip Millimeter Wave Radar

Author: Duan Yifan
Fan Xiaoran
He Chenming
Meng Chengzhen
Wang Dequan
Zhang Yanyong
Publication venue
Publication date: 07/03/2024
Field of study

Place recognition is crucial for tasks like loop-closure detection and re-localization. Single-chip millimeter wave radar (single-chip radar in short) emerges as a low-cost sensor option for place recognition, with the advantage of insensitivity to degraded visual environments. However, it encounters two challenges. Firstly, sparse point cloud from single-chip radar leads to poor performance when using current place recognition methods, which assume much denser data. Secondly, its performance significantly declines in scenarios involving rotational and lateral variations, due to limited overlap in its field of view (FOV). We propose mmPlace, a robust place recognition system to address these challenges. Specifically, mmPlace transforms intermediate frequency (IF) signal into range azimuth heatmap and employs a spatial encoder to extract features. Additionally, to improve the performance in scenarios involving rotational and lateral variations, mmPlace employs a rotating platform and concatenates heatmaps in a rotation cycle, effectively expanding the system's FOV. We evaluate mmPlace's performance on the milliSonic dataset, which is collected on the University of Science and Technology of China (USTC) campus, the city roads surrounding the campus, and an underground parking garage. The results demonstrate that mmPlace outperforms point cloud-based methods and achieves 87.37% recall@1 in scenarios involving rotational and lateral variations.Comment: 8 pages, 8 figure

arXiv.org e-Print Archive

Managing the Mobility of a Mobile Sensor Network Using Network Dynamics

Author: Ke Ma
W. Trappe
Yanyong Zhang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref