Search CORE

561 research outputs found

Learning Robust Object Recognition Using Composed Scenes from Generative Models

Author: Lee Tai Sing
Lin Xingyu
Wang Hao
Zhang Yimeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2017
Field of study

Recurrent feedback connections in the mammalian visual system have been hypothesized to play a role in synthesizing input in the theoretical framework of analysis by synthesis. The comparison of internally synthesized representation with that of the input provides a validation mechanism during perceptual inference and learning. Inspired by these ideas, we proposed that the synthesis machinery can compose new, unobserved images by imagination to train the network itself so as to increase the robustness of the system in novel scenarios. As a proof of concept, we investigated whether images composed by imagination could help an object recognition system to deal with occlusion, which is challenging for the current state-of-the-art deep convolutional neural networks. We fine-tuned a network on images containing objects in various occlusion scenarios, that are imagined or self-generated through a deep generator network. Trained on imagined occluded scenarios under the object persistence constraint, our network discovered more subtle and localized image features that were neglected by the original network for object classification, obtaining better separability of different object classes in the feature space. This leads to significant improvement of object recognition under occlusion for our network relative to the original network trained only on un-occluded images. In addition to providing practical benefits in object recognition under occlusion, this work demonstrates the use of self-generated composition of visual scenes through the synthesis loop, combined with the object persistence constraint, can provide opportunities for neural networks to discover new relevant patterns in the data, and become more flexible in dealing with novel situations.Comment: Accepted by 14th Conference on Computer and Robot Visio

arXiv.org e-Print Archive

Crossref

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

Author: Bu Hui
Du Jiayu
Na Xingyu
Wu Bengu
Zheng Hao
Publication venue
Publication date: 16/09/2017
Field of study

An open-source Mandarin speech corpus called AISHELL-1 is released. It is by far the largest corpus which is suitable for conducting the speech recognition research and building speech recognition systems for Mandarin. The recording procedure, including audio capturing devices and environments are presented in details. The preparation of the related resources, including transcriptions and lexicon are described. The corpus is released with a Kaldi recipe. Experimental results implies that the quality of audio recordings and transcriptions are promising.Comment: Oriental COCOSDA 201

arXiv.org e-Print Archive

Crossref

Advancing Adversarial Robustness Through Adversarial Logit Update

Author: Li Xingyu
Xuan Hao
Zhu Peican
Publication venue
Publication date: 29/08/2023
Field of study

Deep Neural Networks are susceptible to adversarial perturbations. Adversarial training and adversarial purification are among the most widely recognized defense strategies. Although these methods have different underlying logic, both rely on absolute logit values to generate label predictions. In this study, we theoretically analyze the logit difference around successful adversarial attacks from a theoretical point of view and propose a new principle, namely Adversarial Logit Update (ALU), to infer adversarial sample's labels. Based on ALU, we introduce a new classification paradigm that utilizes pre- and post-purification logit differences for model's adversarial robustness boost. Without requiring adversarial or additional data for model training, our clean data synthesis model can be easily applied to various pre-trained models for both adversarial sample detection and ALU-based data classification. Extensive experiments on both CIFAR-10, CIFAR-100, and tiny-ImageNet datasets show that even with simple components, the proposed solution achieves superior robustness performance compared to state-of-the-art methods against a wide range of adversarial attacks. Our python implementation is submitted in our Supplementary document and will be published upon the paper's acceptance

arXiv.org e-Print Archive

Parameterization of Cross-token Relations with Relative Positional Encoding for Vision MLP

Author: Gao Xingyu
Hao Yanbin
He Xiangnan
Mu Tingting
Wang Shuo
Wang Zhicai
Zhang Hao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/10/2022
Field of study

The University of Manchester - Institutional Repository

Effects of fully open-air [CO2] elevation on leaf photosynthesis and ultrastructure of Isatis indigotica Fort

Author: Feng Yongxiang
Gao Ji
Han Xue
Han Yuanhuai
Hao Xingyu
Li Ping
Lin Erda
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/09/2013
Field of study

Traditional Chinese medicine relies heavily on herbs, yet there is no information on how these herb plants would respond to climate change. In order to gain insight into such response, we studied the effect of elevated [CO2] on Isatis indigotica Fort, one of the most popular Chinese herb plants. The changes in leaf photosynthesis,chlorophyll fluorescence, leaf ultrastructure and biomass yield in response to elevated [CO2] (550619 mmol mol–1) were determined at the Free-Air Carbon dioxide Enrichment (FACE) experimental facility in North China. Photosynthetic ability of I. indigotica was improved under elevated [CO2]. Elevated [CO2] increased net photosynthetic rate (PN), water use efficiency (WUE) and maximum rate of electron transport (Jmax) of upper most fully-expended leaves, but not stomatal conductance (gs), transpiration ratio (Tr) and maximum velocity of carboxylation (Vc,max). Elevated [CO2] significantly increased leaf intrinsic efficiency of PSII (Fv’/Fm’) and quantum yield of PSII(WPSII), but decreased leaf non-photochemical quenching (NPQ), and did not affect leaf proportion of open PSII reaction centers (qP) and maximum quantum efficiency of PSII (Fv/Fm). The structural chloroplast membrane, grana layer and stroma thylakoid membranes were intact under elevated [CO2], though more starch grains were accumulated within the chloroplasts than that of under ambient [CO2]. While the yield of I. indigotica was higher due to the improved photosynthesis under elevated [CO2], the content of adenosine, one of the functional ingredients in indigowoad root was not affected

Directory of Open Access Journals

PubMed Central

University of Southern Queensland ePrints

Introducing Depth into Transformer-based 3D Object Detection

Author: Li Feng
Li Hongyang
Liao Xingyu
Liu Shilong
Zeng Ailing
Zhang Hao
Zhang Lei
Publication venue
Publication date: 05/06/2023
Field of study

In this paper, we present DAT, a Depth-Aware Transformer framework designed for camera-based 3D detection. Our model is based on observing two major issues in existing methods: large depth translation errors and duplicate predictions along depth axes. To mitigate these issues, we propose two key solutions within DAT. To address the first issue, we introduce a Depth-Aware Spatial Cross-Attention (DA-SCA) module that incorporates depth information into spatial cross-attention when lifting image features to 3D space. To address the second issue, we introduce an auxiliary learning task called Depth-aware Negative Suppression loss. First, based on their reference points, we organize features as a Bird's-Eye-View (BEV) feature map. Then, we sample positive and negative features along each object ray that connects an object and a camera and train the model to distinguish between them. The proposed DA-SCA and DNS methods effectively alleviate these two problems. We show that DAT is a versatile method that enhances the performance of all three popular models, BEVFormer, DETR3D, and PETR. Our evaluation on BEVFormer demonstrates that DAT achieves a significant improvement of +2.8 NDS on nuScenes val under the same settings. Moreover, when using pre-trained VoVNet-99 as the backbone, DAT achieves strong results of 60.0 NDS and 51.5 mAP on nuScenes test. Our code will be soon.Comment: revisio

arXiv.org e-Print Archive