Search CORE

6,894 research outputs found

Neural Network Ambient Occlusion

Author: Bergstra J.
Kingma D. P.
Srivastava N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/11/2016
Field of study

Crossref

Edinburgh Research Explorer

Ambient Sound Helps: Audiovisual Crowd Counting in Extreme Conditions

Author: Dou Dejing
Gao Junyu
Hu Di
Hua Yuansheng
Mou Lichao
Wang Qingzhong
Zhu Xiao Xiang
Publication venue
Publication date: 16/05/2020
Field of study

Visual crowd counting has been recently studied as a way to enable people counting in crowd scenes from images. Albeit successful, vision-based crowd counting approaches could fail to capture informative features in extreme conditions, e.g., imaging at night and occlusion. In this work, we introduce a novel task of audiovisual crowd counting, in which visual and auditory information are integrated for counting purposes. We collect a large-scale benchmark, named auDiovISual Crowd cOunting (DISCO) dataset, consisting of 1,935 images and the corresponding audio clips, and 170,270 annotated instances. In order to fuse the two modalities, we make use of a linear feature-wise fusion module that carries out an affine transformation on visual and auditory features. Finally, we conduct extensive experiments using the proposed dataset and approach. Experimental results show that introducing auditory information can benefit crowd counting under different illumination, noise, and occlusion conditions. The dataset and code will be released. Code and data have been made availabl

arXiv.org e-Print Archive

Institute of Transport Research:Publications

SA-Net: Deep Neural Network for Robot Trajectory Recognition from RGB-D Streams

Author: Asali Ehsan
Doshi Prashant
Hong Yi
Soans Nihal
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/08/2020
Field of study

Learning from demonstration (LfD) and imitation learning offer new paradigms for transferring task behavior to robots. A class of methods that enable such online learning require the robot to observe the task being performed and decompose the sensed streaming data into sequences of state-action pairs, which are then input to the methods. Thus, recognizing the state-action pairs correctly and quickly in sensed data is a crucial prerequisite for these methods. We present SA-Net a deep neural network architecture that recognizes state-action pairs from RGB-D data streams. SA-Net performed well in two diverse robotic applications of LfD -- one involving mobile ground robots and another involving a robotic manipulator -- which demonstrates that the architecture generalizes well to differing contexts. Comprehensive evaluations including deployment on a physical robot show that \sanet{} significantly improves on the accuracy of the previous method that utilizes traditional image processing and segmentation.Comment: (in press

arXiv.org e-Print Archive

Crossref

Robustness of 3D Deep Learning in an Adversarial Setting

Author: Kwiatkowska Marta
Wicker Matthew
Publication venue
Publication date: 01/04/2019
Field of study

Understanding the spatial arrangement and nature of real-world objects is of paramount importance to many complex engineering tasks, including autonomous navigation. Deep learning has revolutionized state-of-the-art performance for tasks in 3D environments; however, relatively little is known about the robustness of these approaches in an adversarial setting. The lack of comprehensive analysis makes it difficult to justify deployment of 3D deep learning models in real-world, safety-critical applications. In this work, we develop an algorithm for analysis of pointwise robustness of neural networks that operate on 3D data. We show that current approaches presented for understanding the resilience of state-of-the-art models vastly overestimate their robustness. We then use our algorithm to evaluate an array of state-of-the-art models in order to demonstrate their vulnerability to occlusion attacks. We show that, in the worst case, these networks can be reduced to 0% classification accuracy after the occlusion of at most 6.5% of the occupied input space.Comment: 10 pages, 8 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive