Search CORE

81,671 research outputs found

ModDrop: adaptive multi-modal gesture recognition

Author: Nebout Florian
Neverova Natalia
Taylor Graham W.
Wolf Christian
Publication venue
Publication date: 06/06/2015
Field of study

We present a method for gesture detection and localisation based on multi-scale and multi-modal deep learning. Each visual modality captures spatial information at a particular spatial scale (such as motion of the upper body or a hand), and the whole system operates at three temporal scales. Key to our technique is a training strategy which exploits: i) careful initialization of individual modalities; and ii) gradual fusion involving random dropping of separate channels (dubbed ModDrop) for learning cross-modality correlations while preserving uniqueness of each modality-specific representation. We present experiments on the ChaLearn 2014 Looking at People Challenge gesture recognition track, in which we placed first out of 17 teams. Fusing multiple modalities at several spatial and temporal scales leads to a significant increase in recognition rates, allowing the model to compensate for errors of the individual classifiers as well as noise in the separate channels. Futhermore, the proposed ModDrop training technique ensures robustness of the classifier to missing signals in one or several channels to produce meaningful predictions from any number of available modalities. In addition, we demonstrate the applicability of the proposed fusion scheme to modalities of arbitrary nature by experiments on the same dataset augmented with audio.Comment: 14 pages, 7 figure

arXiv.org e-Print Archive

HAL

Hal-Diderot

Deep Learning Face Attributes in the Wild

Author: Liu Ziwei
Luo Ping
Tang Xiaoou
Wang Xiaogang
Publication venue
Publication date: 24/09/2015
Field of study

Predicting face attributes in the wild is challenging due to complex face variations. We propose a novel deep learning framework for attribute prediction in the wild. It cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently. LNet is pre-trained by massive general object categories for face localization, while ANet is pre-trained by massive face identities for attribute prediction. This framework not only outperforms the state-of-the-art with a large margin, but also reveals valuable facts on learning face representation. (1) It shows how the performances of face localization (LNet) and attribute prediction (ANet) can be improved by different pre-training strategies. (2) It reveals that although the filters of LNet are fine-tuned only with image-level attribute tags, their response maps over entire images have strong indication of face locations. This fact enables training LNet for face localization with only image-level annotations, but without face bounding boxes or landmarks, which are required by all attribute recognition works. (3) It also demonstrates that the high-level hidden neurons of ANet automatically discover semantic concepts after pre-training with massive face identities, and such concepts are significantly enriched after fine-tuning with attribute tags. Each attribute can be well explained with a sparse linear combination of these concepts.Comment: To appear in International Conference on Computer Vision (ICCV) 201

arXiv.org e-Print Archive

CiteSeerX

From Facial Parts Responses to Face Detection: A Deep Learning Approach

Author: Loy Chen Change
Luo Ping
Tang Xiaoou
Yang Shuo
Publication venue
Publication date: 21/09/2015
Field of study

In this paper, we propose a novel deep convolutional network (DCN) that achieves outstanding performance on FDDB, PASCAL Face, and AFW. Specifically, our method achieves a high recall rate of 90.99% on the challenging FDDB benchmark, outperforming the state-of-the-art method by a large margin of 2.91%. Importantly, we consider finding faces from a new perspective through scoring facial parts responses by their spatial structure and arrangement. The scoring mechanism is carefully formulated considering challenging cases where faces are only partially visible. This consideration allows our network to detect faces under severe occlusion and unconstrained pose variation, which are the main difficulty and bottleneck of most existing face detection approaches. We show that despite the use of DCN, our network can achieve practical runtime speed.Comment: To appear in ICCV 201

arXiv.org e-Print Archive

Crossref

Smart FRP Composite Sandwich Bridge Decks in Cold Regions

Author: Fan Wei
Hulsey J. Leroy
McLean David
Qiao Pizhong
Publication venue: Alaska University Transportation Center
Publication date: 01/01/2011
Field of study

INE/AUTC 12.0

ScholarWorks@UA

Spatial patterns of knowledge-intensive business services in cities of various sizes, morphologies and economies

Author: Ivan Igor
Slach Ondřej
Zenka Jan
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

We compare intra-urban localization patterns of advertising and IT companies in three large Czech cities. The main aim of our analysis is an empirically-based contribution to the question to what extent do knowledge bases affect the spatial distribution of various knowledge-intensive business industries. The central research question is: To what extent is the localization of these two industries influenced by different modes of innovation/knowledge bases (symbolic vs. synthetic) and to what extent by contextual factors, such as urban size, morphology, position in the urban hierarchy and economic profile of the given city. We found that the urban contexts shape the localization patterns of advertising and IT companies more than differences in knowledge bases-both industries cluster primarily in the inner cities and urban cores. Formation of more suburban IT "scientific neighborhoods" is limited.Web of Science125art. no. 184

DSpace at VSB Technical University of Ostrava

Gas Source Localization Strategies for Teleoperated Mobile Robots. An Experimental Analysis

Author: Gongora Andres
Gonzalez-Jimenez Antonio Javier
Monroy Javier
Publication venue
Publication date: 01/01/2017
Field of study

Gas source localization (GSL) is one of the most important and direct applications of a gas sensitive mobile robot, and consists in searching for one or multiple volatile emission sources with a mobile robot that has improved sensing capabilities (i.e. olfaction, wind flow, etc.). This work adresses GSL by employing a teleoperated mobile robot, and focuses on which search strategy is the most suitable for this teleoperated approach. Four different search strategies, namely chemotaxis, anemotaxis, gas-mapping, and visual-aided search, are analyzed and evaluated according to a set of proposed indicators (e.g. accuracy, efficiency, success rate, etc.) to determine the most suitable one for a human-teleoperated mobile robot. Experimental validation is carried out employing a large dataset composed of over 150 trials where volunteer operators had to locate a gas-leak in a virtual environment under various and realistic environmental conditions (i.e. different wind flow patterns and gas source locations). We report different findings, from which we highlight that, against intuition, visual-aided search is not always the best strategy, but depends on the environmental conditions and the operator’s ability to understand how gas distributes.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Crossref

Repositorio Institucional Universidad de Málaga

Multispectral Palmprint Encoding and Recognition

Author: Hu Yiqun
Khan Zohaib
Mian Ajmal
Shafait Faisal
Publication venue
Publication date: 06/02/2014
Field of study

Palmprints are emerging as a new entity in multi-modal biometrics for human identification and verification. Multispectral palmprint images captured in the visible and infrared spectrum not only contain the wrinkles and ridge structure of a palm, but also the underlying pattern of veins; making them a highly discriminating biometric identifier. In this paper, we propose a feature encoding scheme for robust and highly accurate representation and matching of multispectral palmprints. To facilitate compact storage of the feature, we design a binary hash table structure that allows for efficient matching in large databases. Comprehensive experiments for both identification and verification scenarios are performed on two public datasets -- one captured with a contact-based sensor (PolyU dataset), and the other with a contact-free sensor (CASIA dataset). Recognition results in various experimental setups show that the proposed method consistently outperforms existing state-of-the-art methods. Error rates achieved by our method (0.003% on PolyU and 0.2% on CASIA) are the lowest reported in literature on both dataset and clearly indicate the viability of palmprint as a reliable and promising biometric. All source codes are publicly available.Comment: Preliminary version of this manuscript was published in ICCV 2011. Z. Khan A. Mian and Y. Hu, "Contour Code: Robust and Efficient Multispectral Palmprint Encoding for Human Recognition", International Conference on Computer Vision, 2011. MATLAB Code available: https://sites.google.com/site/zohaibnet/Home/code

arXiv.org e-Print Archive

CiteSeerX