Search CORE

3,145 research outputs found

Deep Visual Attention Prediction

Author: Shen Jianbing
Wang Wenguan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/03/2018
Field of study

In this work, we aim to predict human eye fixation with view-free scenes based on an end-to-end deep learning architecture. Although Convolutional Neural Networks (CNNs) have made substantial improvement on human attention prediction, it is still needed to improve CNN based attention models by efficiently leveraging multi-scale features. Our visual attention network is proposed to capture hierarchical saliency information from deep, coarse layers with global saliency information to shallow, fine layers with local saliency response. Our model is based on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields. Final saliency prediction is achieved via the cooperation of those global and local predictions. Our model is learned in a deep supervision manner, where supervision is directly fed into multi-level layers, instead of previous approaches of providing supervision only at the output layer and propagating this supervision back to earlier layers. Our model thus incorporates multi-level saliency predictions within a single network, which significantly decreases the redundancy of previous approaches of learning multiple network streams with different input scales. Extensive experimental analysis on various challenging benchmark datasets demonstrate our method yields state-of-the-art performance with competitive inference time.Comment: W. Wang and J. Shen. Deep visual attention prediction. IEEE TIP, 27(5):2368-2378,2018. Code and results can be found in https://github.com/wenguanwang/deepattentio

arXiv.org e-Print Archive

Crossref

STA: Spatial-Temporal Attention for Large-Scale Video-based Person Re-Identification

Author: Fu Yang
Huang Thomas
Wang Xiaoyang
Wei Yunchao
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 09/11/2018
Field of study

In this work, we propose a novel Spatial-Temporal Attention (STA) approach to tackle the large-scale person re-identification task in videos. Different from the most existing methods, which simply compute representations of video clips using frame-level aggregation (e.g. average pooling), the proposed STA adopts a more effective way for producing robust clip-level feature representation. Concretely, our STA fully exploits those discriminative parts of one target person in both spatial and temporal dimensions, which results in a 2-D attention score matrix via inter-frame regularization to measure the importances of spatial parts across different frames. Thus, a more robust clip-level feature representation can be generated according to a weighted sum operation guided by the mined 2-D attention score matrix. In this way, the challenging cases for video-based person re-identification such as pose variation and partial occlusion can be well tackled by the STA. We conduct extensive experiments on two large-scale benchmarks, i.e. MARS and DukeMTMC-VideoReID. In particular, the mAP reaches 87.7% on MARS, which significantly outperforms the state-of-the-arts with a large margin of more than 11.6%.Comment: Accepted as a conference paper at AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Mathematical models for somite formation

Author: Armstrong
Aulehla
Aulehla
Aulehla
Baker
Baker
Baker
Chernoff
Collier
Cooke
Cooke
Cooke
Dale
Dale
Dale
del Barco Barrantes
Dequéant
Diez del Corral
Diez del Corral
Diez del Corral
Drake
Duband
Dubrulle
Dubrulle
Dubrulle
Dubrulle
Duguay
Foty
Galli
Giudicelli
Goldbeter
Gossler
Grima
Haraguichi
Hirata
Horikawa
Ishikawa
Jiang
Jiang
Kalcheim
Kulesa
Lewis
Maroto
McGrew
McGrew
McInerney
Meier
Molotkova
Monk
Moreno
Morimoto
Packard
Packard
Palmeirim
Pourquié
Pourquié
Pourquié
Pourquié
Pourquié
Pourquié
Primmett
Primmett
Rida
Saga
Schnell
Schnell
Schnell
Stockdale
Tabin
Takahashi
Takashi
Tam
Turner
Publication venue
Publication date: 01/01/2007
Field of study

Somitogenesis is the process of division of the anterior–posterior vertebrate embryonic axis into similar morphological units known as somites. These segments generate the prepattern which guides formation of the vertebrae, ribs and other associated features of the body trunk. In this work, we review and discuss a series of mathematical models which account for different stages of somite formation. We begin by presenting current experimental information and mechanisms explaining somite formation, highlighting features which will be included in the models. For each model we outline the mathematical basis, show results of numerical simulations, discuss their successes and shortcomings and avenues for future exploration. We conclude with a brief discussion of the state of modeling in the field and current challenges which need to be overcome in order to further our understanding in this area

Crossref

Oxford University Research Archive

Archivo Digital UPM (Univ. Politécnica de Madrid)

Multi-scale Discriminant Saliency with Wavelet-based Hidden Markov Tree Modelling

Author: Ang Kenneth Li-Minn
Ngo Anh Cat Le
Qiu Guoping
Seng Jasmine Kah-Phooi
Publication venue
Publication date: 06/06/2013
Field of study

The bottom-up saliency, an early stage of humans' visual attention, can be considered as a binary classification problem between centre and surround classes. Discriminant power of features for the classification is measured as mutual information between distributions of image features and corresponding classes . As the estimated discrepancy very much depends on considered scale level, multi-scale structure and discriminant power are integrated by employing discrete wavelet features and Hidden Markov Tree (HMT). With wavelet coefficients and Hidden Markov Tree parameters, quad-tree like label structures are constructed and utilized in maximum a posterior probability (MAP) of hidden class variables at corresponding dyadic sub-squares. Then, a saliency value for each square block at each scale level is computed with discriminant power principle. Finally, across multiple scales is integrated the final saliency map by an information maximization rule. Both standard quantitative tools such as NSS, LCC, AUC and qualitative assessments are used for evaluating the proposed multi-scale discriminant saliency (MDIS) method against the well-know information based approach AIM on its released image collection with eye-tracking data. Simulation results are presented and analysed to verify the validity of MDIS as well as point out its limitation for further research direction.Comment: arXiv admin note: substantial text overlap with arXiv:1301.396

arXiv.org e-Print Archive

Crossref

DRO Deakin Research Online

Research Online @ ECU