Search CORE

117 research outputs found

Pedestrian Attribute Recognition: A Survey

Author: Luo Bin
Tang Jin
Wang Xiao
Yang Rui
Zheng Shaofei
Publication venue
Publication date: 22/01/2019
Field of study

Recognizing pedestrian attributes is an important task in computer vision community due to it plays an important role in video surveillance. Many algorithms has been proposed to handle this task. The goal of this paper is to review existing works using traditional methods or based on deep learning networks. Firstly, we introduce the background of pedestrian attributes recognition (PAR, for short), including the fundamental concepts of pedestrian attributes and corresponding challenges. Secondly, we introduce existing benchmarks, including popular datasets and evaluation criterion. Thirdly, we analyse the concept of multi-task learning and multi-label learning, and also explain the relations between these two learning algorithms and pedestrian attribute recognition. We also review some popular network architectures which have widely applied in the deep learning community. Fourthly, we analyse popular solutions for this task, such as attributes group, part-based, \emph{etc}. Fifthly, we shown some applications which takes pedestrian attributes into consideration and achieve better performance. Finally, we summarized this paper and give several possible research directions for pedestrian attributes recognition. The project page of this paper can be found from the following website: \url{https://sites.google.com/view/ahu-pedestrianattributes/}.Comment: Check our project page for High Resolution version of this survey: https://sites.google.com/view/ahu-pedestrianattributes

arXiv.org e-Print Archive

Person Re-identification by Articulated Appearance Matching

Author: D. Ramanan
G Doretto
J Platt
L Bazzani
L Bazzani
PF Felzenszwalb
PF Felzenszwalb
R Shapire
W.S. Zheng
Y Freund
Y. Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Abstract Re-identification of pedestrians in video-surveillance settings can be ef-fectively approached by treating each human figure as an articulated body, whose pose is estimated through the framework of Pictorial Structures (PS). In this way, we can focus selectively on similarities between the appearance of body parts to recognize a previously seen individual. In fact, this strategy resembles what humans employ to solve the same task in the absence of facial details or other reliable bio-metric information. Based on these insights, we show how to perform single image re-identification by matching signatures coming from articulated appearances, and how to strengthen this process in multi-shot re-identification by using Custom Picto-rial Structures (CPS) to produce improved body localizations and appearance signa-tures. Moreover, we provide a complete and detailed breakdown of the system that surrounds these core procedures, with several novel arrangements devised for effi-ciency and flexibility. Finally, we test our approach on several public benchmarks, obtaining convincing results.

CiteSeerX

Crossref

Catalogo dei prodotti della ricerca

On Symbiosis of Attribute Prediction and Semantic Segmentation

Author: Kalayeh Mahdi M.
Shah Mubarak
Publication venue
Publication date: 01/01/2019
Field of study

In this paper, we propose to employ semantic segmentation to improve person-related attribute prediction. The core idea lies in the fact that the probability of an attribute to appear in an image is far from being uniform in the spatial domain. We build our attribute prediction model jointly with a deep semantic segmentation network. This harnesses the localization cues learned by the semantic segmentation to guide the attention of the attribute prediction to the regions where different attributes naturally show up. Therefore, in addition to prediction, we are able to localize the attributes despite merely having access to image-level labels (weak supervision) during training. We first propose semantic segmentation-based pooling and gating, respectively denoted as SSP and SSG. In the former, the estimated segmentation masks are used to pool the final activations of the attribute prediction network, from multiple semantically homogeneous regions. In SSG, the same idea is applied to the intermediate layers of the network. SSP and SSG, while effective, impose heavy memory utilization since each channel of the activations is pooled/gated with all the semantic segmentation masks. To circumvent this, we propose Symbiotic Augmentation (SA), where we learn only one mask per activation channel. SA allows the model to either pick one, or combine (weighted superposition) multiple semantic maps, in order to generate the proper mask for each channel. SA simultaneously applies the same mechanism to the reverse problem by leveraging output logits of attribute prediction to guide the semantic segmentation task. We evaluate our proposed methods for facial attributes on CelebA and LFWA datasets, while benchmarking WIDER Attribute and Berkeley Attributes of People for whole body attributes. Our proposed methods achieve superior results compared to the previous works.Comment: Accepted for publication in PAMI. arXiv admin note: substantial text overlap with arXiv:1704.0874

arXiv.org e-Print Archive

Crossref

Recommended from our members

Automatic Multilevel Feature Abstraction in Adaptable Machine Vision Systems

Author: Rose Valerie
Publication venue
Publication date: 01/01/2010
Field of study

Vision is a complex task which can be accomplished with apparent ease by biological systems, but for which the design of artificial systems is difficult. Although machine vision systems can be successfully designed for a specific task, under certain conditions, they are likely to fail if circumstances change. This was the motivation for the research into ways in which systems can be self-designing and adaptable to new visual tasks. The research was conducted in three vital areas of concern for machine vision systems. The first area is finding a suitable architecture for forming an appropriate representation for the current task. The research investigated the application of Hypernetworks theory to building a multilevel, generally-applicable representation, through repeated application of a fundamental 'self-similarity' principle, that parts of objects assembled under a particular relation at one level, form whole objects at the next. Results show that this is potentially a powerful approach for autonomously generating an adaptable system-architecture suitable for multiple visual tasks. The second area is the autonomous extraction of suitable low-level features, which the research investigated through random generation of minimally-constrained pixel-configurations and algorithmic generation of homogeneous and heterogeneous polygons. The results suggest that, despite the simplicity of the features making them vulnerable to image transformations, these are promising approaches worth developing further. The third area is automatic feature selection. The research explored management of 'dimensionality' and of 'combinatorial explosion', as well as how to locate relevant features at multiple representation levels, in the context of 'emergence' of structure. Results indicate that this approach can find useful 'intermediate-level' constructs through analysis of the connectivity of the simplices representing objects at higher levels. The research concludes that the proposed novel approaches to tackling the above issues, in particular the application of hypernetworks to the formation of multilevel representations and the resulting emergence of higher-level structure, is fruitful

Open Research Online (The Open University)