4,137 research outputs found
Distraction-Aware Feature Learning for Human Attribute Recognition via Coarse-to-Fine Attention Mechanism
Recently, Human Attribute Recognition (HAR) has become a hot topic due to its
scientific challenges and application potentials, where localizing attributes
is a crucial stage but not well handled. In this paper, we propose a novel deep
learning approach to HAR, namely Distraction-aware HAR (Da-HAR). It enhances
deep CNN feature learning by improving attribute localization through a
coarse-to-fine attention mechanism. At the coarse step, a self-mask block is
built to roughly discriminate and reduce distractions, while at the fine step,
a masked attention branch is applied to further eliminate irrelevant regions.
Thanks to this mechanism, feature learning is more accurate, especially when
heavy occlusions and complex backgrounds exist. Extensive experiments are
conducted on the WIDER-Attribute and RAP databases, and state-of-the-art
results are achieved, demonstrating the effectiveness of the proposed approach.Comment: 8 pages, 5 figures, accepted by AAAI-20 as an oral presentatio
Modeling Pedestrian and Bicyclist Crash Exposure with Location-Based Service Data
The rising popularity of non-motorized transportation has brought about safety risks. The inclusion of accurate traffic volume information is one of the key elements to produce robust outcomes when researching safety. The project investigates and enhances transportation safety for pedestrians and bicyclists, emphasizing the integration of Location-Based Services (LBS) data, particularly StreetLight data, to analyze traffic volume and associated risks. The calibration process demonstrates the significance of traffic volume as a key variable influencing prediction accuracy across pedestrian, bicyclist, and vehicle models. The crash analysis reveals a strong correlation between crash counts and traffic activity, with noteworthy findings emerging at the facility scale. The safety ranking analysis identifies higher and lower risk areas in the city. The spatial and temporal analysis at the street segment level highlights changes in traffic volumes before and during the COVID-19 period, along with distinct geographic patterns of activities at downtown and recreational locations. The project contributes to the knowledge and methodology surrounding transportation safety for non-motorized road users. By leveraging LBS data, the research provides comprehensive analyses of traffic patterns and safety considerations, aiming to create a safer environment for pedestrians and bicyclists
Soft Biometric Analysis: MultiPerson and RealTime Pedestrian Attribute Recognition in Crowded Urban Environments
Traditionally, recognition systems were only based on human hard biometrics. However,
the ubiquitous CCTV cameras have raised the desire to analyze human biometrics from
far distances, without people attendance in the acquisition process. Highresolution
face closeshots
are rarely available at far distances such that facebased
systems cannot
provide reliable results in surveillance applications. Human soft biometrics such as body
and clothing attributes are believed to be more effective in analyzing human data collected
by security cameras.
This thesis contributes to the human soft biometric analysis in uncontrolled environments
and mainly focuses on two tasks: Pedestrian Attribute Recognition (PAR) and person reidentification
(reid).
We first review the literature of both tasks and highlight the history
of advancements, recent developments, and the existing benchmarks. PAR and person reid
difficulties are due to significant distances between intraclass
samples, which originate
from variations in several factors such as body pose, illumination, background, occlusion,
and data resolution. Recent stateoftheart
approaches present endtoend
models that
can extract discriminative and comprehensive feature representations from people. The
correlation between different regions of the body and dealing with limited learning data
is also the objective of many recent works. Moreover, class imbalance and correlation
between human attributes are specific challenges associated with the PAR problem.
We collect a large surveillance dataset to train a novel gender recognition model suitable
for uncontrolled environments. We propose a deep residual network that extracts several
posewise
patches from samples and obtains a comprehensive feature representation. In
the next step, we develop a model for multiple attribute recognition at once. Considering
the correlation between human semantic attributes and class imbalance, we respectively
use a multitask
model and a weighted loss function. We also propose a multiplication
layer on top of the backbone features extraction layers to exclude the background features
from the final representation of samples and draw the attention of the model to the
foreground area.
We address the problem of person reid
by implicitly defining the receptive fields of
deep learning classification frameworks. The receptive fields of deep learning models
determine the most significant regions of the input data for providing correct decisions.
Therefore, we synthesize a set of learning data in which the destructive regions (e.g.,
background) in each pair of instances are interchanged. A segmentation module
determines destructive and useful regions in each sample, and the label of synthesized
instances are inherited from the sample that shared the useful regions in the synthesized
image. The synthesized learning data are then used in the learning phase and help
the model rapidly learn that the identity and background regions are not correlated.
Meanwhile, the proposed solution could be seen as a data augmentation approach that
fully preserves the label information and is compatible with other data augmentation
techniques.
When reid
methods are learned in scenarios where the target person appears with identical garments in the gallery, the visual appearance of clothes is given the most
importance in the final feature representation. Clothbased
representations are not
reliable in the longterm
reid
settings as people may change their clothes. Therefore,
developing solutions that ignore clothing cues and focus on identityrelevant
features are
in demand. We transform the original data such that the identityrelevant
information of
people (e.g., face and body shape) are removed, while the identityunrelated
cues (i.e.,
color and texture of clothes) remain unchanged. A learned model on the synthesized
dataset predicts the identityunrelated
cues (shortterm
features). Therefore, we train a
second model coupled with the first model and learns the embeddings of the original data
such that the similarity between the embeddings of the original and synthesized data is
minimized. This way, the second model predicts based on the identityrelated
(longterm)
representation of people.
To evaluate the performance of the proposed models, we use PAR and person reid
datasets, namely BIODI, PETA, RAP, Market1501,
MSMTV2,
PRCC, LTCC, and MIT
and compared our experimental results with stateoftheart
methods in the field.
In conclusion, the data collected from surveillance cameras have low resolution, such
that the extraction of hard biometric features is not possible, and facebased
approaches
produce poor results. In contrast, soft biometrics are robust to variations in data quality.
So, we propose approaches both for PAR and person reid
to learn discriminative features
from each instance and evaluate our proposed solutions on several publicly available
benchmarks.This thesis was prepared at the University of Beria Interior, IT Instituto de Telecomunicações, Soft Computing and Image Analysis Laboratory (SOCIA Lab), Covilhã Delegation, and was submitted to the University of Beira Interior for defense in a public examination session
Proceedings of the 2nd IUI Workshop on Interacting with Smart Objects
These are the Proceedings of the 2nd IUI Workshop on Interacting with Smart Objects. Objects that we use in our everyday life are expanding their restricted interaction capabilities and provide functionalities that go far beyond their original functionality. They feature computing capabilities and are thus able to capture information, process and store it and interact with their environments, turning them into smart objects
- …