3,025 research outputs found
Recurrent Attention Models for Depth-Based Person Identification
We present an attention-based model that reasons on human body shape and
motion dynamics to identify individuals in the absence of RGB information,
hence in the dark. Our approach leverages unique 4D spatio-temporal signatures
to address the identification problem across days. Formulated as a
reinforcement learning task, our model is based on a combination of
convolutional and recurrent neural networks with the goal of identifying small,
discriminative regions indicative of human identity. We demonstrate that our
model produces state-of-the-art results on several published datasets given
only depth images. We further study the robustness of our model towards
viewpoint, appearance, and volumetric changes. Finally, we share insights
gleaned from interpretable 2D, 3D, and 4D visualizations of our model's
spatio-temporal attention.Comment: Computer Vision and Pattern Recognition (CVPR) 201
An original framework for understanding human actions and body language by using deep neural networks
The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour.
By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way.
These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively.
While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements;
both are essential tasks in many computer vision applications, including event recognition, and video surveillance.
In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided.
The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements.
All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods
AFFECT-PRESERVING VISUAL PRIVACY PROTECTION
The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding.
The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection.
The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously
Physical Adversarial Attacks for Surveillance: A Survey
Modern automated surveillance techniques are heavily reliant on deep learning
methods. Despite the superior performance, these learning systems are
inherently vulnerable to adversarial attacks - maliciously crafted inputs that
are designed to mislead, or trick, models into making incorrect predictions. An
adversary can physically change their appearance by wearing adversarial
t-shirts, glasses, or hats or by specific behavior, to potentially avoid
various forms of detection, tracking and recognition of surveillance systems;
and obtain unauthorized access to secure properties and assets. This poses a
severe threat to the security and safety of modern surveillance systems. This
paper reviews recent attempts and findings in learning and designing physical
adversarial attacks for surveillance applications. In particular, we propose a
framework to analyze physical adversarial attacks and provide a comprehensive
survey of physical adversarial attacks on four key surveillance tasks:
detection, identification, tracking, and action recognition under this
framework. Furthermore, we review and analyze strategies to defend against the
physical adversarial attacks and the methods for evaluating the strengths of
the defense. The insights in this paper present an important step in building
resilience within surveillance systems to physical adversarial attacks
Intelligent Sensors for Human Motion Analysis
The book, "Intelligent Sensors for Human Motion Analysis," contains 17 articles published in the Special Issue of the Sensors journal. These articles deal with many aspects related to the analysis of human movement. New techniques and methods for pose estimation, gait recognition, and fall detection have been proposed and verified. Some of them will trigger further research, and some may become the backbone of commercial systems
Towards Reversible De-Identification in Video Sequences Using 3D Avatars and Steganography
We propose a de-identification pipeline that protects the privacy of humans
in video sequences by replacing them with rendered 3D human models, hence
concealing their identity while retaining the naturalness of the scene. The
original images of humans are steganographically encoded in the carrier image,
i.e. the image containing the original scene and the rendered 3D human models.
We qualitatively explore the feasibility of our approach, utilizing the Kinect
sensor and its libraries to detect and localize human joints. A 3D avatar is
rendered into the scene using the obtained joint positions, and the original
human image is steganographically encoded in the new scene. Our qualitative
evaluation shows reasonably good results that merit further exploration.Comment: Part of the Proceedings of the Croatian Computer Vision Workshop,
CCVW 2015, Year
- …