168,357 research outputs found
Semantic Activity Recognition
International audienceExtracting automatically the semantics from visual data is a real challenge. We describe in this paper how recent work in cognitive vision leads to significative results in activity recognition for visualsurveillance and video monitoring. In particular we present work performed in the domain of video understanding in our PULSAR team at INRIA in Sophia Antipolis. Our main objective is to analyse in real-time video streams captured by static video cameras and to recognize their semantic content. We present a cognitive vision approach mixing 4D computer vision techniques and activity recognition based on a priori knowledge. Applications in visualsurveillance and healthcare monitoring are shown. We conclude by current issues in cognitive vision for activity recognition
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
Research on depth-based human activity analysis achieved outstanding
performance and demonstrated the effectiveness of 3D representation for action
recognition. The existing depth-based and RGB+D-based action recognition
benchmarks have a number of limitations, including the lack of large-scale
training samples, realistic number of distinct class categories, diversity in
camera views, varied environmental conditions, and variety of human subjects.
In this work, we introduce a large-scale dataset for RGB+D human action
recognition, which is collected from 106 distinct subjects and contains more
than 114 thousand video samples and 8 million frames. This dataset contains 120
different action classes including daily, mutual, and health-related
activities. We evaluate the performance of a series of existing 3D activity
analysis methods on this dataset, and show the advantage of applying deep
learning methods for 3D-based human action recognition. Furthermore, we
investigate a novel one-shot 3D activity recognition problem on our dataset,
and a simple yet effective Action-Part Semantic Relevance-aware (APSR)
framework is proposed for this task, which yields promising results for
recognition of the novel action classes. We believe the introduction of this
large-scale dataset will enable the community to apply, adapt, and develop
various data-hungry learning techniques for depth-based and RGB+D-based human
activity understanding. [The dataset is available at:
http://rose1.ntu.edu.sg/Datasets/actionRecognition.asp]Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints
Language is increasingly being used to define rich visual recognition
problems with supporting image collections sourced from the web. Structured
prediction models are used in these tasks to take advantage of correlations
between co-occurring labels and visual input but risk inadvertently encoding
social biases found in web corpora. In this work, we study data and models
associated with multilabel object classification and visual semantic role
labeling. We find that (a) datasets for these tasks contain significant gender
bias and (b) models trained on these datasets further amplify existing bias.
For example, the activity cooking is over 33% more likely to involve females
than males in a training set, and a trained model further amplifies the
disparity to 68% at test time. We propose to inject corpus-level constraints
for calibrating existing structured prediction models and design an algorithm
based on Lagrangian relaxation for collective inference. Our method results in
almost no performance loss for the underlying recognition task but decreases
the magnitude of bias amplification by 47.5% and 40.5% for multilabel
classification and visual semantic role labeling, respectively.Comment: 11 pages, published in EMNLP 201
Human Activity Recognition using a Semantic Ontology-Based Framework
In the last years, the extensive use of smart objects embedded in the physical world, in order to monitor and record physical or environmental conditions, has increased rapidly. In this scenario, heterogeneous devices are connected together into a network. Data generated from such system are usually stored in a database, which often shows a lack of semantic information and relationship among devices. Moreover, this set can be incomplete, unreliable, incorrect and noisy. So, it turns out to be important both the integration of information and the interoperability of applications. For this reason, ontologies are becoming widely used to describe the domain and achieve efficient interoperability of information system. An example of the described situation could be represented by Ambient Assisted Living context, which intends to enable older or disabled people to remain living independently longer in their own house. In this contest, human activity recognition plays a main role because it could be considered as starting point to facilitate assistance and care for elderly. Due to the nature of human behavior, it is necessary to manage the time and spatial restrictions. So, we propose a framework that implements a novel methodology based on the integration of an ontology for representing contextual knowledge and a Complex Event Processing engine for supporting timed reasoning. Moreover, it is an infrastructure where knowledge, organized in conceptual spaces (based on its meaning) can be semantically queried, discovered, and shared across applications. In our framework, benefits deriving from the implementation of a domain ontology are exploited into different levels of abstrac- tion. Thereafter, reasoning techniques represent a preprocessing method to prepare data for the final temporal analysis. The results, presented in this paper, have been obtained applying the methodology into AALISABETH, an Ambient Assisted Living project aimed to monitor the lifestyle of old people, not suffering from major chronic diseases or severe disabilities
- …