328,034 research outputs found
Episodic Reasoning for Vision-Based Human Action Recognition
Smart Spaces, Ambient Intelligence, and Ambient Assisted Living are environmental paradigms that strongly depend on their capability to recognize human actions. While most solutions rest on sensor value interpretations and video analysis applications, few have realized the importance of incorporating common-sense capabilities to support the recognition process. Unfortunately, human action recognition cannot be successfully accomplished by only analyzing body postures. On the contrary, this task should be supported by profound knowledge of human agency nature and its tight connection to the reasons and motivations that explain it. The combination of this knowledge and the knowledge about how the world works is essential for recognizing and understanding human actions without committing common-senseless mistakes. This work demonstrates the impact that episodic reasoning has in improving the accuracy of a computer vision system for human action recognition. This work also presents formalization, implementation, and evaluation details of the knowledge model that supports the episodic reasoning
Temporal Relational Reasoning in Videos
Temporal relational reasoning, the ability to link meaningful transformations
of objects or entities over time, is a fundamental property of intelligent
species. In this paper, we introduce an effective and interpretable network
module, the Temporal Relation Network (TRN), designed to learn and reason about
temporal dependencies between video frames at multiple time scales. We evaluate
TRN-equipped networks on activity recognition tasks using three recent video
datasets - Something-Something, Jester, and Charades - which fundamentally
depend on temporal relational reasoning. Our results demonstrate that the
proposed TRN gives convolutional neural networks a remarkable capacity to
discover temporal relations in videos. Through only sparsely sampled video
frames, TRN-equipped networks can accurately predict human-object interactions
in the Something-Something dataset and identify various human gestures on the
Jester dataset with very competitive performance. TRN-equipped networks also
outperform two-stream networks and 3D convolution networks in recognizing daily
activities in the Charades dataset. Further analyses show that the models learn
intuitive and interpretable visual common sense knowledge in videos.Comment: camera-ready version for ECCV'1
Learning the Semantics of Manipulation Action
In this paper we present a formal computational framework for modeling
manipulation actions. The introduced formalism leads to semantics of
manipulation action and has applications to both observing and understanding
human manipulation actions as well as executing them with a robotic mechanism
(e.g. a humanoid robot). It is based on a Combinatory Categorial Grammar. The
goal of the introduced framework is to: (1) represent manipulation actions with
both syntax and semantic parts, where the semantic part employs
-calculus; (2) enable a probabilistic semantic parsing schema to learn
the -calculus representation of manipulation action from an annotated
action corpus of videos; (3) use (1) and (2) to develop a system that visually
observes manipulation actions and understands their meaning while it can reason
beyond observations using propositional logic and axiom schemata. The
experiments conducted on a public available large manipulation action dataset
validate the theoretical framework and our implementation
The Knowledge Level in Cognitive Architectures: Current Limitations and Possible Developments
In this paper we identify and characterize an analysis of two problematic aspects affecting the representational level of cognitive architectures (CAs), namely: the limited size and the homogeneous typology of the encoded and processed knowledge.
We argue that such aspects may constitute not only a technological problem that, in our opinion, should be addressed in order to build articial agents able to exhibit intelligent behaviours in general scenarios, but also an epistemological one, since they limit the plausibility of the comparison of the CAs' knowledge representation and processing mechanisms with those executed by humans in their everyday activities. In the final part of the paper further directions of research will be explored, trying to address current limitations and
future challenges
- …