164,025 research outputs found
Log-Euclidean Bag of Words for Human Action Recognition
Representing videos by densely extracted local space-time features has
recently become a popular approach for analysing actions. In this paper, we
tackle the problem of categorising human actions by devising Bag of Words (BoW)
models based on covariance matrices of spatio-temporal features, with the
features formed from histograms of optical flow. Since covariance matrices form
a special type of Riemannian manifold, the space of Symmetric Positive Definite
(SPD) matrices, non-Euclidean geometry should be taken into account while
discriminating between covariance matrices. To this end, we propose to embed
SPD manifolds to Euclidean spaces via a diffeomorphism and extend the BoW
approach to its Riemannian version. The proposed BoW approach takes into
account the manifold geometry of SPD matrices during the generation of the
codebook and histograms. Experiments on challenging human action datasets show
that the proposed method obtains notable improvements in discrimination
accuracy, in comparison to several state-of-the-art methods
Platonic model of mind as an approximation to neurodynamics
Hierarchy of approximations involved in simplification of microscopic theories, from sub-cellural to the whole brain level, is presented. A new approximation to neural dynamics is described, leading to a Platonic-like model of mind based on psychological spaces. Objects and events in these spaces correspond to quasi-stable states of brain dynamics and may be interpreted from psychological point of view. Platonic model bridges the gap between neurosciences and psychological sciences. Static and dynamic versions of this model are outlined and Feature Space Mapping, a neurofuzzy realization of the static version of Platonic model, described. Categorization experiments with human subjects are analyzed from the neurodynamical and Platonic model points of view
Multi-Label Zero-Shot Human Action Recognition via Joint Latent Ranking Embedding
Human action recognition refers to automatic recognizing human actions from a
video clip. In reality, there often exist multiple human actions in a video
stream. Such a video stream is often weakly-annotated with a set of relevant
human action labels at a global level rather than assigning each label to a
specific video episode corresponding to a single action, which leads to a
multi-label learning problem. Furthermore, there are many meaningful human
actions in reality but it would be extremely difficult to collect/annotate
video clips regarding all of various human actions, which leads to a zero-shot
learning scenario. To the best of our knowledge, there is no work that has
addressed all the above issues together in human action recognition. In this
paper, we formulate a real-world human action recognition task as a multi-label
zero-shot learning problem and propose a framework to tackle this problem in a
holistic way. Our framework holistically tackles the issue of unknown temporal
boundaries between different actions for multi-label learning and exploits the
side information regarding the semantic relationship between different human
actions for knowledge transfer. Consequently, our framework leads to a joint
latent ranking embedding for multi-label zero-shot human action recognition. A
novel neural architecture of two component models and an alternate learning
algorithm are proposed to carry out the joint latent ranking embedding
learning. Thus, multi-label zero-shot recognition is done by measuring
relatedness scores of action labels to a test video clip in the joint latent
visual and semantic embedding spaces. We evaluate our framework with different
settings, including a novel data split scheme designed especially for
evaluating multi-label zero-shot learning, on two datasets: Breakfast and
Charades. The experimental results demonstrate the effectiveness of our
framework.Comment: 27 pages, 10 figures and 7 tables. Technical report submitted to a
journal. More experimental results/references were added and typos were
correcte
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Visual analysis of sensor logs in smart spaces: Activities vs. situations
Models of human habits in smart spaces can be expressed by using a multitude of representations whose readability influences the possibility of being validated by human experts. Our research is focused on developing a visual analysis pipeline (service) that allows, starting from the sensor log of a smart space, to graphically visualize human habits. The basic assumption is to apply techniques borrowed from the area of business process automation and mining on a version of the sensor log preprocessed in order to translate raw sensor measurements into human actions. The proposed pipeline is employed to automatically extract models to be reused for ambient intelligence. In this paper, we present an user evaluation aimed at demonstrating the effectiveness of the approach, by comparing it wrt. a relevant state-of-the-art visual tool, namely SITUVIS
The Knowledge Level in Cognitive Architectures: Current Limitations and Possible Developments
In this paper we identify and characterize an analysis of two problematic aspects affecting the representational level of cognitive architectures (CAs), namely: the limited size and the homogeneous typology of the encoded and processed knowledge.
We argue that such aspects may constitute not only a technological problem that, in our opinion, should be addressed in order to build articial agents able to exhibit intelligent behaviours in general scenarios, but also an epistemological one, since they limit the plausibility of the comparison of the CAs' knowledge representation and processing mechanisms with those executed by humans in their everyday activities. In the final part of the paper further directions of research will be explored, trying to address current limitations and
future challenges
- …