8,302 research outputs found
Learning functional object categories from a relational spatio-temporal representation
Abstract. We propose a framework that learns functional objectcategories from spatio-temporal data sets such as those abstracted from video. The data is represented as one activity graph that encodes qualitative spatio-temporal patterns of interaction between objects. Event classes are induced by statistical generalization, the instances of which encode similar patterns of spatio-temporal relationships between objects. Equivalence classes of objects are discovered on the basis of their similar role in multiple event instantiations. Objects are represented in a multidimensional space that captures their role in all the events. Unsupervised learning in this space results in functional object-categories. Experiments in the domain of food preparation suggest that our techniques represent a significant step in unsupervised learning of functional object categories from spatio-temporal patterns of object interaction.
Narrative Language as an Expression of Individual and Group Identity
Scientific Narrative Psychology integrates quantitative methodologies into the study of identity. Its methodology, Narrative Categorical Analysis, and its toolkit, NarrCat, were both originally developed by the Hungarian Narrative Psychology Group. NarrCat is for machine-made transformation of sentences in self-narratives into psychologically relevant, statistically processable narrative categories. The main body of this flexible and comprehensive system is formed by Psycho-Thematic modules, such as Agency, Evaluation, Emotion, Cognition, Spatiality, and Temporality. The Relational Modules include Social References, Semantic Role Labeling (SRL), and Negation. Certain elements can be combined into Hypermodules, such as Psychological Perspective and Spatio-Temporal Perspective, which allow for even more complex, higher level exploration of composite psychological processes. Using up-to-date developments of corpus linguistics and Natural Language Processing (NLP), a unique feature of NarrCat is its capacity of SRL. The structure of NarrCat, as well as the empirical results in group identity research, is discussed
"Mental Rotation" by Optimizing Transforming Distance
The human visual system is able to recognize objects despite transformations
that can drastically alter their appearance. To this end, much effort has been
devoted to the invariance properties of recognition systems. Invariance can be
engineered (e.g. convolutional nets), or learned from data explicitly (e.g.
temporal coherence) or implicitly (e.g. by data augmentation). One idea that
has not, to date, been explored is the integration of latent variables which
permit a search over a learned space of transformations. Motivated by evidence
that people mentally simulate transformations in space while comparing
examples, so-called "mental rotation", we propose a transforming distance.
Here, a trained relational model actively transforms pairs of examples so that
they are maximally similar in some feature space yet respect the learned
transformational constraints. We apply our method to nearest-neighbour problems
on the Toronto Face Database and NORB
Action Recognition in Video Using Sparse Coding and Relative Features
This work presents an approach to category-based action recognition in video
using sparse coding techniques. The proposed approach includes two main
contributions: i) A new method to handle intra-class variations by decomposing
each video into a reduced set of representative atomic action acts or
key-sequences, and ii) A new video descriptor, ITRA: Inter-Temporal Relational
Act Descriptor, that exploits the power of comparative reasoning to capture
relative similarity relations among key-sequences. In terms of the method to
obtain key-sequences, we introduce a loss function that, for each video, leads
to the identification of a sparse set of representative key-frames capturing
both, relevant particularities arising in the input video, as well as relevant
generalities arising in the complete class collection. In terms of the method
to obtain the ITRA descriptor, we introduce a novel scheme to quantify relative
intra and inter-class similarities among local temporal patterns arising in the
videos. The resulting ITRA descriptor demonstrates to be highly effective to
discriminate among action categories. As a result, the proposed approach
reaches remarkable action recognition performance on several popular benchmark
datasets, outperforming alternative state-of-the-art techniques by a large
margin.Comment: Accepted to CVPR 201
Relational Graph Representation Learning for Predicting Object Affordances
We address the problem of affordance classification for class-agnostic objects considering an open set of actions, by unsupervised learning of object interactions,inducing object affordance classes. A novel qualitative spatial representation incorporating depth information is used to construct Activity Graphs which encode object interactions. These Activity Graphs are clustered to obtain interaction classes, and subsequently extract classes of object affordances. Our experiments demonstrate that our method learns object affordances without being scene- or object-specific
Grounding Dynamic Spatial Relations for Embodied (Robot) Interaction
This paper presents a computational model of the processing of dynamic
spatial relations occurring in an embodied robotic interaction setup. A
complete system is introduced that allows autonomous robots to produce and
interpret dynamic spatial phrases (in English) given an environment of moving
objects. The model unites two separate research strands: computational
cognitive semantics and on commonsense spatial representation and reasoning.
The model for the first time demonstrates an integration of these different
strands.Comment: in: Pham, D.-N. and Park, S.-B., editors, PRICAI 2014: Trends in
Artificial Intelligence, volume 8862 of Lecture Notes in Computer Science,
pages 958-971. Springe
- …