8,302 research outputs found

    Learning functional object categories from a relational spatio-temporal representation

    Get PDF
    Abstract. We propose a framework that learns functional objectcategories from spatio-temporal data sets such as those abstracted from video. The data is represented as one activity graph that encodes qualitative spatio-temporal patterns of interaction between objects. Event classes are induced by statistical generalization, the instances of which encode similar patterns of spatio-temporal relationships between objects. Equivalence classes of objects are discovered on the basis of their similar role in multiple event instantiations. Objects are represented in a multidimensional space that captures their role in all the events. Unsupervised learning in this space results in functional object-categories. Experiments in the domain of food preparation suggest that our techniques represent a significant step in unsupervised learning of functional object categories from spatio-temporal patterns of object interaction.

    Narrative Language as an Expression of Individual and Group Identity

    Get PDF
    Scientific Narrative Psychology integrates quantitative methodologies into the study of identity. Its methodology, Narrative Categorical Analysis, and its toolkit, NarrCat, were both originally developed by the Hungarian Narrative Psychology Group. NarrCat is for machine-made transformation of sentences in self-narratives into psychologically relevant, statistically processable narrative categories. The main body of this flexible and comprehensive system is formed by Psycho-Thematic modules, such as Agency, Evaluation, Emotion, Cognition, Spatiality, and Temporality. The Relational Modules include Social References, Semantic Role Labeling (SRL), and Negation. Certain elements can be combined into Hypermodules, such as Psychological Perspective and Spatio-Temporal Perspective, which allow for even more complex, higher level exploration of composite psychological processes. Using up-to-date developments of corpus linguistics and Natural Language Processing (NLP), a unique feature of NarrCat is its capacity of SRL. The structure of NarrCat, as well as the empirical results in group identity research, is discussed

    "Mental Rotation" by Optimizing Transforming Distance

    Full text link
    The human visual system is able to recognize objects despite transformations that can drastically alter their appearance. To this end, much effort has been devoted to the invariance properties of recognition systems. Invariance can be engineered (e.g. convolutional nets), or learned from data explicitly (e.g. temporal coherence) or implicitly (e.g. by data augmentation). One idea that has not, to date, been explored is the integration of latent variables which permit a search over a learned space of transformations. Motivated by evidence that people mentally simulate transformations in space while comparing examples, so-called "mental rotation", we propose a transforming distance. Here, a trained relational model actively transforms pairs of examples so that they are maximally similar in some feature space yet respect the learned transformational constraints. We apply our method to nearest-neighbour problems on the Toronto Face Database and NORB

    Action Recognition in Video Using Sparse Coding and Relative Features

    Full text link
    This work presents an approach to category-based action recognition in video using sparse coding techniques. The proposed approach includes two main contributions: i) A new method to handle intra-class variations by decomposing each video into a reduced set of representative atomic action acts or key-sequences, and ii) A new video descriptor, ITRA: Inter-Temporal Relational Act Descriptor, that exploits the power of comparative reasoning to capture relative similarity relations among key-sequences. In terms of the method to obtain key-sequences, we introduce a loss function that, for each video, leads to the identification of a sparse set of representative key-frames capturing both, relevant particularities arising in the input video, as well as relevant generalities arising in the complete class collection. In terms of the method to obtain the ITRA descriptor, we introduce a novel scheme to quantify relative intra and inter-class similarities among local temporal patterns arising in the videos. The resulting ITRA descriptor demonstrates to be highly effective to discriminate among action categories. As a result, the proposed approach reaches remarkable action recognition performance on several popular benchmark datasets, outperforming alternative state-of-the-art techniques by a large margin.Comment: Accepted to CVPR 201

    Relational Graph Representation Learning for Predicting Object Affordances

    Get PDF
    We address the problem of affordance classification for class-agnostic objects considering an open set of actions, by unsupervised learning of object interactions,inducing object affordance classes. A novel qualitative spatial representation incorporating depth information is used to construct Activity Graphs which encode object interactions. These Activity Graphs are clustered to obtain interaction classes, and subsequently extract classes of object affordances. Our experiments demonstrate that our method learns object affordances without being scene- or object-specific

    Grounding Dynamic Spatial Relations for Embodied (Robot) Interaction

    Full text link
    This paper presents a computational model of the processing of dynamic spatial relations occurring in an embodied robotic interaction setup. A complete system is introduced that allows autonomous robots to produce and interpret dynamic spatial phrases (in English) given an environment of moving objects. The model unites two separate research strands: computational cognitive semantics and on commonsense spatial representation and reasoning. The model for the first time demonstrates an integration of these different strands.Comment: in: Pham, D.-N. and Park, S.-B., editors, PRICAI 2014: Trends in Artificial Intelligence, volume 8862 of Lecture Notes in Computer Science, pages 958-971. Springe
    • …
    corecore