5,852 research outputs found

    Between Sense and Sensibility: Declarative narrativisation of mental models as a basis and benchmark for visuo-spatial cognition and computation focussed collaborative cognitive systems

    Full text link
    What lies between `\emph{sensing}' and `\emph{sensibility}'? In other words, what kind of cognitive processes mediate sensing capability, and the formation of sensible impressions ---e.g., abstractions, analogies, hypotheses and theory formation, beliefs and their revision, argument formation--- in domain-specific problem solving, or in regular activities of everyday living, working and simply going around in the environment? How can knowledge and reasoning about such capabilities, as exhibited by humans in particular problem contexts, be used as a model and benchmark for the development of collaborative cognitive (interaction) systems concerned with human assistance, assurance, and empowerment? We pose these questions in the context of a range of assistive technologies concerned with \emph{visuo-spatial perception and cognition} tasks encompassing aspects such as commonsense, creativity, and the application of specialist domain knowledge and problem-solving thought processes. Assistive technologies being considered include: (a) human activity interpretation; (b) high-level cognitive rovotics; (c) people-centred creative design in domains such as architecture & digital media creation, and (d) qualitative analyses geographic information systems. Computational narratives not only provide a rich cognitive basis, but they also serve as a benchmark of functional performance in our development of computational cognitive assistance systems. We posit that computational narrativisation pertaining to space, actions, and change provides a useful model of \emph{visual} and \emph{spatio-temporal thinking} within a wide-range of problem-solving tasks and application areas where collaborative cognitive systems could serve an assistive and empowering function.Comment: 5 pages, research statement summarising recent publication

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    On human motion prediction using recurrent neural networks

    Full text link
    Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality. Following the success of deep learning methods in several computer vision tasks, recent work has focused on using deep recurrent neural networks (RNNs) to model human motion, with the goal of learning time-dependent representations that perform tasks such as short-term motion prediction and long-term human motion synthesis. We examine recent work, with a focus on the evaluation methodologies commonly used in the literature, and show that, surprisingly, state-of-the-art performance can be achieved by a simple baseline that does not attempt to model motion at all. We investigate this result, and analyze recent RNN methods by looking at the architectures, loss functions, and training procedures used in state-of-the-art approaches. We propose three changes to the standard RNN models typically used for human motion, which result in a simple and scalable RNN architecture that obtains state-of-the-art performance on human motion prediction.Comment: Accepted at CVPR 1
    corecore