5,852 research outputs found
Between Sense and Sensibility: Declarative narrativisation of mental models as a basis and benchmark for visuo-spatial cognition and computation focussed collaborative cognitive systems
What lies between `\emph{sensing}' and `\emph{sensibility}'? In other words,
what kind of cognitive processes mediate sensing capability, and the formation
of sensible impressions ---e.g., abstractions, analogies, hypotheses and theory
formation, beliefs and their revision, argument formation--- in domain-specific
problem solving, or in regular activities of everyday living, working and
simply going around in the environment? How can knowledge and reasoning about
such capabilities, as exhibited by humans in particular problem contexts, be
used as a model and benchmark for the development of collaborative cognitive
(interaction) systems concerned with human assistance, assurance, and
empowerment?
We pose these questions in the context of a range of assistive technologies
concerned with \emph{visuo-spatial perception and cognition} tasks encompassing
aspects such as commonsense, creativity, and the application of specialist
domain knowledge and problem-solving thought processes. Assistive technologies
being considered include: (a) human activity interpretation; (b) high-level
cognitive rovotics; (c) people-centred creative design in domains such as
architecture & digital media creation, and (d) qualitative analyses geographic
information systems. Computational narratives not only provide a rich cognitive
basis, but they also serve as a benchmark of functional performance in our
development of computational cognitive assistance systems. We posit that
computational narrativisation pertaining to space, actions, and change provides
a useful model of \emph{visual} and \emph{spatio-temporal thinking} within a
wide-range of problem-solving tasks and application areas where collaborative
cognitive systems could serve an assistive and empowering function.Comment: 5 pages, research statement summarising recent publication
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
On human motion prediction using recurrent neural networks
Human motion modelling is a classical problem at the intersection of graphics
and computer vision, with applications spanning human-computer interaction,
motion synthesis, and motion prediction for virtual and augmented reality.
Following the success of deep learning methods in several computer vision
tasks, recent work has focused on using deep recurrent neural networks (RNNs)
to model human motion, with the goal of learning time-dependent representations
that perform tasks such as short-term motion prediction and long-term human
motion synthesis. We examine recent work, with a focus on the evaluation
methodologies commonly used in the literature, and show that, surprisingly,
state-of-the-art performance can be achieved by a simple baseline that does not
attempt to model motion at all. We investigate this result, and analyze recent
RNN methods by looking at the architectures, loss functions, and training
procedures used in state-of-the-art approaches. We propose three changes to the
standard RNN models typically used for human motion, which result in a simple
and scalable RNN architecture that obtains state-of-the-art performance on
human motion prediction.Comment: Accepted at CVPR 1
- …