9,815 research outputs found
Learning the Semantics of Manipulation Action
In this paper we present a formal computational framework for modeling
manipulation actions. The introduced formalism leads to semantics of
manipulation action and has applications to both observing and understanding
human manipulation actions as well as executing them with a robotic mechanism
(e.g. a humanoid robot). It is based on a Combinatory Categorial Grammar. The
goal of the introduced framework is to: (1) represent manipulation actions with
both syntax and semantic parts, where the semantic part employs
-calculus; (2) enable a probabilistic semantic parsing schema to learn
the -calculus representation of manipulation action from an annotated
action corpus of videos; (3) use (1) and (2) to develop a system that visually
observes manipulation actions and understands their meaning while it can reason
beyond observations using propositional logic and axiom schemata. The
experiments conducted on a public available large manipulation action dataset
validate the theoretical framework and our implementation
A semantics-based approach to sensor data segmentation in real-time Activity Recognition
Department of Information Engineering, Dalian University, China
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Activity Recognition (AR) is key in context-aware assistive living systems. One
challenge in AR is the segmentation of observed sensor events when interleaved
or concurrent activities of daily living (ADLs) are performed. Several studies
have proposed methods of separating and organising sensor observations and
recognise generic ADLs performed in a simple or composite manner. However,
little has been explored in semantically distinguishing individual sensor events
directly and passing it to the relevant ongoing/new atomic activities. This
paper proposes Semiotic theory inspired ontological model, capturing generic
knowledge and inhabitant-specific preferences for conducting ADLs to support
the segmentation process. A multithreaded decision algorithm and system prototype
were developed and evaluated against 30 use case scenarios where each
event was simulated at 10sec interval on a machine with i7 2.60GHz CPU, 2
cores and 8GB RAM. The result suggests that all sensor events were adequately
segmented with 100% accuracy for single ADL scenarios and minor improvement
of 97.8% accuracy for composite ADL scenario. However, the performance has
suffered to segment each event with the average classification time of 3971ms
and 62183ms for single and composite ADL scenarios, respectively
VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation
Rich and dense human labeled datasets are among the main enabling factors for
the recent advance on vision-language understanding. Many seemingly distant
annotations (e.g., semantic segmentation and visual question answering (VQA))
are inherently connected in that they reveal different levels and perspectives
of human understandings about the same visual scenes --- and even the same set
of images (e.g., of COCO). The popularity of COCO correlates those annotations
and tasks. Explicitly linking them up may significantly benefit both individual
tasks and the unified vision and language modeling. We present the preliminary
work of linking the instance segmentations provided by COCO to the questions
and answers (QAs) in the VQA dataset, and name the collected links visual
questions and segmentation answers (VQS). They transfer human supervision
between the previously separate tasks, offer more effective leverage to
existing problems, and also open the door for new research problems and models.
We study two applications of the VQS data in this paper: supervised attention
for VQA and a novel question-focused semantic segmentation task. For the
former, we obtain state-of-the-art results on the VQA real multiple-choice task
by simply augmenting the multilayer perceptrons with some attention features
that are learned using the segmentation-QA links as explicit supervision. To
put the latter in perspective, we study two plausible methods and compare them
to an oracle method assuming that the instance segmentations are given at the
test stage.Comment: To appear on ICCV 201
- …