106,275 research outputs found
Linking Conversation Analysis and Motion Capturing: How to robustly track multiple participants?
Pitsch K, Brüning B-A, Schnier C, Dierker H, Wachsmuth S. Linking Conversation Analysis and Motion Capturing: How to robustly track multiple participants? In: Kipp M, Martin J-C, Paggio P, Heylen D, eds. Proceedings of the LREC Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality (MMC 2010). 2010: 63-69.If we want to model the dynamic and contingent nature of human social interaction (e.g. for the design of human robot interaction), analysis and description of natural interaction is required that combines different methodologies and research tools (qualitative/quantitative; manual/automated). In this paper, we pinpoint the requirements and technical challenges for constituting and managing multimodal corpora that arise when linking Conversation Analysis with novel 3D motion capture technologies: i.e. to robustly track multiple participants over an extended period of time. We present and evaluate a solution to by-pass the limits of the current standard Vicon system (using rigid bodies) and ways of mapping the obtained coordinates to a human skeleton model (inverse kinematics) and to export the data into a format that is supported by standard annotation tools (such as ANVIL)
Text2Action: Generative Adversarial Synthesis from Language to Action
In this paper, we propose a generative model which learns the relationship
between language and human action in order to generate a human action sequence
given a sentence describing human behavior. The proposed generative model is a
generative adversarial network (GAN), which is based on the sequence to
sequence (SEQ2SEQ) model. Using the proposed generative network, we can
synthesize various actions for a robot or a virtual agent using a text encoder
recurrent neural network (RNN) and an action decoder RNN. The proposed
generative network is trained from 29,770 pairs of actions and sentence
annotations extracted from MSR-Video-to-Text (MSR-VTT), a large-scale video
dataset. We demonstrate that the network can generate human-like actions which
can be transferred to a Baxter robot, such that the robot performs an action
based on a provided sentence. Results show that the proposed generative network
correctly models the relationship between language and action and can generate
a diverse set of actions from the same sentence.Comment: 8 pages, 10 figure
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Discrete event simulation and virtual reality use in industry: new opportunities and future trends
This paper reviews the area of combined discrete
event simulation (DES) and virtual reality (VR) use within industry.
While establishing a state of the art for progress in this
area, this paper makes the case for VR DES as the vehicle of choice
for complex data analysis through interactive simulation models,
highlighting both its advantages and current limitations. This paper
reviews active research topics such as VR and DES real-time
integration, communication protocols, system design considerations,
model validation, and applications of VR and DES. While
summarizing future research directions for this technology combination,
the case is made for smart factory adoption of VR DES as
a new platform for scenario testing and decision making. It is put
that in order for VR DES to fully meet the visualization requirements
of both Industry 4.0 and Industrial Internet visions of digital
manufacturing, further research is required in the areas of lower
latency image processing, DES delivery as a service, gesture recognition
for VR DES interaction, and linkage of DES to real-time data streams and Big Data sets
- …