106,275 research outputs found

    Linking Conversation Analysis and Motion Capturing: How to robustly track multiple participants?

    Get PDF
    Pitsch K, Brüning B-A, Schnier C, Dierker H, Wachsmuth S. Linking Conversation Analysis and Motion Capturing: How to robustly track multiple participants? In: Kipp M, Martin J-C, Paggio P, Heylen D, eds. Proceedings of the LREC Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality (MMC 2010). 2010: 63-69.If we want to model the dynamic and contingent nature of human social interaction (e.g. for the design of human robot interaction), analysis and description of natural interaction is required that combines different methodologies and research tools (qualitative/quantitative; manual/automated). In this paper, we pinpoint the requirements and technical challenges for constituting and managing multimodal corpora that arise when linking Conversation Analysis with novel 3D motion capture technologies: i.e. to robustly track multiple participants over an extended period of time. We present and evaluate a solution to by-pass the limits of the current standard Vicon system (using rigid bodies) and ways of mapping the obtained coordinates to a human skeleton model (inverse kinematics) and to export the data into a format that is supported by standard annotation tools (such as ANVIL)

    Text2Action: Generative Adversarial Synthesis from Language to Action

    Full text link
    In this paper, we propose a generative model which learns the relationship between language and human action in order to generate a human action sequence given a sentence describing human behavior. The proposed generative model is a generative adversarial network (GAN), which is based on the sequence to sequence (SEQ2SEQ) model. Using the proposed generative network, we can synthesize various actions for a robot or a virtual agent using a text encoder recurrent neural network (RNN) and an action decoder RNN. The proposed generative network is trained from 29,770 pairs of actions and sentence annotations extracted from MSR-Video-to-Text (MSR-VTT), a large-scale video dataset. We demonstrate that the network can generate human-like actions which can be transferred to a Baxter robot, such that the robot performs an action based on a provided sentence. Results show that the proposed generative network correctly models the relationship between language and action and can generate a diverse set of actions from the same sentence.Comment: 8 pages, 10 figure

    Going Deeper into Action Recognition: A Survey

    Full text link
    Understanding human actions in visual data is tied to advances in complementary research areas including object recognition, human dynamics, domain adaptation and semantic segmentation. Over the last decade, human action analysis evolved from earlier schemes that are often limited to controlled environments to nowadays advanced solutions that can learn from millions of videos and apply to almost all daily activities. Given the broad range of applications from video surveillance to human-computer interaction, scientific milestones in action recognition are achieved more rapidly, eventually leading to the demise of what used to be good in a short time. This motivated us to provide a comprehensive review of the notable steps taken towards recognizing human actions. To this end, we start our discussion with the pioneering methods that use handcrafted representations, and then, navigate into the realm of deep learning based approaches. We aim to remain objective throughout this survey, touching upon encouraging improvements as well as inevitable fallbacks, in the hope of raising fresh questions and motivating new research directions for the reader

    Discrete event simulation and virtual reality use in industry: new opportunities and future trends

    Get PDF
    This paper reviews the area of combined discrete event simulation (DES) and virtual reality (VR) use within industry. While establishing a state of the art for progress in this area, this paper makes the case for VR DES as the vehicle of choice for complex data analysis through interactive simulation models, highlighting both its advantages and current limitations. This paper reviews active research topics such as VR and DES real-time integration, communication protocols, system design considerations, model validation, and applications of VR and DES. While summarizing future research directions for this technology combination, the case is made for smart factory adoption of VR DES as a new platform for scenario testing and decision making. It is put that in order for VR DES to fully meet the visualization requirements of both Industry 4.0 and Industrial Internet visions of digital manufacturing, further research is required in the areas of lower latency image processing, DES delivery as a service, gesture recognition for VR DES interaction, and linkage of DES to real-time data streams and Big Data sets
    corecore