75,147 research outputs found

    Egocentric Action Understanding by Learning Embodied Attention

    Get PDF
    Videos captured from wearable cameras, known as egocentric videos, create a continuous record of human daily visual experience, and thereby offer a new perspective for human activity understanding. Importantly, egocentric video aligns gaze, embodied movement, and action in the same “first-person” coordinate system. The rich egocentric cues reflect the attended scene context of an action, and thereby provide novel means for reasoning human daily routines. In my thesis work, I describe my efforts on developing novel computational models that learn the embodied egocentric attention for the automatic analysis of egocentric actions. First, I introduce a probabilistic model for learning gaze and actions in egocentric video and further demonstrate that attention can serve as a robust tool for learning motion-aware video representation. Second, I develop a novel deep model to address the challenging problem of jointly recognizing and localizing actions of a mobile user on a known 3D map from egocentric videos. Third, I present a novel deep latent variable model that makes use of human intentional body movement (motor attention) as a key representation for forecasting human-object interaction in egocentric video. Finally, I propose a novel task of future hand segmentation from egocentric videos, and show how explicitly modeling the future head motion can facilitate future hand movement forecasting.Ph.D

    Learning and Games

    Get PDF
    Part of the Volume on the Ecology of Games: Connecting Youth, Games, and Learning In this chapter, I argue that good video games recruit good learning and that a game's design is inherently connected to designing good learning for players. I start with a perspective on learning now common in the Learning Sciences that argues that people primarily think and learn through experiences they have had, not through abstract calculations and generalizations. People store these experiences in memory -- and human long-term memory is now viewed as nearly limitless -- and use them to run simulations in their minds to prepare for problem solving in new situations. These simulations help them to form hypotheses about how to proceed in the new situation based on past experiences. The chapter also discusses the conditions experience must meet if it is to be optimal for learning and shows how good video games can deliver such optimal learning experiences. Some of the issues covered include: identity and learning; models and model-based thinking; the control of avatars and "empathy for a complex system"; distributed intelligence and cross-functional teams for learning; motivation, and ownership; emotion in learning; and situated meaning, that is, the ways in which games represent verbal meaning through images, actions, and dialogue, not just other words and definitions

    The Challenge of Believability in Video Games: Definitions, Agents Models and Imitation Learning

    Full text link
    In this paper, we address the problem of creating believable agents (virtual characters) in video games. We consider only one meaning of believability, ``giving the feeling of being controlled by a player'', and outline the problem of its evaluation. We present several models for agents in games which can produce believable behaviours, both from industry and research. For high level of believability, learning and especially imitation learning seems to be the way to go. We make a quick overview of different approaches to make video games' agents learn from players. To conclude we propose a two-step method to develop new models for believable agents. First we must find the criteria for believability for our application and define an evaluation method. Then the model and the learning algorithm can be designed

    Virtual Meeting Rooms: From Observation to Simulation

    Get PDF
    Virtual meeting rooms are used for simulation of real meeting behavior and can show how people behave, how they gesture, move their heads, bodies, their gaze behavior during conversations. They are used for visualising models of meeting behavior, and they can be used for the evaluation of these models. They are also used to show the effects of controlling certain parameters on the behavior and in experiments to see what the effect is on communication when various channels of information - speech, gaze, gesture, posture - are switched off or manipulated in other ways. The paper presents the various stages in the development of a virtual meeting room as well and illustrates its uses by presenting some results of experiments to see whether human judges can induce conversational roles in a virtual meeting situation when they only see the head movements of participants in the meeting

    Embodied Artificial Intelligence through Distributed Adaptive Control: An Integrated Framework

    Full text link
    In this paper, we argue that the future of Artificial Intelligence research resides in two keywords: integration and embodiment. We support this claim by analyzing the recent advances of the field. Regarding integration, we note that the most impactful recent contributions have been made possible through the integration of recent Machine Learning methods (based in particular on Deep Learning and Recurrent Neural Networks) with more traditional ones (e.g. Monte-Carlo tree search, goal babbling exploration or addressable memory systems). Regarding embodiment, we note that the traditional benchmark tasks (e.g. visual classification or board games) are becoming obsolete as state-of-the-art learning algorithms approach or even surpass human performance in most of them, having recently encouraged the development of first-person 3D game platforms embedding realistic physics. Building upon this analysis, we first propose an embodied cognitive architecture integrating heterogenous sub-fields of Artificial Intelligence into a unified framework. We demonstrate the utility of our approach by showing how major contributions of the field can be expressed within the proposed framework. We then claim that benchmarking environments need to reproduce ecologically-valid conditions for bootstrapping the acquisition of increasingly complex cognitive skills through the concept of a cognitive arms race between embodied agents.Comment: Updated version of the paper accepted to the ICDL-Epirob 2017 conference (Lisbon, Portugal

    Meetings and Meeting Modeling in Smart Environments

    Get PDF
    In this paper we survey our research on smart meeting rooms and its relevance for augmented reality meeting support and virtual reality generation of meetings in real time or off-line. The research reported here forms part of the European 5th and 6th framework programme projects multi-modal meeting manager (M4) and augmented multi-party interaction (AMI). Both projects aim at building a smart meeting environment that is able to collect multimodal captures of the activities and discussions in a meeting room, with the aim to use this information as input to tools that allow real-time support, browsing, retrieval and summarization of meetings. Our aim is to research (semantic) representations of what takes place during meetings in order to allow generation, e.g. in virtual reality, of meeting activities (discussions, presentations, voting, etc.). Being able to do so also allows us to look at tools that provide support during a meeting and at tools that allow those not able to be physically present during a meeting to take part in a virtual way. This may lead to situations where the differences between real meeting participants, human-controlled virtual participants and (semi-) autonomous virtual participants disappear
    • 

    corecore